From nobody Wed Feb 11 10:14:15 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of redhat.com designates 209.132.183.24 as permitted sender) client-ip=209.132.183.24; envelope-from=libvir-list-bounces@redhat.com; helo=mx3-phx2.redhat.com; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of redhat.com designates 209.132.183.24 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; Return-Path: Received: from mx3-phx2.redhat.com (mx3-phx2.redhat.com [209.132.183.24]) by mx.zohomail.com with SMTPS id 1489408416837623.9743034691863; Mon, 13 Mar 2017 05:33:36 -0700 (PDT) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by mx3-phx2.redhat.com (8.13.8/8.13.8) with ESMTP id v2DCU4hE008375; Mon, 13 Mar 2017 08:30:04 -0400 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id v2DCU3RU015377 for ; Mon, 13 Mar 2017 08:30:03 -0400 Received: from moe.brq.redhat.com (dhcp129-131.brq.redhat.com [10.34.129.131]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id v2DCTxfg002868; Mon, 13 Mar 2017 08:30:02 -0400 From: Michal Privoznik To: libvir-list@redhat.com Date: Mon, 13 Mar 2017 13:29:53 +0100 Message-Id: <70108c0f23c3e7a35f16bca1a50d6b1933382930.1489405375.git.mprivozn@redhat.com> In-Reply-To: References: In-Reply-To: References: X-Scanned-By: MIMEDefang 2.68 on 10.5.11.24 X-loop: libvir-list@redhat.com Cc: mfleming@suse.de Subject: [libvirt] [PATCH 2/2] qemu: Adaptive timeout for connecting to monitor X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: libvir-list-bounces@redhat.com Errors-To: libvir-list-bounces@redhat.com X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Type: text/plain; charset="utf-8" There were couple of reports on the list (e.g. [1]) that guests with huge amounts of RAM are unable to start because libvirt kills qemu in the initialization phase. The problem is that if guest is configured to use hugepages kernel has to zero them all out before handing over to qemu process. For instance, 402GiB worth of 1GiB pages took around 105 seconds (~3.8GiB/s). Since we do not want to make the timeout for connecting to monitor configurable [2], we have to teach libvirt to count with this fact. This commit implements "1s per each 1GiB of RAM" approach as suggested here [3]. 1: https://www.redhat.com/archives/libvir-list/2017-March/msg00373.html 3: https://www.redhat.com/archives/libvir-list/2017-March/msg00405.html 2: The reason is that ideally someday it will be Libvirt who creates the monitor socket and qemu will just use it. Signed-off-by: Michal Privoznik --- src/qemu/qemu_capabilities.c | 2 +- src/qemu/qemu_monitor.c | 36 +++++++++++++++++++++++++++++++----- src/qemu/qemu_monitor.h | 1 + src/qemu/qemu_process.c | 8 ++++++++ tests/qemumonitortestutils.c | 1 + 5 files changed, 42 insertions(+), 6 deletions(-) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 5a3b4ac50..54dfd22d8 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -4761,7 +4761,7 @@ virQEMUCapsInitQMPCommandRun(virQEMUCapsInitQMPComman= dPtr cmd, cmd->vm->pid =3D cmd->pid; =20 if (!(cmd->mon =3D qemuMonitorOpen(cmd->vm, &cmd->config, true, - &callbacks, NULL))) + 0, &callbacks, NULL))) goto ignore; =20 virObjectLock(cmd->mon); diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index d71f84c80..272350bf5 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -327,11 +327,13 @@ qemuMonitorDispose(void *obj) =20 =20 static int -qemuMonitorOpenUnix(const char *monitor, pid_t cpid) +qemuMonitorOpenUnix(const char *monitor, + pid_t cpid, + unsigned long long timeout) { struct sockaddr_un addr; int monfd; - virTimeBackOffVar timeout; + virTimeBackOffVar timebackoff; int ret =3D -1; =20 if ((monfd =3D socket(AF_UNIX, SOCK_STREAM, 0)) < 0) { @@ -348,9 +350,9 @@ qemuMonitorOpenUnix(const char *monitor, pid_t cpid) goto error; } =20 - if (virTimeBackOffStart(&timeout, 1, 30*1000 /* ms */) < 0) + if (virTimeBackOffStart(&timebackoff, 1, timeout * 1000) < 0) goto error; - while (virTimeBackOffWait(&timeout)) { + while (virTimeBackOffWait(&timebackoff)) { ret =3D connect(monfd, (struct sockaddr *) &addr, sizeof(addr)); =20 if (ret =3D=3D 0) @@ -871,10 +873,30 @@ qemuMonitorOpenInternal(virDomainObjPtr vm, } =20 =20 +#define QEMU_DEFAULT_MONITOR_WAIT 30 + +/** + * qemuMonitorOpen: + * @vm: domain object + * @config: monitor configuration + * @json: enable JSON on the monitor + * @timeout: how much seconds add to default timeout + * @cb: monitor event handles + * @opaque: opaque data for @cb + * + * Opens the monitor for running qemu. It may happen that it + * takes some time for qemu to create the monitor socket (e.g. + * because kernel is zeroing configured hugepages), therefore we + * wait up to default + timeout seconds for the monitor to show + * up after which a failure is claimed. + * + * Returns monitor object, NULL on error. + */ qemuMonitorPtr qemuMonitorOpen(virDomainObjPtr vm, virDomainChrSourceDefPtr config, bool json, + unsigned long long timeout, qemuMonitorCallbacksPtr cb, void *opaque) { @@ -882,10 +904,14 @@ qemuMonitorOpen(virDomainObjPtr vm, bool hasSendFD =3D false; qemuMonitorPtr ret; =20 + timeout +=3D QEMU_DEFAULT_MONITOR_WAIT; + switch (config->type) { case VIR_DOMAIN_CHR_TYPE_UNIX: hasSendFD =3D true; - if ((fd =3D qemuMonitorOpenUnix(config->data.nix.path, vm ? vm->pi= d : 0)) < 0) + if ((fd =3D qemuMonitorOpenUnix(config->data.nix.path, + vm ? vm->pid : 0, + timeout)) < 0) return NULL; break; =20 diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index 847e9458a..3c37a6ffe 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -246,6 +246,7 @@ char *qemuMonitorUnescapeArg(const char *in); qemuMonitorPtr qemuMonitorOpen(virDomainObjPtr vm, virDomainChrSourceDefPtr config, bool json, + unsigned long long timeout, qemuMonitorCallbacksPtr cb, void *opaque) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(4); diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index b9c1847bb..6a9c53aea 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -1658,6 +1658,7 @@ qemuConnectMonitor(virQEMUDriverPtr driver, virDomain= ObjPtr vm, int asyncJob, qemuDomainObjPrivatePtr priv =3D vm->privateData; int ret =3D -1; qemuMonitorPtr mon =3D NULL; + unsigned long long timeout =3D 0; =20 if (qemuSecuritySetDaemonSocketLabel(driver->securityManager, vm->def)= < 0) { VIR_ERROR(_("Failed to set security context for monitor for %s"), @@ -1665,6 +1666,12 @@ qemuConnectMonitor(virQEMUDriverPtr driver, virDomai= nObjPtr vm, int asyncJob, return -1; } =20 + /* When using hugepages, kernel zeroes them out before + * handing them over to qemu. This can be very time + * consuming. Therefore, add a seconds to timeout for each + * 1GiB of guest RAM. */ + timeout =3D vm->def->mem.total_memory / (1024 * 1024); + /* Hold an extra reference because we can't allow 'vm' to be * deleted until the monitor gets its own reference. */ virObjectRef(vm); @@ -1675,6 +1682,7 @@ qemuConnectMonitor(virQEMUDriverPtr driver, virDomain= ObjPtr vm, int asyncJob, mon =3D qemuMonitorOpen(vm, priv->monConfig, priv->monJSON, + timeout, &monitorCallbacks, driver); =20 diff --git a/tests/qemumonitortestutils.c b/tests/qemumonitortestutils.c index cfd0a38cb..89857a662 100644 --- a/tests/qemumonitortestutils.c +++ b/tests/qemumonitortestutils.c @@ -1175,6 +1175,7 @@ qemuMonitorTestNew(bool json, if (!(test->mon =3D qemuMonitorOpen(test->vm, &src, json, + 0, &qemuMonitorTestCallbacks, driver))) goto error; --=20 2.11.0 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list