From nobody Sun May 5 03:34:22 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; envelope-from=libvir-list-bounces@redhat.com; helo=mx1.redhat.com; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=fail(p=none dis=none) header.from=canonical.com Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by mx.zohomail.com with SMTPS id 1533277803353222.09873996061594; Thu, 2 Aug 2018 23:30:03 -0700 (PDT) Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 686DD30820CD; Fri, 3 Aug 2018 06:29:59 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id D3EF8600C5; Fri, 3 Aug 2018 06:29:57 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 918BD18037ED; Fri, 3 Aug 2018 06:29:54 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id w736Tq3q027798 for ; Fri, 3 Aug 2018 02:29:52 -0400 Received: by smtp.corp.redhat.com (Postfix) id 6407A1057FA3; Fri, 3 Aug 2018 06:29:52 +0000 (UTC) Received: from mx1.redhat.com (ext-mx15.extmail.prod.ext.phx2.redhat.com [10.5.110.44]) by smtp.corp.redhat.com (Postfix) with ESMTPS id BDDD91062241; Fri, 3 Aug 2018 06:29:47 +0000 (UTC) Received: from youngberry.canonical.com (youngberry.canonical.com [91.189.89.112]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 28E2230832E8; Fri, 3 Aug 2018 06:29:46 +0000 (UTC) Received: from 1.general.paelzer.uk.vpn ([10.172.196.172] helo=lap.fritz.box) by youngberry.canonical.com with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.76) (envelope-from ) id 1flTaq-0002Rh-QR; Fri, 03 Aug 2018 06:29:44 +0000 From: Christian Ehrhardt To: Alex Williamson , =?UTF-8?q?Daniel=20P=20=2E=20Berrang=C3=A9?= , libvir-list@redhat.com Date: Fri, 3 Aug 2018 08:29:40 +0200 Message-Id: <20180803062941.31641-2-christian.ehrhardt@canonical.com> In-Reply-To: <20180803062941.31641-1-christian.ehrhardt@canonical.com> References: <20180803062941.31641-1-christian.ehrhardt@canonical.com> X-Greylist: Sender passed SPF test, Sender IP whitelisted by DNSRBL, ACL 209 matched, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.44]); Fri, 03 Aug 2018 06:29:46 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.44]); Fri, 03 Aug 2018 06:29:46 +0000 (UTC) for IP:'91.189.89.112' DOMAIN:'youngberry.canonical.com' HELO:'youngberry.canonical.com' FROM:'christian.ehrhardt@canonical.com' RCPT:'' X-RedHat-Spam-Score: -5 (RCVD_IN_DNSWL_HI) 91.189.89.112 youngberry.canonical.com 91.189.89.112 youngberry.canonical.com X-Scanned-By: MIMEDefang 2.84 on 10.5.110.44 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-loop: libvir-list@redhat.com Cc: Christian Ehrhardt Subject: [libvirt] [RFC 1/2] process: wait longer 5->30s on hard shutdown X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: libvir-list-bounces@redhat.com Errors-To: libvir-list-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.47]); Fri, 03 Aug 2018 06:30:01 +0000 (UTC) X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Type: text/plain; charset="utf-8" In cases where virProcessKillPainfully already reailizes that SIGTERM wasn't enough we are partially on a bad path already. Maybe the system is overloaded or having serious trouble to free and reap resources in time. In those case give the SIGKILL that was sent after 10 seconds some more time to take effect if force was set (only then we are falling back to SIGKILL anyway). Signed-off-by: Christian Ehrhardt --- src/util/virprocess.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/util/virprocess.c b/src/util/virprocess.c index f92b0dce37..10952b0980 100644 --- a/src/util/virprocess.c +++ b/src/util/virprocess.c @@ -350,6 +350,7 @@ virProcessKillPainfully(pid_t pid, bool force) { size_t i; int ret =3D -1; + int maxwait =3D (force ? 200 : 75 ); const char *signame =3D "TERM"; =20 VIR_DEBUG("vpid=3D%lld force=3D%d", (long long)pid, force); @@ -357,12 +358,12 @@ virProcessKillPainfully(pid_t pid, bool force) /* This loop sends SIGTERM, then waits a few iterations (10 seconds) * to see if it dies. If the process still hasn't exited, and * @force is requested, a SIGKILL will be sent, and this will - * wait up to 5 seconds more for the process to exit before + * wait up to 30 seconds more for the process to exit before * returning. * * Note that setting @force could result in dataloss for the process. */ - for (i =3D 0; i < 75; i++) { + for (i =3D 0; i < maxwait; i++) { int signum; if (i =3D=3D 0) { signum =3D SIGTERM; /* kindly suggest it should exit */ --=20 2.17.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list From nobody Sun May 5 03:34:22 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; envelope-from=libvir-list-bounces@redhat.com; helo=mx1.redhat.com; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com; dmarc=fail(p=none dis=none) header.from=canonical.com Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by mx.zohomail.com with SMTPS id 153345915143592.14554304325725; Sun, 5 Aug 2018 01:52:31 -0700 (PDT) Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 8ADB3C0587F8; Sun, 5 Aug 2018 08:52:29 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 2CB3610694F2; Sun, 5 Aug 2018 08:52:29 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id C5F224EE20; Sun, 5 Aug 2018 08:52:28 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id w736TrrP027808 for ; Fri, 3 Aug 2018 02:29:53 -0400 Received: by smtp.corp.redhat.com (Postfix) id E30434DA84; Fri, 3 Aug 2018 06:29:53 +0000 (UTC) Received: from mx1.redhat.com (ext-mx20.extmail.prod.ext.phx2.redhat.com [10.5.110.49]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 04CB858820; Fri, 3 Aug 2018 06:29:47 +0000 (UTC) Received: from youngberry.canonical.com (youngberry.canonical.com [91.189.89.112]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 0A9FE308626C; Fri, 3 Aug 2018 06:29:47 +0000 (UTC) Received: from 1.general.paelzer.uk.vpn ([10.172.196.172] helo=lap.fritz.box) by youngberry.canonical.com with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.76) (envelope-from ) id 1flTar-0002Rh-SQ; Fri, 03 Aug 2018 06:29:45 +0000 From: Christian Ehrhardt To: Alex Williamson , =?UTF-8?q?Daniel=20P=20=2E=20Berrang=C3=A9?= , libvir-list@redhat.com Date: Fri, 3 Aug 2018 08:29:41 +0200 Message-Id: <20180803062941.31641-3-christian.ehrhardt@canonical.com> In-Reply-To: <20180803062941.31641-1-christian.ehrhardt@canonical.com> References: <20180803062941.31641-1-christian.ehrhardt@canonical.com> X-Greylist: Sender passed SPF test, Sender IP whitelisted by DNSRBL, ACL 209 matched, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Fri, 03 Aug 2018 06:29:47 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Fri, 03 Aug 2018 06:29:47 +0000 (UTC) for IP:'91.189.89.112' DOMAIN:'youngberry.canonical.com' HELO:'youngberry.canonical.com' FROM:'christian.ehrhardt@canonical.com' RCPT:'' X-RedHat-Spam-Score: -5 (RCVD_IN_DNSWL_HI) 91.189.89.112 youngberry.canonical.com 91.189.89.112 youngberry.canonical.com X-Scanned-By: MIMEDefang 2.84 on 10.5.110.49 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-loop: libvir-list@redhat.com Cc: Christian Ehrhardt Subject: [libvirt] [RFC 2/2] process: accept the lack of /proc/ as valid process removal X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: libvir-list-bounces@redhat.com Errors-To: libvir-list-bounces@redhat.com X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Sun, 05 Aug 2018 08:52:30 +0000 (UTC) X-ZohoMail: RDMRC_1 RSF_0 Z_629925259 SPT_0 Content-Type: text/plain; charset="utf-8" There were cases where the process was gone (no /proc/ entry anymore),= but kill with signal=3D0 still was able to reach the process. This can happen if the kernel still cleans up resources. In most common cases of this there would be a /proc/ entry left with t= he process in Zombie state until reaped (by init). But those cases usually res= olve rather quickly as init periodically will call wait to reap. The more critical and confusing cases are those where the process is gone from all that (not in /proc/ anymore), but the kernel still considers it reachable by kill with signal 0. This is due to kill (2) only checking for "existence of a process ID" but n= ot the process itself. This effect has mostly been seen when using plenty of SR-IOV resources in t= he guest (which might explain the extra cleanup phase by the kernel) and to resolve those issues libvirt will accept /proc/ being gone as valid ex= it as well (on top of signal 0 returning ESRCH). Signed-off-by: Christian Ehrhardt --- src/util/virprocess.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/src/util/virprocess.c b/src/util/virprocess.c index 10952b0980..8d863a6777 100644 --- a/src/util/virprocess.c +++ b/src/util/virprocess.c @@ -352,9 +352,14 @@ virProcessKillPainfully(pid_t pid, bool force) int ret =3D -1; int maxwait =3D (force ? 200 : 75 ); const char *signame =3D "TERM"; + char *procPath =3D NULL; =20 VIR_DEBUG("vpid=3D%lld force=3D%d", (long long)pid, force); =20 + if (virAsprintf(&procPath, "/proc/%llu", (long long) pid) < 0) + VIR_WARN("Can't allocate procPath to check for exit of pid %lld, ", + (long long)pid); + /* This loop sends SIGTERM, then waits a few iterations (10 seconds) * to see if it dies. If the process still hasn't exited, and * @force is requested, a SIGKILL will be sent, and this will @@ -393,6 +398,24 @@ virProcessKillPainfully(pid_t pid, bool force) ret =3D signum =3D=3D SIGTERM ? 0 : 1; goto cleanup; /* process is dead */ } + /* + * There were cases where the process was gone (no /proc/ ent= ry + * anymore), but kill with signal=3D0 still was able to reach the = process + * as kill (2) only checks for "existence of a process ID" but not= the + * process itself. For example if the kernel might still clean up + * resources. We accept having no /proc/ entry left as valid = exit + * of the process as well. + */ + if (procPath !=3D NULL && !virFileExists(procPath)) { + if (errno =3D=3D ENOENT) { + ret =3D signum =3D=3D SIGTERM ? 0 : 1; + /* DEBUG as it could be just the race from signal to clean= up */ + VIR_DEBUG("Process with pid %lld still reachable with sign= als " + "but %s is no more existing", + (long long) pid, procPath); + goto cleanup; + } + } =20 usleep(200 * 1000); } @@ -402,6 +425,7 @@ virProcessKillPainfully(pid_t pid, bool force) (long long)pid, signame); =20 cleanup: + VIR_FREE(procPath); return ret; } =20 --=20 2.17.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list