[PATCH] tests/qtest: kill off QEMU with SIGKILL when qtest exits abnormally

Daniel P. Berrangé posted 1 patch 1 year, 11 months ago
tests/qtest/libqtest.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
[PATCH] tests/qtest: kill off QEMU with SIGKILL when qtest exits abnormally
Posted by Daniel P. Berrangé 1 year, 11 months ago
If a qtest program exits without calling qtest_quit(), then the
QEMU emulator process will remain running in the background forever.

Unfortunately this scenario is exactly what will happen when a
g_assert() check triggers an abort().

Prior to switching to use of 'meson test', this problem would
cause tap-driver.pl to hang forever. It was waiting for its
STDIN to report EOF, but that would never happen due to the
ophaned QEMU emulator processes keeping the pipe open forever.
Fortunately this doesn't happen with meson, but it is still
desirable to not leak QEMU processes when asserts fire.

Using the Linux specific prctl(PR_SET_PDEATHSIG) syscall, we
can ensure that QEMU gets sent SIGKILL as soon as the controlling
qtest exits, despite being daemonized.

Note, technically the death signal is sent when the *thread* that
called fork() exits. IOW, if you are calling qtest_init() in one
thread, letting that thread exit, and then expecting to run
qtest_quit() in a different thread, things are not going to work
out. Fortunately that is not a scenario that exists in qtests,
as pairs of qtest_init and qtest_quit are always called from the
same thread.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 tests/qtest/libqtest.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c
index 228357f1ea..553e82e492 100644
--- a/tests/qtest/libqtest.c
+++ b/tests/qtest/libqtest.c
@@ -19,6 +19,9 @@
 #include <sys/socket.h>
 #include <sys/wait.h>
 #include <sys/un.h>
+#ifdef __linux__
+#include <sys/prctl.h>
+#endif /* __linux__ */
 
 #include "libqtest.h"
 #include "libqmp.h"
@@ -301,6 +304,21 @@ QTestState *qtest_init_without_qmp_handshake(const char *extra_args)
     s->expected_status = 0;
     s->qemu_pid = fork();
     if (s->qemu_pid == 0) {
+#ifdef __linux__
+        /*
+         * If the controlling qtest process exits without calling
+         * the qtest_quit() method, the QEMU processes will get
+         * orphaned and remain running forever in the background.
+         *
+         * Missing qtest_quit() calls are, unfortunately, exactly
+         * what happen when a g_assert() check triggers abort() in
+         * a failing test scenario.
+         *
+         * This PR_SET_PDEATHSIG setup will ensure QEMU will
+         * get terminated with SIGKILL.
+         */
+        prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0);
+#endif /* __linux__ */
         if (!g_setenv("QEMU_AUDIO_DRV", "none", true)) {
             exit(1);
         }
-- 
2.36.1


Re: [PATCH] tests/qtest: kill off QEMU with SIGKILL when qtest exits abnormally
Posted by Thomas Huth 1 year, 11 months ago
On 13/05/2022 16.37, Daniel P. Berrangé wrote:
> If a qtest program exits without calling qtest_quit(), then the
> QEMU emulator process will remain running in the background forever.
> 
> Unfortunately this scenario is exactly what will happen when a
> g_assert() check triggers an abort().
> 
> Prior to switching to use of 'meson test', this problem would
> cause tap-driver.pl to hang forever. It was waiting for its
> STDIN to report EOF, but that would never happen due to the
> ophaned QEMU emulator processes keeping the pipe open forever.
> Fortunately this doesn't happen with meson, but it is still
> desirable to not leak QEMU processes when asserts fire.
> 
> Using the Linux specific prctl(PR_SET_PDEATHSIG) syscall, we
> can ensure that QEMU gets sent SIGKILL as soon as the controlling
> qtest exits, despite being daemonized.
> 
> Note, technically the death signal is sent when the *thread* that
> called fork() exits. IOW, if you are calling qtest_init() in one
> thread, letting that thread exit, and then expecting to run
> qtest_quit() in a different thread, things are not going to work
> out. Fortunately that is not a scenario that exists in qtests,
> as pairs of qtest_init and qtest_quit are always called from the
> same thread.
> 
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>   tests/qtest/libqtest.c | 18 ++++++++++++++++++
>   1 file changed, 18 insertions(+)
> 
> diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c
> index 228357f1ea..553e82e492 100644
> --- a/tests/qtest/libqtest.c
> +++ b/tests/qtest/libqtest.c
> @@ -19,6 +19,9 @@
>   #include <sys/socket.h>
>   #include <sys/wait.h>
>   #include <sys/un.h>
> +#ifdef __linux__
> +#include <sys/prctl.h>
> +#endif /* __linux__ */
>   
>   #include "libqtest.h"
>   #include "libqmp.h"
> @@ -301,6 +304,21 @@ QTestState *qtest_init_without_qmp_handshake(const char *extra_args)
>       s->expected_status = 0;
>       s->qemu_pid = fork();
>       if (s->qemu_pid == 0) {
> +#ifdef __linux__
> +        /*
> +         * If the controlling qtest process exits without calling
> +         * the qtest_quit() method, the QEMU processes will get
> +         * orphaned and remain running forever in the background.
> +         *
> +         * Missing qtest_quit() calls are, unfortunately, exactly
> +         * what happen when a g_assert() check triggers abort() in
> +         * a failing test scenario.
> +         *
> +         * This PR_SET_PDEATHSIG setup will ensure QEMU will
> +         * get terminated with SIGKILL.
> +         */
> +        prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0);
> +#endif /* __linux__ */
>           if (!g_setenv("QEMU_AUDIO_DRV", "none", true)) {
>               exit(1);
>           }

Would it make sense to install a signal handler for SIGABRT instead and make 
sure that we tear down the QEMU instance there? ... that would then also 
work for other non-Linux operating systems?

  Thomas


Re: [PATCH] tests/qtest: kill off QEMU with SIGKILL when qtest exits abnormally
Posted by Thomas Huth 1 year, 11 months ago
On 13/05/2022 16.47, Thomas Huth wrote:
> On 13/05/2022 16.37, Daniel P. Berrangé wrote:
>> If a qtest program exits without calling qtest_quit(), then the
>> QEMU emulator process will remain running in the background forever.
>>
>> Unfortunately this scenario is exactly what will happen when a
>> g_assert() check triggers an abort().
>>
>> Prior to switching to use of 'meson test', this problem would
>> cause tap-driver.pl to hang forever. It was waiting for its
>> STDIN to report EOF, but that would never happen due to the
>> ophaned QEMU emulator processes keeping the pipe open forever.
>> Fortunately this doesn't happen with meson, but it is still
>> desirable to not leak QEMU processes when asserts fire.
>>
>> Using the Linux specific prctl(PR_SET_PDEATHSIG) syscall, we
>> can ensure that QEMU gets sent SIGKILL as soon as the controlling
>> qtest exits, despite being daemonized.
>>
>> Note, technically the death signal is sent when the *thread* that
>> called fork() exits. IOW, if you are calling qtest_init() in one
>> thread, letting that thread exit, and then expecting to run
>> qtest_quit() in a different thread, things are not going to work
>> out. Fortunately that is not a scenario that exists in qtests,
>> as pairs of qtest_init and qtest_quit are always called from the
>> same thread.
>>
>> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
>> ---
>>   tests/qtest/libqtest.c | 18 ++++++++++++++++++
>>   1 file changed, 18 insertions(+)
>>
>> diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c
>> index 228357f1ea..553e82e492 100644
>> --- a/tests/qtest/libqtest.c
>> +++ b/tests/qtest/libqtest.c
>> @@ -19,6 +19,9 @@
>>   #include <sys/socket.h>
>>   #include <sys/wait.h>
>>   #include <sys/un.h>
>> +#ifdef __linux__
>> +#include <sys/prctl.h>
>> +#endif /* __linux__ */
>>   #include "libqtest.h"
>>   #include "libqmp.h"
>> @@ -301,6 +304,21 @@ QTestState *qtest_init_without_qmp_handshake(const 
>> char *extra_args)
>>       s->expected_status = 0;
>>       s->qemu_pid = fork();
>>       if (s->qemu_pid == 0) {
>> +#ifdef __linux__
>> +        /*
>> +         * If the controlling qtest process exits without calling
>> +         * the qtest_quit() method, the QEMU processes will get
>> +         * orphaned and remain running forever in the background.
>> +         *
>> +         * Missing qtest_quit() calls are, unfortunately, exactly
>> +         * what happen when a g_assert() check triggers abort() in
>> +         * a failing test scenario.
>> +         *
>> +         * This PR_SET_PDEATHSIG setup will ensure QEMU will
>> +         * get terminated with SIGKILL.
>> +         */
>> +        prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0);
>> +#endif /* __linux__ */
>>           if (!g_setenv("QEMU_AUDIO_DRV", "none", true)) {
>>               exit(1);
>>           }
> 
> Would it make sense to install a signal handler for SIGABRT instead and make 
> sure that we tear down the QEMU instance there? ... that would then also 
> work for other non-Linux operating systems?

Wait, we're doing that already ... why doesn't it work for your case?

  Thomas



Re: [PATCH] tests/qtest: kill off QEMU with SIGKILL when qtest exits abnormally
Posted by Daniel P. Berrangé 1 year, 11 months ago
On Fri, May 13, 2022 at 04:49:09PM +0200, Thomas Huth wrote:
> On 13/05/2022 16.47, Thomas Huth wrote:
> > On 13/05/2022 16.37, Daniel P. Berrangé wrote:
> > > If a qtest program exits without calling qtest_quit(), then the
> > > QEMU emulator process will remain running in the background forever.
> > > 
> > > Unfortunately this scenario is exactly what will happen when a
> > > g_assert() check triggers an abort().
> > > 
> > > Prior to switching to use of 'meson test', this problem would
> > > cause tap-driver.pl to hang forever. It was waiting for its
> > > STDIN to report EOF, but that would never happen due to the
> > > ophaned QEMU emulator processes keeping the pipe open forever.
> > > Fortunately this doesn't happen with meson, but it is still
> > > desirable to not leak QEMU processes when asserts fire.
> > > 
> > > Using the Linux specific prctl(PR_SET_PDEATHSIG) syscall, we
> > > can ensure that QEMU gets sent SIGKILL as soon as the controlling
> > > qtest exits, despite being daemonized.
> > > 
> > > Note, technically the death signal is sent when the *thread* that
> > > called fork() exits. IOW, if you are calling qtest_init() in one
> > > thread, letting that thread exit, and then expecting to run
> > > qtest_quit() in a different thread, things are not going to work
> > > out. Fortunately that is not a scenario that exists in qtests,
> > > as pairs of qtest_init and qtest_quit are always called from the
> > > same thread.
> > > 
> > > Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> > > ---
> > >   tests/qtest/libqtest.c | 18 ++++++++++++++++++
> > >   1 file changed, 18 insertions(+)
> > > 
> > > diff --git a/tests/qtest/libqtest.c b/tests/qtest/libqtest.c
> > > index 228357f1ea..553e82e492 100644
> > > --- a/tests/qtest/libqtest.c
> > > +++ b/tests/qtest/libqtest.c
> > > @@ -19,6 +19,9 @@
> > >   #include <sys/socket.h>
> > >   #include <sys/wait.h>
> > >   #include <sys/un.h>
> > > +#ifdef __linux__
> > > +#include <sys/prctl.h>
> > > +#endif /* __linux__ */
> > >   #include "libqtest.h"
> > >   #include "libqmp.h"
> > > @@ -301,6 +304,21 @@ QTestState
> > > *qtest_init_without_qmp_handshake(const char *extra_args)
> > >       s->expected_status = 0;
> > >       s->qemu_pid = fork();
> > >       if (s->qemu_pid == 0) {
> > > +#ifdef __linux__
> > > +        /*
> > > +         * If the controlling qtest process exits without calling
> > > +         * the qtest_quit() method, the QEMU processes will get
> > > +         * orphaned and remain running forever in the background.
> > > +         *
> > > +         * Missing qtest_quit() calls are, unfortunately, exactly
> > > +         * what happen when a g_assert() check triggers abort() in
> > > +         * a failing test scenario.
> > > +         *
> > > +         * This PR_SET_PDEATHSIG setup will ensure QEMU will
> > > +         * get terminated with SIGKILL.
> > > +         */
> > > +        prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0);
> > > +#endif /* __linux__ */
> > >           if (!g_setenv("QEMU_AUDIO_DRV", "none", true)) {
> > >               exit(1);
> > >           }
> > 
> > Would it make sense to install a signal handler for SIGABRT instead and
> > make sure that we tear down the QEMU instance there? ... that would then
> > also work for other non-Linux operating systems?
> 
> Wait, we're doing that already ... why doesn't it work for your case?

Opps, hook_list_is_empty() has inverted logic, so the abrt sighandler
never gets registered.

In any case, the abrt handler only sends SIGTERM, so there's a chance
QEMU still might not exit. Or the test program can fail with SEGV
in which case we can't safely run any cleanup code. So the prctl()
feels useful regardless.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|