[libvirt] [PATCH 5/5] libvirtd: Fix order of cleanup processing

John Ferlan posted 5 patches 7 years, 6 months ago
[libvirt] [PATCH 5/5] libvirtd: Fix order of cleanup processing
Posted by John Ferlan 7 years, 6 months ago
Current cleanup processing is ad-hoc at best - it's led to a couple of
strange and hard to diagnose timing problems and crashes.

So rather than perform cleanup in a somewhat random order, let's
perform cleanup in the exact opposite order of startup.

Signed-off-by: John Ferlan <jferlan@redhat.com>
---
 daemon/libvirtd.c | 42 +++++++++++++++++++++++++++++-------------
 1 file changed, 29 insertions(+), 13 deletions(-)

diff --git a/daemon/libvirtd.c b/daemon/libvirtd.c
index 87c5b22710..92037e9070 100644
--- a/daemon/libvirtd.c
+++ b/daemon/libvirtd.c
@@ -1524,9 +1524,35 @@ int main(int argc, char **argv) {
                 0, "shutdown", NULL, NULL);
 
  cleanup:
-    virNetlinkEventServiceStopAll();
+    /* NB: Order for cleanup should attempt to kept in the inverse order
+     * was was used to create/start the daemon - there are some fairly
+     * important reliances built into the startup processing that use
+     * refcnt's in order to manage objects. Removing too early could
+     * cause crashes. */
     virNetDaemonClose(dmn);
+
+    virNetlinkEventServiceStopAll();
+
+    if (driversInitialized) {
+        /* Set via daemonRunStateInit thread in daemonStateInit.
+         * NB: It is possible that virNetlinkEventServerStart fails
+         * and we jump to cleanup before driversInitialized has
+         * been set. That could leave things inconsistent; however,
+         * resolution of that possibility is perhaps more trouble than
+         * it's worth to handle. */
+        driversInitialized = false;
+        virStateCleanup();
+    }
+
+    virObjectUnref(dmn);
+
     virNetlinkShutdown();
+
+    if (pid_file_fd != -1)
+        virPidFileReleasePath(pid_file, pid_file_fd);
+
+    VIR_FREE(run_dir);
+
     if (statuswrite != -1) {
         if (ret != 0) {
             /* Tell parent of daemon what failed */
@@ -1537,25 +1563,15 @@ int main(int argc, char **argv) {
         }
         VIR_FORCE_CLOSE(statuswrite);
     }
-    if (pid_file_fd != -1)
-        virPidFileReleasePath(pid_file, pid_file_fd);
 
     VIR_FREE(sock_file);
     VIR_FREE(sock_file_ro);
     VIR_FREE(sock_file_adm);
+
     VIR_FREE(pid_file);
-    VIR_FREE(remote_config_file);
-    VIR_FREE(run_dir);
 
+    VIR_FREE(remote_config_file);
     daemonConfigFree(config);
 
-    if (driversInitialized) {
-        driversInitialized = false;
-        virStateCleanup();
-    }
-    /* Now that the hypervisor shutdown inhibition functions that use
-     * 'dmn' as a parameter are done, we can finally unref 'dmn' */
-    virObjectUnref(dmn);
-
     return ret;
 }
-- 
2.13.6

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 5/5] libvirtd: Fix order of cleanup processing
Posted by Erik Skultety 7 years, 6 months ago
On Tue, Nov 07, 2017 at 09:39:56PM -0500, John Ferlan wrote:
> Current cleanup processing is ad-hoc at best - it's led to a couple of
> strange and hard to diagnose timing problems and crashes.
>
> So rather than perform cleanup in a somewhat random order, let's
> perform cleanup in the exact opposite order of startup.
>
> Signed-off-by: John Ferlan <jferlan@redhat.com>
> ---
>  daemon/libvirtd.c | 42 +++++++++++++++++++++++++++++-------------
>  1 file changed, 29 insertions(+), 13 deletions(-)
>
> diff --git a/daemon/libvirtd.c b/daemon/libvirtd.c
> index 87c5b22710..92037e9070 100644
> --- a/daemon/libvirtd.c
> +++ b/daemon/libvirtd.c
> @@ -1524,9 +1524,35 @@ int main(int argc, char **argv) {
>                  0, "shutdown", NULL, NULL);
>
>   cleanup:
> -    virNetlinkEventServiceStopAll();
> +    /* NB: Order for cleanup should attempt to kept in the inverse order
> +     * was was used to create/start the daemon - there are some fairly
> +     * important reliances built into the startup processing that use
> +     * refcnt's in order to manage objects. Removing too early could
> +     * cause crashes. */

^Not a very useful  commentary, more suitable for the commit message - so I
preferred if you'd drop it from here.

>      virNetDaemonClose(dmn);
> +
> +    virNetlinkEventServiceStopAll();
> +
> +    if (driversInitialized) {
> +        /* Set via daemonRunStateInit thread in daemonStateInit.
> +         * NB: It is possible that virNetlinkEventServerStart fails
> +         * and we jump to cleanup before driversInitialized has
> +         * been set. That could leave things inconsistent; however,
> +         * resolution of that possibility is perhaps more trouble than
> +         * it's worth to handle. */

True, I guess, nevertheless it's not very useful either...

...

> +    virObjectUnref(dmn);

Why ^here? Just because we're doing the cleanup the exact opposite way? The
approach is right, but I think in general cleanup routines should be run first
followed by releasing all the objects, so this should stay where it is right
now - at the very end.

I agree with the rest of the hunks.

Reviewed-by: Erik Skultety <eskultet@redhat.com>

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 5/5] libvirtd: Fix order of cleanup processing
Posted by John Ferlan 7 years, 6 months ago

On 11/10/2017 10:23 AM, Erik Skultety wrote:
> On Tue, Nov 07, 2017 at 09:39:56PM -0500, John Ferlan wrote:
>> Current cleanup processing is ad-hoc at best - it's led to a couple of
>> strange and hard to diagnose timing problems and crashes.
>>
>> So rather than perform cleanup in a somewhat random order, let's
>> perform cleanup in the exact opposite order of startup.
>>
>> Signed-off-by: John Ferlan <jferlan@redhat.com>
>> ---
>>  daemon/libvirtd.c | 42 +++++++++++++++++++++++++++++-------------
>>  1 file changed, 29 insertions(+), 13 deletions(-)
>>
>> diff --git a/daemon/libvirtd.c b/daemon/libvirtd.c
>> index 87c5b22710..92037e9070 100644
>> --- a/daemon/libvirtd.c
>> +++ b/daemon/libvirtd.c
>> @@ -1524,9 +1524,35 @@ int main(int argc, char **argv) {
>>                  0, "shutdown", NULL, NULL);
>>
>>   cleanup:
>> -    virNetlinkEventServiceStopAll();
>> +    /* NB: Order for cleanup should attempt to kept in the inverse order
>> +     * was was used to create/start the daemon - there are some fairly
>> +     * important reliances built into the startup processing that use
>> +     * refcnt's in order to manage objects. Removing too early could
>> +     * cause crashes. */
> 
> ^Not a very useful  commentary, more suitable for the commit message - so I
> preferred if you'd drop it from here.
> 

I like it here because I won't necessarily go find the commit that added
this code in order to know that we need to be very careful and orderly
about the cleanup order.

The only time I'd go back to a commit message is to perhaps point out
why the order is the way it is when some other patch f*ks it up ;-)

>>      virNetDaemonClose(dmn);
>> +
>> +    virNetlinkEventServiceStopAll();
>> +
>> +    if (driversInitialized) {
>> +        /* Set via daemonRunStateInit thread in daemonStateInit.
>> +         * NB: It is possible that virNetlinkEventServerStart fails
>> +         * and we jump to cleanup before driversInitialized has
>> +         * been set. That could leave things inconsistent; however,
>> +         * resolution of that possibility is perhaps more trouble than
>> +         * it's worth to handle. */
> 
> True, I guess, nevertheless it's not very useful either...
> 
> ...
> 

It would be for someone chasing a very rogue and odd driver core. I
think it was important to know/understand that there is an open window
here which someone could drive a truck through.  Maybe someone else has
a more wonderful idea or set of patches that would avoid this... I could
add an XXX in front of which seems to be our norm of - someone someday
when they have copious free time and wants to pull their hair out could
actually fix this problem.

>> +    virObjectUnref(dmn);
> 
> Why ^here? Just because we're doing the cleanup the exact opposite way? The
> approach is right, but I think in general cleanup routines should be run first
> followed by releasing all the objects, so this should stay where it is right
> now - at the very end.

Nothing after this theoretically relies on @dmn still existing.  Nothing
else uses it or has a reference to it....

John

> 
> I agree with the rest of the hunks.
> 
> Reviewed-by: Erik Skultety <eskultet@redhat.com>
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 5/5] libvirtd: Fix order of cleanup processing
Posted by Erik Skultety 7 years, 5 months ago
On Fri, Nov 10, 2017 at 05:51:26PM -0500, John Ferlan wrote:
>
>
> On 11/10/2017 10:23 AM, Erik Skultety wrote:
> > On Tue, Nov 07, 2017 at 09:39:56PM -0500, John Ferlan wrote:
> >> Current cleanup processing is ad-hoc at best - it's led to a couple of
> >> strange and hard to diagnose timing problems and crashes.
> >>
> >> So rather than perform cleanup in a somewhat random order, let's
> >> perform cleanup in the exact opposite order of startup.
> >>
> >> Signed-off-by: John Ferlan <jferlan@redhat.com>
> >> ---
> >>  daemon/libvirtd.c | 42 +++++++++++++++++++++++++++++-------------
> >>  1 file changed, 29 insertions(+), 13 deletions(-)
> >>
> >> diff --git a/daemon/libvirtd.c b/daemon/libvirtd.c
> >> index 87c5b22710..92037e9070 100644
> >> --- a/daemon/libvirtd.c
> >> +++ b/daemon/libvirtd.c
> >> @@ -1524,9 +1524,35 @@ int main(int argc, char **argv) {
> >>                  0, "shutdown", NULL, NULL);
> >>
> >>   cleanup:
> >> -    virNetlinkEventServiceStopAll();
> >> +    /* NB: Order for cleanup should attempt to kept in the inverse order
> >> +     * was was used to create/start the daemon - there are some fairly
> >> +     * important reliances built into the startup processing that use
> >> +     * refcnt's in order to manage objects. Removing too early could
> >> +     * cause crashes. */
> >
> > ^Not a very useful  commentary, more suitable for the commit message - so I
> > preferred if you'd drop it from here.
> >
>
> I like it here because I won't necessarily go find the commit that added
> this code in order to know that we need to be very careful and orderly
> about the cleanup order.
>
> The only time I'd go back to a commit message is to perhaps point out
> why the order is the way it is when some other patch f*ks it up ;-)

Well, I think that releasing memory in the inverse order should be an implicit
rationale, you could at least shorten the commentary to something as simple as
"Cleanup order needs to be kept strictly in the inverse order"...

>
> >>      virNetDaemonClose(dmn);
> >> +
> >> +    virNetlinkEventServiceStopAll();
> >> +
> >> +    if (driversInitialized) {
> >> +        /* Set via daemonRunStateInit thread in daemonStateInit.
> >> +         * NB: It is possible that virNetlinkEventServerStart fails
> >> +         * and we jump to cleanup before driversInitialized has
> >> +         * been set. That could leave things inconsistent; however,
> >> +         * resolution of that possibility is perhaps more trouble than
> >> +         * it's worth to handle. */
> >
> > True, I guess, nevertheless it's not very useful either...
> >
> > ...
> >
>
> It would be for someone chasing a very rogue and odd driver core. I
> think it was important to know/understand that there is an open window
> here which someone could drive a truck through.

Maybe I'm just paranoid enough to expect such a window to exist everywhere...
not a fan of the commentary though...


Reviewed-by: Erik Skultety <eskultet@redhat.com>

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 5/5] libvirtd: Fix order of cleanup processing
Posted by John Ferlan 7 years, 5 months ago

On 11/16/2017 02:58 AM, Erik Skultety wrote:
> On Fri, Nov 10, 2017 at 05:51:26PM -0500, John Ferlan wrote:
>>
>>
>> On 11/10/2017 10:23 AM, Erik Skultety wrote:
>>> On Tue, Nov 07, 2017 at 09:39:56PM -0500, John Ferlan wrote:
>>>> Current cleanup processing is ad-hoc at best - it's led to a couple of
>>>> strange and hard to diagnose timing problems and crashes.
>>>>
>>>> So rather than perform cleanup in a somewhat random order, let's
>>>> perform cleanup in the exact opposite order of startup.
>>>>
>>>> Signed-off-by: John Ferlan <jferlan@redhat.com>
>>>> ---
>>>>  daemon/libvirtd.c | 42 +++++++++++++++++++++++++++++-------------
>>>>  1 file changed, 29 insertions(+), 13 deletions(-)
>>>>
>>>> diff --git a/daemon/libvirtd.c b/daemon/libvirtd.c
>>>> index 87c5b22710..92037e9070 100644
>>>> --- a/daemon/libvirtd.c
>>>> +++ b/daemon/libvirtd.c
>>>> @@ -1524,9 +1524,35 @@ int main(int argc, char **argv) {
>>>>                  0, "shutdown", NULL, NULL);
>>>>
>>>>   cleanup:
>>>> -    virNetlinkEventServiceStopAll();
>>>> +    /* NB: Order for cleanup should attempt to kept in the inverse order
>>>> +     * was was used to create/start the daemon - there are some fairly
>>>> +     * important reliances built into the startup processing that use
>>>> +     * refcnt's in order to manage objects. Removing too early could
>>>> +     * cause crashes. */
>>>
>>> ^Not a very useful  commentary, more suitable for the commit message - so I
>>> preferred if you'd drop it from here.
>>>
>>
>> I like it here because I won't necessarily go find the commit that added
>> this code in order to know that we need to be very careful and orderly
>> about the cleanup order.
>>
>> The only time I'd go back to a commit message is to perhaps point out
>> why the order is the way it is when some other patch f*ks it up ;-)
> 
> Well, I think that releasing memory in the inverse order should be an implicit
> rationale, you could at least shorten the commentary to something as simple as
> "Cleanup order needs to be kept strictly in the inverse order"...
> 

I don't disagree about implicit rationale - unfortunately as seen
in this cleanup reality is further from those logical thoughts.

>>
>>>>      virNetDaemonClose(dmn);
>>>> +
>>>> +    virNetlinkEventServiceStopAll();
>>>> +
>>>> +    if (driversInitialized) {
>>>> +        /* Set via daemonRunStateInit thread in daemonStateInit.
>>>> +         * NB: It is possible that virNetlinkEventServerStart fails
>>>> +         * and we jump to cleanup before driversInitialized has
>>>> +         * been set. That could leave things inconsistent; however,
>>>> +         * resolution of that possibility is perhaps more trouble than
>>>> +         * it's worth to handle. */
>>>
>>> True, I guess, nevertheless it's not very useful either...
>>>
>>> ...
>>>
>>
>> It would be for someone chasing a very rogue and odd driver core. I
>> think it was important to know/understand that there is an open window
>> here which someone could drive a truck through.
> 
> Maybe I'm just paranoid enough to expect such a window to exist everywhere...
> not a fan of the commentary though...
> 
> 
> Reviewed-by: Erik Skultety <eskultet@redhat.com>
> 


Fair enough - point made.  I'll adjust as follows and add the extra
commentary about drivers initialized in the commit message for anyone
that cares to go back through history... FWIW, all of cleanup:

 cleanup:
    /* Keep cleanup order in inverse order of startup */
    virNetDaemonClose(dmn);

    virNetlinkEventServiceStopAll();

    if (driversInitialized) {
        /* NB: Possible issue with timing window between driversInitialized
         * setting if virNetlinkEventServerStart fails */
        driversInitialized = false;
        virStateCleanup();
    }

    virObjectUnref(adminProgram);
    virObjectUnref(srvAdm);
    virObjectUnref(qemuProgram);
    virObjectUnref(lxcProgram);
    virObjectUnref(remoteProgram);
    virObjectUnref(srv);
    virObjectUnref(dmn);

    virNetlinkShutdown();

    if (pid_file_fd != -1)
        virPidFileReleasePath(pid_file, pid_file_fd);

    VIR_FREE(run_dir);

    if (statuswrite != -1) {
        if (ret != 0) {
            /* Tell parent of daemon what failed */
            char status = ret;
            while (write(statuswrite, &status, 1) == -1 &&
                   errno == EINTR)
                ;
        }
        VIR_FORCE_CLOSE(statuswrite);
    }

    VIR_FREE(sock_file);
    VIR_FREE(sock_file_ro);
    VIR_FREE(sock_file_adm);

    VIR_FREE(pid_file);

    VIR_FREE(remote_config_file);
    daemonConfigFree(config);

    return ret;


John

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list