Fix virtlockd loosing locks

[libvirt] [PATCH 5/5] rpc: switch virtlockd and virtlogd to use single-threaded dispatch

Posted by Daniel P. Berrangé 7 years, 4 months ago

Currently both virtlogd and virtlockd use a single worker thread for
dispatching RPC messages. Even this is overkill and their RPC message
handling callbacks all run in short, finite time and so blocking the
main loop is not an issue like you'd see in libvirtd with long running
QEMU commands.

By setting max_workers==0, we can turn off the worker thread and run
these daemons single threaded. This in turn fixes a serious problem in
the virtlockd daemon whereby it looses all fcntl() locks at re-exec due
to multiple threads existing. fcntl() locks only get preserved if the
process is single threaded at time of exec().

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 src/locking/lock_daemon.c | 4 ++--
 src/logging/log_daemon.c  | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/locking/lock_daemon.c b/src/locking/lock_daemon.c
index 79ab90fc91..7afff42246 100644
--- a/src/locking/lock_daemon.c
+++ b/src/locking/lock_daemon.c
@@ -165,7 +165,7 @@ virLockDaemonNew(virLockDaemonConfigPtr config, bool privileged)
         goto error;
 
     if (!(srv = virNetServerNew("virtlockd", 1,
-                                1, 1, 0, config->max_clients,
+                                0, 0, 0, config->max_clients,
                                 config->max_clients, -1, 0,
                                 NULL,
                                 virLockDaemonClientNew,
@@ -180,7 +180,7 @@ virLockDaemonNew(virLockDaemonConfigPtr config, bool privileged)
     srv = NULL;
 
     if (!(srv = virNetServerNew("admin", 1,
-                                1, 1, 0, config->admin_max_clients,
+                                0, 0, 0, config->admin_max_clients,
                                 config->admin_max_clients, -1, 0,
                                 NULL,
                                 remoteAdmClientNew,
diff --git a/src/logging/log_daemon.c b/src/logging/log_daemon.c
index d54d26ab9d..35d7ebb6d2 100644
--- a/src/logging/log_daemon.c
+++ b/src/logging/log_daemon.c
@@ -154,7 +154,7 @@ virLogDaemonNew(virLogDaemonConfigPtr config, bool privileged)
         goto error;
 
     if (!(srv = virNetServerNew("virtlogd", 1,
-                                1, 1, 0, config->max_clients,
+                                0, 0, 0, config->max_clients,
                                 config->max_clients, -1, 0,
                                 NULL,
                                 virLogDaemonClientNew,
@@ -169,7 +169,7 @@ virLogDaemonNew(virLogDaemonConfigPtr config, bool privileged)
     srv = NULL;
 
     if (!(srv = virNetServerNew("admin", 1,
-                                1, 1, 0, config->admin_max_clients,
+                                0, 0, 0, config->admin_max_clients,
                                 config->admin_max_clients, -1, 0,
                                 NULL,
                                 remoteAdmClientNew,
-- 
2.14.3

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH 5/5] rpc: switch virtlockd and virtlogd to use single-threaded dispatch

Posted by John Ferlan 7 years, 4 months ago


On 03/06/2018 12:58 PM, Daniel P. Berrangé wrote:
> Currently both virtlogd and virtlockd use a single worker thread for
> dispatching RPC messages. Even this is overkill and their RPC message
> handling callbacks all run in short, finite time and so blocking the
> main loop is not an issue like you'd see in libvirtd with long running
> QEMU commands.
> 
> By setting max_workers==0, we can turn off the worker thread and run
> these daemons single threaded. This in turn fixes a serious problem in
> the virtlockd daemon whereby it looses all fcntl() locks at re-exec due

s/looses/loses

> to multiple threads existing. fcntl() locks only get preserved if the
> process is single threaded at time of exec().
> 
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>  src/locking/lock_daemon.c | 4 ++--
>  src/logging/log_daemon.c  | 4 ++--
>  2 files changed, 4 insertions(+), 4 deletions(-)
> 

Reviewed-by: John Ferlan <jferlan@redhat.com>

John

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH 5/5] rpc: switch virtlockd and virtlogd to use single-threaded dispatch

Posted by Jim Fehlig 7 years, 4 months ago

On 03/06/2018 10:58 AM, Daniel P. Berrangé wrote:
> Currently both virtlogd and virtlockd use a single worker thread for
> dispatching RPC messages. Even this is overkill and their RPC message
> handling callbacks all run in short, finite time and so blocking the
> main loop is not an issue like you'd see in libvirtd with long running
> QEMU commands.
> 
> By setting max_workers==0, we can turn off the worker thread and run
> these daemons single threaded. This in turn fixes a serious problem in
> the virtlockd daemon whereby it looses all fcntl() locks at re-exec due
> to multiple threads existing. fcntl() locks only get preserved if the
> process is single threaded at time of exec().

I suppose this change has no affect when e.g. starting many domains in parallel 
when locking is enabled. Before the change, there's still only one worker thread 
to process requests.

I've tested the series and locks are now preserved across re-execs of virtlockd. 
Question is whether we want this change or pursue fixing the underlying kernel bug?

FYI, via the non-public bug I asked a glibc maintainer about the lost lock 
behavior. He agreed it is a kernel bug and posted the below comment to the bug.

Regards,
Jim

First, I agree that POSIX file record locks (i.e. the fcntl F_SETLK ones, which
you're using) _are_ to be preserved over execve (absent any FD_CLOEXEC of
course, which you aren't using).  (Relevant quote from fcntl(2):

        Record locks are not inherited by  a  child  created  via  fork(2),
        but  are  preserved  across  an execve(2).

Second I agree that the existence or non-existence of threads must not play
a role in the above.

Third I see this from strace of your reproducer (only relevant snippets, with
a bit of explanation):

13:54:09.581429 clone(child_stack=0x7f74b0517ff0,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,
parent_tidptr=0x7f74b05189d0, tls=0x7f74b0518700, child_tidptr=0x7f74b05189d0)
= 30911
Process 30911 attached

So, here we created the thread 30911.  PID 30910 is now taking the lock:

[pid 30910] 13:54:09.581562 open("lock.txt", O_WRONLY|O_CREAT|O_TRUNC, 0755
<unfinished ...>
[pid 30911] 13:54:09.581595 set_robust_list(0x7f74b05189e0, 24 <unfinished ...>
[pid 30910] 13:54:09.581647 <... open resumed> ) = 3
[pid 30911] 13:54:09.581671 <... set_robust_list resumed> ) = 0
[pid 30910] 13:54:09.581693 fcntl(3, F_SETLK, {type=F_WRLCK, whence=SEEK_SET,
start=0, len=42} <unfinished ...>
[pid 30911] 13:54:09.581722 rt_sigprocmask(SIG_BLOCK, [CHLD],  <unfinished ...>
[pid 30910] 13:54:09.581750 <... fcntl resumed> ) = 0

Lock aquired.  Now we do a little debug output and sleeping, and the only
actions on FD 3 during this are:

[pid 30910] 13:54:09.581790 fcntl(3, F_GETFD <unfinished ...>
[pid 30910] 13:54:09.581872 <... fcntl resumed> ) = 0
[pid 30910] 13:54:09.581911 fcntl(3, F_SETFD, 0 <unfinished ...>
[pid 30910] 13:54:09.581958 <... fcntl resumed> ) = 0

Then comes the execve from 30910, with no intervening thread exit or the like.
But of course during the execve the thread 30911 is killed:

[pid 30910] 13:54:19.582600 execve("/suse/matz/lock-strangeness",
["/suse/matz/lock-strangeness", "3"], [/* 0 vars */] <unfinished ...>
[pid 30911] 13:54:19.583422 +++ exited with 0 +++
13:54:19.583749 <... execve resumed> )  = 0
13:54:19.583845 brk(0)                  = 0xc0c000

So, 30910 is alone again and is executing, loading the shared libs, relocating
and so on.  The first action done to FD 3 (retained over the execve) is:

13:54:19.588016 fcntl(3, F_GETLK, {type=F_UNLCK, whence=SEEK_SET, start=0,
len=42, pid=0}) = 0
13:54:19.588101 fcntl(3, F_GETFD)       = 0
13:54:19.588480 fcntl(3, F_SETFD, FD_CLOEXEC) = 0

Bleah!  F_UNLCK we get, which isn't what we're supposed to get.

As there are no intervening syscalls that could in any way have removed the
lock
(in fact there are none other to file descriptor 3) it can't be glibc or
libpthread (or anything else userspace is doing) that would have caused the
lock to be removed.  It's the kernel itself, and hence a bug therein.

If I may hazard a guess it would be that the forced exits of all non-main
threads done by the kernel somehow lead to cleaning up the locks, possibly
because the actions that need doing during fork(2) (which also includes
other-threads-exit, but also lock-removal) are conflated with the actions that
need happen during execve(2) (which must _not_ include lock-removel).

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH 5/5] rpc: switch virtlockd and virtlogd to use single-threaded dispatch

Posted by Daniel P. Berrangé 7 years, 4 months ago

On Tue, Mar 06, 2018 at 04:46:05PM -0700, Jim Fehlig wrote:
> On 03/06/2018 10:58 AM, Daniel P. Berrangé wrote:
> > Currently both virtlogd and virtlockd use a single worker thread for
> > dispatching RPC messages. Even this is overkill and their RPC message
> > handling callbacks all run in short, finite time and so blocking the
> > main loop is not an issue like you'd see in libvirtd with long running
> > QEMU commands.
> > 
> > By setting max_workers==0, we can turn off the worker thread and run
> > these daemons single threaded. This in turn fixes a serious problem in
> > the virtlockd daemon whereby it looses all fcntl() locks at re-exec due
> > to multiple threads existing. fcntl() locks only get preserved if the
> > process is single threaded at time of exec().
> 
> I suppose this change has no affect when e.g. starting many domains in
> parallel when locking is enabled. Before the change, there's still only one
> worker thread to process requests.
> 
> I've tested the series and locks are now preserved across re-execs of
> virtlockd. Question is whether we want this change or pursue fixing the
> underlying kernel bug?
> 
> FYI, via the non-public bug I asked a glibc maintainer about the lost lock
> behavior. He agreed it is a kernel bug and posted the below comment to the
> bug.
> 
> Regards,
> Jim
> 
> First, I agree that POSIX file record locks (i.e. the fcntl F_SETLK ones, which
> you're using) _are_ to be preserved over execve (absent any FD_CLOEXEC of
> course, which you aren't using).  (Relevant quote from fcntl(2):
> 
>        Record locks are not inherited by  a  child  created  via  fork(2),
>        but  are  preserved  across  an execve(2).
> 
> Second I agree that the existence or non-existence of threads must not play
> a role in the above.

I've asked some Red Hat experts too and they suggest it looks like a kernel
bug. The question is whether this is a recent kernel regression, that is easily
fixed, or whether its a long term problem.

I've at least verified that this broken behaviour existed in RHEL-7 (but its
possible it was backported when OFD locks were implemented). I still want to
test RHEL-6 and RHEL-5 to see if this problem goes back indefinitely.

My inclination though is that we'll need to work around the problem in
libvirt regardless.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH 5/5] rpc: switch virtlockd and virtlogd to use single-threaded dispatch

Posted by Daniel P. Berrangé 7 years, 4 months ago

On Wed, Mar 07, 2018 at 10:10:29AM +0000, Daniel P. Berrangé wrote:
> On Tue, Mar 06, 2018 at 04:46:05PM -0700, Jim Fehlig wrote:
> > On 03/06/2018 10:58 AM, Daniel P. Berrangé wrote:
> > > Currently both virtlogd and virtlockd use a single worker thread for
> > > dispatching RPC messages. Even this is overkill and their RPC message
> > > handling callbacks all run in short, finite time and so blocking the
> > > main loop is not an issue like you'd see in libvirtd with long running
> > > QEMU commands.
> > > 
> > > By setting max_workers==0, we can turn off the worker thread and run
> > > these daemons single threaded. This in turn fixes a serious problem in
> > > the virtlockd daemon whereby it looses all fcntl() locks at re-exec due
> > > to multiple threads existing. fcntl() locks only get preserved if the
> > > process is single threaded at time of exec().
> > 
> > I suppose this change has no affect when e.g. starting many domains in
> > parallel when locking is enabled. Before the change, there's still only one
> > worker thread to process requests.
> > 
> > I've tested the series and locks are now preserved across re-execs of
> > virtlockd. Question is whether we want this change or pursue fixing the
> > underlying kernel bug?
> > 
> > FYI, via the non-public bug I asked a glibc maintainer about the lost lock
> > behavior. He agreed it is a kernel bug and posted the below comment to the
> > bug.
> > 
> > Regards,
> > Jim
> > 
> > First, I agree that POSIX file record locks (i.e. the fcntl F_SETLK ones, which
> > you're using) _are_ to be preserved over execve (absent any FD_CLOEXEC of
> > course, which you aren't using).  (Relevant quote from fcntl(2):
> > 
> >        Record locks are not inherited by  a  child  created  via  fork(2),
> >        but  are  preserved  across  an execve(2).
> > 
> > Second I agree that the existence or non-existence of threads must not play
> > a role in the above.
> 
> I've asked some Red Hat experts too and they suggest it looks like a kernel
> bug. The question is whether this is a recent kernel regression, that is easily
> fixed, or whether its a long term problem.
> 
> I've at least verified that this broken behaviour existed in RHEL-7 (but its
> possible it was backported when OFD locks were implemented). I still want to
> test RHEL-6 and RHEL-5 to see if this problem goes back indefinitely.

I've checked RHEL6 & RHEL5 and both are affected, so this a long time Linux
problem, and so we'll need to workaround it.

FYI I've got kernel bug open here to track it from RHEL side:

  https://bugzilla.redhat.com/show_bug.cgi?id=1552621

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH 5/5] rpc: switch virtlockd and virtlogd to use single-threaded dispatch

Posted by Jim Fehlig 7 years, 4 months ago

On 03/07/2018 06:07 AM, Daniel P. Berrangé wrote:
> On Wed, Mar 07, 2018 at 10:10:29AM +0000, Daniel P. Berrangé wrote:
>> On Tue, Mar 06, 2018 at 04:46:05PM -0700, Jim Fehlig wrote:
>>> On 03/06/2018 10:58 AM, Daniel P. Berrangé wrote:
>>>> Currently both virtlogd and virtlockd use a single worker thread for
>>>> dispatching RPC messages. Even this is overkill and their RPC message
>>>> handling callbacks all run in short, finite time and so blocking the
>>>> main loop is not an issue like you'd see in libvirtd with long running
>>>> QEMU commands.
>>>>
>>>> By setting max_workers==0, we can turn off the worker thread and run
>>>> these daemons single threaded. This in turn fixes a serious problem in
>>>> the virtlockd daemon whereby it looses all fcntl() locks at re-exec due
>>>> to multiple threads existing. fcntl() locks only get preserved if the
>>>> process is single threaded at time of exec().
>>>
>>> I suppose this change has no affect when e.g. starting many domains in
>>> parallel when locking is enabled. Before the change, there's still only one
>>> worker thread to process requests.
>>>
>>> I've tested the series and locks are now preserved across re-execs of
>>> virtlockd. Question is whether we want this change or pursue fixing the
>>> underlying kernel bug?
>>>
>>> FYI, via the non-public bug I asked a glibc maintainer about the lost lock
>>> behavior. He agreed it is a kernel bug and posted the below comment to the
>>> bug.
>>>
>>> Regards,
>>> Jim
>>>
>>> First, I agree that POSIX file record locks (i.e. the fcntl F_SETLK ones, which
>>> you're using) _are_ to be preserved over execve (absent any FD_CLOEXEC of
>>> course, which you aren't using).  (Relevant quote from fcntl(2):
>>>
>>>         Record locks are not inherited by  a  child  created  via  fork(2),
>>>         but  are  preserved  across  an execve(2).
>>>
>>> Second I agree that the existence or non-existence of threads must not play
>>> a role in the above.
>>
>> I've asked some Red Hat experts too and they suggest it looks like a kernel
>> bug. The question is whether this is a recent kernel regression, that is easily
>> fixed, or whether its a long term problem.
>>
>> I've at least verified that this broken behaviour existed in RHEL-7 (but its
>> possible it was backported when OFD locks were implemented). I still want to
>> test RHEL-6 and RHEL-5 to see if this problem goes back indefinitely.
> 
> I've checked RHEL6 & RHEL5 and both are affected, so this a long time Linux
> problem, and so we'll need to workaround it.

We have some vintage distros around for long term support and I managed to 
"bisect" the problem a bit: The reproducer works on kernel 2.6.16 but breaks on 
2.6.32.

> FYI I've got kernel bug open here to track it from RHEL side:
> 
>    https://bugzilla.redhat.com/show_bug.cgi?id=1552621

Thanks!

Regards,
Jim

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH 5/5] rpc: switch virtlockd and virtlogd to use single-threaded dispatch

Posted by Jim Fehlig 7 years, 4 months ago

On 03/06/2018 10:58 AM, Daniel P. Berrangé wrote:
> Currently both virtlogd and virtlockd use a single worker thread for
> dispatching RPC messages. Even this is overkill and their RPC message
> handling callbacks all run in short, finite time and so blocking the
> main loop is not an issue like you'd see in libvirtd with long running
> QEMU commands.
> 
> By setting max_workers==0, we can turn off the worker thread and run
> these daemons single threaded. This in turn fixes a serious problem in
> the virtlockd daemon whereby it looses all fcntl() locks at re-exec due
> to multiple threads existing. fcntl() locks only get preserved if the
> process is single threaded at time of exec().
> 
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>   src/locking/lock_daemon.c | 4 ++--
>   src/logging/log_daemon.c  | 4 ++--
>   2 files changed, 4 insertions(+), 4 deletions(-)

Given the outcome of discussions

Reviewed-by: Jim Fehlig <jfehlig@suse.com>

Regards,
Jim

> 
> diff --git a/src/locking/lock_daemon.c b/src/locking/lock_daemon.c
> index 79ab90fc91..7afff42246 100644
> --- a/src/locking/lock_daemon.c
> +++ b/src/locking/lock_daemon.c
> @@ -165,7 +165,7 @@ virLockDaemonNew(virLockDaemonConfigPtr config, bool privileged)
>           goto error;
>   
>       if (!(srv = virNetServerNew("virtlockd", 1,
> -                                1, 1, 0, config->max_clients,
> +                                0, 0, 0, config->max_clients,
>                                   config->max_clients, -1, 0,
>                                   NULL,
>                                   virLockDaemonClientNew,
> @@ -180,7 +180,7 @@ virLockDaemonNew(virLockDaemonConfigPtr config, bool privileged)
>       srv = NULL;
>   
>       if (!(srv = virNetServerNew("admin", 1,
> -                                1, 1, 0, config->admin_max_clients,
> +                                0, 0, 0, config->admin_max_clients,
>                                   config->admin_max_clients, -1, 0,
>                                   NULL,
>                                   remoteAdmClientNew,
> diff --git a/src/logging/log_daemon.c b/src/logging/log_daemon.c
> index d54d26ab9d..35d7ebb6d2 100644
> --- a/src/logging/log_daemon.c
> +++ b/src/logging/log_daemon.c
> @@ -154,7 +154,7 @@ virLogDaemonNew(virLogDaemonConfigPtr config, bool privileged)
>           goto error;
>   
>       if (!(srv = virNetServerNew("virtlogd", 1,
> -                                1, 1, 0, config->max_clients,
> +                                0, 0, 0, config->max_clients,
>                                   config->max_clients, -1, 0,
>                                   NULL,
>                                   virLogDaemonClientNew,
> @@ -169,7 +169,7 @@ virLogDaemonNew(virLogDaemonConfigPtr config, bool privileged)
>       srv = NULL;
>   
>       if (!(srv = virNetServerNew("admin", 1,
> -                                1, 1, 0, config->admin_max_clients,
> +                                0, 0, 0, config->admin_max_clients,
>                                   config->admin_max_clients, -1, 0,
>                                   NULL,
>                                   remoteAdmClientNew,
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] [PATCH 1/5] rpc: push ref acquisition into RPC dispatch function
[libvirt] [PATCH 2/5] rpc: simplify calling convention of virNetServerClientDispatchFunc
[libvirt] [PATCH 3/5] rpc: invoke the message dispatch callback with client unlocked
[libvirt] [PATCH 4/5] rpc: avoid crashing in pre-exec if no workers are present
[libvirt] [PATCH 5/5] rpc: switch virtlockd and virtlogd to use single-threaded dispatch