[libvirt] [PATCH] nodedev: Document the udevEventHandleThread

John Ferlan posted 1 patch 5 years, 5 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20181019120443.2733-1-jferlan@redhat.com
There is a newer version of this series
src/node_device/node_device_udev.c | 49 ++++++++++++++++++++++++++++++
1 file changed, 49 insertions(+)
[libvirt] [PATCH] nodedev: Document the udevEventHandleThread
Posted by John Ferlan 5 years, 5 months ago
Add details of the algorithm and why it was used to help future
readers understand the issues encountered. Based largely off a
description provided by Erik Skultety <eskultet@redhat.com>.

Signed-off-by: John Ferlan <jferlan@redhat.com>
---

 see: https://www.redhat.com/archives/libvir-list/2018-October/msg00915.html

 src/node_device/node_device_udev.c | 49 ++++++++++++++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c
index 22897591de..8ca81e65ad 100644
--- a/src/node_device/node_device_udev.c
+++ b/src/node_device/node_device_udev.c
@@ -1591,6 +1591,53 @@ udevEventMonitorSanityCheck(udevEventDataPtr priv,
 }
 
 
+/**
+ * udevEventHandleThread
+ * @opaque: unused
+ *
+ * Thread to handle the udevEventHandleCallback processing when udev
+ * tells us there's a device change for us (add, modify, delete, etc).
+ *
+ * Management of the processing udev device notification is a delicate
+ * balance between the udev process, the scheduler, and when exactly the
+ * data from/for the socket is received. The udev processing code notifies
+ * the udevEventHandleCallback that it has data; however, the actual recv()
+ * done in the udev code to fill in the @device structure can be delayed
+ * due to how threads are scheduled.
+ *
+ * As it turns out, the event loop could be scheduled (and it in fact
+ * was) earlier than the handler thread. What that essentially means is
+ * that by the time the thread actually handled the event and read the
+ * data from the monitor, the event loop fired the very same event, simply
+ * because the data hadn't been retrieved from the socket at that point yet.
+ *
+ * Thus this algorithm is that once udevEventHandleCallback is notified
+ * there is data, this loop will process as much data as possible until
+ * udev_monitor_receive_device fails to get the @device. This may be
+ * because we've processed everything or because the scheduling thread
+ * hasn't been able to populate data in the socket. Once we're done
+ * processing everything we can, wait on the next event/notification.
+ * Although this may incur a slight delay if the socket data wasn't
+ * populated, the event/notifications are fairly consistent so there's
+ * no need to use virCondWaitUntil.
+ *
+ * NB: Usage of event based socket algorithm causes some issues with
+ * older platforms such as CentOS 6 in which the data socket is opened
+ * without the NONBLOCK bit set. Still even if the reset of priv->dataReady
+ * were moved to just after the udev_monitor_receive_device in order to
+ * avoid the NONBLOCK issue, the scheduler would still come into play
+ * especially for environments when multiple devices are added into the
+ * system. Even those environments would still be blocked on the udev
+ * socket recv() call. The algorithm for handling that scenario would
+ * be more complex and involve pulling the monitor object out of the
+ * private data lockable object and would need to be guarded by a
+ * separate lock. Devising algorithms to work around issues with older
+ * OS's at the expense of more modern OS algorithms in newer event
+ * processing code may result in unexpected issues, so the choice is
+ * to encourage use of newer OS's with newer udev event processing code.
+ * Newer devices, such as mdevs make heavy usage of event processing
+ * and it's expected future devices would follow that model.
+ */
 static void
 udevEventHandleThread(void *opaque ATTRIBUTE_UNUSED)
 {
@@ -1637,6 +1684,8 @@ udevEventHandleThread(void *opaque ATTRIBUTE_UNUSED)
                 return;
             }
 
+            /* See the above description why this needs to be here
+             * and not after the udev_monitor_receive_device. */
             virObjectLock(priv);
             priv->dataReady = false;
             virObjectUnlock(priv);
-- 
2.17.2

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] nodedev: Document the udevEventHandleThread
Posted by Erik Skultety 5 years, 4 months ago
On Fri, Oct 19, 2018 at 08:04:43AM -0400, John Ferlan wrote:
> Add details of the algorithm and why it was used to help future
> readers understand the issues encountered. Based largely off a
> description provided by Erik Skultety <eskultet@redhat.com>.

Since the link to the archive would be missing from the commit message, I feel
like simply blaming commit cdbe13329a7 for lacking documentation/justification
of the algorithm used is the right way of composing the commit message.

I appreciate the effort to document it properly, although I feel it's a bit
overwhelming, so the comments I have below is just a sheer attempt to reduce
the amount of text documenting a single function, you know, if a function needs
a comment like that, it suggests that there's something wrong with that
function...

>
> Signed-off-by: John Ferlan <jferlan@redhat.com>
> ---
>
>  see: https://www.redhat.com/archives/libvir-list/2018-October/msg00915.html
>
>  src/node_device/node_device_udev.c | 49 ++++++++++++++++++++++++++++++
>  1 file changed, 49 insertions(+)
>
> diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c
> index 22897591de..8ca81e65ad 100644
> --- a/src/node_device/node_device_udev.c
> +++ b/src/node_device/node_device_udev.c
> @@ -1591,6 +1591,53 @@ udevEventMonitorSanityCheck(udevEventDataPtr priv,
>  }
>
>
> +/**
> + * udevEventHandleThread
> + * @opaque: unused
> + *
> + * Thread to handle the udevEventHandleCallback processing when udev
> + * tells us there's a device change for us (add, modify, delete, etc).
> + *
> + * Management of the processing udev device notification is a delicate
> + * balance between the udev process, the scheduler, and when exactly the
> + * data from/for the socket is received.

I'd drop ^this sentence, the issue with this unfortunate mix of factors is
explained further down, let's try keeping the amount of commentary down.

> + * The udev processing code notifies
> + * the udevEventHandleCallback that it has data; however, the actual recv()
> + * done in the udev code to fill in the @device structure can be delayed
> + * due to how threads are scheduled.

How about:
"When we get notified by udev that there are data to be processed, the actual
data retrieval by libudev may be delayed due to how threads are scheduled."

> + *
> + * As it turns out, the event loop could be scheduled (and it in fact
> + * was) earlier than the handler thread. What that essentially means is
> + * that by the time the thread actually handled the event and read the
> + * data from the monitor, the event loop fired the very same event, simply
> + * because the data hadn't been retrieved from the socket at that point yet.

^This doesn't need to be a separate paragraph, you can connect it to the
previous one. Also, some words don't feel right for commentary purposes (I know
they're mine :D). How about:

"In fact, the event loop could be scheduled earlier than the handler thread,
thus potentially emitting the very same event the handler thread is currently
trying to process, simply because the data hadn't been retrieved from the
socket at that time yet."

> + *
> + * Thus this algorithm is that once udevEventHandleCallback is notified
> + * there is data,

^this can be dropped...

> + * this loop will process as much data as possible until
> + * udev_monitor_receive_device fails to get the @device. This may be
> + * because we've processed everything or because the scheduling thread
> + * hasn't been able to populate data in the socket. Once we're done
> + * processing everything we can, wait on the next event/notification.

How about:

"Instead of waiting for the next event after retrieving a device data we keep
reading from the udev monitor until we encounter one of EAGAIN or EWOULDBLOCK
errors, only then we'll wait for the next event."

Also, I'd move that bit into the function body, so it's directly tied to
the piece of code responsible for it.

> + * Although this may incur a slight delay if the socket data wasn't
> + * populated, the event/notifications are fairly consistent so there's
> + * no need to use virCondWaitUntil.

^This could be dropped.

> + *
> + * NB: Usage of event based socket algorithm causes some issues with
> + * older platforms such as CentOS 6 in which the data socket is opened
> + * without the NONBLOCK bit set. Still even if the reset of priv->dataReady
> + * were moved to just after the udev_monitor_receive_device in order to
> + * avoid the NONBLOCK issue, the scheduler would still come into play

"Trying to move the reset of the @priv->dataReady flag within the loop wouldn't
help much as the scheduler would still come into play."

Everything written below could go to the commit message itself.


Would you agree to the proposed adjustments?
Erik

> + * especially for environments when multiple devices are added into the
> + * system. Even those environments would still be blocked on the udev
> + * socket recv() call. The algorithm for handling that scenario would
> + * be more complex and involve pulling the monitor object out of the
> + * private data lockable object and would need to be guarded by a
> + * separate lock. Devising algorithms to work around issues with older
> + * OS's at the expense of more modern OS algorithms in newer event
> + * processing code may result in unexpected issues, so the choice is
> + * to encourage use of newer OS's with newer udev event processing code.
> + * Newer devices, such as mdevs make heavy usage of event processing
> + * and it's expected future devices would follow that model.
> + */
>  static void
>  udevEventHandleThread(void *opaque ATTRIBUTE_UNUSED)
>  {
> @@ -1637,6 +1684,8 @@ udevEventHandleThread(void *opaque ATTRIBUTE_UNUSED)
>                  return;
>              }
>
> +            /* See the above description why this needs to be here
> +             * and not after the udev_monitor_receive_device. */
>              virObjectLock(priv);
>              priv->dataReady = false;
>              virObjectUnlock(priv);
> --
> 2.17.2
>
> --
> libvir-list mailing list
> libvir-list@redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] nodedev: Document the udevEventHandleThread
Posted by John Ferlan 5 years, 4 months ago

On 11/1/18 7:36 AM, Erik Skultety wrote:
> On Fri, Oct 19, 2018 at 08:04:43AM -0400, John Ferlan wrote:
>> Add details of the algorithm and why it was used to help future
>> readers understand the issues encountered. Based largely off a
>> description provided by Erik Skultety <eskultet@redhat.com>.
> 
> Since the link to the archive would be missing from the commit message, I feel
> like simply blaming commit cdbe13329a7 for lacking documentation/justification
> of the algorithm used is the right way of composing the commit message.
> 

OK - thanks for the commit link...

> I appreciate the effort to document it properly, although I feel it's a bit
> overwhelming, so the comments I have below is just a sheer attempt to reduce
> the amount of text documenting a single function, you know, if a function needs
> a comment like that, it suggests that there's something wrong with that
> function...
> 

Understood - I threw a lot of spaghetti on the wall and altering what's
here is perfectly fine. I didn't want to leave something out that you
may have found important...

BTW: Another explanation for a lengthy function header - it may suggest
that if you're going to post a patch changing it's logic, you had better
know what you're doing ;-)

>>
>> Signed-off-by: John Ferlan <jferlan@redhat.com>
>> ---
>>
>>  see: https://www.redhat.com/archives/libvir-list/2018-October/msg00915.html
>>
>>  src/node_device/node_device_udev.c | 49 ++++++++++++++++++++++++++++++
>>  1 file changed, 49 insertions(+)
>>
>> diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c
>> index 22897591de..8ca81e65ad 100644
>> --- a/src/node_device/node_device_udev.c
>> +++ b/src/node_device/node_device_udev.c
>> @@ -1591,6 +1591,53 @@ udevEventMonitorSanityCheck(udevEventDataPtr priv,
>>  }
>>
>>
>> +/**
>> + * udevEventHandleThread
>> + * @opaque: unused
>> + *
>> + * Thread to handle the udevEventHandleCallback processing when udev
>> + * tells us there's a device change for us (add, modify, delete, etc).
>> + *
>> + * Management of the processing udev device notification is a delicate
>> + * balance between the udev process, the scheduler, and when exactly the
>> + * data from/for the socket is received.
> 
> I'd drop ^this sentence, the issue with this unfortunate mix of factors is
> explained further down, let's try keeping the amount of commentary down.
> 

I'll move it to the commit message ;-)

>> + * The udev processing code notifies
>> + * the udevEventHandleCallback that it has data; however, the actual recv()
>> + * done in the udev code to fill in the @device structure can be delayed
>> + * due to how threads are scheduled.
> 
> How about:
> "When we get notified by udev that there are data to be processed, the actual
> data retrieval by libudev may be delayed due to how threads are scheduled."
> 
> + *
>> + * As it turns out, the event loop could be scheduled (and it in fact
>> + * was) earlier than the handler thread. What that essentially means is
>> + * that by the time the thread actually handled the event and read the
>> + * data from the monitor, the event loop fired the very same event, simply
>> + * because the data hadn't been retrieved from the socket at that point yet.
> 
> ^This doesn't need to be a separate paragraph, you can connect it to the
> previous one. Also, some words don't feel right for commentary purposes (I know
> they're mine :D). How about:
> 
> "In fact, the event loop could be scheduled earlier than the handler thread,
> thus potentially emitting the very same event the handler thread is currently
> trying to process, simply because the data hadn't been retrieved from the
> socket at that time yet."
> 
>> + *
>> + * Thus this algorithm is that once udevEventHandleCallback is notified
>> + * there is data,
> 
> ^this can be dropped...
> 
>> + * this loop will process as much data as possible until
>> + * udev_monitor_receive_device fails to get the @device. This may be
>> + * because we've processed everything or because the scheduling thread
>> + * hasn't been able to populate data in the socket. Once we're done
>> + * processing everything we can, wait on the next event/notification.
> 
> How about:
> 
> "Instead of waiting for the next event after retrieving a device data we keep
> reading from the udev monitor until we encounter one of EAGAIN or EWOULDBLOCK
> errors, only then we'll wait for the next event."
> 
> Also, I'd move that bit into the function body, so it's directly tied to
> the piece of code responsible for it.
> 
>> + * Although this may incur a slight delay if the socket data wasn't
>> + * populated, the event/notifications are fairly consistent so there's
>> + * no need to use virCondWaitUntil.
> 
> ^This could be dropped.
> 
>> + *
>> + * NB: Usage of event based socket algorithm causes some issues with
>> + * older platforms such as CentOS 6 in which the data socket is opened
>> + * without the NONBLOCK bit set. Still even if the reset of priv->dataReady
>> + * were moved to just after the udev_monitor_receive_device in order to
>> + * avoid the NONBLOCK issue, the scheduler would still come into play
> 
> "Trying to move the reset of the @priv->dataReady flag within the loop wouldn't
> help much as the scheduler would still come into play."
> 
> Everything written below could go to the commit message itself.
> 
> 
> Would you agree to the proposed adjustments?

yep, a v2 is on it's way.

John

> Erik
> 
>> + * especially for environments when multiple devices are added into the
>> + * system. Even those environments would still be blocked on the udev
>> + * socket recv() call. The algorithm for handling that scenario would
>> + * be more complex and involve pulling the monitor object out of the
>> + * private data lockable object and would need to be guarded by a
>> + * separate lock. Devising algorithms to work around issues with older
>> + * OS's at the expense of more modern OS algorithms in newer event
>> + * processing code may result in unexpected issues, so the choice is
>> + * to encourage use of newer OS's with newer udev event processing code.
>> + * Newer devices, such as mdevs make heavy usage of event processing
>> + * and it's expected future devices would follow that model.
>> + */
>>  static void
>>  udevEventHandleThread(void *opaque ATTRIBUTE_UNUSED)
>>  {
>> @@ -1637,6 +1684,8 @@ udevEventHandleThread(void *opaque ATTRIBUTE_UNUSED)
>>                  return;
>>              }
>>
>> +            /* See the above description why this needs to be here
>> +             * and not after the udev_monitor_receive_device. */
>>              virObjectLock(priv);
>>              priv->dataReady = false;
>>              virObjectUnlock(priv);
>> --
>> 2.17.2
>>
>> --
>> libvir-list mailing list
>> libvir-list@redhat.com
>> https://www.redhat.com/mailman/listinfo/libvir-list

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list