[PATCH v2] async: the main AioContext is only "current" if under the BQL

Paolo Bonzini posted 1 patch 2 years, 10 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/next-importer-push tags/patchew/20210609122234.544153-1-pbonzini@redhat.com
Maintainers: Kevin Wolf <kwolf@redhat.com>, Fam Zheng <fam@euphon.net>, Stefan Hajnoczi <stefanha@redhat.com>, Max Reitz <mreitz@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>
include/block/aio.h   |  5 ++++-
iothread.c            |  9 +--------
stubs/iothread-lock.c |  2 +-
stubs/iothread.c      |  8 --------
stubs/meson.build     |  1 -
tests/unit/iothread.c |  9 +--------
util/async.c          | 20 ++++++++++++++++++++
util/main-loop.c      |  1 +
8 files changed, 28 insertions(+), 27 deletions(-)
delete mode 100644 stubs/iothread.c
[PATCH v2] async: the main AioContext is only "current" if under the BQL
Posted by Paolo Bonzini 2 years, 10 months ago
If we want to wake up a coroutine from a worker thread, aio_co_wake()
currently does not work.  In that scenario, aio_co_wake() calls
aio_co_enter(), but there is no current AioContext and therefore
qemu_get_current_aio_context() returns the main thread.  aio_co_wake()
then attempts to call aio_context_acquire() instead of going through
aio_co_schedule().

The default case of qemu_get_current_aio_context() was added to cover
synchronous I/O started from the vCPU thread, but the main and vCPU
threads are quite different.  The main thread is an I/O thread itself,
only running a more complicated event loop; the vCPU thread instead
is essentially a worker thread that occasionally calls
qemu_mutex_lock_iothread().  It is only in those critical sections
that it acts as if it were the home thread of the main AioContext.

Therefore, this patch detaches qemu_get_current_aio_context() from
iothreads, which is a useless complication.  The AioContext pointer
is stored directly in the thread-local variable, including for the
main loop.  Worker threads (including vCPU threads) optionally behave
as temporary home threads if they have taken the big QEMU lock,
but if that is not the case they will always schedule coroutines
on remote threads via aio_co_schedule().

With this change, qemu_mutex_iothread_locked() must be changed from
true to false.  The previous value of true was needed because the
main thread did not have an AioContext in the thread-local variable,
but now it does have one.

Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/block/aio.h   |  5 ++++-
 iothread.c            |  9 +--------
 stubs/iothread-lock.c |  2 +-
 stubs/iothread.c      |  8 --------
 stubs/meson.build     |  1 -
 tests/unit/iothread.c |  9 +--------
 util/async.c          | 20 ++++++++++++++++++++
 util/main-loop.c      |  1 +
 8 files changed, 28 insertions(+), 27 deletions(-)
 delete mode 100644 stubs/iothread.c

diff --git a/include/block/aio.h b/include/block/aio.h
index 5f342267d5..10fcae1515 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -691,10 +691,13 @@ void aio_co_enter(AioContext *ctx, struct Coroutine *co);
  * Return the AioContext whose event loop runs in the current thread.
  *
  * If called from an IOThread this will be the IOThread's AioContext.  If
- * called from another thread it will be the main loop AioContext.
+ * called from the main thread or with the "big QEMU lock" taken it
+ * will be the main loop AioContext.
  */
 AioContext *qemu_get_current_aio_context(void);
 
+void qemu_set_current_aio_context(AioContext *ctx);
+
 /**
  * aio_context_setup:
  * @ctx: the aio context
diff --git a/iothread.c b/iothread.c
index 7f086387be..2c5ccd7367 100644
--- a/iothread.c
+++ b/iothread.c
@@ -39,13 +39,6 @@ DECLARE_CLASS_CHECKERS(IOThreadClass, IOTHREAD,
 #define IOTHREAD_POLL_MAX_NS_DEFAULT 0ULL
 #endif
 
-static __thread IOThread *my_iothread;
-
-AioContext *qemu_get_current_aio_context(void)
-{
-    return my_iothread ? my_iothread->ctx : qemu_get_aio_context();
-}
-
 static void *iothread_run(void *opaque)
 {
     IOThread *iothread = opaque;
@@ -56,7 +49,7 @@ static void *iothread_run(void *opaque)
      * in this new thread uses glib.
      */
     g_main_context_push_thread_default(iothread->worker_context);
-    my_iothread = iothread;
+    qemu_set_current_aio_context(iothread->ctx);
     iothread->thread_id = qemu_get_thread_id();
     qemu_sem_post(&iothread->init_done_sem);
 
diff --git a/stubs/iothread-lock.c b/stubs/iothread-lock.c
index 2a6efad64a..5b45b7fc8b 100644
--- a/stubs/iothread-lock.c
+++ b/stubs/iothread-lock.c
@@ -3,7 +3,7 @@
 
 bool qemu_mutex_iothread_locked(void)
 {
-    return true;
+    return false;
 }
 
 void qemu_mutex_lock_iothread_impl(const char *file, int line)
diff --git a/stubs/iothread.c b/stubs/iothread.c
deleted file mode 100644
index 8cc9e28c55..0000000000
--- a/stubs/iothread.c
+++ /dev/null
@@ -1,8 +0,0 @@
-#include "qemu/osdep.h"
-#include "block/aio.h"
-#include "qemu/main-loop.h"
-
-AioContext *qemu_get_current_aio_context(void)
-{
-    return qemu_get_aio_context();
-}
diff --git a/stubs/meson.build b/stubs/meson.build
index 65c22c0568..4993797f05 100644
--- a/stubs/meson.build
+++ b/stubs/meson.build
@@ -16,7 +16,6 @@ stub_ss.add(files('fw_cfg.c'))
 stub_ss.add(files('gdbstub.c'))
 stub_ss.add(files('get-vm-name.c'))
 stub_ss.add(when: 'CONFIG_LINUX_IO_URING', if_true: files('io_uring.c'))
-stub_ss.add(files('iothread.c'))
 stub_ss.add(files('iothread-lock.c'))
 stub_ss.add(files('isa-bus.c'))
 stub_ss.add(files('is-daemonized.c'))
diff --git a/tests/unit/iothread.c b/tests/unit/iothread.c
index afde12b4ef..f9b0791084 100644
--- a/tests/unit/iothread.c
+++ b/tests/unit/iothread.c
@@ -30,13 +30,6 @@ struct IOThread {
     bool stopping;
 };
 
-static __thread IOThread *my_iothread;
-
-AioContext *qemu_get_current_aio_context(void)
-{
-    return my_iothread ? my_iothread->ctx : qemu_get_aio_context();
-}
-
 static void iothread_init_gcontext(IOThread *iothread)
 {
     GSource *source;
@@ -54,9 +47,9 @@ static void *iothread_run(void *opaque)
 
     rcu_register_thread();
 
-    my_iothread = iothread;
     qemu_mutex_lock(&iothread->init_done_lock);
     iothread->ctx = aio_context_new(&error_abort);
+    qemu_set_current_aio_context(iothread->ctx);
 
     /*
      * We must connect the ctx to a GMainContext, because in older versions
diff --git a/util/async.c b/util/async.c
index 674dbefb7c..5d9b7cc1eb 100644
--- a/util/async.c
+++ b/util/async.c
@@ -649,3 +649,23 @@ void aio_context_release(AioContext *ctx)
 {
     qemu_rec_mutex_unlock(&ctx->lock);
 }
+
+static __thread AioContext *my_aiocontext;
+
+AioContext *qemu_get_current_aio_context(void)
+{
+    if (my_aiocontext) {
+        return my_aiocontext;
+    }
+    if (qemu_mutex_iothread_locked()) {
+        /* Possibly in a vCPU thread.  */
+        return qemu_get_aio_context();
+    }
+    return NULL;
+}
+
+void qemu_set_current_aio_context(AioContext *ctx)
+{
+    assert(!my_aiocontext);
+    my_aiocontext = ctx;
+}
diff --git a/util/main-loop.c b/util/main-loop.c
index d9c55df6f5..4ae5b23e99 100644
--- a/util/main-loop.c
+++ b/util/main-loop.c
@@ -170,6 +170,7 @@ int qemu_init_main_loop(Error **errp)
     if (!qemu_aio_context) {
         return -EMFILE;
     }
+    qemu_set_current_aio_context(qemu_aio_context);
     qemu_notify_bh = qemu_bh_new(notify_event_cb, NULL);
     gpollfds = g_array_new(FALSE, FALSE, sizeof(GPollFD));
     src = aio_get_g_source(qemu_aio_context);
-- 
2.31.1


Re: [PATCH v2] async: the main AioContext is only "current" if under the BQL
Posted by Vladimir Sementsov-Ogievskiy 2 years, 10 months ago
09.06.2021 15:22, Paolo Bonzini wrote:
> If we want to wake up a coroutine from a worker thread, aio_co_wake()
> currently does not work.  In that scenario, aio_co_wake() calls
> aio_co_enter(), but there is no current AioContext and therefore
> qemu_get_current_aio_context() returns the main thread.  aio_co_wake()
> then attempts to call aio_context_acquire() instead of going through
> aio_co_schedule().
> 
> The default case of qemu_get_current_aio_context() was added to cover
> synchronous I/O started from the vCPU thread, but the main and vCPU
> threads are quite different.  The main thread is an I/O thread itself,
> only running a more complicated event loop; the vCPU thread instead
> is essentially a worker thread that occasionally calls
> qemu_mutex_lock_iothread().  It is only in those critical sections
> that it acts as if it were the home thread of the main AioContext.
> 
> Therefore, this patch detaches qemu_get_current_aio_context() from
> iothreads, which is a useless complication.  The AioContext pointer
> is stored directly in the thread-local variable, including for the
> main loop.  Worker threads (including vCPU threads) optionally behave
> as temporary home threads if they have taken the big QEMU lock,
> but if that is not the case they will always schedule coroutines
> on remote threads via aio_co_schedule().
> 
> With this change, qemu_mutex_iothread_locked() must be changed from

Did you miss "qemu_mutex_iothread_locked() stub function"?

> true to false.  The previous value of true was needed because the
> main thread did not have an AioContext in the thread-local variable,
> but now it does have one.
> 
> Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Weak as I'm not good in all these iothreads, neither I know does all old callers of qemu_get_current_aio_context() are ok with new behavior. Technically looks OK:

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Also, it works well with my nbd upcoming series, thanks:

Tested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Ask just in case: aio context of iothread is set only once and never changes, yes?


> ---
>   include/block/aio.h   |  5 ++++-
>   iothread.c            |  9 +--------
>   stubs/iothread-lock.c |  2 +-
>   stubs/iothread.c      |  8 --------
>   stubs/meson.build     |  1 -
>   tests/unit/iothread.c |  9 +--------
>   util/async.c          | 20 ++++++++++++++++++++
>   util/main-loop.c      |  1 +
>   8 files changed, 28 insertions(+), 27 deletions(-)
>   delete mode 100644 stubs/iothread.c
> 
> diff --git a/include/block/aio.h b/include/block/aio.h
> index 5f342267d5..10fcae1515 100644
> --- a/include/block/aio.h
> +++ b/include/block/aio.h
> @@ -691,10 +691,13 @@ void aio_co_enter(AioContext *ctx, struct Coroutine *co);
>    * Return the AioContext whose event loop runs in the current thread.
>    *
>    * If called from an IOThread this will be the IOThread's AioContext.  If
> - * called from another thread it will be the main loop AioContext.
> + * called from the main thread or with the "big QEMU lock" taken it
> + * will be the main loop AioContext.

worth add: In other cases return NULL ?


-- 
Best regards,
Vladimir

Re: [PATCH v2] async: the main AioContext is only "current" if under the BQL
Posted by Paolo Bonzini 2 years, 10 months ago
On 09/06/21 17:01, Vladimir Sementsov-Ogievskiy wrote:
> 09.06.2021 15:22, Paolo Bonzini wrote:
>> If we want to wake up a coroutine from a worker thread, aio_co_wake()
>> currently does not work.  In that scenario, aio_co_wake() calls
>> aio_co_enter(), but there is no current AioContext and therefore
>> qemu_get_current_aio_context() returns the main thread.  aio_co_wake()
>> then attempts to call aio_context_acquire() instead of going through
>> aio_co_schedule().
>>
>> The default case of qemu_get_current_aio_context() was added to cover
>> synchronous I/O started from the vCPU thread, but the main and vCPU
>> threads are quite different.  The main thread is an I/O thread itself,
>> only running a more complicated event loop; the vCPU thread instead
>> is essentially a worker thread that occasionally calls
>> qemu_mutex_lock_iothread().  It is only in those critical sections
>> that it acts as if it were the home thread of the main AioContext.
>>
>> Therefore, this patch detaches qemu_get_current_aio_context() from
>> iothreads, which is a useless complication.  The AioContext pointer
>> is stored directly in the thread-local variable, including for the
>> main loop.  Worker threads (including vCPU threads) optionally behave
>> as temporary home threads if they have taken the big QEMU lock,
>> but if that is not the case they will always schedule coroutines
>> on remote threads via aio_co_schedule().
>>
>> With this change, qemu_mutex_iothread_locked() must be changed from
> 
> Did you miss "qemu_mutex_iothread_locked() stub function"?

Yes, that comment refers to the stub.  Let me resubmit with a testcase 
and I'll fix that too.

Paolo


Re: [PATCH v2] async: the main AioContext is only "current" if under the BQL
Posted by Eric Blake 2 years, 10 months ago
On Fri, Jun 11, 2021 at 01:14:27PM +0200, Paolo Bonzini wrote:
> > > With this change, qemu_mutex_iothread_locked() must be changed from
> > 
> > Did you miss "qemu_mutex_iothread_locked() stub function"?
> 
> Yes, that comment refers to the stub.  Let me resubmit with a testcase and
> I'll fix that too.

Vladimir's v4 series reworking nbd connection depends on this one.
I'll wait for your v3 to land, but once it does, I'm happy to queue it
through my NBD tree along with Vladimir's work, if that makes it
easier.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org