This field is accessed by multiple threads without a lock. Use explicit
qatomic_read()/qatomic_set() calls. There is no need for acquire/release
because blk_set_disable_request_queuing() doesn't provide any
guarantees (it helps that it's used at BlockBackend creation time and
not when there is I/O in flight).
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
---
block/block-backend.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/block/block-backend.c b/block/block-backend.c
index 68807be32b..0cba4add20 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -82,7 +82,7 @@ struct BlockBackend {
int quiesce_counter; /* atomic: written under BQL, read by other threads */
CoQueue queued_requests;
- bool disable_request_queuing;
+ bool disable_request_queuing; /* atomic */
VMChangeStateEntry *vmsh;
bool force_allow_inactivate;
@@ -1232,7 +1232,7 @@ void blk_set_allow_aio_context_change(BlockBackend *blk, bool allow)
void blk_set_disable_request_queuing(BlockBackend *blk, bool disable)
{
IO_CODE();
- blk->disable_request_queuing = disable;
+ qatomic_set(&blk->disable_request_queuing, disable);
}
static int coroutine_fn GRAPH_RDLOCK
@@ -1271,7 +1271,8 @@ static void coroutine_fn blk_wait_while_drained(BlockBackend *blk)
{
assert(blk->in_flight > 0);
- if (qatomic_read(&blk->quiesce_counter) && !blk->disable_request_queuing) {
+ if (qatomic_read(&blk->quiesce_counter) &&
+ !qatomic_read(&blk->disable_request_queuing)) {
blk_dec_in_flight(blk);
qemu_co_queue_wait(&blk->queued_requests, NULL);
blk_inc_in_flight(blk);
--
2.39.2
On 3/7/23 22:04, Stefan Hajnoczi wrote: > static int coroutine_fn GRAPH_RDLOCK > @@ -1271,7 +1271,8 @@ static void coroutine_fn blk_wait_while_drained(BlockBackend *blk) > { > assert(blk->in_flight > 0); > > - if (qatomic_read(&blk->quiesce_counter) && !blk->disable_request_queuing) { > + if (qatomic_read(&blk->quiesce_counter) && > + !qatomic_read(&blk->disable_request_queuing)) { The qatomic_inc in blk_inc_in_flight() made me a bit nervous that smp_mb__after_rmw() was needed there, but it's okay. First, anyway blk_wait_while_drained() has to _eventually_ pause the device, not immediately. Even if it misses that blk->quiesce_counter == 1, the I/O will proceed and it'll just take a little more polling before bdrv_drained_begin() exits. Second, I checked with CPPMEM the barriers in AIO_WAIT_WHILE() and aio_wait_kick() save the day, even if loading blk->quiesce_counter is reordered before the incremented value (1) is stored to blk->in_flight. The CPPMEM model here uses mo_relaxed to force all possible kinds of havoc: int main() { atomic_int quiesce_counter = 0; atomic_int waiters = 0; atomic_int in_flight = 0; {{{ { quiesce_counter.store(1, mo_relaxed); waiters.store(1, mo_relaxed); // AIO_WAIT_WHILE starts here atomic_thread_fence(mo_seq_cst); in_flight.load(mo_relaxed).readsvalue(1); } // if 1, sleep ||| { in_flight.store(1, mo_relaxed); // bdrv_inc_in_flight quiesce_counter.load(mo_relaxed).readsvalue(1); // go down "if" in_flight.store(0, mo_release); // bdrv_dec_in_flight atomic_thread_fence(mo_seq_cst); // aio_wait_kick starts here waiters.load(mo_relaxed).readsvalue(0); } // if 0, do not wake }}}; return 0; } Because CPPMEM shows no execution consistent with the buggy .readsvalue(), either AIO_WAIT_WHILE will not go to sleep or it will be woken up with in_flight == 0. The polling loop ends and drained_end restarts the coroutine from blk->queued_requests. Paolo > blk_dec_in_flight(blk); > qemu_co_queue_wait(&blk->queued_requests, NULL); > blk_inc_in_flight(blk); > -- 2.39.2
On Thu, Mar 09, 2023 at 12:18:00PM +0100, Paolo Bonzini wrote: > On 3/7/23 22:04, Stefan Hajnoczi wrote: > > static int coroutine_fn GRAPH_RDLOCK > > @@ -1271,7 +1271,8 @@ static void coroutine_fn blk_wait_while_drained(BlockBackend *blk) > > { > > assert(blk->in_flight > 0); > > - if (qatomic_read(&blk->quiesce_counter) && !blk->disable_request_queuing) { > > + if (qatomic_read(&blk->quiesce_counter) && > > + !qatomic_read(&blk->disable_request_queuing)) { > > The qatomic_inc in blk_inc_in_flight() made me a bit nervous that > smp_mb__after_rmw() was needed there, but it's okay. Yes. I wrote it under the assumption that sequentially consistent operations like qatomic_inc() are implicit barriers. > First, anyway blk_wait_while_drained() has to _eventually_ pause the device, > not immediately. Even if it misses that blk->quiesce_counter == 1, the I/O > will proceed and it'll just take a little more polling before > bdrv_drained_begin() exits. > > Second, I checked with CPPMEM the barriers in AIO_WAIT_WHILE() and > aio_wait_kick() save the day, even if loading blk->quiesce_counter is > reordered before the incremented value (1) is stored to blk->in_flight. > > The CPPMEM model here uses mo_relaxed to force all possible kinds of havoc: > > int main() { > atomic_int quiesce_counter = 0; > atomic_int waiters = 0; > atomic_int in_flight = 0; > > {{{ { quiesce_counter.store(1, mo_relaxed); > waiters.store(1, mo_relaxed); // AIO_WAIT_WHILE starts here > atomic_thread_fence(mo_seq_cst); > in_flight.load(mo_relaxed).readsvalue(1); } // if 1, sleep > > ||| { in_flight.store(1, mo_relaxed); // bdrv_inc_in_flight > quiesce_counter.load(mo_relaxed).readsvalue(1); // go down "if" > in_flight.store(0, mo_release); // bdrv_dec_in_flight > atomic_thread_fence(mo_seq_cst); // aio_wait_kick starts here > waiters.load(mo_relaxed).readsvalue(0); } // if 0, do not wake > }}}; > > return 0; > } > > > Because CPPMEM shows no execution consistent with the buggy .readsvalue(), > either AIO_WAIT_WHILE will not go to sleep or it will be woken up with > in_flight == 0. The polling loop ends and drained_end restarts the > coroutine from blk->queued_requests. Okay. Stefan
On 3/7/23 22:04, Stefan Hajnoczi wrote: > This field is accessed by multiple threads without a lock. Use explicit > qatomic_read()/qatomic_set() calls. There is no need for acquire/release > because blk_set_disable_request_queuing() doesn't provide any > guarantees (it helps that it's used at BlockBackend creation time and > not when there is I/O in flight). This in turn means itdoes not need to be atomic - atomics are only needed if there are concurrent writes. No big deal; I am now resurrecting the series from the time I had noticed the queued_requests thread-safety problem, so this will be simplified in 8.1. For now your version is okay, thanks for fixing it! Paolo > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > Reviewed-by: Hanna Czenczek <hreitz@redhat.com> > --- > block/block-backend.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/block/block-backend.c b/block/block-backend.c > index 68807be32b..0cba4add20 100644 > --- a/block/block-backend.c > +++ b/block/block-backend.c > @@ -82,7 +82,7 @@ struct BlockBackend { > > int quiesce_counter; /* atomic: written under BQL, read by other threads */ > CoQueue queued_requests; > - bool disable_request_queuing; > + bool disable_request_queuing; /* atomic */ > > VMChangeStateEntry *vmsh; > bool force_allow_inactivate; > @@ -1232,7 +1232,7 @@ void blk_set_allow_aio_context_change(BlockBackend *blk, bool allow) > void blk_set_disable_request_queuing(BlockBackend *blk, bool disable) > { > IO_CODE(); > - blk->disable_request_queuing = disable; > + qatomic_set(&blk->disable_request_queuing, disable); > } > > static int coroutine_fn GRAPH_RDLOCK > @@ -1271,7 +1271,8 @@ static void coroutine_fn blk_wait_while_drained(BlockBackend *blk) > { > assert(blk->in_flight > 0); > > - if (qatomic_read(&blk->quiesce_counter) && !blk->disable_request_queuing) { > + if (qatomic_read(&blk->quiesce_counter) && > + !qatomic_read(&blk->disable_request_queuing)) { > blk_dec_in_flight(blk); > qemu_co_queue_wait(&blk->queued_requests, NULL); > blk_inc_in_flight(blk);
On Thu, Mar 09, 2023 at 10:07:40AM +0100, Paolo Bonzini wrote: > On 3/7/23 22:04, Stefan Hajnoczi wrote: > > This field is accessed by multiple threads without a lock. Use explicit > > qatomic_read()/qatomic_set() calls. There is no need for acquire/release > > because blk_set_disable_request_queuing() doesn't provide any > > guarantees (it helps that it's used at BlockBackend creation time and > > not when there is I/O in flight). > > This in turn means itdoes not need to be atomic - atomics are only needed if > there are concurrent writes. No big deal; I am now resurrecting the series > from the time I had noticed the queued_requests thread-safety problem, so > this will be simplified in 8.1. For now your version is okay, thanks for > fixing it! I was under the impression that variables accessed by multiple threads outside a lock or similar primitive need memory_order_relaxed both as documentation and to tell the compiler that they should indeed be atomic (but without ordering guarantees). I think memory_order_relaxed also tells the compiler to do a bit more, like to generate just a single store to the variable for each occurrence in the code ("speculative" and "out-of-thin air" stores). It's the documentation part that's most interesting in this case. Do we not want to identify variables that are accessed outside a lock and therefore require some thought? Stefan
On 3/9/23 13:31, Stefan Hajnoczi wrote: > On Thu, Mar 09, 2023 at 10:07:40AM +0100, Paolo Bonzini wrote: >> On 3/7/23 22:04, Stefan Hajnoczi wrote: >>> This field is accessed by multiple threads without a lock. Use explicit >>> qatomic_read()/qatomic_set() calls. There is no need for acquire/release >>> because blk_set_disable_request_queuing() doesn't provide any >>> guarantees (it helps that it's used at BlockBackend creation time and >>> not when there is I/O in flight). >> >> This in turn means itdoes not need to be atomic - atomics are only needed if >> there are concurrent writes. No big deal; I am now resurrecting the series >> from the time I had noticed the queued_requests thread-safety problem, so >> this will be simplified in 8.1. For now your version is okay, thanks for >> fixing it! > > I was under the impression that variables accessed by multiple threads > outside a lock or similar primitive need memory_order_relaxed both as > documentation and to tell the compiler that they should indeed be atomic > (but without ordering guarantees). Atomic accesses are needed to avoid data races. Data races are concurrent accesses, of which at least one is a non-atomic write. (A is concurrent with B is you can't be sure that A happens before B or vice versa; this intuitively is the "lock or similar primitive" that you mentioned. Happens-before changes from one execution to the other, but it is enough to somehow prove there _is_ an ordering; for example, given two accesses that are done while a mutex is taken, one will always happen before the other). In this case all writes to disable_request_queuing happen not just outside I/O, but even *before* the first I/O. No writes that are concurrent with reads => no need to use atomic for reads. For example the stdin global variable is accessed from multiple threads and you would never use atomics to read the pointer. Just don't write to it and there won't be data races. > I think memory_order_relaxed also tells the compiler to do a bit more, > like to generate just a single store to the variable for each occurrence > in the code ("speculative" and "out-of-thin air" stores). The correspondence is not necessarily 1:1, some optimizations are possible; for example this: qatomic_set(&x, 0); a = qatomic_read(&x); qatomic_set(&x, a + 1); can be changed to a = 0; qatomic_set(&x, 1); (because it is safe to assume that no other thread sees the state where x==0). Or the first read here: a = qatomic_read(&x); a = qatomic_read(&x); can be optimized out, unlike Linux's READ_ONCE(). I have no idea if compilers actually perform these optimizations, but if they do they are neither frequent (maybe in languages that inline a lot more, but not in QEMU) nor problematic. Even though there's some freedom in removing/consolidating code, it's true as you said that speculation is a no-no! > It's the documentation part that's most interesting in this case. Do we > not want to identify variables that are accessed outside a lock and > therefore require some thought? I think it depends, see the stdin example before. As the API is now, I agree that using qatomic_read() is the right thing to do. In principle the flag could bounce back and forth many times. :/ With a better API, the balance may tilt on the side of not using atomics. We'll see when I post the patch. :) Paolo
On 7/3/23 22:04, Stefan Hajnoczi wrote: > This field is accessed by multiple threads without a lock. Use explicit > qatomic_read()/qatomic_set() calls. There is no need for acquire/release > because blk_set_disable_request_queuing() doesn't provide any > guarantees (it helps that it's used at BlockBackend creation time and > not when there is I/O in flight). > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > Reviewed-by: Hanna Czenczek <hreitz@redhat.com> > --- > block/block-backend.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
© 2016 - 2025 Red Hat, Inc.