[PATCH v2 00/10] block: Make raw_co_get_allocated_file_size asynchronous

Fabiano Rosas posted 10 patches 10 months, 2 weeks ago
block/file-posix.c                 | 40 +++++++++++++++++--
block/meson.build                  |  1 +
block/monitor/block-hmp-cmds.c     | 22 ++++++-----
block/qapi.c                       | 62 +++++++++---------------------
blockdev.c                         |  6 +--
hmp-commands-info.hx               |  1 +
include/block/block-hmp-cmds.h     |  2 +-
include/block/qapi.h               | 17 ++++----
include/block/raw-aio.h            |  4 +-
qapi/block-core.json               |  5 ++-
qemu-img.c                         |  2 -
scripts/block-coroutine-wrapper.py |  1 +
12 files changed, 90 insertions(+), 73 deletions(-)
[PATCH v2 00/10] block: Make raw_co_get_allocated_file_size asynchronous
Posted by Fabiano Rosas 10 months, 2 weeks ago
Hi,

The major change from the last version is that this time I'm moving
all of the callers of bdrv_get_allocated_file_size() into
coroutines. I had to make some temporary changes to avoid asserts
while not all the callers were converted.

I tried my best to explain why I think the changes are safe. To avoid
changing too much of the code I added a change that removes the
dependency of qmp_query_block from hmp_nbd_server_start, that way I
don't need to move all of the nbd code into a coroutine as well.

Based on:
 [PATCH v2 00/11] block: Re-enable the graph lock
 https://lore.kernel.org/r/20230605085711.21261-1-kwolf@redhat.com

changes:

  - fixed duplicated commit message [Lin]
  - clarified why we need to convert info-block [Claudio]
  - added my rationale of why the changes are safe [Eric]
  - converted all callers to coroutines [Kevin]
  - made hmp_nbd_server_start don't depend on qmp_query_block

CI run: https://gitlab.com/farosas/qemu/-/pipelines/895525156
===
v1:
https://lore.kernel.org/r/20230523213903.18418-1-farosas@suse.de

As discussed in another thread [1], here's an RFC addressing a VCPU
softlockup encountered when issuing QMP commands that target a disk
placed on NFS.

Since QMP commands happen with the qemu_global_mutex locked, any
command that takes too long to finish will block other threads waiting
to take the global mutex. One such thread could be a VCPU thread going
out of the guest to handle IO.

This is the case when issuing the QMP command query-block, which
eventually calls raw_co_get_allocated_file_size(). This function makes
an 'fstat' call that has been observed to take a long time (seconds)
over NFS.

NFS latency issues aside, we can improve the situation by not blocking
VCPU threads while the command is running.

Move the 'fstat' call into the thread-pool and make the necessary
adaptations to ensure raw_co_get_allocated_file_size runs in a
coroutine in the block driver aio_context.

1- Question about QMP and BQL
https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg03141.html

Fabiano Rosas (8):
  block: Remove bdrv_query_block_node_info
  block: Remove unnecessary variable in bdrv_block_device_info
  block: Allow the wrapper script to see functions declared in qapi.h
  block: Temporarily mark bdrv_co_get_allocated_file_size as mixed
  block: Convert bdrv_query_block_graph_info to coroutine
  block: Convert bdrv_block_device_info into co_wrapper
  block: Don't query all block devices at hmp_nbd_server_start
  block: Convert qmp_query_block() to coroutine_fn

João Silva (1):
  block: Add a thread-pool version of fstat

Lin Ma (1):
  block: Convert qmp_query_named_block_nodes to coroutine

 block/file-posix.c                 | 40 +++++++++++++++++--
 block/meson.build                  |  1 +
 block/monitor/block-hmp-cmds.c     | 22 ++++++-----
 block/qapi.c                       | 62 +++++++++---------------------
 blockdev.c                         |  6 +--
 hmp-commands-info.hx               |  1 +
 include/block/block-hmp-cmds.h     |  2 +-
 include/block/qapi.h               | 17 ++++----
 include/block/raw-aio.h            |  4 +-
 qapi/block-core.json               |  5 ++-
 qemu-img.c                         |  2 -
 scripts/block-coroutine-wrapper.py |  1 +
 12 files changed, 90 insertions(+), 73 deletions(-)

-- 
2.35.3


Re: [PATCH v2 00/10] block: Make raw_co_get_allocated_file_size asynchronous
Posted by Claudio Fontana 8 months ago
Hi all,

we currently have to maintain something downstream for this, since the current behavior can compound problems on top of existing bad NFS latency,
could someone continue to help reviewing this work?

Thanks,

Claudio


On 6/9/23 22:19, Fabiano Rosas wrote:
> Hi,
> 
> The major change from the last version is that this time I'm moving
> all of the callers of bdrv_get_allocated_file_size() into
> coroutines. I had to make some temporary changes to avoid asserts
> while not all the callers were converted.
> 
> I tried my best to explain why I think the changes are safe. To avoid
> changing too much of the code I added a change that removes the
> dependency of qmp_query_block from hmp_nbd_server_start, that way I
> don't need to move all of the nbd code into a coroutine as well.
> 
> Based on:
>  [PATCH v2 00/11] block: Re-enable the graph lock
>  https://lore.kernel.org/r/20230605085711.21261-1-kwolf@redhat.com
> 
> changes:
> 
>   - fixed duplicated commit message [Lin]
>   - clarified why we need to convert info-block [Claudio]
>   - added my rationale of why the changes are safe [Eric]
>   - converted all callers to coroutines [Kevin]
>   - made hmp_nbd_server_start don't depend on qmp_query_block
> 
> CI run: https://gitlab.com/farosas/qemu/-/pipelines/895525156
> ===
> v1:
> https://lore.kernel.org/r/20230523213903.18418-1-farosas@suse.de
> 
> As discussed in another thread [1], here's an RFC addressing a VCPU
> softlockup encountered when issuing QMP commands that target a disk
> placed on NFS.
> 
> Since QMP commands happen with the qemu_global_mutex locked, any
> command that takes too long to finish will block other threads waiting
> to take the global mutex. One such thread could be a VCPU thread going
> out of the guest to handle IO.
> 
> This is the case when issuing the QMP command query-block, which
> eventually calls raw_co_get_allocated_file_size(). This function makes
> an 'fstat' call that has been observed to take a long time (seconds)
> over NFS.
> 
> NFS latency issues aside, we can improve the situation by not blocking
> VCPU threads while the command is running.
> 
> Move the 'fstat' call into the thread-pool and make the necessary
> adaptations to ensure raw_co_get_allocated_file_size runs in a
> coroutine in the block driver aio_context.
> 
> 1- Question about QMP and BQL
> https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg03141.html
> 
> Fabiano Rosas (8):
>   block: Remove bdrv_query_block_node_info
>   block: Remove unnecessary variable in bdrv_block_device_info
>   block: Allow the wrapper script to see functions declared in qapi.h
>   block: Temporarily mark bdrv_co_get_allocated_file_size as mixed
>   block: Convert bdrv_query_block_graph_info to coroutine
>   block: Convert bdrv_block_device_info into co_wrapper
>   block: Don't query all block devices at hmp_nbd_server_start
>   block: Convert qmp_query_block() to coroutine_fn
> 
> João Silva (1):
>   block: Add a thread-pool version of fstat
> 
> Lin Ma (1):
>   block: Convert qmp_query_named_block_nodes to coroutine
> 
>  block/file-posix.c                 | 40 +++++++++++++++++--
>  block/meson.build                  |  1 +
>  block/monitor/block-hmp-cmds.c     | 22 ++++++-----
>  block/qapi.c                       | 62 +++++++++---------------------
>  blockdev.c                         |  6 +--
>  hmp-commands-info.hx               |  1 +
>  include/block/block-hmp-cmds.h     |  2 +-
>  include/block/qapi.h               | 17 ++++----
>  include/block/raw-aio.h            |  4 +-
>  qapi/block-core.json               |  5 ++-
>  qemu-img.c                         |  2 -
>  scripts/block-coroutine-wrapper.py |  1 +
>  12 files changed, 90 insertions(+), 73 deletions(-)
> 


Re: [PATCH v2 00/10] block: Make raw_co_get_allocated_file_size asynchronous
Posted by Fabiano Rosas 9 months, 3 weeks ago
Fabiano Rosas <farosas@suse.de> writes:

> Hi,
>
> The major change from the last version is that this time I'm moving
> all of the callers of bdrv_get_allocated_file_size() into
> coroutines. I had to make some temporary changes to avoid asserts
> while not all the callers were converted.
>
> I tried my best to explain why I think the changes are safe. To avoid
> changing too much of the code I added a change that removes the
> dependency of qmp_query_block from hmp_nbd_server_start, that way I
> don't need to move all of the nbd code into a coroutine as well.
>
> Based on:
>  [PATCH v2 00/11] block: Re-enable the graph lock
>  https://lore.kernel.org/r/20230605085711.21261-1-kwolf@redhat.com
>
> changes:
>
>   - fixed duplicated commit message [Lin]
>   - clarified why we need to convert info-block [Claudio]
>   - added my rationale of why the changes are safe [Eric]
>   - converted all callers to coroutines [Kevin]
>   - made hmp_nbd_server_start don't depend on qmp_query_block
>
> CI run: https://gitlab.com/farosas/qemu/-/pipelines/895525156

Ping, this seems to have fallen through the cracks.