[PATCH v2] KVM: dirty ring: check if vcpu is created before dirty_ring_reap_one

Weinan Liu posted 1 patch 1 year, 2 months ago
accel/kvm/kvm-all.c | 5 +++++
1 file changed, 5 insertions(+)
[PATCH v2] KVM: dirty ring: check if vcpu is created before dirty_ring_reap_one
Posted by Weinan Liu 1 year, 2 months ago
Failed to assert '(dirty_gfns && ring_size)' in kvm_dirty_ring_reap_one if
the vcpu has not been finished to create yet. This bug occasionally occurs
when I open 200+ qemu instances on my 16G 6-cores x86 machine. And it must
be triggered if inserting a 'sleep(10)' into kvm_vcpu_thread_fn as below--

 static void *kvm_vcpu_thread_fn(void *arg)
 {
     CPUState *cpu = arg;
     int r;

     rcu_register_thread();

+    sleep(10);
     qemu_mutex_lock_iothread();
     qemu_thread_get_self(cpu->thread);
     cpu->thread_id = qemu_get_thread_id();
     cpu->can_do_io = 1;

where dirty ring reaper will wakeup but then a vcpu has not been finished
to create.

Signed-off-by: Weinan Liu <liu-weinan@qq.com>
---
 accel/kvm/kvm-all.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 7e6a6076b1..0070ad72b8 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -1416,6 +1416,11 @@ static void *kvm_dirty_ring_reaper_thread(void *data)
          */
         sleep(1);
 
+        /* ensure kvm_init_vcpu is finished, so cpu->kvm_dirty_gfns is ok */
+        if (!phase_check(PHASE_MACHINE_READY)) {
+            continue;
+        }
+
         /* keep sleeping so that dirtylimit not be interfered by reaper */
         if (dirtylimit_in_service()) {
             continue;
-- 
2.25.1
Re: [PATCH v2] KVM: dirty ring: check if vcpu is created before dirty_ring_reap_one
Posted by Peter Xu 1 year, 2 months ago
Hi, Weinan,

On Sun, Feb 05, 2023 at 06:45:45AM +0000, Weinan Liu wrote:
> Failed to assert '(dirty_gfns && ring_size)' in kvm_dirty_ring_reap_one if
> the vcpu has not been finished to create yet. This bug occasionally occurs
> when I open 200+ qemu instances on my 16G 6-cores x86 machine. And it must
> be triggered if inserting a 'sleep(10)' into kvm_vcpu_thread_fn as below--
> 
>  static void *kvm_vcpu_thread_fn(void *arg)
>  {
>      CPUState *cpu = arg;
>      int r;
> 
>      rcu_register_thread();
> 
> +    sleep(10);
>      qemu_mutex_lock_iothread();
>      qemu_thread_get_self(cpu->thread);
>      cpu->thread_id = qemu_get_thread_id();
>      cpu->can_do_io = 1;
> 
> where dirty ring reaper will wakeup but then a vcpu has not been finished
> to create.
> 
> Signed-off-by: Weinan Liu <liu-weinan@qq.com>
> ---
>  accel/kvm/kvm-all.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
> index 7e6a6076b1..0070ad72b8 100644
> --- a/accel/kvm/kvm-all.c
> +++ b/accel/kvm/kvm-all.c
> @@ -1416,6 +1416,11 @@ static void *kvm_dirty_ring_reaper_thread(void *data)
>           */
>          sleep(1);
>  
> +        /* ensure kvm_init_vcpu is finished, so cpu->kvm_dirty_gfns is ok */
> +        if (!phase_check(PHASE_MACHINE_READY)) {
> +            continue;
> +        }
> +

Here's an old patch for this:

https://lore.kernel.org/all/20220927154653.77296-1-peterx@redhat.com/

IMHO that one will be more straightforward and self contained than this
one.  What do you think?

When posting new patches, please also remember to copy maintainers.  For
this one, it's:

$ ./scripts/get_maintainer.pl -f accel/kvm/kvm-all.c 
Paolo Bonzini <pbonzini@redhat.com> (supporter:Overall KVM CPUs)

Thanks,

-- 
Peter Xu