[Qemu-devel] [PATCH] throttle: fix a qemu crash problem when calling blk_delete

sochin.jiang posted 1 patch 10 years, 4 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/1385265352-40801-1-git-send-email-sochin.jiang@huawei.com
Test checkpatch passed
Test docker passed
Test s390x passed
There is a newer version of this series
block/io.c      | 8 ++++++++
util/throttle.c | 6 +++---
2 files changed, 11 insertions(+), 3 deletions(-)
[Qemu-devel] [PATCH] throttle: fix a qemu crash problem when calling blk_delete
Posted by sochin.jiang 10 years, 4 months ago
From: "sochin.jiang" <sochin.jiang@huawei.com>

commit 7ca7f0 moves the throttling related part of the BDS life cycle management
to BlockBackend, adds call to throttle_timers_detach_aio_context in blk_remove_bs.
commit 1606e remove a block device from its throttle group in blk_delete by calling
blk_io_limits_disable, this fix an easily reproducible qemu crash. But delete a
BB without a BDS inserted could easily cause a qemu crash too by calling
bdrv_drained_begin in blk_io_limits_disable. Say, a simply drive_add and then a
drive_del command.

Signed-off-by: sochin.jiang <sochin.jiang@huawei.com>
---
 block/io.c      | 8 ++++++++
 util/throttle.c | 6 +++---
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/block/io.c b/block/io.c
index 0854e0f..c411a87 100644
--- a/block/io.c
+++ b/block/io.c
@@ -269,6 +269,10 @@ static void coroutine_fn bdrv_co_yield_to_drain(BlockDriverState *bs,
 
 void bdrv_drained_begin(BlockDriverState *bs)
 {
+    if (!bs) {
+        return;
+    }
+
     if (qemu_in_coroutine()) {
         bdrv_co_yield_to_drain(bs, true);
         return;
@@ -284,6 +288,10 @@ void bdrv_drained_begin(BlockDriverState *bs)
 
 void bdrv_drained_end(BlockDriverState *bs)
 {
+    if (!bs) {
+        return;
+    }
+
     if (qemu_in_coroutine()) {
         bdrv_co_yield_to_drain(bs, false);
         return;
diff --git a/util/throttle.c b/util/throttle.c
index b38e742..35a21fc 100644
--- a/util/throttle.c
+++ b/util/throttle.c
@@ -245,8 +245,6 @@ void throttle_timers_init(ThrottleTimers *tt,
 /* destroy a timer */
 static void throttle_timer_destroy(QEMUTimer **timer)
 {
-    assert(*timer != NULL);
-
     timer_del(*timer);
     timer_free(*timer);
     *timer = NULL;
@@ -258,7 +256,9 @@ void throttle_timers_detach_aio_context(ThrottleTimers *tt)
     int i;
 
     for (i = 0; i < 2; i++) {
-        throttle_timer_destroy(&tt->timers[i]);
+        if (tt->timers[i]) {
+            throttle_timer_destroy(&tt->timers[i]);
+        }
     }
 }
 
-- 
1.8.3.1


Re: [Qemu-devel] [Qemu-block] [PATCH] throttle: fix a qemu crash problem when calling blk_delete
Posted by Alberto Garcia 6 years, 5 months ago
On Sun 24 Nov 2013 04:55:52 AM CET, sochin.jiang wrote:
       ^^^^^^^^^^^
I guess the date in your computer is wrong :-)

> commit 7ca7f0 moves the throttling related part of the BDS life cycle
> management to BlockBackend, adds call to
> throttle_timers_detach_aio_context in blk_remove_bs.  commit 1606e
> remove a block device from its throttle group in blk_delete by calling
> blk_io_limits_disable, this fix an easily reproducible qemu crash. But
> delete a BB without a BDS inserted could easily cause a qemu crash too
> by calling bdrv_drained_begin in blk_io_limits_disable. Say, a simply
> drive_add and then a drive_del command.

Thanks, I can reproduce this easily by running QEMU and doing

   drive_add 0 if=none,throttling.iops-total=5000

followed by

   drive_del none0

>  void bdrv_drained_begin(BlockDriverState *bs)
>  {
> +    if (!bs) {
> +        return;
> +    }
> +
>      if (qemu_in_coroutine()) {
>          bdrv_co_yield_to_drain(bs, true);
>          return;
> @@ -284,6 +288,10 @@ void bdrv_drained_begin(BlockDriverState *bs)
>  
>  void bdrv_drained_end(BlockDriverState *bs)
>  {
> +    if (!bs) {
> +        return;
> +    }
> +

I'd say that if someone calls bdrv_drained_begin() with a NULL pointer
then the problem is in the caller...

>  static void throttle_timer_destroy(QEMUTimer **timer)
>  {
> -    assert(*timer != NULL);
> -
>      timer_del(*timer);
>      timer_free(*timer);
>      *timer = NULL;
> @@ -258,7 +256,9 @@ void throttle_timers_detach_aio_context(ThrottleTimers *tt)
>      int i;
>  
>      for (i = 0; i < 2; i++) {
> -        throttle_timer_destroy(&tt->timers[i]);
> +        if (tt->timers[i]) {
> +            throttle_timer_destroy(&tt->timers[i]);
> +        }
>      }
>  }

Why is this part necessary? In what situation you end up calling
throttle_timers_detach_aio_context() twice?

Berto

Re: [Qemu-devel] [Qemu-block] [PATCH] throttle: fix a qemu crash problem when calling blk_delete
Posted by sochin.jiang 6 years, 5 months ago
Thanks for replying.

Indeed, the problem comes from the caller. I guest

some code should be reconsidered in blk_remove_bs

and blk_delete, especially with throttling.

Secondly, when handling drive_del in hmp_drive_del,

throttle_timers_detach_aio_context() is called

first time in blk_remove_bs and again throughing blk_unref,

see below:

hmp_drive_del->
    blk_remove_bs->
        *throttle_timers_detach_aio_context*->
    ...
    blk_unref->
        blk_delete->
            blk_io_limits_disable->
                throttle_group_unregister_tgm->
                    throttle_timers_destroy->
                        *throttle_timers_detach_aio_context*->**...**


sochin

.      


On 2017/10/20 19:43, Alberto Garcia wrote:
> On Sun 24 Nov 2013 04:55:52 AM CET, sochin.jiang wrote:
>        ^^^^^^^^^^^
> I guess the date in your computer is wrong :-)
>
>> commit 7ca7f0 moves the throttling related part of the BDS life cycle
>> management to BlockBackend, adds call to
>> throttle_timers_detach_aio_context in blk_remove_bs.  commit 1606e
>> remove a block device from its throttle group in blk_delete by calling
>> blk_io_limits_disable, this fix an easily reproducible qemu crash. But
>> delete a BB without a BDS inserted could easily cause a qemu crash too
>> by calling bdrv_drained_begin in blk_io_limits_disable. Say, a simply
>> drive_add and then a drive_del command.
> Thanks, I can reproduce this easily by running QEMU and doing
>
>    drive_add 0 if=none,throttling.iops-total=5000
>
> followed by
>
>    drive_del none0
>
>>  void bdrv_drained_begin(BlockDriverState *bs)
>>  {
>> +    if (!bs) {
>> +        return;
>> +    }
>> +
>>      if (qemu_in_coroutine()) {
>>          bdrv_co_yield_to_drain(bs, true);
>>          return;
>> @@ -284,6 +288,10 @@ void bdrv_drained_begin(BlockDriverState *bs)
>>  
>>  void bdrv_drained_end(BlockDriverState *bs)
>>  {
>> +    if (!bs) {
>> +        return;
>> +    }
>> +
> I'd say that if someone calls bdrv_drained_begin() with a NULL pointer
> then the problem is in the caller...
>
>>  static void throttle_timer_destroy(QEMUTimer **timer)
>>  {
>> -    assert(*timer != NULL);
>> -
>>      timer_del(*timer);
>>      timer_free(*timer);
>>      *timer = NULL;
>> @@ -258,7 +256,9 @@ void throttle_timers_detach_aio_context(ThrottleTimers *tt)
>>      int i;
>>  
>>      for (i = 0; i < 2; i++) {
>> -        throttle_timer_destroy(&tt->timers[i]);
>> +        if (tt->timers[i]) {
>> +            throttle_timer_destroy(&tt->timers[i]);
>> +        }
>>      }
>>  }
> Why is this part necessary? In what situation you end up calling
> throttle_timers_detach_aio_context() twice?
>
> Berto
>
> .
>