[RFC PATCH] migration/rdma: Fix out of order wrid

Li Zhijian posted 1 patch 2 years, 10 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/next-importer-push tags/patchew/20210610085831.19779-1-lizhijian@cn.fujitsu.com
Maintainers: Juan Quintela <quintela@redhat.com>, "Dr. David Alan Gilbert" <dgilbert@redhat.com>
migration/rdma.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
[RFC PATCH] migration/rdma: Fix out of order wrid
Posted by Li Zhijian 2 years, 10 months ago
destination:
../qemu/build/qemu-system-x86_64 -enable-kvm -netdev tap,id=hn0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown -device e1000,netdev=hn0,mac=50:52:54:00:11:22 -boot c -drive if=none,file=./Fedora-rdma-server-migration.qcow2,id=drive-virtio-disk0 -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-tablet -monitor stdio -vga qxl -spice streaming-video=filter,port=5902,disable-ticketing -incoming rdma:192.168.22.23:8888
qemu-system-x86_64: -spice streaming-video=filter,port=5902,disable-ticketing: warning: short-form boolean option 'disable-ticketing' deprecated
Please use disable-ticketing=on instead
QEMU 6.0.50 monitor - type 'help' for more information
(qemu) trace-event qemu_rdma_block_for_wrid_miss on
(qemu) dest_init RDMA Device opened: kernel name rxe_eth0 uverbs device name uverbs2, infiniband_verbs class device path /sys/class/infiniband_verbs/uverbs2, infiniband class device path /sys/class/infiniband/rxe_eth0, transport: (2) Ethernet
qemu_rdma_block_for_wrid_miss A Wanted wrid CONTROL SEND (2000) but got CONTROL RECV (4000)

source:
../qemu/build/qemu-system-x86_64 -enable-kvm -netdev tap,id=hn0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown -device e1000,netdev=hn0,mac=50:52:54:00:11:22 -boot c -drive if=none,file=./Fedora-rdma-server.qcow2,id=drive-virtio-disk0 -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-tablet -monitor stdio -vga qxl -spice streaming-video=filter,port=5901,disable-ticketing -S
qemu-system-x86_64: -spice streaming-video=filter,port=5901,disable-ticketing: warning: short-form boolean option 'disable-ticketing' deprecated
Please use disable-ticketing=on instead
QEMU 6.0.50 monitor - type 'help' for more information
(qemu)
(qemu) trace-event qemu_rdma_block_for_wrid_miss on
(qemu) migrate -d rdma:192.168.22.23:8888
source_resolve_host RDMA Device opened: kernel name rxe_eth0 uverbs device name uverbs2, infiniband_verbs class device path /sys/class/infiniband_verbs/uverbs2, infiniband class device path /sys/class/infiniband/rxe_eth0, transport: (2) Ethernet
(qemu) qemu_rdma_block_for_wrid_miss A Wanted wrid WRITE RDMA (1) but got CONTROL RECV (4000)

NOTE: soft RoCE as the rdma device.
[root@iaas-rpma images]# rdma link show rxe_eth0/1
link rxe_eth0/1 state ACTIVE physical_state LINK_UP netdev eth0

This migration cannot be completed since out of order(OOO) CQ event occurs.
OOO cases will occur in both source side and destination side. And it
happens on only SEND and RECV are out of order. OOO between 'WRITE RDMA' and
'RECV' doesn't matter.

below the OOO sequence:
	  source                     destination
  qemu_rdma_write_one()          qemu_rdma_registration_handle()
1.	post_recv X                 post_recv Y
2.			            post_send X
3.			            wait X CQ event
4.	X CQ event
5.	post_send Y
6.	wait Y CQ event
7.			            Y CQ event (dropped)
8.	Y CQ event(send Y done)
9.			            X CQ event(send X done)
10.                                 wait Y CQ event(dropped at (7), blocks forever)

Looks it only happens on soft RoCE rdma device in my a hundred of runs,
a hardward IB device works fine.

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
---
This is just a draft to address this problem. One possible approach
could be creating their independent CQ for both SEND and RECV, it can
help us to poll the CQ we are really insterested in. But it could be a
big changes.
---
 migration/rdma.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/migration/rdma.c b/migration/rdma.c
index b703bf1b918..7a2b0a8853e 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -364,6 +364,8 @@ typedef struct RDMAContext {
     struct ibv_comp_channel *comp_channel;  /* completion channel */
     struct ibv_pd *pd;                      /* protection domain */
     struct ibv_cq *cq;                      /* completion queue */
+    int64_t ooo_wrid;
+    int64_t ooo_wrid_byte_len;
 
     /*
      * If a previous write failed (perhaps because of a failed
@@ -1612,11 +1614,32 @@ static int qemu_rdma_block_for_wrid(RDMAContext *rdma, int wrid_requested,
         wr_id = wr_id_in & RDMA_WRID_TYPE_MASK;
 
         if (wr_id == RDMA_WRID_NONE) {
+            if (rdma->ooo_wrid >= RDMA_WRID_SEND_CONTROL && rdma->ooo_wrid == wrid_requested) {
+                error_report("get expected ooo wrid %d", wrid_requested);
+                if (byte_len && rdma->ooo_wrid_byte_len != -1) {
+                    *byte_len = rdma->ooo_wrid_byte_len;
+                    rdma->ooo_wrid = RDMA_WRID_NONE;
+                    return 0;
+                }
+            }
             break;
         }
         if (wr_id != wrid_requested) {
             trace_qemu_rdma_block_for_wrid_miss(print_wrid(wrid_requested),
                        wrid_requested, print_wrid(wr_id), wr_id);
+            if (wr_id >= RDMA_WRID_SEND_CONTROL) {
+                if (rdma->ooo_wrid > RDMA_WRID_NONE) {
+                    error_report("more than one out of order wird(%ld, %ld)", rdma->ooo_wrid, wr_id);
+                    return -1;
+                }
+                error_report("get out of order wird(%ld)\n", wr_id);
+                rdma->ooo_wrid = wr_id;
+                if (byte_len) {
+                    rdma->ooo_wrid_byte_len = *byte_len;
+                } else {
+                    rdma->ooo_wrid_byte_len = -1;
+                }
+            }
         }
     }
 
-- 
2.28.0




Re: [RFC PATCH] migration/rdma: Fix out of order wrid
Posted by no-reply@patchew.org 2 years, 10 months ago
Patchew URL: https://patchew.org/QEMU/20210610085831.19779-1-lizhijian@cn.fujitsu.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20210610085831.19779-1-lizhijian@cn.fujitsu.com
Subject: [RFC PATCH] migration/rdma: Fix out of order wrid

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
2d73991 migration/rdma: Fix out of order wrid

=== OUTPUT BEGIN ===
ERROR: line over 90 characters
#76: FILE: migration/rdma.c:1618:
+            if (rdma->ooo_wrid >= RDMA_WRID_SEND_CONTROL && rdma->ooo_wrid == wrid_requested) {

ERROR: line over 90 characters
#91: FILE: migration/rdma.c:1633:
+                    error_report("more than one out of order wird(%ld, %ld)", rdma->ooo_wrid, wr_id);

ERROR: Error messages should not contain newlines
#94: FILE: migration/rdma.c:1636:
+                error_report("get out of order wird(%ld)\n", wr_id);

total: 3 errors, 0 warnings, 40 lines checked

Commit 2d739918ecc4 (migration/rdma: Fix out of order wrid) has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20210610085831.19779-1-lizhijian@cn.fujitsu.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com