> -----Original Message----- > From: Bryan Zhang <bryan.zhang@bytedance.com> > Sent: Wednesday, March 27, 2024 6:42 AM > To: qemu-devel@nongnu.org > Cc: peterx@redhat.com; farosas@suse.de; Liu, Yuan1 <yuan1.liu@intel.com>; > berrange@redhat.com; Zou, Nanhai <nanhai.zou@intel.com>; > hao.xiang@linux.dev; Bryan Zhang <bryan.zhang@bytedance.com> > Subject: [PATCH v2 0/5] *** Implement using Intel QAT to offload ZLIB > > v2: > - Rebase changes on top of recent multifd code changes. > - Use QATzip API 'qzMalloc' and 'qzFree' to allocate QAT buffers. > - Remove parameter tuning and use QATzip's defaults for better > performance. > - Add parameter to enable QAT software fallback. > > v1: > https://lists.nongnu.org/archive/html/qemu-devel/2023-12/msg03761.html > > * Performance > > We present updated performance results. For circumstantial reasons, v1 > presented performance on a low-bandwidth (1Gbps) network. > > Here, we present updated results with a similar setup as before but with > two main differences: > > 1. Our machines have a ~50Gbps connection, tested using 'iperf3'. > 2. We had a bug in our memory allocation causing us to only use ~1/2 of > the VM's RAM. Now we properly allocate and fill nearly all of the VM's > RAM. > > Thus, the test setup is as follows: > > We perform multifd live migration over TCP using a VM with 64GB memory. > We prepare the machine's memory by powering it on, allocating a large > amount of memory (60GB) as a single buffer, and filling the buffer with > the repeated contents of the Silesia corpus[0]. This is in lieu of a more > realistic memory snapshot, which proved troublesome to acquire. > > We analyze CPU usage by averaging the output of 'top' every second > during migration. This is admittedly imprecise, but we feel that it > accurately portrays the different degrees of CPU usage of varying > compression methods. > > We present the latency, throughput, and CPU usage results for all of the > compression methods, with varying numbers of multifd threads (4, 8, and > 16). > > [0] The Silesia corpus can be accessed here: > https://sun.aei.polsl.pl//~sdeor/index.php?page=silesia > > ** Results > > 4 multifd threads: > > |---------------|---------------|----------------|---------|---------| > |method |time(sec) |throughput(mbps)|send cpu%|recv cpu%| > |---------------|---------------|----------------|---------|---------| > |qatzip | 23.13 | 8749.94 |117.50 |186.49 | > |---------------|---------------|----------------|---------|---------| > |zlib |254.35 | 771.87 |388.20 |144.40 | > |---------------|---------------|----------------|---------|---------| > |zstd | 54.52 | 3442.59 |414.59 |149.77 | > |---------------|---------------|----------------|---------|---------| > |none | 12.45 |43739.60 |159.71 |204.96 | > |---------------|---------------|----------------|---------|---------| > > 8 multifd threads: > > |---------------|---------------|----------------|---------|---------| > |method |time(sec) |throughput(mbps)|send cpu%|recv cpu%| > |---------------|---------------|----------------|---------|---------| > |qatzip | 16.91 |12306.52 |186.37 |391.84 | > |---------------|---------------|----------------|---------|---------| > |zlib |130.11 | 1508.89 |753.86 |289.35 | > |---------------|---------------|----------------|---------|---------| > |zstd | 27.57 | 6823.23 |786.83 |303.80 | > |---------------|---------------|----------------|---------|---------| > |none | 11.82 |46072.63 |163.74 |238.56 | > |---------------|---------------|----------------|---------|---------| > > 16 multifd threads: > > |---------------|---------------|----------------|---------|---------| > |method |time(sec) |throughput(mbps)|send cpu%|recv cpu%| > |---------------|---------------|----------------|---------|---------| > |qatzip |18.64 |11044.52 | 573.61 |437.65 | > |---------------|---------------|----------------|---------|---------| > |zlib |66.43 | 2955.79 |1469.68 |567.47 | > |---------------|---------------|----------------|---------|---------| > |zstd |14.17 |13290.66 |1504.08 |615.33 | > |---------------|---------------|----------------|---------|---------| > |none |16.82 |32363.26 | 180.74 |217.17 | > |---------------|---------------|----------------|---------|---------| > > ** Observations I'm a little confused about the CPU utilization on the destination for decompression, it seems the CPU is decompressing instead of QAT, I check the code about qzDecompress, it is the same with qzCompress if the decompression task is not completed, it will try to stay sleep state as much as possible. Maybe I understand it incorrectly, but I think QAT should help save more CPU resources in both compression and decompression. Thank you very much for providing this version. I will set up an environment on your patch set to test the performance. > - In general, not using compression outperforms using compression in a > non-network-bound environment. > - 'qatzip' outperforms other compression workers with 4 and 8 workers, > achieving a ~91% latency reduction over 'zlib' with 4 workers, and a > ~58% latency reduction over 'zstd' with 4 workers. > - 'qatzip' maintains comparable performance with 'zstd' at 16 workers, > showing a ~32% increase in latency. This performance difference > becomes more noticeable with more workers, as CPU compression is highly > parallelizable. > - 'qatzip' compression uses considerably less CPU than other compression > methods. At 8 workers, 'qatzip' demonstrates a ~75% reduction in > compression CPU usage compared to 'zstd' and 'zlib'. > - 'qatzip' decompression CPU usage is less impressive, and is even > slightly worse than 'zstd' and 'zlib' CPU usage at 4 and 16 workers. > > Bryan Zhang (5): > meson: Introduce 'qatzip' feature to the build system > migration: Add migration parameters for QATzip > migration: Introduce unimplemented 'qatzip' compression method > migration: Implement 'qatzip' methods using QAT > tests/migration: Add integration test for 'qatzip' compression method > > hw/core/qdev-properties-system.c | 6 +- > meson.build | 10 + > meson_options.txt | 2 + > migration/meson.build | 1 + > migration/migration-hmp-cmds.c | 8 + > migration/multifd-qatzip.c | 382 +++++++++++++++++++++++++++++++ > migration/multifd.h | 1 + > migration/options.c | 57 +++++ > migration/options.h | 2 + > qapi/migration.json | 40 +++- > scripts/meson-buildoptions.sh | 3 + > tests/qtest/meson.build | 4 + > tests/qtest/migration-test.c | 35 +++ > 13 files changed, 549 insertions(+), 2 deletions(-) > create mode 100644 migration/multifd-qatzip.c > > -- > 2.30.2
© 2016 - 2024 Red Hat, Inc.