:p
atchew
Login
The following patches are queued for QEMU stable v7.2.11: https://gitlab.com/qemu-project/qemu/-/commits/staging-7.2 Patch freeze is 2024-04-20, and the release is planned for 2024-04-22: https://wiki.qemu.org/Planning/7.2 Please respond here or CC qemu-stable@nongnu.org on any additional patches you think should (or shouldn't) be included in the release. The changes which are staging for inclusion, with the original commit hash from master branch, are given below the bottom line. Thanks! /mjt -------------------------------------- 01 9ea920dc2825 Daniel P. Berrangé: gitlab: update FreeBSD Cirrus CI image to 13.3 02 f5af80271aad David Parsons: ui/cocoa: Fix window clipping on macOS 14 03 bc6bd20ee353 Zhuojia Shen: target/arm: align exposed ID registers with Linux 04 3dc2afeab296 Peter Maydell: tests/tcg/aarch64/sysregs.c: Use S syntax for id_aa64zfr0_el1 and id_aa64smfr0_el1 05 1f51573f7925 Richard Henderson: target/arm: Fix SME full tile indexing 06 fd7f95f23d6f Peter Maydell: hw/rtc/sun4v-rtc: Relicense to GPLv2-or-later 07 012b170173bc Dmitrii Gavrilov: system/qdev-monitor: move drain_call_rcu call under if (!dev) in qmp_device_add() 08 a9198b3132d8 Sven Schnelle: hw/scsi/lsi53c895a: stop script on phase mismatch 09 8b09b7fe4708 Sven Schnelle: hw/scsi/lsi53c895a: add missing decrement of reentrancy counter 10 9876359990dd Sven Schnelle: hw/scsi/lsi53c895a: add timer to scripts processing 11 9bc9e9511944 Michael Tokarev: make-release: switch to .xz format by default 12 4cadf1023498 Laurent Vivier: e1000e: fix link state on resume 13 6a5287ce8047 Nick Briggs: Avoid unaligned fetch in ladr_match() 14 784fd35387e9 Klaus Jensen: hw/nvme: clean up confusing use of errp/local_err 15 973f76cf7743 Klaus Jensen: hw/nvme: cleanup error reporting in nvme_init_pci() 16 4f0a4a3d5854 Minwoo Im: hw/nvme: separate 'serial' property for VFs 17 ee7bda4d38cd Klaus Jensen: hw/nvme: generalize the mbar size helper 18 fa905f65c554 Klaus Jensen: hw/nvme: add machine compatibility parameter to enable msix exclusive bar 19 31180dbdca28 Akihiko Odaki: pcie: Introduce pcie_sriov_num_vfs 20 91bb64a8d201 Akihiko Odaki: hw/nvme: Use pcie_sriov_num_vfs() 21 6081b4243cd6 Akihiko Odaki: pcie_sriov: Validate NumVFs 22 74e2845c5f95 Jonathan Cameron: hmat acpi: Fix out of bounds access due to missing use of indirection 23 2e128776dc56 Cédric Le Goater: migration: Skip only empty block devices 24 c45f8f1aef35 Thomas Huth: tests/unit: Bump test-aio-multithread test timeout to 2 minutes 25 e1b363e328d5 Thomas Huth: tests/unit: Bump test-crypto-block test timeout to 5 minutes 26 63b18312d14a Kevin Wolf: tests/unit: Bump test-replication timeout to 60 seconds 27 55f7c6a5f2bd Peter Maydell: tests: Raise timeouts for bufferiszero and crypto-tlscredsx509 28 5f97afe2543f Paolo Bonzini: target/i386: introduce function to query MMU indices 29 90f641531c78 Paolo Bonzini: target/i386: use separate MMU indexes for 32-bit accesses 30 2cc68629a6fc Paolo Bonzini: target/i386: fix direction of "32-bit MMU" test 31 7fd226b04746 Tao Su: target/i386: Revert monitor_puts() in do_inject_x86_mce() 32 1590154ee437 Song Gao: target/loongarch: Fix qemu-system-loongarch64 assert failed with the option '-d int' 33 7c7a9f578e4f Lorenz Brun: hw/scsi/scsi-generic: Fix io_timeout property not applying 34 a158c63b3ba1 Yao Xingtao: monitor/hmp-cmds-target: Append a space in error message in gpa2hva() 35 1c188fc8cbff Akihiko Odaki: virtio-net: Fix vhost virtqueue notifiers for RSS 36 2911e9b95f3b Richard Henderson: tcg/optimize: Fix sign_mask for logical right-shift 37 4a3aa11e1fb2 Richard Henderson: target/hppa: Clear psw_n for BE on use_nullify_skip path 38 1d2f2b35bc86 Michael Tokarev: gitlab-ci/cirrus: switch from 'master' to 'latest' 39 44e25fbc1900 Peter Maydell: hw/intc/arm_gicv3: ICC_HPPIR* return SPURIOUS if int group is disabled 40 4c54f5bc8e1d Yajun Wu: hw/net/virtio-net: fix qemu set used ring flag even vhost started 41 2d9a31b3c273 Wafer: hw/virtio: Fix packed virtqueue flush used_idx
From: Daniel P. Berrangé <berrange@redhat.com> The 13.2 images have been deleted from gcloud Cc: qemu-stable@nongnu.org Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20240304144456.3825935-3-berrange@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit 9ea920dc28254cd9a363aaef01985dffd8abedd7) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: 7.2 used FreeBSD version 13.1, not 13.2) diff --git a/.gitlab-ci.d/cirrus.yml b/.gitlab-ci.d/cirrus.yml index XXXXXXX..XXXXXXX 100644 --- a/.gitlab-ci.d/cirrus.yml +++ b/.gitlab-ci.d/cirrus.yml @@ -XXX,XX +XXX,XX @@ x64-freebsd-13-build: NAME: freebsd-13 CIRRUS_VM_INSTANCE_TYPE: freebsd_instance CIRRUS_VM_IMAGE_SELECTOR: image_family - CIRRUS_VM_IMAGE_NAME: freebsd-13-1 + CIRRUS_VM_IMAGE_NAME: freebsd-13-3 CIRRUS_VM_CPUS: 8 CIRRUS_VM_RAM: 8G UPDATE_COMMAND: pkg update -- 2.39.2
From: David Parsons <dave@daveparsons.net> macOS Sonoma changes the NSView.clipsToBounds to false by default where it was true in earlier version of macOS. This causes the window contents to be occluded by the frame at the top of the window. This fixes the issue by conditionally compiling the clipping on Sonoma to true. NSView only exposes the clipToBounds in macOS 14 and so has to be fixed via conditional compilation. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1994 Signed-off-by: David Parsons <dave@daveparsons.net> Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-ID: <20240224140620.39200-1-dave@daveparsons.net> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> (cherry picked from commit f5af80271aad356233b2bea2369b3b2211fa395d) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/ui/cocoa.m b/ui/cocoa.m index XXXXXXX..XXXXXXX 100644 --- a/ui/cocoa.m +++ b/ui/cocoa.m @@ -XXX,XX +XXX,XX @@ #define MAC_OS_X_VERSION_10_13 101300 #endif +#ifndef MAC_OS_VERSION_14_0 +#define MAC_OS_VERSION_14_0 140000 +#endif + /* 10.14 deprecates NSOnState and NSOffState in favor of * NSControlStateValueOn/Off, which were introduced in 10.13. * Define for older versions @@ -XXX,XX +XXX,XX @@ - (id)initWithFrame:(NSRect)frameRect screen.width = frameRect.size.width; screen.height = frameRect.size.height; kbd = qkbd_state_init(dcl.con); +#if MAC_OS_X_VERSION_MAX_ALLOWED >= MAC_OS_VERSION_14_0 + [self setClipsToBounds:YES]; +#endif } return self; -- 2.39.2
From: Zhuojia Shen <chaosdefinition@hotmail.com> In CPUID registers exposed to userspace, some registers were missing and some fields were not exposed. This patch aligns exposed ID registers and their fields with what the upstream kernel currently exposes. Specifically, the following new ID registers/fields are exposed to userspace: ID_AA64PFR1_EL1.BT: bits 3-0 ID_AA64PFR1_EL1.MTE: bits 11-8 ID_AA64PFR1_EL1.SME: bits 27-24 ID_AA64ZFR0_EL1.SVEver: bits 3-0 ID_AA64ZFR0_EL1.AES: bits 7-4 ID_AA64ZFR0_EL1.BitPerm: bits 19-16 ID_AA64ZFR0_EL1.BF16: bits 23-20 ID_AA64ZFR0_EL1.SHA3: bits 35-32 ID_AA64ZFR0_EL1.SM4: bits 43-40 ID_AA64ZFR0_EL1.I8MM: bits 47-44 ID_AA64ZFR0_EL1.F32MM: bits 55-52 ID_AA64ZFR0_EL1.F64MM: bits 59-56 ID_AA64SMFR0_EL1.F32F32: bit 32 ID_AA64SMFR0_EL1.B16F32: bit 34 ID_AA64SMFR0_EL1.F16F32: bit 35 ID_AA64SMFR0_EL1.I8I32: bits 39-36 ID_AA64SMFR0_EL1.F64F64: bit 48 ID_AA64SMFR0_EL1.I16I64: bits 55-52 ID_AA64SMFR0_EL1.FA64: bit 63 ID_AA64MMFR0_EL1.ECV: bits 63-60 ID_AA64MMFR1_EL1.AFP: bits 47-44 ID_AA64MMFR2_EL1.AT: bits 35-32 ID_AA64ISAR0_EL1.RNDR: bits 63-60 ID_AA64ISAR1_EL1.FRINTTS: bits 35-32 ID_AA64ISAR1_EL1.BF16: bits 47-44 ID_AA64ISAR1_EL1.DGH: bits 51-48 ID_AA64ISAR1_EL1.I8MM: bits 55-52 ID_AA64ISAR2_EL1.WFxT: bits 3-0 ID_AA64ISAR2_EL1.RPRES: bits 7-4 ID_AA64ISAR2_EL1.GPA3: bits 11-8 ID_AA64ISAR2_EL1.APA3: bits 15-12 The code is also refactored to use symbolic names for ID register fields for better readability and maintainability. The test case in tests/tcg/aarch64/sysregs.c is also updated to match the intended behavior. Signed-off-by: Zhuojia Shen <chaosdefinition@hotmail.com> Message-id: DS7PR12MB6309FB585E10772928F14271ACE79@DS7PR12MB6309.namprd12.prod.outlook.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> [PMM: use Sn_n_Cn_Cn_n syntax to work with older assemblers that don't recognize id_aa64isar2_el1 and id_aa64mmfr2_el1] Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit bc6bd20ee3538347afb750c4bd06edca4a922897) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: pick this for v8.0.0-2361-g1f51573f79 "target/arm: Fix SME full tile indexing") diff --git a/target/arm/helper.c b/target/arm/helper.c index XXXXXXX..XXXXXXX 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu) #ifdef CONFIG_USER_ONLY static const ARMCPRegUserSpaceInfo v8_user_idregs[] = { { .name = "ID_AA64PFR0_EL1", - .exported_bits = 0x000f000f00ff0000, - .fixed_bits = 0x0000000000000011 }, + .exported_bits = R_ID_AA64PFR0_FP_MASK | + R_ID_AA64PFR0_ADVSIMD_MASK | + R_ID_AA64PFR0_SVE_MASK | + R_ID_AA64PFR0_DIT_MASK, + .fixed_bits = (0x1u << R_ID_AA64PFR0_EL0_SHIFT) | + (0x1u << R_ID_AA64PFR0_EL1_SHIFT) }, { .name = "ID_AA64PFR1_EL1", - .exported_bits = 0x00000000000000f0 }, + .exported_bits = R_ID_AA64PFR1_BT_MASK | + R_ID_AA64PFR1_SSBS_MASK | + R_ID_AA64PFR1_MTE_MASK | + R_ID_AA64PFR1_SME_MASK }, { .name = "ID_AA64PFR*_EL1_RESERVED", - .is_glob = true }, - { .name = "ID_AA64ZFR0_EL1" }, + .is_glob = true }, + { .name = "ID_AA64ZFR0_EL1", + .exported_bits = R_ID_AA64ZFR0_SVEVER_MASK | + R_ID_AA64ZFR0_AES_MASK | + R_ID_AA64ZFR0_BITPERM_MASK | + R_ID_AA64ZFR0_BFLOAT16_MASK | + R_ID_AA64ZFR0_SHA3_MASK | + R_ID_AA64ZFR0_SM4_MASK | + R_ID_AA64ZFR0_I8MM_MASK | + R_ID_AA64ZFR0_F32MM_MASK | + R_ID_AA64ZFR0_F64MM_MASK }, + { .name = "ID_AA64SMFR0_EL1", + .exported_bits = R_ID_AA64SMFR0_F32F32_MASK | + R_ID_AA64SMFR0_B16F32_MASK | + R_ID_AA64SMFR0_F16F32_MASK | + R_ID_AA64SMFR0_I8I32_MASK | + R_ID_AA64SMFR0_F64F64_MASK | + R_ID_AA64SMFR0_I16I64_MASK | + R_ID_AA64SMFR0_FA64_MASK }, { .name = "ID_AA64MMFR0_EL1", - .fixed_bits = 0x00000000ff000000 }, - { .name = "ID_AA64MMFR1_EL1" }, + .exported_bits = R_ID_AA64MMFR0_ECV_MASK, + .fixed_bits = (0xfu << R_ID_AA64MMFR0_TGRAN64_SHIFT) | + (0xfu << R_ID_AA64MMFR0_TGRAN4_SHIFT) }, + { .name = "ID_AA64MMFR1_EL1", + .exported_bits = R_ID_AA64MMFR1_AFP_MASK }, + { .name = "ID_AA64MMFR2_EL1", + .exported_bits = R_ID_AA64MMFR2_AT_MASK }, { .name = "ID_AA64MMFR*_EL1_RESERVED", - .is_glob = true }, + .is_glob = true }, { .name = "ID_AA64DFR0_EL1", - .fixed_bits = 0x0000000000000006 }, - { .name = "ID_AA64DFR1_EL1" }, + .fixed_bits = (0x6u << R_ID_AA64DFR0_DEBUGVER_SHIFT) }, + { .name = "ID_AA64DFR1_EL1" }, { .name = "ID_AA64DFR*_EL1_RESERVED", - .is_glob = true }, + .is_glob = true }, { .name = "ID_AA64AFR*", - .is_glob = true }, + .is_glob = true }, { .name = "ID_AA64ISAR0_EL1", - .exported_bits = 0x00fffffff0fffff0 }, + .exported_bits = R_ID_AA64ISAR0_AES_MASK | + R_ID_AA64ISAR0_SHA1_MASK | + R_ID_AA64ISAR0_SHA2_MASK | + R_ID_AA64ISAR0_CRC32_MASK | + R_ID_AA64ISAR0_ATOMIC_MASK | + R_ID_AA64ISAR0_RDM_MASK | + R_ID_AA64ISAR0_SHA3_MASK | + R_ID_AA64ISAR0_SM3_MASK | + R_ID_AA64ISAR0_SM4_MASK | + R_ID_AA64ISAR0_DP_MASK | + R_ID_AA64ISAR0_FHM_MASK | + R_ID_AA64ISAR0_TS_MASK | + R_ID_AA64ISAR0_RNDR_MASK }, { .name = "ID_AA64ISAR1_EL1", - .exported_bits = 0x000000f0ffffffff }, + .exported_bits = R_ID_AA64ISAR1_DPB_MASK | + R_ID_AA64ISAR1_APA_MASK | + R_ID_AA64ISAR1_API_MASK | + R_ID_AA64ISAR1_JSCVT_MASK | + R_ID_AA64ISAR1_FCMA_MASK | + R_ID_AA64ISAR1_LRCPC_MASK | + R_ID_AA64ISAR1_GPA_MASK | + R_ID_AA64ISAR1_GPI_MASK | + R_ID_AA64ISAR1_FRINTTS_MASK | + R_ID_AA64ISAR1_SB_MASK | + R_ID_AA64ISAR1_BF16_MASK | + R_ID_AA64ISAR1_DGH_MASK | + R_ID_AA64ISAR1_I8MM_MASK }, + { .name = "ID_AA64ISAR2_EL1", + .exported_bits = R_ID_AA64ISAR2_WFXT_MASK | + R_ID_AA64ISAR2_RPRES_MASK | + R_ID_AA64ISAR2_GPA3_MASK | + R_ID_AA64ISAR2_APA3_MASK }, { .name = "ID_AA64ISAR*_EL1_RESERVED", - .is_glob = true }, + .is_glob = true }, }; modify_arm_cp_regs(v8_idregs, v8_user_idregs); #endif @@ -XXX,XX +XXX,XX @@ void register_cp_regs_for_features(ARMCPU *cpu) #ifdef CONFIG_USER_ONLY static const ARMCPRegUserSpaceInfo id_v8_user_midr_cp_reginfo[] = { { .name = "MIDR_EL1", - .exported_bits = 0x00000000ffffffff }, - { .name = "REVIDR_EL1" }, + .exported_bits = R_MIDR_EL1_REVISION_MASK | + R_MIDR_EL1_PARTNUM_MASK | + R_MIDR_EL1_ARCHITECTURE_MASK | + R_MIDR_EL1_VARIANT_MASK | + R_MIDR_EL1_IMPLEMENTER_MASK }, + { .name = "REVIDR_EL1" }, }; modify_arm_cp_regs(id_v8_midr_cp_reginfo, id_v8_user_midr_cp_reginfo); #endif diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target index XXXXXXX..XXXXXXX 100644 --- a/tests/tcg/aarch64/Makefile.target +++ b/tests/tcg/aarch64/Makefile.target @@ -XXX,XX +XXX,XX @@ config-cc.mak: Makefile $(call cc-option,-march=armv8.1-a+sve2, CROSS_CC_HAS_SVE2); \ $(call cc-option,-march=armv8.3-a, CROSS_CC_HAS_ARMV8_3); \ $(call cc-option,-mbranch-protection=standard, CROSS_CC_HAS_ARMV8_BTI); \ - $(call cc-option,-march=armv8.5-a+memtag, CROSS_CC_HAS_ARMV8_MTE)) 3> config-cc.mak + $(call cc-option,-march=armv8.5-a+memtag, CROSS_CC_HAS_ARMV8_MTE); \ + $(call cc-option,-march=armv9-a+sme, CROSS_CC_HAS_ARMV9_SME)) 3> config-cc.mak -include config-cc.mak # Pauth Tests @@ -XXX,XX +XXX,XX @@ endif ifneq ($(CROSS_CC_HAS_SVE),) # System Registers Tests AARCH64_TESTS += sysregs +ifneq ($(CROSS_CC_HAS_ARMV9_SME),) +sysregs: CFLAGS+=-march=armv9-a+sme -DHAS_ARMV9_SME +else sysregs: CFLAGS+=-march=armv8.1-a+sve +endif # SVE ioctl test AARCH64_TESTS += sve-ioctls diff --git a/tests/tcg/aarch64/sysregs.c b/tests/tcg/aarch64/sysregs.c index XXXXXXX..XXXXXXX 100644 --- a/tests/tcg/aarch64/sysregs.c +++ b/tests/tcg/aarch64/sysregs.c @@ -XXX,XX +XXX,XX @@ #define HWCAP_CPUID (1 << 11) #endif +/* + * Older assemblers don't recognize newer system register names, + * but we can still access them by the Sn_n_Cn_Cn_n syntax. + */ +#define SYS_ID_AA64ISAR2_EL1 S3_0_C0_C6_2 +#define SYS_ID_AA64MMFR2_EL1 S3_0_C0_C7_2 + int failed_bit_count; /* Read and print system register `id' value */ @@ -XXX,XX +XXX,XX @@ int main(void) * minimum valid fields - for the purposes of this check allowed * to have non-zero values. */ - get_cpu_reg_check_mask(id_aa64isar0_el1, _m(00ff,ffff,f0ff,fff0)); - get_cpu_reg_check_mask(id_aa64isar1_el1, _m(0000,00f0,ffff,ffff)); + get_cpu_reg_check_mask(id_aa64isar0_el1, _m(f0ff,ffff,f0ff,fff0)); + get_cpu_reg_check_mask(id_aa64isar1_el1, _m(00ff,f0ff,ffff,ffff)); + get_cpu_reg_check_mask(SYS_ID_AA64ISAR2_EL1, _m(0000,0000,0000,ffff)); /* TGran4 & TGran64 as pegged to -1 */ - get_cpu_reg_check_mask(id_aa64mmfr0_el1, _m(0000,0000,ff00,0000)); - get_cpu_reg_check_zero(id_aa64mmfr1_el1); + get_cpu_reg_check_mask(id_aa64mmfr0_el1, _m(f000,0000,ff00,0000)); + get_cpu_reg_check_mask(id_aa64mmfr1_el1, _m(0000,f000,0000,0000)); + get_cpu_reg_check_mask(SYS_ID_AA64MMFR2_EL1, _m(0000,000f,0000,0000)); /* EL1/EL0 reported as AA64 only */ get_cpu_reg_check_mask(id_aa64pfr0_el1, _m(000f,000f,00ff,0011)); - get_cpu_reg_check_mask(id_aa64pfr1_el1, _m(0000,0000,0000,00f0)); + get_cpu_reg_check_mask(id_aa64pfr1_el1, _m(0000,0000,0f00,0fff)); /* all hidden, DebugVer fixed to 0x6 (ARMv8 debug architecture) */ get_cpu_reg_check_mask(id_aa64dfr0_el1, _m(0000,0000,0000,0006)); get_cpu_reg_check_zero(id_aa64dfr1_el1); - get_cpu_reg_check_zero(id_aa64zfr0_el1); + get_cpu_reg_check_mask(id_aa64zfr0_el1, _m(0ff0,ff0f,00ff,00ff)); +#ifdef HAS_ARMV9_SME + get_cpu_reg_check_mask(id_aa64smfr0_el1, _m(80f1,00fd,0000,0000)); +#endif get_cpu_reg_check_zero(id_aa64afr0_el1); get_cpu_reg_check_zero(id_aa64afr1_el1); -- 2.39.2
From: Peter Maydell <peter.maydell@linaro.org> Some assemblers will complain about attempts to access id_aa64zfr0_el1 and id_aa64smfr0_el1 by name if the test binary isn't built for the right processor type: /tmp/ccASXpLo.s:782: Error: selected processor does not support system register name 'id_aa64zfr0_el1' /tmp/ccASXpLo.s:829: Error: selected processor does not support system register name 'id_aa64smfr0_el1' However, these registers are in the ID space and are guaranteed to read-as-zero on older CPUs, so the access is both safe and sensible. Switch to using the S syntax, as we already do for ID_AA64ISAR2_EL1 and ID_AA64MMFR2_EL1. This allows us to drop the HAS_ARMV9_SME check and the makefile machinery to adjust the CFLAGS for this test, so we don't rely on having a sufficiently new compiler to be able to check these registers. This means we're actually testing the SME ID register: no released GCC yet recognizes -march=armv9-a+sme, so that was always skipped. It also avoids a future problem if we try to switch the "do we have SME support in the toolchain" check from "in the compiler" to "in the assembler" (at which point we would otherwise run into the above errors). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit 3dc2afeab2964b54848715b913b6c605f36be3e1) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: pick this for v8.0.0-2361-g1f51573f79 "target/arm: Fix SME full tile indexing") diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target index XXXXXXX..XXXXXXX 100644 --- a/tests/tcg/aarch64/Makefile.target +++ b/tests/tcg/aarch64/Makefile.target @@ -XXX,XX +XXX,XX @@ AARCH64_TESTS += mte-1 mte-2 mte-3 mte-4 mte-5 mte-6 mte-7 mte-%: CFLAGS += -march=armv8.5-a+memtag endif -ifneq ($(CROSS_CC_HAS_SVE),) # System Registers Tests AARCH64_TESTS += sysregs -ifneq ($(CROSS_CC_HAS_ARMV9_SME),) -sysregs: CFLAGS+=-march=armv9-a+sme -DHAS_ARMV9_SME -else -sysregs: CFLAGS+=-march=armv8.1-a+sve -endif +ifneq ($(CROSS_CC_HAS_SVE),) # SVE ioctl test AARCH64_TESTS += sve-ioctls sve-ioctls: CFLAGS+=-march=armv8.1-a+sve diff --git a/tests/tcg/aarch64/sysregs.c b/tests/tcg/aarch64/sysregs.c index XXXXXXX..XXXXXXX 100644 --- a/tests/tcg/aarch64/sysregs.c +++ b/tests/tcg/aarch64/sysregs.c @@ -XXX,XX +XXX,XX @@ /* * Older assemblers don't recognize newer system register names, * but we can still access them by the Sn_n_Cn_Cn_n syntax. + * This also means we don't need to specifically request that the + * assembler enables whatever architectural features the ID registers + * syntax might be gated behind. */ #define SYS_ID_AA64ISAR2_EL1 S3_0_C0_C6_2 #define SYS_ID_AA64MMFR2_EL1 S3_0_C0_C7_2 +#define SYS_ID_AA64ZFR0_EL1 S3_0_C0_C4_4 +#define SYS_ID_AA64SMFR0_EL1 S3_0_C0_C4_5 int failed_bit_count; @@ -XXX,XX +XXX,XX @@ int main(void) /* all hidden, DebugVer fixed to 0x6 (ARMv8 debug architecture) */ get_cpu_reg_check_mask(id_aa64dfr0_el1, _m(0000,0000,0000,0006)); get_cpu_reg_check_zero(id_aa64dfr1_el1); - get_cpu_reg_check_mask(id_aa64zfr0_el1, _m(0ff0,ff0f,00ff,00ff)); -#ifdef HAS_ARMV9_SME - get_cpu_reg_check_mask(id_aa64smfr0_el1, _m(80f1,00fd,0000,0000)); -#endif + get_cpu_reg_check_mask(SYS_ID_AA64ZFR0_EL1, _m(0ff0,ff0f,00ff,00ff)); + get_cpu_reg_check_mask(SYS_ID_AA64SMFR0_EL1, _m(80f1,00fd,0000,0000)); get_cpu_reg_check_zero(id_aa64afr0_el1); get_cpu_reg_check_zero(id_aa64afr1_el1); -- 2.39.2
From: Richard Henderson <richard.henderson@linaro.org> For the outer product set of insns, which take an entire matrix tile as output, the argument is not a combined tile+column. Therefore using get_tile_rowcol was incorrect, as we extracted the tile number from itself. The test case relies only on assembler support for SME, since no release of GCC recognizes -march=armv9-a+sme yet. Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1620 Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20230622151201.1578522-5-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> [PMM: dropped now-unneeded changes to sysregs CFLAGS] Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit 1f51573f7925b80e79a29f87c7d9d6ead60960c0) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/target/arm/translate-sme.c b/target/arm/translate-sme.c index XXXXXXX..XXXXXXX 100644 --- a/target/arm/translate-sme.c +++ b/target/arm/translate-sme.c @@ -XXX,XX +XXX,XX @@ static TCGv_ptr get_tile_rowcol(DisasContext *s, int esz, int rs, return addr; } +/* + * Resolve tile.size[0] to a host pointer. + * Used by e.g. outer product insns where we require the entire tile. + */ +static TCGv_ptr get_tile(DisasContext *s, int esz, int tile) +{ + TCGv_ptr addr = tcg_temp_new_ptr(); + int offset; + + offset = tile * sizeof(ARMVectorReg) + offsetof(CPUARMState, zarray); + + tcg_gen_addi_ptr(addr, cpu_env, offset); + return addr; +} + static bool trans_ZERO(DisasContext *s, arg_ZERO *a) { if (!dc_isar_feature(aa64_sme, s)) { @@ -XXX,XX +XXX,XX @@ static bool do_adda(DisasContext *s, arg_adda *a, MemOp esz, return true; } - /* Sum XZR+zad to find ZAd. */ - za = get_tile_rowcol(s, esz, 31, a->zad, false); + za = get_tile(s, esz, a->zad); zn = vec_full_reg_ptr(s, a->zn); pn = pred_full_reg_ptr(s, a->pn); pm = pred_full_reg_ptr(s, a->pm); @@ -XXX,XX +XXX,XX @@ static bool do_outprod(DisasContext *s, arg_op *a, MemOp esz, return true; } - /* Sum XZR+zad to find ZAd. */ - za = get_tile_rowcol(s, esz, 31, a->zad, false); + za = get_tile(s, esz, a->zad); zn = vec_full_reg_ptr(s, a->zn); zm = vec_full_reg_ptr(s, a->zm); pn = pred_full_reg_ptr(s, a->pn); @@ -XXX,XX +XXX,XX @@ static bool do_outprod_fpst(DisasContext *s, arg_op *a, MemOp esz, return true; } - /* Sum XZR+zad to find ZAd. */ - za = get_tile_rowcol(s, esz, 31, a->zad, false); + za = get_tile(s, esz, a->zad); zn = vec_full_reg_ptr(s, a->zn); zm = vec_full_reg_ptr(s, a->zm); pn = pred_full_reg_ptr(s, a->pn); diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target index XXXXXXX..XXXXXXX 100644 --- a/tests/tcg/aarch64/Makefile.target +++ b/tests/tcg/aarch64/Makefile.target @@ -XXX,XX +XXX,XX @@ config-cc.mak: Makefile $(call cc-option,-march=armv8.3-a, CROSS_CC_HAS_ARMV8_3); \ $(call cc-option,-mbranch-protection=standard, CROSS_CC_HAS_ARMV8_BTI); \ $(call cc-option,-march=armv8.5-a+memtag, CROSS_CC_HAS_ARMV8_MTE); \ - $(call cc-option,-march=armv9-a+sme, CROSS_CC_HAS_ARMV9_SME)) 3> config-cc.mak + $(call cc-option,-Wa$(COMMA)-march=armv9-a+sme, CROSS_AS_HAS_ARMV9_SME)) 3> config-cc.mak -include config-cc.mak # Pauth Tests @@ -XXX,XX +XXX,XX @@ AARCH64_TESTS += mte-1 mte-2 mte-3 mte-4 mte-5 mte-6 mte-7 mte-%: CFLAGS += -march=armv8.5-a+memtag endif +# SME Tests +ifneq ($(CROSS_AS_HAS_ARMV9_SME),) +AARCH64_TESTS += sme-outprod1 +endif + # System Registers Tests AARCH64_TESTS += sysregs diff --git a/tests/tcg/aarch64/sme-outprod1.c b/tests/tcg/aarch64/sme-outprod1.c new file mode 100644 index XXXXXXX..XXXXXXX --- /dev/null +++ b/tests/tcg/aarch64/sme-outprod1.c @@ -XXX,XX +XXX,XX @@ +/* + * SME outer product, 1 x 1. + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#include <stdio.h> + +extern void foo(float *dst); + +asm( +" .arch_extension sme\n" +" .type foo, @function\n" +"foo:\n" +" stp x29, x30, [sp, -80]!\n" +" mov x29, sp\n" +" stp d8, d9, [sp, 16]\n" +" stp d10, d11, [sp, 32]\n" +" stp d12, d13, [sp, 48]\n" +" stp d14, d15, [sp, 64]\n" +" smstart\n" +" ptrue p0.s, vl4\n" +" fmov z0.s, #1.0\n" +/* + * An outer product of a vector of 1.0 by itself should be a matrix of 1.0. + * Note that we are using tile 1 here (za1.s) rather than tile 0. + */ +" zero {za}\n" +" fmopa za1.s, p0/m, p0/m, z0.s, z0.s\n" +/* + * Read the first 4x4 sub-matrix of elements from tile 1: + * Note that za1h should be interchangable here. + */ +" mov w12, #0\n" +" mova z0.s, p0/m, za1v.s[w12, #0]\n" +" mova z1.s, p0/m, za1v.s[w12, #1]\n" +" mova z2.s, p0/m, za1v.s[w12, #2]\n" +" mova z3.s, p0/m, za1v.s[w12, #3]\n" +/* + * And store them to the input pointer (dst in the C code): + */ +" st1w {z0.s}, p0, [x0]\n" +" add x0, x0, #16\n" +" st1w {z1.s}, p0, [x0]\n" +" add x0, x0, #16\n" +" st1w {z2.s}, p0, [x0]\n" +" add x0, x0, #16\n" +" st1w {z3.s}, p0, [x0]\n" +" smstop\n" +" ldp d8, d9, [sp, 16]\n" +" ldp d10, d11, [sp, 32]\n" +" ldp d12, d13, [sp, 48]\n" +" ldp d14, d15, [sp, 64]\n" +" ldp x29, x30, [sp], 80\n" +" ret\n" +" .size foo, . - foo" +); + +int main() +{ + float dst[16]; + int i, j; + + foo(dst); + + for (i = 0; i < 16; i++) { + if (dst[i] != 1.0f) { + break; + } + } + + if (i == 16) { + return 0; /* success */ + } + + /* failure */ + for (i = 0; i < 4; ++i) { + for (j = 0; j < 4; ++j) { + printf("%f ", (double)dst[i * 4 + j]); + } + printf("\n"); + } + return 1; +} -- 2.39.2
From: Peter Maydell <peter.maydell@linaro.org> The sun4v RTC device model added under commit a0e893039cf2ce0 in 2016 was unfortunately added with a license of GPL-v3-or-later, which is not compatible with other QEMU code which has a GPL-v2-only license. Relicense the code in the .c and the .h file to GPL-v2-or-later, to make it compatible with the rest of QEMU. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini (for Red Hat) <pbonzini@redhat.com> Signed-off-by: Artyom Tarasenko <atar4qemu@gmail.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Acked-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20240223161300.938542-1-peter.maydell@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit fd7f95f23d6fe485332c1d4b489eb719fcb7c225) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/rtc/sun4v-rtc.c b/hw/rtc/sun4v-rtc.c index XXXXXXX..XXXXXXX 100644 --- a/hw/rtc/sun4v-rtc.c +++ b/hw/rtc/sun4v-rtc.c @@ -XXX,XX +XXX,XX @@ * * Copyright (c) 2016 Artyom Tarasenko * - * This code is licensed under the GNU GPL v3 or (at your option) any later + * This code is licensed under the GNU GPL v2 or (at your option) any later * version. */ diff --git a/include/hw/rtc/sun4v-rtc.h b/include/hw/rtc/sun4v-rtc.h index XXXXXXX..XXXXXXX 100644 --- a/include/hw/rtc/sun4v-rtc.h +++ b/include/hw/rtc/sun4v-rtc.h @@ -XXX,XX +XXX,XX @@ * * Copyright (c) 2016 Artyom Tarasenko * - * This code is licensed under the GNU GPL v3 or (at your option) any later + * This code is licensed under the GNU GPL v2 or (at your option) any later * version. */ -- 2.39.2
From: Dmitrii Gavrilov <ds-gavr@yandex-team.ru> Original goal of addition of drain_call_rcu to qmp_device_add was to cover the failure case of qdev_device_add. It seems call of drain_call_rcu was misplaced in 7bed89958bfbf40df what led to waiting for pending RCU callbacks under happy path too. What led to overall performance degradation of qmp_device_add. In this patch call of drain_call_rcu moved under handling of failure of qdev_device_add. Signed-off-by: Dmitrii Gavrilov <ds-gavr@yandex-team.ru> Message-ID: <20231103105602.90475-1-ds-gavr@yandex-team.ru> Fixes: 7bed89958bf ("device_core: use drain_call_rcu in in qmp_device_add", 2020-10-12) Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 012b170173bcaa14b9bc26209e0813311ac78489) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c index XXXXXXX..XXXXXXX 100644 --- a/softmmu/qdev-monitor.c +++ b/softmmu/qdev-monitor.c @@ -XXX,XX +XXX,XX @@ void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp) return; } dev = qdev_device_add(opts, errp); - - /* - * Drain all pending RCU callbacks. This is done because - * some bus related operations can delay a device removal - * (in this case this can happen if device is added and then - * removed due to a configuration error) - * to a RCU callback, but user might expect that this interface - * will finish its job completely once qmp command returns result - * to the user - */ - drain_call_rcu(); - if (!dev) { + /* + * Drain all pending RCU callbacks. This is done because + * some bus related operations can delay a device removal + * (in this case this can happen if device is added and then + * removed due to a configuration error) + * to a RCU callback, but user might expect that this interface + * will finish its job completely once qmp command returns result + * to the user + */ + drain_call_rcu(); + qemu_opts_del(opts); return; } -- 2.39.2
From: Sven Schnelle <svens@stackframe.org> Netbsd isn't happy with qemu lsi53c895a emulation: cd0(esiop0:0:2:0): command with tag id 0 reset esiop0: autoconfiguration error: phase mismatch without command esiop0: autoconfiguration error: unhandled scsi interrupt, sist=0x80 sstat1=0x0 DSA=0x23a64b1 DSP=0x50 This is because lsi_bad_phase() triggers a phase mismatch, which stops SCRIPT processing. However, after returning to lsi_command_complete(), SCRIPT is restarted with lsi_resume_script(). Fix this by adding a return value to lsi_bad_phase(), and only resume script processing when lsi_bad_phase() didn't trigger a host interrupt. Signed-off-by: Sven Schnelle <svens@stackframe.org> Tested-by: Helge Deller <deller@gmx.de> Message-ID: <20240302214453.2071388-1-svens@stackframe.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit a9198b3132d81a6bfc9fdbf6f3d3a514c2864674) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c index XXXXXXX..XXXXXXX 100644 --- a/hw/scsi/lsi53c895a.c +++ b/hw/scsi/lsi53c895a.c @@ -XXX,XX +XXX,XX @@ static inline void lsi_set_phase(LSIState *s, int phase) s->sstat1 = (s->sstat1 & ~PHASE_MASK) | phase; } -static void lsi_bad_phase(LSIState *s, int out, int new_phase) +static int lsi_bad_phase(LSIState *s, int out, int new_phase) { + int ret = 0; /* Trigger a phase mismatch. */ if (s->ccntl0 & LSI_CCNTL0_ENPMJ) { if ((s->ccntl0 & LSI_CCNTL0_PMJCTL)) { @@ -XXX,XX +XXX,XX @@ static void lsi_bad_phase(LSIState *s, int out, int new_phase) trace_lsi_bad_phase_interrupt(); lsi_script_scsi_interrupt(s, LSI_SIST0_MA, 0); lsi_stop_script(s); + ret = 1; } lsi_set_phase(s, new_phase); + return ret; } @@ -XXX,XX +XXX,XX @@ static int lsi_queue_req(LSIState *s, SCSIRequest *req, uint32_t len) static void lsi_command_complete(SCSIRequest *req, size_t resid) { LSIState *s = LSI53C895A(req->bus->qbus.parent); - int out; + int out, stop = 0; out = (s->sstat1 & PHASE_MASK) == PHASE_DO; trace_lsi_command_complete(req->status); @@ -XXX,XX +XXX,XX @@ static void lsi_command_complete(SCSIRequest *req, size_t resid) s->command_complete = 2; if (s->waiting && s->dbc != 0) { /* Raise phase mismatch for short transfers. */ - lsi_bad_phase(s, out, PHASE_ST); + stop = lsi_bad_phase(s, out, PHASE_ST); + if (stop) { + s->waiting = 0; + } } else { lsi_set_phase(s, PHASE_ST); } @@ -XXX,XX +XXX,XX @@ static void lsi_command_complete(SCSIRequest *req, size_t resid) lsi_request_free(s, s->current); scsi_req_unref(req); } - lsi_resume_script(s); + if (!stop) { + lsi_resume_script(s); + } } /* Callback to indicate that the SCSI layer has completed a transfer. */ -- 2.39.2
From: Sven Schnelle <svens@stackframe.org> When the maximum count of SCRIPTS instructions is reached, the code stops execution and returns, but fails to decrement the reentrancy counter. This effectively renders the SCSI controller unusable because on next entry the reentrancy counter is still above the limit. This bug was seen on HP-UX 10.20 which seems to trigger SCRIPTS loops. Fixes: b987718bbb ("hw/scsi/lsi53c895a: Fix reentrancy issues in the LSI controller (CVE-2023-0330)") Signed-off-by: Sven Schnelle <svens@stackframe.org> Message-ID: <20240128202214.2644768-1-svens@stackframe.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Tested-by: Helge Deller <deller@gmx.de> Signed-off-by: Thomas Huth <thuth@redhat.com> (cherry picked from commit 8b09b7fe47082c69295a0fc0cc01b041b6385025) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c index XXXXXXX..XXXXXXX 100644 --- a/hw/scsi/lsi53c895a.c +++ b/hw/scsi/lsi53c895a.c @@ -XXX,XX +XXX,XX @@ again: lsi_script_scsi_interrupt(s, LSI_SIST0_UDC, 0); lsi_disconnect(s); trace_lsi_execute_script_stop(); + reentrancy_level--; return; } insn = read_dword(s, s->dsp); -- 2.39.2
From: Sven Schnelle <svens@stackframe.org> HP-UX 10.20 seems to make the lsi53c895a spinning on a memory location under certain circumstances. As the SCSI controller and CPU are not running at the same time this loop will never finish. After some time, the check loop interrupts with a unexpected device disconnect. This works, but is slow because the kernel resets the scsi controller. Instead of signaling UDC, start a timer and exit the loop. Until the timer fires, the CPU can process instructions which might changes the memory location. The limit of instructions is also reduced because scripts running on the SCSI processor are usually very short. This keeps the time until the loop is exit short. Suggested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Sven Schnelle <svens@stackframe.org> Message-ID: <20240229204407.1699260-1-svens@stackframe.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 9876359990dd4c8a48de65cf5e1c3d13e96a7f4e) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c index XXXXXXX..XXXXXXX 100644 --- a/hw/scsi/lsi53c895a.c +++ b/hw/scsi/lsi53c895a.c @@ -XXX,XX +XXX,XX @@ static const char *names[] = { #define LSI_TAG_VALID (1 << 16) /* Maximum instructions to process. */ -#define LSI_MAX_INSN 10000 +#define LSI_MAX_INSN 100 typedef struct lsi_request { SCSIRequest *req; @@ -XXX,XX +XXX,XX @@ enum { LSI_WAIT_RESELECT, /* Wait Reselect instruction has been issued */ LSI_DMA_SCRIPTS, /* processing DMA from lsi_execute_script */ LSI_DMA_IN_PROGRESS, /* DMA operation is in progress */ + LSI_WAIT_SCRIPTS, /* SCRIPTS stopped because of instruction count limit */ }; enum { @@ -XXX,XX +XXX,XX @@ struct LSIState { MemoryRegion ram_io; MemoryRegion io_io; AddressSpace pci_io_as; + QEMUTimer *scripts_timer; int carry; /* ??? Should this be an a visible register somewhere? */ int status; @@ -XXX,XX +XXX,XX @@ static void lsi_soft_reset(LSIState *s) s->sbr = 0; assert(QTAILQ_EMPTY(&s->queue)); assert(!s->current); + timer_del(s->scripts_timer); } static int lsi_dma_40bit(LSIState *s) @@ -XXX,XX +XXX,XX @@ static void lsi_wait_reselect(LSIState *s) } } +static void lsi_scripts_timer_start(LSIState *s) +{ + trace_lsi_scripts_timer_start(); + timer_mod(s->scripts_timer, qemu_clock_get_us(QEMU_CLOCK_VIRTUAL) + 500); +} + static void lsi_execute_script(LSIState *s) { PCIDevice *pci_dev = PCI_DEVICE(s); @@ -XXX,XX +XXX,XX @@ static void lsi_execute_script(LSIState *s) int insn_processed = 0; static int reentrancy_level; + if (s->waiting == LSI_WAIT_SCRIPTS) { + timer_del(s->scripts_timer); + s->waiting = LSI_NOWAIT; + } + reentrancy_level++; s->istat1 |= LSI_ISTAT1_SRUN; @@ -XXX,XX +XXX,XX @@ again: /* * Some windows drivers make the device spin waiting for a memory location * to change. If we have executed more than LSI_MAX_INSN instructions then - * assume this is the case and force an unexpected device disconnect. This - * is apparently sufficient to beat the drivers into submission. + * assume this is the case and start a timer. Until the timer fires, the + * host CPU has a chance to run and change the memory location. * * Another issue (CVE-2023-0330) can occur if the script is programmed to * trigger itself again and again. Avoid this problem by stopping after @@ -XXX,XX +XXX,XX @@ again: * which should be enough for all valid use cases). */ if (++insn_processed > LSI_MAX_INSN || reentrancy_level > 8) { - if (!(s->sien0 & LSI_SIST0_UDC)) { - qemu_log_mask(LOG_GUEST_ERROR, - "lsi_scsi: inf. loop with UDC masked"); - } - lsi_script_scsi_interrupt(s, LSI_SIST0_UDC, 0); - lsi_disconnect(s); - trace_lsi_execute_script_stop(); + s->waiting = LSI_WAIT_SCRIPTS; + lsi_scripts_timer_start(s); reentrancy_level--; return; } @@ -XXX,XX +XXX,XX @@ static int lsi_post_load(void *opaque, int version_id) return -EINVAL; } + if (s->waiting == LSI_WAIT_SCRIPTS) { + lsi_scripts_timer_start(s); + } return 0; } @@ -XXX,XX +XXX,XX @@ static const struct SCSIBusInfo lsi_scsi_info = { .cancel = lsi_request_cancelled }; +static void scripts_timer_cb(void *opaque) +{ + LSIState *s = opaque; + + trace_lsi_scripts_timer_triggered(); + s->waiting = LSI_NOWAIT; + lsi_execute_script(s); +} + static void lsi_scsi_realize(PCIDevice *dev, Error **errp) { LSIState *s = LSI53C895A(dev); @@ -XXX,XX +XXX,XX @@ static void lsi_scsi_realize(PCIDevice *dev, Error **errp) "lsi-ram", 0x2000); memory_region_init_io(&s->io_io, OBJECT(s), &lsi_io_ops, s, "lsi-io", 256); + s->scripts_timer = timer_new_us(QEMU_CLOCK_VIRTUAL, scripts_timer_cb, s); /* * Since we use the address-space API to interact with ram_io, disable the @@ -XXX,XX +XXX,XX @@ static void lsi_scsi_exit(PCIDevice *dev) LSIState *s = LSI53C895A(dev); address_space_destroy(&s->pci_io_as); + timer_del(s->scripts_timer); } static void lsi_class_init(ObjectClass *klass, void *data) diff --git a/hw/scsi/trace-events b/hw/scsi/trace-events index XXXXXXX..XXXXXXX 100644 --- a/hw/scsi/trace-events +++ b/hw/scsi/trace-events @@ -XXX,XX +XXX,XX @@ lsi_execute_script_stop(void) "SCRIPTS execution stopped" lsi_awoken(void) "Woken by SIGP" lsi_reg_read(const char *name, int offset, uint8_t ret) "Read reg %s 0x%x = 0x%02x" lsi_reg_write(const char *name, int offset, uint8_t val) "Write reg %s 0x%x = 0x%02x" +lsi_scripts_timer_triggered(void) "SCRIPTS timer triggered" +lsi_scripts_timer_start(void) "SCRIPTS timer started" # virtio-scsi.c virtio_scsi_cmd_req(int lun, uint32_t tag, uint8_t cmd) "virtio_scsi_cmd_req lun=%u tag=0x%x cmd=0x%x" -- 2.39.2
For a long time, we provide two compression formats in the download area, .bz2 and .xz. There's absolutely no reason to provide two in parallel, .xz compresses better, and all the links we use points to .xz. Downstream distributions mostly use .xz too. For the release maintenance providing two formats is definitely extra burden too. Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> (cherry picked from commit 9bc9e95119445d7a430b0fc8b7daf22a3612bbd3) diff --git a/scripts/make-release b/scripts/make-release index XXXXXXX..XXXXXXX 100755 --- a/scripts/make-release +++ b/scripts/make-release @@ -XXX,XX +XXX,XX @@ git submodule update --init CryptoPkg/Library/OpensslLib/openssl \ MdeModulePkg/Library/BrotliCustomDecompressLib/brotli) popd -tar --exclude=.git -cjf ${destination}.tar.bz2 ${destination} +tar --exclude=.git -cJf ${destination}.tar.xz ${destination} rm -rf ${destination} -- 2.39.2
From: Laurent Vivier <lvivier@redhat.com> On resume e1000e_vm_state_change() always calls e1000e_autoneg_resume() that sets link_down to false, and thus activates the link even if we have disabled it. The problem can be reproduced starting qemu in paused state (-S) and then set the link to down. When we resume the machine the link appears to be up. Reproducer: # qemu-system-x86_64 ... -device e1000e,netdev=netdev0,id=net0 -S {"execute": "qmp_capabilities" } {"execute": "set_link", "arguments": {"name": "net0", "up": false}} {"execute": "cont" } To fix the problem, merge the content of e1000e_vm_state_change() into e1000e_core_post_load() as e1000 does. Buglink: https://issues.redhat.com/browse/RHEL-21867 Fixes: 6f3fbe4ed06a ("net: Introduce e1000e device emulation") Suggested-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> (cherry picked from commit 4cadf10234989861398e19f3bb441d3861f3bb7c) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c index XXXXXXX..XXXXXXX 100644 --- a/hw/net/e1000e_core.c +++ b/hw/net/e1000e_core.c @@ -XXX,XX +XXX,XX @@ e1000e_intmgr_timer_resume(E1000IntrDelayTimer *timer) } } -static void -e1000e_intmgr_timer_pause(E1000IntrDelayTimer *timer) -{ - if (timer->running) { - timer_del(timer->timer); - } -} - static inline void e1000e_intrmgr_stop_timer(E1000IntrDelayTimer *timer) { @@ -XXX,XX +XXX,XX @@ e1000e_intrmgr_resume(E1000ECore *core) } } -static void -e1000e_intrmgr_pause(E1000ECore *core) -{ - int i; - - e1000e_intmgr_timer_pause(&core->radv); - e1000e_intmgr_timer_pause(&core->rdtr); - e1000e_intmgr_timer_pause(&core->raid); - e1000e_intmgr_timer_pause(&core->tidv); - e1000e_intmgr_timer_pause(&core->tadv); - - e1000e_intmgr_timer_pause(&core->itr); - - for (i = 0; i < E1000E_MSIX_VEC_NUM; i++) { - e1000e_intmgr_timer_pause(&core->eitr[i]); - } -} - static void e1000e_intrmgr_reset(E1000ECore *core) { @@ -XXX,XX +XXX,XX @@ e1000e_core_read(E1000ECore *core, hwaddr addr, unsigned size) return 0; } -static inline void -e1000e_autoneg_pause(E1000ECore *core) -{ - timer_del(core->autoneg_timer); -} - static void e1000e_autoneg_resume(E1000ECore *core) { @@ -XXX,XX +XXX,XX @@ e1000e_autoneg_resume(E1000ECore *core) } } -static void -e1000e_vm_state_change(void *opaque, bool running, RunState state) -{ - E1000ECore *core = opaque; - - if (running) { - trace_e1000e_vm_state_running(); - e1000e_intrmgr_resume(core); - e1000e_autoneg_resume(core); - } else { - trace_e1000e_vm_state_stopped(); - e1000e_autoneg_pause(core); - e1000e_intrmgr_pause(core); - } -} - void e1000e_core_pci_realize(E1000ECore *core, const uint16_t *eeprom_templ, @@ -XXX,XX +XXX,XX @@ e1000e_core_pci_realize(E1000ECore *core, e1000e_autoneg_timer, core); e1000e_intrmgr_pci_realize(core); - core->vmstate = - qemu_add_vm_change_state_handler(e1000e_vm_state_change, core); - for (i = 0; i < E1000E_NUM_QUEUES; i++) { net_tx_pkt_init(&core->tx[i].tx_pkt, core->owner, E1000E_MAX_TX_FRAGS, core->has_vnet); @@ -XXX,XX +XXX,XX @@ e1000e_core_pci_uninit(E1000ECore *core) e1000e_intrmgr_pci_unint(core); - qemu_del_vm_change_state_handler(core->vmstate); - for (i = 0; i < E1000E_NUM_QUEUES; i++) { net_tx_pkt_reset(core->tx[i].tx_pkt); net_tx_pkt_uninit(core->tx[i].tx_pkt); @@ -XXX,XX +XXX,XX @@ e1000e_core_post_load(E1000ECore *core) */ nc->link_down = (core->mac[STATUS] & E1000_STATUS_LU) == 0; + /* + * we need to restart intrmgr timers, as an older version of + * QEMU can have stopped them before migration + */ + e1000e_intrmgr_resume(core); + e1000e_autoneg_resume(core); + return 0; } diff --git a/hw/net/e1000e_core.h b/hw/net/e1000e_core.h index XXXXXXX..XXXXXXX 100644 --- a/hw/net/e1000e_core.h +++ b/hw/net/e1000e_core.h @@ -XXX,XX +XXX,XX @@ struct E1000Core { E1000IntrDelayTimer eitr[E1000E_MSIX_VEC_NUM]; bool eitr_intr_pending[E1000E_MSIX_VEC_NUM]; - VMChangeStateEntry *vmstate; - uint32_t itr_guest_value; uint32_t eitr_guest_value[E1000E_MSIX_VEC_NUM]; -- 2.39.2
From: Nick Briggs <nicholas.h.briggs@gmail.com> There is no guarantee that the PCNetState is allocated such that csr[8] is allocated on an 8-byte boundary. Since not all hosts are capable of unaligned fetches the 16-bit elements need to be fetched individually to avoid a potential fault. Closes issue #2143 Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2143 Signed-off-by: Nick Briggs <nicholas.h.briggs@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Jason Wang <jasowang@redhat.com> (cherry picked from commit 6a5287ce80470bb8df95901d73ee779a64e70c3a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/net/pcnet.c b/hw/net/pcnet.c index XXXXXXX..XXXXXXX 100644 --- a/hw/net/pcnet.c +++ b/hw/net/pcnet.c @@ -XXX,XX +XXX,XX @@ static inline int ladr_match(PCNetState *s, const uint8_t *buf, int size) { struct qemu_ether_header *hdr = (void *)buf; if ((*(hdr->ether_dhost)&0x01) && - ((uint64_t *)&s->csr[8])[0] != 0LL) { + (s->csr[8] | s->csr[9] | s->csr[10] | s->csr[11]) != 0) { uint8_t ladr[8] = { s->csr[8] & 0xff, s->csr[8] >> 8, s->csr[9] & 0xff, s->csr[9] >> 8, -- 2.39.2
From: Klaus Jensen <k.jensen@samsung.com> Remove an unnecessary local Error value in nvme_realize(). In the process, change nvme_check_constraints() to return a bool. Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> (cherry picked from commit 784fd35387e9e6b42e3f365ddf44263eb25de8f7) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: needed for v8.2.0-2319-gfa905f65c5 "hw/nvme: add machine compatibility parameter to enable msix exclusive bar") diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index XXXXXXX..XXXXXXX 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -XXX,XX +XXX,XX @@ static const MemoryRegionOps nvme_cmb_ops = { }, }; -static void nvme_check_constraints(NvmeCtrl *n, Error **errp) +static bool nvme_check_params(NvmeCtrl *n, Error **errp) { NvmeParams *params = &n->params; @@ -XXX,XX +XXX,XX @@ static void nvme_check_constraints(NvmeCtrl *n, Error **errp) if (n->namespace.blkconf.blk && n->subsys) { error_setg(errp, "subsystem support is unavailable with legacy " "namespace ('drive' property)"); - return; + return false; } if (params->max_ioqpairs < 1 || params->max_ioqpairs > NVME_MAX_IOQPAIRS) { error_setg(errp, "max_ioqpairs must be between 1 and %d", NVME_MAX_IOQPAIRS); - return; + return false; } if (params->msix_qsize < 1 || params->msix_qsize > PCI_MSIX_FLAGS_QSIZE + 1) { error_setg(errp, "msix_qsize must be between 1 and %d", PCI_MSIX_FLAGS_QSIZE + 1); - return; + return false; } if (!params->serial) { error_setg(errp, "serial property not set"); - return; + return false; } if (n->pmr.dev) { if (host_memory_backend_is_mapped(n->pmr.dev)) { error_setg(errp, "can't use already busy memdev: %s", object_get_canonical_path_component(OBJECT(n->pmr.dev))); - return; + return false; } if (!is_power_of_2(n->pmr.dev->size)) { error_setg(errp, "pmr backend size needs to be power of 2 in size"); - return; + return false; } host_memory_backend_set_mapped(n->pmr.dev, true); @@ -XXX,XX +XXX,XX @@ static void nvme_check_constraints(NvmeCtrl *n, Error **errp) if (n->params.zasl > n->params.mdts) { error_setg(errp, "zoned.zasl (Zone Append Size Limit) must be less " "than or equal to mdts (Maximum Data Transfer Size)"); - return; + return false; } if (!n->params.vsl) { error_setg(errp, "vsl must be non-zero"); - return; + return false; } if (params->sriov_max_vfs) { if (!n->subsys) { error_setg(errp, "subsystem is required for the use of SR-IOV"); - return; + return false; } if (params->sriov_max_vfs > NVME_MAX_VFS) { error_setg(errp, "sriov_max_vfs must be between 0 and %d", NVME_MAX_VFS); - return; + return false; } if (params->cmb_size_mb) { error_setg(errp, "CMB is not supported with SR-IOV"); - return; + return false; } if (n->pmr.dev) { error_setg(errp, "PMR is not supported with SR-IOV"); - return; + return false; } if (!params->sriov_vq_flexible || !params->sriov_vi_flexible) { error_setg(errp, "both sriov_vq_flexible and sriov_vi_flexible" " must be set for the use of SR-IOV"); - return; + return false; } if (params->sriov_vq_flexible < params->sriov_max_vfs * 2) { error_setg(errp, "sriov_vq_flexible must be greater than or equal" " to %d (sriov_max_vfs * 2)", params->sriov_max_vfs * 2); - return; + return false; } if (params->max_ioqpairs < params->sriov_vq_flexible + 2) { error_setg(errp, "(max_ioqpairs - sriov_vq_flexible) must be" " greater than or equal to 2"); - return; + return false; } if (params->sriov_vi_flexible < params->sriov_max_vfs) { error_setg(errp, "sriov_vi_flexible must be greater than or equal" " to %d (sriov_max_vfs)", params->sriov_max_vfs); - return; + return false; } if (params->msix_qsize < params->sriov_vi_flexible + 1) { error_setg(errp, "(msix_qsize - sriov_vi_flexible) must be" " greater than or equal to 1"); - return; + return false; } if (params->sriov_max_vi_per_vf && @@ -XXX,XX +XXX,XX @@ static void nvme_check_constraints(NvmeCtrl *n, Error **errp) error_setg(errp, "sriov_max_vi_per_vf must meet:" " (sriov_max_vi_per_vf - 1) %% %d == 0 and" " sriov_max_vi_per_vf >= 1", NVME_VF_RES_GRANULARITY); - return; + return false; } if (params->sriov_max_vq_per_vf && @@ -XXX,XX +XXX,XX @@ static void nvme_check_constraints(NvmeCtrl *n, Error **errp) error_setg(errp, "sriov_max_vq_per_vf must meet:" " (sriov_max_vq_per_vf - 1) %% %d == 0 and" " sriov_max_vq_per_vf >= 2", NVME_VF_RES_GRANULARITY); - return; + return false; } } + + return true; } static void nvme_init_state(NvmeCtrl *n) @@ -XXX,XX +XXX,XX @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp) { NvmeCtrl *n = NVME(pci_dev); NvmeNamespace *ns; - Error *local_err = NULL; NvmeCtrl *pn = NVME(pcie_sriov_get_pf(pci_dev)); if (pci_is_vf(pci_dev)) { @@ -XXX,XX +XXX,XX @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp) n->subsys = pn->subsys; } - nvme_check_constraints(n, &local_err); - if (local_err) { - error_propagate(errp, local_err); + if (!nvme_check_params(n, errp)) { return; } @@ -XXX,XX +XXX,XX @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp) &pci_dev->qdev, n->parent_obj.qdev.id); if (nvme_init_subsys(n, errp)) { - error_propagate(errp, local_err); return; } nvme_init_state(n); -- 2.39.2
From: Klaus Jensen <k.jensen@samsung.com> Replace the local Error variable with errp and ERRP_GUARD() and change the return value to bool. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> (cherry picked from commit 973f76cf7743545a5d8a0a8bfdfe2cd02aa3e238) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: needed for v8.2.0-2319-gfa905f65c5 "hw/nvme: add machine compatibility parameter to enable msix exclusive bar") diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index XXXXXXX..XXXXXXX 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -XXX,XX +XXX,XX @@ static int nvme_add_pm_capability(PCIDevice *pci_dev, uint8_t offset) return 0; } -static int nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev, Error **errp) +static bool nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev, Error **errp) { + ERRP_GUARD(); uint8_t *pci_conf = pci_dev->config; uint64_t bar_size; unsigned msix_table_offset, msix_pba_offset; int ret; - Error *err = NULL; - pci_conf[PCI_INTERRUPT_PIN] = 1; pci_config_set_prog_interface(pci_conf, 0x2); @@ -XXX,XX +XXX,XX @@ static int nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev, Error **errp) } ret = msix_init(pci_dev, n->params.msix_qsize, &n->bar0, 0, msix_table_offset, - &n->bar0, 0, msix_pba_offset, 0, &err); - if (ret < 0) { - if (ret == -ENOTSUP) { - warn_report_err(err); - } else { - error_propagate(errp, err); - return ret; - } + &n->bar0, 0, msix_pba_offset, 0, errp); + if (ret == -ENOTSUP) { + /* report that msix is not supported, but do not error out */ + warn_report_err(*errp); + *errp = NULL; + } else if (ret < 0) { + /* propagate error to caller */ + return false; } nvme_update_msixcap_ts(pci_dev, n->conf_msix_qsize); @@ -XXX,XX +XXX,XX @@ static int nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev, Error **errp) nvme_init_sriov(n, pci_dev, 0x120); } - return 0; + return true; } static void nvme_init_subnqn(NvmeCtrl *n) @@ -XXX,XX +XXX,XX @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp) return; } nvme_init_state(n); - if (nvme_init_pci(n, pci_dev, errp)) { + if (!nvme_init_pci(n, pci_dev, errp)) { return; } nvme_init_ctrl(n, pci_dev); -- 2.39.2
From: Minwoo Im <minwoo.im@samsung.com> Currently, when a VF is created, it uses the 'params' object of the PF as it is. In other words, the 'params.serial' string memory area is also shared. In this situation, if the VF is removed from the system, the PF's 'params.serial' object is released with object_finalize() followed by object_property_del_all() which release the memory for 'serial' property. If that happens, the next VF created will inherit a serial from a corrupted memory area. If this happens, an error will occur when comparing subsys->serial and n->params.serial in the nvme_subsys_register_ctrl() function. Cc: qemu-stable@nongnu.org Fixes: 44c2c09488db ("hw/nvme: Add support for SR-IOV") Signed-off-by: Minwoo Im <minwoo.im@samsung.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> (cherry picked from commit 4f0a4a3d5854824e5c5eccf353d4a1f4f749a29d) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index XXXXXXX..XXXXXXX 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -XXX,XX +XXX,XX @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp) if (pci_is_vf(pci_dev)) { /* * VFs derive settings from the parent. PF's lifespan exceeds - * that of VF's, so it's safe to share params.serial. + * that of VF's. */ memcpy(&n->params, &pn->params, sizeof(NvmeParams)); + + /* + * Set PF's serial value to a new string memory to prevent 'serial' + * property object release of PF when a VF is removed from the system. + */ + n->params.serial = g_strdup(pn->params.serial); n->subsys = pn->subsys; } -- 2.39.2
From: Klaus Jensen <k.jensen@samsung.com> Generalize the mbar size helper such that it can handle cases where the MSI-X table and PBA are expected to be in an exclusive bar. Cc: qemu-stable@nongnu.org Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> (cherry picked from commit ee7bda4d38cda3eaf114c850a723dd12e23d3abc) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index XXXXXXX..XXXXXXX 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -XXX,XX +XXX,XX @@ static void nvme_init_pmr(NvmeCtrl *n, PCIDevice *pci_dev) memory_region_set_enabled(&n->pmr.dev->mr, false); } -static uint64_t nvme_bar_size(unsigned total_queues, unsigned total_irqs, - unsigned *msix_table_offset, - unsigned *msix_pba_offset) +static uint64_t nvme_mbar_size(unsigned total_queues, unsigned total_irqs, + unsigned *msix_table_offset, + unsigned *msix_pba_offset) { - uint64_t bar_size, msix_table_size, msix_pba_size; + uint64_t bar_size, msix_table_size; bar_size = sizeof(NvmeBar) + 2 * total_queues * NVME_DB_SIZE; + + if (total_irqs == 0) { + goto out; + } + bar_size = QEMU_ALIGN_UP(bar_size, 4 * KiB); if (msix_table_offset) { @@ -XXX,XX +XXX,XX @@ static uint64_t nvme_bar_size(unsigned total_queues, unsigned total_irqs, *msix_pba_offset = bar_size; } - msix_pba_size = QEMU_ALIGN_UP(total_irqs, 64) / 8; - bar_size += msix_pba_size; + bar_size += QEMU_ALIGN_UP(total_irqs, 64) / 8; - bar_size = pow2ceil(bar_size); - return bar_size; +out: + return pow2ceil(bar_size); } static void nvme_init_sriov(NvmeCtrl *n, PCIDevice *pci_dev, uint16_t offset) @@ -XXX,XX +XXX,XX @@ static void nvme_init_sriov(NvmeCtrl *n, PCIDevice *pci_dev, uint16_t offset) uint16_t vf_dev_id = n->params.use_intel_id ? PCI_DEVICE_ID_INTEL_NVME : PCI_DEVICE_ID_REDHAT_NVME; NvmePriCtrlCap *cap = &n->pri_ctrl_cap; - uint64_t bar_size = nvme_bar_size(le16_to_cpu(cap->vqfrsm), + uint64_t bar_size = nvme_mbar_size(le16_to_cpu(cap->vqfrsm), le16_to_cpu(cap->vifrsm), NULL, NULL); @@ -XXX,XX +XXX,XX @@ static bool nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev, Error **errp) ERRP_GUARD(); uint8_t *pci_conf = pci_dev->config; uint64_t bar_size; - unsigned msix_table_offset, msix_pba_offset; + unsigned msix_table_offset = 0, msix_pba_offset = 0; int ret; pci_conf[PCI_INTERRUPT_PIN] = 1; @@ -XXX,XX +XXX,XX @@ static bool nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev, Error **errp) } /* add one to max_ioqpairs to account for the admin queue pair */ - bar_size = nvme_bar_size(n->params.max_ioqpairs + 1, n->params.msix_qsize, - &msix_table_offset, &msix_pba_offset); + bar_size = nvme_mbar_size(n->params.max_ioqpairs + 1, n->params.msix_qsize, + &msix_table_offset, &msix_pba_offset); memory_region_init(&n->bar0, OBJECT(n), "nvme-bar0", bar_size); memory_region_init_io(&n->iomem, OBJECT(n), &nvme_mmio_ops, n, "nvme", -- 2.39.2
From: Klaus Jensen <k.jensen@samsung.com> Commit 1901b4967c3f ("hw/block/nvme: move msix table and pba to BAR 0") moved the MSI-X table and PBA to BAR 0 to make room for enabling CMR and PMR at the same time. As reported by Julien Grall in #2184, this breaks migration through system hibernation. Add a machine compatibility parameter and set it on machines pre 6.0 to enable the old behavior automatically, restoring the hibernation migration support. Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2184 Fixes: 1901b4967c3f ("hw/block/nvme: move msix table and pba to BAR 0") Reported-by: Julien Grall julien@xen.org Tested-by: Julien Grall julien@xen.org Reviewed-by: Jesper Wendel Devantier <foss@defmacro.it> Signed-off-by: Klaus Jensen <k.jensen@samsung.com> (cherry picked from commit fa905f65c5549703279f68c253914799b10ada47) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/core/machine.c b/hw/core/machine.c index XXXXXXX..XXXXXXX 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -XXX,XX +XXX,XX @@ GlobalProperty hw_compat_5_2[] = { { "PIIX4_PM", "smm-compat", "on"}, { "virtio-blk-device", "report-discard-granularity", "off" }, { "virtio-net-pci-base", "vectors", "3"}, + { "nvme", "msix-exclusive-bar", "on"}, }; const size_t hw_compat_5_2_len = G_N_ELEMENTS(hw_compat_5_2); diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index XXXXXXX..XXXXXXX 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -XXX,XX +XXX,XX @@ static bool nvme_check_params(NvmeCtrl *n, Error **errp) } if (n->pmr.dev) { + if (params->msix_exclusive_bar) { + error_setg(errp, "not enough BARs available to enable PMR"); + return false; + } + if (host_memory_backend_is_mapped(n->pmr.dev)) { error_setg(errp, "can't use already busy memdev: %s", object_get_canonical_path_component(OBJECT(n->pmr.dev))); @@ -XXX,XX +XXX,XX @@ static bool nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev, Error **errp) pcie_ari_init(pci_dev, 0x100, 1); } - /* add one to max_ioqpairs to account for the admin queue pair */ - bar_size = nvme_mbar_size(n->params.max_ioqpairs + 1, n->params.msix_qsize, - &msix_table_offset, &msix_pba_offset); + if (n->params.msix_exclusive_bar && !pci_is_vf(pci_dev)) { + bar_size = nvme_mbar_size(n->params.max_ioqpairs + 1, 0, NULL, NULL); + memory_region_init_io(&n->iomem, OBJECT(n), &nvme_mmio_ops, n, "nvme", + bar_size); + pci_register_bar(pci_dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY | + PCI_BASE_ADDRESS_MEM_TYPE_64, &n->iomem); + ret = msix_init_exclusive_bar(pci_dev, n->params.msix_qsize, 4, errp); + } else { + assert(n->params.msix_qsize >= 1); - memory_region_init(&n->bar0, OBJECT(n), "nvme-bar0", bar_size); - memory_region_init_io(&n->iomem, OBJECT(n), &nvme_mmio_ops, n, "nvme", - msix_table_offset); - memory_region_add_subregion(&n->bar0, 0, &n->iomem); + /* add one to max_ioqpairs to account for the admin queue pair */ + bar_size = nvme_mbar_size(n->params.max_ioqpairs + 1, + n->params.msix_qsize, &msix_table_offset, + &msix_pba_offset); - if (pci_is_vf(pci_dev)) { - pcie_sriov_vf_register_bar(pci_dev, 0, &n->bar0); - } else { - pci_register_bar(pci_dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY | - PCI_BASE_ADDRESS_MEM_TYPE_64, &n->bar0); + memory_region_init(&n->bar0, OBJECT(n), "nvme-bar0", bar_size); + memory_region_init_io(&n->iomem, OBJECT(n), &nvme_mmio_ops, n, "nvme", + msix_table_offset); + memory_region_add_subregion(&n->bar0, 0, &n->iomem); + + if (pci_is_vf(pci_dev)) { + pcie_sriov_vf_register_bar(pci_dev, 0, &n->bar0); + } else { + pci_register_bar(pci_dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY | + PCI_BASE_ADDRESS_MEM_TYPE_64, &n->bar0); + } + + ret = msix_init(pci_dev, n->params.msix_qsize, + &n->bar0, 0, msix_table_offset, + &n->bar0, 0, msix_pba_offset, 0, errp); } - ret = msix_init(pci_dev, n->params.msix_qsize, - &n->bar0, 0, msix_table_offset, - &n->bar0, 0, msix_pba_offset, 0, errp); + if (ret == -ENOTSUP) { /* report that msix is not supported, but do not error out */ warn_report_err(*errp); @@ -XXX,XX +XXX,XX @@ static Property nvme_props[] = { params.sriov_max_vi_per_vf, 0), DEFINE_PROP_UINT8("sriov_max_vq_per_vf", NvmeCtrl, params.sriov_max_vq_per_vf, 0), + DEFINE_PROP_BOOL("msix-exclusive-bar", NvmeCtrl, params.msix_exclusive_bar, + false), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h index XXXXXXX..XXXXXXX 100644 --- a/hw/nvme/nvme.h +++ b/hw/nvme/nvme.h @@ -XXX,XX +XXX,XX @@ typedef struct NvmeParams { uint16_t sriov_vi_flexible; uint8_t sriov_max_vq_per_vf; uint8_t sriov_max_vi_per_vf; + bool msix_exclusive_bar; } NvmeParams; typedef struct NvmeCtrl { -- 2.39.2
From: Akihiko Odaki <akihiko.odaki@daynix.com> igb can use this function to change its behavior depending on the number of virtual functions currently enabled. Signed-off-by: Gal Hammer <gal.hammer@sap.com> Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Jason Wang <jasowang@redhat.com> (cherry picked from commit 31180dbdca2859ae9841939f85158908453ea01d) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: needed for v8.2.0-2290-g91bb64a8d2 "hw/nvme: Use pcie_sriov_num_vfs()" (CVE-2024-26328)) diff --git a/hw/pci/pcie_sriov.c b/hw/pci/pcie_sriov.c index XXXXXXX..XXXXXXX 100644 --- a/hw/pci/pcie_sriov.c +++ b/hw/pci/pcie_sriov.c @@ -XXX,XX +XXX,XX @@ PCIDevice *pcie_sriov_get_vf_at_index(PCIDevice *dev, int n) } return NULL; } + +uint16_t pcie_sriov_num_vfs(PCIDevice *dev) +{ + return dev->exp.sriov_pf.num_vfs; +} diff --git a/include/hw/pci/pcie_sriov.h b/include/hw/pci/pcie_sriov.h index XXXXXXX..XXXXXXX 100644 --- a/include/hw/pci/pcie_sriov.h +++ b/include/hw/pci/pcie_sriov.h @@ -XXX,XX +XXX,XX @@ PCIDevice *pcie_sriov_get_pf(PCIDevice *dev); */ PCIDevice *pcie_sriov_get_vf_at_index(PCIDevice *dev, int n); +/* Returns the current number of virtual functions. */ +uint16_t pcie_sriov_num_vfs(PCIDevice *dev); + #endif /* QEMU_PCIE_SRIOV_H */ -- 2.39.2
From: Akihiko Odaki <akihiko.odaki@daynix.com> nvme_sriov_pre_write_ctrl() used to directly inspect SR-IOV configurations to know the number of VFs being disabled due to SR-IOV configuration writes, but the logic was flawed and resulted in out-of-bound memory access. It assumed PCI_SRIOV_NUM_VF always has the number of currently enabled VFs, but it actually doesn't in the following cases: - PCI_SRIOV_NUM_VF has been set but PCI_SRIOV_CTRL_VFE has never been. - PCI_SRIOV_NUM_VF was written after PCI_SRIOV_CTRL_VFE was set. - VFs were only partially enabled because of realization failure. It is a responsibility of pcie_sriov to interpret SR-IOV configurations and pcie_sriov does it correctly, so use pcie_sriov_num_vfs(), which it provides, to get the number of enabled VFs before and after SR-IOV configuration writes. Cc: qemu-stable@nongnu.org Fixes: CVE-2024-26328 Fixes: 11871f53ef8e ("hw/nvme: Add support for the Virtualization Management command") Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20240228-reuse-v8-1-282660281e60@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 91bb64a8d2014fda33a81fcf0fce37340f0d3b0c) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index XXXXXXX..XXXXXXX 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -XXX,XX +XXX,XX @@ static void nvme_pci_reset(DeviceState *qdev) nvme_ctrl_reset(n, NVME_RESET_FUNCTION); } -static void nvme_sriov_pre_write_ctrl(PCIDevice *dev, uint32_t address, - uint32_t val, int len) +static void nvme_sriov_post_write_config(PCIDevice *dev, uint16_t old_num_vfs) { NvmeCtrl *n = NVME(dev); NvmeSecCtrlEntry *sctrl; - uint16_t sriov_cap = dev->exp.sriov_cap; - uint32_t off = address - sriov_cap; - int i, num_vfs; + int i; - if (!sriov_cap) { - return; - } - - if (range_covers_byte(off, len, PCI_SRIOV_CTRL)) { - if (!(val & PCI_SRIOV_CTRL_VFE)) { - num_vfs = pci_get_word(dev->config + sriov_cap + PCI_SRIOV_NUM_VF); - for (i = 0; i < num_vfs; i++) { - sctrl = &n->sec_ctrl_list.sec[i]; - nvme_virt_set_state(n, le16_to_cpu(sctrl->scid), false); - } - } + for (i = pcie_sriov_num_vfs(dev); i < old_num_vfs; i++) { + sctrl = &n->sec_ctrl_list.sec[i]; + nvme_virt_set_state(n, le16_to_cpu(sctrl->scid), false); } } static void nvme_pci_write_config(PCIDevice *dev, uint32_t address, uint32_t val, int len) { - nvme_sriov_pre_write_ctrl(dev, address, val, len); + uint16_t old_num_vfs = pcie_sriov_num_vfs(dev); + pci_default_write_config(dev, address, val, len); pcie_cap_flr_write_config(dev, address, val, len); + nvme_sriov_post_write_config(dev, old_num_vfs); } static const VMStateDescription nvme_vmstate = { -- 2.39.2
From: Akihiko Odaki <akihiko.odaki@daynix.com> The guest may write NumVFs greater than TotalVFs and that can lead to buffer overflow in VF implementations. Cc: qemu-stable@nongnu.org Fixes: CVE-2024-26327 Fixes: 7c0fa8dff811 ("pcie: Add support for Single Root I/O Virtualization (SR/IOV)") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Message-Id: <20240228-reuse-v8-2-282660281e60@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@ericsson.com> (cherry picked from commit 6081b4243cd64dff1b2cf5b0c215c71e9d7e753b) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/pci/pcie_sriov.c b/hw/pci/pcie_sriov.c index XXXXXXX..XXXXXXX 100644 --- a/hw/pci/pcie_sriov.c +++ b/hw/pci/pcie_sriov.c @@ -XXX,XX +XXX,XX @@ static void register_vfs(PCIDevice *dev) assert(sriov_cap > 0); num_vfs = pci_get_word(dev->config + sriov_cap + PCI_SRIOV_NUM_VF); + if (num_vfs > pci_get_word(dev->config + sriov_cap + PCI_SRIOV_TOTAL_VF)) { + return; + } dev->exp.sriov_pf.vf = g_new(PCIDevice *, num_vfs); -- 2.39.2
From: Jonathan Cameron <Jonathan.Cameron@huawei.com> With a numa set up such as -numa nodeid=0,cpus=0 \ -numa nodeid=1,memdev=mem \ -numa nodeid=2,cpus=1 and appropriate hmat_lb entries the initiator list is correctly computed and writen to HMAT as 0,2 but then the LB data is accessed using the node id (here 2), landing outside the entry_list array. Stash the reverse lookup when writing the initiator list and use it to get the correct array index index. Fixes: 4586a2cb83 ("hmat acpi: Build System Locality Latency and Bandwidth Information Structure(s)") Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Message-Id: <20240307160326.31570-3-Jonathan.Cameron@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 74e2845c5f95b0c139c79233ddb65bb17f2dd679) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/acpi/hmat.c b/hw/acpi/hmat.c index XXXXXXX..XXXXXXX 100644 --- a/hw/acpi/hmat.c +++ b/hw/acpi/hmat.c @@ -XXX,XX +XXX,XX @@ static void build_hmat_lb(GArray *table_data, HMAT_LB_Info *hmat_lb, uint32_t *initiator_list) { int i, index; + uint32_t initiator_to_index[MAX_NODES] = {}; HMAT_LB_Data *lb_data; uint16_t *entry_list; uint32_t base; @@ -XXX,XX +XXX,XX @@ static void build_hmat_lb(GArray *table_data, HMAT_LB_Info *hmat_lb, /* Initiator Proximity Domain List */ for (i = 0; i < num_initiator; i++) { build_append_int_noprefix(table_data, initiator_list[i], 4); + /* Reverse mapping for array possitions */ + initiator_to_index[initiator_list[i]] = i; } /* Target Proximity Domain List */ @@ -XXX,XX +XXX,XX @@ static void build_hmat_lb(GArray *table_data, HMAT_LB_Info *hmat_lb, entry_list = g_new0(uint16_t, num_initiator * num_target); for (i = 0; i < hmat_lb->list->len; i++) { lb_data = &g_array_index(hmat_lb->list, HMAT_LB_Data, i); - index = lb_data->initiator * num_target + lb_data->target; + index = initiator_to_index[lb_data->initiator] * num_target + + lb_data->target; entry_list[index] = (uint16_t)(lb_data->data / hmat_lb->base); } -- 2.39.2
From: Cédric Le Goater <clg@redhat.com> The block .save_setup() handler calls a helper routine init_blk_migration() which builds a list of block devices to take into account for migration. When one device is found to be empty (sectors == 0), the loop exits and all the remaining devices are ignored. This is a regression introduced when bdrv_iterate() was removed. Change that by skipping only empty devices. Cc: Markus Armbruster <armbru@redhat.com> Cc: qemu-stable <qemu-stable@nongnu.org> Suggested-by: Kevin Wolf <kwolf@redhat.com> Fixes: fea68bb6e9fa ("block: Eliminate bdrv_iterate(), use bdrv_next()") Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Link: https://lore.kernel.org/r/20240312120431.550054-1-clg@redhat.com [peterx: fix "Suggested-by:"] Signed-off-by: Peter Xu <peterx@redhat.com> (cherry picked from commit 2e128776dc56f502c2ee41750afe83938f389528) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/migration/block.c b/migration/block.c index XXXXXXX..XXXXXXX 100644 --- a/migration/block.c +++ b/migration/block.c @@ -XXX,XX +XXX,XX @@ static int init_blk_migration(QEMUFile *f) } sectors = bdrv_nb_sectors(bs); - if (sectors <= 0) { + if (sectors == 0) { + continue; + } + if (sectors < 0) { ret = sectors; bdrv_next_cleanup(&it); goto out; -- 2.39.2
From: Thomas Huth <thuth@redhat.com> When running the tests in slow mode on a very loaded system and with --enable-debug, the test-aio-multithread can take longer than 1 minute. Bump the timeout to two minutes to make sure that it also passes in such situations. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20231215070357.10888-14-thuth@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> (cherry picked from commit c45f8f1aef35730a2dcf3cabe296ac12965db43d) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/tests/unit/meson.build b/tests/unit/meson.build index XXXXXXX..XXXXXXX 100644 --- a/tests/unit/meson.build +++ b/tests/unit/meson.build @@ -XXX,XX +XXX,XX @@ test_env.set('G_TEST_SRCDIR', meson.current_source_dir()) test_env.set('G_TEST_BUILDDIR', meson.current_build_dir()) slow_tests = { + 'test-aio-multithread' : 120, 'test-crypto-tlscredsx509': 45, 'test-crypto-tlssession': 45 } -- 2.39.2
From: Thomas Huth <thuth@redhat.com> When running the tests in slow mode on a very loaded system and with --enable-debug, the test-crypto-block can take longer than 4 minutes. Bump the timeout to 5 minutes to make sure that it also passes in such situations. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20231215070357.10888-15-thuth@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> (cherry picked from commit e1b363e328d559cd5f86d3d1d7b84d0154e153d3) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/tests/unit/meson.build b/tests/unit/meson.build index XXXXXXX..XXXXXXX 100644 --- a/tests/unit/meson.build +++ b/tests/unit/meson.build @@ -XXX,XX +XXX,XX @@ test_env.set('G_TEST_BUILDDIR', meson.current_build_dir()) slow_tests = { 'test-aio-multithread' : 120, + 'test-crypto-block' : 300, 'test-crypto-tlscredsx509': 45, 'test-crypto-tlssession': 45 } -- 2.39.2
From: Kevin Wolf <kwolf@redhat.com> We're seeing timeouts for this test on CI runs (specifically for ubuntu-20.04-s390x-all). It doesn't fail consistently, but even the successful runs take about 27 or 28 seconds, which is not very far from the 30 seconds timeout. Bump the timeout a bit to make failure less likely even on this CI host. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20240125165803.48373-1-kwolf@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit 63b18312d14ac984acaf13c7c55d9baa2d61496e) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/tests/unit/meson.build b/tests/unit/meson.build index XXXXXXX..XXXXXXX 100644 --- a/tests/unit/meson.build +++ b/tests/unit/meson.build @@ -XXX,XX +XXX,XX @@ slow_tests = { 'test-aio-multithread' : 120, 'test-crypto-block' : 300, 'test-crypto-tlscredsx509': 45, - 'test-crypto-tlssession': 45 + 'test-crypto-tlssession': 45, + 'test-replication': 60, } foreach test_name, extra: tests -- 2.39.2
From: Peter Maydell <peter.maydell@linaro.org> On our gcov CI job, the bufferiszero and crypto-tlscredsx509 tests time out occasionally, making the job flaky. Double the timeout on these two tests. Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2221 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-id: 20240312110815.116992-1-peter.maydell@linaro.org (cherry picked from commit 55f7c6a5f2bd82e1d2d0eac6eee0185ce0451815) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/tests/unit/meson.build b/tests/unit/meson.build index XXXXXXX..XXXXXXX 100644 --- a/tests/unit/meson.build +++ b/tests/unit/meson.build @@ -XXX,XX +XXX,XX @@ test_env.set('G_TEST_BUILDDIR', meson.current_build_dir()) slow_tests = { 'test-aio-multithread' : 120, + 'test-bufferiszero': 60, 'test-crypto-block' : 300, - 'test-crypto-tlscredsx509': 45, - 'test-crypto-tlssession': 45, + 'test-crypto-tlscredsx509': 90, + 'test-crypto-tlssession': 90, 'test-replication': 60, } -- 2.39.2
From: Paolo Bonzini <pbonzini@redhat.com> Remove knowledge of specific MMU indexes (other than MMU_NESTED_IDX and MMU_PHYS_IDX) from mmu_translate(). This will make it possible to split 32-bit and 64-bit MMU indexes. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 5f97afe2543f09160a8d123ab6e2e8c6d98fa9ce) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: context fixup in target/i386/cpu.h due to other changes in that area) diff --git a/target/i386/cpu.h b/target/i386/cpu.h index XXXXXXX..XXXXXXX 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -XXX,XX +XXX,XX @@ static inline int cpu_mmu_index(CPUX86State *env, bool ifetch) ? MMU_KNOSMAP_IDX : MMU_KSMAP_IDX; } +static inline bool is_mmu_index_smap(int mmu_index) +{ + return mmu_index == MMU_KSMAP_IDX; +} + +static inline bool is_mmu_index_user(int mmu_index) +{ + return mmu_index == MMU_USER_IDX; +} + static inline bool is_mmu_index_32(int mmu_index) { assert(mmu_index < MMU_PHYS_IDX); diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c index XXXXXXX..XXXXXXX 100644 --- a/target/i386/tcg/sysemu/excp_helper.c +++ b/target/i386/tcg/sysemu/excp_helper.c @@ -XXX,XX +XXX,XX @@ static bool mmu_translate(CPUX86State *env, const TranslateParams *in, { const target_ulong addr = in->addr; const int pg_mode = in->pg_mode; - const bool is_user = (in->mmu_idx == MMU_USER_IDX); + const bool is_user = is_mmu_index_user(in->mmu_idx); const MMUAccessType access_type = in->access_type; uint64_t ptep, pte, rsvd_mask; PTETranslate pte_trans = { @@ -XXX,XX +XXX,XX @@ do_check_protect_pse36: } int prot = 0; - if (in->mmu_idx != MMU_KSMAP_IDX || !(ptep & PG_USER_MASK)) { + if (!is_mmu_index_smap(in->mmu_idx) || !(ptep & PG_USER_MASK)) { prot |= PAGE_READ; if ((ptep & PG_RW_MASK) || !(is_user || (pg_mode & PG_MODE_WP))) { prot |= PAGE_WRITE; -- 2.39.2
From: Paolo Bonzini <pbonzini@redhat.com> Accesses from a 32-bit environment (32-bit code segment for instruction accesses, EFER.LMA==0 for processor accesses) have to mask away the upper 32 bits of the address. While a bit wasteful, the easiest way to do so is to use separate MMU indexes. These days, QEMU anyway is compiled with a fixed value for NB_MMU_MODES. Split MMU_USER_IDX, MMU_KSMAP_IDX and MMU_KNOSMAP_IDX in two. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 90f641531c782c873a05895f411c05fbbbef3c49) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: move changes for x86_cpu_mmu_index() to cpu_mmu_index() due to missing v8.2.0-1030-gace0c5fe5950 "target/i386: Populate CPUClass.mmu_index" Increase NB_MMU_MODES from 5 to 8 in target/i386/cpu-param.h due to missing v7.2.0-2640-gffd824f3f32d "include/exec: Set default NB_MMU_MODES to 16" v7.2.0-2647-g6787318a5d86 "target/i386: Remove NB_MMU_MODES define" which relaxed upper limit of MMU index for i386, since this commit starts using MMU_NESTED_IDX=7. Thanks Zhao Liu and Paolo Bonzini for the analisys and suggestions. ) diff --git a/target/i386/cpu-param.h b/target/i386/cpu-param.h index XXXXXXX..XXXXXXX 100644 --- a/target/i386/cpu-param.h +++ b/target/i386/cpu-param.h @@ -XXX,XX +XXX,XX @@ # define TARGET_VIRT_ADDR_SPACE_BITS 32 #endif #define TARGET_PAGE_BITS 12 -#define NB_MMU_MODES 5 +#define NB_MMU_MODES 8 #ifndef CONFIG_USER_ONLY # define TARGET_TB_PCREL 1 diff --git a/target/i386/cpu.h b/target/i386/cpu.h index XXXXXXX..XXXXXXX 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -XXX,XX +XXX,XX @@ uint64_t cpu_get_tsc(CPUX86State *env); #define cpu_list x86_cpu_list /* MMU modes definitions */ -#define MMU_KSMAP_IDX 0 -#define MMU_USER_IDX 1 -#define MMU_KNOSMAP_IDX 2 -#define MMU_NESTED_IDX 3 -#define MMU_PHYS_IDX 4 +#define MMU_KSMAP64_IDX 0 +#define MMU_KSMAP32_IDX 1 +#define MMU_USER64_IDX 2 +#define MMU_USER32_IDX 3 +#define MMU_KNOSMAP64_IDX 4 +#define MMU_KNOSMAP32_IDX 5 +#define MMU_PHYS_IDX 6 +#define MMU_NESTED_IDX 7 + +#ifdef CONFIG_USER_ONLY +#ifdef TARGET_X86_64 +#define MMU_USER_IDX MMU_USER64_IDX +#else +#define MMU_USER_IDX MMU_USER32_IDX +#endif +#endif static inline int cpu_mmu_index(CPUX86State *env, bool ifetch) { - return (env->hflags & HF_CPL_MASK) == 3 ? MMU_USER_IDX : - (!(env->hflags & HF_SMAP_MASK) || (env->eflags & AC_MASK)) - ? MMU_KNOSMAP_IDX : MMU_KSMAP_IDX; + int mmu_index_32 = (env->hflags & HF_CS64_MASK) ? 1 : 0; + int mmu_index_base = + (env->hflags & HF_CPL_MASK) == 3 ? MMU_USER64_IDX : + !(env->hflags & HF_SMAP_MASK) ? MMU_KNOSMAP64_IDX : + (env->eflags & AC_MASK) ? MMU_KNOSMAP64_IDX : MMU_KSMAP64_IDX; + + return mmu_index_base + mmu_index_32; } static inline bool is_mmu_index_smap(int mmu_index) { - return mmu_index == MMU_KSMAP_IDX; + return (mmu_index & ~1) == MMU_KSMAP64_IDX; } static inline bool is_mmu_index_user(int mmu_index) { - return mmu_index == MMU_USER_IDX; + return (mmu_index & ~1) == MMU_USER64_IDX; } static inline bool is_mmu_index_32(int mmu_index) @@ -XXX,XX +XXX,XX @@ static inline bool is_mmu_index_32(int mmu_index) static inline int cpu_mmu_index_kernel(CPUX86State *env) { - return !(env->hflags & HF_SMAP_MASK) ? MMU_KNOSMAP_IDX : - ((env->hflags & HF_CPL_MASK) < 3 && (env->eflags & AC_MASK)) - ? MMU_KNOSMAP_IDX : MMU_KSMAP_IDX; + int mmu_index_32 = (env->hflags & HF_LMA_MASK) ? 1 : 0; + int mmu_index_base = + !(env->hflags & HF_SMAP_MASK) ? MMU_KNOSMAP64_IDX : + ((env->hflags & HF_CPL_MASK) < 3 && (env->eflags & AC_MASK)) ? MMU_KNOSMAP64_IDX : MMU_KSMAP64_IDX; + + return mmu_index_base + mmu_index_32; } #define CC_DST (env->cc_dst) diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c index XXXXXXX..XXXXXXX 100644 --- a/target/i386/tcg/sysemu/excp_helper.c +++ b/target/i386/tcg/sysemu/excp_helper.c @@ -XXX,XX +XXX,XX @@ static bool get_physical_address(CPUX86State *env, vaddr addr, if (likely(use_stage2)) { in.cr3 = env->nested_cr3; in.pg_mode = env->nested_pg_mode; - in.mmu_idx = MMU_USER_IDX; + in.mmu_idx = + env->nested_pg_mode & PG_MODE_LMA ? MMU_USER64_IDX : MMU_USER32_IDX; in.ptw_idx = MMU_PHYS_IDX; if (!mmu_translate(env, &in, out, err)) { -- 2.39.2
From: Paolo Bonzini <pbonzini@redhat.com> The low bit of MMU indices for x86 TCG indicates whether the processor is in 32-bit mode and therefore linear addresses have to be masked to 32 bits. However, the index was computed incorrectly, leading to possible conflicts in the TLB for any address above 4G. Analyzed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Fixes: b1661801c18 ("target/i386: Fix physical address truncation", 2024-02-28) Fixes: 1c15f97b4f1 ("target/i386: Fix physical address truncation" in stable-7.2) Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2206 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 2cc68629a6fc198f4a972698bdd6477f883aedfb) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: move changes for x86_cpu_mmu_index() to cpu_mmu_index() due to missing v8.2.0-1030-gace0c5fe59 "target/i386: Populate CPUClass.mmu_index") diff --git a/target/i386/cpu.h b/target/i386/cpu.h index XXXXXXX..XXXXXXX 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -XXX,XX +XXX,XX @@ uint64_t cpu_get_tsc(CPUX86State *env); static inline int cpu_mmu_index(CPUX86State *env, bool ifetch) { - int mmu_index_32 = (env->hflags & HF_CS64_MASK) ? 1 : 0; + int mmu_index_32 = (env->hflags & HF_CS64_MASK) ? 0 : 1; int mmu_index_base = (env->hflags & HF_CPL_MASK) == 3 ? MMU_USER64_IDX : !(env->hflags & HF_SMAP_MASK) ? MMU_KNOSMAP64_IDX : @@ -XXX,XX +XXX,XX @@ static inline bool is_mmu_index_32(int mmu_index) static inline int cpu_mmu_index_kernel(CPUX86State *env) { - int mmu_index_32 = (env->hflags & HF_LMA_MASK) ? 1 : 0; + int mmu_index_32 = (env->hflags & HF_LMA_MASK) ? 0 : 1; int mmu_index_base = !(env->hflags & HF_SMAP_MASK) ? MMU_KNOSMAP64_IDX : ((env->hflags & HF_CPL_MASK) < 3 && (env->eflags & AC_MASK)) ? MMU_KNOSMAP64_IDX : MMU_KSMAP64_IDX; -- 2.39.2
From: Tao Su <tao1.su@linux.intel.com> monitor_puts() doesn't check the monitor pointer, but do_inject_x86_mce() may have a parameter with NULL monitor pointer. Revert monitor_puts() in do_inject_x86_mce() to fix, then the fact that we send the same message to monitor and log is again more obvious. Fixes: bf0c50d4aa85 (monitor: expose monitor_puts to rest of code) Reviwed-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Tao Su <tao1.su@linux.intel.com> Message-ID: <20240320083640.523287-1-tao1.su@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit 7fd226b04746f0be0b636de5097f1b42338951a0) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/target/i386/helper.c b/target/i386/helper.c index XXXXXXX..XXXXXXX 100644 --- a/target/i386/helper.c +++ b/target/i386/helper.c @@ -XXX,XX +XXX,XX @@ static void do_inject_x86_mce(CPUState *cs, run_on_cpu_data data) if (need_reset) { emit_guest_memory_failure(MEMORY_FAILURE_ACTION_RESET, ar, recursive); - monitor_puts(params->mon, msg); + monitor_printf(params->mon, "%s", msg); qemu_log_mask(CPU_LOG_RESET, "%s\n", msg); qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET); return; -- 2.39.2
From: Song Gao <gaosong@loongson.cn> qemu-system-loongarch64 assert failed with the option '-d int', the helper_idle() raise an exception EXCP_HLT, but the exception name is undefined. Signed-off-by: Song Gao <gaosong@loongson.cn> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20240321123606.1704900-1-gaosong@loongson.cn> (cherry picked from commit 1590154ee4376819a8c6ee61e849ebf4a4e7cd02) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: fixup for lack of 2 commits adding new entries into excp_names[]: v8.0.0-514-ga3f3db5cda "target/loongarch: Add CHECK_SXE maccro for check LSX enable" and v8.1.0-801-gb8f1bdf3d1 "target/loongarch: check_vec support check LASX instructions") diff --git a/target/loongarch/cpu.c b/target/loongarch/cpu.c index XXXXXXX..XXXXXXX 100644 --- a/target/loongarch/cpu.c +++ b/target/loongarch/cpu.c @@ -XXX,XX +XXX,XX @@ const char * const fregnames[32] = { "f24", "f25", "f26", "f27", "f28", "f29", "f30", "f31", }; -static const char * const excp_names[] = { - [EXCCODE_INT] = "Interrupt", - [EXCCODE_PIL] = "Page invalid exception for load", - [EXCCODE_PIS] = "Page invalid exception for store", - [EXCCODE_PIF] = "Page invalid exception for fetch", - [EXCCODE_PME] = "Page modified exception", - [EXCCODE_PNR] = "Page Not Readable exception", - [EXCCODE_PNX] = "Page Not Executable exception", - [EXCCODE_PPI] = "Page Privilege error", - [EXCCODE_ADEF] = "Address error for instruction fetch", - [EXCCODE_ADEM] = "Address error for Memory access", - [EXCCODE_SYS] = "Syscall", - [EXCCODE_BRK] = "Break", - [EXCCODE_INE] = "Instruction Non-Existent", - [EXCCODE_IPE] = "Instruction privilege error", - [EXCCODE_FPD] = "Floating Point Disabled", - [EXCCODE_FPE] = "Floating Point Exception", - [EXCCODE_DBP] = "Debug breakpoint", - [EXCCODE_BCE] = "Bound Check Exception", +struct TypeExcp { + int32_t exccode; + const char * const name; +}; + +static const struct TypeExcp excp_names[] = { + {EXCCODE_INT, "Interrupt"}, + {EXCCODE_PIL, "Page invalid exception for load"}, + {EXCCODE_PIS, "Page invalid exception for store"}, + {EXCCODE_PIF, "Page invalid exception for fetch"}, + {EXCCODE_PME, "Page modified exception"}, + {EXCCODE_PNR, "Page Not Readable exception"}, + {EXCCODE_PNX, "Page Not Executable exception"}, + {EXCCODE_PPI, "Page Privilege error"}, + {EXCCODE_ADEF, "Address error for instruction fetch"}, + {EXCCODE_ADEM, "Address error for Memory access"}, + {EXCCODE_SYS, "Syscall"}, + {EXCCODE_BRK, "Break"}, + {EXCCODE_INE, "Instruction Non-Existent"}, + {EXCCODE_IPE, "Instruction privilege error"}, + {EXCCODE_FPD, "Floating Point Disabled"}, + {EXCCODE_FPE, "Floating Point Exception"}, + {EXCCODE_DBP, "Debug breakpoint"}, + {EXCCODE_BCE, "Bound Check Exception"}, + {EXCCODE_SXD, "128 bit vector instructions Disable exception"}, + {EXCCODE_ASXD, "256 bit vector instructions Disable exception"}, + {EXCP_HLT, "EXCP_HLT"}, }; const char *loongarch_exception_name(int32_t exception) { - assert(excp_names[exception]); - return excp_names[exception]; + int i; + + for (i = 0; i < ARRAY_SIZE(excp_names); i++) { + if (excp_names[i].exccode == exception) { + return excp_names[i].name; + } + } + return "Unknown"; } void G_NORETURN do_raise_exception(CPULoongArchState *env, @@ -XXX,XX +XXX,XX @@ void G_NORETURN do_raise_exception(CPULoongArchState *env, { CPUState *cs = env_cpu(env); - qemu_log_mask(CPU_LOG_INT, "%s: %d (%s)\n", + qemu_log_mask(CPU_LOG_INT, "%s: expection: %d (%s)\n", __func__, exception, loongarch_exception_name(exception)); @@ -XXX,XX +XXX,XX @@ static void loongarch_cpu_do_interrupt(CPUState *cs) CPULoongArchState *env = &cpu->env; bool update_badinstr = 1; int cause = -1; - const char *name; bool tlbfill = FIELD_EX64(env->CSR_TLBRERA, CSR_TLBRERA, ISTLBR); uint32_t vec_size = FIELD_EX64(env->CSR_ECFG, CSR_ECFG, VS); if (cs->exception_index != EXCCODE_INT) { - if (cs->exception_index < 0 || - cs->exception_index >= ARRAY_SIZE(excp_names)) { - name = "unknown"; - } else { - name = excp_names[cs->exception_index]; - } - qemu_log_mask(CPU_LOG_INT, "%s enter: pc " TARGET_FMT_lx " ERA " TARGET_FMT_lx - " TLBRERA " TARGET_FMT_lx " %s exception\n", __func__, - env->pc, env->CSR_ERA, env->CSR_TLBRERA, name); + " TLBRERA " TARGET_FMT_lx " exception: %d (%s)\n", + __func__, env->pc, env->CSR_ERA, env->CSR_TLBRERA, + cs->exception_index, + loongarch_exception_name(cs->exception_index)); } switch (cs->exception_index) { -- 2.39.2
From: Lorenz Brun <lorenz@brun.one> The io_timeout property, introduced in c9b6609 (part of 6.0) is silently overwritten by the hardcoded default value of 30 seconds (DEFAULT_IO_TIMEOUT) in scsi_generic_realize because that function is being called after the properties have already been applied. The property definition already has a default value which is applied correctly when no value is explicitly set, so we can just remove the code which overrides the io_timeout completely. This has been tested by stracing SG_IO operations with the io_timeout property set and unset and now sets the timeout field in the ioctl request to the proper value. Fixes: c9b6609b69facad ("scsi: make io_timeout configurable") Signed-off-by: Lorenz Brun <lorenz@brun.one> Message-ID: <20240315145831.2531695-1-lorenz@brun.one> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> (cherry picked from commit 7c7a9f578e4fb1adff7ac8d9acaaaedb87474e76) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c index XXXXXXX..XXXXXXX 100644 --- a/hw/scsi/scsi-generic.c +++ b/hw/scsi/scsi-generic.c @@ -XXX,XX +XXX,XX @@ static void scsi_generic_realize(SCSIDevice *s, Error **errp) /* Only used by scsi-block, but initialize it nevertheless to be clean. */ s->default_scsi_version = -1; - s->io_timeout = DEFAULT_IO_TIMEOUT; scsi_generic_read_device_inquiry(s); } -- 2.39.2
From: Yao Xingtao <yaoxt.fnst@fujitsu.com> In qemu monitor mode, when we use gpa2hva command to print the host virtual address corresponding to a guest physical address, if the gpa is not in RAM, the error message is below: (qemu) gpa2hva 0x750000000 Memory at address 0x750000000is not RAM A space is missed between '0x750000000' and 'is'. Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com> Fixes: e9628441df ("hmp: gpa2hva and gpa2hpa hostaddr command") Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Dr. David Alan Gilbert <dave@treblig.org> Message-ID: <20240319021610.2423844-1-ruansy.fnst@fujitsu.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> (cherry picked from commit a158c63b3ba120f1656e4dd815d186c623fb5ef6) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: in 7.2. it is in monitor/misc.c, not in monitor/hmp-cmds-target.c) diff --git a/monitor/misc.c b/monitor/misc.c index XXXXXXX..XXXXXXX 100644 --- a/monitor/misc.c +++ b/monitor/misc.c @@ -XXX,XX +XXX,XX @@ void *gpa2hva(MemoryRegion **p_mr, hwaddr addr, uint64_t size, Error **errp) } if (!memory_region_is_ram(mrs.mr) && !memory_region_is_romd(mrs.mr)) { - error_setg(errp, "Memory at address 0x%" HWADDR_PRIx "is not RAM", addr); + error_setg(errp, "Memory at address 0x%" HWADDR_PRIx " is not RAM", addr); memory_region_unref(mrs.mr); return NULL; } -- 2.39.2
From: Akihiko Odaki <akihiko.odaki@daynix.com> virtio_net_guest_notifier_pending() and virtio_net_guest_notifier_mask() checked VIRTIO_NET_F_MQ to know there are multiple queues, but VIRTIO_NET_F_RSS also enables multiple queues. Refer to n->multiqueue, which is set to true either of VIRTIO_NET_F_MQ or VIRTIO_NET_F_RSS is enabled. Fixes: 68b0a6395f36 ("virtio-net: align ctrl_vq index for non-mq guest for vhost_vdpa") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com> (cherry picked from commit 1c188fc8cbffc5f05cc616cab4e1372fb6e6f11f) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index XXXXXXX..XXXXXXX 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -XXX,XX +XXX,XX @@ static bool virtio_net_guest_notifier_pending(VirtIODevice *vdev, int idx) VirtIONet *n = VIRTIO_NET(vdev); NetClientState *nc; assert(n->vhost_started); - if (!virtio_vdev_has_feature(vdev, VIRTIO_NET_F_MQ) && idx == 2) { + if (!n->multiqueue && idx == 2) { /* Must guard against invalid features and bogus queue index * from being set by malicious guest, or penetrated through * buggy migration stream. @@ -XXX,XX +XXX,XX @@ static void virtio_net_guest_notifier_mask(VirtIODevice *vdev, int idx, VirtIONet *n = VIRTIO_NET(vdev); NetClientState *nc; assert(n->vhost_started); - if (!virtio_vdev_has_feature(vdev, VIRTIO_NET_F_MQ) && idx == 2) { + if (!n->multiqueue && idx == 2) { /* Must guard against invalid features and bogus queue index * from being set by malicious guest, or penetrated through * buggy migration stream. -- 2.39.2
From: Richard Henderson <richard.henderson@linaro.org> The 'sign' computation is attempting to locate the sign bit that has been repeated, so that we can test if that bit is known zero. That computation can be zero if there are no known sign repetitions. Cc: qemu-stable@nongnu.org Fixes: 93a967fbb57 ("tcg/optimize: Propagate sign info for shifting") Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2248 Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> (cherry picked from commit 2911e9b95f3bb03783ae5ca3e2494dc3b44a9161) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: trivial context fixup in tests/tcg/aarch64/Makefile.target) diff --git a/tcg/optimize.c b/tcg/optimize.c index XXXXXXX..XXXXXXX 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -XXX,XX +XXX,XX @@ static bool fold_shift(OptContext *ctx, TCGOp *op) * will not reduced the number of input sign repetitions. */ sign = (s_mask & -s_mask) >> 1; - if (!(z_mask & sign)) { + if (sign && !(z_mask & sign)) { ctx->s_mask = s_mask; } break; diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target index XXXXXXX..XXXXXXX 100644 --- a/tests/tcg/aarch64/Makefile.target +++ b/tests/tcg/aarch64/Makefile.target @@ -XXX,XX +XXX,XX @@ VPATH += $(AARCH64_SRC) # Base architecture tests AARCH64_TESTS=fcvt pcalign-a64 +AARCH64_TESTS += test-2248 fcvt: LDFLAGS+=-lm diff --git a/tests/tcg/aarch64/test-2248.c b/tests/tcg/aarch64/test-2248.c new file mode 100644 index XXXXXXX..XXXXXXX --- /dev/null +++ b/tests/tcg/aarch64/test-2248.c @@ -XXX,XX +XXX,XX @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* See https://gitlab.com/qemu-project/qemu/-/issues/2248 */ + +#include <assert.h> + +__attribute__((noinline)) +long test(long x, long y, long sh) +{ + long r; + asm("cmp %1, %2\n\t" + "cset x12, lt\n\t" + "and w11, w12, #0xff\n\t" + "cmp w11, #0\n\t" + "csetm x14, ne\n\t" + "lsr x13, x14, %3\n\t" + "sxtb %0, w13" + : "=r"(r) + : "r"(x), "r"(y), "r"(sh) + : "x11", "x12", "x13", "x14"); + return r; +} + +int main() +{ + long r = test(0, 1, 2); + assert(r == -1); + return 0; +} -- 2.39.2
From: Richard Henderson <richard.henderson@linaro.org> Along this path we have already skipped the insn to be nullified, so the subsequent insn should be executed. Cc: qemu-stable@nongnu.org Reported-by: Sven Schnelle <svens@stackframe.org> Tested-by: Sven Schnelle <svens@stackframe.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> (cherry picked from commit 4a3aa11e1fb25c28c24a43fd2835c429b00a463d) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/target/hppa/translate.c b/target/hppa/translate.c index XXXXXXX..XXXXXXX 100644 --- a/target/hppa/translate.c +++ b/target/hppa/translate.c @@ -XXX,XX +XXX,XX @@ static bool trans_be(DisasContext *ctx, arg_be *a) tcg_gen_addi_reg(cpu_iaoq_b, cpu_iaoq_f, 4); tcg_gen_mov_i64(cpu_iasq_f, new_spc); tcg_gen_mov_i64(cpu_iasq_b, cpu_iasq_f); + nullify_set(ctx, 0); } else { copy_iaoq_entry(cpu_iaoq_f, ctx->iaoq_b, cpu_iaoq_b); if (ctx->iaoq_b == -1) { -- 2.39.2
Commit ab72522797 "gitlab: switch from 'stable' to 'latest' docker container tags" switched most tags to 'latest' but missed cirrus image. Fix this now. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2256 Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Message-id: 20240401051633.2780456-1-mjt@tls.msk.ru Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit 1d2f2b35bc86b7a13dc3009a3c5031220aa0b7de) diff --git a/.gitlab-ci.d/cirrus.yml b/.gitlab-ci.d/cirrus.yml index XXXXXXX..XXXXXXX 100644 --- a/.gitlab-ci.d/cirrus.yml +++ b/.gitlab-ci.d/cirrus.yml @@ -XXX,XX +XXX,XX @@ .cirrus_build_job: extends: .base_job_template stage: build - image: registry.gitlab.com/libvirt/libvirt-ci/cirrus-run:master + image: registry.gitlab.com/libvirt/libvirt-ci/cirrus-run:latest needs: [] timeout: 80m allow_failure: true -- 2.39.2
From: Peter Maydell <peter.maydell@linaro.org> If the group of the highest priority pending interrupt is disabled via ICC_IGRPEN*, the ICC_HPPIR* registers should return INTID_SPURIOUS, not the interrupt ID. (See the GIC architecture specification pseudocode functions ICC_HPPIR1_EL1[] and HighestPriorityPendingInterrupt().) Make HPPIR reads honour the group disable, the way we already do when determining whether to preempt in icc_hppi_can_preempt(). Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20240328153333.2522667-1-peter.maydell@linaro.org (cherry picked from commit 44e25fbc1900c99c91a44e532c5bd680bc403459) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c index XXXXXXX..XXXXXXX 100644 --- a/hw/intc/arm_gicv3_cpuif.c +++ b/hw/intc/arm_gicv3_cpuif.c @@ -XXX,XX +XXX,XX @@ static uint64_t icc_hppir0_value(GICv3CPUState *cs, CPUARMState *env) */ bool irq_is_secure; - if (cs->hppi.prio == 0xff) { + if (icc_no_enabled_hppi(cs)) { return INTID_SPURIOUS; } @@ -XXX,XX +XXX,XX @@ static uint64_t icc_hppir1_value(GICv3CPUState *cs, CPUARMState *env) */ bool irq_is_secure; - if (cs->hppi.prio == 0xff) { + if (icc_no_enabled_hppi(cs)) { return INTID_SPURIOUS; } -- 2.39.2
From: Yajun Wu <yajunw@nvidia.com> When vhost-user or vhost-kernel is handling virtio net datapath, QEMU should not touch used ring. But with vhost-user socket reconnect scenario, in a very rare case (has pending kick event). VRING_USED_F_NO_NOTIFY is set by QEMU in following code path: #0 virtio_queue_split_set_notification (vq=0x7ff5f4c920a8, enable=0) at ../hw/virtio/virtio.c:511 #1 0x0000559d6dbf033b in virtio_queue_set_notification (vq=0x7ff5f4c920a8, enable=0) at ../hw/virtio/virtio.c:576 #2 0x0000559d6dbbbdbc in virtio_net_handle_tx_bh (vdev=0x559d703a6aa0, vq=0x7ff5f4c920a8) at ../hw/net/virtio-net.c:2801 #3 0x0000559d6dbf4791 in virtio_queue_notify_vq (vq=0x7ff5f4c920a8) at ../hw/virtio/virtio.c:2248 #4 0x0000559d6dbf79da in virtio_queue_host_notifier_read (n=0x7ff5f4c9211c) at ../hw/virtio/virtio.c:3525 #5 0x0000559d6d9a5814 in virtio_bus_cleanup_host_notifier (bus=0x559d703a6a20, n=1) at ../hw/virtio/virtio-bus.c:321 #6 0x0000559d6dbf83c9 in virtio_device_stop_ioeventfd_impl (vdev=0x559d703a6aa0) at ../hw/virtio/virtio.c:3774 #7 0x0000559d6d9a55c8 in virtio_bus_stop_ioeventfd (bus=0x559d703a6a20) at ../hw/virtio/virtio-bus.c:259 #8 0x0000559d6d9a53e8 in virtio_bus_grab_ioeventfd (bus=0x559d703a6a20) at ../hw/virtio/virtio-bus.c:199 #9 0x0000559d6dbf841c in virtio_device_grab_ioeventfd (vdev=0x559d703a6aa0) at ../hw/virtio/virtio.c:3783 #10 0x0000559d6d9bde18 in vhost_dev_enable_notifiers (hdev=0x559d707edd70, vdev=0x559d703a6aa0) at ../hw/virtio/vhost.c:1592 #11 0x0000559d6d89a0b8 in vhost_net_start_one (net=0x559d707edd70, dev=0x559d703a6aa0) at ../hw/net/vhost_net.c:266 #12 0x0000559d6d89a6df in vhost_net_start (dev=0x559d703a6aa0, ncs=0x559d7048d890, data_queue_pairs=31, cvq=0) at ../hw/net/vhost_net.c:412 #13 0x0000559d6dbb5b89 in virtio_net_vhost_status (n=0x559d703a6aa0, status=15 '\017') at ../hw/net/virtio-net.c:311 #14 0x0000559d6dbb5e34 in virtio_net_set_status (vdev=0x559d703a6aa0, status=15 '\017') at ../hw/net/virtio-net.c:392 #15 0x0000559d6dbb60d8 in virtio_net_set_link_status (nc=0x559d7048d890) at ../hw/net/virtio-net.c:455 #16 0x0000559d6da64863 in qmp_set_link (name=0x559d6f0b83d0 "hostnet1", up=true, errp=0x7ffdd76569f0) at ../net/net.c:1459 #17 0x0000559d6da7226e in net_vhost_user_event (opaque=0x559d6f0b83d0, event=CHR_EVENT_OPENED) at ../net/vhost-user.c:301 #18 0x0000559d6ddc7f63 in chr_be_event (s=0x559d6f2ffea0, event=CHR_EVENT_OPENED) at ../chardev/char.c:62 #19 0x0000559d6ddc7fdc in qemu_chr_be_event (s=0x559d6f2ffea0, event=CHR_EVENT_OPENED) at ../chardev/char.c:82 This issue causes guest kernel stop kicking device and traffic stop. Add vhost_started check in virtio_net_handle_tx_bh to fix this wrong VRING_USED_F_NO_NOTIFY set. Signed-off-by: Yajun Wu <yajunw@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <20240402045109.97729-1-yajunw@nvidia.com> [PMD: Use unlikely()] Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> (cherry picked from commit 4c54f5bc8e1d38f15cc35b6a6932d8fbe219c692) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index XXXXXXX..XXXXXXX 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -XXX,XX +XXX,XX @@ static void virtio_net_handle_tx_bh(VirtIODevice *vdev, VirtQueue *vq) VirtIONet *n = VIRTIO_NET(vdev); VirtIONetQueue *q = &n->vqs[vq2q(virtio_get_queue_index(vq))]; + if (unlikely(n->vhost_started)) { + return; + } + if (unlikely((n->status & VIRTIO_NET_S_LINK_UP) == 0)) { virtio_net_drop_tx_queue_data(vdev, vq); return; -- 2.39.2
From: Wafer <wafer@jaguarmicro.com> In the event of writing many chains of descriptors, the device must write just the id of the last buffer in the descriptor chain, skip forward the number of descriptors in the chain, and then repeat the operations for the rest of chains. Current QEMU code writes all the buffer ids consecutively, and then skips all the buffers altogether. This is a bug, and can be reproduced with a VirtIONet device with _F_MRG_RXBUB and without _F_INDIRECT_DESC: If a virtio-net device has the VIRTIO_NET_F_MRG_RXBUF feature but not the VIRTIO_RING_F_INDIRECT_DESC feature, 'VirtIONetQueue->rx_vq' will use the merge feature to store data in multiple 'elems'. The 'num_buffers' in the virtio header indicates how many elements are merged. If the value of 'num_buffers' is greater than 1, all the merged elements will be filled into the descriptor ring. The 'idx' of the elements should be the value of 'vq->used_idx' plus 'ndescs'. Fixes: 86044b24e8 ("virtio: basic packed virtqueue support") Acked-by: Eugenio Pérez <eperezma@redhat.com> Signed-off-by: Wafer <wafer@jaguarmicro.com> Message-Id: <20240407015451.5228-2-wafer@jaguarmicro.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit 2d9a31b3c27311eca1682cb2c076d7a300441960) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index XXXXXXX..XXXXXXX 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -XXX,XX +XXX,XX @@ static void virtqueue_packed_flush(VirtQueue *vq, unsigned int count) return; } + /* + * For indirect element's 'ndescs' is 1. + * For all other elemment's 'ndescs' is the + * number of descriptors chained by NEXT (as set in virtqueue_packed_pop). + * So When the 'elem' be filled into the descriptor ring, + * The 'idx' of this 'elem' shall be + * the value of 'vq->used_idx' plus the 'ndescs'. + */ + ndescs += vq->used_elems[0].ndescs; for (i = 1; i < count; i++) { - virtqueue_packed_fill_desc(vq, &vq->used_elems[i], i, false); + virtqueue_packed_fill_desc(vq, &vq->used_elems[i], ndescs, false); ndescs += vq->used_elems[i].ndescs; } virtqueue_packed_fill_desc(vq, &vq->used_elems[0], 0, true); - ndescs += vq->used_elems[0].ndescs; vq->inuse -= ndescs; vq->used_idx += ndescs; -- 2.39.2
The following patches are queued for QEMU stable v7.2.11: https://gitlab.com/qemu-project/qemu/-/commits/staging-7.2 Patch freeze is 2024-04-20, and the release is planned for 2024-04-22: https://wiki.qemu.org/Planning/7.2 Please respond here or CC qemu-stable@nongnu.org on any additional patches you think should (or shouldn't) be included in the release. The changes which are staging for inclusion, with the original commit hash from master branch, are given below the bottom line. Thanks! /mjt -------------------------------------- 01* 9ea920dc2825 Daniel P. Berrangé: gitlab: update FreeBSD Cirrus CI image to 13.3 02* f5af80271aad David Parsons: ui/cocoa: Fix window clipping on macOS 14 03* bc6bd20ee353 Zhuojia Shen: target/arm: align exposed ID registers with Linux 04* 3dc2afeab296 Peter Maydell: tests/tcg/aarch64/sysregs.c: Use S syntax for id_aa64zfr0_el1 and id_aa64smfr0_el1 05* 1f51573f7925 Richard Henderson: target/arm: Fix SME full tile indexing 06* fd7f95f23d6f Peter Maydell: hw/rtc/sun4v-rtc: Relicense to GPLv2-or-later 07* 012b170173bc Dmitrii Gavrilov: system/qdev-monitor: move drain_call_rcu call under if (!dev) in qmp_device_add() 08* a9198b3132d8 Sven Schnelle: hw/scsi/lsi53c895a: stop script on phase mismatch 09* 8b09b7fe4708 Sven Schnelle: hw/scsi/lsi53c895a: add missing decrement of reentrancy counter 10* 9876359990dd Sven Schnelle: hw/scsi/lsi53c895a: add timer to scripts processing 11* 9bc9e9511944 Michael Tokarev: make-release: switch to .xz format by default 12* 4cadf1023498 Laurent Vivier: e1000e: fix link state on resume 13* 6a5287ce8047 Nick Briggs: Avoid unaligned fetch in ladr_match() 14* 784fd35387e9 Klaus Jensen: hw/nvme: clean up confusing use of errp/local_err 15* 973f76cf7743 Klaus Jensen: hw/nvme: cleanup error reporting in nvme_init_pci() 16* 4f0a4a3d5854 Minwoo Im: hw/nvme: separate 'serial' property for VFs 17* ee7bda4d38cd Klaus Jensen: hw/nvme: generalize the mbar size helper 18* fa905f65c554 Klaus Jensen: hw/nvme: add machine compatibility parameter to enable msix exclusive bar 19* 31180dbdca28 Akihiko Odaki: pcie: Introduce pcie_sriov_num_vfs 20* 91bb64a8d201 Akihiko Odaki: hw/nvme: Use pcie_sriov_num_vfs() 21* 6081b4243cd6 Akihiko Odaki: pcie_sriov: Validate NumVFs 22* 74e2845c5f95 Jonathan Cameron: hmat acpi: Fix out of bounds access due to missing use of indirection 23* 2e128776dc56 Cédric Le Goater: migration: Skip only empty block devices 24* c45f8f1aef35 Thomas Huth: tests/unit: Bump test-aio-multithread test timeout to 2 minutes 25* e1b363e328d5 Thomas Huth: tests/unit: Bump test-crypto-block test timeout to 5 minutes 26* 63b18312d14a Kevin Wolf: tests/unit: Bump test-replication timeout to 60 seconds 27* 55f7c6a5f2bd Peter Maydell: tests: Raise timeouts for bufferiszero and crypto-tlscredsx509 28* 5f97afe2543f Paolo Bonzini: target/i386: introduce function to query MMU indices 29* 90f641531c78 Paolo Bonzini: target/i386: use separate MMU indexes for 32-bit accesses 30* 2cc68629a6fc Paolo Bonzini: target/i386: fix direction of "32-bit MMU" test 31* 7fd226b04746 Tao Su: target/i386: Revert monitor_puts() in do_inject_x86_mce() 32* 1590154ee437 Song Gao: target/loongarch: Fix qemu-system-loongarch64 assert failed with the option '-d int' 33* 7c7a9f578e4f Lorenz Brun: hw/scsi/scsi-generic: Fix io_timeout property not applying 34* a158c63b3ba1 Yao Xingtao: monitor/hmp-cmds-target: Append a space in error message in gpa2hva() 35* 1c188fc8cbff Akihiko Odaki: virtio-net: Fix vhost virtqueue notifiers for RSS 36* 2911e9b95f3b Richard Henderson: tcg/optimize: Fix sign_mask for logical right-shift 37* 4a3aa11e1fb2 Richard Henderson: target/hppa: Clear psw_n for BE on use_nullify_skip path 38* 1d2f2b35bc86 Michael Tokarev: gitlab-ci/cirrus: switch from 'master' to 'latest' 39* 44e25fbc1900 Peter Maydell: hw/intc/arm_gicv3: ICC_HPPIR* return SPURIOUS if int group is disabled 40* 4c54f5bc8e1d Yajun Wu: hw/net/virtio-net: fix qemu set used ring flag even vhost started 41* 2d9a31b3c273 Wafer: hw/virtio: Fix packed virtqueue flush used_idx 42 e25fe886b89a Richard Henderson: tcg/optimize: Do not attempt to constant fold neg_vec 43 f0907ff4cae7 Richard Henderson: linux-user: Fix waitid return of siginfo_t and rusage 44 ec0504b989ca Philippe Mathieu-Daudé: hw/virtio: Introduce virtio_bh_new_guarded() helper 45 ba28e0ff4d95 Philippe Mathieu-Daudé: hw/display/virtio-gpu: Protect from DMA re-entrancy bugs 46 b4295bff25f7 Philippe Mathieu-Daudé: hw/char/virtio-serial-bus: Protect from DMA re-entrancy bugs 47 f4729ec39ad9 Philippe Mathieu-Daudé: hw/virtio/virtio-crypto: Protect from DMA re-entrancy bugs 48 aa88f99c87c0 Yuquan Wang: qemu-options: Fix CXL Fixed Memory Window interleave-granularity typo 49 7a86544f286d Philippe Mathieu-Daudé: hw/block/nand: Factor nand_load_iolen() method out 50 2e3e09b36800 Philippe Mathieu-Daudé: hw/block/nand: Have blk_load() take unsigned offset and return boolean 51 d39fdfff348f Philippe Mathieu-Daudé: hw/block/nand: Fix out-of-bound access in NAND block buffer 52 fc09ff2979de Philippe Mathieu-Daudé: hw/misc/applesmc: Fix memory leak in reset() handler 53 eaf2bd29538d Philippe Mathieu-Daudé: backends/cryptodev: Do not abort for invalid session ID 54 ad766d603f39 Philippe Mathieu-Daudé: hw/net/lan9118: Fix overflow in MIL TX FIFO 55 a45223467e4e Philippe Mathieu-Daudé: hw/net/lan9118: Replace magic '2048' value by MIL_TXFIFO_SIZE definition 56 9e4b27ca6bf4 Philippe Mathieu-Daudé: hw/sd/sdhci: Do not update TRNMOD when Command Inhibit (DAT) is set 57 b754cb2dcde2 Zack Buhman: target/sh4: add missing CHECK_NOT_DELAY_SLOT 58 2df5c1f5b014 Harsh Prateek Bora: ppc/spapr: Introduce SPAPR_IRQ_NR_IPIS to refer IRQ range for CPU IPIs. 59 c4f91d7b7be7 Harsh Prateek Bora: ppc/spapr: Initialize max_cpus limit to SPAPR_IRQ_NR_IPIS. (commit(s) marked with * were in previous series and are not resent)
From: Richard Henderson <richard.henderson@linaro.org> Split out the tail of fold_neg to fold_neg_no_const so that we can avoid attempting to constant fold vector negate. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2150 Signed-off-by: Richard Henderson <richard.henderson@linaro.org> (cherry picked from commit e25fe886b89a396bae5847520b70c148587d490a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: context fixup in tests/tcg/aarch64/Makefile.target) diff --git a/tcg/optimize.c b/tcg/optimize.c index XXXXXXX..XXXXXXX 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -XXX,XX +XXX,XX @@ static bool fold_nand(OptContext *ctx, TCGOp *op) return false; } -static bool fold_neg(OptContext *ctx, TCGOp *op) +static bool fold_neg_no_const(OptContext *ctx, TCGOp *op) { - uint64_t z_mask; - - if (fold_const1(ctx, op)) { - return true; - } - /* Set to 1 all bits to the left of the rightmost. */ - z_mask = arg_info(op->args[1])->z_mask; + uint64_t z_mask = arg_info(op->args[1])->z_mask; ctx->z_mask = -(z_mask & -z_mask); /* @@ -XXX,XX +XXX,XX @@ static bool fold_neg(OptContext *ctx, TCGOp *op) return true; } +static bool fold_neg(OptContext *ctx, TCGOp *op) +{ + return fold_const1(ctx, op) || fold_neg_no_const(ctx, op); +} + static bool fold_nor(OptContext *ctx, TCGOp *op) { if (fold_const2_commutative(ctx, op) || @@ -XXX,XX +XXX,XX @@ static bool fold_sub_to_neg(OptContext *ctx, TCGOp *op) if (have_neg) { op->opc = neg_op; op->args[1] = op->args[2]; - return fold_neg(ctx, op); + return fold_neg_no_const(ctx, op); } return false; } diff --git a/tests/tcg/aarch64/Makefile.target b/tests/tcg/aarch64/Makefile.target index XXXXXXX..XXXXXXX 100644 --- a/tests/tcg/aarch64/Makefile.target +++ b/tests/tcg/aarch64/Makefile.target @@ -XXX,XX +XXX,XX @@ VPATH += $(AARCH64_SRC) # Base architecture tests AARCH64_TESTS=fcvt pcalign-a64 -AARCH64_TESTS += test-2248 +AARCH64_TESTS += test-2248 test-2150 fcvt: LDFLAGS+=-lm diff --git a/tests/tcg/aarch64/test-2150.c b/tests/tcg/aarch64/test-2150.c new file mode 100644 index XXXXXXX..XXXXXXX --- /dev/null +++ b/tests/tcg/aarch64/test-2150.c @@ -XXX,XX +XXX,XX @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* See https://gitlab.com/qemu-project/qemu/-/issues/2150 */ + +int main() +{ + asm volatile( + "movi v6.4s, #1\n" + "movi v7.4s, #0\n" + "sub v6.2d, v7.2d, v6.2d\n" + : : : "v6", "v7"); + return 0; +} -- 2.39.2
From: Richard Henderson <richard.henderson@linaro.org> The copy back to siginfo_t should be conditional only on arg3, not the specific values that might have been written. The copy back to rusage was missing entirely. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2262 Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Alex Fan <alex.fan.q@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> (cherry picked from commit f0907ff4cae743f1a4ef3d0a55a047029eed06ff) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/linux-user/syscall.c b/linux-user/syscall.c index XXXXXXX..XXXXXXX 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -XXX,XX +XXX,XX @@ static abi_long do_syscall1(CPUArchState *cpu_env, int num, abi_long arg1, #ifdef TARGET_NR_waitid case TARGET_NR_waitid: { + struct rusage ru; siginfo_t info; - info.si_pid = 0; - ret = get_errno(safe_waitid(arg1, arg2, &info, arg4, NULL)); - if (!is_error(ret) && arg3 && info.si_pid != 0) { - if (!(p = lock_user(VERIFY_WRITE, arg3, sizeof(target_siginfo_t), 0))) + + ret = get_errno(safe_waitid(arg1, arg2, (arg3 ? &info : NULL), + arg4, (arg5 ? &ru : NULL))); + if (!is_error(ret)) { + if (arg3) { + p = lock_user(VERIFY_WRITE, arg3, + sizeof(target_siginfo_t), 0); + if (!p) { + return -TARGET_EFAULT; + } + host_to_target_siginfo(p, &info); + unlock_user(p, arg3, sizeof(target_siginfo_t)); + } + if (arg5 && host_to_target_rusage(arg5, &ru)) { return -TARGET_EFAULT; - host_to_target_siginfo(p, &info); - unlock_user(p, arg3, sizeof(target_siginfo_t)); + } } } return ret; -- 2.39.2
From: Philippe Mathieu-Daudé <philmd@linaro.org> Introduce virtio_bh_new_guarded(), similar to qemu_bh_new_guarded() but using the transport memory guard, instead of the device one (there can only be one virtio device per virtio bus). Inspired-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Gerd Hoffmann <kraxel@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20240409105537.18308-2-philmd@linaro.org> (cherry picked from commit ec0504b989ca61e03636384d3602b7bf07ffe4da) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: trivial #include context fixup in include/hw/virtio/virtio.h) diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index XXXXXXX..XXXXXXX 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -XXX,XX +XXX,XX @@ static void virtio_register_types(void) } type_init(virtio_register_types) + +QEMUBH *virtio_bh_new_guarded_full(DeviceState *dev, + QEMUBHFunc *cb, void *opaque, + const char *name) +{ + DeviceState *transport = qdev_get_parent_bus(dev)->parent; + + return qemu_bh_new_full(cb, opaque, name, + &transport->mem_reentrancy_guard); +} diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h index XXXXXXX..XXXXXXX 100644 --- a/include/hw/virtio/virtio.h +++ b/include/hw/virtio/virtio.h @@ -XXX,XX +XXX,XX @@ #include "standard-headers/linux/virtio_ring.h" #include "qom/object.h" #include "hw/virtio/vhost.h" +#include "block/aio.h" /* * A guest should never accept this. It implies negotiation is broken @@ -XXX,XX +XXX,XX @@ static inline bool virtio_device_disabled(VirtIODevice *vdev) bool virtio_legacy_allowed(VirtIODevice *vdev); bool virtio_legacy_check_disabled(VirtIODevice *vdev); +QEMUBH *virtio_bh_new_guarded_full(DeviceState *dev, + QEMUBHFunc *cb, void *opaque, + const char *name); +#define virtio_bh_new_guarded(dev, cb, opaque) \ + virtio_bh_new_guarded_full((dev), (cb), (opaque), (stringify(cb))) + #endif -- 2.39.2
From: Philippe Mathieu-Daudé <philmd@linaro.org> Replace qemu_bh_new_guarded() by virtio_bh_new_guarded() so the bus and device use the same guard. Otherwise the DMA-reentrancy protection can be bypassed: $ cat << EOF | qemu-system-i386 -display none -nodefaults \ -machine q35,accel=qtest \ -m 512M \ -device virtio-gpu \ -qtest stdio outl 0xcf8 0x80000820 outl 0xcfc 0xe0004000 outl 0xcf8 0x80000804 outw 0xcfc 0x06 write 0xe0004030 0x4 0x024000e0 write 0xe0004028 0x1 0xff write 0xe0004020 0x4 0x00009300 write 0xe000401c 0x1 0x01 write 0x101 0x1 0x04 write 0x103 0x1 0x1c write 0x9301c8 0x1 0x18 write 0x105 0x1 0x1c write 0x107 0x1 0x1c write 0x109 0x1 0x1c write 0x10b 0x1 0x00 write 0x10d 0x1 0x00 write 0x10f 0x1 0x00 write 0x111 0x1 0x00 write 0x113 0x1 0x00 write 0x115 0x1 0x00 write 0x117 0x1 0x00 write 0x119 0x1 0x00 write 0x11b 0x1 0x00 write 0x11d 0x1 0x00 write 0x11f 0x1 0x00 write 0x121 0x1 0x00 write 0x123 0x1 0x00 write 0x125 0x1 0x00 write 0x127 0x1 0x00 write 0x129 0x1 0x00 write 0x12b 0x1 0x00 write 0x12d 0x1 0x00 write 0x12f 0x1 0x00 write 0x131 0x1 0x00 write 0x133 0x1 0x00 write 0x135 0x1 0x00 write 0x137 0x1 0x00 write 0x139 0x1 0x00 write 0xe0007003 0x1 0x00 EOF ... ================================================================= ==276099==ERROR: AddressSanitizer: heap-use-after-free on address 0x60d000011178 at pc 0x562cc3b736c7 bp 0x7ffed49dee60 sp 0x7ffed49dee58 READ of size 8 at 0x60d000011178 thread T0 #0 0x562cc3b736c6 in virtio_gpu_ctrl_response hw/display/virtio-gpu.c:180:42 #1 0x562cc3b7c40b in virtio_gpu_ctrl_response_nodata hw/display/virtio-gpu.c:192:5 #2 0x562cc3b7c40b in virtio_gpu_simple_process_cmd hw/display/virtio-gpu.c:1015:13 #3 0x562cc3b82873 in virtio_gpu_process_cmdq hw/display/virtio-gpu.c:1050:9 #4 0x562cc4a85514 in aio_bh_call util/async.c:169:5 #5 0x562cc4a85c52 in aio_bh_poll util/async.c:216:13 #6 0x562cc4a1a79b in aio_dispatch util/aio-posix.c:423:5 #7 0x562cc4a8a2da in aio_ctx_dispatch util/async.c:358:5 #8 0x7f36840547a8 in g_main_context_dispatch (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x547a8) #9 0x562cc4a8b753 in glib_pollfds_poll util/main-loop.c:290:9 #10 0x562cc4a8b753 in os_host_main_loop_wait util/main-loop.c:313:5 #11 0x562cc4a8b753 in main_loop_wait util/main-loop.c:592:11 #12 0x562cc3938186 in qemu_main_loop system/runstate.c:782:9 #13 0x562cc43b7af5 in qemu_default_main system/main.c:37:14 #14 0x7f3683a6c189 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16 #15 0x7f3683a6c244 in __libc_start_main csu/../csu/libc-start.c:381:3 #16 0x562cc2a58ac0 in _start (qemu-system-i386+0x231bac0) 0x60d000011178 is located 56 bytes inside of 136-byte region [0x60d000011140,0x60d0000111c8) freed by thread T0 here: #0 0x562cc2adb662 in __interceptor_free (qemu-system-i386+0x239e662) #1 0x562cc3b86b21 in virtio_gpu_reset hw/display/virtio-gpu.c:1524:9 #2 0x562cc416e20e in virtio_reset hw/virtio/virtio.c:2145:9 #3 0x562cc37c5644 in virtio_pci_reset hw/virtio/virtio-pci.c:2249:5 #4 0x562cc4233758 in memory_region_write_accessor system/memory.c:497:5 #5 0x562cc4232eea in access_with_adjusted_size system/memory.c:573:18 previously allocated by thread T0 here: #0 0x562cc2adb90e in malloc (qemu-system-i386+0x239e90e) #1 0x7f368405a678 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x5a678) #2 0x562cc4163ffc in virtqueue_split_pop hw/virtio/virtio.c:1612:12 #3 0x562cc4163ffc in virtqueue_pop hw/virtio/virtio.c:1783:16 #4 0x562cc3b91a95 in virtio_gpu_handle_ctrl hw/display/virtio-gpu.c:1112:15 #5 0x562cc4a85514 in aio_bh_call util/async.c:169:5 #6 0x562cc4a85c52 in aio_bh_poll util/async.c:216:13 #7 0x562cc4a1a79b in aio_dispatch util/aio-posix.c:423:5 SUMMARY: AddressSanitizer: heap-use-after-free hw/display/virtio-gpu.c:180:42 in virtio_gpu_ctrl_response With this change, the same reproducer triggers: qemu-system-i386: warning: Blocked re-entrant IO on MemoryRegion: virtio-pci-common-virtio-gpu at addr: 0x6 Fixes: CVE-2024-3446 Cc: qemu-stable@nongnu.org Reported-by: Alexander Bulekov <alxndr@bu.edu> Reported-by: Yongkang Jia <kangel@zju.edu.cn> Reported-by: Xiao Lei <nop.leixiao@gmail.com> Reported-by: Yiming Tao <taoym@zju.edu.cn> Buglink: https://bugs.launchpad.net/qemu/+bug/1888606 Reviewed-by: Gerd Hoffmann <kraxel@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20240409105537.18308-3-philmd@linaro.org> (cherry picked from commit ba28e0ff4d95b56dc334aac2730ab3651ffc3132) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: context fixup in hw/display/virtio-gpu.c:virtio_gpu_device_realize() due to missing v8.1.0-rc2-69-ga41e2d97f92b "virtio-gpu: reset gfx resources in main thread". Maybe it's worth to pick this too) diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c index XXXXXXX..XXXXXXX 100644 --- a/hw/display/virtio-gpu.c +++ b/hw/display/virtio-gpu.c @@ -XXX,XX +XXX,XX @@ void virtio_gpu_device_realize(DeviceState *qdev, Error **errp) g->ctrl_vq = virtio_get_queue(vdev, 0); g->cursor_vq = virtio_get_queue(vdev, 1); - g->ctrl_bh = qemu_bh_new_guarded(virtio_gpu_ctrl_bh, g, - &qdev->mem_reentrancy_guard); - g->cursor_bh = qemu_bh_new_guarded(virtio_gpu_cursor_bh, g, - &qdev->mem_reentrancy_guard); + g->ctrl_bh = virtio_bh_new_guarded(qdev, virtio_gpu_ctrl_bh, g); + g->cursor_bh = virtio_bh_new_guarded(qdev, virtio_gpu_cursor_bh, g); QTAILQ_INIT(&g->reslist); QTAILQ_INIT(&g->cmdq); QTAILQ_INIT(&g->fenceq); -- 2.39.2
From: Philippe Mathieu-Daudé <philmd@linaro.org> Replace qemu_bh_new_guarded() by virtio_bh_new_guarded() so the bus and device use the same guard. Otherwise the DMA-reentrancy protection can be bypassed. Fixes: CVE-2024-3446 Cc: qemu-stable@nongnu.org Suggested-by: Alexander Bulekov <alxndr@bu.edu> Reviewed-by: Gerd Hoffmann <kraxel@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20240409105537.18308-4-philmd@linaro.org> (cherry picked from commit b4295bff25f7b50de1d9cc94a9c6effd40056bca) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c index XXXXXXX..XXXXXXX 100644 --- a/hw/char/virtio-serial-bus.c +++ b/hw/char/virtio-serial-bus.c @@ -XXX,XX +XXX,XX @@ static void virtser_port_device_realize(DeviceState *dev, Error **errp) return; } - port->bh = qemu_bh_new_guarded(flush_queued_data_bh, port, - &dev->mem_reentrancy_guard); + port->bh = virtio_bh_new_guarded(dev, flush_queued_data_bh, port); port->elem = NULL; } -- 2.39.2
From: Philippe Mathieu-Daudé <philmd@linaro.org> Replace qemu_bh_new_guarded() by virtio_bh_new_guarded() so the bus and device use the same guard. Otherwise the DMA-reentrancy protection can be bypassed. Fixes: CVE-2024-3446 Cc: qemu-stable@nongnu.org Suggested-by: Alexander Bulekov <alxndr@bu.edu> Reviewed-by: Gerd Hoffmann <kraxel@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20240409105537.18308-5-philmd@linaro.org> (cherry picked from commit f4729ec39ad97a42ceaa7b5697f84f440ea6e5dc) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/virtio/virtio-crypto.c b/hw/virtio/virtio-crypto.c index XXXXXXX..XXXXXXX 100644 --- a/hw/virtio/virtio-crypto.c +++ b/hw/virtio/virtio-crypto.c @@ -XXX,XX +XXX,XX @@ static void virtio_crypto_device_realize(DeviceState *dev, Error **errp) vcrypto->vqs[i].dataq = virtio_add_queue(vdev, 1024, virtio_crypto_handle_dataq_bh); vcrypto->vqs[i].dataq_bh = - qemu_bh_new_guarded(virtio_crypto_dataq_bh, &vcrypto->vqs[i], - &dev->mem_reentrancy_guard); + virtio_bh_new_guarded(dev, virtio_crypto_dataq_bh, + &vcrypto->vqs[i]); vcrypto->vqs[i].vcrypto = vcrypto; } -- 2.39.2
From: Yuquan Wang <wangyuquan1236@phytium.com.cn> Fix the unit typo of interleave-granularity of CXL Fixed Memory Window in qemu-option.hx. Fixes: 03b39fcf64 ("hw/cxl: Make the CFMW a machine parameter.") Signed-off-by: Yuquan Wang wangyuquan1236@phytium.com.cn Message-ID: <20240407083539.1488172-2-wangyuquan1236@phytium.com.cn> [PMD: Reworded] Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> (cherry picked from commit aa88f99c87c0e5d195d6d96190374650553ea61f) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/qemu-options.hx b/qemu-options.hx index XXXXXXX..XXXXXXX 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -XXX,XX +XXX,XX @@ SRST platform and configuration dependent. ``interleave-granularity=granularity`` sets the granularity of - interleave. Default 256KiB. Only 256KiB, 512KiB, 1024KiB, 2048KiB - 4096KiB, 8192KiB and 16384KiB granularities supported. + interleave. Default 256 (bytes). Only 256, 512, 1k, 2k, + 4k, 8k and 16k granularities supported. Example: :: - -machine cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.targets.1=cxl.1,cxl-fmw.0.size=128G,cxl-fmw.0.interleave-granularity=512k + -machine cxl-fmw.0.targets.0=cxl.0,cxl-fmw.0.targets.1=cxl.1,cxl-fmw.0.size=128G,cxl-fmw.0.interleave-granularity=512 ERST DEF("M", HAS_ARG, QEMU_OPTION_M, -- 2.39.2
From: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20240409135944.24997-2-philmd@linaro.org> (cherry picked from commit 7a86544f286d8af4fa5251101c1026ddae92cc3d) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/block/nand.c b/hw/block/nand.c index XXXXXXX..XXXXXXX 100644 --- a/hw/block/nand.c +++ b/hw/block/nand.c @@ -XXX,XX +XXX,XX @@ static inline void nand_pushio_byte(NANDFlashState *s, uint8_t value) } } +/* + * nand_load_block: Load block containing (s->addr + @offset). + * Returns length of data available at @offset in this block. + */ +static unsigned nand_load_block(NANDFlashState *s, unsigned offset) +{ + unsigned iolen; + + s->blk_load(s, s->addr, offset); + + iolen = (1 << s->page_shift); + if (s->gnd) { + iolen += 1 << s->oob_shift; + } + assert(offset <= iolen); + iolen -= offset; + + return iolen; +} + static void nand_command(NANDFlashState *s) { - unsigned int offset; switch (s->cmd) { case NAND_CMD_READ0: s->iolen = 0; @@ -XXX,XX +XXX,XX @@ static void nand_command(NANDFlashState *s) case NAND_CMD_NOSERIALREAD2: if (!(nand_flash_ids[s->chip_id].options & NAND_SAMSUNG_LP)) break; - offset = s->addr & ((1 << s->addr_shift) - 1); - s->blk_load(s, s->addr, offset); - if (s->gnd) - s->iolen = (1 << s->page_shift) - offset; - else - s->iolen = (1 << s->page_shift) + (1 << s->oob_shift) - offset; + s->iolen = nand_load_block(s, s->addr & ((1 << s->addr_shift) - 1)); break; case NAND_CMD_RESET: @@ -XXX,XX +XXX,XX @@ uint32_t nand_getio(DeviceState *dev) if (!s->iolen && s->cmd == NAND_CMD_READ0) { offset = (int) (s->addr & ((1 << s->addr_shift) - 1)) + s->offset; s->offset = 0; - - s->blk_load(s, s->addr, offset); - if (s->gnd) - s->iolen = (1 << s->page_shift) - offset; - else - s->iolen = (1 << s->page_shift) + (1 << s->oob_shift) - offset; + s->iolen = nand_load_block(s, offset); } if (s->ce || s->iolen <= 0) { -- 2.39.2
From: Philippe Mathieu-Daudé <philmd@linaro.org> Negative offset is meaningless, use unsigned type. Return a boolean value indicating success. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20240409135944.24997-3-philmd@linaro.org> (cherry picked from commit 2e3e09b368001f7eaeeca7a9b49cb1f0c9092d85) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/block/nand.c b/hw/block/nand.c index XXXXXXX..XXXXXXX 100644 --- a/hw/block/nand.c +++ b/hw/block/nand.c @@ -XXX,XX +XXX,XX @@ struct NANDFlashState { void (*blk_write)(NANDFlashState *s); void (*blk_erase)(NANDFlashState *s); - void (*blk_load)(NANDFlashState *s, uint64_t addr, int offset); + /* + * Returns %true when block containing (@addr + @offset) is + * successfully loaded, otherwise %false. + */ + bool (*blk_load)(NANDFlashState *s, uint64_t addr, unsigned offset); uint32_t ioaddr_vmstate; }; @@ -XXX,XX +XXX,XX @@ static void glue(nand_blk_erase_, NAND_PAGE_SIZE)(NANDFlashState *s) } } -static void glue(nand_blk_load_, NAND_PAGE_SIZE)(NANDFlashState *s, - uint64_t addr, int offset) +static bool glue(nand_blk_load_, NAND_PAGE_SIZE)(NANDFlashState *s, + uint64_t addr, unsigned offset) { if (PAGE(addr) >= s->pages) { - return; + return false; } if (s->blk) { @@ -XXX,XX +XXX,XX @@ static void glue(nand_blk_load_, NAND_PAGE_SIZE)(NANDFlashState *s, offset, NAND_PAGE_SIZE + OOB_SIZE - offset); s->ioaddr = s->io; } + + return true; } static void glue(nand_init_, NAND_PAGE_SIZE)(NANDFlashState *s) -- 2.39.2
From: Philippe Mathieu-Daudé <philmd@linaro.org> nand_command() and nand_getio() don't check @offset points into the block, nor the available data length (s->iolen) is not negative. In order to fix: - check the offset is in range in nand_blk_load_NAND_PAGE_SIZE(), - do not set @iolen if blk_load() failed. Reproducer: $ cat << EOF | qemu-system-arm -machine tosa \ -monitor none -serial none \ -display none -qtest stdio write 0x10000111 0x1 0xca write 0x10000104 0x1 0x47 write 0x1000ca04 0x1 0xd7 write 0x1000ca01 0x1 0xe0 write 0x1000ca04 0x1 0x71 write 0x1000ca00 0x1 0x50 write 0x1000ca04 0x1 0xd7 read 0x1000ca02 0x1 write 0x1000ca01 0x1 0x10 EOF ================================================================= ==15750==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61f000000de0 at pc 0x560e61557210 bp 0x7ffcfc4a59f0 sp 0x7ffcfc4a59e8 READ of size 1 at 0x61f000000de0 thread T0 #0 0x560e6155720f in mem_and hw/block/nand.c:101:20 #1 0x560e6155ac9c in nand_blk_write_512 hw/block/nand.c:663:9 #2 0x560e61544200 in nand_command hw/block/nand.c:293:13 #3 0x560e6153cc83 in nand_setio hw/block/nand.c:520:13 #4 0x560e61a0a69e in tc6393xb_nand_writeb hw/display/tc6393xb.c:380:13 #5 0x560e619f9bf7 in tc6393xb_writeb hw/display/tc6393xb.c:524:9 #6 0x560e647c7d03 in memory_region_write_accessor softmmu/memory.c:492:5 #7 0x560e647c7641 in access_with_adjusted_size softmmu/memory.c:554:18 #8 0x560e647c5f66 in memory_region_dispatch_write softmmu/memory.c:1514:16 #9 0x560e6485409e in flatview_write_continue softmmu/physmem.c:2825:23 #10 0x560e648421eb in flatview_write softmmu/physmem.c:2867:12 #11 0x560e64841ca8 in address_space_write softmmu/physmem.c:2963:18 #12 0x560e61170162 in qemu_writeb tests/qtest/videzzo/videzzo_qemu.c:1080:5 #13 0x560e6116eef7 in dispatch_mmio_write tests/qtest/videzzo/videzzo_qemu.c:1227:28 0x61f000000de0 is located 0 bytes to the right of 3424-byte region [0x61f000000080,0x61f000000de0) allocated by thread T0 here: #0 0x560e611276cf in malloc /root/llvm-project/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3 #1 0x7f7959a87e98 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x57e98) #2 0x560e64b98871 in object_new qom/object.c:749:12 #3 0x560e64b5d1a1 in qdev_new hw/core/qdev.c:153:19 #4 0x560e61547ea5 in nand_init hw/block/nand.c:639:11 #5 0x560e619f8772 in tc6393xb_init hw/display/tc6393xb.c:558:16 #6 0x560e6390bad2 in tosa_init hw/arm/tosa.c:250:12 SUMMARY: AddressSanitizer: heap-buffer-overflow hw/block/nand.c:101:20 in mem_and ==15750==ABORTING Broken since introduction in commit 3e3d5815cb ("NAND Flash memory emulation and ECC calculation helpers for use by NAND controllers"). Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1445 Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1446 Reported-by: Qiang Liu <cyruscyliu@gmail.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20240409135944.24997-4-philmd@linaro.org> (cherry picked from commit d39fdfff348fdf00173b7a58e935328a64db7d28) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/block/nand.c b/hw/block/nand.c index XXXXXXX..XXXXXXX 100644 --- a/hw/block/nand.c +++ b/hw/block/nand.c @@ -XXX,XX +XXX,XX @@ static unsigned nand_load_block(NANDFlashState *s, unsigned offset) { unsigned iolen; - s->blk_load(s, s->addr, offset); + if (!s->blk_load(s, s->addr, offset)) { + return 0; + } iolen = (1 << s->page_shift); if (s->gnd) { @@ -XXX,XX +XXX,XX @@ static bool glue(nand_blk_load_, NAND_PAGE_SIZE)(NANDFlashState *s, return false; } + if (offset > NAND_PAGE_SIZE + OOB_SIZE) { + return false; + } + if (s->blk) { if (s->mem_oob) { if (blk_pread(s->blk, SECTOR(addr) << BDRV_SECTOR_BITS, -- 2.39.2
From: Philippe Mathieu-Daudé <philmd@linaro.org> AppleSMCData is allocated with g_new0() in applesmc_add_key(): release it with g_free(). Leaked since commit 1ddda5cd36 ("AppleSMC device emulation"). Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2272 Reported-by: Zheyu Ma <zheyuma97@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20240408095217.57239-3-philmd@linaro.org> (cherry picked from commit fc09ff2979defdcf8d00c2db94022d5d610e36ba) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/misc/applesmc.c b/hw/misc/applesmc.c index XXXXXXX..XXXXXXX 100644 --- a/hw/misc/applesmc.c +++ b/hw/misc/applesmc.c @@ -XXX,XX +XXX,XX @@ static void qdev_applesmc_isa_reset(DeviceState *dev) /* Remove existing entries */ QLIST_FOREACH_SAFE(d, &s->data_def, node, next) { QLIST_REMOVE(d, node); + g_free(d); } s->status = 0x00; s->status_1e = 0x00; -- 2.39.2
From: Philippe Mathieu-Daudé <philmd@linaro.org> Instead of aborting when a session ID is invalid, return VIRTIO_CRYPTO_INVSESS ("Invalid session id"). Reproduced using: $ cat << EOF | qemu-system-i386 -display none \ -machine q35,accel=qtest -m 512M -nodefaults \ -object cryptodev-backend-builtin,id=cryptodev0 \ -device virtio-crypto-pci,id=crypto0,cryptodev=cryptodev0 \ -qtest stdio outl 0xcf8 0x80000804 outw 0xcfc 0x06 outl 0xcf8 0x80000820 outl 0xcfc 0xe0008000 write 0x10800e 0x1 0x01 write 0xe0008016 0x1 0x01 write 0xe0008020 0x4 0x00801000 write 0xe0008028 0x4 0x00c01000 write 0xe000801c 0x1 0x01 write 0x110000 0x1 0x05 write 0x110001 0x1 0x04 write 0x108002 0x1 0x11 write 0x108008 0x1 0x48 write 0x10800c 0x1 0x01 write 0x108018 0x1 0x10 write 0x10801c 0x1 0x02 write 0x10c002 0x1 0x01 write 0xe000b005 0x1 0x00 EOF Assertion failed: (session_id < MAX_NUM_SESSIONS && builtin->sessions[session_id]), function cryptodev_builtin_close_session, file cryptodev-builtin.c, line 430. Cc: qemu-stable@nongnu.org Reported-by: Zheyu Ma <zheyuma97@gmail.com> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2274 Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: zhenwei pi <pizhenwei@bytedance.com> Message-Id: <20240409094757.9127-1-philmd@linaro.org> (cherry picked from commit eaf2bd29538d039df80bb4b1584de33a61312bc6) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/backends/cryptodev-builtin.c b/backends/cryptodev-builtin.c index XXXXXXX..XXXXXXX 100644 --- a/backends/cryptodev-builtin.c +++ b/backends/cryptodev-builtin.c @@ -XXX,XX +XXX,XX @@ static int cryptodev_builtin_close_session( CRYPTODEV_BACKEND_BUILTIN(backend); CryptoDevBackendBuiltinSession *session; - assert(session_id < MAX_NUM_SESSIONS && builtin->sessions[session_id]); + if (session_id >= MAX_NUM_SESSIONS || !builtin->sessions[session_id]) { + return -VIRTIO_CRYPTO_INVSESS; + } session = builtin->sessions[session_id]; if (session->cipher) { -- 2.39.2
From: Philippe Mathieu-Daudé <philmd@linaro.org> When the MAC Interface Layer (MIL) transmit FIFO is full, truncate the packet, and raise the Transmitter Error (TXE) flag. Broken since model introduction in commit 2a42499017 ("LAN9118 emulation"). When using the reproducer from https://gitlab.com/qemu-project/qemu/-/issues/2267 we get: hw/net/lan9118.c:798:17: runtime error: index 2048 out of bounds for type 'uint8_t[2048]' (aka 'unsigned char[2048]') #0 0x563ec9a057b1 in tx_fifo_push hw/net/lan9118.c:798:43 #1 0x563ec99fbb28 in lan9118_writel hw/net/lan9118.c:1042:9 #2 0x563ec99f2de2 in lan9118_16bit_mode_write hw/net/lan9118.c:1205:9 #3 0x563ecbf78013 in memory_region_write_accessor system/memory.c:497:5 #4 0x563ecbf776f5 in access_with_adjusted_size system/memory.c:573:18 #5 0x563ecbf75643 in memory_region_dispatch_write system/memory.c:1521:16 #6 0x563ecc01bade in flatview_write_continue_step system/physmem.c:2713:18 #7 0x563ecc01b374 in flatview_write_continue system/physmem.c:2743:19 #8 0x563ecbff1c9b in flatview_write system/physmem.c:2774:12 #9 0x563ecbff1768 in address_space_write system/physmem.c:2894:18 ... [*] LAN9118 DS00002266B.pdf, Table 5.3.3 "INTERRUPT STATUS REGISTER" Cc: qemu-stable@nongnu.org Reported-by: Will Lester Reported-by: Chuhong Yuan <hslester96@gmail.com> Suggested-by: Peter Maydell <peter.maydell@linaro.org> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2267 Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20240409133801.23503-3-philmd@linaro.org> (cherry picked from commit ad766d603f39888309cfb1433ba2de1d0e9e4f58) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/net/lan9118.c b/hw/net/lan9118.c index XXXXXXX..XXXXXXX 100644 --- a/hw/net/lan9118.c +++ b/hw/net/lan9118.c @@ -XXX,XX +XXX,XX @@ static void tx_fifo_push(lan9118_state *s, uint32_t val) /* Documentation is somewhat unclear on the ordering of bytes in FIFO words. Empirical results show it to be little-endian. */ - /* TODO: FIFO overflow checking. */ while (n--) { + if (s->txp->len == MIL_TXFIFO_SIZE) { + /* + * No more space in the FIFO. The datasheet is not + * precise about this case. We choose what is easiest + * to model: the packet is truncated, and TXE is raised. + * + * Note, it could be a fragmented packet, but we currently + * do not handle that (see earlier TX_B case). + */ + qemu_log_mask(LOG_GUEST_ERROR, + "MIL TX FIFO overrun, discarding %u byte%s\n", + n, n > 1 ? "s" : ""); + s->int_sts |= TXE_INT; + break; + } s->txp->data[s->txp->len] = val & 0xff; s->txp->len++; val >>= 8; -- 2.39.2
From: Philippe Mathieu-Daudé <philmd@linaro.org> The magic 2048 is explained in the LAN9211 datasheet (DS00002414A) in chapter 1.4, "10/100 Ethernet MAC": The MAC Interface Layer (MIL), within the MAC, contains a 2K Byte transmit and a 128 Byte receive FIFO which is separate from the TX and RX FIFOs. [...] Note, the use of the constant in lan9118_receive() reveals that our implementation is using the same buffer for both tx and rx. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20240409133801.23503-2-philmd@linaro.org> (cherry picked from commit a45223467e4e185fff1c76a6483784fa379ded77) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/net/lan9118.c b/hw/net/lan9118.c index XXXXXXX..XXXXXXX 100644 --- a/hw/net/lan9118.c +++ b/hw/net/lan9118.c @@ -XXX,XX +XXX,XX @@ do { fprintf(stderr, "lan9118: error: " fmt , ## __VA_ARGS__);} while (0) #define GPT_TIMER_EN 0x20000000 +/* + * The MAC Interface Layer (MIL), within the MAC, contains a 2K Byte transmit + * and a 128 Byte receive FIFO which is separate from the TX and RX FIFOs. + */ +#define MIL_TXFIFO_SIZE 2048 + enum tx_state { TX_IDLE, TX_B, @@ -XXX,XX +XXX,XX @@ typedef struct { int32_t pad; int32_t fifo_used; int32_t len; - uint8_t data[2048]; + uint8_t data[MIL_TXFIFO_SIZE]; } LAN9118Packet; static const VMStateDescription vmstate_lan9118_packet = { @@ -XXX,XX +XXX,XX @@ static const VMStateDescription vmstate_lan9118_packet = { VMSTATE_INT32(pad, LAN9118Packet), VMSTATE_INT32(fifo_used, LAN9118Packet), VMSTATE_INT32(len, LAN9118Packet), - VMSTATE_UINT8_ARRAY(data, LAN9118Packet, 2048), + VMSTATE_UINT8_ARRAY(data, LAN9118Packet, MIL_TXFIFO_SIZE), VMSTATE_END_OF_LIST() } }; @@ -XXX,XX +XXX,XX @@ static ssize_t lan9118_receive(NetClientState *nc, const uint8_t *buf, return -1; } - if (size >= 2048 || size < 14) { + if (size >= MIL_TXFIFO_SIZE || size < 14) { return -1; } -- 2.39.2
From: Philippe Mathieu-Daudé <philmd@linaro.org> Per "SD Host Controller Standard Specification Version 3.00": * 2.2.5 Transfer Mode Register (Offset 00Ch) Writes to this register shall be ignored when the Command Inhibit (DAT) in the Present State register is 1. Do not update the TRNMOD register when Command Inhibit (DAT) bit is set to avoid the present-status register going out of sync, leading to malicious guest using DMA mode and overflowing the FIFO buffer: $ cat << EOF | qemu-system-i386 \ -display none -nographic -nodefaults \ -machine accel=qtest -m 512M \ -device sdhci-pci,sd-spec-version=3 \ -device sd-card,drive=mydrive \ -drive if=none,index=0,file=null-co://,format=raw,id=mydrive \ -qtest stdio outl 0xcf8 0x80001013 outl 0xcfc 0x91 outl 0xcf8 0x80001001 outl 0xcfc 0x06000000 write 0x9100002c 0x1 0x05 write 0x91000058 0x1 0x16 write 0x91000005 0x1 0x04 write 0x91000028 0x1 0x08 write 0x16 0x1 0x21 write 0x19 0x1 0x20 write 0x9100000c 0x1 0x01 write 0x9100000e 0x1 0x20 write 0x9100000f 0x1 0x00 write 0x9100000c 0x1 0x00 write 0x91000020 0x1 0x00 EOF Stack trace (part): ================================================================= ==89993==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x615000029900 at pc 0x55d5f885700d bp 0x7ffc1e1e9470 sp 0x7ffc1e1e9468 WRITE of size 1 at 0x615000029900 thread T0 #0 0x55d5f885700c in sdhci_write_dataport hw/sd/sdhci.c:564:39 #1 0x55d5f8849150 in sdhci_write hw/sd/sdhci.c:1223:13 #2 0x55d5fa01db63 in memory_region_write_accessor system/memory.c:497:5 #3 0x55d5fa01d245 in access_with_adjusted_size system/memory.c:573:18 #4 0x55d5fa01b1a9 in memory_region_dispatch_write system/memory.c:1521:16 #5 0x55d5fa09f5c9 in flatview_write_continue system/physmem.c:2711:23 #6 0x55d5fa08f78b in flatview_write system/physmem.c:2753:12 #7 0x55d5fa08f258 in address_space_write system/physmem.c:2860:18 ... 0x615000029900 is located 0 bytes to the right of 512-byte region [0x615000029700,0x615000029900) allocated by thread T0 here: #0 0x55d5f7237b27 in __interceptor_calloc #1 0x7f9e36dd4c50 in g_malloc0 #2 0x55d5f88672f7 in sdhci_pci_realize hw/sd/sdhci-pci.c:36:5 #3 0x55d5f844b582 in pci_qdev_realize hw/pci/pci.c:2092:9 #4 0x55d5fa2ee74b in device_set_realized hw/core/qdev.c:510:13 #5 0x55d5fa325bfb in property_set_bool qom/object.c:2358:5 #6 0x55d5fa31ea45 in object_property_set qom/object.c:1472:5 #7 0x55d5fa332509 in object_property_set_qobject om/qom-qobject.c:28:10 #8 0x55d5fa31f6ed in object_property_set_bool qom/object.c:1541:15 #9 0x55d5fa2e2948 in qdev_realize hw/core/qdev.c:292:12 #10 0x55d5f8eed3f1 in qdev_device_add_from_qdict system/qdev-monitor.c:719:10 #11 0x55d5f8eef7ff in qdev_device_add system/qdev-monitor.c:738:11 #12 0x55d5f8f211f0 in device_init_func system/vl.c:1200:11 #13 0x55d5fad0877d in qemu_opts_foreach util/qemu-option.c:1135:14 #14 0x55d5f8f0df9c in qemu_create_cli_devices system/vl.c:2638:5 #15 0x55d5f8f0db24 in qmp_x_exit_preconfig system/vl.c:2706:5 #16 0x55d5f8f14dc0 in qemu_init system/vl.c:3737:9 ... SUMMARY: AddressSanitizer: heap-buffer-overflow hw/sd/sdhci.c:564:39 in sdhci_write_dataport Add assertions to ensure the fifo_buffer[] is not overflowed by malicious accesses to the Buffer Data Port register. Fixes: CVE-2024-3447 Cc: qemu-stable@nongnu.org Fixes: d7dfca0807 ("hw/sdhci: introduce standard SD host controller") Buglink: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=58813 Reported-by: Alexander Bulekov <alxndr@bu.edu> Reported-by: Chuhong Yuan <hslester96@gmail.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <CAFEAcA9iLiv1XGTGKeopgMa8Y9+8kvptvsb8z2OBeuy+5=NUfg@mail.gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20240409145524.27913-1-philmd@linaro.org> (cherry picked from commit 9e4b27ca6bf4974f169bbca7f3dca117b1208b6f) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c index XXXXXXX..XXXXXXX 100644 --- a/hw/sd/sdhci.c +++ b/hw/sd/sdhci.c @@ -XXX,XX +XXX,XX @@ static uint32_t sdhci_read_dataport(SDHCIState *s, unsigned size) } for (i = 0; i < size; i++) { + assert(s->data_count < s->buf_maxsz); value |= s->fifo_buffer[s->data_count] << i * 8; s->data_count++; /* check if we've read all valid data (blksize bytes) from buffer */ @@ -XXX,XX +XXX,XX @@ static void sdhci_write_dataport(SDHCIState *s, uint32_t value, unsigned size) } for (i = 0; i < size; i++) { + assert(s->data_count < s->buf_maxsz); s->fifo_buffer[s->data_count] = value & 0xFF; s->data_count++; value >>= 8; @@ -XXX,XX +XXX,XX @@ sdhci_write(void *opaque, hwaddr offset, uint64_t val, unsigned size) if (!(s->capareg & R_SDHC_CAPAB_SDMA_MASK)) { value &= ~SDHC_TRNS_DMA; } + + /* TRNMOD writes are inhibited while Command Inhibit (DAT) is true */ + if (s->prnsts & SDHC_DATA_INHIBIT) { + mask |= 0xffff; + } + MASKED_WRITE(s->trnmod, mask, value & SDHC_TRNMOD_MASK); MASKED_WRITE(s->cmdreg, mask >> 16, value >> 16); -- 2.39.2
From: Zack Buhman <zack@buhman.org> CHECK_NOT_DELAY_SLOT is correctly applied to the branch-related instructions, but not to the PC-relative mov* instructions. I verified the existence of an illegal slot exception on a SH7091 when any of these instructions are attempted inside a delay slot. This also matches the behavior described in the SH-4 ISA manual. Signed-off-by: Zack Buhman <zack@buhman.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20240407150705.5965-1-zack@buhman.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewd-by: Yoshinori Sato <ysato@users.sourceforge.jp> (cherry picked from commit b754cb2dcde26a7bc8a9d17bb6900a0ac0dd38e2) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> (Mjt: trivial context (whitespace before comments) fixup) diff --git a/target/sh4/translate.c b/target/sh4/translate.c index XXXXXXX..XXXXXXX 100644 --- a/target/sh4/translate.c +++ b/target/sh4/translate.c @@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx) tcg_gen_movi_i32(REG(B11_8), B7_0s); return; case 0x9000: /* mov.w @(disp,PC),Rn */ + CHECK_NOT_DELAY_SLOT { TCGv addr = tcg_const_i32(ctx->base.pc_next + 4 + B7_0 * 2); tcg_gen_qemu_ld_i32(REG(B11_8), addr, ctx->memidx, MO_TESW); @@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx) } return; case 0xd000: /* mov.l @(disp,PC),Rn */ + CHECK_NOT_DELAY_SLOT { TCGv addr = tcg_const_i32((ctx->base.pc_next + 4 + B7_0 * 4) & ~3); tcg_gen_qemu_ld_i32(REG(B11_8), addr, ctx->memidx, MO_TESL); @@ -XXX,XX +XXX,XX @@ static void _decode_opc(DisasContext * ctx) } return; case 0xc700: /* mova @(disp,PC),R0 */ + CHECK_NOT_DELAY_SLOT tcg_gen_movi_i32(REG(0), ((ctx->base.pc_next & 0xfffffffc) + 4 + B7_0 * 4) & ~3); return; -- 2.39.2
From: Harsh Prateek Bora <harshpb@linux.ibm.com> spapr_irq_init currently uses existing macro SPAPR_XIRQ_BASE to refer to the range of CPU IPIs during initialization of nr-irqs property. It is more appropriate to have its own define which can be further reused as appropriate for correct interpretation. Suggested-by: Cedric Le Goater <clg@kaod.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Tested-by: Kowshik Jois <kowsjois@linux.ibm.com> Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> (cherry picked from commit 2df5c1f5b014126595a26c6797089d284a3b211c) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c index XXXXXXX..XXXXXXX 100644 --- a/hw/ppc/spapr_irq.c +++ b/hw/ppc/spapr_irq.c @@ -XXX,XX +XXX,XX @@ #include "trace.h" +QEMU_BUILD_BUG_ON(SPAPR_IRQ_NR_IPIS > SPAPR_XIRQ_BASE); + static const TypeInfo spapr_intc_info = { .name = TYPE_SPAPR_INTC, .parent = TYPE_INTERFACE, @@ -XXX,XX +XXX,XX @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp) int i; dev = qdev_new(TYPE_SPAPR_XIVE); - qdev_prop_set_uint32(dev, "nr-irqs", smc->nr_xirqs + SPAPR_XIRQ_BASE); + qdev_prop_set_uint32(dev, "nr-irqs", smc->nr_xirqs + SPAPR_IRQ_NR_IPIS); /* * 8 XIVE END structures per CPU. One for each available * priority @@ -XXX,XX +XXX,XX @@ void spapr_irq_init(SpaprMachineState *spapr, Error **errp) } spapr->qirqs = qemu_allocate_irqs(spapr_set_irq, spapr, - smc->nr_xirqs + SPAPR_XIRQ_BASE); + smc->nr_xirqs + SPAPR_IRQ_NR_IPIS); /* * Mostly we don't actually need this until reset, except that not diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h index XXXXXXX..XXXXXXX 100644 --- a/include/hw/ppc/spapr_irq.h +++ b/include/hw/ppc/spapr_irq.h @@ -XXX,XX +XXX,XX @@ #include "qom/object.h" /* - * IRQ range offsets per device type + * The XIVE IRQ backend uses the same layout as the XICS backend but + * covers the full range of the IRQ number space. The IRQ numbers for + * the CPU IPIs are allocated at the bottom of this space, below 4K, + * to preserve compatibility with XICS which does not use that range. + */ + +/* + * CPU IPI range (XIVE only) */ #define SPAPR_IRQ_IPI 0x0 +#define SPAPR_IRQ_NR_IPIS 0x1000 + +/* + * IRQ range offsets per device type + */ #define SPAPR_XIRQ_BASE XICS_IRQ_BASE /* 0x1000 */ #define SPAPR_IRQ_EPOW (SPAPR_XIRQ_BASE + 0x0000) -- 2.39.2
From: Harsh Prateek Bora <harshpb@linux.ibm.com> Initialize the machine specific max_cpus limit as per the maximum range of CPU IPIs available. Keeping between 4096 to 8192 will throw IRQ not free error due to XIVE/XICS limitation and keeping beyond 8192 will hit assert in tcg_region_init or spapr_xive_claim_irq. Logs: Without patch fix: [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097 qemu-system-ppc64: IRQ 4096 is not free [root@host build]# On LPAR: [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193 ** ERROR:../tcg/region.c:774:tcg_region_init: assertion failed: (region_size >= 2 * page_size) Bail out! ERROR:../tcg/region.c:774:tcg_region_init: assertion failed: (region_size >= 2 * page_size) Aborted (core dumped) [root@host build]# On x86: [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=8193 qemu-system-ppc64: ../hw/intc/spapr_xive.c:596: spapr_xive_claim_irq: Assertion `lisn < xive->nr_irqs' failed. Aborted (core dumped) [root@host build]# With patch fix: [root@host build]# qemu-system-ppc64 -accel tcg -smp 10,maxcpus=4097 qemu-system-ppc64: Invalid SMP CPUs 4097. The max CPUs supported by machine 'pseries-8.2' is 4096 [root@host build]# Reported-by: Kowshik Jois <kowsjois@linux.ibm.com> Tested-by: Kowshik Jois <kowsjois@linux.ibm.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> (cherry picked from commit c4f91d7b7be76c47015521ab0109c6e998a369b0) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index XXXXXXX..XXXXXXX 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -XXX,XX +XXX,XX @@ static void spapr_machine_class_init(ObjectClass *oc, void *data) mc->block_default_type = IF_SCSI; /* - * Setting max_cpus to INT32_MAX. Both KVM and TCG max_cpus values - * should be limited by the host capability instead of hardcoded. - * max_cpus for KVM guests will be checked in kvm_init(), and TCG - * guests are welcome to have as many CPUs as the host are capable - * of emulate. + * While KVM determines max cpus in kvm_init() using kvm_max_vcpus(), + * In TCG the limit is restricted by the range of CPU IPIs available. */ - mc->max_cpus = INT32_MAX; + mc->max_cpus = SPAPR_IRQ_NR_IPIS; mc->no_parallel = 1; mc->default_boot_order = ""; -- 2.39.2