accel/hvf/hvf-accel-ops.c | 2 +- accel/hvf/hvf-all.c | 49 ++++++++++++++++-------------------- include/sysemu/hvf_int.h | 9 +++++-- target/i386/hvf/hvf.c | 47 +++++++++++++++++++++++++++++++--- target/i386/hvf/vmx.h | 3 +-- target/i386/hvf/x86_cpuid.c | 4 +++ target/i386/hvf/x86_decode.c | 2 +- target/i386/hvf/x86_emu.c | 4 +-- 8 files changed, 81 insertions(+), 39 deletions(-)
This is a series of semi-related patches for the x86 macOS Hypervisor.framework (hvf) accelerator backend. The intention is to (1) fix bugs and (2) move the hvf backend to use more modern and efficient APIs in Hypervisor.framework. The goal is to replace the main hv_vcpu_run() call with hv_vcpu_run_until(). On the one hand, doing so noticeably improves performance in itself. On the other hand, using the new API is a prerequisite for enabling hvf's in-kernel virtual APIC implementation, which provides a further, even more drastic performance improvement on many workloads. Integrating the APIC requires a bunch of large commits which still need some compatibility bugfixing, and which I hope to submit as a later patch set. Compared to v2 of the patch set, I've re-added the kick and hv_vcpu_run_until patches after analysing hv_vcpu_interrupt in more detail and finding it safe. Plus, there's an ergonomic improvement to the assert_hvf_ok macro. In this series: Patch 1 enables the INVTSC CPUID bit when running with hvf. This can enable some optimisations in the guest OS, and I've not found any reason it shouldn't be allowed for hvf based hosts. It now also includes setting a migration blocker when the feature is active. Patch 2 fixes a bunch of compile warnings that kept littering my build logs, so I finally caved and fixed them. As far as I can tell, these were all ancient typos. Patch 3 sorts out the minor mess of hvf vCPU IDs/handles. The aarch64 and x86-64 versions of Hypervisor.framework's APIs use different integral types (uint64_t vs unsigned int) when referencing vCPUs, so this changes the code to use the correct one depending on build arch instead of awkward pointer casts. (There's currently only one instance of such a cast, but patches 5 and 6 would have added more, so I'm fixing this preemptively.) Patch 4 fixes dirty page tracking for the x86 hvf backend. This has previously only accidentally worked because hv_vcpu_run() makes spurious EPT fault exits. Switching to hv_vcpu_run_until() surfaces this issue when using an emulated VGA adapter for example, as those use dirty page tracking to do partial screen updates. This issue was causing problems on previous attempts of switching to the newer HVF APIs as it was masked by the inefficiency of the older APIs. Patch 5 implements a "real" vCPU kick using hv_vcpu_interrupt(). So far, the x86 hvf backend's external interrupt injection has relied on hv_vcpu_run()'s behaviour of exiting the VM very frequently without host action to handle external interrupts. This is not a great state of affairs in itself, but fails entirely when switching to hv_vcpu_run_until() which returns only when there is actual work to be done by the VMM. In previous attempts to introduce this improved 'kick', there has been doubt about hv_vcpu_interrupt()'s reliability in edge cases, particularly when an interrupt is issued when the vCPU thread is either not running hv_vcpu_run_until() at all, or either entering or exiting VMX mode. I believe this concern is unfounded for three reasons: 1) The patches have been in use in a fleet of hundreds of production systems running CI workloads for over two years. No symptoms of a missed interrupt (or indeed any other issues) have been encountered. 2) I have tried to provoke such an edge case and failed. To do this, I hacked up Michael Steil's toy "hvdos" VMM to run some bare-metal assembly code, then hit the running VM with a barrage of hv_vcpu_interrupt calls at randomly spaced intervals. Any hv_vcpu_interrupt call is followed shortly by a VM exit, regardless of what state the vCPU thread was in. Multiple interrupts in short succession are coalesced, but only before a VM exit. If an interrupt is issued after the VM has already exited, a further exit is triggered as soon as the vCPU thread attempts to re-enter the VM. The code for this is here: https://gitlab.com/pmdj/hvf-edge-cases 3) The doubts about hv_vcpu_interrupt edge cases seem to originate in observed behaviour that was actually caused by the dirty memory tracking bug fixed in patch 4 of this series. That bug had nothing to do with hv_vcpu_interrupt as such, it was surfaced by hv_vcpu_run_until()'s change in EPT fault exit behaviour compared to hv_vcpu_run(). Of course, this is not PROOF of absence of defects, but I'm not aware of any formal proofs covering other Qemu code or dependencies. I have also asked Apple's DTS to officially clarify hv_vcpu_interrupt()'s behaviour in edge cases but unfortunately received no reply. Patch 6 switches from hv_vcpu_run() to hv_vcpu_run_until() where supported. This was of course the goal, and the previous patches fix all the bugs that caused this patch to surface them. Patch 7 provides a small improvement to error messages if and when an hvf call fails. assert_hvf_ok() would previously only tell you the error code of the failing call, not which call was failing, nor the call site. The change makes it behave more like a classic assertion, reporting the expression as well as the source code location along with the error. changelog: v3: - Back to one patch series with all the changes. - Detailed investigation into edge case behaviour of hv_vcpu_interrupt. Determined it was behaving fine, no race condition workarounds needed, so the kick and hv_vcpu_run_until patches have actually stayed essentially the same as in v1. - Added the assert_hvf_ok patch because I kept using it when debugging. v2: - Migration blocker when INVTSC is set (Thanks Paolo for pointing that out!) - Dirty page tracking fix (Thanks Roman for noticing the regression in observed behaviour on certain VMs, which led me to debugging this issue.) - vCPU handle type cleanup (Based on discussion with Paolo) - Added fixes for existing compile warnings. - Split patch series into 2 parts. This work has been sponsored by Sauce Labs Inc. Phil Dennis-Jordan (7): i386/hvf: Adds support for INVTSC cpuid bit i386/hvf: Fixes some compilation warnings hvf: Consistent types for vCPU handles i386/hvf: Fixes dirty memory tracking by page granularity RX->RWX change i386/hvf: In kick_vcpu use hv_vcpu_interrupt to force exit i386/hvf: Updates API usage to use modern vCPU run function hvf: Makes assert_hvf_ok report failed expression accel/hvf/hvf-accel-ops.c | 2 +- accel/hvf/hvf-all.c | 49 ++++++++++++++++-------------------- include/sysemu/hvf_int.h | 9 +++++-- target/i386/hvf/hvf.c | 47 +++++++++++++++++++++++++++++++--- target/i386/hvf/vmx.h | 3 +-- target/i386/hvf/x86_cpuid.c | 4 +++ target/i386/hvf/x86_decode.c | 2 +- target/i386/hvf/x86_emu.c | 4 +-- 8 files changed, 81 insertions(+), 39 deletions(-) -- 2.36.1
Queued, thanks. Thanks for persisting! It sucks that the hv_vcpu_interrupt() API docs are not clear, but your tests are great. The self-interrupt one is the case that I was most worried about, and you're covering it. Sorry for being a pain for nothing, at least retrospectively. Paolo
On Thu, 6 Jun 2024 at 10:24, Paolo Bonzini <pbonzini@redhat.com> wrote: > Queued, thanks. Thanks - also for reviewing, etc.! > Thanks for persisting! It sucks that the hv_vcpu_interrupt() API docs > are not clear, but your tests are great. The self-interrupt one is > the case that I was most worried about, and you're covering it. > Sorry for being a pain for nothing, at least retrospectively. No worries - the concern is understandable, especially in the face of the unfortunate apparent regression which turned out to be the dirty page tracking bug. And I agree, the hv_vcpu_interrupt docs, along with the rest of Hypervisor.framework's, are terrible. There does not appear to have been any thought about what a developer using that API might care about. I've been working on integrating the HVF APIC/PIC/IOAPIC implementations, and there are ambiguities and edge cases galore. Unfortunately (?), the perf improvement is worth the trouble of trial & error… Phil
© 2016 - 2025 Red Hat, Inc.