[PATCH v15 15/60] i386/xen: add pc_machine_kvm_type to initialize XEN_EMULATE mode

David Woodhouse posted 60 patches 2 years, 4 months ago
[PATCH v15 15/60] i386/xen: add pc_machine_kvm_type to initialize XEN_EMULATE mode
Posted by David Woodhouse 2 years, 4 months ago
From: David Woodhouse <dwmw@amazon.co.uk>

The xen_overlay device (and later similar devices for event channels and
grant tables) need to be instantiated. Do this from a kvm_type method on
the PC machine derivatives, since KVM is only way to support Xen emulation
for now.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Paul Durrant <paul@xen.org>
---
 hw/i386/pc.c         | 11 +++++++++++
 include/hw/i386/pc.h |  3 +++
 2 files changed, 14 insertions(+)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 992951c107..a316a01b15 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -90,6 +90,7 @@
 #include "hw/virtio/virtio-iommu.h"
 #include "hw/virtio/virtio-pmem-pci.h"
 #include "hw/virtio/virtio-mem-pci.h"
+#include "hw/i386/kvm/xen_overlay.h"
 #include "hw/mem/memory-device.h"
 #include "sysemu/replay.h"
 #include "target/i386/cpu.h"
@@ -1846,6 +1847,16 @@ static void pc_machine_initfn(Object *obj)
     cxl_machine_init(obj, &pcms->cxl_devices_state);
 }
 
+int pc_machine_kvm_type(MachineState *machine, const char *kvm_type)
+{
+#ifdef CONFIG_XEN_EMU
+    if (xen_mode == XEN_EMULATE) {
+        xen_overlay_create();
+    }
+#endif
+    return 0;
+}
+
 static void pc_machine_reset(MachineState *machine, ShutdownCause reason)
 {
     CPUState *cs;
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 66e3d059ef..467311007e 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -291,12 +291,15 @@ extern const size_t pc_compat_1_5_len;
 extern GlobalProperty pc_compat_1_4[];
 extern const size_t pc_compat_1_4_len;
 
+int pc_machine_kvm_type(MachineState *machine, const char *vm_type);
+
 #define DEFINE_PC_MACHINE(suffix, namestr, initfn, optsfn) \
     static void pc_machine_##suffix##_class_init(ObjectClass *oc, void *data) \
     { \
         MachineClass *mc = MACHINE_CLASS(oc); \
         optsfn(mc); \
         mc->init = initfn; \
+        mc->kvm_type = pc_machine_kvm_type; \
     } \
     static const TypeInfo pc_machine_type_##suffix = { \
         .name       = namestr TYPE_MACHINE_SUFFIX, \
-- 
2.39.0
Re: [PATCH v15 15/60] i386/xen: add pc_machine_kvm_type to initialize XEN_EMULATE mode
Posted by Xiaoyao Li 2 years, 4 months ago
On 3/1/2023 9:51 PM, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> The xen_overlay device (and later similar devices for event channels and
> grant tables) need to be instantiated. Do this from a kvm_type method on
> the PC machine derivatives, since KVM is only way to support Xen emulation
> for now.

Just curious, isn't there any more reasonable place to add machine 
specific initialization?

abusing the mc->kvm_type() looks bad to me.

If everything goes well, KVM x86 will support multiple vm_type[1].
At that time, QEMU will need to implement mc->kvm_type() to get/parse vm 
type. I have posted the patch to enable it for TDX[2].

[1] 
https://github.com/sean-jc/linux/commit/fbc35ac0be99b5e02248e7c8f1f3cacb9da512ce

[2] 
https://lore.kernel.org/qemu-devel/20220802074750.2581308-4-xiaoyao.li@intel.com/ 


> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> Reviewed-by: Paul Durrant <paul@xen.org>
> ---
>   hw/i386/pc.c         | 11 +++++++++++
>   include/hw/i386/pc.h |  3 +++
>   2 files changed, 14 insertions(+)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 992951c107..a316a01b15 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -90,6 +90,7 @@
>   #include "hw/virtio/virtio-iommu.h"
>   #include "hw/virtio/virtio-pmem-pci.h"
>   #include "hw/virtio/virtio-mem-pci.h"
> +#include "hw/i386/kvm/xen_overlay.h"
>   #include "hw/mem/memory-device.h"
>   #include "sysemu/replay.h"
>   #include "target/i386/cpu.h"
> @@ -1846,6 +1847,16 @@ static void pc_machine_initfn(Object *obj)
>       cxl_machine_init(obj, &pcms->cxl_devices_state);
>   }
>   
> +int pc_machine_kvm_type(MachineState *machine, const char *kvm_type)
> +{
> +#ifdef CONFIG_XEN_EMU
> +    if (xen_mode == XEN_EMULATE) {
> +        xen_overlay_create();
> +    }
> +#endif
> +    return 0;
> +}
> +
>   static void pc_machine_reset(MachineState *machine, ShutdownCause reason)
>   {
>       CPUState *cs;
> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> index 66e3d059ef..467311007e 100644
> --- a/include/hw/i386/pc.h
> +++ b/include/hw/i386/pc.h
> @@ -291,12 +291,15 @@ extern const size_t pc_compat_1_5_len;
>   extern GlobalProperty pc_compat_1_4[];
>   extern const size_t pc_compat_1_4_len;
>   
> +int pc_machine_kvm_type(MachineState *machine, const char *vm_type);
> +
>   #define DEFINE_PC_MACHINE(suffix, namestr, initfn, optsfn) \
>       static void pc_machine_##suffix##_class_init(ObjectClass *oc, void *data) \
>       { \
>           MachineClass *mc = MACHINE_CLASS(oc); \
>           optsfn(mc); \
>           mc->init = initfn; \
> +        mc->kvm_type = pc_machine_kvm_type; \
>       } \
>       static const TypeInfo pc_machine_type_##suffix = { \
>           .name       = namestr TYPE_MACHINE_SUFFIX, \
Re: [PATCH v15 15/60] i386/xen: add pc_machine_kvm_type to initialize XEN_EMULATE mode
Posted by David Woodhouse 2 years, 4 months ago
On Fri, 2023-03-10 at 11:15 +0800, Xiaoyao Li wrote:
> On 3/1/2023 9:51 PM, David Woodhouse wrote:
> > From: David Woodhouse <dwmw@amazon.co.uk>
> > 
> > The xen_overlay device (and later similar devices for event channels and
> > grant tables) need to be instantiated. Do this from a kvm_type method on
> > the PC machine derivatives, since KVM is only way to support Xen emulation
> > for now.
> 
> Just curious, isn't there any more reasonable place to add machine 
> specific initialization?
> 
> abusing the mc->kvm_type() looks bad to me.

Hm, good question. Off the top of my head I have no better answer than
"Paolo made me do it":

https://lore.kernel.org/qemu-devel/8495140d-3301-7693-b804-0554166802da@redhat.com/

But I have gained a bit more clue since December, and reading that
message again I'll put a lot more focus on the fact that he said
"during mc->kvm_type OR AFTERWARDS". That's the *earliest* we can
depend on xen_mode being set to XEN_EMULATE, but now the dust has
settled I don't think we actually *need* to do it that early.

I'll have a play with moving it to pc_basic_device_init() where there's
some other XEN_EMULATE initialisation anyway:

--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1319,6 +1319,10 @@ void pc_basic_device_init(struct PCMachineState *pcms,
 
 #ifdef CONFIG_XEN_EMU
     if (xen_mode == XEN_EMULATE) {
+        xen_overlay_create();
+        xen_evtchn_create();
+        xen_gnttab_create();
+        xen_xenstore_create();
         xen_evtchn_connect_gsis(gsi);
         if (pcms->bus) {
             pci_create_simple(pcms->bus, -1, "xen-platform");
@@ -1868,14 +1872,6 @@ static void pc_machine_initfn(Object *obj)
 
 int pc_machine_kvm_type(MachineState *machine, const char *kvm_type)
 {
-#ifdef CONFIG_XEN_EMU
-    if (xen_mode == XEN_EMULATE) {
-        xen_overlay_create();
-        xen_evtchn_create();
-        xen_gnttab_create();
-        xen_xenstore_create();
-    }
-#endif
     return 0;
 }
 
I have *also* gained some Avocado tests since December, which exercise
Xen guests (with PV disk) in a bunch of different interrupt-delivery
modes. And pass when I do the above. So I'll let the morning coffee
take effect and give it a bit more thought and testing, and probably
submit it.

Shall I leave pc_machine_kvm_type() present but empty, given that
you're about to come along and put something back there?

Thanks.
Re: [PATCH v15 15/60] i386/xen: add pc_machine_kvm_type to initialize XEN_EMULATE mode
Posted by Paolo Bonzini 2 years, 4 months ago
On 3/10/23 09:28, David Woodhouse wrote:
> On Fri, 2023-03-10 at 11:15 +0800, Xiaoyao Li wrote:
>> On 3/1/2023 9:51 PM, David Woodhouse wrote:
>>> From: David Woodhouse <dwmw@amazon.co.uk>
>>>
>>> The xen_overlay device (and later similar devices for event channels and
>>> grant tables) need to be instantiated. Do this from a kvm_type method on
>>> the PC machine derivatives, since KVM is only way to support Xen emulation
>>> for now.
>>
>> Just curious, isn't there any more reasonable place to add machine
>> specific initialization?
>>
>> abusing the mc->kvm_type() looks bad to me.
> 
> Hm, good question. Off the top of my head I have no better answer than
> "Paolo made me do it":
> 
> https://lore.kernel.org/qemu-devel/8495140d-3301-7693-b804-0554166802da@redhat.com/
> 
> But I have gained a bit more clue since December, and reading that
> message again I'll put a lot more focus on the fact that he said
> "during mc->kvm_type OR AFTERWARDS".

I don't think this is abusing mc->kvm_type; that is the point where 
startup code tells the machine "now you have your accelerator 
configuration, do what you want with that info".  In fact I find using 
xen_enabled() in mc->kvm_type to be less ugly than having a MachineClass 
entry just for KVM. :)

But, if you want to move it to pc_basic_device_init() that is certainly 
okay too.  It's not like people are going to add TCG/Xen support 
tomorrow but it is a tiny step in the direction of less 
accelerator-specific code for Xen emulation.

Paolo
Re: [PATCH v15 15/60] i386/xen: add pc_machine_kvm_type to initialize XEN_EMULATE mode
Posted by David Woodhouse 2 years, 4 months ago
On Fri, 2023-03-10 at 09:52 +0100, Paolo Bonzini wrote:
> 
> I don't think this is abusing mc->kvm_type; that is the point where 
> startup code tells the machine "now you have your accelerator 
> configuration, do what you want with that info".  In fact I find using 
> xen_enabled() in mc->kvm_type to be less ugly than having a MachineClass 
> entry just for KVM. :)
> 
> But, if you want to move it to pc_basic_device_init() that is certainly 
> okay too.  It's not like people are going to add TCG/Xen support 
> tomorrow but it is a tiny step in the direction of less 
> accelerator-specific code for Xen emulation.

Yeah, I think I will.

We can't just leave it as it is; it needs *either* a coherent comment
explaining why the init calls happen from different locations, *or*
they need to be put together. Xiaoyao has triggered my "if it needs
documenting, fix it first and then explain what's left" instinct.

And it isn't even just a case of where we call the xen* functions from;
the xen_evtchn_connect_gsis() call *only* exists as a separate function
because we didn't have the GSIs available early enough to pass them to
xen_evtchn_create(). If we do it all in pc_basic_devices_init() then
that can be simplified too.

I cannot come up with a coherent comment explaining the current
situation, and thus I must fix it :)

Speaking of the 'fix it first' instinct... looking back at that mail
from December reminds me that I wanted to make '-M xenfv' work. Because
on the users' behalf, I *hate* what I had to write in
https://qemu-project.gitlab.io/qemu/system/i386/xen.html

qemu-system-x86_64 -M pc --accel kvm,xen-version=0x4000a,kernel-irqchip=split \
     -drive file=${GUEST_IMAGE},if=none,id=disk -device xen-disk,drive=disk,vdev=xvda

Why in the name of all that is holy can't that be

 qemu-system-x86_64 -M xenfv -drive file=${GUEST_IMAGE},if=xen-disk,xen-block.vdev=xvda

Now *maybe* I could live with them also having to add '-accel kvm'. If
we can genuinely make the argument that QEMU running on a Linux/KVM
host and not in a Xen dom0 couldn't possibly be expected to work that
out for itself. 

And in fact, why wouldn't it default to xen-disk for a Xen guest? Why
should the user have to specify that? And why *can't* it default to
xvda for the first (and only) present disk.... 
[PATCH] hw/xen: Simplify emulated Xen platform init
Posted by David Woodhouse 2 years, 4 months ago
From: David Woodhouse <dwmw@amazon.co.uk>

I initially put the basic platform init (overlay pages, grant tables,
event channels) into mc->kvm_type because that was the earliest place
that could sensibly test for xen_mode==XEN_EMULATE.

The intent was to do this early enough that we could then initialise the
XenBus and other parts which would have depended on them, from a generic
location for both Xen and KVM/Xen in the PC-specific code, as seen in
https://lore.kernel.org/qemu-devel/20230116221919.1124201-16-dwmw2@infradead.org/

However, then the Xen on Arm patches came along, and *they* wanted to
do the XenBus init from a 'generic' Xen-specific location instead:
https://lore.kernel.org/qemu-devel/20230210222729.957168-4-sstabellini@kernel.org/

Since there's no generic location that covers all three, I conceded to
do it for XEN_EMULATE mode in pc_basic_devices_init().

And now there's absolutely no point in having some of the platform init
done from pc_machine_kvm_type(); we can move it all up to live in a
single place in pc_basic_devices_init(). This has the added benefit that
we can drop the separate xen_evtchn_connect_gsis() function completely,
and pass just the system GSIs in directly to xen_evtchn_create().

While I'm at it, it does no harm to explicitly pass in the *number* of
said GSIs, because it does make me twitch a bit to pass an array of
impicit size. During the lifetime of the KVM/Xen patchset, that had
already changed (albeit just cosmetically) from GSI_NUM_PINS to
IOAPIC_NUM_PINS.

And document a bit better that this is for the *output* GSI for raising
CPU0's events when the per-CPU vector isn't available. The fact that
we create a whole set of them and then only waggle the one we're told
to, instead of having a single output and only *connecting* it to the
GSI that it should be connected to, is still non-intuitive for me.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 hw/i386/kvm/xen_evtchn.c | 40 ++++++++++++++++++++--------------------
 hw/i386/kvm/xen_evtchn.h |  3 +--
 hw/i386/pc.c             | 13 ++++---------
 3 files changed, 25 insertions(+), 31 deletions(-)

diff --git a/hw/i386/kvm/xen_evtchn.c b/hw/i386/kvm/xen_evtchn.c
index 98a7b85047..f4204949fb 100644
--- a/hw/i386/kvm/xen_evtchn.c
+++ b/hw/i386/kvm/xen_evtchn.c
@@ -146,7 +146,10 @@ struct XenEvtchnState {
     QemuMutex port_lock;
     uint32_t nr_ports;
     XenEvtchnPort port_table[EVTCHN_2L_NR_CHANNELS];
-    qemu_irq gsis[IOAPIC_NUM_PINS];
+
+    /* Connected to the system GSIs for raising callback as GSI / INTx */
+    unsigned int nr_callback_gsis;
+    qemu_irq *callback_gsis;
 
     struct xenevtchn_handle *be_handles[EVTCHN_2L_NR_CHANNELS];
 
@@ -298,7 +301,7 @@ static void gsi_assert_bh(void *opaque)
     }
 }
 
-void xen_evtchn_create(void)
+void xen_evtchn_create(unsigned int nr_gsis, qemu_irq *system_gsis)
 {
     XenEvtchnState *s = XEN_EVTCHN(sysbus_create_simple(TYPE_XEN_EVTCHN,
                                                         -1, NULL));
@@ -309,8 +312,19 @@ void xen_evtchn_create(void)
     qemu_mutex_init(&s->port_lock);
     s->gsi_bh = aio_bh_new(qemu_get_aio_context(), gsi_assert_bh, s);
 
-    for (i = 0; i < IOAPIC_NUM_PINS; i++) {
-        sysbus_init_irq(SYS_BUS_DEVICE(s), &s->gsis[i]);
+    /*
+     * These are the *output* GSI from event channel support, for
+     * signalling CPU0's events via GSI or PCI INTx instead of the
+     * per-CPU vector. We create a *set* of irqs and connect one to
+     * each of the system GSIs which were passed in from the platform
+     * code, and then just trigger the right one as appropriate from
+     * xen_evtchn_set_callback_level().
+     */
+    s->nr_callback_gsis = nr_gsis;
+    s->callback_gsis = g_new0(qemu_irq, nr_gsis);
+    for (i = 0; i < nr_gsis; i++) {
+        sysbus_init_irq(SYS_BUS_DEVICE(s), &s->callback_gsis[i]);
+        sysbus_connect_irq(SYS_BUS_DEVICE(s), i, system_gsis[i]);
     }
 
     /*
@@ -335,20 +349,6 @@ void xen_evtchn_create(void)
     xen_evtchn_ops = &emu_evtchn_backend_ops;
 }
 
-void xen_evtchn_connect_gsis(qemu_irq *system_gsis)
-{
-    XenEvtchnState *s = xen_evtchn_singleton;
-    int i;
-
-    if (!s) {
-        return;
-    }
-
-    for (i = 0; i < IOAPIC_NUM_PINS; i++) {
-        sysbus_connect_irq(SYS_BUS_DEVICE(s), i, system_gsis[i]);
-    }
-}
-
 static void xen_evtchn_register_types(void)
 {
     type_register_static(&xen_evtchn_info);
@@ -429,8 +429,8 @@ void xen_evtchn_set_callback_level(int level)
         return;
     }
 
-    if (s->callback_gsi && s->callback_gsi < IOAPIC_NUM_PINS) {
-        qemu_set_irq(s->gsis[s->callback_gsi], level);
+    if (s->callback_gsi && s->callback_gsi < s->nr_callback_gsis) {
+        qemu_set_irq(s->callback_gsis[s->callback_gsi], level);
         if (level) {
             /* Ensure the vCPU polls for deassertion */
             kvm_xen_set_callback_asserted();
diff --git a/hw/i386/kvm/xen_evtchn.h b/hw/i386/kvm/xen_evtchn.h
index bfb67ac2bc..b740acfc0d 100644
--- a/hw/i386/kvm/xen_evtchn.h
+++ b/hw/i386/kvm/xen_evtchn.h
@@ -16,10 +16,9 @@
 
 typedef uint32_t evtchn_port_t;
 
-void xen_evtchn_create(void);
+void xen_evtchn_create(unsigned int nr_gsis, qemu_irq *system_gsis);
 int xen_evtchn_soft_reset(void);
 int xen_evtchn_set_callback_param(uint64_t param);
-void xen_evtchn_connect_gsis(qemu_irq *system_gsis);
 void xen_evtchn_set_callback_level(int level);
 
 int xen_evtchn_set_port(uint16_t port);
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 1489abf010..25584cb8f3 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1319,7 +1319,10 @@ void pc_basic_device_init(struct PCMachineState *pcms,
 
 #ifdef CONFIG_XEN_EMU
     if (xen_mode == XEN_EMULATE) {
-        xen_evtchn_connect_gsis(gsi);
+        xen_overlay_create();
+        xen_evtchn_create(IOAPIC_NUM_PINS, gsi);
+        xen_gnttab_create();
+        xen_xenstore_create();
         if (pcms->bus) {
             pci_create_simple(pcms->bus, -1, "xen-platform");
         }
@@ -1868,14 +1871,6 @@ static void pc_machine_initfn(Object *obj)
 
 int pc_machine_kvm_type(MachineState *machine, const char *kvm_type)
 {
-#ifdef CONFIG_XEN_EMU
-    if (xen_mode == XEN_EMULATE) {
-        xen_overlay_create();
-        xen_evtchn_create();
-        xen_gnttab_create();
-        xen_xenstore_create();
-    }
-#endif
     return 0;
 }
 
-- 
2.34.1


Re: [PATCH] hw/xen: Simplify emulated Xen platform init
Posted by Paul Durrant 2 years, 3 months ago
On 10/03/2023 10:54, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> I initially put the basic platform init (overlay pages, grant tables,
> event channels) into mc->kvm_type because that was the earliest place
> that could sensibly test for xen_mode==XEN_EMULATE.
> 
> The intent was to do this early enough that we could then initialise the
> XenBus and other parts which would have depended on them, from a generic
> location for both Xen and KVM/Xen in the PC-specific code, as seen in
> https://lore.kernel.org/qemu-devel/20230116221919.1124201-16-dwmw2@infradead.org/
> 
> However, then the Xen on Arm patches came along, and *they* wanted to
> do the XenBus init from a 'generic' Xen-specific location instead:
> https://lore.kernel.org/qemu-devel/20230210222729.957168-4-sstabellini@kernel.org/
> 
> Since there's no generic location that covers all three, I conceded to
> do it for XEN_EMULATE mode in pc_basic_devices_init().
> 
> And now there's absolutely no point in having some of the platform init
> done from pc_machine_kvm_type(); we can move it all up to live in a
> single place in pc_basic_devices_init(). This has the added benefit that
> we can drop the separate xen_evtchn_connect_gsis() function completely,
> and pass just the system GSIs in directly to xen_evtchn_create().
> 
> While I'm at it, it does no harm to explicitly pass in the *number* of
> said GSIs, because it does make me twitch a bit to pass an array of
> impicit size. During the lifetime of the KVM/Xen patchset, that had
> already changed (albeit just cosmetically) from GSI_NUM_PINS to
> IOAPIC_NUM_PINS.
> 
> And document a bit better that this is for the *output* GSI for raising
> CPU0's events when the per-CPU vector isn't available. The fact that
> we create a whole set of them and then only waggle the one we're told
> to, instead of having a single output and only *connecting* it to the
> GSI that it should be connected to, is still non-intuitive for me.
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>   hw/i386/kvm/xen_evtchn.c | 40 ++++++++++++++++++++--------------------
>   hw/i386/kvm/xen_evtchn.h |  3 +--
>   hw/i386/pc.c             | 13 ++++---------
>   3 files changed, 25 insertions(+), 31 deletions(-)
> 

Reviewed-by: Paul Durrant <paul@xen.org>