src/qemu/qemu_domain_address.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
We can't retrieve the isolation group of a device that's
not present in the system. However, it's very common for
VFs to be created late in the boot, so they might not be
present yet when libvirtd starts, which would cause the
guests using them to disappear.
If a PCI address has already been set for the host device
in question, we can assume that it either existed at some
point in the past or the user is assigning addresses
manually: in both cases, we should not error out and just
ignore the (temporary) failure.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1484254
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
---
src/qemu/qemu_domain_address.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c
index 16bf0cdf9..7f12f186b 100644
--- a/src/qemu/qemu_domain_address.c
+++ b/src/qemu/qemu_domain_address.c
@@ -1012,6 +1012,18 @@ qemuDomainFillDeviceIsolationGroup(virDomainDefPtr def,
tmp = virPCIDeviceAddressGetIOMMUGroupNum(hostAddr);
if (tmp < 0) {
+ /* If there's already a PCI address assigned to the device
+ * we move on instead of erroring out.
+ *
+ * This means we still don't allow non-existing host
+ * devices to be assigned to guests; however, if the host
+ * device existed at some point in the past but no longer
+ * does, which can happen very easily when dealing with VFs,
+ * we prevent the guest from disappearing and give the user
+ * an opportunity to edit its configuration */
+ if (virDeviceInfoPCIAddressPresent(info))
+ goto skip;
+
virReportError(VIR_ERR_INTERNAL_ERROR,
_("Can't look up isolation group for host device "
"%04x:%02x:%02x.%x"),
--
2.13.5
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
On Thu, Aug 24, 2017 at 05:12:20PM +0200, Andrea Bolognani wrote: >We can't retrieve the isolation group of a device that's >not present in the system. However, it's very common for >VFs to be created late in the boot, so they might not be >present yet when libvirtd starts, which would cause the >guests using them to disappear. > >If a PCI address has already been set for the host device >in question, we can assume that it either existed at some >point in the past or the user is assigning addresses >manually: in both cases, we should not error out and just >ignore the (temporary) failure. > >Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1484254 > >Signed-off-by: Andrea Bolognani <abologna@redhat.com> >--- > src/qemu/qemu_domain_address.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > Not an expert on this topic, but I did my due diligence and it makes sense to me, so Reviewed-by: Martin Kletzander <mkletzan@redhat.com> if that's enough for you. >diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c >index 16bf0cdf9..7f12f186b 100644 >--- a/src/qemu/qemu_domain_address.c >+++ b/src/qemu/qemu_domain_address.c >@@ -1012,6 +1012,18 @@ qemuDomainFillDeviceIsolationGroup(virDomainDefPtr def, > tmp = virPCIDeviceAddressGetIOMMUGroupNum(hostAddr); > > if (tmp < 0) { >+ /* If there's already a PCI address assigned to the device >+ * we move on instead of erroring out. >+ * >+ * This means we still don't allow non-existing host >+ * devices to be assigned to guests; however, if the host >+ * device existed at some point in the past but no longer >+ * does, which can happen very easily when dealing with VFs, >+ * we prevent the guest from disappearing and give the user >+ * an opportunity to edit its configuration */ >+ if (virDeviceInfoPCIAddressPresent(info)) >+ goto skip; >+ > virReportError(VIR_ERR_INTERNAL_ERROR, > _("Can't look up isolation group for host device " > "%04x:%02x:%02x.%x"), >-- >2.13.5 > >-- >libvir-list mailing list >libvir-list@redhat.com >https://www.redhat.com/mailman/listinfo/libvir-list -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Fri, Aug 25, 2017 at 11:41:22 +0200, Martin Kletzander wrote: > On Thu, Aug 24, 2017 at 05:12:20PM +0200, Andrea Bolognani wrote: > > We can't retrieve the isolation group of a device that's > > not present in the system. However, it's very common for > > VFs to be created late in the boot, so they might not be > > present yet when libvirtd starts, which would cause the > > guests using them to disappear. > > > > If a PCI address has already been set for the host device > > in question, we can assume that it either existed at some > > point in the past or the user is assigning addresses > > manually: in both cases, we should not error out and just > > ignore the (temporary) failure. > > > > Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1484254 > > > > Signed-off-by: Andrea Bolognani <abologna@redhat.com> > > --- > > src/qemu/qemu_domain_address.c | 12 ++++++++++++ > > 1 file changed, 12 insertions(+) > > > > Not an expert on this topic, but I did my due diligence and it makes > sense to me, so > > Reviewed-by: Martin Kletzander <mkletzan@redhat.com> Disclamer: I'm not objecting to the ACK, it's clearly better than it was. > > if that's enough for you. > > > diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c > > index 16bf0cdf9..7f12f186b 100644 > > --- a/src/qemu/qemu_domain_address.c > > +++ b/src/qemu/qemu_domain_address.c > > @@ -1012,6 +1012,18 @@ qemuDomainFillDeviceIsolationGroup(virDomainDefPtr def, > > tmp = virPCIDeviceAddressGetIOMMUGroupNum(hostAddr); > > > > if (tmp < 0) { > > + /* If there's already a PCI address assigned to the device > > + * we move on instead of erroring out. > > + * > > + * This means we still don't allow non-existing host > > + * devices to be assigned to guests; however, if the host > > + * device existed at some point in the past but no longer > > + * does, which can happen very easily when dealing with VFs, > > + * we prevent the guest from disappearing and give the user > > + * an opportunity to edit its configuration */ > > + if (virDeviceInfoPCIAddressPresent(info)) > > + goto skip; > > + > > virReportError(VIR_ERR_INTERNAL_ERROR, > > _("Can't look up isolation group for host device " > > "%04x:%02x:%02x.%x"), Why on earth is this in the domain address callback? That means that it's filled on parse. That's silly for a thing like the isolation group. The isolation group is necessary only when you are going to start the VM so only then it should be determined. That would prevent a bug like this altogether. Also there's PCI hotplug... -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Fri, 2017-08-25 at 12:12 +0200, Peter Krempa wrote: > > > @@ -1012,6 +1012,18 @@ qemuDomainFillDeviceIsolationGroup(virDomainDefPtr def, > > > tmp = virPCIDeviceAddressGetIOMMUGroupNum(hostAddr); > > > > > > if (tmp < 0) { > > > + /* If there's already a PCI address assigned to the device > > > + * we move on instead of erroring out. > > > + * > > > + * This means we still don't allow non-existing host > > > + * devices to be assigned to guests; however, if the host > > > + * device existed at some point in the past but no longer > > > + * does, which can happen very easily when dealing with VFs, > > > + * we prevent the guest from disappearing and give the user > > > + * an opportunity to edit its configuration */ > > > + if (virDeviceInfoPCIAddressPresent(info)) > > > + goto skip; > > > + > > > virReportError(VIR_ERR_INTERNAL_ERROR, > > > _("Can't look up isolation group for host device " > > > "%04x:%02x:%02x.%x"), > > Why on earth is this in the domain address callback? That means that > it's filled on parse. That's silly for a thing like the isolation group. > > The isolation group is necessary only when you are going to start the VM > so only then it should be determined. That would prevent a bug like this > altogether. > > Also there's PCI hotplug... Well, the whole point of introducing isolation groups was so that isolation constraint would be respected when assigning PCI addresses, and guest startup is way too late for that. If host devices are not isolated correctly on pSeries guests, depending on the situation the result can be either degraded error handling (undesirable, but maybe not too bad) or the guest failing to start (pretty bad). So picking the right PCI addresses is critical, and we need to know the isolation group in order to do it. Note that these constraints only apply to pSeries, and accordingly the code above only runs for such guests. That said, I hadn't realized that configuring guests to use devices that are not yet or no longer available on the host was supposed to work, but since that's the case the current behavior is basically a regression because it prevents the users from doing something that they would have been able to do in the past. I'll post a v2 that changes the behavior so that we fall back to the old ways (no isolation) for devices that are not available on the host at the time the guest is configured. Thanks for pointing out my mistake :) -- Andrea Bolognani / Red Hat / Virtualization -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
© 2016 - 2025 Red Hat, Inc.