include/libvirt/virterror.h | 1 + src/access/viraccessdriverpolkit.c | 2 +- src/access/viraccessmanager.c | 2 +- src/conf/domain_conf.c | 11 ++++++++--- src/datatypes.h | 30 ++++++++++++++++++++++++++---- src/libvirt-domain.c | 6 ++++++ src/libvirt.c | 1 + src/util/virbuffer.c | 4 ++-- src/util/virconf.c | 6 ++++-- src/util/virerror.c | 10 +++++++++- src/util/virerror.h | 15 ++++++++++----- src/util/virkeyfile.c | 6 ++++-- src/util/virxml.c | 2 +- tools/virsh-domain-monitor.c | 2 ++ tools/vsh.c | 3 +++ 15 files changed, 79 insertions(+), 22 deletions(-)
Background ========== We have plenty of features in libvirt, some of which were designed at a time when the virtualization story was much more straightforward than the multi-architecture, multi-hypervisor, multi-machine world we currently live in and, while we have found ways to keep the APIs chugging along, the result is sometimes somewhat confusing for users and application developers, as well as requiring libvirt developers themselves to spend quite a bit of collective time working around decisions that, in hindsight, turn out to have been less than fortunate. Two concrete examples are considered here: one is the virConnectNumOfDomains() API which, while known to be racy and having non-racy alternatives, can still be used by developers without getting any kind of warning in the process; the other one is the ability to define a domain without specifying the machine type, which is becoming increasingly problematic now with some x86_64 features being limited to q35 and downstreams looking to push for its adoption, as well as being a manifestation of the more general problem of libvirt's default being sometimes too conservative and at odds with the existence of slimmed-down QEMU binaries being built with reducing the total attack surface in mind. Having a proper deprecation story in libvirt would allow us to point users and developers towards the recommended solution in each case, be it using a different API or querying libosinfo for information; after a generous grace period, we could then remove the problematic functionality altogether. This would be a more conservative version of the process we already have in place for dropping support for older QEMU releases, which recently has allowed us to ax massive chunks of effectively dead code and simplify parts of libvirt quite significantly. This series explores one possible approach to the problem and aims to spark project-wide discussion around the topic. Further work ============ * Fix the known issues listed below as well as all not-yet-known issues that will undoubtably surface through discussion :) * Introduce a mechanism to catch use of deprecated APIs at build time, similar to GLib's G_DISABLE_DEPRECATED, to help application developers proactively move off problematic APIs. * Create a formal deprecation policy with well-defined rules and time scales in the spirit of the existing one covering our relationship with QEMU. Know issues =========== * For the more granular (and more interesting) type of deprecation shown in patch 6/7, warnings are not being reported back to the client as expected. I believe this is caused by the RPC code looking for either a failure, in which case the virError is transmitted, or a success, in which case the actual return value is: we'll have to figure out a way for the error to travel across the wire regardless of whether or not the API call was ultimately successful if we want clients to actually receive warnings when non-local drivers are involved. Andrea Bolognani (7): util: Add 'level' argument to virReportErrorHelper() util: Introduce virReportWarning() tools: Print warnings in virsh util: Introduce VIR_ERR_DEPRECATED_FEATURE Deprecate virConnectNumOfDomains() Deprecate missing machine type in virDomainDefineXMLFlags() tools: Force virsh to use deprecated features include/libvirt/virterror.h | 1 + src/access/viraccessdriverpolkit.c | 2 +- src/access/viraccessmanager.c | 2 +- src/conf/domain_conf.c | 11 ++++++++--- src/datatypes.h | 30 ++++++++++++++++++++++++++---- src/libvirt-domain.c | 6 ++++++ src/libvirt.c | 1 + src/util/virbuffer.c | 4 ++-- src/util/virconf.c | 6 ++++-- src/util/virerror.c | 10 +++++++++- src/util/virerror.h | 15 ++++++++++----- src/util/virkeyfile.c | 6 ++++-- src/util/virxml.c | 2 +- tools/virsh-domain-monitor.c | 2 ++ tools/vsh.c | 3 +++ 15 files changed, 79 insertions(+), 22 deletions(-) -- 2.17.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Tue, Oct 02, 2018 at 16:14:39 +0200, Andrea Bolognani wrote: > Background > ========== [...] > Two concrete examples are considered here: one is the > virConnectNumOfDomains() API which, while known to be racy and having > non-racy alternatives, can still be used by developers without > getting any kind of warning in the process; the other one is the > ability to define a domain without specifying the machine type, which Okay, but for these particular ones we could do a compile time warning. Not that we ever can remove them though. > is becoming increasingly problematic now with some x86_64 features > being limited to q35 and downstreams looking to push for its Well so I presume the issue is that we can't change to q35 as the default which was vigorously discussed. IMHO the current approach still allows users shoot themselves into the foot (usually with less spectacular results than when the drivers for devices don't work anymore) but still: If you use a libvirt XML without a machine type but with features which require a certain machine type (e.g. cpu and memory hotplug) and define it on a host with new-enough qemu everything will work. If you do the same on a host with older qemu your VM will not start at all (granted, here you get a libvirt error as this will mostly result into missing capabilities. Our documentation states in multiple places that fields not populated by the user are mostly hypervisor dependant what the default will become. In my opinion the machine type is similarly hypervisor dependant, and in this case the "hypervisor" for the libvirt-qemu infrastructure also involves libvirt's qemu driver. > adoption, as well as being a manifestation of the more general > problem of libvirt's default being sometimes too conservative and at > odds with the existence of slimmed-down QEMU binaries being built > with reducing the total attack surface in mind. If your qemu binary does not support certain feature, libvirt will know it. We have capability detection and for that matter we also have machine type detection (we fill in the default according to the canonical name). In such case we are very welcome to choose anything which will satisfy the default. I'm afraid though that the downstreams you are mentioning can't in fact fully drop i440fx for some reason and thus are trying to weasel around by attempting to make us change the default. This I don't support as a worthy goal. If they want to slim down qemu, they are welcome and we can pick a suitable different default. We can also consider using what qemu provides as a default. Users will get the default they asked for. They always can specify their specific type if their software is not happy with it. That is in the end the reason for libvirt recording the machine type in the XML for it's use as we don't want it to change. Not specifying it is a invitation to change it. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Tue, Oct 02, 2018 at 05:19:30PM +0200, Peter Krempa wrote: > On Tue, Oct 02, 2018 at 16:14:39 +0200, Andrea Bolognani wrote: > > Background > > ========== > > [...] > > > Two concrete examples are considered here: one is the > > virConnectNumOfDomains() API which, while known to be racy and having > > non-racy alternatives, can still be used by developers without > > getting any kind of warning in the process; the other one is the > > ability to define a domain without specifying the machine type, which > > Okay, but for these particular ones we could do a compile time warning. > Not that we ever can remove them though. Definitely agree with Peter, having a runtime warning for issue that you cannot change in runtime situation is IMHO wrong. Even though we cannot remove any API the deprecation warning can be still useful because it can suggest better API to use instead of the old one. > > is becoming increasingly problematic now with some x86_64 features > > being limited to q35 and downstreams looking to push for its > > Well so I presume the issue is that we can't change to q35 as the > default which was vigorously discussed. > > IMHO the current approach still allows users shoot themselves into the > foot (usually with less spectacular results than when the drivers for > devices don't work anymore) but still: > > If you use a libvirt XML without a machine type but with features which > require a certain machine type (e.g. cpu and memory hotplug) and define > it on a host with new-enough qemu everything will work. If you do the > same on a host with older qemu your VM will not start at all (granted, > here you get a libvirt error as this will mostly result into missing > capabilities. > > Our documentation states in multiple places that fields not populated by > the user are mostly hypervisor dependant what the default will become. > > In my opinion the machine type is similarly hypervisor dependant, and in > this case the "hypervisor" for the libvirt-qemu infrastructure also > involves libvirt's qemu driver. > > > adoption, as well as being a manifestation of the more general > > problem of libvirt's default being sometimes too conservative and at > > odds with the existence of slimmed-down QEMU binaries being built > > with reducing the total attack surface in mind. > > If your qemu binary does not support certain feature, libvirt will know > it. We have capability detection and for that matter we also have > machine type detection (we fill in the default according to the > canonical name). In such case we are very welcome to choose anything > which will satisfy the default. > > I'm afraid though that the downstreams you are mentioning can't in fact > fully drop i440fx for some reason and thus are trying to weasel around > by attempting to make us change the default. This I don't support as a > worthy goal. If they want to slim down qemu, they are welcome and we can > pick a suitable different default. > > We can also consider using what qemu provides as a default. Users will > get the default they asked for. They always can specify their specific > type if their software is not happy with it. That is in the end the > reason for libvirt recording the machine type in the XML for it's use as > we don't want it to change. Not specifying it is a invitation to change > it. Again I have to agree here. I don't see any strong reason why we cannot change default machine type. As Peter wrote, if the only reason would be that downstream want's to remove it, well they should change it in downstream as well. But in this case I thing that it's time to move to Q35 as a default machine type, it's supported by majority of recent OSes and it would be sensible default. If some application/user depends on a specific machine type they should ask for it and most likely they are already doing it. We have a lot of attributes in our XML that are usually filled in by default and that default already depends on QEMU version or features that QEMU offers. Pavel -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
I feel like I've addressed most of your points in my reply to Peter's message so I won't repeat the same arguments and will snip quite aggressively, but please let me know if I've missed anything. On Tue, 2018-10-02 at 17:38 +0200, Pavel Hrdina wrote: [...] > But in this case I thing that it's time > to move to Q35 as a default machine type, it's supported by majority of > recent OSes and it would be sensible default. > > If some application/user depends on a specific machine type they should > ask for it and most likely they are already doing it. Even if we agreed that q35 is a better default than i440fx these days, changing libvirt's default would be, in my opinion, merely papering over the real issue, which is: there shouldn't have been a default to begin with. Applications should either choose a specific machine type because they know it's the only one the fits their needs (as you suggest) or, if they don't have such specialized needs, ask libosinfo to figure one out based on information such as the OS that will ultimately run in the guest; the same is true for network interfaces, storage controllers, graphics adapters and the like. The idea that you can have a single "good" default is one that was perhaps true years ago, but it's certainly not true today, and holding on to it only causes pain for users and developers alike. > We have a lot of > attributes in our XML that are usually filled in by default and that > default already depends on QEMU version or features that QEMU offers. We're actually awfully inconsistent in this, since we base some of the choices on QEMU features and have hardcoded values for others; in some particularly neat cases, we might even end up applying *both* approaches :) -- Andrea Bolognani / Red Hat / Virtualization -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On 10/02/2018 05:38 PM, Pavel Hrdina wrote: > On Tue, Oct 02, 2018 at 05:19:30PM +0200, Peter Krempa wrote: >> On Tue, Oct 02, 2018 at 16:14:39 +0200, Andrea Bolognani wrote: >>> Background >>> ========== >> >> [...] >> >>> Two concrete examples are considered here: one is the >>> virConnectNumOfDomains() API which, while known to be racy and having >>> non-racy alternatives, can still be used by developers without >>> getting any kind of warning in the process; the other one is the >>> ability to define a domain without specifying the machine type, which >> >> Okay, but for these particular ones we could do a compile time warning. >> Not that we ever can remove them though. > > Definitely agree with Peter, having a runtime warning for issue that > you cannot change in runtime situation is IMHO wrong. Even though we > cannot remove any API the deprecation warning can be still useful > because it can suggest better API to use instead of the old one. So two of our biggest consumers use python bindings. I'm failing to see how a compile time warning would help here. E.g. if an app is using "conn.conn.numOfDefinedDomains()" we could call virConnectListAllDomains() with appropriate flags. Except we can't, because then the app would be unable to talk to older libvirtds where the procedure doesn't exist. In other words, if our goal is to make users switch to newer versions of functions then we might have to find a better solution (than compilte time warnings). Runtime warnings would work with python though. Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Wed, Oct 03, 2018 at 09:13:00AM +0200, Michal Privoznik wrote: > On 10/02/2018 05:38 PM, Pavel Hrdina wrote: > > On Tue, Oct 02, 2018 at 05:19:30PM +0200, Peter Krempa wrote: > >> On Tue, Oct 02, 2018 at 16:14:39 +0200, Andrea Bolognani wrote: > >>> Background > >>> ========== > >> > >> [...] > >> > >>> Two concrete examples are considered here: one is the > >>> virConnectNumOfDomains() API which, while known to be racy and having > >>> non-racy alternatives, can still be used by developers without > >>> getting any kind of warning in the process; the other one is the > >>> ability to define a domain without specifying the machine type, which > >> > >> Okay, but for these particular ones we could do a compile time warning. > >> Not that we ever can remove them though. > > > > Definitely agree with Peter, having a runtime warning for issue that > > you cannot change in runtime situation is IMHO wrong. Even though we > > cannot remove any API the deprecation warning can be still useful > > because it can suggest better API to use instead of the old one. > > So two of our biggest consumers use python bindings. I'm failing to see > how a compile time warning would help here. E.g. if an app is using > "conn.conn.numOfDefinedDomains()" we could call > virConnectListAllDomains() with appropriate flags. Except we can't, > because then the app would be unable to talk to older libvirtds where > the procedure doesn't exist. > > In other words, if our goal is to make users switch to newer versions of > functions then we might have to find a better solution (than compilte > time warnings). Runtime warnings would work with python though. As John's pointed out in his reply, given our API contract, we're providing devs of apps built on top of libvirt with certain guarantees, so it's fair to assume they'd just ignore the warnings, since we can't break our contract and their product keeps working anyway, so "why bother changing it", right? Unfortunately, I feel it's a common practice not to change anything unless it stops working and there might even be very legitimate reasons for such a practice, like simply not enough man power. Therefore I myself don't see any mass adoption/switch to the new preferred/safe versions of our APIs any time soon. The only effect you'd IMHO achieve by having runtime warnings is log pollution. Having said that, I don't have any better ideas on how we could achieve this brave goal without using the obvious drastic measures. Erik -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Wed, 2018-10-03 at 10:44 +0200, Erik Skultety wrote: > On Wed, Oct 03, 2018 at 09:13:00AM +0200, Michal Privoznik wrote: > > On 10/02/2018 05:38 PM, Pavel Hrdina wrote: > > > Definitely agree with Peter, having a runtime warning for issue that > > > you cannot change in runtime situation is IMHO wrong. Even though we > > > cannot remove any API the deprecation warning can be still useful > > > because it can suggest better API to use instead of the old one. > > > > So two of our biggest consumers use python bindings. I'm failing to see > > how a compile time warning would help here. The compile time warnings would cause bindings developers to find out APIs are being deprecated as they compile against newer libvirt the same way applications developers linking against the C library would; additionally, they could use that information as a motivation to deprecate *their* APIs in a language-appropriate manner. So, as mentioned elsewhere, I believe we should have ways to warn about deprecated features *both* at compile time and at runtime. > > In other words, if our goal is to make users switch to newer versions of > > functions then we might have to find a better solution (than compilte > > time warnings). Runtime warnings would work with python though. > > As John's pointed out in his reply, given our API contract, we're providing > devs of apps built on top of libvirt with certain guarantees, so it's fair to > assume they'd just ignore the warnings, since we can't break our contract and > their product keeps working anyway, so "why bother changing it", right? > Unfortunately, I feel it's a common practice not to change anything unless it > stops working and there might even be very legitimate reasons for such a > practice, like simply not enough man power. > Therefore I myself don't see any mass adoption/switch to the new preferred/safe > versions of our APIs any time soon. The only effect you'd IMHO achieve by having > runtime warnings is log pollution. Having said that, I don't have any better > ideas on how we could achieve this brave goal without using the obvious drastic > measures. Deprecation warnings can still help adoption: if tools start spewing them, users will notice and report bugs, eventually annoying developers into migrating to non-deprecated APIs :) Of course this works better for command line applications like virt-install than for GUI applications like virt-manager, but it's useful nonetheless. That said, in order to reap real benefits, deprecating features should go hand in hand with having a well-defined support policy that includes a timeline describing how, after a generous grace period, the functionality will actually be dropped altogether. -- Andrea Bolognani / Red Hat / Virtualization -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On 10/04/2018 01:03 PM, Andrea Bolognani wrote: > On Wed, 2018-10-03 at 10:44 +0200, Erik Skultety wrote: >> On Wed, Oct 03, 2018 at 09:13:00AM +0200, Michal Privoznik wrote: >>> On 10/02/2018 05:38 PM, Pavel Hrdina wrote: >>>> Definitely agree with Peter, having a runtime warning for issue that >>>> you cannot change in runtime situation is IMHO wrong. Even though we >>>> cannot remove any API the deprecation warning can be still useful >>>> because it can suggest better API to use instead of the old one. >>> >>> So two of our biggest consumers use python bindings. I'm failing to see >>> how a compile time warning would help here. > > The compile time warnings would cause bindings developers to find > out APIs are being deprecated as they compile against newer libvirt > the same way applications developers linking against the C library > would; additionally, they could use that information as a motivation > to deprecate *their* APIs in a language-appropriate manner. > > So, as mentioned elsewhere, I believe we should have ways to warn > about deprecated features *both* at compile time and at runtime. > >>> In other words, if our goal is to make users switch to newer versions of >>> functions then we might have to find a better solution (than compilte >>> time warnings). Runtime warnings would work with python though. >> >> As John's pointed out in his reply, given our API contract, we're providing >> devs of apps built on top of libvirt with certain guarantees, so it's fair to >> assume they'd just ignore the warnings, since we can't break our contract and >> their product keeps working anyway, so "why bother changing it", right? >> Unfortunately, I feel it's a common practice not to change anything unless it >> stops working and there might even be very legitimate reasons for such a >> practice, like simply not enough man power. >> Therefore I myself don't see any mass adoption/switch to the new preferred/safe >> versions of our APIs any time soon. The only effect you'd IMHO achieve by having >> runtime warnings is log pollution. Having said that, I don't have any better >> ideas on how we could achieve this brave goal without using the obvious drastic >> measures. > > Deprecation warnings can still help adoption: if tools start spewing > them, users will notice and report bugs, eventually annoying > developers into migrating to non-deprecated APIs :) This was my reasoning too. We have to have run time warnings. Not all languages are compiled. Even though I have no statistics, I am confident to say that oVirt and OpenStack are our biggest consumers and they use python bindings. Compile time warnings won't help us there, rather the opposite - put pressure on us when writing them. > > Of course this works better for command line applications like > virt-install than for GUI applications like virt-manager, but it's > useful nonetheless. > > That said, in order to reap real benefits, deprecating features > should go hand in hand with having a well-defined support policy > that includes a timeline describing how, after a generous grace > period, the functionality will actually be dropped altogether. > /me grabbing some popcorn and soda O:-) So this was going to be my question. What is the end goal we are trying to achieve? If it is to advertise we have a better API to do the job, we can document it - just like we are doing so far. Also the term "better API" depends on individual use case heavily. If I have a small, one host virtualization solution, and I know nobody else is starting up/shutting down domains, I might use virDomainGetNames() just fine. Worse, if I have a working solution and I upgrade libvirt I'll start seeing warning all of a sudden. This kind of breaks our promise of stability. If the goal is to make mgmt apps switch to APIs we consider better, then we definitely need runtime warnings. I don't see how apps written in dynamic, not compiled languages can detect they're using deprecated API without running themselves. And surely nor developers, nor QEs have 100% code coverage tests. And for dropping functionality - we can not do that. Period. Michal -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Thu, Oct 04, 2018 at 01:16:46PM +0200, Michal Privoznik wrote: > On 10/04/2018 01:03 PM, Andrea Bolognani wrote: > > On Wed, 2018-10-03 at 10:44 +0200, Erik Skultety wrote: > >> On Wed, Oct 03, 2018 at 09:13:00AM +0200, Michal Privoznik wrote: > >>> On 10/02/2018 05:38 PM, Pavel Hrdina wrote: > >>>> Definitely agree with Peter, having a runtime warning for issue that > >>>> you cannot change in runtime situation is IMHO wrong. Even though we > >>>> cannot remove any API the deprecation warning can be still useful > >>>> because it can suggest better API to use instead of the old one. > >>> > >>> So two of our biggest consumers use python bindings. I'm failing to see > >>> how a compile time warning would help here. > > > > The compile time warnings would cause bindings developers to find > > out APIs are being deprecated as they compile against newer libvirt > > the same way applications developers linking against the C library > > would; additionally, they could use that information as a motivation > > to deprecate *their* APIs in a language-appropriate manner. I completely agree with this. > > So, as mentioned elsewhere, I believe we should have ways to warn > > about deprecated features *both* at compile time and at runtime. But definitely not with this paragraph. See below. > >>> In other words, if our goal is to make users switch to newer versions of > >>> functions then we might have to find a better solution (than compilte > >>> time warnings). Runtime warnings would work with python though. > >> > >> As John's pointed out in his reply, given our API contract, we're providing > >> devs of apps built on top of libvirt with certain guarantees, so it's fair to > >> assume they'd just ignore the warnings, since we can't break our contract and > >> their product keeps working anyway, so "why bother changing it", right? > >> Unfortunately, I feel it's a common practice not to change anything unless it > >> stops working and there might even be very legitimate reasons for such a > >> practice, like simply not enough man power. > >> Therefore I myself don't see any mass adoption/switch to the new preferred/safe > >> versions of our APIs any time soon. The only effect you'd IMHO achieve by having > >> runtime warnings is log pollution. Having said that, I don't have any better > >> ideas on how we could achieve this brave goal without using the obvious drastic > >> measures. > > > > Deprecation warnings can still help adoption: if tools start spewing > > them, users will notice and report bugs, eventually annoying > > developers into migrating to non-deprecated APIs :) > > This was my reasoning too. We have to have run time warnings. Not all > languages are compiled. Even though I have no statistics, I am confident > to say that oVirt and OpenStack are our biggest consumers and they use > python bindings. Compile time warnings won't help us there, rather the > opposite - put pressure on us when writing them. You should not mix libvirt and libvirt-python, two separate projects and two separate issues. Even if that is true, we should still handle deprecation warnings for C code as well and that should be compile time only. See bellow. > > Of course this works better for command line applications like > > virt-install than for GUI applications like virt-manager, but it's > > useful nonetheless. > > > > That said, in order to reap real benefits, deprecating features > > should go hand in hand with having a well-defined support policy > > that includes a timeline describing how, after a generous grace > > period, the functionality will actually be dropped altogether. > > > > /me grabbing some popcorn and soda O:-) > > So this was going to be my question. What is the end goal we are trying > to achieve? If it is to advertise we have a better API to do the job, we > can document it - just like we are doing so far. Also the term "better > API" depends on individual use case heavily. If I have a small, one host > virtualization solution, and I know nobody else is starting up/shutting > down domains, I might use virDomainGetNames() just fine. > > Worse, if I have a working solution and I upgrade libvirt I'll start > seeing warning all of a sudden. This kind of breaks our promise of > stability. > > If the goal is to make mgmt apps switch to APIs we consider better, then > we definitely need runtime warnings. I don't see how apps written in > dynamic, not compiled languages can detect they're using deprecated API > without running themselves. And surely nor developers, nor QEs have 100% > code coverage tests. Runtime deprecation warning in C code is wrong in my opinion and even though there are some project doing it we should not do it. Deprecation warning is for developers, not for users. Yes, adding deprecation warning into libvirt will not solve the goal for oVirt nor OpenStack, but this should be handled in the libvirt-python binding code using language-specific manner as Andrea mentioned above. In python it happens to be a runtime warning, warnings.DeprecatedWarning, however, that warning is by default ignored unless it's triggered from __main__, so developers using libvirt-python API would need to enable that warning for their test-suite, that's a python way how to do it. One of the reasons why it's not enabled by default is that as user you have no way how to modify the module code and therefore you should not get the warning. From user point of view everything works how it should and I don't care whether my program uses something deprecated or not. It's unnecessary noise for the user. Again, we should not introduce runtime deprecation warnings in C code. Pavel > > And for dropping functionality - we can not do that. Period. +1, for everyone questioning this please read [1], thanks. Pavel [1] <https://libvirt.org/support.html> -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Thu, 2018-10-04 at 13:36 +0200, Pavel Hrdina wrote: > Runtime deprecation warning in C code is wrong in my opinion and even > though there are some project doing it we should not do it. > Deprecation warning is for developers, not for users. > > Yes, adding deprecation warning into libvirt will not solve the goal for > oVirt nor OpenStack, but this should be handled in the libvirt-python > binding code using language-specific manner as Andrea mentioned above. > In python it happens to be a runtime warning, > warnings.DeprecatedWarning, however, that warning is by default ignored > unless it's triggered from __main__, so developers using libvirt-python > API would need to enable that warning for their test-suite, that's a > python way how to do it. > > One of the reasons why it's not enabled by default is that as user > you have no way how to modify the module code and therefore you should > not get the warning. From user point of view everything works how it > should and I don't care whether my program uses something deprecated > or not. It's unnecessary noise for the user. All of the above is perfectly reasonable if you only consider cases where we might want to deprecate an entire API, but fails to take into account that a large part of libvirt's functionality is exposed through XML elements and attributes rather than C function calls. So if we wanted to warn people that not providing a machine type when defining a guest is a bad idea and they should use libosinfo to figure out the information instead, there's simply no way we can do that at compile time. Not to mention that the existence of virsh somewhat blurs the lines between users and developers, as it's a very thin wrapper where each option translates almost directly to the corresponding C function call and its parameters and flags, but is then installed in $PATH for the world to use... > > And for dropping functionality - we can not do that. Period. > > +1, for everyone questioning this please read [1], thanks. > > [1] <https://libvirt.org/support.html> As mentioned elsewhere, that document defines our current stance on backwards compatibility, but nothing says we can't change it if we think doing so will bring benefits to developers and users, which I strongly believe it would. We used to have a similar, although not formally documented, stance on supporting old QEMU versions, but we've since relaxed it[1] which has benefited libvirt and its developers greatly without, as far as I'm aware, inconveniencing users. [1] A change that, I can't help noticing, you signed off on ;) -- Andrea Bolognani / Red Hat / Virtualization -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Thu, Oct 04, 2018 at 02:06:54PM +0200, Andrea Bolognani wrote: > On Thu, 2018-10-04 at 13:36 +0200, Pavel Hrdina wrote: > > Runtime deprecation warning in C code is wrong in my opinion and even > > though there are some project doing it we should not do it. > > Deprecation warning is for developers, not for users. > > > > Yes, adding deprecation warning into libvirt will not solve the goal for > > oVirt nor OpenStack, but this should be handled in the libvirt-python > > binding code using language-specific manner as Andrea mentioned above. > > In python it happens to be a runtime warning, > > warnings.DeprecatedWarning, however, that warning is by default ignored > > unless it's triggered from __main__, so developers using libvirt-python > > API would need to enable that warning for their test-suite, that's a > > python way how to do it. > > > > One of the reasons why it's not enabled by default is that as user > > you have no way how to modify the module code and therefore you should > > not get the warning. From user point of view everything works how it > > should and I don't care whether my program uses something deprecated > > or not. It's unnecessary noise for the user. > > All of the above is perfectly reasonable if you only consider cases > where we might want to deprecate an entire API, but fails to take > into account that a large part of libvirt's functionality is exposed > through XML elements and attributes rather than C function calls. > > So if we wanted to warn people that not providing a machine type > when defining a guest is a bad idea and they should use libosinfo > to figure out the information instead, there's simply no way we can > do that at compile time. Sure, if we want to warn about missing/wrong/deprecated features from XML we need runtime warnings only for that and that's OK, that is something that user usually can change. > Not to mention that the existence of virsh somewhat blurs the lines > between users and developers, as it's a very thin wrapper where each > option translates almost directly to the corresponding C function > call and its parameters and flags, but is then installed in $PATH > for the world to use... Again if it's something XML related user can change that even while using virsh, if it's API related user should not see it by default. > > > And for dropping functionality - we can not do that. Period. > > > > +1, for everyone questioning this please read [1], thanks. > > > > [1] <https://libvirt.org/support.html> > > As mentioned elsewhere, that document defines our current stance on > backwards compatibility, but nothing says we can't change it if we > think doing so will bring benefits to developers and users, which I > strongly believe it would. > > We used to have a similar, although not formally documented, stance > on supporting old QEMU versions, but we've since relaxed it[1] which > has benefited libvirt and its developers greatly without, as far as > I'm aware, inconveniencing users. I'm not sure whether we had that promise for QEMU versions documented and it's also slightly different situation. We are dropping support for old QEMUs in new libvirts with the assumption that user usually updates both at the same time (through OS distribution channels). Removing APIs is totally different. We would have to bump our library so version and that would break every existing application unless you would rebuild it to use the new library. That would cause a lot of issues for everyone using libvirt. Yes, dropping old or broken API would allow us to drop a lot of code, on the other hand with the deprecation warning in place and deprecation note in the documentation we can start claiming that these APIs are no longer supported and any bugs caused be these APIs (maybe except for security issues) will not be fixed and in the code we could try to isolate that code as much as possible to make it easier for us. Pavel -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Thu, 2018-10-04 at 13:16 +0200, Michal Privoznik wrote: > On 10/04/2018 01:03 PM, Andrea Bolognani wrote: > > That said, in order to reap real benefits, deprecating features > > should go hand in hand with having a well-defined support policy > > that includes a timeline describing how, after a generous grace > > period, the functionality will actually be dropped altogether. > > /me grabbing some popcorn and soda O:-) > > So this was going to be my question. What is the end goal we are trying > to achieve? If it is to advertise we have a better API to do the job, we > can document it - just like we are doing so far. Also the term "better > API" depends on individual use case heavily. If I have a small, one host > virtualization solution, and I know nobody else is starting up/shutting > down domains, I might use virDomainGetNames() just fine. Until the time you give access to your host to another user, and suddenly your scripts will fail 10% of the time for no obvious reason ;) Using the proper API might be more work upfront, but it will pay off in the long run. > Worse, if I have a working solution and I upgrade libvirt I'll start > seeing warning all of a sudden. This kind of breaks our promise of > stability. If the functionality keeps working, then getting a bunch of warnings in the process will definitely not break any stability promise. And if reporting warnings breaks functionality, then we need to find a different way to report them because applications no longer working after a libvirt update is simply unacceptable. > And for dropping functionality - we can not do that. Period. It's *us* who decided that's the case, so it's up to *us* to uphold that decision. Or, you know, revert it :) I'm not saying that we should start dropping random features out of the blue without any rationale, but if we start formally deprecating functionality that we deem problematic and at the same time communicate clearly to both developers and users such functionality will be removed upstream after, say, three years worth of releases, then I see absolutely no problem with that. GTK+ does it, Qt does it; and I'm willing to bet there are more applications linking against either of them than there are linking against libvirt. -- Andrea Bolognani / Red Hat / Virtualization -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Thu, Oct 04, 2018 at 01:38:33PM +0200, Andrea Bolognani wrote: > GTK+ does it, Qt does it; and I'm willing to bet there are more > applications linking against either of them than there are linking > against libvirt. I don't think that's a positive example. As a developer who has used GTK for multiple apps, I *hate* their frequent API deprecations during minor updates. It is a constant battle to keep app code compiling cleanly even within a single releae, no to mention the even greater pain of trying to do GTK2 and GTK3 in parallel. Sure, this deprecation and deletion of code benefits the GTK maintainers, but the cost of creating ongoing pain for every single application developer. That's not a net win IMHO as the set of app developers is much larger. As a counter point, Microsoft Windows never breaks its APIs and that has many orders of magnitude more app developers than GTK/Qt The fact that libvirt never breaks API is one of our greatest selling points. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Thu, 2018-10-04 at 14:02 +0100, Daniel P. Berrangé wrote: > On Thu, Oct 04, 2018 at 01:38:33PM +0200, Andrea Bolognani wrote: > > GTK+ does it, Qt does it; and I'm willing to bet there are more > > applications linking against either of them than there are linking > > against libvirt. > > I don't think that's a positive example. As a developer who has > used GTK for multiple apps, I *hate* their frequent API deprecations > during minor updates. It is a constant battle to keep app code > compiling cleanly even within a single releae, no to mention the > even greater pain of trying to do GTK2 and GTK3 in parallel. > > Sure, this deprecation and deletion of code benefits the GTK > maintainers, but the cost of creating ongoing pain for every > single application developer. That's not a net win IMHO as the > set of app developers is much larger. I think keeping up with changes in libraries you consume is simply due diligence for developers, and while of course that takes up some of your time and is for the most part hardly a glamorous job, it also ensures your application is in tip-top shape at any given time and is taking advantage of all possible improvements in the underlying tech by periodically reconsidering parts of its design, which will ultimately result in a better experience for users. Cleaning up the API from time to time also means developers won't have to figure out which one of the seemingly equivalent solutions to a given problem presents which particular pitfalls, which will reduce frustrations and once again result in better applications. And libvirt developers themselves not having to tiptoe around what is in some cases quite literally a decade worth of backwards compatibility hacks would allow them to focus on features and bug fixes that actually impact users in a positive way, all the while keeping the size of the library from growing out of control, with all the resource usage and attack suface consideration that entails. > As a counter point, Microsoft Windows never breaks its APIs and > that has many orders of magnitude more app developers than GTK/Qt How many of the bugs and crashes experienced daily by Windows users could be avoided if applications were forced to adopt newer, safer APIs over time instead of being allowed to lazily stick to whatever was available at the time Windows 95 was the new hotness? Your guess is as good as mine, but I'm pretty sure the answer is not "zero". > The fact that libvirt never breaks API is one of our greatest > selling points. I don't have hard data on this, but I suspect the success of libvirt is driven more by the great features it offers than its promise to preserve API/ABI compatibility forever; even more so considering that the majority of Open Source projects don't offer anything close to the latter, and yet somehow manage to thrive. -- Andrea Bolognani / Red Hat / Virtualization -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Thu, Oct 04, 2018 at 04:08:31PM +0200, Andrea Bolognani wrote: > On Thu, 2018-10-04 at 14:02 +0100, Daniel P. Berrangé wrote: > > On Thu, Oct 04, 2018 at 01:38:33PM +0200, Andrea Bolognani wrote: > > > GTK+ does it, Qt does it; and I'm willing to bet there are more > > > applications linking against either of them than there are linking > > > against libvirt. > > > > I don't think that's a positive example. As a developer who has > > used GTK for multiple apps, I *hate* their frequent API deprecations > > during minor updates. It is a constant battle to keep app code > > compiling cleanly even within a single releae, no to mention the > > even greater pain of trying to do GTK2 and GTK3 in parallel. > > > > Sure, this deprecation and deletion of code benefits the GTK > > maintainers, but the cost of creating ongoing pain for every > > single application developer. That's not a net win IMHO as the > > set of app developers is much larger. > > I think keeping up with changes in libraries you consume is simply > due diligence for developers, and while of course that takes up > some of your time and is for the most part hardly a glamorous job, > it also ensures your application is in tip-top shape at any given > time and is taking advantage of all possible improvements in the > underlying tech by periodically reconsidering parts of its design, > which will ultimately result in a better experience for users. Changing existing code to use new APIs doesn't imply the app is in tip-top shape. On the contrary, I can arguably make it more likely to break in subtle ways by changing something which is long tested in the field for something new & relatively untested. > And libvirt developers themselves not having to tiptoe around what > is in some cases quite literally a decade worth of backwards > compatibility hacks would allow them to focus on features and bug > fixes that actually impact users in a positive way, all the while > keeping the size of the library from growing out of control, with > all the resource usage and attack suface consideration that entails. Maintaining API compatibility is not really the main contributor to resource usage or attack surface. That is a large architectural issue, such as the monolithic libvirtd and usage memory unsafe language for impl. Deprecating & removing features or changing defaults will have negligible impact. > > As a counter point, Microsoft Windows never breaks its APIs and > > that has many orders of magnitude more app developers than GTK/Qt > > How many of the bugs and crashes experienced daily by Windows users > could be avoided if applications were forced to adopt newer, safer > APIs over time instead of being allowed to lazily stick to whatever > was available at the time Windows 95 was the new hotness? The idea that you can force people to adapt is flawed. Some devs may adapt. Others simply won't so an app that otherwise worked fine will now be needlessly broken. Others will simply bundle the older version of your API in their code getting stuck on old outdated insecure versions. > Your guess is as good as mine, but I'm pretty sure the answer is > not "zero". Adapting to new APIs will itself introduce new bugs. Libvirt has felt this when we've had to adapt to APIs changing in things we use too. > > The fact that libvirt never breaks API is one of our greatest > > selling points. > > I don't have hard data on this, but I suspect the success of > libvirt is driven more by the great features it offers than its > promise to preserve API/ABI compatibility forever; even more so > considering that the majority of Open Source projects don't offer > anything close to the latter, and yet somehow manage to thrive. GTK's API breakage means that we still have applications in Fedora that are stuck on GTK1, despite fact that GTK3 is current and GTK4 is coming soon. The world of hurt caused by Python's gratuituous incompatbilities between py2 and py3 is going to continue to harm the python community for years to come. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Thu, 2018-10-04 at 15:26 +0100, Daniel P. Berrangé wrote: > On Thu, Oct 04, 2018 at 04:08:31PM +0200, Andrea Bolognani wrote: > > I think keeping up with changes in libraries you consume is simply > > due diligence for developers, and while of course that takes up > > some of your time and is for the most part hardly a glamorous job, > > it also ensures your application is in tip-top shape at any given > > time and is taking advantage of all possible improvements in the > > underlying tech by periodically reconsidering parts of its design, > > which will ultimately result in a better experience for users. > > Changing existing code to use new APIs doesn't imply the app > is in tip-top shape. On the contrary, I can arguably make it > more likely to break in subtle ways by changing something which > is long tested in the field for something new & relatively > untested. You don't have to be an early adopter: it's perfectly fine to wait a few point releases before moving to a new API. Never addressing the deprecation warnings, though, is hardly going to result in better code in the long run. > > And libvirt developers themselves not having to tiptoe around what > > is in some cases quite literally a decade worth of backwards > > compatibility hacks would allow them to focus on features and bug > > fixes that actually impact users in a positive way, all the while > > keeping the size of the library from growing out of control, with > > all the resource usage and attack suface consideration that entails. > > Maintaining API compatibility is not really the main contributor > to resource usage or attack surface. That is a large architectural > issue, such as the monolithic libvirtd and usage memory unsafe > language for impl. Deprecating & removing features or changing > defaults will have negligible impact. It is certainly forcing us to write more complex logic, which is easier to get wrong and will result in more bugs regardless of how safe the programming language we use is. Not that I'm opposing any of the changes you mention, mind you :) > > > As a counter point, Microsoft Windows never breaks its APIs and > > > that has many orders of magnitude more app developers than GTK/Qt > > > > How many of the bugs and crashes experienced daily by Windows users > > could be avoided if applications were forced to adopt newer, safer > > APIs over time instead of being allowed to lazily stick to whatever > > was available at the time Windows 95 was the new hotness? > > The idea that you can force people to adapt is flawed. Some devs > may adapt. Others simply won't so an app that otherwise worked > fine will now be needlessly broken. Others will simply bundle > the older version of your API in their code getting stuck on > old outdated insecure versions. Are those the kind of applications you'd trust to manage your virtualization cluster? If the developers can't be bothered to put out a new release for literally *years*, then you're better off using pretty much anything else. > > Your guess is as good as mine, but I'm pretty sure the answer is > > not "zero". > > Adapting to new APIs will itself introduce new bugs. Libvirt has felt > this when we've had to adapt to APIs changing in things we use too. Adding support for new QEMU features also sometimes results in new bugs popping up, but that should teach us to write and test our code more carefully, not to give up on keeping up with the innovations happening in QEMU. > > > The fact that libvirt never breaks API is one of our greatest > > > selling points. > > > > I don't have hard data on this, but I suspect the success of > > libvirt is driven more by the great features it offers than its > > promise to preserve API/ABI compatibility forever; even more so > > considering that the majority of Open Source projects don't offer > > anything close to the latter, and yet somehow manage to thrive. > > GTK's API breakage means that we still have applications in Fedora > that are stuck on GTK1, despite fact that GTK3 is current and GTK4 > is coming soon. 'dnf repoquery --whatrequires gtk+' reports ~30 results, a big chunk of which are related to XMMS, a media player that's not been updated in 10+ year; GTK+ 1 itself hasn't seen a release since 2011. Allowing them to still be just a 'dnf install' away for any user is borderline irresponsible IMHO. > The world of hurt caused by Python's gratuituous > incompatbilities between py2 and py3 is going to continue to harm > the python community for years to come. For what it's worth, and that's probably not much, according to the TIOBE Index Python is more popular now that it's ever been, and it has climbed back from 5th to 3rd place over the past year, despite it being arguably one of the worst times for Python programmers who had been putting off the 2->3 transition until the very last moment. -- Andrea Bolognani / Red Hat / Virtualization -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Tue, 2018-10-02 at 17:19 +0200, Peter Krempa wrote: > On Tue, Oct 02, 2018 at 16:14:39 +0200, Andrea Bolognani wrote: [...] > > Two concrete examples are considered here: one is the > > virConnectNumOfDomains() API which, while known to be racy and having > > non-racy alternatives, can still be used by developers without > > getting any kind of warning in the process; the other one is the > > ability to define a domain without specifying the machine type, which > > Okay, but for these particular ones we could do a compile time warning. I believe we should really have both, to address both applications being rebuilt against newer libvirt all the time, such as open source projects that are included in our CI or in any Linux distro, and those that aren't and just get a new libvirt version from time to time due to OS updates, such as home-grown management tools. > Not that we ever can remove them though. True, that has been our policy so far. Doesn't mean it cannot ever change :) [...] > Our documentation states in multiple places that fields not populated by > the user are mostly hypervisor dependant what the default will become. > > In my opinion the machine type is similarly hypervisor dependant, and in > this case the "hypervisor" for the libvirt-qemu infrastructure also > involves libvirt's qemu driver. I don't necessarily disagree with you, but it should be noted that attempts to change libvirt's own defaults have been rejected times and times again on the basis that existing applications were, despite that being a very bad idea, relying on them. > > adoption, as well as being a manifestation of the more general > > problem of libvirt's default being sometimes too conservative and at > > odds with the existence of slimmed-down QEMU binaries being built > > with reducing the total attack surface in mind. > > If your qemu binary does not support certain feature, libvirt will know > it. We have capability detection and for that matter we also have > machine type detection (we fill in the default according to the > canonical name). In such case we are very welcome to choose anything > which will satisfy the default. > > I'm afraid though that the downstreams you are mentioning can't in fact > fully drop i440fx for some reason and thus are trying to weasel around > by attempting to make us change the default. This I don't support as a > worthy goal. If they want to slim down qemu, they are welcome and we can > pick a suitable different default. q35 is what sparked the discussion, but it's far from the only offender. For example, if I create a guest using $ virt-install \ --os-variant fedora28 \ ... then libosinfo will be queried for information about supported network cards, and the resulting XML will look like <interface type='network'> <model type='virtio'/> ... However, libvirt's own default for x86_64 guests' network devices is rtl8139, which means that if I later 'virsh edit' the guest or 'virsh attach-device' a new network interface I will get that model instead of virtio; to add insult to injury, the above will happen even if my QEMU binary has rtl8139 compiled out and virtio-net-pci compiled in! The only way to avoid such issues is to require the user / management application to provide the model name themselves. > We can also consider using what qemu provides as a default. Users will > get the default they asked for. They always can specify their specific > type if their software is not happy with it. Using QEMU's default machine type is exactly what we were doing until very recently, but we changed that because QEMU's default has been i440fx for so long that applications have come to rely on it and would break if q35 suddenly started showing up instead, which goes to show that they should not have been relying on either QEMU's or libvirt's default, and they should have been providing the machine type explicitly, possibly as obtained by querying libosinfo, instead. Or, looking at it from the other side, that libvirt should have required them to provide the machine type instead of trying to be helpful and filling it in if absent. We can't retroactively mandate applications do that, but we can deprecate such behavior and thus steer them firmly towards the proper solution. -- Andrea Bolognani / Red Hat / Virtualization -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Tue, Oct 02, 2018 at 18:26:12 +0200, Andrea Bolognani wrote: > On Tue, 2018-10-02 at 17:19 +0200, Peter Krempa wrote: > > Our documentation states in multiple places that fields not populated by > > the user are mostly hypervisor dependant what the default will become. > > > > In my opinion the machine type is similarly hypervisor dependant, and in > > this case the "hypervisor" for the libvirt-qemu infrastructure also > > involves libvirt's qemu driver. > > I don't necessarily disagree with you, but it should be noted that > attempts to change libvirt's own defaults have been rejected times > and times again on the basis that existing applications were, despite > that being a very bad idea, relying on them. I'm going to talk only about default values which are stored in domain XML once chosen here as the ones which do not make it into the XML are much more complicated and we can't easily change them... I think rejecting changes to libvirt's defaults in general is wrong. The defaults are not stable and have never been and we should stop pretending they are. When we select a default value for something we store it in the domain XML to protect us from future changes to the default value. Not to mention that defaults may depend on hypervisor version and we need to make sure we fail to start a domain on newer hypervisor with different defaults rather than silently starting a domain with different devices and hope for not breaking the guest OS. So for example, when a domain XML contains no machine type or even just an alias, such as "pc", we replace it with something real when the domain is defined. Thus when the exact same domain XML will result in different machine type depending on the QEMU version. Of course, the difference is not so drastic as in "pc" vs. "q35", but the difference is real. That said, I'm not saying we should just change our defaults just because we think a new value is better. We have have libosinfo for telling users what values are the best for their guest OS. But in case QEMU deprecates some device model or even machine type, I think we should avoid using the deprecated model as a default value to prepare people for transition. Once the deprecated model is removed, our default would change anyway, but existing domain would suddenly start to fail in case we did not change the default when the model was marked as deprecated. Of course, this would not help apps or users using transient domains without default values, but such apps/users are just broken if they rely on defaults. For persistent domains libvirt takes care of preserving default values from when a domain was first defined. Users/apps have to handle this themselves for transient apps if they care. Libvirt updates the live XML and they can read it and update their definition to make the defaults persistent. There's nothing more we can do for them and I don't think this should block us from changing any defaults. We're already doing so anyway. Jirka -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Tue, Oct 02, 2018 at 06:26:12PM +0200, Andrea Bolognani wrote: > On Tue, 2018-10-02 at 17:19 +0200, Peter Krempa wrote: > > On Tue, Oct 02, 2018 at 16:14:39 +0200, Andrea Bolognani wrote: > [...] > > > Two concrete examples are considered here: one is the > > > virConnectNumOfDomains() API which, while known to be racy and having > > > non-racy alternatives, can still be used by developers without > > > getting any kind of warning in the process; the other one is the > > > ability to define a domain without specifying the machine type, which > > > > Okay, but for these particular ones we could do a compile time warning. > > I believe we should really have both, to address both applications > being rebuilt against newer libvirt all the time, such as open source > projects that are included in our CI or in any Linux distro, and > those that aren't and just get a new libvirt version from time to > time due to OS updates, such as home-grown management tools. > > > Not that we ever can remove them though. > > True, that has been our policy so far. Doesn't mean it cannot ever > change :) > > [...] > > Our documentation states in multiple places that fields not populated by > > the user are mostly hypervisor dependant what the default will become. > > > > In my opinion the machine type is similarly hypervisor dependant, and in > > this case the "hypervisor" for the libvirt-qemu infrastructure also > > involves libvirt's qemu driver. > > I don't necessarily disagree with you, but it should be noted that > attempts to change libvirt's own defaults have been rejected times > and times again on the basis that existing applications were, despite > that being a very bad idea, relying on them. > > > > adoption, as well as being a manifestation of the more general > > > problem of libvirt's default being sometimes too conservative and at > > > odds with the existence of slimmed-down QEMU binaries being built > > > with reducing the total attack surface in mind. > > > > If your qemu binary does not support certain feature, libvirt will know > > it. We have capability detection and for that matter we also have > > machine type detection (we fill in the default according to the > > canonical name). In such case we are very welcome to choose anything > > which will satisfy the default. > > > > I'm afraid though that the downstreams you are mentioning can't in fact > > fully drop i440fx for some reason and thus are trying to weasel around > > by attempting to make us change the default. This I don't support as a > > worthy goal. If they want to slim down qemu, they are welcome and we can > > pick a suitable different default. > > q35 is what sparked the discussion, but it's far from the only > offender. For example, if I create a guest using > > $ virt-install \ > --os-variant fedora28 \ > ... > > then libosinfo will be queried for information about supported > network cards, and the resulting XML will look like > > <interface type='network'> > <model type='virtio'/> > ... > > However, libvirt's own default for x86_64 guests' network devices is > rtl8139, which means that if I later 'virsh edit' the guest or 'virsh > attach-device' a new network interface I will get that model instead > of virtio; to add insult to injury, the above will happen even if my > QEMU binary has rtl8139 compiled out and virtio-net-pci compiled in! That's not correct actually. If rtl8139 is missing in QEMU, libvirt will try e1000, and if that's missing it'll try virtio-net. > > We can also consider using what qemu provides as a default. Users will > > get the default they asked for. They always can specify their specific > > type if their software is not happy with it. > > Using QEMU's default machine type is exactly what we were doing until > very recently, but we changed that because QEMU's default has been > i440fx for so long that applications have come to rely on it and > would break if q35 suddenly started showing up instead, which goes to > show that they should not have been relying on either QEMU's or > libvirt's default, and they should have been providing the machine > type explicitly, possibly as obtained by querying libosinfo, instead. > > Or, looking at it from the other side, that libvirt should have > required them to provide the machine type instead of trying to be > helpful and filling it in if absent. We can't retroactively mandate > applications do that, but we can deprecate such behavior and thus > steer them firmly towards the proper solution. The problem with saying applications were doing it "wrong" is that this definition of "wrong" changes. Applications were perfectly justified in not providing a machine type, because the concept didn't even exist in earlier libvirt. Once it did exist, we still only supported x86, and there was no q35, so it was still valid to not specify it. Even today it is reasonable to not care about machine type in case where the app only cares about x86. Our view of "best" way to configure a guest is changing and in many cases it is becoming increasingly clear that there's no single "best" way, or no single perfect default. Taking something that's historically optional and saying it should be mandatory is a defacto API breakage. Deprecating it doesn't magically stop it being an API breakage. It is just giving apps a warning that we're about to hurt them, and I don't consider that a good thing. I think we're largely missing the bigger picture here. Configuring guests, and using libvirt APIs in general, can be very complicated. We provide basic API contract docs, and a crude XML schema reference, but this is woefully insufficient. There's very little telling apps about the big picture way to configure things / implement tasks. We're missing a good developer guide where you'd give info to apps devs about how to effectively use libvirt, so it is no surprise that apps do things that are sub-optimal. Providing better docs to app devs would be far more useful than deprecation warnings which have minimal contextual guidance. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Thu, 2018-10-04 at 14:13 +0100, Daniel P. Berrangé wrote: > On Tue, Oct 02, 2018 at 06:26:12PM +0200, Andrea Bolognani wrote: > > However, libvirt's own default for x86_64 guests' network devices is > > rtl8139, which means that if I later 'virsh edit' the guest or 'virsh > > attach-device' a new network interface I will get that model instead > > of virtio; to add insult to injury, the above will happen even if my > > QEMU binary has rtl8139 compiled out and virtio-net-pci compiled in! > > That's not correct actually. If rtl8139 is missing in QEMU, libvirt > will try e1000, and if that's missing it'll try virtio-net. Welp, you're right. Quite embarrassing that I got it wrong, too, considering that I am the one who implemented this behavior in the first place O:-) The larger point still stands, though: you pretty much never want to see an rtl8139 device popping up in your guest these days, and yet it's awfully easy for that to happen which, in my opinion, is a problem we should address. > > > We can also consider using what qemu provides as a default. Users will > > > get the default they asked for. They always can specify their specific > > > type if their software is not happy with it. > > > > Using QEMU's default machine type is exactly what we were doing until > > very recently, but we changed that because QEMU's default has been > > i440fx for so long that applications have come to rely on it and > > would break if q35 suddenly started showing up instead, which goes to > > show that they should not have been relying on either QEMU's or > > libvirt's default, and they should have been providing the machine > > type explicitly, possibly as obtained by querying libosinfo, instead. > > > > Or, looking at it from the other side, that libvirt should have > > required them to provide the machine type instead of trying to be > > helpful and filling it in if absent. We can't retroactively mandate > > applications do that, but we can deprecate such behavior and thus > > steer them firmly towards the proper solution. > > The problem with saying applications were doing it "wrong" is that > this definition of "wrong" changes. Applications were perfectly > justified in not providing a machine type, because the concept > didn't even exist in earlier libvirt. Once it did exist, we still > only supported x86, and there was no q35, so it was still valid to > not specify it. Sure, that design decision made complete sense in the context it was made in. But the context has changed in a way that turned it from an asset into a liability, and pretending that's not the case will certainly not solve the issue. > Our view of "best" way to configure a guest is changing and in many > cases it is becoming increasingly clear that there's no single > "best" way, or no single perfect default. I completely agree here. > Taking something that's historically optional and saying it should > be mandatory is a defacto API breakage. Deprecating it doesn't > magically stop it being an API breakage. It is just giving apps > a warning that we're about to hurt them, and I don't consider that > a good thing. I disagree: for better or worse, features being deprecated and ultimately removed is one of the realities of software development, and I see very little motivation for libvirt to try and play by different rules. > I think we're largely missing the bigger picture here. Configuring > guests, and using libvirt APIs in general, can be very complicated. > > We provide basic API contract docs, and a crude XML schema reference, > but this is woefully insufficient. There's very little telling apps > about the big picture way to configure things / implement tasks. > > We're missing a good developer guide where you'd give info to apps > devs about how to effectively use libvirt, so it is no surprise that > apps do things that are sub-optimal. Providing better docs to app > devs would be far more useful than deprecation warnings which have > minimal contextual guidance. I agree that having better documentation would help, and we should definitely work towards that goal. However, I'm fairly confident trying to address issues through documentation only will not be enough to get applications off problematic API usage: people mentioned several times elsewhere in the thread how they think *emitting warnings* itself wouldn't be enough, and developers would only take notice once the application breaks for good! How is documentation going to be effective? Who would look at documentation periodically just to make sure they're not using features that have since been deprecated? I know I don't do that for any of libvirt's dependencies, but if I started getting warnings I'd certainly take notice. Unless users are nagged about it during usage and developers are nagged about it during development, usage of problematic APIs will never go away. -- Andrea Bolognani / Red Hat / Virtualization -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Thu, Oct 04, 2018 at 04:49:43PM +0200, Andrea Bolognani wrote: > On Thu, 2018-10-04 at 14:13 +0100, Daniel P. Berrangé wrote: > > I think we're largely missing the bigger picture here. Configuring > > guests, and using libvirt APIs in general, can be very complicated. > > > > We provide basic API contract docs, and a crude XML schema reference, > > but this is woefully insufficient. There's very little telling apps > > about the big picture way to configure things / implement tasks. > > > > We're missing a good developer guide where you'd give info to apps > > devs about how to effectively use libvirt, so it is no surprise that > > apps do things that are sub-optimal. Providing better docs to app > > devs would be far more useful than deprecation warnings which have > > minimal contextual guidance. > > I agree that having better documentation would help, and we should > definitely work towards that goal. > > However, I'm fairly confident trying to address issues through > documentation only will not be enough to get applications off > problematic API usage: people mentioned several times elsewhere in > the thread how they think *emitting warnings* itself wouldn't be > enough, and developers would only take notice once the application > breaks for good! How is documentation going to be effective? Who > would look at documentation periodically just to make sure they're > not using features that have since been deprecated? I know I don't > do that for any of libvirt's dependencies, but if I started getting > warnings I'd certainly take notice. > > Unless users are nagged about it during usage and developers are > nagged about it during development, usage of problematic APIs will > never go away. I'm not concerned if usage doesn't go away entirely, because I believe in our goal to preserve API compatibility indefinitely. If an application is using an API or feature & it works for them that is fine. Even if the API has known design flaws, that isn't a problem unless those flaws are relevant to the usage scenario of the app I'm much more interested in how we can help application developers pick the most effective approaches to using libvirt, and a couple of lines in a deprecation message are not effective for this. A large part of this thread is based off the fact that apps are blindly relying on defaults. This is bad because no single default is a good choice for all scenarios. What's needed is guidance on how to effectively configure guests using libosinfo to identify suitable choices. This means explaining to apps how libosinfo fits into the picture when using libvirt, and describing the caveats around the choices of machine types, or the choices of CPU models. There's no escaping the writing of detailed docs to get across these points. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Thu, 2018-10-04 at 16:04 +0100, Daniel P. Berrangé wrote: > On Thu, Oct 04, 2018 at 04:49:43PM +0200, Andrea Bolognani wrote: > > I agree that having better documentation would help, and we should > > definitely work towards that goal. > > > > However, I'm fairly confident trying to address issues through > > documentation only will not be enough to get applications off > > problematic API usage: people mentioned several times elsewhere in > > the thread how they think *emitting warnings* itself wouldn't be > > enough, and developers would only take notice once the application > > breaks for good! How is documentation going to be effective? Who > > would look at documentation periodically just to make sure they're > > not using features that have since been deprecated? I know I don't > > do that for any of libvirt's dependencies, but if I started getting > > warnings I'd certainly take notice. > > > > Unless users are nagged about it during usage and developers are > > nagged about it during development, usage of problematic APIs will > > never go away. > > I'm not concerned if usage doesn't go away entirely, because I believe > in our goal to preserve API compatibility indefinitely. I don't see that as a goal per se, but rather as a *mean* to achieve goals such as ensuring applications don't break on users during upgrade and application developers don't have to work harder than necessary to keep up with changes in libvirt. It's not, however, quite a silver bullet because, for all the good it does, it also introduces several issues, some of which have been mentioned in this thread. I believe a better balance would be achieved by guaranteeing *long term* API/ABI stability, in the order of several years, so that application developers and users still benefit from not having changes forced on them too frequently but we also have the chance to clean up after our mistakes from time to time, which again would not only help libvirt developers but also, indirectly, users and application developers too. > If an application > is using an API or feature & it works for them that is fine. Even if the > API has known design flaws, that isn't a problem unless those flaws are > relevant to the usage scenario of the app Usage scenarios can and will change. Someone earlier in the thread used the example of a simple application that works just fine despite using racy APIs because there's only one person starting and stopping guests: what happens when a second admin enters the picture? Now you have to spend time figuring out where the (seldomly occurring) issue comes from and ultimately fix it by switching to the non-racy API. Wouldn't it have been much better if using the racy API had raised warnings from the very beginning, causing the developer to switch to the non-racy one before hitting any issues? Also, circling back to the machine type conundrum, what if the application or deployment assume guests will be i440fx but at some point a QEMU binary with i440fx compiled out is installed on the system? Suddenly what used to work doesn't work anymore despite the application itself sticking to "what works for them". > I'm much more interested in how we can help application developers pick > the most effective approaches to using libvirt, and a couple of lines in > a deprecation message are not effective for this. While a short "this is deprecated" message won't give application developers or users the whole story, it will certainly give them motivation to look up the details in the documentation. > A large part of this thread is based off the fact that apps are blindly > relying on defaults. This is bad because no single default is a good > choice for all scenarios. What's needed is guidance on how to effectively > configure guests using libosinfo to identify suitable choices. This means > explaining to apps how libosinfo fits into the picture when using libvirt, > and describing the caveats around the choices of machine types, or the > choices of CPU models. There's no escaping the writing of detailed docs > to get across these points. Do we expect virsh users to also, unprompted, go out of their way to read documentation targeted at application developers? -- Andrea Bolognani / Red Hat / Virtualization -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Thu, Oct 04, 2018 at 14:13:41 +0100, Daniel P. Berrangé wrote: > The problem with saying applications were doing it "wrong" is that > this definition of "wrong" changes. Applications were perfectly > justified in not providing a machine type, because the concept > didn't even exist in earlier libvirt. Once it did exist, we still > only supported x86, and there was no q35, so it was still valid to > not specify it. > > Even today it is reasonable to not care about machine type in case > where the app only cares about x86. > > Our view of "best" way to configure a guest is changing and in many > cases it is becoming increasingly clear that there's no single > "best" way, or no single perfect default. As I tried to explain in another mail in this thread, I definitely agree we should not change defaults just because we think some value is better now. That's really a job for libosinfo. But what if QEMU (or any other hypervisor) marks something (device model, machine type) as deprecated and we use that deprecated value as our default. Shouldn't we be able to tell about it to our users (here runtime warnings could help) when they use such thing explicitly and choose a different default value? That would help users with the transition and once hypervisor drops support for it completely, fewer existing domains will be affected since the recently created ones would already use non-deprecated defaults. We already silently do this to some extent with machine types. If a users specifies "pc" or even nothing, we translate it to the current QEMU default machine type "pc-i440fx-*". So taking "q35" as an example, if QEMU marks i440fx machine types as deprecated (they seem to be thinking about it) I don't see that big difference[*] in changing the default machine type to "q35". After all we will automatically change the default once i440fx is removed completely. [*] sure some devices don't even exist in q35 machine types some domain XMLs may not work anymore while with just newer i440fx machine type the XML would just keep working. But the guest OS will see a different hardware anyway and may be unhappy about it. Jirka -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Fri, Oct 05, 2018 at 10:34:33AM +0200, Jiri Denemark wrote: > On Thu, Oct 04, 2018 at 14:13:41 +0100, Daniel P. Berrangé wrote: > > The problem with saying applications were doing it "wrong" is that > > this definition of "wrong" changes. Applications were perfectly > > justified in not providing a machine type, because the concept > > didn't even exist in earlier libvirt. Once it did exist, we still > > only supported x86, and there was no q35, so it was still valid to > > not specify it. > > > > Even today it is reasonable to not care about machine type in case > > where the app only cares about x86. > > > > Our view of "best" way to configure a guest is changing and in many > > cases it is becoming increasingly clear that there's no single > > "best" way, or no single perfect default. > > As I tried to explain in another mail in this thread, I definitely agree > we should not change defaults just because we think some value is better > now. That's really a job for libosinfo. > > But what if QEMU (or any other hypervisor) marks something (device > model, machine type) as deprecated and we use that deprecated value as > our default. Shouldn't we be able to tell about it to our users (here > runtime warnings could help) when they use such thing explicitly and > choose a different default value? That would help users with the > transition and once hypervisor drops support for it completely, fewer > existing domains will be affected since the recently created ones would > already use non-deprecated defaults. QEMU already emits warnings on stderr whenever something that is deprecated is used, and those end up in the libvirt log file for that guest. I don't think that we need to duplicate what QEMU is already reporting again. It is not the kind of think an application will dynamically take action on at runtime. It is something application developers need to be told about so they can adapt future code releases as needed. > So taking "q35" as an example, if QEMU marks i440fx machine types as > deprecated (they seem to be thinking about it) I don't see that big > difference[*] in changing the default machine type to "q35". After all > we will automatically change the default once i440fx is removed > completely. There's no active plans to deprecate or remove i440fx upstream in QEMU. If a downstream did remove it, then clearly that'll impose some level of pain on applications. This is something where better documentation would help application developers. We report on supported machine types in capabilities XML but few applications do anything useful with this. We're also intending to add support to libosinfo to report on machine type compatibility for OS. We've got nothing that explains to applications how all this functionality fits together, nor good description of application relevant differences between i440fx & q35. So it is not surprising that code apps have around machine types is very sub-par and liable to break. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Fri, Oct 05, 2018 at 09:54:02 +0100, Daniel P. Berrangé wrote: > On Fri, Oct 05, 2018 at 10:34:33AM +0200, Jiri Denemark wrote: > > On Thu, Oct 04, 2018 at 14:13:41 +0100, Daniel P. Berrangé wrote: > > > The problem with saying applications were doing it "wrong" is that > > > this definition of "wrong" changes. Applications were perfectly > > > justified in not providing a machine type, because the concept > > > didn't even exist in earlier libvirt. Once it did exist, we still > > > only supported x86, and there was no q35, so it was still valid to > > > not specify it. > > > > > > Even today it is reasonable to not care about machine type in case > > > where the app only cares about x86. > > > > > > Our view of "best" way to configure a guest is changing and in many > > > cases it is becoming increasingly clear that there's no single > > > "best" way, or no single perfect default. > > > > As I tried to explain in another mail in this thread, I definitely agree > > we should not change defaults just because we think some value is better > > now. That's really a job for libosinfo. > > > > But what if QEMU (or any other hypervisor) marks something (device > > model, machine type) as deprecated and we use that deprecated value as > > our default. Shouldn't we be able to tell about it to our users (here > > runtime warnings could help) when they use such thing explicitly and > > choose a different default value? That would help users with the > > transition and once hypervisor drops support for it completely, fewer > > existing domains will be affected since the recently created ones would > > already use non-deprecated defaults. > > QEMU already emits warnings on stderr whenever something that is > deprecated is used, and those end up in the libvirt log file for > that guest. I don't think that we need to duplicate what QEMU > is already reporting again. > > It is not the kind of think an application will dynamically take > action on at runtime. It is something application developers need > to be told about so they can adapt future code releases as needed. OK, but if it's us choosing some default which QEMU considers deprecated, it's us who should dealt with it. That is, change the default IMHO. Sure apps could just stop relying on defaults and choose something specific which is not deprecated, but that doesn't mean we as a user of QEMU should continue using default values which got deprecated and are likely to be removed. If it's something which is not user/guest visible, we are already trying to move to more modern interfaces. That is easy since it doesn't change anything for libvirt users, but I feel like we need to have a plan for stuff which actually is visible to our users. > > So taking "q35" as an example, if QEMU marks i440fx machine types as > > deprecated (they seem to be thinking about it) I don't see that big > > difference[*] in changing the default machine type to "q35". After all > > we will automatically change the default once i440fx is removed > > completely. > > There's no active plans to deprecate or remove i440fx upstream in > QEMU. If a downstream did remove it, then clearly that'll impose > some level of pain on applications. I wasn't talking about downstream and I probably gave a wrong example and mixed several different things, but the point stays valid. We should have some sort of a policy when and how we change default values rather than saying "we never do that except when we already did so". One of the reason why we store chosen defaults in domain XML is to be able to change the defaults without breaking existing domains. We should formally touch this area in our "Support guarantees" document (https://libvirt.org/support.html) so that users/apps know what they can expect and we don't have to think or fight everytime someone comes up with an idea to change a default value. Jirka -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Fri, Oct 05, 2018 at 12:33:57PM +0200, Jiri Denemark wrote: > On Fri, Oct 05, 2018 at 09:54:02 +0100, Daniel P. Berrangé wrote: > > On Fri, Oct 05, 2018 at 10:34:33AM +0200, Jiri Denemark wrote: > > > On Thu, Oct 04, 2018 at 14:13:41 +0100, Daniel P. Berrangé wrote: > > > > The problem with saying applications were doing it "wrong" is that > > > > this definition of "wrong" changes. Applications were perfectly > > > > justified in not providing a machine type, because the concept > > > > didn't even exist in earlier libvirt. Once it did exist, we still > > > > only supported x86, and there was no q35, so it was still valid to > > > > not specify it. > > > > > > > > Even today it is reasonable to not care about machine type in case > > > > where the app only cares about x86. > > > > > > > > Our view of "best" way to configure a guest is changing and in many > > > > cases it is becoming increasingly clear that there's no single > > > > "best" way, or no single perfect default. > > > > > > As I tried to explain in another mail in this thread, I definitely agree > > > we should not change defaults just because we think some value is better > > > now. That's really a job for libosinfo. > > > > > > But what if QEMU (or any other hypervisor) marks something (device > > > model, machine type) as deprecated and we use that deprecated value as > > > our default. Shouldn't we be able to tell about it to our users (here > > > runtime warnings could help) when they use such thing explicitly and > > > choose a different default value? That would help users with the > > > transition and once hypervisor drops support for it completely, fewer > > > existing domains will be affected since the recently created ones would > > > already use non-deprecated defaults. > > > > QEMU already emits warnings on stderr whenever something that is > > deprecated is used, and those end up in the libvirt log file for > > that guest. I don't think that we need to duplicate what QEMU > > is already reporting again. > > > > It is not the kind of think an application will dynamically take > > action on at runtime. It is something application developers need > > to be told about so they can adapt future code releases as needed. > > OK, but if it's us choosing some default which QEMU considers > deprecated, it's us who should dealt with it. That is, change the > default IMHO. Sure apps could just stop relying on defaults and choose > something specific which is not deprecated, but that doesn't mean we as > a user of QEMU should continue using default values which got deprecated > and are likely to be removed. > > If it's something which is not user/guest visible, we are already trying > to move to more modern interfaces. That is easy since it doesn't change > anything for libvirt users, but I feel like we need to have a plan for > stuff which actually is visible to our users. If some user visible feature is actually removed, then our hand is forced - we have to pick something else to replace it. This is what happened with the default NIC with RHEL stopped building rtl8139, so we had to have a default fallback. That change is/was not seemless since there exist OS which supported rtl8139 but which lacked e1000 / virtio-net drivers. Fortunately apps already use libosinfo to explicitly set NIC model so the breakage was not so visible. If 'i440fx' is ever actually deleted we would have todo something similar - prefer i440fx for best historical compat, but fallback to q35 despite knowing it could break. > > > So taking "q35" as an example, if QEMU marks i440fx machine types as > > > deprecated (they seem to be thinking about it) I don't see that big > > > difference[*] in changing the default machine type to "q35". After all > > > we will automatically change the default once i440fx is removed > > > completely. > > > > There's no active plans to deprecate or remove i440fx upstream in > > QEMU. If a downstream did remove it, then clearly that'll impose > > some level of pain on applications. > > I wasn't talking about downstream and I probably gave a wrong example > and mixed several different things, but the point stays valid. We should > have some sort of a policy when and how we change default values rather > than saying "we never do that except when we already did so". One of the > reason why we store chosen defaults in domain XML is to be able to > change the defaults without breaking existing domains. We should > formally touch this area in our "Support guarantees" document > (https://libvirt.org/support.html) so that users/apps know what they can > expect and we don't have to think or fight everytime someone comes up > with an idea to change a default value. I guess we should outline the NIC default choice as an example of the approach we would take. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Fri, Oct 05, 2018 at 12:10:51 +0100, Daniel P. Berrangé wrote: > On Fri, Oct 05, 2018 at 12:33:57PM +0200, Jiri Denemark wrote: > > On Fri, Oct 05, 2018 at 09:54:02 +0100, Daniel P. Berrangé wrote: > > > On Fri, Oct 05, 2018 at 10:34:33AM +0200, Jiri Denemark wrote: > > > > On Thu, Oct 04, 2018 at 14:13:41 +0100, Daniel P. Berrangé wrote: > > > > > The problem with saying applications were doing it "wrong" is that > > > > > this definition of "wrong" changes. Applications were perfectly > > > > > justified in not providing a machine type, because the concept > > > > > didn't even exist in earlier libvirt. Once it did exist, we still > > > > > only supported x86, and there was no q35, so it was still valid to > > > > > not specify it. > > > > > > > > > > Even today it is reasonable to not care about machine type in case > > > > > where the app only cares about x86. > > > > > > > > > > Our view of "best" way to configure a guest is changing and in many > > > > > cases it is becoming increasingly clear that there's no single > > > > > "best" way, or no single perfect default. > > > > > > > > As I tried to explain in another mail in this thread, I definitely agree > > > > we should not change defaults just because we think some value is better > > > > now. That's really a job for libosinfo. > > > > > > > > But what if QEMU (or any other hypervisor) marks something (device > > > > model, machine type) as deprecated and we use that deprecated value as > > > > our default. Shouldn't we be able to tell about it to our users (here > > > > runtime warnings could help) when they use such thing explicitly and > > > > choose a different default value? That would help users with the > > > > transition and once hypervisor drops support for it completely, fewer > > > > existing domains will be affected since the recently created ones would > > > > already use non-deprecated defaults. > > > > > > QEMU already emits warnings on stderr whenever something that is > > > deprecated is used, and those end up in the libvirt log file for > > > that guest. I don't think that we need to duplicate what QEMU > > > is already reporting again. > > > > > > It is not the kind of think an application will dynamically take > > > action on at runtime. It is something application developers need > > > to be told about so they can adapt future code releases as needed. > > > > OK, but if it's us choosing some default which QEMU considers > > deprecated, it's us who should dealt with it. That is, change the > > default IMHO. Sure apps could just stop relying on defaults and choose > > something specific which is not deprecated, but that doesn't mean we as > > a user of QEMU should continue using default values which got deprecated > > and are likely to be removed. > > > > If it's something which is not user/guest visible, we are already trying > > to move to more modern interfaces. That is easy since it doesn't change > > anything for libvirt users, but I feel like we need to have a plan for > > stuff which actually is visible to our users. > > If some user visible feature is actually removed, then our hand is > forced - we have to pick something else to replace it. This is what > happened with the default NIC with RHEL stopped building rtl8139, > so we had to have a default fallback. > > That change is/was not seemless since there exist OS which supported > rtl8139 but which lacked e1000 / virtio-net drivers. Fortunately apps > already use libosinfo to explicitly set NIC model so the breakage was > not so visible. > > If 'i440fx' is ever actually deleted we would have todo something > similar - prefer i440fx for best historical compat, but fallback > to q35 despite knowing it could break. We already do so. If QEMU supports "pc", we use that as the default (and resolve it to some pc-i440fx-* machine type), otherwise we use whatever QEMU says is the default. My point is whether we should wait until we're forced or switch the default once it's deprecated to make sure newly created domains get the new default early and they will keep working once the old default is removed. Of course, changes to the XML or guest configuration may be needed for the new default to actually work, but I think it's better to do the changes when creating new domains rather than realizing that suddenly all my existing domains won't boot anymore because the default value they got when I created them was removed. Jirka -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Fri, 2018-10-05 at 12:10 +0100, Daniel P. Berrangé wrote: > On Fri, Oct 05, 2018 at 12:33:57PM +0200, Jiri Denemark wrote: > > OK, but if it's us choosing some default which QEMU considers > > deprecated, it's us who should dealt with it. That is, change the > > default IMHO. Sure apps could just stop relying on defaults and choose > > something specific which is not deprecated, but that doesn't mean we as > > a user of QEMU should continue using default values which got deprecated > > and are likely to be removed. > > > > If it's something which is not user/guest visible, we are already trying > > to move to more modern interfaces. That is easy since it doesn't change > > anything for libvirt users, but I feel like we need to have a plan for > > stuff which actually is visible to our users. > > If some user visible feature is actually removed, then our hand is > forced - we have to pick something else to replace it. This is what > happened with the default NIC with RHEL stopped building rtl8139, > so we had to have a default fallback. > > That change is/was not seemless since there exist OS which supported > rtl8139 but which lacked e1000 / virtio-net drivers. Fortunately apps > already use libosinfo to explicitly set NIC model so the breakage was > not so visible. > > If 'i440fx' is ever actually deleted we would have todo something > similar - prefer i440fx for best historical compat, but fallback > to q35 despite knowing it could break. I think this is just running in circles around the issue. Once you've determined that there is no sensible default that works for everyone, and that whatever works now might not work in the future, so applications should provide a value instead of relying on the default if they don't want to break down the line, you've made the default for all intents and purposes entirely pointless. Yet you've stopped just short of actually addressing the underlying issue, which you would do by (after a generous deprecation period) *requiring* applications to provide an explicit value at all times. -- Andrea Bolognani / Red Hat / Virtualization -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Fri, 2018-10-05 at 09:54 +0100, Daniel P. Berrangé wrote: > On Fri, Oct 05, 2018 at 10:34:33AM +0200, Jiri Denemark wrote: > > But what if QEMU (or any other hypervisor) marks something (device > > model, machine type) as deprecated and we use that deprecated value as > > our default. Shouldn't we be able to tell about it to our users (here > > runtime warnings could help) when they use such thing explicitly and > > choose a different default value? That would help users with the > > transition and once hypervisor drops support for it completely, fewer > > existing domains will be affected since the recently created ones would > > already use non-deprecated defaults. > > QEMU already emits warnings on stderr whenever something that is > deprecated is used, and those end up in the libvirt log file for > that guest. I don't think that we need to duplicate what QEMU > is already reporting again. Warnings printed on stderr -> users and developers will actually see them, be annoyed by them, eventually cave in and act upon them. Warnings written to a log -> nobody will notice them, until one day things suddenly stop working apparently out of the blue. We might pretend that's not the case, but really, it is. > > So taking "q35" as an example, if QEMU marks i440fx machine types as > > deprecated (they seem to be thinking about it) I don't see that big > > difference[*] in changing the default machine type to "q35". After all > > we will automatically change the default once i440fx is removed > > completely. > > There's no active plans to deprecate or remove i440fx upstream in > QEMU. If a downstream did remove it, then clearly that'll impose > some level of pain on applications. While i440fx itself has not been deprecated and might possibly never be, other machine types (x86's pc-0.10 and pc-0.11, ppc's prep) are in fact already marked as deprecated and will eventually be dropped entirely from QEMU. -- Andrea Bolognani / Red Hat / Virtualization -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Fri, Oct 05, 2018 at 11:38:14AM +0200, Andrea Bolognani wrote: > On Fri, 2018-10-05 at 09:54 +0100, Daniel P. Berrangé wrote: > > On Fri, Oct 05, 2018 at 10:34:33AM +0200, Jiri Denemark wrote: > > > But what if QEMU (or any other hypervisor) marks something (device > > > model, machine type) as deprecated and we use that deprecated value as > > > our default. Shouldn't we be able to tell about it to our users (here > > > runtime warnings could help) when they use such thing explicitly and > > > choose a different default value? That would help users with the > > > transition and once hypervisor drops support for it completely, fewer > > > existing domains will be affected since the recently created ones would > > > already use non-deprecated defaults. > > > > QEMU already emits warnings on stderr whenever something that is > > deprecated is used, and those end up in the libvirt log file for > > that guest. I don't think that we need to duplicate what QEMU > > is already reporting again. > > Warnings printed on stderr -> users and developers will actually > see them, be annoyed by them, eventually cave in and act upon them. > > Warnings written to a log -> nobody will notice them, until one day > things suddenly stop working apparently out of the blue. > > We might pretend that's not the case, but really, it is. You cannot expect that libvirt will duplicate all the deprecation warnings from all hypervisors for all models or machine types or other values that can be set to a lot of different attributes in our XML. Once we do that for machine type there will be requests to do it for other attributes as well. I think that this is a big NO and way out of scope of libvirt. Pavel -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Fri, 2018-10-05 at 11:56 +0200, Pavel Hrdina wrote: > On Fri, Oct 05, 2018 at 11:38:14AM +0200, Andrea Bolognani wrote: > > Warnings printed on stderr -> users and developers will actually > > see them, be annoyed by them, eventually cave in and act upon them. > > > > Warnings written to a log -> nobody will notice them, until one day > > things suddenly stop working apparently out of the blue. > > > > We might pretend that's not the case, but really, it is. > > You cannot expect that libvirt will duplicate all the deprecation > warnings from all hypervisors for all models or machine types or other > values that can be set to a lot of different attributes in our XML. > > Once we do that for machine type there will be requests to do it for > other attributes as well. I think that this is a big NO and way out > of scope of libvirt. Of course not, that would become very silly very fast - although in many cases we already go out of our way to report a libvirt error for configurations that would cause QEMU to ultimately bail out anyway, so it wouldn't be entirely unprecedented either. What I meant to say is that we should report deprecations of *libvirt* features to the user directly instead of stuffing them into a log file where they will possibly never be noticed until it's way too late. -- Andrea Bolognani / Red Hat / Virtualization -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Fri, Oct 05, 2018 at 11:38:14AM +0200, Andrea Bolognani wrote: > On Fri, 2018-10-05 at 09:54 +0100, Daniel P. Berrangé wrote: > > On Fri, Oct 05, 2018 at 10:34:33AM +0200, Jiri Denemark wrote: > > > But what if QEMU (or any other hypervisor) marks something (device > > > model, machine type) as deprecated and we use that deprecated value as > > > our default. Shouldn't we be able to tell about it to our users (here > > > runtime warnings could help) when they use such thing explicitly and > > > choose a different default value? That would help users with the > > > transition and once hypervisor drops support for it completely, fewer > > > existing domains will be affected since the recently created ones would > > > already use non-deprecated defaults. > > > > QEMU already emits warnings on stderr whenever something that is > > deprecated is used, and those end up in the libvirt log file for > > that guest. I don't think that we need to duplicate what QEMU > > is already reporting again. > > Warnings printed on stderr -> users and developers will actually > see them, be annoyed by them, eventually cave in and act upon them. > > Warnings written to a log -> nobody will notice them, until one day > things suddenly stop working apparently out of the blue. > > We might pretend that's not the case, but really, it is. Unless you're talking about a CLI tool (virt-install, virsh), there is no difference between those two scenarios. For virt-manager, virt-viewer, oVirt, OpenStack, KubeVirt, stderr is never going to be seen, it just ends up in a log file. So I don't find that distinction to be compelling. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Fri, 2018-10-05 at 12:05 +0100, Daniel P. Berrangé wrote: > On Fri, Oct 05, 2018 at 11:38:14AM +0200, Andrea Bolognani wrote: > > Warnings printed on stderr -> users and developers will actually > > see them, be annoyed by them, eventually cave in and act upon them. > > > > Warnings written to a log -> nobody will notice them, until one day > > things suddenly stop working apparently out of the blue. > > > > We might pretend that's not the case, but really, it is. > > Unless you're talking about a CLI tool (virt-install, virsh), there > is no difference between those two scenarios. For virt-manager, > virt-viewer, oVirt, OpenStack, KubeVirt, stderr is never going to > be seen, it just ends up in a log file. So I don't find that > distinction to be compelling. Sure, I used "stderr" as a shorthand for "whatever reporting method can be appropriately used by the application to shove a warning to the user's face" :) For a GUI application, it might be a dialog popping up or a message showing scrolling throuhg the status bar or what have you; for something headless, it might be an email delivered to the admin's inbox. The point is that the user should *not* be required to dig through logs to find out they've been using deprecated features: they should be *told* that's the case right when they do. -- Andrea Bolognani / Red Hat / Virtualization -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Picking up the thread again from here because it looks like as sensible a point as any. On Tue, 2018-10-02 at 18:26 +0200, Andrea Bolognani wrote: > q35 is what sparked the discussion, but it's far from the only > offender. For example, if I create a guest using > > $ virt-install \ > --os-variant fedora28 \ > ... > > then libosinfo will be queried for information about supported > network cards, and the resulting XML will look like > > <interface type='network'> > <model type='virtio'/> > ... > > However, libvirt's own default for x86_64 guests' network devices is > rtl8139, which means that if I later 'virsh edit' the guest or 'virsh > attach-device' a new network interface I will get that model instead > of virtio; [...] I'd say at this point it's fair to conclude that the consensus among libvirt developers is to stick to the existing compatibility promise and thus never deprecate, let alone drop, features; the question of whether defaults could ever change hasn't been answered in such a clear-cut manner, as far as I can tell, but it's also somewhat less relevant because I think we can all agree that having a single default value is simply not going to cut it these days. So applications are supposed to provide explicit values, obtained through libosinfo, so that they will keep working as other parts of the stack change around them; this will, however, not be enforced in any way but rather encouraged through documentation. This is a sensible enough plan, but it fails to address one important scenario: people using virsh directly to script changes and other operations on their guests - see above for an example of how that can end up not working as expected. Since virsh itself is basically just exposing raw libvirt APIs, it doesn't and shouldn't have much logic of its own, plus it falls under the same stability guarantees as the library itself, so we can't easily change it. The obvious solution is to have a *different* command line client that can alter guests the same way virsh can, but is not shipped as part of libvirt and can thus link against libosinfo as well as not having to follow the same strict stability guarantees. Pavel helpfully pointed out that such a client already exists: it's called virt-xml and it's part of virt-manager. It needs a few tweaks before it can really fit the bill, but once that's been taken care of you should be able to use something like $ virt-xml \ myguest \ --os-variant fedora28 \ --add-device \ --network network=default and get a virtio-net device, as you should, instead of rtl8139. Actually accepting --os-variant is the missing bit, but it should be fairly easy to do; not only that, but the existing proposal to store libosinfo metadata in the guest XML during installation has the potential to make even that entirely unnecessary! For installation you'll obviously want to use virt-install, but that's the case already since actually installing a guest from scratch using 'virsh define' is just madness :) IIUC the use case of importing an existing guest image without booting the guest at the same time is not covered, but once again that's only a bugfix away. So once we have these changes in place, command line users can be pretty much completely isolated from libvirt defaults, just like virt-manager and oVirt and Nova users. Then it will be up to us to actually advertise these alternatives and push users away from virsh[1] and towards them. I wonder if showing a message suggesting to use virt-xml instead when 'virsh edit' or 'virsh attach-device' are called would be considered acceptable at that point? [1] Only where it makes sense, of course: 'virsh start' and similar, for example, don't really need an alternative. -- Andrea Bolognani / Red Hat / Virtualization -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Wed, Oct 10, 2018 at 01:09:50PM +0200, Andrea Bolognani wrote: > Pavel helpfully pointed out that such a client already exists: > it's called virt-xml and it's part of virt-manager. > > It needs a few tweaks before it can really fit the bill, but once > that's been taken care of you should be able to use something like > > $ virt-xml \ > myguest \ > --os-variant fedora28 \ > --add-device \ > --network network=default > > and get a virtio-net device, as you should, instead of rtl8139. > > Actually accepting --os-variant is the missing bit, but it should > be fairly easy to do; not only that, but the existing proposal to > store libosinfo metadata in the guest XML during installation has > the potential to make even that entirely unnecessary! > > For installation you'll obviously want to use virt-install, but > that's the case already since actually installing a guest from > scratch using 'virsh define' is just madness :) IIUC the use case > of importing an existing guest image without booting the guest at > the same time is not covered, but once again that's only a bugfix > away. > > So once we have these changes in place, command line users can be > pretty much completely isolated from libvirt defaults, just like > virt-manager and oVirt and Nova users. Then it will be up to us > to actually advertise these alternatives and push users away from > virsh[1] and towards them. > > I wonder if showing a message suggesting to use virt-xml instead > when 'virsh edit' or 'virsh attach-device' are called would be > considered acceptable at that point? Depends what you mean by showing a message ? I'd be fine with the virsh man page referring people to virt-xml as a companion tool. I would certainly not expect invokation of 'virsh edit' to print any text on the console, as it will always be valid to want to use "virsh edit", "virsh atach-device" or any other command precisely because they are an almost direct passthrough to the libvirt API without trying to inject clever logic of their own. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On 10/12/2018 11:21 AM, Daniel P. Berrangé wrote: > On Wed, Oct 10, 2018 at 01:09:50PM +0200, Andrea Bolognani wrote: >> I wonder if showing a message suggesting to use virt-xml instead >> when 'virsh edit' or 'virsh attach-device' are called would be >> considered acceptable at that point? > Depends what you mean by showing a message ? I'd be fine with the > virsh man page referring people to virt-xml as a companion tool. > > I would certainly not expect invokation of 'virsh edit' to print > any text on the console, as it will always be valid to want to > use "virsh edit", "virsh atach-device" or any other command > precisely because they are an almost direct passthrough to the > libvirt API without trying to inject clever logic of their own. What if it was added to the host's MOTD? :-P https://bugzilla.redhat.com/show_bug.cgi?id=1635200\ -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Fri, 2018-10-12 at 16:21 +0100, Daniel P. Berrangé wrote: > On Wed, Oct 10, 2018 at 01:09:50PM +0200, Andrea Bolognani wrote: > > So once we have these changes in place, command line users can be > > pretty much completely isolated from libvirt defaults, just like > > virt-manager and oVirt and Nova users. Then it will be up to us > > to actually advertise these alternatives and push users away from > > virsh[1] and towards them. > > > > I wonder if showing a message suggesting to use virt-xml instead > > when 'virsh edit' or 'virsh attach-device' are called would be > > considered acceptable at that point? > > Depends what you mean by showing a message ? I'd be fine with the > virsh man page referring people to virt-xml as a companion tool. > > I would certainly not expect invokation of 'virsh edit' to print > any text on the console, as it will always be valid to want to > use "virsh edit", "virsh atach-device" or any other command > precisely because they are an almost direct passthrough to the > libvirt API without trying to inject clever logic of their own. Okay, let's forget the runtime messages then: we can mention virt-xml (and virt-install) in the documentation, write blog posts about them, and the like. Does the rest of the plan look reasonable to you? -- Andrea Bolognani / Red Hat / Virtualization -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Sun, Oct 14, 2018 at 01:41:27PM +0200, Andrea Bolognani wrote: > On Fri, 2018-10-12 at 16:21 +0100, Daniel P. Berrangé wrote: > > On Wed, Oct 10, 2018 at 01:09:50PM +0200, Andrea Bolognani wrote: > > > So once we have these changes in place, command line users can be > > > pretty much completely isolated from libvirt defaults, just like > > > virt-manager and oVirt and Nova users. Then it will be up to us > > > to actually advertise these alternatives and push users away from > > > virsh[1] and towards them. > > > > > > I wonder if showing a message suggesting to use virt-xml instead > > > when 'virsh edit' or 'virsh attach-device' are called would be > > > considered acceptable at that point? > > > > Depends what you mean by showing a message ? I'd be fine with the > > virsh man page referring people to virt-xml as a companion tool. > > > > I would certainly not expect invokation of 'virsh edit' to print > > any text on the console, as it will always be valid to want to > > use "virsh edit", "virsh atach-device" or any other command > > precisely because they are an almost direct passthrough to the > > libvirt API without trying to inject clever logic of their own. > > Okay, let's forget the runtime messages then: we can mention > virt-xml (and virt-install) in the documentation, write blog posts > about them, and the like. > > Does the rest of the plan look reasonable to you? Sure, I'm fine with docs updates. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Tue, Oct 02, 2018 at 04:14:39PM +0200, Andrea Bolognani wrote: >Background >========== > >We have plenty of features in libvirt, some of which were designed at >a time when the virtualization story was much more straightforward >than the multi-architecture, multi-hypervisor, multi-machine world we >currently live in and, while we have found ways to keep the APIs >chugging along, the result is sometimes somewhat confusing for users >and application developers, as well as requiring libvirt developers >themselves to spend quite a bit of collective time working around >decisions that, in hindsight, turn out to have been less than >fortunate. > >Two concrete examples are considered here: one is the >virConnectNumOfDomains() API which, while known to be racy and having >non-racy alternatives, can still be used by developers without >getting any kind of warning in the process; the other one is the >ability to define a domain without specifying the machine type, which >is becoming increasingly problematic now with some x86_64 features >being limited to q35 and downstreams looking to push for its >adoption, as well as being a manifestation of the more general >problem of libvirt's default being sometimes too conservative and at >odds with the existence of slimmed-down QEMU binaries being built >with reducing the total attack surface in mind. > >Having a proper deprecation story in libvirt would allow us to point >users and developers towards the recommended solution in each case, >be it using a different API or querying libosinfo for information; >after a generous grace period, we could then remove the problematic >functionality altogether. This would be a more conservative version >of the process we already have in place for dropping support for >older QEMU releases, which recently has allowed us to ax massive >chunks of effectively dead code and simplify parts of libvirt quite >significantly. > I don't think we'll ever be able to remove the functionality. If informing the app developers is the goal, a "Deprecated Features" section in news.xml could do the trick. Jano -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On 10/2/18 10:14 AM, Andrea Bolognani wrote: > Background > ========== > > We have plenty of features in libvirt, some of which were designed at > a time when the virtualization story was much more straightforward > than the multi-architecture, multi-hypervisor, multi-machine world we > currently live in and, while we have found ways to keep the APIs > chugging along, the result is sometimes somewhat confusing for users > and application developers, as well as requiring libvirt developers > themselves to spend quite a bit of collective time working around > decisions that, in hindsight, turn out to have been less than > fortunate. I've read the other responses and as difficult as it is to type - yeah, it seems we're stuck with what we have. Hard to undo old decisions as difficult as they make future life. Being stuck with those decisions makes future adjustments that much harder unless we can find a better way to fracture things between legacy and current (which yes has it's own downsides). Spending time coding and reviewing on "warnings" serves only those that keep up with the latest and greatest serves what purpose? Until it stops working, who notices? Will someone who develops code that hasn't kept up with the advances in virtualization see, know, or care? They have something that works and won't change it. They're probably also holding onto their Commodore VIC-20 hoping for a comeback, too. I would think the "bulk" of the scope is less old API's and more the default value problem. Is this a lazy or uninformed consumer problem? The default value problem seems to be limited to QEMU machine types and PCI model/type chosen. If QEMU changed to default to q35 and/or removed i440fx, then IIUC there'd be lots of problems. No small wonder so many vendors of physical systems go out of business. Bad defaults. That's not to say those that are left are providing systems with great defaults, just better than the others. The whole conversation seems to be a very circular mess/problem. Libvirt's history may be to not make policy decisions, but it does seem there was one core policy decision that is haunting us when trying to move forward. > > Two concrete examples are considered here: one is the > virConnectNumOfDomains() API which, while known to be racy and having > non-racy alternatives, can still be used by developers without > getting any kind of warning in the process; the other one is the > ability to define a domain without specifying the machine type, which > is becoming increasingly problematic now with some x86_64 features > being limited to q35 and downstreams looking to push for its > adoption, as well as being a manifestation of the more general > problem of libvirt's default being sometimes too conservative and at > odds with the existence of slimmed-down QEMU binaries being built > with reducing the total attack surface in mind. > I know a moot point now, but... connectNumOf[Defined]Domains may be used without calling connectList[Defined]Domains. The latter is actually the one I'd think is more buggy as opposed to using connectListAllDomains. I think the same argument can be made for other subsystems with both connectList and connectListAll (e.g. most of them). Still, since connectListAllDomains was introduced in 0.9.13 (01-Jul-2012) - I have to imagine anyone developing code in 2018 would use that instead of the old API's. > Having a proper deprecation story in libvirt would allow us to point > users and developers towards the recommended solution in each case, > be it using a different API or querying libosinfo for information; > after a generous grace period, we could then remove the problematic > functionality altogether. This would be a more conservative version > of the process we already have in place for dropping support for > older QEMU releases, which recently has allowed us to ax massive > chunks of effectively dead code and simplify parts of libvirt quite > significantly. We cannot deprecate old stuff, but can we declare new stuff comes with limited lifetimes? IOW: learn from our previous mistakes and not allow them to happen again. Maybe one of us will still be here in that lifetime and get to remove something! Even though we cannot deprecate, we could move stuff around such that certain things end up on *_legacy.{c,h} files and if problems arise in them, we could have a general "support" statement that would hopefully prod the up the stack app to try and enter the modern ages. Perhaps a time consuming short term with longer term benefits. > > This series explores one possible approach to the problem and aims > to spark project-wide discussion around the topic. Game on ;-)! This series only 'solves' or 'advises of' the problem for anyone that would install whatever libvirt version it got into for both client (virsh specific too) and libvirtd/API. Doesn't help other combinations. They wouldn't necessarily know the message occurred. Some random thoughts without putting too much thought into each. * Can we pull of the python-3 and python-2 magic trick with virsh? Perhaps creating virsh-legacy that gets fork/exec'd if we cannot find some new API we create circa 2018-2019 that knows how to only use newer API's and perhaps (new) API to generate newer/better defaults. I know that only solves the virsh problem, but one would think virsh-{python, perl, etc} would be able to pull off similar tricks. Other libvirt API consumers would be able to use the new API too. Some day soon they are going to have to solve the same problem especially if they provide defaults or have configurations using legacy or older options when something more newer is available. * Can we come up with a mechanism that allows domain admins to update defaults automagically? Perhaps some new attribute that indicates such a desire. * How about a new domine define flag that says pick the latest and greatest defaults. Of course that assumes someone is using the later and greater *Flags API too! * How about a new tools/virt* image that can be run providing information about existing [defined only?] domains that would point out which definitions may use old technology or at least informs a user what may need to change based on current technology? Thus when a domain is stopped, you run the tool, it changes (for example) the machine type and then runs the post parse and verify code and either automatically cleans up or allows the admin to keep updating until they fix things themselves. Puts the onus on someone else to make the "policy decision". * Can/should the Default Machine probing be split into a connect* type driver function and then used by new apps that care? * How much of the "default values" code for various machines or interconnects be split into loadable modules based on machine type? We're splitting cgroups into v1 and v2 right now. Could we do the same to have machine specific code in specific modules? Does i440fx use pci-e? IIRC q35 uses pci-e only. John > > > Further work > ============ > > * Fix the known issues listed below as well as all not-yet-known > issues that will undoubtably surface through discussion :) > > * Introduce a mechanism to catch use of deprecated APIs at build > time, similar to GLib's G_DISABLE_DEPRECATED, to help application > developers proactively move off problematic APIs. > > * Create a formal deprecation policy with well-defined rules and > time scales in the spirit of the existing one covering our > relationship with QEMU. > > > Know issues > =========== > > * For the more granular (and more interesting) type of deprecation > shown in patch 6/7, warnings are not being reported back to the > client as expected. I believe this is caused by the RPC code > looking for either a failure, in which case the virError is > transmitted, or a success, in which case the actual return value > is: we'll have to figure out a way for the error to travel across > the wire regardless of whether or not the API call was ultimately > successful if we want clients to actually receive warnings when > non-local drivers are involved. > > > Andrea Bolognani (7): > util: Add 'level' argument to virReportErrorHelper() > util: Introduce virReportWarning() > tools: Print warnings in virsh > util: Introduce VIR_ERR_DEPRECATED_FEATURE > Deprecate virConnectNumOfDomains() > Deprecate missing machine type in virDomainDefineXMLFlags() > tools: Force virsh to use deprecated features > > include/libvirt/virterror.h | 1 + > src/access/viraccessdriverpolkit.c | 2 +- > src/access/viraccessmanager.c | 2 +- > src/conf/domain_conf.c | 11 ++++++++--- > src/datatypes.h | 30 ++++++++++++++++++++++++++---- > src/libvirt-domain.c | 6 ++++++ > src/libvirt.c | 1 + > src/util/virbuffer.c | 4 ++-- > src/util/virconf.c | 6 ++++-- > src/util/virerror.c | 10 +++++++++- > src/util/virerror.h | 15 ++++++++++----- > src/util/virkeyfile.c | 6 ++++-- > src/util/virxml.c | 2 +- > tools/virsh-domain-monitor.c | 2 ++ > tools/vsh.c | 3 +++ > 15 files changed, 79 insertions(+), 22 deletions(-) > -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
© 2016 - 2023 Red Hat, Inc.