[libvirt] [PATCH 00/11] Avoid numerous calls of virQEMUCapsCacheLookup

Marc Hartmayer posted 11 patches 5 years, 6 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20180920174457.8698-1-mhartmay@linux.ibm.com
src/conf/domain_conf.c  | 98 ++++++++++++++++++++++++++---------------
src/conf/domain_conf.h  |  9 ++--
src/qemu/qemu_domain.c  | 82 +++++++++++++++++++---------------
src/qemu/qemu_domain.h  |  4 ++
src/qemu/qemu_process.c | 18 +++-----
src/vz/vz_driver.c      |  6 ++-
6 files changed, 128 insertions(+), 89 deletions(-)
[libvirt] [PATCH 00/11] Avoid numerous calls of virQEMUCapsCacheLookup
Posted by Marc Hartmayer 5 years, 6 months ago
For a domain definition there are numerous calls of
virQEMUCapsCacheLookup (the same applies to the domain start). This
slows down the process since virQEMUCapsCacheLookup validates that the
QEMU capabilitites are still valid (among other things, a fork is done
for this if the user for the QEMU processes is 'qemu'). Therefore
let's reduce the number of virQEMUCapsCacheLookup calls whenever
possible and reasonable.

In addition to the speed up, there is the risk that
virQEMUCapsCacheLookup returns different QEMU capabilities during a
task if, for example, the QEMU binary has changed during the task.

The correct way would be:

 - get the QEMU capabilities only once per task via virQEMUCapsCacheLookup
 - do the task with these QEMU capabilities

or

 - abort the task after a cache invalidation

Note: With this patch series the behavior is still not (completely)
fixed, but the performance has been significantly improved. In a quick
test this gave a speed up of factor 4 for a simple define/undefine
loop.

In general, the more devices a domain has, the more drastic the
overhead becomes (because a cache validation is performed for each
device).

Marc Hartmayer (11):
  qemu: Use VIR_STEAL_PTR macro
  qemu: Introduce qemuDomainUpdateQEMUCaps()
  qemu: Pass QEMUCaps to virDomainDefPostParse
  conf: Use getParseOpaque() in virDomainObjSetDefTransient
  conf: Add function description for virDomainDefPostParse
  conf: Get rid of virDomainDeviceDefPostParseOne
  conf: Extend virDomainDefValidate(Callback) for parseOpaque
  conf: Use domainPostParseData(Alloc|Free) in virDomainDefValidate
  qemu: Use @parseOpaque in qemuDomainDefValidate
  conf: Extend virDomainDeviceDefValidate(Callback) for parseOpaque
  qemu: Use @parseOpaque in qemuDomainDeviceDefValidate

 src/conf/domain_conf.c  | 98 ++++++++++++++++++++++++++---------------
 src/conf/domain_conf.h  |  9 ++--
 src/qemu/qemu_domain.c  | 82 +++++++++++++++++++---------------
 src/qemu/qemu_domain.h  |  4 ++
 src/qemu/qemu_process.c | 18 +++-----
 src/vz/vz_driver.c      |  6 ++-
 6 files changed, 128 insertions(+), 89 deletions(-)

-- 
2.17.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 00/11] Avoid numerous calls of virQEMUCapsCacheLookup
Posted by Daniel P. Berrangé 5 years, 5 months ago
On Thu, Sep 20, 2018 at 07:44:46PM +0200, Marc Hartmayer wrote:
> For a domain definition there are numerous calls of
> virQEMUCapsCacheLookup (the same applies to the domain start). This
> slows down the process since virQEMUCapsCacheLookup validates that the
> QEMU capabilitites are still valid (among other things, a fork is done
> for this if the user for the QEMU processes is 'qemu'). Therefore
> let's reduce the number of virQEMUCapsCacheLookup calls whenever
> possible and reasonable.
> 
> In addition to the speed up, there is the risk that
> virQEMUCapsCacheLookup returns different QEMU capabilities during a
> task if, for example, the QEMU binary has changed during the task.
> 
> The correct way would be:
> 
>  - get the QEMU capabilities only once per task via virQEMUCapsCacheLookup
>  - do the task with these QEMU capabilities
> 
> or
> 
>  - abort the task after a cache invalidation
> 
> Note: With this patch series the behavior is still not (completely)
> fixed, but the performance has been significantly improved. In a quick
> test this gave a speed up of factor 4 for a simple define/undefine
> loop.
> 
> In general, the more devices a domain has, the more drastic the
> overhead becomes (because a cache validation is performed for each
> device).

IIUC from your KVM Forum presentation, the overhead of the
cache lookup is almost entirely coming from the fork() call
triggered by the single call

    kvmUsable = virFileAccessibleAs("/dev/kvm", R_OK | W_OK,
                                    priv->runUid, priv->runGid) == 0;

Rather than the major refactor of the way the parse callbacks
work, we could instead just optimize this call to avoid the
repeated forks.

Even if we reduced the number of calls to the cache, it is
still somewhat overkill to be checking /dev/kvm via fork()
every time. eg even if you reduced it to just a single
cache lookup, if you run virDomainDefine for 100 domains
in parallel it is still going to do 100 forks. 

We could optimize this by jcalling virFileAccessibleAs
once and storing the result in a global. Then just do a
plain stat() call in process to check the st_ctime field
for changes. We only need re-run the heavy virFileAccessibleAs
check if st_ctime has changed (indicating a owner/group/acl
change).

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 00/11] Avoid numerous calls of virQEMUCapsCacheLookup
Posted by Bjoern Walk 5 years, 5 months ago
Daniel P. Berrangé <berrange@redhat.com> [2018-10-24, 10:43PM +0100]:
> We could optimize this by jcalling virFileAccessibleAs
> once and storing the result in a global. Then just do a
> plain stat() call in process to check the st_ctime field
> for changes. We only need re-run the heavy virFileAccessibleAs
> check if st_ctime has changed (indicating a owner/group/acl
> change).

But can't access permission change outside of changing the actual device
file (e.g. cgroups or other OS capabilities)? Isn't that the whole
purpose of the virFileAccessibleAs gymnastics?

-- 
IBM Systems
Linux on Z & Virtualization Development
--------------------------------------------------
IBM Deutschland Research & Development GmbH
Schönaicher Str. 220, 71032 Böblingen
Phone: +49 7031 16 1819
--------------------------------------------------
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294 
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 00/11] Avoid numerous calls of virQEMUCapsCacheLookup
Posted by Daniel P. Berrangé 5 years, 5 months ago
On Thu, Oct 25, 2018 at 01:47:26PM +0200, Bjoern Walk wrote:
> Daniel P. Berrangé <berrange@redhat.com> [2018-10-24, 10:43PM +0100]:
> > We could optimize this by jcalling virFileAccessibleAs
> > once and storing the result in a global. Then just do a
> > plain stat() call in process to check the st_ctime field
> > for changes. We only need re-run the heavy virFileAccessibleAs
> > check if st_ctime has changed (indicating a owner/group/acl
> > change).
> 
> But can't access permission change outside of changing the actual device
> file (e.g. cgroups or other OS capabilities)? Isn't that the whole
> purpose of the virFileAccessibleAs gymnastics?

Yes, cgroups could restrict access to /dev/kvm, but virFileAccessibleAs
does not use cgroups, it only cares about using the correct user + group
membership. 

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 00/11] Avoid numerous calls of virQEMUCapsCacheLookup
Posted by Bjoern Walk 5 years, 5 months ago
Daniel P. Berrangé <berrange@redhat.com> [2018-10-25, 06:32PM +0100]:
> On Thu, Oct 25, 2018 at 01:47:26PM +0200, Bjoern Walk wrote:
> > Daniel P. Berrangé <berrange@redhat.com> [2018-10-24, 10:43PM +0100]:
> > > We could optimize this by jcalling virFileAccessibleAs
> > > once and storing the result in a global. Then just do a
> > > plain stat() call in process to check the st_ctime field
> > > for changes. We only need re-run the heavy virFileAccessibleAs
> > > check if st_ctime has changed (indicating a owner/group/acl
> > > change).
> > 
> > But can't access permission change outside of changing the actual device
> > file (e.g. cgroups or other OS capabilities)? Isn't that the whole
> > purpose of the virFileAccessibleAs gymnastics?
> 
> Yes, cgroups could restrict access to /dev/kvm, but virFileAccessibleAs
> does not use cgroups, it only cares about using the correct user + group
> membership. 

Sorry, but then I don't understand the purpose of this function at all.
Why would you EVER check permissions like that? A simple stat(2) call
should give you the exact same information, no?

-- 
IBM Systems
Linux on Z & Virtualization Development
--------------------------------------------------
IBM Deutschland Research & Development GmbH
Schönaicher Str. 220, 71032 Böblingen
Phone: +49 7031 16 1819
--------------------------------------------------
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294 
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 00/11] Avoid numerous calls of virQEMUCapsCacheLookup
Posted by Daniel P. Berrangé 5 years, 5 months ago
On Fri, Oct 26, 2018 at 06:57:54AM +0200, Bjoern Walk wrote:
> Daniel P. Berrangé <berrange@redhat.com> [2018-10-25, 06:32PM +0100]:
> > On Thu, Oct 25, 2018 at 01:47:26PM +0200, Bjoern Walk wrote:
> > > Daniel P. Berrangé <berrange@redhat.com> [2018-10-24, 10:43PM +0100]:
> > > > We could optimize this by jcalling virFileAccessibleAs
> > > > once and storing the result in a global. Then just do a
> > > > plain stat() call in process to check the st_ctime field
> > > > for changes. We only need re-run the heavy virFileAccessibleAs
> > > > check if st_ctime has changed (indicating a owner/group/acl
> > > > change).
> > > 
> > > But can't access permission change outside of changing the actual device
> > > file (e.g. cgroups or other OS capabilities)? Isn't that the whole
> > > purpose of the virFileAccessibleAs gymnastics?
> > 
> > Yes, cgroups could restrict access to /dev/kvm, but virFileAccessibleAs
> > does not use cgroups, it only cares about using the correct user + group
> > membership. 
> 
> Sorry, but then I don't understand the purpose of this function at all.
> Why would you EVER check permissions like that? A simple stat(2) call
> should give you the exact same information, no?

The capabilities probing runs as the 'qemu' user and 'qemu' group.
If libvirtd stat()s the /dev/kvm device, it will merely find out
if the *root* user can access it. We need to chcek if the qemu/qemu
user/group can access it, and if you try todo via a stat() then you
have to reinvent the kernel logic for checking supplementary group
memberships and file ACL membership. The only sensible way is for
the process checking to actually be running as the qemu/qemu user
and group, hence we fork and change to this user/group pair.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 00/11] Avoid numerous calls of virQEMUCapsCacheLookup
Posted by Marc Hartmayer 5 years, 5 months ago
On Wed, Oct 24, 2018 at 11:43 PM +0200, "Daniel P. Berrangé" <berrange@redhat.com> wrote:
> On Thu, Sep 20, 2018 at 07:44:46PM +0200, Marc Hartmayer wrote:
>> For a domain definition there are numerous calls of
>> virQEMUCapsCacheLookup (the same applies to the domain start). This
>> slows down the process since virQEMUCapsCacheLookup validates that the
>> QEMU capabilitites are still valid (among other things, a fork is done
>> for this if the user for the QEMU processes is 'qemu'). Therefore
>> let's reduce the number of virQEMUCapsCacheLookup calls whenever
>> possible and reasonable.
>>
>> In addition to the speed up, there is the risk that
>> virQEMUCapsCacheLookup returns different QEMU capabilities during a
>> task if, for example, the QEMU binary has changed during the task.
>>
>> The correct way would be:
>>
>>  - get the QEMU capabilities only once per task via virQEMUCapsCacheLookup
>>  - do the task with these QEMU capabilities
>>
>> or
>>
>>  - abort the task after a cache invalidation
>>
>> Note: With this patch series the behavior is still not (completely)
>> fixed, but the performance has been significantly improved. In a quick
>> test this gave a speed up of factor 4 for a simple define/undefine
>> loop.
>>
>> In general, the more devices a domain has, the more drastic the
>> overhead becomes (because a cache validation is performed for each
>> device).
>
> IIUC from your KVM Forum presentation, the overhead of the
> cache lookup is almost entirely coming from the fork() call
> triggered by the single call
>
>     kvmUsable = virFileAccessibleAs("/dev/kvm", R_OK | W_OK,
>                                     priv->runUid, priv->runGid) == 0;
>
> Rather than the major refactor of the way the parse callbacks
> work, we could instead just optimize this call to avoid the
> repeated forks.
>
> Even if we reduced the number of calls to the cache, it is
> still somewhat overkill to be checking /dev/kvm via fork()
> every time. eg even if you reduced it to just a single
> cache lookup, if you run virDomainDefine for 100 domains
> in parallel it is still going to do 100 forks.
>
> We could optimize this by jcalling virFileAccessibleAs
> once and storing the result in a global. Then just do a
> plain stat() call in process to check the st_ctime field
> for changes. We only need re-run the heavy virFileAccessibleAs
> check if st_ctime has changed (indicating a owner/group/acl
> change).

Okay, if that’s fine I’ll add this as an additional patch to this
series. Because the various virQEMUCapsCacheLookup calls still seem to
be wrong to me.

 Marc

>
> Regards,
> Daniel
> --
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>
--
Kind regards / Beste Grüße
   Marc Hartmayer

IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list