[v1] More virsecretobj changes to prepare for common object

[libvirt] [PATCH 8/8] secret: Handle object list removal and deletion properly

Posted by John Ferlan 8 years, 4 months ago

Rather than rely on virSecretObjEndAPI to make the final virObjectUnref
after the call virSecretObjListRemove, be more explicit by calling
virObjectUnref and setting @obj to NULL for secretUndefine and in
the error path of secretDefineXML.

This also fixes a leak during virSecretLoad if the virSecretLoadValue
fails the code jumps to cleanup without setting @ret = obj, thus calling
virSecretObjListRemove which only accounts for the object reference
related to adding the object to the list during virSecretObjListAdd,
but does not account for the reference to the object itself as the
return of @ret would be NULL so the caller wouldn't call virSecretObjEndAPI
on the object recently added thus reducing the refcnt to zero. Thus
cleaning up the virSecretLoadValue error path to make it clearer what
needs to be done on failure.

Signed-off-by: John Ferlan <jferlan@redhat.com>
---
 src/conf/virsecretobj.c    | 14 ++++++--------
 src/secret/secret_driver.c |  9 +++++++--
 2 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/src/conf/virsecretobj.c b/src/conf/virsecretobj.c
index ca13cf8..26646dd 100644
--- a/src/conf/virsecretobj.c
+++ b/src/conf/virsecretobj.c
@@ -918,7 +918,6 @@ virSecretLoad(virSecretObjListPtr secrets,
 {
     virSecretDefPtr def = NULL;
     virSecretObjPtr obj = NULL;
-    virSecretObjPtr ret = NULL;
 
     if (!(def = virSecretDefParseFile(path)))
         goto cleanup;
@@ -930,16 +929,15 @@ virSecretLoad(virSecretObjListPtr secrets,
         goto cleanup;
     def = NULL;
 
-    if (virSecretLoadValue(obj) < 0)
-        goto cleanup;
-
-    ret = obj;
-    obj = NULL;
+    if (virSecretLoadValue(obj) < 0) {
+        virSecretObjListRemove(secrets, obj);
+        virObjectUnref(obj);
+        obj = NULL;
+    }
 
  cleanup:
-    virSecretObjListRemove(secrets, obj);
     virSecretDefFree(def);
-    return ret;
+    return obj;
 }
 
 
diff --git a/src/secret/secret_driver.c b/src/secret/secret_driver.c
index 32cd8bb..14483fd 100644
--- a/src/secret/secret_driver.c
+++ b/src/secret/secret_driver.c
@@ -268,10 +268,13 @@ secretDefineXML(virConnectPtr conn,
      * the backup. The current def will be handled below.
      * Otherwise, this is a new secret, thus remove it.
      */
-    if (backup)
+    if (backup) {
         virSecretObjSetDef(obj, backup);
-    else
+    } else {
         virSecretObjListRemove(driver->secrets, obj);
+        virObjectUnref(obj);
+        obj = NULL;
+    }
 
  cleanup:
     virSecretDefFree(def);
@@ -411,6 +414,8 @@ secretUndefine(virSecretPtr secret)
     virSecretObjDeleteData(obj);
 
     virSecretObjListRemove(driver->secrets, obj);
+    virObjectUnref(obj);
+    obj = NULL;
 
     ret = 0;
 
-- 
2.9.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH 8/8] secret: Handle object list removal and deletion properly

Posted by Michal Privoznik 8 years, 3 months ago

On 06/03/2017 03:27 PM, John Ferlan wrote:
> Rather than rely on virSecretObjEndAPI to make the final virObjectUnref
> after the call virSecretObjListRemove, be more explicit by calling
> virObjectUnref and setting @obj to NULL for secretUndefine and in
> the error path of secretDefineXML.
> 
> This also fixes a leak during virSecretLoad if the virSecretLoadValue
> fails the code jumps to cleanup without setting @ret = obj, thus calling
> virSecretObjListRemove which only accounts for the object reference
> related to adding the object to the list during virSecretObjListAdd,
> but does not account for the reference to the object itself as the
> return of @ret would be NULL so the caller wouldn't call virSecretObjEndAPI
> on the object recently added thus reducing the refcnt to zero. Thus
> cleaning up the virSecretLoadValue error path to make it clearer what
> needs to be done on failure.

I think the real reason is that we cannot call virSecretObjEndAPI()
because that *unlocks* the secret object. However, the object is already
unlocked at that point by virSecretObjListRemove() and thus we would
unlock twice while locking just once. Honestly, I'd rather see that
explanation in the commit message. But it's your call.

> 
> Signed-off-by: John Ferlan <jferlan@redhat.com>
> ---
>  src/conf/virsecretobj.c    | 14 ++++++--------
>  src/secret/secret_driver.c |  9 +++++++--
>  2 files changed, 13 insertions(+), 10 deletions(-)

ACK

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH 8/8] secret: Handle object list removal and deletion properly

Posted by John Ferlan 8 years, 3 months ago

On 07/11/2017 11:52 AM, Michal Privoznik wrote:
> On 06/03/2017 03:27 PM, John Ferlan wrote:
>> Rather than rely on virSecretObjEndAPI to make the final virObjectUnref
>> after the call virSecretObjListRemove, be more explicit by calling
>> virObjectUnref and setting @obj to NULL for secretUndefine and in
>> the error path of secretDefineXML.
>>
>> This also fixes a leak during virSecretLoad if the virSecretLoadValue
>> fails the code jumps to cleanup without setting @ret = obj, thus calling
>> virSecretObjListRemove which only accounts for the object reference
>> related to adding the object to the list during virSecretObjListAdd,
>> but does not account for the reference to the object itself as the
>> return of @ret would be NULL so the caller wouldn't call virSecretObjEndAPI
>> on the object recently added thus reducing the refcnt to zero. Thus
>> cleaning up the virSecretLoadValue error path to make it clearer what
>> needs to be done on failure.
> 
> I think the real reason is that we cannot call virSecretObjEndAPI()
> because that *unlocks* the secret object. However, the object is already
> unlocked at that point by virSecretObjListRemove() and thus we would
> unlock twice while locking just once. Honestly, I'd rather see that
> explanation in the commit message. But it's your call.
> 

Partially true - although calling Unlock on something already Unlocked
by the same thread IIRC doesn't do much other than cause an error, but
we don't fail on that error.

It's ironic this relates to something Erik and I discussed during the
virNodeDev* changes w/r/t the "owner" (driver) after the Add/Append now
has a reference on the object and would always need to Unref it even
after removing it from the List.

This just happened to be where I had that oh sh*t moment and realized
that when calling Remove we were essentially unlocking and thus yes
calling EndAPI would unlock and unlocked object.  I'll add something in
about that..

John

As an aside, this is exactly why I started down the path of common
objects. Consider how the callers to virDomainObjListAdd go through
great lengths to managed the returned object some quite differently
based on what they know about how many refs are returned and whether the
calling function calls virDomainObjEndAPI.  If it does, there's always a
virObjectRef done on the object after the Add returns. It's a subtle
thing, but confusing nonetheless.

The thing is the *Remove API will call virHashRemoveEntry which
decrements a refcnt for each table; however, the *Add API only
increments the refcnt once for adding into each table.  Callers are very
careful to understand and manage that.

I'd rather see that mgmt go away. It should be "safe" to call EndAPI obj
the @obj regardless if Add or Lookup was used.  The callers shouldn't
have to know they need to either use *EndAPI or ObjUnlock.  In any case,
I digress and that's a different issue for another day

>>
>> Signed-off-by: John Ferlan <jferlan@redhat.com>
>> ---
>>  src/conf/virsecretobj.c    | 14 ++++++--------
>>  src/secret/secret_driver.c |  9 +++++++--
>>  2 files changed, 13 insertions(+), 10 deletions(-)
> 
> ACK
> 
> Michal
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH 8/8] secret: Handle object list removal and deletion properly

Posted by Michal Privoznik 8 years, 3 months ago

On 07/13/2017 07:19 PM, John Ferlan wrote:
> 
> 
> On 07/11/2017 11:52 AM, Michal Privoznik wrote:
>> On 06/03/2017 03:27 PM, John Ferlan wrote:
>>> Rather than rely on virSecretObjEndAPI to make the final virObjectUnref
>>> after the call virSecretObjListRemove, be more explicit by calling
>>> virObjectUnref and setting @obj to NULL for secretUndefine and in
>>> the error path of secretDefineXML.
>>>
>>> This also fixes a leak during virSecretLoad if the virSecretLoadValue
>>> fails the code jumps to cleanup without setting @ret = obj, thus calling
>>> virSecretObjListRemove which only accounts for the object reference
>>> related to adding the object to the list during virSecretObjListAdd,
>>> but does not account for the reference to the object itself as the
>>> return of @ret would be NULL so the caller wouldn't call virSecretObjEndAPI
>>> on the object recently added thus reducing the refcnt to zero. Thus
>>> cleaning up the virSecretLoadValue error path to make it clearer what
>>> needs to be done on failure.
>>
>> I think the real reason is that we cannot call virSecretObjEndAPI()
>> because that *unlocks* the secret object. However, the object is already
>> unlocked at that point by virSecretObjListRemove() and thus we would
>> unlock twice while locking just once. Honestly, I'd rather see that
>> explanation in the commit message. But it's your call.
>>
> 
> Partially true - although calling Unlock on something already Unlocked
> by the same thread IIRC doesn't do much other than cause an error, but
> we don't fail on that error.

I don't think this is true. Just consider the following general example:

    Thread A   |   Thread B
------------------------------
1)  lock(X)    |
2)             |  lock(x)
3) unlock(X)   |
4) unlock(X)   |
5)             |  work(X)

In this scenario, thread B starts working on X thinking it's locked
while it isn't. Also, man-page for pthread_mutex_lock says that
unlocking an unlocked lock leads to undefined behaviour (we use NORMAL
non-robust mutexes. And yes, I had to search for what does it mean
non-robust mutex).

> 
> It's ironic this relates to something Erik and I discussed during the
> virNodeDev* changes w/r/t the "owner" (driver) after the Add/Append now
> has a reference on the object and would always need to Unref it even
> after removing it from the List.

About that. I think that if we do our ref counting right, we don't need
to do any special unref(). I mean:

undefine(X) {
  obj = lookup(X);  // obj now has refcnt = 2, one beacuse it's in the
                    // list, one for the undefine()
  listRemove(obj);  // obj.refcnt = 1, the list reference is gone
  endApi(&obj);     // undefine() reference is gone, obj is freed
}

Of course, if list uses two hash tables (one for UUIDs one for names) it
holds two references, but then listRemove() also unref()-s twice.

> 
> This just happened to be where I had that oh sh*t moment and realized
> that when calling Remove we were essentially unlocking and thus yes
> calling EndAPI would unlock and unlocked object.  I'll add something in
> about that..
> 
> John
> 
> As an aside, this is exactly why I started down the path of common
> objects. Consider how the callers to virDomainObjListAdd go through
> great lengths to managed the returned object some quite differently
> based on what they know about how many refs are returned and whether the
> calling function calls virDomainObjEndAPI.  If it does, there's always a
> virObjectRef done on the object after the Add returns. It's a subtle
> thing, but confusing nonetheless.
> 
> The thing is the *Remove API will call virHashRemoveEntry which
> decrements a refcnt for each table; however, the *Add API only
> increments the refcnt once for adding into each table.  Callers are very
> careful to understand and manage that.

I haven't looked into the code, but if this is true then listAdd() and
listRemove() functions are broken because they mangle refcount. It
should be able to do the following and having the refcount of an object
unchanged at the end:

listAdd(obj);
listRemove(obj);

If the refcount is not the same at the end as it was at the beginning,
the list functions are broken and need to be fixed

> 
> I'd rather see that mgmt go away. It should be "safe" to call EndAPI obj
> the @obj regardless if Add or Lookup was used.  The callers shouldn't
> have to know they need to either use *EndAPI or ObjUnlock.  In any case,
> I digress and that's a different issue for another day

I totally agree on this!

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH 8/8] secret: Handle object list removal and deletion properly

Posted by John Ferlan 8 years, 3 months ago

On 07/14/2017 04:15 AM, Michal Privoznik wrote:
> On 07/13/2017 07:19 PM, John Ferlan wrote:
>>
>>
>> On 07/11/2017 11:52 AM, Michal Privoznik wrote:
>>> On 06/03/2017 03:27 PM, John Ferlan wrote:
>>>> Rather than rely on virSecretObjEndAPI to make the final virObjectUnref
>>>> after the call virSecretObjListRemove, be more explicit by calling
>>>> virObjectUnref and setting @obj to NULL for secretUndefine and in
>>>> the error path of secretDefineXML.
>>>>
>>>> This also fixes a leak during virSecretLoad if the virSecretLoadValue
>>>> fails the code jumps to cleanup without setting @ret = obj, thus calling
>>>> virSecretObjListRemove which only accounts for the object reference
>>>> related to adding the object to the list during virSecretObjListAdd,
>>>> but does not account for the reference to the object itself as the
>>>> return of @ret would be NULL so the caller wouldn't call virSecretObjEndAPI
>>>> on the object recently added thus reducing the refcnt to zero. Thus
>>>> cleaning up the virSecretLoadValue error path to make it clearer what
>>>> needs to be done on failure.
>>>
>>> I think the real reason is that we cannot call virSecretObjEndAPI()
>>> because that *unlocks* the secret object. However, the object is already
>>> unlocked at that point by virSecretObjListRemove() and thus we would
>>> unlock twice while locking just once. Honestly, I'd rather see that
>>> explanation in the commit message. But it's your call.
>>>
>>
>> Partially true - although calling Unlock on something already Unlocked
>> by the same thread IIRC doesn't do much other than cause an error, but
>> we don't fail on that error.
> 
> I don't think this is true. Just consider the following general example:
> 
>     Thread A   |   Thread B
> ------------------------------
> 1)  lock(X)    |
> 2)             |  lock(x)
> 3) unlock(X)   |
> 4) unlock(X)   |
> 5)             |  work(X)
> 
> In this scenario, thread B starts working on X thinking it's locked
> while it isn't. Also, man-page for pthread_mutex_lock says that
> unlocking an unlocked lock leads to undefined behaviour (we use NORMAL
> non-robust mutexes. And yes, I had to search for what does it mean
> non-robust mutex).
> 

I assume that step 2 is supposed to be (X) and not (x)?

In your scenario above ThreadB gains the lock after step 3 since that's
the moment in time when ThreadA gives it up. When ThreadA goes to unlock
in step 4, it fails because it doesn't own the lock as only ThreadB can
unlock. Without ThreadB in the picture, then the second unlock would be
a failure because the thread isn't locked, cannot unlock an unlocked object.

IMO:  undefined behavior is what I'd call a way to say you could get an
error that the lock is already unlocked or you could get an error that
you don't own the lock to unlock, but we don't want to explain that so
we'll just it's undefined. For some reason I have this recollection
about fork'd children that really make things crazy, but I'm not sure I
can put it into words.

>>
>> It's ironic this relates to something Erik and I discussed during the
>> virNodeDev* changes w/r/t the "owner" (driver) after the Add/Append now
>> has a reference on the object and would always need to Unref it even
>> after removing it from the List.
> 
> About that. I think that if we do our ref counting right, we don't need
> to do any special unref(). I mean:
> 
> undefine(X) {
>   obj = lookup(X);  // obj now has refcnt = 2, one beacuse it's in the
>                     // list, one for the undefine()
>   listRemove(obj);  // obj.refcnt = 1, the list reference is gone
>   endApi(&obj);     // undefine() reference is gone, obj is freed
> }
> 
> Of course, if list uses two hash tables (one for UUIDs one for names) it
> holds two references, but then listRemove() also unref()-s twice.
> 

When one enters a listRemove function, there will be either 2 or 3 refs
on the object (1 or 2 hash tables) and the object is locked.

When one leaves a listRemove function, there is 1 reference on the
object and the object is unlocked.

So is the problem that listRemove shouldn't make that last Unlock? Or
that the caller shouldn't make an EndAPI call after calling Remove?

The corollary is that the listAdd would leave with 2 or 3 refs (again
number of hash tables) and the object locked.  The caller is then
expected to run EndAPI. The obj is then left with 1 or 2 refs because
it's in a table and not because it's being used again (since EndAPI will
set *obj = NULL).

My "chosen" solution is to keep the Unlock in listRemove, but I can be
swayed to have listRemove keep the lock thus having callers use EndAPI
instead of Unref. That would be something I would want to propagate to
other series so as to keep things similar between all consumers.

>>
>> This just happened to be where I had that oh sh*t moment and realized
>> that when calling Remove we were essentially unlocking and thus yes
>> calling EndAPI would unlock and unlocked object.  I'll add something in
>> about that..
>>
>> John
>>
>> As an aside, this is exactly why I started down the path of common
>> objects. Consider how the callers to virDomainObjListAdd go through
>> great lengths to managed the returned object some quite differently
>> based on what they know about how many refs are returned and whether the
>> calling function calls virDomainObjEndAPI.  If it does, there's always a
>> virObjectRef done on the object after the Add returns. It's a subtle
>> thing, but confusing nonetheless.
>>
>> The thing is the *Remove API will call virHashRemoveEntry which
>> decrements a refcnt for each table; however, the *Add API only
>> increments the refcnt once for adding into each table.  Callers are very
>> careful to understand and manage that.
> 
> I haven't looked into the code, but if this is true then listAdd() and
> listRemove() functions are broken because they mangle refcount. It
> should be able to do the following and having the refcount of an object
> unchanged at the end:
> 
> listAdd(obj);
> listRemove(obj);
> 
> If the refcount is not the same at the end as it was at the beginning,
> the list functions are broken and need to be fixed
> 

Go check callers of virDomainObjListAdd... Every single qemu case will
always virObjectRef(vm) after a call... Other consumers will sometimes
use EndAPI and other times use Unlock based primarily on whether they
called Ref as well. It's at best confusing. The problem in my mind is
that the Add was only refcnt'ing once for two lists while the Remove was
having two unref's.

I think if obj = Add is successful, then it's still up to whatever code
does the Remove(obj) code to subsequently indicate it's really done with
@obj by performing the Unref.  At least that's the goal I've been
working towards...

John

>>
>> I'd rather see that mgmt go away. It should be "safe" to call EndAPI obj
>> the @obj regardless if Add or Lookup was used.  The callers shouldn't
>> have to know they need to either use *EndAPI or ObjUnlock.  In any case,
>> I digress and that's a different issue for another day
> 
> I totally agree on this!
> 
> Michal
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list