[libvirt] [PATCH] qemu: agent: Reset agentError when qemuConnectAgent success

Wang Yechao posted 1 patch 5 years, 6 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/1538048228-21580-1-git-send-email-wang.yechao255@zte.com.cn
src/qemu/qemu_process.c | 1 +
1 file changed, 1 insertion(+)
[libvirt] [PATCH] qemu: agent: Reset agentError when qemuConnectAgent success
Posted by Wang Yechao 5 years, 6 months ago
qemuAgentClose and qemuAgentIO have race condition,
as follows:

main thread:                          second thread:
virEventPollRunOnce                processSerialChangedEvent
  virEventPollDispatchHandles
    virMutexUnlock(&eventLoop.lock)
                                      qemuAgentClose
                                        virObjectLock(mon)
                                        virEventRemoveHandle
                                        VIR_FORCE_CLOSE(mon->fd)
                                        virObjectUnlock(mon)
                                      priv->agentError = false
    qemuAgentIO
      virObjectLock(mon)
        mon->fd != fd --> error = true
        qemuProcessHandleAgentError
          priv->agentError = true
      virObjectUnlock(mon)
    virMutexLock(&eventLoop.lock)

qemuAgentClose set the mon->fd to '-1', and then qemuAgentIO
check the mon->fd not equals to fd that registered before.
qemuProcessHandleAgentError will be called to set
priv->agentError to 'true', then the priv->agentError is
always 'true' except restart libvirtd or restart
qemu-guest-agent process in guest. We can't send any
qemu-agent-command anymore even if qemuConnectAgent return
success later.

This is accidently occurs when hot-add-vcpu in windows2012.
  virsh setvcpus ...
  virsh qemu-agent-command $vm '{"execute":"guest-get-vcpus"}'

Reset the priv->agentError to 'false' when qemuConnectAgent sucess
to fix this problem.

Signed-off-by: Wang Yechao <wang.yechao255@zte.com.cn>
---
 src/qemu/qemu_process.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 29b0ba1..4fbb955 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -269,6 +269,7 @@ qemuConnectAgent(virQEMUDriverPtr driver, virDomainObjPtr vm)
         virResetLastError();
     }
 
+    priv->agentError = false;
     return 0;
 }
 
-- 
1.8.3.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH] qemu: agent: Reset agentError when qemuConnectAgent success
Posted by Nikolay Shirokovskiy 5 years, 6 months ago
On 27.09.2018 14:37, Wang Yechao wrote:
> qemuAgentClose and qemuAgentIO have race condition,
> as follows:
> 
> main thread:                          second thread:
> virEventPollRunOnce                processSerialChangedEvent
>   virEventPollDispatchHandles
>     virMutexUnlock(&eventLoop.lock)
>                                       qemuAgentClose
>                                         virObjectLock(mon)
>                                         virEventRemoveHandle
>                                         VIR_FORCE_CLOSE(mon->fd)
>                                         virObjectUnlock(mon)
>                                       priv->agentError = false
>     qemuAgentIO
>       virObjectLock(mon)
>         mon->fd != fd --> error = true
>         qemuProcessHandleAgentError
>           priv->agentError = true
>       virObjectUnlock(mon)
>     virMutexLock(&eventLoop.lock)
> 
> qemuAgentClose set the mon->fd to '-1', and then qemuAgentIO
> check the mon->fd not equals to fd that registered before.
> qemuProcessHandleAgentError will be called to set
> priv->agentError to 'true', then the priv->agentError is
> always 'true' except restart libvirtd or restart
> qemu-guest-agent process in guest. We can't send any
> qemu-agent-command anymore even if qemuConnectAgent return
> success later.
> 
> This is accidently occurs when hot-add-vcpu in windows2012.
>   virsh setvcpus ...
>   virsh qemu-agent-command $vm '{"execute":"guest-get-vcpus"}'
> 
> Reset the priv->agentError to 'false' when qemuConnectAgent sucess
> to fix this problem.
> 
> Signed-off-by: Wang Yechao <wang.yechao255@zte.com.cn>
> ---
>  src/qemu/qemu_process.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
> index 29b0ba1..4fbb955 100644
> --- a/src/qemu/qemu_process.c
> +++ b/src/qemu/qemu_process.c
> @@ -269,6 +269,7 @@ qemuConnectAgent(virQEMUDriverPtr driver, virDomainObjPtr vm)
>          virResetLastError();
>      }
>  
> +    priv->agentError = false;
>      return 0;
>  }
>  
> 

There was similar problem with qemu monitor:

commit 89563efc0209b854d2b2e554423423d7602acdbd
Author: Daniel P. Berrange <berrange@redhat.com>
Date:   Fri Sep 28 15:27:39 2012 +0100

    Avoid bogus I/O event errors when closing the QEMU monitor
    
    After calling qemuMonitorClose(), it is still possible for
    the QEMU monitor I/O event callback to get invoked. This
    will trigger an error message because mon->fd has been set
    to -1 at this point. Silently ignore the case where mon->fd
    is -1, likewise for mon->watch being zero.
    
    Signed-off-by: Daniel P. Berrange <berrange@redhat.com

This approach has advantage it does not set agentError at all. So clients
see agent state as disconnected (which is true) instead of error.

Nikolay

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt][PATCH] qemu: agent: Reset agentError whenqemuConnectAgent success
Posted by wang.yechao255@zte.com.cn 5 years, 6 months ago
> On 27.09.2018 14:37, Wang Yechao wrote:
> > qemuAgentClose and qemuAgentIO have race condition,
> > as follows:
> >
> > main thread:                          second thread:
> > virEventPollRunOnce                processSerialChangedEvent
> >   virEventPollDispatchHandles
> >     virMutexUnlock(&eventLoop.lock)
> >                                       qemuAgentClose
> >                                         virObjectLock(mon)
> >                                         virEventRemoveHandle
> >                                         VIR_FORCE_CLOSE(mon->fd)
> >                                         virObjectUnlock(mon)
> >                                       priv->agentError = false
> >     qemuAgentIO
> >       virObjectLock(mon)
> >         mon->fd != fd --> error = true
> >         qemuProcessHandleAgentError
> >           priv->agentError = true
> >       virObjectUnlock(mon)
> >     virMutexLock(&eventLoop.lock)
> >
> > qemuAgentClose set the mon->fd to '-1', and then qemuAgentIO
> > check the mon->fd not equals to fd that registered before.
> > qemuProcessHandleAgentError will be called to set
> > priv->agentError to 'true', then the priv->agentError is
> > always 'true' except restart libvirtd or restart
> > qemu-guest-agent process in guest. We can't send any
> > qemu-agent-command anymore even if qemuConnectAgent return
> > success later.
> >
> > This is accidently occurs when hot-add-vcpu in windows2012.
> >   virsh setvcpus ...
> >   virsh qemu-agent-command $vm '{"execute":"guest-get-vcpus"}'
> >
> > Reset the priv->agentError to 'false' when qemuConnectAgent sucess
> > to fix this problem.
> >
> > Signed-off-by: Wang Yechao <wang.yechao255@zte.com.cn>
> > ---
> >  src/qemu/qemu_process.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
> > index 29b0ba1..4fbb955 100644
> > --- a/src/qemu/qemu_process.c
> > +++ b/src/qemu/qemu_process.c
> > @@ -269,6 +269,7 @@ qemuConnectAgent(virQEMUDriverPtr driver, virDomainObjPtr vm)
> >          virResetLastError();
> >      }
> >
> > +    priv->agentError = false;
> >      return 0;
> >  }
> >
> >
>
> There was similar problem with qemu monitor:
>
> commit 89563efc0209b854d2b2e554423423d7602acdbd
> Author: Daniel P. Berrange <berrange@redhat.com>
> Date:   Fri Sep 28 15:27:39 2012 +0100
>
>     Avoid bogus I/O event errors when closing the QEMU monitor
>
>     After calling qemuMonitorClose(), it is still possible for
>     the QEMU monitor I/O event callback to get invoked. This
>     will trigger an error message because mon->fd has been set
>     to -1 at this point. Silently ignore the case where mon->fd
>     is -1, likewise for mon->watch being zero.
>
>     Signed-off-by: Daniel P. Berrange <berrange@redhat.com
>
> This approach has advantage it does not set agentError at all. So clients
> see agent state as disconnected (which is true) instead of error.
>
> Nikolay

Thanks Nikolay, this solution is better. I'll give another patch.
---
Best wishes
Wang Yechao--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list