[Qemu-devel] [PATCH v2] vnc: do not disconnect on EAGAIN

Michael Tokarev posted 1 patch 7 years, 1 month ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/1486115549-9398-1-git-send-email-mjt@msgid.tls.msk.ru
Test checkpatch passed
Test docker passed
Test s390x failed
ui/vnc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
[Qemu-devel] [PATCH v2] vnc: do not disconnect on EAGAIN
Posted by Michael Tokarev 7 years, 1 month ago
When qemu vnc server is trying to send large update to clients,
there might be a situation when system responds with something
like EAGAIN, indicating that there's no system memory to send
that much data (depending on the network speed, client and server
and what is happening).  In this case, something like this happens
on qemu side (from strace):

sendmsg(16, {msg_name(0)=NULL,
        msg_iov(1)=[{"\244\"..., 729186}],
        msg_controllen=0, msg_flags=0}, 0) = 103950
sendmsg(16, {msg_name(0)=NULL,
        msg_iov(1)=[{"lz\346"..., 1559618}],
        msg_controllen=0, msg_flags=0}, 0) = -1 EAGAIN
sendmsg(-1, {msg_name(0)=NULL,
        msg_iov(1)=[{"lz\346"..., 1559618}],
        msg_controllen=0, msg_flags=0}, 0) = -1 EBADF

qemu closes the socket before the retry, and obviously it gets EBADF
when trying to send to -1.

This is because there WAS a special handling for EAGAIN, but now it doesn't
work anymore, after commit 04d2529da27db512dcbd5e99d0e26d333f16efcc, because
now in all error-like cases we initiate vnc disconnect.

This change were introduced in qemu 2.6, and caused numerous grief for many
people, resulting in their vnc clients reporting sporadic random disconnects
from vnc server.

Fix that by doing the disconnect only when necessary, i.e. omitting this
very case of EAGAIN.

Hopefully the existing condition (comparing with QIO_CHANNEL_ERR_BLOCK)
is sufficient, as the original code (before the above commit) were
checking for other errno values too.

Apparently there's another (semi?)bug exist somewhere here, since the
code tries to write to fd# -1, it probably should check if the connection
is open before. But this isn't important.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Fixes: 04d2529da27db512dcbd5e99d0e26d333f16efcc
Cc: Daniel P. Berrange <berrange@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: qemu-stable@nongnu.org
---
v2: previous patch was tab/space-damaged, fixing this now

 ui/vnc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/ui/vnc.c b/ui/vnc.c
index cdeb79c..f2701e5 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -1256,12 +1256,13 @@ ssize_t vnc_client_io_error(VncState *vs, ssize_t ret, Error **errp)
     if (ret <= 0) {
         if (ret == 0) {
             VNC_DEBUG("Closing down client sock: EOF\n");
+            vnc_disconnect_start(vs);
         } else if (ret != QIO_CHANNEL_ERR_BLOCK) {
             VNC_DEBUG("Closing down client sock: ret %zd (%s)\n",
                       ret, errp ? error_get_pretty(*errp) : "Unknown");
+            vnc_disconnect_start(vs);
         }
 
-        vnc_disconnect_start(vs);
         if (errp) {
             error_free(*errp);
             *errp = NULL;
-- 
2.1.4


Re: [Qemu-devel] [PATCH v2] vnc: do not disconnect on EAGAIN
Posted by Daniel P. Berrange 7 years, 1 month ago
On Fri, Feb 03, 2017 at 12:52:29PM +0300, Michael Tokarev wrote:
> When qemu vnc server is trying to send large update to clients,
> there might be a situation when system responds with something
> like EAGAIN, indicating that there's no system memory to send
> that much data (depending on the network speed, client and server
> and what is happening).  In this case, something like this happens
> on qemu side (from strace):
> 
> sendmsg(16, {msg_name(0)=NULL,
>         msg_iov(1)=[{"\244\"..., 729186}],
>         msg_controllen=0, msg_flags=0}, 0) = 103950
> sendmsg(16, {msg_name(0)=NULL,
>         msg_iov(1)=[{"lz\346"..., 1559618}],
>         msg_controllen=0, msg_flags=0}, 0) = -1 EAGAIN
> sendmsg(-1, {msg_name(0)=NULL,
>         msg_iov(1)=[{"lz\346"..., 1559618}],
>         msg_controllen=0, msg_flags=0}, 0) = -1 EBADF
> 
> qemu closes the socket before the retry, and obviously it gets EBADF
> when trying to send to -1.
> 
> This is because there WAS a special handling for EAGAIN, but now it doesn't
> work anymore, after commit 04d2529da27db512dcbd5e99d0e26d333f16efcc, because
> now in all error-like cases we initiate vnc disconnect.
> 
> This change were introduced in qemu 2.6, and caused numerous grief for many
> people, resulting in their vnc clients reporting sporadic random disconnects
> from vnc server.
> 
> Fix that by doing the disconnect only when necessary, i.e. omitting this
> very case of EAGAIN.
> 
> Hopefully the existing condition (comparing with QIO_CHANNEL_ERR_BLOCK)
> is sufficient, as the original code (before the above commit) were
> checking for other errno values too.

There's no need to check WSAEWOULDBLOCK anymore - our win32 portability
code already remaps it to EAGAIN.

Checking EINTR is not required because QIOChannelSocket already
restarts the recvmsg/sendmsg when seeing EINTR

> Apparently there's another (semi?)bug exist somewhere here, since the
> code tries to write to fd# -1, it probably should check if the connection
> is open before. But this isn't important.
> 
> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
> Fixes: 04d2529da27db512dcbd5e99d0e26d333f16efcc
> Cc: Daniel P. Berrange <berrange@redhat.com>
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Cc: qemu-stable@nongnu.org
> ---
> v2: previous patch was tab/space-damaged, fixing this now
> 
>  ui/vnc.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/ui/vnc.c b/ui/vnc.c
> index cdeb79c..f2701e5 100644
> --- a/ui/vnc.c
> +++ b/ui/vnc.c
> @@ -1256,12 +1256,13 @@ ssize_t vnc_client_io_error(VncState *vs, ssize_t ret, Error **errp)
>      if (ret <= 0) {
>          if (ret == 0) {
>              VNC_DEBUG("Closing down client sock: EOF\n");
> +            vnc_disconnect_start(vs);
>          } else if (ret != QIO_CHANNEL_ERR_BLOCK) {
>              VNC_DEBUG("Closing down client sock: ret %zd (%s)\n",
>                        ret, errp ? error_get_pretty(*errp) : "Unknown");
> +            vnc_disconnect_start(vs);
>          }
>  
> -        vnc_disconnect_start(vs);
>          if (errp) {
>              error_free(*errp);
>              *errp = NULL;

Reviewed-by: Daniel P. Berrange <berrange@redhat.com>

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

Re: [Qemu-devel] [PATCH v2] vnc: do not disconnect on EAGAIN
Posted by Gerd Hoffmann 7 years, 1 month ago
On Fr, 2017-02-03 at 12:52 +0300, Michael Tokarev wrote:
> When qemu vnc server is trying to send large update to clients,
> there might be a situation when system responds with something
> like EAGAIN, indicating that there's no system memory to send
> that much data (depending on the network speed, client and server
> and what is happening).  In this case, something like this happens
> on qemu side (from strace):
> 
> sendmsg(16, {msg_name(0)=NULL,
>         msg_iov(1)=[{"\244\"..., 729186}],
>         msg_controllen=0, msg_flags=0}, 0) = 103950
> sendmsg(16, {msg_name(0)=NULL,
>         msg_iov(1)=[{"lz\346"..., 1559618}],
>         msg_controllen=0, msg_flags=0}, 0) = -1 EAGAIN
> sendmsg(-1, {msg_name(0)=NULL,
>         msg_iov(1)=[{"lz\346"..., 1559618}],
>         msg_controllen=0, msg_flags=0}, 0) = -1 EBADF
> 
> qemu closes the socket before the retry, and obviously it gets EBADF
> when trying to send to -1.
> 
> This is because there WAS a special handling for EAGAIN, but now it doesn't
> work anymore, after commit 04d2529da27db512dcbd5e99d0e26d333f16efcc, because
> now in all error-like cases we initiate vnc disconnect.
> 
> This change were introduced in qemu 2.6, and caused numerous grief for many
> people, resulting in their vnc clients reporting sporadic random disconnects
> from vnc server.
> 
> Fix that by doing the disconnect only when necessary, i.e. omitting this
> very case of EAGAIN.
> 
> Hopefully the existing condition (comparing with QIO_CHANNEL_ERR_BLOCK)
> is sufficient, as the original code (before the above commit) were
> checking for other errno values too.
> 
> Apparently there's another (semi?)bug exist somewhere here, since the
> code tries to write to fd# -1, it probably should check if the connection
> is open before. But this isn't important.
> 
> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
> Fixes: 04d2529da27db512dcbd5e99d0e26d333f16efcc
> Cc: Daniel P. Berrange <berrange@redhat.com>
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Cc: qemu-stable@nongnu.org

Added to ui patch queue.

thanks,
  Gerd