From nobody Thu May 15 12:49:01 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; envelope-from=libvir-list-bounces@redhat.com; helo=mx1.redhat.com; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by mx.zohomail.com with SMTPS id 1508501078229989.7122472787433; Fri, 20 Oct 2017 05:04:38 -0700 (PDT) Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3CA49806A3; Fri, 20 Oct 2017 12:04:37 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 15EAF5D6A4; Fri, 20 Oct 2017 12:04:37 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id CAA1A62CEB; Fri, 20 Oct 2017 12:04:36 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id v9KC4ZP3002996 for ; Fri, 20 Oct 2017 08:04:35 -0400 Received: by smtp.corp.redhat.com (Postfix) id 896391878A; Fri, 20 Oct 2017 12:04:35 +0000 (UTC) Received: from virval.usersys.redhat.com (unknown [10.43.2.105]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 37E51648A0 for ; Fri, 20 Oct 2017 12:04:32 +0000 (UTC) Received: by virval.usersys.redhat.com (Postfix, from userid 500) id F414A104246; Fri, 20 Oct 2017 14:04:30 +0200 (CEST) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 3CA49806A3 Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=libvir-list-bounces@redhat.com From: Jiri Denemark To: libvir-list@redhat.com Date: Fri, 20 Oct 2017 14:04:29 +0200 Message-Id: In-Reply-To: References: In-Reply-To: References: Mail-Followup-To: libvir-list@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-loop: libvir-list@redhat.com Subject: [libvirt] [PATCH 3/3] qemu: Enabled pause-before-switchover migration capability X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: libvir-list-bounces@redhat.com Errors-To: libvir-list-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Fri, 20 Oct 2017 12:04:37 +0000 (UTC) X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Type: text/plain; charset="utf-8" QEMU identified a race condition between the device state serialization and the end of storage migration. Both QEMU and libvirt needs to be updated to fix this. Our migration work flow is modified so that after starting the migration we to wait for QEMU to enter "pre-switchover", "postcopy-active", or "completed" state. Once there, we cancel all block jobs as usual. But if QEMU is in "pre-switchover", we need to resume the migration afterwards and wait again for the real end (either "postcopy-active" or "completed" state). Old QEMU will just enter either "postcopy-active" or "completed" directly, which is still correctly handled even by new libvirt. The "pre-switchover" state will only be entered if QEMU supports it and the pause-before-switchover capability was enabled. Thus all combinations of libvirt and QEMU will work, but only new QEMU with new libvirt will avoid the race condition. Signed-off-by: Jiri Denemark --- src/qemu/qemu_migration.c | 64 +++++++++++++++++++++++++++++++++++++++++++= +++- 1 file changed, 63 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 4b356002f..af744661f 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1525,6 +1525,16 @@ qemuMigrationCompleted(virQEMUDriverPtr driver, goto error; } =20 + /* Migration was paused before serializing device state, let's return = to + * the caller so that it can finish all block jobs, resume migration, = and + * wait again for the real end of the migration. + */ + if (flags & QEMU_MIGRATION_COMPLETED_PRE_SWITCHOVER && + jobInfo->status =3D=3D QEMU_DOMAIN_JOB_STATUS_PAUSED) { + VIR_DEBUG("Migration paused before switchover"); + return 1; + } + /* In case of postcopy the source considers migration completed at the * moment it switched from active to postcopy-active state. The destin= ation * will continue waiting until the migrate state changes to completed. @@ -3600,6 +3610,28 @@ qemuMigrationConnect(virQEMUDriverPtr driver, return ret; } =20 + +static int +qemuMigrationContinue(virQEMUDriverPtr driver, + virDomainObjPtr vm, + qemuMonitorMigrationStatus status, + qemuDomainAsyncJob asyncJob) +{ + qemuDomainObjPrivatePtr priv =3D vm->privateData; + int ret; + + if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0) + return -1; + + ret =3D qemuMonitorMigrateContinue(priv->mon, status); + + if (qemuDomainObjExitMonitor(driver, vm) < 0) + ret =3D -1; + + return ret; +} + + static int qemuMigrationRun(virQEMUDriverPtr driver, virDomainObjPtr vm, @@ -3769,6 +3801,12 @@ qemuMigrationRun(virQEMUDriverPtr driver, QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) goto error; =20 + if (qemuMigrationCapsGet(vm, QEMU_MONITOR_MIGRATION_CAPS_PAUSE_BEFORE_= SWITCHOVER) && + qemuMigrationSetOption(driver, vm, + QEMU_MONITOR_MIGRATION_CAPS_PAUSE_BEFORE_SW= ITCHOVER, + true, QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) + goto error; + if (qemuMigrationSetParams(driver, vm, QEMU_ASYNC_JOB_MIGRATION_OUT, migParams) < 0) goto error; @@ -3847,7 +3885,7 @@ qemuMigrationRun(virQEMUDriverPtr driver, fd =3D -1; } =20 - waitFlags =3D 0; + waitFlags =3D QEMU_MIGRATION_COMPLETED_PRE_SWITCHOVER; if (abort_on_error) waitFlags |=3D QEMU_MIGRATION_COMPLETED_ABORT_ON_ERROR; if (mig->nbd) @@ -3889,6 +3927,30 @@ qemuMigrationRun(virQEMUDriverPtr driver, dconn) < 0) goto error; =20 + /* When migration was paused before serializing device state we need to + * resume it now once we finished all block jobs and wait for the real + * end of the migration. + */ + if (priv->job.current->status =3D=3D QEMU_DOMAIN_JOB_STATUS_PAUSED) { + if (qemuMigrationContinue(driver, vm, + QEMU_MONITOR_MIGRATION_STATUS_PRE_SWITCH= OVER, + QEMU_ASYNC_JOB_MIGRATION_OUT) < 0) + goto error; + + waitFlags ^=3D QEMU_MIGRATION_COMPLETED_PRE_SWITCHOVER; + + rc =3D qemuMigrationWaitForCompletion(driver, vm, + QEMU_ASYNC_JOB_MIGRATION_OUT, + dconn, waitFlags); + if (rc =3D=3D -2) { + goto error; + } else if (rc =3D=3D -1) { + /* QEMU reported failed migration, nothing to cancel anymore */ + cancel =3D false; + goto error; + } + } + if (iothread) { qemuMigrationIOThreadPtr io; =20 --=20 2.14.2 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list