We can get rid of the build-coroutine-sigaltstack job by moving
the configure flags that should be tested here to other jobs:
Move --with-coroutine=sigaltstack to the build-without-defaults job
and --enable-trace-backends=ftrace to the cross-s390x-kvm-only job.
Signed-off-by: Thomas Huth <thuth@redhat.com>
---
.gitlab-ci.d/buildtest.yml | 14 ++------------
.gitlab-ci.d/crossbuilds.yml | 2 +-
2 files changed, 3 insertions(+), 13 deletions(-)
diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
index 91c7467a66..1438797a1c 100644
--- a/.gitlab-ci.d/buildtest.yml
+++ b/.gitlab-ci.d/buildtest.yml
@@ -533,19 +533,8 @@ build-tci:
- QTEST_QEMU_BINARY="./qemu-system-s390x" ./tests/qtest/pxe-test -m slow
- make check-tcg
-# Alternate coroutines implementations are only really of interest to KVM users
-# However we can't test against KVM on Gitlab-CI so we can only run unit tests
-build-coroutine-sigaltstack:
- extends: .native_build_job_template
- needs:
- job: amd64-ubuntu2004-container
- variables:
- IMAGE: ubuntu2004
- CONFIGURE_ARGS: --with-coroutine=sigaltstack --disable-tcg
- --enable-trace-backends=ftrace
- MAKE_CHECK_ARGS: check-unit
-
# Check our reduced build configurations
+# (and an alternative coroutine implementation)
build-without-defaults:
extends: .native_build_job_template
needs:
@@ -559,6 +548,7 @@ build-without-defaults:
--disable-pie
--disable-qom-cast-debug
--disable-strip
+ --with-coroutine=sigaltstack
TARGETS: avr-softmmu mips64-softmmu s390x-softmmu sh4-softmmu
sparc64-softmmu hexagon-linux-user i386-linux-user s390x-linux-user
MAKE_CHECK_ARGS: check-unit check-qtest-avr check-qtest-mips64
diff --git a/.gitlab-ci.d/crossbuilds.yml b/.gitlab-ci.d/crossbuilds.yml
index 8dbbb8f881..027d2088da 100644
--- a/.gitlab-ci.d/crossbuilds.yml
+++ b/.gitlab-ci.d/crossbuilds.yml
@@ -159,7 +159,7 @@ cross-s390x-kvm-only:
job: s390x-debian-cross-container
variables:
IMAGE: debian-s390x-cross
- EXTRA_CONFIGURE_OPTS: --disable-tcg
+ EXTRA_CONFIGURE_OPTS: --disable-tcg --enable-trace-backends=ftrace
cross-mips64el-kvm-only:
extends: .cross_accel_build_job
--
2.31.1
On Mon, Jan 30, 2023 at 11:44:46AM +0100, Thomas Huth wrote: > We can get rid of the build-coroutine-sigaltstack job by moving > the configure flags that should be tested here to other jobs: > Move --with-coroutine=sigaltstack to the build-without-defaults job > and --enable-trace-backends=ftrace to the cross-s390x-kvm-only job. The biggest user of coroutines is the block layer. So we probably ought to have coroutines aligned with a job that triggers the 'make check-block' for iotests. IIUC, the without-defaults job won't do that. How about, arbitrarily, using either the 'check-system-debian' or 'check-system-ubuntu' job. Those distros are closely related, so getting sigaltstack vs ucontext coverage between them is a good win, and they both trigger the block jobs IIUC. Incidentally sigaltstack is also covered by our Cirrus CI job for macOS. > Signed-off-by: Thomas Huth <thuth@redhat.com> > --- > .gitlab-ci.d/buildtest.yml | 14 ++------------ > .gitlab-ci.d/crossbuilds.yml | 2 +- > 2 files changed, 3 insertions(+), 13 deletions(-) > > diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml > index 91c7467a66..1438797a1c 100644 > --- a/.gitlab-ci.d/buildtest.yml > +++ b/.gitlab-ci.d/buildtest.yml > @@ -533,19 +533,8 @@ build-tci: > - QTEST_QEMU_BINARY="./qemu-system-s390x" ./tests/qtest/pxe-test -m slow > - make check-tcg > > -# Alternate coroutines implementations are only really of interest to KVM users > -# However we can't test against KVM on Gitlab-CI so we can only run unit tests > -build-coroutine-sigaltstack: > - extends: .native_build_job_template > - needs: > - job: amd64-ubuntu2004-container > - variables: > - IMAGE: ubuntu2004 > - CONFIGURE_ARGS: --with-coroutine=sigaltstack --disable-tcg > - --enable-trace-backends=ftrace > - MAKE_CHECK_ARGS: check-unit > - > # Check our reduced build configurations > +# (and an alternative coroutine implementation) > build-without-defaults: > extends: .native_build_job_template > needs: > @@ -559,6 +548,7 @@ build-without-defaults: > --disable-pie > --disable-qom-cast-debug > --disable-strip > + --with-coroutine=sigaltstack > TARGETS: avr-softmmu mips64-softmmu s390x-softmmu sh4-softmmu > sparc64-softmmu hexagon-linux-user i386-linux-user s390x-linux-user > MAKE_CHECK_ARGS: check-unit check-qtest-avr check-qtest-mips64 > diff --git a/.gitlab-ci.d/crossbuilds.yml b/.gitlab-ci.d/crossbuilds.yml > index 8dbbb8f881..027d2088da 100644 > --- a/.gitlab-ci.d/crossbuilds.yml > +++ b/.gitlab-ci.d/crossbuilds.yml > @@ -159,7 +159,7 @@ cross-s390x-kvm-only: > job: s390x-debian-cross-container > variables: > IMAGE: debian-s390x-cross > - EXTRA_CONFIGURE_OPTS: --disable-tcg > + EXTRA_CONFIGURE_OPTS: --disable-tcg --enable-trace-backends=ftrace > > cross-mips64el-kvm-only: > extends: .cross_accel_build_job > -- > 2.31.1 > > With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On 30/01/2023 11.58, Daniel P. Berrangé wrote: > On Mon, Jan 30, 2023 at 11:44:46AM +0100, Thomas Huth wrote: >> We can get rid of the build-coroutine-sigaltstack job by moving >> the configure flags that should be tested here to other jobs: >> Move --with-coroutine=sigaltstack to the build-without-defaults job >> and --enable-trace-backends=ftrace to the cross-s390x-kvm-only job. > > The biggest user of coroutines is the block layer. So we probably > ought to have coroutines aligned with a job that triggers the > 'make check-block' for iotests. IIUC, the without-defaults > job won't do that. How about, arbitrarily, using either the > 'check-system-debian' or 'check-system-ubuntu' job. Those distros > are closely related, so getting sigaltstack vs ucontext coverage > between them is a good win, and they both trigger the block jobs > IIUC. I gave it a try with the ubuntu job, but this apparently trips up the iotests: https://gitlab.com/thuth/qemu/-/jobs/3705965062#L212 Does anybody have a clue what could be going wrong here? Thomas
Am 03.02.2023 um 12:23 hat Thomas Huth geschrieben: > On 30/01/2023 11.58, Daniel P. Berrangé wrote: > > On Mon, Jan 30, 2023 at 11:44:46AM +0100, Thomas Huth wrote: > > > We can get rid of the build-coroutine-sigaltstack job by moving > > > the configure flags that should be tested here to other jobs: > > > Move --with-coroutine=sigaltstack to the build-without-defaults job > > > and --enable-trace-backends=ftrace to the cross-s390x-kvm-only job. > > > > The biggest user of coroutines is the block layer. So we probably > > ought to have coroutines aligned with a job that triggers the > > 'make check-block' for iotests. IIUC, the without-defaults > > job won't do that. How about, arbitrarily, using either the > > 'check-system-debian' or 'check-system-ubuntu' job. Those distros > > are closely related, so getting sigaltstack vs ucontext coverage > > between them is a good win, and they both trigger the block jobs > > IIUC. > > I gave it a try with the ubuntu job, but this apparently trips up the iotests: > > https://gitlab.com/thuth/qemu/-/jobs/3705965062#L212 > > Does anybody have a clue what could be going wrong here? I'm not sure how changing the coroutine backend could cause it, but primarily this looks like an assertion failure in migration code. Dave, Juan, any ideas what this assertion checks and why it could be failing? Kevin
Kevin Wolf <kwolf@redhat.com> wrote: > Am 03.02.2023 um 12:23 hat Thomas Huth geschrieben: >> On 30/01/2023 11.58, Daniel P. Berrangé wrote: >> > On Mon, Jan 30, 2023 at 11:44:46AM +0100, Thomas Huth wrote: >> > > We can get rid of the build-coroutine-sigaltstack job by moving >> > > the configure flags that should be tested here to other jobs: >> > > Move --with-coroutine=sigaltstack to the build-without-defaults job >> > > and --enable-trace-backends=ftrace to the cross-s390x-kvm-only job. >> > >> > The biggest user of coroutines is the block layer. So we probably >> > ought to have coroutines aligned with a job that triggers the >> > 'make check-block' for iotests. IIUC, the without-defaults >> > job won't do that. How about, arbitrarily, using either the >> > 'check-system-debian' or 'check-system-ubuntu' job. Those distros >> > are closely related, so getting sigaltstack vs ucontext coverage >> > between them is a good win, and they both trigger the block jobs >> > IIUC. >> >> I gave it a try with the ubuntu job, but this apparently trips up the iotests: >> >> https://gitlab.com/thuth/qemu/-/jobs/3705965062#L212 >> >> Does anybody have a clue what could be going wrong here? > > I'm not sure how changing the coroutine backend could cause it, but > primarily this looks like an assertion failure in migration code. Adding Peter here, as he is the last one touching that code O:-) > Dave, Juan, any ideas what this assertion checks and why it could be > failing? Really no. +QEMU_PROG: ../migration/ram.c:874: pss_find_next_dirty: Assertion `pss->host_page_end' failed. +./common.rc: line 195: 78727 Aborted (core dumped) ( if [ -n "${QEMU_NEED_PID}" ]; then + echo $BASHPID > "${QEMU_TEST_DIR}/qemu-${_QEMU_HANDLE}.pid"; +fi; GDB=""; if [ -n "${GDB_OPTIONS}" ]; then + GDB="gdbserver ${GDB_OPTIONS}"; +fi; VALGRIND_QEMU="${VALGRIND_QEMU_VM}" _qemu_proc_exec "${VALGRIND_LOGFILE}" $GDB "$QEMU_PROG" $QEMU_OPTIONS "$@" ) pss_find_next_dirty() is only called from three places: find . -type f -exec grep --color=auto -nH --null -e pss_find_next_dirty \{\} + ./ram.c.847: * pss_find_next_dirty: find the next dirty page of current ramblock ./ram.c.857:static void pss_find_next_dirty(PageSearchStatus *pss) ./ram.c.1562: pss_find_next_dirty(pss); ./ram.c.2391: pss_find_next_dirty(pss); ./ram.c.2476: pss_find_next_dirty(pss); I can't see how this can be affected by coroutines changes. What is the test that is failing, and what is the change that I have to do to try to reproduce it? Later, Juan.
On 03/02/2023 13.08, Kevin Wolf wrote: > Am 03.02.2023 um 12:23 hat Thomas Huth geschrieben: >> On 30/01/2023 11.58, Daniel P. Berrangé wrote: >>> On Mon, Jan 30, 2023 at 11:44:46AM +0100, Thomas Huth wrote: >>>> We can get rid of the build-coroutine-sigaltstack job by moving >>>> the configure flags that should be tested here to other jobs: >>>> Move --with-coroutine=sigaltstack to the build-without-defaults job >>>> and --enable-trace-backends=ftrace to the cross-s390x-kvm-only job. >>> >>> The biggest user of coroutines is the block layer. So we probably >>> ought to have coroutines aligned with a job that triggers the >>> 'make check-block' for iotests. IIUC, the without-defaults >>> job won't do that. How about, arbitrarily, using either the >>> 'check-system-debian' or 'check-system-ubuntu' job. Those distros >>> are closely related, so getting sigaltstack vs ucontext coverage >>> between them is a good win, and they both trigger the block jobs >>> IIUC. >> >> I gave it a try with the ubuntu job, but this apparently trips up the iotests: >> >> https://gitlab.com/thuth/qemu/-/jobs/3705965062#L212 >> >> Does anybody have a clue what could be going wrong here? > > I'm not sure how changing the coroutine backend could cause it, but > primarily this looks like an assertion failure in migration code. > > Dave, Juan, any ideas what this assertion checks and why it could be > failing? Ah, I think it's the bug that will be fixed by: https://lore.kernel.org/qemu-devel/20230202160640.2300-2-quintela@redhat.com/ The fix hasn't hit the master branch yet (I think), and I had another patch in my CI that disables the aarch64 binary in that runner, so the iotests suddenly have been executed with the alpha binary there --> migration fails. So never mind, it will be fixed as soon as Juan's pull request gets included. Thomas
On Fri, 3 Feb 2023 at 15:44, Thomas Huth <thuth@redhat.com> wrote: > > On 03/02/2023 13.08, Kevin Wolf wrote: > > Am 03.02.2023 um 12:23 hat Thomas Huth geschrieben: > >> On 30/01/2023 11.58, Daniel P. Berrangé wrote: > >>> On Mon, Jan 30, 2023 at 11:44:46AM +0100, Thomas Huth wrote: > >>>> We can get rid of the build-coroutine-sigaltstack job by moving > >>>> the configure flags that should be tested here to other jobs: > >>>> Move --with-coroutine=sigaltstack to the build-without-defaults job > >>>> and --enable-trace-backends=ftrace to the cross-s390x-kvm-only job. > >>> > >>> The biggest user of coroutines is the block layer. So we probably > >>> ought to have coroutines aligned with a job that triggers the > >>> 'make check-block' for iotests. IIUC, the without-defaults > >>> job won't do that. How about, arbitrarily, using either the > >>> 'check-system-debian' or 'check-system-ubuntu' job. Those distros > >>> are closely related, so getting sigaltstack vs ucontext coverage > >>> between them is a good win, and they both trigger the block jobs > >>> IIUC. > >> > >> I gave it a try with the ubuntu job, but this apparently trips up the iotests: > >> > >> https://gitlab.com/thuth/qemu/-/jobs/3705965062#L212 > >> > >> Does anybody have a clue what could be going wrong here? > > > > I'm not sure how changing the coroutine backend could cause it, but > > primarily this looks like an assertion failure in migration code. > > > > Dave, Juan, any ideas what this assertion checks and why it could be > > failing? > > Ah, I think it's the bug that will be fixed by: > > https://lore.kernel.org/qemu-devel/20230202160640.2300-2-quintela@redhat.com/ > > The fix hasn't hit the master branch yet (I think), and I had another patch > in my CI that disables the aarch64 binary in that runner, so the iotests > suddenly have been executed with the alpha binary there --> migration fails. > > So never mind, it will be fixed as soon as Juan's pull request gets included. The migration tests have been flaky for a while now, including setups where host and guest page sizes are the same. (For instance, my x86 macos box pretty reliably sees failures when the machine is under load.) -- PMM
Peter Maydell <peter.maydell@linaro.org> wrote: > On Fri, 3 Feb 2023 at 15:44, Thomas Huth <thuth@redhat.com> wrote: >> >> On 03/02/2023 13.08, Kevin Wolf wrote: >> > Am 03.02.2023 um 12:23 hat Thomas Huth geschrieben: >> >> On 30/01/2023 11.58, Daniel P. Berrangé wrote: >> >>> On Mon, Jan 30, 2023 at 11:44:46AM +0100, Thomas Huth wrote: >> >>>> We can get rid of the build-coroutine-sigaltstack job by moving >> >>>> the configure flags that should be tested here to other jobs: >> >>>> Move --with-coroutine=sigaltstack to the build-without-defaults job >> >>>> and --enable-trace-backends=ftrace to the cross-s390x-kvm-only job. >> >>> >> >>> The biggest user of coroutines is the block layer. So we probably >> >>> ought to have coroutines aligned with a job that triggers the >> >>> 'make check-block' for iotests. IIUC, the without-defaults >> >>> job won't do that. How about, arbitrarily, using either the >> >>> 'check-system-debian' or 'check-system-ubuntu' job. Those distros >> >>> are closely related, so getting sigaltstack vs ucontext coverage >> >>> between them is a good win, and they both trigger the block jobs >> >>> IIUC. >> >> >> >> I gave it a try with the ubuntu job, but this apparently trips up the iotests: >> >> >> >> https://gitlab.com/thuth/qemu/-/jobs/3705965062#L212 >> >> >> >> Does anybody have a clue what could be going wrong here? >> > >> > I'm not sure how changing the coroutine backend could cause it, but >> > primarily this looks like an assertion failure in migration code. >> > >> > Dave, Juan, any ideas what this assertion checks and why it could be >> > failing? >> >> Ah, I think it's the bug that will be fixed by: >> >> https://lore.kernel.org/qemu-devel/20230202160640.2300-2-quintela@redhat.com/ >> >> The fix hasn't hit the master branch yet (I think), and I had another patch >> in my CI that disables the aarch64 binary in that runner, so the iotests >> suddenly have been executed with the alpha binary there --> migration fails. >> >> So never mind, it will be fixed as soon as Juan's pull request gets included. > > The migration tests have been flaky for a while now, > including setups where host and guest page sizes are the same. > (For instance, my x86 macos box pretty reliably sees failures > when the machine is under load.) I *thought* that we had fixed all of those. But it is difficult for me to know because: - I only happens when one runs "make check" - running ./migration-test have never failed to me - When it fails (and it has been a while since it has failed to me) it is impossible to me to detect what is going on, and as said, I have never been able to reproduce running only migration-test. I will try to run several at the same time and see if it happens. And as Thomas said, I *think* that the fix that Peter Xu posted should fix this issue. Famous last words. Later, Juan.
On 03/02/2023 22.14, Juan Quintela wrote: > Peter Maydell <peter.maydell@linaro.org> wrote: >> On Fri, 3 Feb 2023 at 15:44, Thomas Huth <thuth@redhat.com> wrote: >>> >>> On 03/02/2023 13.08, Kevin Wolf wrote: >>>> Am 03.02.2023 um 12:23 hat Thomas Huth geschrieben: >>>>> On 30/01/2023 11.58, Daniel P. Berrangé wrote: >>>>>> On Mon, Jan 30, 2023 at 11:44:46AM +0100, Thomas Huth wrote: >>>>>>> We can get rid of the build-coroutine-sigaltstack job by moving >>>>>>> the configure flags that should be tested here to other jobs: >>>>>>> Move --with-coroutine=sigaltstack to the build-without-defaults job >>>>>>> and --enable-trace-backends=ftrace to the cross-s390x-kvm-only job. >>>>>> >>>>>> The biggest user of coroutines is the block layer. So we probably >>>>>> ought to have coroutines aligned with a job that triggers the >>>>>> 'make check-block' for iotests. IIUC, the without-defaults >>>>>> job won't do that. How about, arbitrarily, using either the >>>>>> 'check-system-debian' or 'check-system-ubuntu' job. Those distros >>>>>> are closely related, so getting sigaltstack vs ucontext coverage >>>>>> between them is a good win, and they both trigger the block jobs >>>>>> IIUC. >>>>> >>>>> I gave it a try with the ubuntu job, but this apparently trips up the iotests: >>>>> >>>>> https://gitlab.com/thuth/qemu/-/jobs/3705965062#L212 >>>>> >>>>> Does anybody have a clue what could be going wrong here? >>>> >>>> I'm not sure how changing the coroutine backend could cause it, but >>>> primarily this looks like an assertion failure in migration code. >>>> >>>> Dave, Juan, any ideas what this assertion checks and why it could be >>>> failing? >>> >>> Ah, I think it's the bug that will be fixed by: >>> >>> https://lore.kernel.org/qemu-devel/20230202160640.2300-2-quintela@redhat.com/ >>> >>> The fix hasn't hit the master branch yet (I think), and I had another patch >>> in my CI that disables the aarch64 binary in that runner, so the iotests >>> suddenly have been executed with the alpha binary there --> migration fails. >>> >>> So never mind, it will be fixed as soon as Juan's pull request gets included. >> >> The migration tests have been flaky for a while now, >> including setups where host and guest page sizes are the same. >> (For instance, my x86 macos box pretty reliably sees failures >> when the machine is under load.) > > I *thought* that we had fixed all of those. > > But it is difficult for me to know because: > - I only happens when one runs "make check" > - running ./migration-test have never failed to me > - When it fails (and it has been a while since it has failed to me) > it is impossible to me to detect what is going on, and as said, I have > never been able to reproduce running only migration-test. > > I will try to run several at the same time and see if it happens. > > And as Thomas said, I *think* that the fix that Peter Xu posted should > fix this issue. Famous last words. The patch from Peter should fix my problems that I triggered via the iotests - but the migration-qtest is still unstable independent from that issue, I think. See for example the latest staging pipeline: https://gitlab.com/qemu-project/qemu/-/pipelines/767961842 The migration qtest failed in both, the x86-freebsd-build and the ubuntu-20.04-s390x-all pipelin. Thomas
Thomas Huth <thuth@redhat.com> wrote: > On 03/02/2023 22.14, Juan Quintela wrote: >> Peter Maydell <peter.maydell@linaro.org> wrote: >>> On Fri, 3 Feb 2023 at 15:44, Thomas Huth <thuth@redhat.com> wrote: >>>> >>>> On 03/02/2023 13.08, Kevin Wolf wrote: >>>>> Am 03.02.2023 um 12:23 hat Thomas Huth geschrieben: >>>>>> On 30/01/2023 11.58, Daniel P. Berrangé wrote: >>>>>>> On Mon, Jan 30, 2023 at 11:44:46AM +0100, Thomas Huth wrote: >>>>>>>> We can get rid of the build-coroutine-sigaltstack job by moving >>>>>>>> the configure flags that should be tested here to other jobs: >>>>>>>> Move --with-coroutine=sigaltstack to the build-without-defaults job >>>>>>>> and --enable-trace-backends=ftrace to the cross-s390x-kvm-only job. >>>>>>> >>>>>>> The biggest user of coroutines is the block layer. So we probably >>>>>>> ought to have coroutines aligned with a job that triggers the >>>>>>> 'make check-block' for iotests. IIUC, the without-defaults >>>>>>> job won't do that. How about, arbitrarily, using either the >>>>>>> 'check-system-debian' or 'check-system-ubuntu' job. Those distros >>>>>>> are closely related, so getting sigaltstack vs ucontext coverage >>>>>>> between them is a good win, and they both trigger the block jobs >>>>>>> IIUC. >>>>>> >>>>>> I gave it a try with the ubuntu job, but this apparently trips up the iotests: >>>>>> >>>>>> https://gitlab.com/thuth/qemu/-/jobs/3705965062#L212 >>>>>> >>>>>> Does anybody have a clue what could be going wrong here? >>>>> >>>>> I'm not sure how changing the coroutine backend could cause it, but >>>>> primarily this looks like an assertion failure in migration code. >>>>> >>>>> Dave, Juan, any ideas what this assertion checks and why it could be >>>>> failing? >>>> >>>> Ah, I think it's the bug that will be fixed by: >>>> >>>> https://lore.kernel.org/qemu-devel/20230202160640.2300-2-quintela@redhat.com/ >>>> >>>> The fix hasn't hit the master branch yet (I think), and I had another patch >>>> in my CI that disables the aarch64 binary in that runner, so the iotests >>>> suddenly have been executed with the alpha binary there --> migration fails. >>>> >>>> So never mind, it will be fixed as soon as Juan's pull request gets included. >>> >>> The migration tests have been flaky for a while now, >>> including setups where host and guest page sizes are the same. >>> (For instance, my x86 macos box pretty reliably sees failures >>> when the machine is under load.) >> I *thought* that we had fixed all of those. >> But it is difficult for me to know because: >> - I only happens when one runs "make check" >> - running ./migration-test have never failed to me >> - When it fails (and it has been a while since it has failed to me) >> it is impossible to me to detect what is going on, and as said, I have >> never been able to reproduce running only migration-test. >> I will try to run several at the same time and see if it happens. >> And as Thomas said, I *think* that the fix that Peter Xu posted >> should >> fix this issue. Famous last words. > > The patch from Peter should fix my problems that I triggered via the > iotests - but the migration-qtest is still unstable independent from > that issue, I think. See for example the latest staging pipeline: > > https://gitlab.com/qemu-project/qemu/-/pipelines/767961842 > > The migration qtest failed in both, the x86-freebsd-build and the > ubuntu-20.04-s390x-all pipelin. > > Thomas 31/659 qemu:qtest+qtest-aarch64 / qtest-aarch64/migration-test ERROR 48.23s killed by signal 6 SIGABRT >>> G_TEST_DBUS_DAEMON=/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/tests/dbus-vmstate-daemon.sh QTEST_QEMU_IMG=./qemu-img QTEST_QEMU_BINARY=./qemu-system-aarch64 MALLOC_PERTURB_=124 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon /home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/tests/qtest/migration-test --tap -k ――――――――――――――――――――――――――――――――――――― ✀ ――――――――――――――――――――――――――――――――――――― stderr: Broken pipe ../tests/qtest/libqtest.c:190: kill_qemu() detected QEMU death from signal 11 (Segmentation fault) (core dumped) TAP parsing error: Too few tests run (expected 41, got 12) (test program exited with status code -6) ―――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――― I don't know hat to do with this: - this is aarch64 tcg - this *works* on f37, or at least I can't reproduce any error with make check on my box, and I *think* my configuration is quite extensive (as far as I know everything that can be compiled in fedora with packages in the distro): configure file: /mnt/code/qemu/full/configure --enable-trace-backends=log --prefix=/usr --sysconfdir=/etc/sysconfig/ --audio-drv-list=pa,alsa --with-coroutine=ucontext --with-git-submodules=validate --enable-alsa --enable-attr --enable-auth-pam --enable-avx2 --enable-avx512f --enable-bochs --enable-bpf --enable-brlapi --disable-bsd-user --enable-bzip2 --enable-cap-ng --enable-capstone --disable-cfi --disable-cfi-debug --enable-cloop --disable-cocoa --enable-containers --disable-coreaudio --enable-coroutine-pool --enable-crypto-afalg --enable-curl --enable-curses --enable-dbus-display --enable-debug-info --disable-debug-mutex --disable-debug-stack-usage --disable-debug-tcg --enable-dmg --enable-docs --disable-dsound --enable-fdt --enable-fuse --enable-fuse-lseek --disable-fuzzing --disable-gcov --disable-gcrypt --enable-gettext --enable-gio --enable-glusterfs --enable-gnutls --disable-gprof --enable-gtk --enable-guest-agent --disable-guest-agent-msi --disable-hax --disable-hvf --enable-iconv --enable-install-blobs --enable-jack --enable-keyring --enable-kvm --enable-l2tpv3 --enable-libdaxctl --enable-libiscsi --enable-libnfs --enable-libpmem --enable-libssh --enable-libudev --enable-libusb --enable-linux-aio --enable-linux-io-uring --enable-linux-user --enable-live-block-migration --disable-lto --disable-lzfse --enable-lzo --disable-malloc-trim --enable-membarrier --enable-module-upgrades --enable-modules --enable-mpath --enable-multiprocess --disable-netmap --enable-nettle --enable-numa --disable-nvmm --enable-opengl --enable-oss --enable-pa --enable-parallels --enable-pie --enable-plugins --enable-png --disable-profiler --enable-pvrdma --enable-qcow1 --enable-qed --disable-qom-cast-debug --enable-rbd --enable-rdma --enable-replication --enable-rng-none --disable-safe-stack --disable-sanitizers --enable-stack-protector --enable-sdl --enable-sdl-image --enable-seccomp --enable-selinux --enable-slirp --enable-slirp-smbd --enable-smartcard --enable-snappy --enable-sparse --enable-spice --enable-spice-protocol --enable-system --enable-tcg --disable-tcg-interpreter --enable-tools --enable-tpm --disable-tsan --disable-u2f --enable-usb-redir --enable-user --disable-vde --enable-vdi --enable-vhost-crypto --enable-vhost-kernel --enable-vhost-net --enable-vhost-user --enable-vhost-user-blk-server --enable-vhost-vdpa --enable-virglrenderer --enable-virtfs --enable-virtiofsd --enable-vnc --enable-vnc-jpeg --enable-vnc-sasl --enable-vte --enable-vvfat --enable-werror --disable-whpx --enable-xen --enable-xen-pci-passthrough --enable-xkbcommon --enable-zstd - It gives a segmentation fault. Nothing else. Can we get at least a backtrace to work from there? Thanks, Juan.
On Mon, 6 Feb 2023 at 08:46, Juan Quintela <quintela@redhat.com> wrote: > 31/659 qemu:qtest+qtest-aarch64 / qtest-aarch64/migration-test ERROR 48.23s killed by signal 6 SIGABRT > >>> G_TEST_DBUS_DAEMON=/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/tests/dbus-vmstate-daemon.sh QTEST_QEMU_IMG=./qemu-img QTEST_QEMU_BINARY=./qemu-system-aarch64 MALLOC_PERTURB_=124 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon /home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/tests/qtest/migration-test --tap -k > ――――――――――――――――――――――――――――――――――――― ✀ ――――――――――――――――――――――――――――――――――――― > stderr: > Broken pipe > ../tests/qtest/libqtest.c:190: kill_qemu() detected QEMU death from signal 11 (Segmentation fault) (core dumped) > TAP parsing error: Too few tests run (expected 41, got 12) > (test program exited with status code -6) > ―――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――― > > I don't know hat to do with this: > - this is aarch64 tcg > - this *works* on f37, or at least I can't reproduce any error with make > check on my box, and I *think* my configuration is quite extensive (as > far as I know everything that can be compiled in fedora with packages > in the distro): > - It gives a segmentation fault. Nothing else. > > Can we get at least a backtrace to work from there? Improving the test program / test harness to provide better information would help. Ultimately if we're going to be doing CI in gitlab this is a necessity -- all we are ever going to have is whatever the test program and harness can print to the logs... thanks -- PMM
On Fri, 3 Feb 2023 at 21:14, Juan Quintela <quintela@redhat.com> wrote: > > Peter Maydell <peter.maydell@linaro.org> wrote: > > The migration tests have been flaky for a while now, > > including setups where host and guest page sizes are the same. > > (For instance, my x86 macos box pretty reliably sees failures > > when the machine is under load.) > > I *thought* that we had fixed all of those. > > But it is difficult for me to know because: > - I only happens when one runs "make check" > - running ./migration-test have never failed to me > - When it fails (and it has been a while since it has failed to me) > it is impossible to me to detect what is going on, and as said, I have > never been able to reproduce running only migration-test. Yes. If we could improve the logging in the test so that when an intermittent failure does happen the test prints better clues about what happened, I think that would help a lot. https://lore.kernel.org/qemu-devel/CAFEAcA8x_iM3hN2-P9F+huXnXFXy+D6FzE+Leq4erLdg7zkVGw@mail.gmail.com/ is the thread from late December about the macos failures. -- PMM
Peter Maydell <peter.maydell@linaro.org> wrote: > On Fri, 3 Feb 2023 at 21:14, Juan Quintela <quintela@redhat.com> wrote: >> >> Peter Maydell <peter.maydell@linaro.org> wrote: >> > The migration tests have been flaky for a while now, >> > including setups where host and guest page sizes are the same. >> > (For instance, my x86 macos box pretty reliably sees failures >> > when the machine is under load.) >> >> I *thought* that we had fixed all of those. >> >> But it is difficult for me to know because: >> - I only happens when one runs "make check" >> - running ./migration-test have never failed to me >> - When it fails (and it has been a while since it has failed to me) >> it is impossible to me to detect what is going on, and as said, I have >> never been able to reproduce running only migration-test. > > Yes. If we could improve the logging in the test so that when > an intermittent failure does happen the test prints better > clues about what happened, I think that would help a lot. > > https://lore.kernel.org/qemu-devel/CAFEAcA8x_iM3hN2-P9F+huXnXFXy+D6FzE+Leq4erLdg7zkVGw@mail.gmail.com/ > is the thread from late December about the macos failures. We (red hat) found a similar problem with aarch64, but only when using zero copy. Will try to see if I can reproduce this other there. https://bugzilla.redhat.com/show_bug.cgi?id=2160929 the similar thing to what you have is: - they are trying to cancel - they are on aarch64 but: - they can only reproduce with zero copy enabled. Later, Juan.
On 30/01/2023 11.58, Daniel P. Berrangé wrote: > On Mon, Jan 30, 2023 at 11:44:46AM +0100, Thomas Huth wrote: >> We can get rid of the build-coroutine-sigaltstack job by moving >> the configure flags that should be tested here to other jobs: >> Move --with-coroutine=sigaltstack to the build-without-defaults job >> and --enable-trace-backends=ftrace to the cross-s390x-kvm-only job. > > The biggest user of coroutines is the block layer. So we probably > ought to have coroutines aligned with a job that triggers the > 'make check-block' for iotests. IIUC, the without-defaults > job won't do that. How about, arbitrarily, using either the > 'check-system-debian' or 'check-system-ubuntu' job. Those distros > are closely related, so getting sigaltstack vs ucontext coverage > between them is a good win, and they both trigger the block jobs > IIUC. Ok ... but let's put qemu-block@nongnu.org on CC: first, to see what the block layer folks think about this. > Incidentally sigaltstack is also covered by our Cirrus CI job > for macOS. Oh, nice, so we already have some "check-block" test coverage there! Thomas >> --- >> .gitlab-ci.d/buildtest.yml | 14 ++------------ >> .gitlab-ci.d/crossbuilds.yml | 2 +- >> 2 files changed, 3 insertions(+), 13 deletions(-) >> >> diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml >> index 91c7467a66..1438797a1c 100644 >> --- a/.gitlab-ci.d/buildtest.yml >> +++ b/.gitlab-ci.d/buildtest.yml >> @@ -533,19 +533,8 @@ build-tci: >> - QTEST_QEMU_BINARY="./qemu-system-s390x" ./tests/qtest/pxe-test -m slow >> - make check-tcg >> >> -# Alternate coroutines implementations are only really of interest to KVM users >> -# However we can't test against KVM on Gitlab-CI so we can only run unit tests >> -build-coroutine-sigaltstack: >> - extends: .native_build_job_template >> - needs: >> - job: amd64-ubuntu2004-container >> - variables: >> - IMAGE: ubuntu2004 >> - CONFIGURE_ARGS: --with-coroutine=sigaltstack --disable-tcg >> - --enable-trace-backends=ftrace >> - MAKE_CHECK_ARGS: check-unit >> - >> # Check our reduced build configurations >> +# (and an alternative coroutine implementation) >> build-without-defaults: >> extends: .native_build_job_template >> needs: >> @@ -559,6 +548,7 @@ build-without-defaults: >> --disable-pie >> --disable-qom-cast-debug >> --disable-strip >> + --with-coroutine=sigaltstack >> TARGETS: avr-softmmu mips64-softmmu s390x-softmmu sh4-softmmu >> sparc64-softmmu hexagon-linux-user i386-linux-user s390x-linux-user >> MAKE_CHECK_ARGS: check-unit check-qtest-avr check-qtest-mips64 >> diff --git a/.gitlab-ci.d/crossbuilds.yml b/.gitlab-ci.d/crossbuilds.yml >> index 8dbbb8f881..027d2088da 100644 >> --- a/.gitlab-ci.d/crossbuilds.yml >> +++ b/.gitlab-ci.d/crossbuilds.yml >> @@ -159,7 +159,7 @@ cross-s390x-kvm-only: >> job: s390x-debian-cross-container >> variables: >> IMAGE: debian-s390x-cross >> - EXTRA_CONFIGURE_OPTS: --disable-tcg >> + EXTRA_CONFIGURE_OPTS: --disable-tcg --enable-trace-backends=ftrace >> >> cross-mips64el-kvm-only: >> extends: .cross_accel_build_job >> -- >> 2.31.1 >> >> > > With regards, > Daniel
© 2016 - 2025 Red Hat, Inc.