On 19/06/2021 18:26, Richard Henderson wrote:
> Short story is that the first two patches resolve the observed
> problem, by completely bypassing quite a lot of code in memory.c.
>
> Longer story is that we should either use that code in memory.c,
> or we should bypass it to an even lower level, so that we don't
> have multiple locations doing the partial-read assembly thing.
>
> Patch 13 exposes a number of obvious device bugs via make check.
> I'm sure there are more in devices that are less well tested.
>
> Patch 15 has an obvious drawback: it breaks the original #360.
> But it starts the conversation as to whether the check in memory.c
> is in fact broken.
>
>
> r~
>
>
> Mark Cave-Ayland (2):
> NOTFORMERGE q800: test case for do_unaligned_access issue
> accel/tcg: Use byte ops for unaligned loads
>
> Philippe Mathieu-Daudé (1):
> accel/tcg: Extract load_helper_unaligned from load_helper
>
> Richard Henderson (12):
> accel/tcg: Don't test for watchpoints for code read
> accel/tcg: Handle page span access before i/o access
> softmmu/memory: Inline memory_region_dispatch_read1
> softmmu/memory: Simplify access_with_adjusted_size interface
> hw/net/e1000e: Fix size of io operations
> hw/net/e1000e: Fix impl.min_access_size
> hw/pci-host/q35: Improve blackhole_ops
> hw/scsi/megasas: Fix megasas_mmio_ops sizes
> hw/scsi/megasas: Improve megasas_queue_ops min_access_size
> softmmu/memory: Disallow short writes
> softmmu/memory: Support some unaligned access
> RFC accel/tcg: Defer some unaligned accesses to memory subsystem
>
> accel/tcg/cputlb.c | 147 +++++++++++++----------------
> hw/m68k/q800.c | 131 ++------------------------
> hw/net/e1000e.c | 8 +-
> hw/pci-host/q35.c | 9 +-
> hw/scsi/megasas.c | 6 +-
> softmmu/memory.c | 226 +++++++++++++++++++++++++++++++++------------
> 6 files changed, 251 insertions(+), 276 deletions(-)
Hi Richard,
I've had a look over this patchset, and based upon my work leading up to #360 this
does feel like an improvement: in particular separating out page spanning accesses,
and removing the duplicate cputlb code for splitting/combining unaligned accesses
seem like wins to me.
As mentioned in the report itself, cputlb has effectively been bypassing
.min_access_size but I'm not sure about the consequences of this - does it mean that
the access size check should also be bypassed for the head/tail of unaligned
accesses? I'm also not completely sure how this behaviour can change based upon the
target CPU.
The comment in patch 15 re: mr->ops->valid.unaligned is interesting: if the memory
subsystem can split the accesses beforehand so that they are accepted by the device,
does this check then become obsolete?
My general feeling is that there are still some discussions to be had around defining
how unaligned accesses should work, and including a qtest to cover the various
unaligned/page span cases would help here. Other than that I do feel the patch is
worthwhile, and it may be there is a situation similar to the
memory_region_access_valid() changes in 5d971f9e67 whereby we accept the change in
behaviour will have some fallout but allow plenty of time for regressions to be fixed.
ATB,
Mark.