OvmfPkg/Library/VirtioLib/VirtioLib.inf | 1 + OvmfPkg/VirtioBlkDxe/VirtioBlk.inf | 5 + OvmfPkg/VirtioGpuDxe/VirtioGpu.inf | 1 + OvmfPkg/VirtioNetDxe/VirtioNet.inf | 1 + OvmfPkg/VirtioRngDxe/VirtioRng.inf | 1 + OvmfPkg/VirtioScsiDxe/VirtioScsi.inf | 1 + OvmfPkg/Include/IndustryStandard/Virtio095.h | 1 + OvmfPkg/Include/IndustryStandard/Virtio10.h | 5 + OvmfPkg/Include/Library/VirtioLib.h | 20 ++++ OvmfPkg/Library/VirtioLib/VirtioLib.c | 96 ++++++++++++++- OvmfPkg/VirtioBlkDxe/VirtioBlk.c | 125 ++++++++++++++++++-- 11 files changed, 244 insertions(+), 13 deletions(-)
I have found that OVMF fails to detect the disk when iommu_platform is set from qemu cli. The failure occurs during the feature bit negotiation. Recently, EDKII introduced IOMMU protocol d1fddc4533bf. SEV patch series introduced a IoMmu protocol driver f9d129e68a45 to set a DMA access attribute and methods to allocate, free, map and unmap the DMA memory for the master bus devices In this patch series, I have tried to enable the IOMMU_PLATFORM feature for VirtioBlkDevice. I am sending this as RFC to seek feedback before I extend the support for other Virtio devices. The patch has been tested in SEV guest - mainly because IoMmuDxe driver installs the IOMMU protocol for SEV guest only. If needed, I can extend the IoMmuDxe driver to install IOMMU protocol for non SEV guests. qemu cli used for testing: # $QEMU \ ... -drive file=${HDA_FILE},if=none,id=disk0,format=qcow2 \ -device virtio-blk-pci,drive=disk0,disable-legacy=on,iommu_platform=true,disable-modern=off,scsi=off ... Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Contributed-under: TianoCore Contribution Agreement 1.0 Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Brijesh Singh (3): OvmfPkg/Include/Virtio10: Define VIRTIO_F_IOMMU_PLATFORM feature bit OvmfPkg/VirtioLib: Add IOMMU_PLATFORM support OvmfPkg/VirtioBlkDxe: Add VIRITO_F_IOMMU_PLATFORM support OvmfPkg/Library/VirtioLib/VirtioLib.inf | 1 + OvmfPkg/VirtioBlkDxe/VirtioBlk.inf | 5 + OvmfPkg/VirtioGpuDxe/VirtioGpu.inf | 1 + OvmfPkg/VirtioNetDxe/VirtioNet.inf | 1 + OvmfPkg/VirtioRngDxe/VirtioRng.inf | 1 + OvmfPkg/VirtioScsiDxe/VirtioScsi.inf | 1 + OvmfPkg/Include/IndustryStandard/Virtio095.h | 1 + OvmfPkg/Include/IndustryStandard/Virtio10.h | 5 + OvmfPkg/Include/Library/VirtioLib.h | 20 ++++ OvmfPkg/Library/VirtioLib/VirtioLib.c | 96 ++++++++++++++- OvmfPkg/VirtioBlkDxe/VirtioBlk.c | 125 ++++++++++++++++++-- 11 files changed, 244 insertions(+), 13 deletions(-) -- Brijesh Singh 2.7.4 _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
Adding Ard On 07/20/17 00:09, Brijesh Singh wrote: > I have found that OVMF fails to detect the disk when iommu_platform is set from > qemu cli. The failure occurs during the feature bit negotiation. > > Recently, EDKII introduced IOMMU protocol d1fddc4533bf. SEV patch series introduced > a IoMmu protocol driver f9d129e68a45 to set a DMA access attribute and methods to > allocate, free, map and unmap the DMA memory for the master bus devices > > In this patch series, I have tried to enable the IOMMU_PLATFORM feature for > VirtioBlkDevice. I am sending this as RFC to seek feedback before I extend the support > for other Virtio devices. The patch has been tested in SEV guest - mainly because > IoMmuDxe driver installs the IOMMU protocol for SEV guest only. If needed, I can > extend the IoMmuDxe driver to install IOMMU protocol for non SEV guests. > > qemu cli used for testing: > > # $QEMU \ > ... > -drive file=${HDA_FILE},if=none,id=disk0,format=qcow2 \ > -device virtio-blk-pci,drive=disk0,disable-legacy=on,iommu_platform=true,disable-modern=off,scsi=off > ... > > Cc: Jordan Justen <jordan.l.justen@intel.com> > Cc: Laszlo Ersek <lersek@redhat.com> > Cc: Jason Wang <jasowang@redhat.com> > Cc: Michael S. Tsirkin <mst@redhat.com> > Contributed-under: TianoCore Contribution Agreement 1.0 > Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> > > Brijesh Singh (3): > OvmfPkg/Include/Virtio10: Define VIRTIO_F_IOMMU_PLATFORM feature bit > OvmfPkg/VirtioLib: Add IOMMU_PLATFORM support > OvmfPkg/VirtioBlkDxe: Add VIRITO_F_IOMMU_PLATFORM support In the course of processing my post-PTO email backlog, yesterday I skimmed this topic very quickly, and thought about it for an hour or so (while away for the computer). Here's my take on it. (1) The closest to any formal language that I found for VIRTIO_F_IOMMU_PLATFORM are: https://issues.oasis-open.org/browse/VIRTIO-154 https://lists.oasis-open.org/archives/virtio-dev/201610/msg00121.html Are these references up-to-date? The ticket also references SVN r587 as the resolution, but that link is dead. (2) A few weeks back I saw Jason's SeaBIOS patch (also pointed out to me more recently by Brijesh, in private): [SeaBIOS] [PATCH] virtio: IOMMU support I don't understand this patch. (I also don't understand the "iommu_platform" device property on the QEMU command line.) According to the spec language quoted from the mailing list above, we have four cases: (2.1) device does not offer VIRITO_F_IOMMU_PLATFORM --> everything works like before (2.2) device offers VIRITO_F_IOMMU_PLATFORM, but the driver does not negotiate it --> device is allowed to reject functioning * device offers VIRITO_F_IOMMU_PLATFORM and the driver negotiates it: there are two possibilities: (2.3) the driver *disables* the IOMMU, and works like before (2.4) the driver *configures* the IOMMU and works accordingly The SeaBIOS patch negotiates VIRITO_F_IOMMU_PLATFORM, but it *neither* disables *nor* configures the IOMMU. It simply *ignores* the IOMMU. I don't see how that is any different *in effect* from simply not negotiating VIRITO_F_IOMMU_PLATFORM -- case (2.2) --, and I don't understand why QEMU tolerates "ignoring the IOMMU" differently from "not negotiating the flag". (3) *If* we indeed just wanted to follow the SeaBIOS patch, then VIRTIO_F_IOMMU_PLATFORM should be treated in OVMF *precisely* in parallel with VIRTIO_F_VERSION_1. (4) I'm confused by the intersection of the SEV and VIRTIO_F_IOMMU_PLATFORM use cases. The IoMmu DXE driver added in edk2 commit f9d129e68a45 ("OvmfPkg: Add IoMmuDxe driver", 2017-07-06) is specific to SEV. In SEV-less guests, the IoMmu DXE driver will not install the IOMMU protocol, but the QEMU command line may still contain the "iommu_platform=true" property. For example, as far as I recall, DPDK guest payloads use an emulated "viommu" device. For this OVMF, would have to provide a separate IOMMU DXE driver, one that could actually interact with the "viommu" device. And there should be some coordination that exactly one instance of gEdkiiIoMmuProtocolGuid OR gIoMmuAbsentProtocolGuid be installed. I see that the SEV references are only in the blurb of the patch set; no actual commit message refers to SEV. That's OK; I just think the blurb is confusing. (5) Now I'm coming to my main point. The virtio device drivers in OvmfPkg are UEFI_DRIVER modules that conform to the UEFI driver model. They consume our home-grown the VIRTIO_DEVICE_PROTOCOL, and produce whatever UEFI protocol is appropriate on top (for example, EFI_BLOCK_IO_PROTOCOL in VirtioBlkDxe). Such a driver has no business talking to the platform's IOMMU protocol directly (even if there is one). Instead: (5.1) The VIRTIO_DEVICE_PROTOCOL has to be extended with the necessary AllocateSharedPages, FreeSharedPages, MapSharedPages, UnmapSharedPages functions. (5.2) All top-level virtio driver code, including VirtioLib, has to be rebased to the new VIRTIO_DEVICE_PROTOCOL member functions. This covers the ring memory itself, the memory pointed-to by the descriptors placed on the ring(s) -- i.e., the device specific requests --, and all further memory that is referenced by those device specific requests. This will result in a larger memory footprint, as all current pool allocations will be turned into page allocations, but I guess that is tolerable. (5.3) All virtio driver code should treat VIRTIO_F_IOMMU_PLATFORM simply in parallel with VIRTIO_F_VERSION_1, and don't act upon VIRTIO_F_IOMMU_PLATFORM in any special shape or form. So basically this is just my point (3) from above. (5.4) There are three VIRTIO_DEVICE_PROTOCOL implementations in edk2: - "OvmfPkg/VirtioPciDeviceDxe" binds legacy-only and transitional virtio-pci devices, and offers virtio 0.9.5 semantics. - "ArmVirtPkg/VirtioFdtDxe" (via "OvmfPkg/Library/VirtioMmioDeviceLib") binds virtio-mmio devices, and offers virtio 0.9.5 semantics. - "OvmfPkg/Virtio10Dxe" binds modern-only virtio-pci devices, and offers virtio 1.0.0 semantics. The first two drivers should implement the AllocateSharedPages() and FreeSharedPages() member functions simply with the corresponding MemoryAllocationLib functions (using BootServicesData type memory), and implement the MapSharedPages() and UnmapSharedPages() member functions as no-ops (return the input addresses transparently). The third driver should implement all four new member functions by respectively delegating the job to: - EFI_PCI_IO_PROTOCOL.AllocateBuffer() -- with BootServicesData -- - EFI_PCI_IO_PROTOCOL.FreeBuffer() - EFI_PCI_IO_PROTOCOL.Map() -- with BusMasterCommonBuffer64 -- - EFI_PCI_IO_PROTOCOL.Unmap() The EFI_PCI_IO_PROTOCOL implementation will delegate these calls to the platform-specific PCI host bridge / root bridge driver, and *that* driver in turn is allowed to talk to an IOMMU protocol (if any). (This last step is already covered by the following edk2 commits: - generally, c15da8eb3587 ("MdeModulePkg/PciHostBridge: Add IOMMU support.", 2017-04-29), - specifically for SEV in OVMF, c6ab9aecb71b ("OvmfPkg: update PciHostBridgeDxe to use PlatformHasIoMmuLib", 2017-07-06).) (5.5) There's a delicate question that has to be considered with care; namely the ExitBootServices() callbacks in the virtio drivers. Currently these functions perform virtio resets. They cause the devices to forget all their configuration (including guest RAM references). Aborting in-flight DMA (e.g. in VirtioNetDxe) is a practice that is specifically recommended in the UEFI Driver Writers' Guide; plus at some point the guest kernel will reclaim and overwrite BootServicesData type memory. If at that point QEMU is still looking at some (originally firmware-allocated) areas as virtio rings, the results won't be amusing. (Speaking from experience.) Hence the resetting of virtio devices upon ExitBootServices(). Now, remember that ExitBootServices() callbacks *must not* change the UEFI memory map, so *no* memory allocations *must* be released in the callbacks. The virtio resets performed in the callbacks are surgical for this reason; nothing else is being done. The memory will be reclaimed by the OS, later. With an IOMMU in the picture, further actions become necessary: *after* the virtio reset, any buffers previously in use (including rings, device specific requests pointed-to by descriptors, and any further memory referenced by those requests) must be *unmapped*, but *not freed*. (Speaking in SEV terms, this will result in those memory areas seeing their C bits restored, without changing the UEFI memmap.) This means that: - the new VIRTIO_DEVICE_PROTOCOL.UnmapSharedPages() function has to be called judiciously from these callbacks, after the virtio reset, - *and* that the *entire* call chain originating from UnmapSharedPages(), through PciIo, through PciRootBridgeIo, to EdkiiIoMmu, *must* not call gBS->FreePool() or gBS->FreePages() (or the equivalent MemoryAllocationLib functions). Note that if the last requirement is currently violated (outside of OvmfPkg), then that is a general problem for physical platforms as well -- IMO, a physical NIC driver too should be able to abort DMA in its exit-boot-services callback and then unmap any relevant IOMMU mappings (via PciIo->Unmap().) Thanks Laszlo _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
Hi Laszlo, On 07/25/2017 01:17 PM, Laszlo Ersek wrote: > Adding Ard > > On 07/20/17 00:09, Brijesh Singh wrote: >> I have found that OVMF fails to detect the disk when iommu_platform is set from >> qemu cli. The failure occurs during the feature bit negotiation. >> >> Recently, EDKII introduced IOMMU protocol d1fddc4533bf. SEV patch series introduced >> a IoMmu protocol driver f9d129e68a45 to set a DMA access attribute and methods to >> allocate, free, map and unmap the DMA memory for the master bus devices >> >> In this patch series, I have tried to enable the IOMMU_PLATFORM feature for >> VirtioBlkDevice. I am sending this as RFC to seek feedback before I extend the support >> for other Virtio devices. The patch has been tested in SEV guest - mainly because >> IoMmuDxe driver installs the IOMMU protocol for SEV guest only. If needed, I can >> extend the IoMmuDxe driver to install IOMMU protocol for non SEV guests. >> >> qemu cli used for testing: >> >> # $QEMU \ >> ... >> -drive file=${HDA_FILE},if=none,id=disk0,format=qcow2 \ >> -device virtio-blk-pci,drive=disk0,disable-legacy=on,iommu_platform=true,disable-modern=off,scsi=off >> ... >> >> Cc: Jordan Justen <jordan.l.justen@intel.com> >> Cc: Laszlo Ersek <lersek@redhat.com> >> Cc: Jason Wang <jasowang@redhat.com> >> Cc: Michael S. Tsirkin <mst@redhat.com> >> Contributed-under: TianoCore Contribution Agreement 1.0 >> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> >> >> Brijesh Singh (3): >> OvmfPkg/Include/Virtio10: Define VIRTIO_F_IOMMU_PLATFORM feature bit >> OvmfPkg/VirtioLib: Add IOMMU_PLATFORM support >> OvmfPkg/VirtioBlkDxe: Add VIRITO_F_IOMMU_PLATFORM support > > In the course of processing my post-PTO email backlog, yesterday I > skimmed this topic very quickly, and thought about it for an hour or so > (while away for the computer). Here's my take on it. > > > (1) The closest to any formal language that I found for > VIRTIO_F_IOMMU_PLATFORM are: > > https://issues.oasis-open.org/browse/VIRTIO-154 > https://lists.oasis-open.org/archives/virtio-dev/201610/msg00121.html > > Are these references up-to-date? The ticket also references SVN r587 as > the resolution, but that link is dead. > > > (2) A few weeks back I saw Jason's SeaBIOS patch (also pointed out to me > more recently by Brijesh, in private): > > [SeaBIOS] [PATCH] virtio: IOMMU support > > I don't understand this patch. (I also don't understand the > "iommu_platform" device property on the QEMU command line.) According to > the spec language quoted from the mailing list above, we have four cases: > > (2.1) device does not offer VIRITO_F_IOMMU_PLATFORM --> everything works > like before > > (2.2) device offers VIRITO_F_IOMMU_PLATFORM, but the driver does not > negotiate it --> device is allowed to reject functioning > > * device offers VIRITO_F_IOMMU_PLATFORM and the driver negotiates it: > there are two possibilities: > (2.3) the driver *disables* the IOMMU, and works like before > (2.4) the driver *configures* the IOMMU and works accordingly > > The SeaBIOS patch negotiates VIRITO_F_IOMMU_PLATFORM, but it *neither* > disables *nor* configures the IOMMU. It simply *ignores* the IOMMU. I > don't see how that is any different *in effect* from simply not > negotiating VIRITO_F_IOMMU_PLATFORM -- case (2.2) --, and I don't > understand why QEMU tolerates "ignoring the IOMMU" differently from "not > negotiating the flag". > > > (3) *If* we indeed just wanted to follow the SeaBIOS patch, then > VIRTIO_F_IOMMU_PLATFORM should be treated in OVMF *precisely* in > parallel with VIRTIO_F_VERSION_1. > > > (4) I'm confused by the intersection of the SEV and > VIRTIO_F_IOMMU_PLATFORM use cases. The IoMmu DXE driver added in edk2 > commit f9d129e68a45 ("OvmfPkg: Add IoMmuDxe driver", 2017-07-06) is > specific to SEV. > > In SEV-less guests, the IoMmu DXE driver will not install the IOMMU > protocol, but the QEMU command line may still contain the > "iommu_platform=true" property. For example, as far as I recall, DPDK > guest payloads use an emulated "viommu" device. For this OVMF, would > have to provide a separate IOMMU DXE driver, one that could actually > interact with the "viommu" device. And there should be some coordination > that exactly one instance of gEdkiiIoMmuProtocolGuid OR > gIoMmuAbsentProtocolGuid be installed. > > I see that the SEV references are only in the blurb of the patch set; no > actual commit message refers to SEV. That's OK; I just think the blurb > is confusing. > I was trying to figure out why setting iommu_platform fails to detect the HDD in OVMF. Since SEV IOMMU work was still fresh in my mind hence I thought we simply need to update the Virtio drivers to consume IOMMU protocol directly. But your explanation on how things work makes sense; thanks for explaining it in great detail. I will follow your recommendation, and look into extending VIRTIO_DEVICE_PROTOCOL with necessary functions and delegate the work to EFI_PCI_IO_PROTOCOL (which will use IOMMU driver if available). Thanks Brijesh > > (5) Now I'm coming to my main point. The virtio device drivers in > OvmfPkg are UEFI_DRIVER modules that conform to the UEFI driver model. > They consume our home-grown the VIRTIO_DEVICE_PROTOCOL, and produce > whatever UEFI protocol is appropriate on top (for example, > EFI_BLOCK_IO_PROTOCOL in VirtioBlkDxe). > > Such a driver has no business talking to the platform's IOMMU protocol > directly (even if there is one). Instead: > > > (5.1) The VIRTIO_DEVICE_PROTOCOL has to be extended with the necessary > > AllocateSharedPages, FreeSharedPages, MapSharedPages, UnmapSharedPages > > functions. > > > (5.2) All top-level virtio driver code, including VirtioLib, has to be > rebased to the new VIRTIO_DEVICE_PROTOCOL member functions. This covers > the ring memory itself, the memory pointed-to by the descriptors placed > on the ring(s) -- i.e., the device specific requests --, and all further > memory that is referenced by those device specific requests. > > This will result in a larger memory footprint, as all current pool > allocations will be turned into page allocations, but I guess that is > tolerable. > > > (5.3) All virtio driver code should treat VIRTIO_F_IOMMU_PLATFORM simply > in parallel with VIRTIO_F_VERSION_1, and don't act upon > VIRTIO_F_IOMMU_PLATFORM in any special shape or form. So basically this > is just my point (3) from above. > > > (5.4) There are three VIRTIO_DEVICE_PROTOCOL implementations in edk2: > > - "OvmfPkg/VirtioPciDeviceDxe" binds legacy-only and transitional > virtio-pci devices, and offers virtio 0.9.5 semantics. > > - "ArmVirtPkg/VirtioFdtDxe" (via "OvmfPkg/Library/VirtioMmioDeviceLib") > binds virtio-mmio devices, and offers virtio 0.9.5 semantics. > > - "OvmfPkg/Virtio10Dxe" binds modern-only virtio-pci devices, and offers > virtio 1.0.0 semantics. > > The first two drivers should implement the AllocateSharedPages() and > FreeSharedPages() member functions simply with the corresponding > MemoryAllocationLib functions (using BootServicesData type memory), and > implement the MapSharedPages() and UnmapSharedPages() member functions > as no-ops (return the input addresses transparently). > > The third driver should implement all four new member functions by > respectively delegating the job to: > - EFI_PCI_IO_PROTOCOL.AllocateBuffer() -- with BootServicesData -- > - EFI_PCI_IO_PROTOCOL.FreeBuffer() > - EFI_PCI_IO_PROTOCOL.Map() -- with BusMasterCommonBuffer64 -- > - EFI_PCI_IO_PROTOCOL.Unmap() > > The EFI_PCI_IO_PROTOCOL implementation will delegate these calls to the > platform-specific PCI host bridge / root bridge driver, and *that* > driver in turn is allowed to talk to an IOMMU protocol (if any). > > (This last step is already covered by the following edk2 commits: > - generally, c15da8eb3587 ("MdeModulePkg/PciHostBridge: Add IOMMU > support.", 2017-04-29), > - specifically for SEV in OVMF, c6ab9aecb71b ("OvmfPkg: update > PciHostBridgeDxe to use PlatformHasIoMmuLib", 2017-07-06).) > > > (5.5) There's a delicate question that has to be considered with care; > namely the ExitBootServices() callbacks in the virtio drivers. > > Currently these functions perform virtio resets. They cause the devices > to forget all their configuration (including guest RAM references). > Aborting in-flight DMA (e.g. in VirtioNetDxe) is a practice that is > specifically recommended in the UEFI Driver Writers' Guide; plus at some > point the guest kernel will reclaim and overwrite BootServicesData type > memory. If at that point QEMU is still looking at some (originally > firmware-allocated) areas as virtio rings, the results won't be amusing. > (Speaking from experience.) Hence the resetting of virtio devices upon > ExitBootServices(). > > Now, remember that ExitBootServices() callbacks *must not* change the > UEFI memory map, so *no* memory allocations *must* be released in the > callbacks. The virtio resets performed in the callbacks are surgical for > this reason; nothing else is being done. The memory will be reclaimed by > the OS, later. > > With an IOMMU in the picture, further actions become necessary: *after* > the virtio reset, any buffers previously in use (including rings, device > specific requests pointed-to by descriptors, and any further memory > referenced by those requests) must be *unmapped*, but *not freed*. > > (Speaking in SEV terms, this will result in those memory areas seeing > their C bits restored, without changing the UEFI memmap.) > > This means that: > - the new VIRTIO_DEVICE_PROTOCOL.UnmapSharedPages() function has to be > called judiciously from these callbacks, after the virtio reset, > - *and* that the *entire* call chain originating from > UnmapSharedPages(), through PciIo, through PciRootBridgeIo, to > EdkiiIoMmu, *must* not call gBS->FreePool() or gBS->FreePages() (or the > equivalent MemoryAllocationLib functions). > > Note that if the last requirement is currently violated (outside of > OvmfPkg), then that is a general problem for physical platforms as well > -- IMO, a physical NIC driver too should be able to abort DMA in its > exit-boot-services callback and then unmap any relevant IOMMU mappings > (via PciIo->Unmap().) > > Thanks > Laszlo > _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
Hi Laszlo, > > (5.3) All virtio driver code should treat VIRTIO_F_IOMMU_PLATFORM simply > in parallel with VIRTIO_F_VERSION_1, and don't act upon > VIRTIO_F_IOMMU_PLATFORM in any special shape or form. So basically this > is just my point (3) from above. > > > (5.4) There are three VIRTIO_DEVICE_PROTOCOL implementations in edk2: > > - "OvmfPkg/VirtioPciDeviceDxe" binds legacy-only and transitional > virtio-pci devices, and offers virtio 0.9.5 semantics. > > - "ArmVirtPkg/VirtioFdtDxe" (via "OvmfPkg/Library/VirtioMmioDeviceLib") > binds virtio-mmio devices, and offers virtio 0.9.5 semantics. > > - "OvmfPkg/Virtio10Dxe" binds modern-only virtio-pci devices, and offers > virtio 1.0.0 semantics. > > The first two drivers should implement the AllocateSharedPages() and > FreeSharedPages() member functions simply with the corresponding > MemoryAllocationLib functions (using BootServicesData type memory), and > implement the MapSharedPages() and UnmapSharedPages() member functions > as no-ops (return the input addresses transparently). > > The third driver should implement all four new member functions by > respectively delegating the job to: > - EFI_PCI_IO_PROTOCOL.AllocateBuffer() -- with BootServicesData -- > - EFI_PCI_IO_PROTOCOL.FreeBuffer() > - EFI_PCI_IO_PROTOCOL.Map() -- with BusMasterCommonBuffer64 -- > - EFI_PCI_IO_PROTOCOL.Unmap() > I have working to implement patch per your recommendation. I assume you mean map the buffers with EfiPciIoOperationBusMasterCommonBuffer [1]. If so, I see one issue with SEV guest. When SEV is active, IOMMU driver uses a bounce buffer to map host address to a device address. While creating bounce buffer we can map it either for EfiPciIoOperationBusMasterRead or EfiPciIoOperationBusMasterWrite Operation. If caller wants to map EfiPciIoOperationBusMasterCommonBuffer then it must allocate the buffer using EFI_PCI_IO_PROTOCOL.AllocateBuffer() [2] otherwise we will fail to map. I see that PciRootBridgeIo.c has similar check when using a bounce buffer for < 4GB use cases [3]. Do you see any issue if we use EfiPciIoOperationBusMasterRead or EfiPciIoOperationBusMasterWrite instead of EfiPciIoOperationBusMasterCommonBuffer ? [1] https://github.com/tianocore/edk2/blob/master/EdkCompatibilityPkg/Foundation/Efi/Protocol/PciIo/PciIo.h#L169 [2] https://github.com/tianocore/edk2/blob/master/MdeModulePkg/Bus/Pci/PciBusDxe/PciIo.c#L1082 [3] https://github.com/tianocore/edk2/blob/master/MdeModulePkg/Bus/Pci/PciHostBridgeDxe/PciRootBridgeIo.c#L1109 > The EFI_PCI_IO_PROTOCOL implementation will delegate these calls to the > platform-specific PCI host bridge / root bridge driver, and *that* > driver in turn is allowed to talk to an IOMMU protocol (if any). > > (This last step is already covered by the following edk2 commits: > - generally, c15da8eb3587 ("MdeModulePkg/PciHostBridge: Add IOMMU > support.", 2017-04-29), > - specifically for SEV in OVMF, c6ab9aecb71b ("OvmfPkg: update > PciHostBridgeDxe to use PlatformHasIoMmuLib", 2017-07-06).) > > _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
On 07/27/17 16:21, Brijesh Singh wrote: > Hi Laszlo, > > >> >> (5.3) All virtio driver code should treat VIRTIO_F_IOMMU_PLATFORM >> simply in parallel with VIRTIO_F_VERSION_1, and don't act upon >> VIRTIO_F_IOMMU_PLATFORM in any special shape or form. So basically >> this is just my point (3) from above. >> >> >> (5.4) There are three VIRTIO_DEVICE_PROTOCOL implementations in edk2: >> >> - "OvmfPkg/VirtioPciDeviceDxe" binds legacy-only and transitional >> virtio-pci devices, and offers virtio 0.9.5 semantics. >> >> - "ArmVirtPkg/VirtioFdtDxe" (via >> "OvmfPkg/Library/VirtioMmioDeviceLib") binds virtio-mmio devices, >> and offers virtio 0.9.5 semantics. >> >> - "OvmfPkg/Virtio10Dxe" binds modern-only virtio-pci devices, and >> offers virtio 1.0.0 semantics. >> >> The first two drivers should implement the AllocateSharedPages() and >> FreeSharedPages() member functions simply with the corresponding >> MemoryAllocationLib functions (using BootServicesData type memory), >> and implement the MapSharedPages() and UnmapSharedPages() member >> functions as no-ops (return the input addresses transparently). >> >> The third driver should implement all four new member functions by >> respectively delegating the job to: >> - EFI_PCI_IO_PROTOCOL.AllocateBuffer() -- with BootServicesData -- >> - EFI_PCI_IO_PROTOCOL.FreeBuffer() >> - EFI_PCI_IO_PROTOCOL.Map() -- with BusMasterCommonBuffer64 -- >> - EFI_PCI_IO_PROTOCOL.Unmap() >> > > I have working to implement patch per your recommendation. I assume > you mean map the buffers with EfiPciIoOperationBusMasterCommonBuffer > [1]. Yes. > If so, I see one issue with SEV guest. When SEV is active, IOMMU > driver uses a bounce buffer to map host address to a device address. > While creating bounce buffer we can map it either for > EfiPciIoOperationBusMasterRead or EfiPciIoOperationBusMasterWrite > Operation. If caller wants to map > EfiPciIoOperationBusMasterCommonBuffer then it must allocate the > buffer using EFI_PCI_IO_PROTOCOL.AllocateBuffer() [2] otherwise we > will fail to map. Correct. > I see that PciRootBridgeIo.c has similar check when using a bounce > buffer for < 4GB use cases [3]. Correct. > Do you see any issue if we use EfiPciIoOperationBusMasterRead or > EfiPciIoOperationBusMasterWrite instead of > EfiPciIoOperationBusMasterCommonBuffer ? Yes, I do. The BusMasterRead and BusMasterWrite operations are suitable for one-shot, uni-directional transfers, where the bounce buffer contents and the client code contents are exchanged on Map/Unmap. However virtio transfers, generally speaking, are not like this. While the individual memory areas pointed-to by separate virtio descriptors can me associated with one specific transfer direction (see the VRING_DESC_F_WRITE flag), the virtio ring area itself is long-lived, and it is simultaneously read and written by both host and guest, asynchronously and repeatedly. This calls for BusMasterCommonBuffer. Once we implement BusMasterCommonBuffer, we can use it for everything else. Another reason why separate allocation and mapping (and conversely, separate unmapping and deallocation) are required is the ExitBootServices() callbacks. In that callback, unmapping must happen *without* memory allocation or deallocation (because the IOMMU must be de-programmed, but the UEFI memmap must not be changed), covering all memory areas that are at that point shared between host and guest (regardless of transfer direction). Normally, Map allocates the bounce buffer internally, and Unmap releases the bounce buffer internally (for BusMasterRead and BusMasterWrite), which is not right here. If you use AllocateBuffer() separately, then Map() -- with BusMasterCommonBuffer -- will not allocate internally, and Unmap() will also not deallocate internally. So, in the ExitBootServices() callback, you will be able to call Unmap() only, and then *not* call FreeBuffer() at all. This is why I suggest introducing all four functions (Allocate, Free, Map, Unmap) to the VIRTIO_DEVICE_PROTOCOL, and why all virtio drivers should use all four functions explicitly, not just Map and Unmap. ... Actually, I think there is a bug in "OvmfPkg/IoMmuDxe/AmdSevIoMmu.c". You have the following distribution of operations at the moment: - IoMmuAllocateBuffer() allocates pages and clears the memory encryption mask - IoMmuFreeBuffer() re-sets the memory encryption mask, and deallocates pages - IoMmuMap() does nothing at all when BusMasterCommonBuffer is requested (and IoMmuAllocateBuffer() was called previously). Otherwise, IoMmuMap() allocates pages, allocates MAP_INFO, and clears the memory encryption mask. - IoMmuUnmap() does nothing when cleaning up a BusMasterCommonBuffer operation (--> NO_MAPPING). Otherwise, IoMmuUnmap() clears the encryption mask, and frees both the pages and MAP_INFO. This distribution of operations seems wrong. The key point is that AllocateBuffer() *need not* result in a buffer that is immediately usable, and that client code is required to call Map() *unconditionally*, even if BusMasterCommonBuffer is the desired operation. Therefore, the right distribution of operations is: - IoMmuAllocateBuffer() allocates pages and does not touch the encryption mask.. - IoMmuFreeBuffer() deallocates pages and does not touch the encryption mask. - IoMmuMap() does not allocate pages when BusMasterCommonBuffer is requested, and it allocates pages (bounce buffer) otherwise. *Regardless* of BusMaster operation, the following actions are carried out unconditionally: . the memory encryption mask is cleared in this function (and in this function only), . An attempt is made to grab a MAP_INFO structure from an internal free list (to be introduced!). The head of the list is a new static variable. If the free list is empty, then a MAP_INFO structure is allocated with AllocatePool(). The NO_MAPPING macro becomes unused and can be deleted from the source code. - IoMmuUnmap() clears the encryption mask unconditionally. (For this, it has to consult the MAP_INFO structure that is being passed in from the caller.) In addition: . If MapInfo->Operation is BusMasterCommonBuffer, then we know the allocation was done separately in AllocateBuffer, so we do not release the pages. Otherwise, we do release the pages. . MapInfo is linked back on the internal free list (see above). It is *never* released with FreePool(). This approach guarantees that IoMmuUnmap() can de-program the IOMMU (= re-set the memory encryption mask) without changing the UEFI memory map. (I trust that MemEncryptSevSetPageEncMask() will not split page tables internally when it *re*sets the encryption mask -- is that correct?) I'm sorry that I didn't catch this earlier in your commit f9d129e68a45 ("OvmfPkg: Add IoMmuDxe driver", 2017-07-06), but as you see, I gave you only an Acked-by on that, not a Reviewed-by. I've really had to think it through from the virtio perspective; I didn't foresee this use case back then (I only felt that I wasn't getting the full picture about the IOMMU protocol details). Thanks Laszlo _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
On 27 July 2017 at 18:16, Laszlo Ersek <lersek@redhat.com> wrote: > On 07/27/17 16:21, Brijesh Singh wrote: >> Hi Laszlo, >> >> >>> >>> (5.3) All virtio driver code should treat VIRTIO_F_IOMMU_PLATFORM >>> simply in parallel with VIRTIO_F_VERSION_1, and don't act upon >>> VIRTIO_F_IOMMU_PLATFORM in any special shape or form. So basically >>> this is just my point (3) from above. >>> >>> >>> (5.4) There are three VIRTIO_DEVICE_PROTOCOL implementations in edk2: >>> >>> - "OvmfPkg/VirtioPciDeviceDxe" binds legacy-only and transitional >>> virtio-pci devices, and offers virtio 0.9.5 semantics. >>> >>> - "ArmVirtPkg/VirtioFdtDxe" (via >>> "OvmfPkg/Library/VirtioMmioDeviceLib") binds virtio-mmio devices, >>> and offers virtio 0.9.5 semantics. >>> >>> - "OvmfPkg/Virtio10Dxe" binds modern-only virtio-pci devices, and >>> offers virtio 1.0.0 semantics. >>> >>> The first two drivers should implement the AllocateSharedPages() and >>> FreeSharedPages() member functions simply with the corresponding >>> MemoryAllocationLib functions (using BootServicesData type memory), >>> and implement the MapSharedPages() and UnmapSharedPages() member >>> functions as no-ops (return the input addresses transparently). >>> >>> The third driver should implement all four new member functions by >>> respectively delegating the job to: >>> - EFI_PCI_IO_PROTOCOL.AllocateBuffer() -- with BootServicesData -- >>> - EFI_PCI_IO_PROTOCOL.FreeBuffer() >>> - EFI_PCI_IO_PROTOCOL.Map() -- with BusMasterCommonBuffer64 -- >>> - EFI_PCI_IO_PROTOCOL.Unmap() >>> >> >> I have working to implement patch per your recommendation. I assume >> you mean map the buffers with EfiPciIoOperationBusMasterCommonBuffer >> [1]. > > Yes. > >> If so, I see one issue with SEV guest. When SEV is active, IOMMU >> driver uses a bounce buffer to map host address to a device address. >> While creating bounce buffer we can map it either for >> EfiPciIoOperationBusMasterRead or EfiPciIoOperationBusMasterWrite >> Operation. If caller wants to map >> EfiPciIoOperationBusMasterCommonBuffer then it must allocate the >> buffer using EFI_PCI_IO_PROTOCOL.AllocateBuffer() [2] otherwise we >> will fail to map. > > Correct. > >> I see that PciRootBridgeIo.c has similar check when using a bounce >> buffer for < 4GB use cases [3]. > > Correct. > >> Do you see any issue if we use EfiPciIoOperationBusMasterRead or >> EfiPciIoOperationBusMasterWrite instead of >> EfiPciIoOperationBusMasterCommonBuffer ? > > Yes, I do. The BusMasterRead and BusMasterWrite operations are suitable > for one-shot, uni-directional transfers, where the bounce buffer > contents and the client code contents are exchanged on Map/Unmap. > > However virtio transfers, generally speaking, are not like this. While > the individual memory areas pointed-to by separate virtio descriptors > can me associated with one specific transfer direction (see the > VRING_DESC_F_WRITE flag), the virtio ring area itself is long-lived, and > it is simultaneously read and written by both host and guest, > asynchronously and repeatedly. This calls for BusMasterCommonBuffer. > Once we implement BusMasterCommonBuffer, we can use it for everything > else. > > > Another reason why separate allocation and mapping (and conversely, > separate unmapping and deallocation) are required is the > ExitBootServices() callbacks. In that callback, unmapping must happen > *without* memory allocation or deallocation (because the IOMMU must be > de-programmed, but the UEFI memmap must not be changed), covering all > memory areas that are at that point shared between host and guest > (regardless of transfer direction). > > Normally, Map allocates the bounce buffer internally, and Unmap releases > the bounce buffer internally (for BusMasterRead and BusMasterWrite), > which is not right here. If you use AllocateBuffer() separately, then > Map() -- with BusMasterCommonBuffer -- will not allocate internally, and > Unmap() will also not deallocate internally. So, in the > ExitBootServices() callback, you will be able to call Unmap() only, and > then *not* call FreeBuffer() at all. > > This is why I suggest introducing all four functions (Allocate, Free, > Map, Unmap) to the VIRTIO_DEVICE_PROTOCOL, and why all virtio drivers > should use all four functions explicitly, not just Map and Unmap. > > ... Actually, I think there is a bug in > "OvmfPkg/IoMmuDxe/AmdSevIoMmu.c". You have the following distribution of > operations at the moment: > > - IoMmuAllocateBuffer() allocates pages and clears the memory encryption > mask > > - IoMmuFreeBuffer() re-sets the memory encryption mask, and deallocates > pages > > - IoMmuMap() does nothing at all when BusMasterCommonBuffer is requested > (and IoMmuAllocateBuffer() was called previously). Otherwise, > IoMmuMap() allocates pages, allocates MAP_INFO, and clears the memory > encryption mask. > > - IoMmuUnmap() does nothing when cleaning up a BusMasterCommonBuffer > operation (--> NO_MAPPING). Otherwise, IoMmuUnmap() clears the > encryption mask, and frees both the pages and MAP_INFO. > > This distribution of operations seems wrong. The key point is that > AllocateBuffer() *need not* result in a buffer that is immediately > usable, and that client code is required to call Map() > *unconditionally*, even if BusMasterCommonBuffer is the desired > operation. Therefore, the right distribution of operations is: > > - IoMmuAllocateBuffer() allocates pages and does not touch the > encryption mask.. > > - IoMmuFreeBuffer() deallocates pages and does not touch the encryption > mask. > > - IoMmuMap() does not allocate pages when BusMasterCommonBuffer is > requested, and it allocates pages (bounce buffer) otherwise. > > *Regardless* of BusMaster operation, the following actions are carried > out unconditionally: > > . the memory encryption mask is cleared in this function (and in this > function only), > > . An attempt is made to grab a MAP_INFO structure from an internal > free list (to be introduced!). The head of the list is a new static > variable. If the free list is empty, then a MAP_INFO structure is > allocated with AllocatePool(). The NO_MAPPING macro becomes unused > and can be deleted from the source code. > > - IoMmuUnmap() clears the encryption mask unconditionally. (For this, it > has to consult the MAP_INFO structure that is being passed in from the > caller.) In addition: > > . If MapInfo->Operation is BusMasterCommonBuffer, then we know the > allocation was done separately in AllocateBuffer, so we do not > release the pages. Otherwise, we do release the pages. > > . MapInfo is linked back on the internal free list (see above). It is > *never* released with FreePool(). > > This approach guarantees that IoMmuUnmap() can de-program the IOMMU (= > re-set the memory encryption mask) without changing the UEFI memory > map. (I trust that MemEncryptSevSetPageEncMask() will not split page > tables internally when it *re*sets the encryption mask -- is that > correct?) > > I'm sorry that I didn't catch this earlier in your commit f9d129e68a45 > ("OvmfPkg: Add IoMmuDxe driver", 2017-07-06), but as you see, I gave you > only an Acked-by on that, not a Reviewed-by. I've really had to think it > through from the virtio perspective; I didn't foresee this use case back > then (I only felt that I wasn't getting the full picture about the IOMMU > protocol details). > I have to concur with Laszlo here: the respective semantics of the allocate/map/unmap/free operations should be identical to their counterparts in the PCI I/O protocol, and in the DmaLib in EmbeddedPkg. Note that this is likely to cause problems with proprietary x86 drivers in option ROMs, which are used to getting away with using host addresses for DMA, and fail to invoke Map() for common buffers (or fail to invoke AllocateBuffer() altogether). The API is clear, and drivers that abide by the rules should operate correctly even when using encrypted memory or non-1:1 mapping between the host and PCI address space (which is how I ran into these issues) _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
On 07/27/2017 12:56 PM, Ard Biesheuvel wrote: > On 27 July 2017 at 18:16, Laszlo Ersek <lersek@redhat.com> wrote: >> On 07/27/17 16:21, Brijesh Singh wrote: >>> Hi Laszlo, >>> >>> >>>> >>>> (5.3) All virtio driver code should treat VIRTIO_F_IOMMU_PLATFORM >>>> simply in parallel with VIRTIO_F_VERSION_1, and don't act upon >>>> VIRTIO_F_IOMMU_PLATFORM in any special shape or form. So basically >>>> this is just my point (3) from above. >>>> >>>> >>>> (5.4) There are three VIRTIO_DEVICE_PROTOCOL implementations in edk2: >>>> >>>> - "OvmfPkg/VirtioPciDeviceDxe" binds legacy-only and transitional >>>> virtio-pci devices, and offers virtio 0.9.5 semantics. >>>> >>>> - "ArmVirtPkg/VirtioFdtDxe" (via >>>> "OvmfPkg/Library/VirtioMmioDeviceLib") binds virtio-mmio devices, >>>> and offers virtio 0.9.5 semantics. >>>> >>>> - "OvmfPkg/Virtio10Dxe" binds modern-only virtio-pci devices, and >>>> offers virtio 1.0.0 semantics. >>>> >>>> The first two drivers should implement the AllocateSharedPages() and >>>> FreeSharedPages() member functions simply with the corresponding >>>> MemoryAllocationLib functions (using BootServicesData type memory), >>>> and implement the MapSharedPages() and UnmapSharedPages() member >>>> functions as no-ops (return the input addresses transparently). >>>> >>>> The third driver should implement all four new member functions by >>>> respectively delegating the job to: >>>> - EFI_PCI_IO_PROTOCOL.AllocateBuffer() -- with BootServicesData -- >>>> - EFI_PCI_IO_PROTOCOL.FreeBuffer() >>>> - EFI_PCI_IO_PROTOCOL.Map() -- with BusMasterCommonBuffer64 -- >>>> - EFI_PCI_IO_PROTOCOL.Unmap() >>>> >>> >>> I have working to implement patch per your recommendation. I assume >>> you mean map the buffers with EfiPciIoOperationBusMasterCommonBuffer >>> [1]. >> >> Yes. >> >>> If so, I see one issue with SEV guest. When SEV is active, IOMMU >>> driver uses a bounce buffer to map host address to a device address. >>> While creating bounce buffer we can map it either for >>> EfiPciIoOperationBusMasterRead or EfiPciIoOperationBusMasterWrite >>> Operation. If caller wants to map >>> EfiPciIoOperationBusMasterCommonBuffer then it must allocate the >>> buffer using EFI_PCI_IO_PROTOCOL.AllocateBuffer() [2] otherwise we >>> will fail to map. >> >> Correct. >> >>> I see that PciRootBridgeIo.c has similar check when using a bounce >>> buffer for < 4GB use cases [3]. >> >> Correct. >> >>> Do you see any issue if we use EfiPciIoOperationBusMasterRead or >>> EfiPciIoOperationBusMasterWrite instead of >>> EfiPciIoOperationBusMasterCommonBuffer ? >> >> Yes, I do. The BusMasterRead and BusMasterWrite operations are suitable >> for one-shot, uni-directional transfers, where the bounce buffer >> contents and the client code contents are exchanged on Map/Unmap. >> >> However virtio transfers, generally speaking, are not like this. While >> the individual memory areas pointed-to by separate virtio descriptors >> can me associated with one specific transfer direction (see the >> VRING_DESC_F_WRITE flag), the virtio ring area itself is long-lived, and >> it is simultaneously read and written by both host and guest, >> asynchronously and repeatedly. This calls for BusMasterCommonBuffer. >> Once we implement BusMasterCommonBuffer, we can use it for everything >> else. >> >> >> Another reason why separate allocation and mapping (and conversely, >> separate unmapping and deallocation) are required is the >> ExitBootServices() callbacks. In that callback, unmapping must happen >> *without* memory allocation or deallocation (because the IOMMU must be >> de-programmed, but the UEFI memmap must not be changed), covering all >> memory areas that are at that point shared between host and guest >> (regardless of transfer direction). >> >> Normally, Map allocates the bounce buffer internally, and Unmap releases >> the bounce buffer internally (for BusMasterRead and BusMasterWrite), >> which is not right here. If you use AllocateBuffer() separately, then >> Map() -- with BusMasterCommonBuffer -- will not allocate internally, and >> Unmap() will also not deallocate internally. So, in the >> ExitBootServices() callback, you will be able to call Unmap() only, and >> then *not* call FreeBuffer() at all. >> >> This is why I suggest introducing all four functions (Allocate, Free, >> Map, Unmap) to the VIRTIO_DEVICE_PROTOCOL, and why all virtio drivers >> should use all four functions explicitly, not just Map and Unmap. >> >> ... Actually, I think there is a bug in >> "OvmfPkg/IoMmuDxe/AmdSevIoMmu.c". You have the following distribution of >> operations at the moment: >> >> - IoMmuAllocateBuffer() allocates pages and clears the memory encryption >> mask >> >> - IoMmuFreeBuffer() re-sets the memory encryption mask, and deallocates >> pages >> >> - IoMmuMap() does nothing at all when BusMasterCommonBuffer is requested >> (and IoMmuAllocateBuffer() was called previously). Otherwise, >> IoMmuMap() allocates pages, allocates MAP_INFO, and clears the memory >> encryption mask. >> >> - IoMmuUnmap() does nothing when cleaning up a BusMasterCommonBuffer >> operation (--> NO_MAPPING). Otherwise, IoMmuUnmap() clears the >> encryption mask, and frees both the pages and MAP_INFO. >> I agree with you, there is a bug in AmdSevIoMmu.c. I will fix it. And introduce list to track if the buffer was allocated by us. If buffer was allocated by us then we can safely say that it does not require a bounce buffer. While implementing the initial code I was not able to see BusMasterCommonBuffer usage. But with virtio things are getting a bit more clear in my mind. >> This distribution of operations seems wrong. The key point is that >> AllocateBuffer() *need not* result in a buffer that is immediately >> usable, and that client code is required to call Map() >> *unconditionally*, even if BusMasterCommonBuffer is the desired >> operation. Therefore, the right distribution of operations is: >> >> - IoMmuAllocateBuffer() allocates pages and does not touch the >> encryption mask.. >> >> - IoMmuFreeBuffer() deallocates pages and does not touch the encryption >> mask. >> Actually one of main reason why we cleared and restored the memory encryption mask during allocate/free is because we also consume the IOMMU protocol in QemuFwCfgLib as a method to allocate and free a DMA buffer. I am certainly open to suggestions. [1] https://github.com/tianocore/edk2/blob/master/OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c#L159 [2] https://github.com/tianocore/edk2/blob/master/OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c#L197 >> - IoMmuMap() does not allocate pages when BusMasterCommonBuffer is >> requested, and it allocates pages (bounce buffer) otherwise. >> I am trying to wrap my head around how we can support BusMasterCommonBuffer when buffer was not allocated by us. Changing the memory encryption mask in a page table will not update the contents. Also since the memory encryption mask works on PAGE_SIZE hence changing the encryption mask on not our allocated buffer could mess things up (e.g if NumberOfBytes is not PAGE_SIZE aligned). >> *Regardless* of BusMaster operation, the following actions are carried >> out unconditionally: >> >> . the memory encryption mask is cleared in this function (and in this >> function only), >> >> . An attempt is made to grab a MAP_INFO structure from an internal >> free list (to be introduced!). The head of the list is a new static >> variable. If the free list is empty, then a MAP_INFO structure is >> allocated with AllocatePool(). The NO_MAPPING macro becomes unused >> and can be deleted from the source code. >> >> - IoMmuUnmap() clears the encryption mask unconditionally. (For this, it >> has to consult the MAP_INFO structure that is being passed in from the >> caller.) In addition: >> >> . If MapInfo->Operation is BusMasterCommonBuffer, then we know the >> allocation was done separately in AllocateBuffer, so we do not >> release the pages. Otherwise, we do release the pages. >> >> . MapInfo is linked back on the internal free list (see above). It is >> *never* released with FreePool(). >> >> This approach guarantees that IoMmuUnmap() can de-program the IOMMU (= >> re-set the memory encryption mask) without changing the UEFI memory >> map. (I trust that MemEncryptSevSetPageEncMask() will not split page >> tables internally when it *re*sets the encryption mask -- is that >> correct?) Yes, MemEncryptSevSetPageEncMask() will not split pages when it restores mask. >>>> I'm sorry that I didn't catch this earlier in your commit f9d129e68a45 >> ("OvmfPkg: Add IoMmuDxe driver", 2017-07-06), but as you see, I gave you >> only an Acked-by on that, not a Reviewed-by. I've really had to think it >> through from the virtio perspective; I didn't foresee this use case back >> then (I only felt that I wasn't getting the full picture about the IOMMU >> protocol details). >> > > I have to concur with Laszlo here: the respective semantics of the > allocate/map/unmap/free operations should be identical to their > counterparts in the PCI I/O protocol, and in the DmaLib in > EmbeddedPkg. > > Note that this is likely to cause problems with proprietary x86 > drivers in option ROMs, which are used to getting away with using host > addresses for DMA, and fail to invoke Map() for common buffers (or > fail to invoke AllocateBuffer() altogether). The API is clear, and > drivers that abide by the rules should operate correctly even when > using encrypted memory or non-1:1 mapping between the host and PCI > address space (which is how I ran into these issues) > _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
On 07/27/2017 02:00 PM, Brijesh Singh wrote: >>> This distribution of operations seems wrong. The key point is that >>> AllocateBuffer() *need not* result in a buffer that is immediately >>> usable, and that client code is required to call Map() >>> *unconditionally*, even if BusMasterCommonBuffer is the desired >>> operation. Therefore, the right distribution of operations is: >>> >>> - IoMmuAllocateBuffer() allocates pages and does not touch the >>> encryption mask.. >>> >>> - IoMmuFreeBuffer() deallocates pages and does not touch the encryption >>> mask. >>> > > Actually one of main reason why we cleared and restored the memory encryption mask > during allocate/free is because we also consume the IOMMU protocol in QemuFwCfgLib > as a method to allocate and free a DMA buffer. I am certainly open to suggestions. > > [1] https://github.com/tianocore/edk2/blob/master/OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c#L159 > [2] https://github.com/tianocore/edk2/blob/master/OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c#L197 > >>> - IoMmuMap() does not allocate pages when BusMasterCommonBuffer is >>> requested, and it allocates pages (bounce buffer) otherwise. >>> > > I am trying to wrap my head around how we can support BusMasterCommonBuffer > when buffer was not allocated by us. Changing the memory encryption mask in > a page table will not update the contents. Also since the memory encryption > mask works on PAGE_SIZE hence changing the encryption mask on not our allocated > buffer could mess things up (e.g if NumberOfBytes is not PAGE_SIZE aligned). > I may be missing something in my understanding. Here is a flow I have in my mind, please correct me. OvmfPkg/VirtIoBlk.c: VirtioBlkInit() .... .... VirtioRingInit Virtio->AllocateSharedPages(RingSize, &Ring->Base) PciIo->AllocatePages(RingSize, &RingAddress) Virtio->MapSharedPages(...,BusMasterCommonBuffer, Ring->Base, RingSize, &RingDeviceAddress) ..... ..... This case is straight forward and we can easily maps. No need for bounce buffering. VirtioBlkReadBlocks(..., BufferSize, Buffer,) ...... ...... SynchronousRequest(..., BufferSize, Buffer) .... Virtio->MapSharedPages(..., BusMasterCommonBuffer, Buffer, BufferSize, &DeviceAddress) VirtioAppendDesc(DeviceAddress, BufferSize, ...) VirtioFlush (...) In the above case, "Buffer" was not allocated by us hence we will not able to change the memory encryption attributes. Am I missing something in the flow ? >>> *Regardless* of BusMaster operation, the following actions are carried >>> out unconditionally: >>> >>> . the memory encryption mask is cleared in this function (and in this >>> function only), >>> >>> . An attempt is made to grab a MAP_INFO structure from an internal >>> free list (to be introduced!). The head of the list is a new static >>> variable. If the free list is empty, then a MAP_INFO structure is >>> allocated with AllocatePool(). The NO_MAPPING macro becomes unused >>> and can be deleted from the source code. >>> >>> - IoMmuUnmap() clears the encryption mask unconditionally. (For this, it >>> has to consult the MAP_INFO structure that is being passed in from the >>> caller.) In addition: >>> >>> . If MapInfo->Operation is BusMasterCommonBuffer, then we know the >>> allocation was done separately in AllocateBuffer, so we do not >>> release the pages. Otherwise, we do release the pages. >>> >>> . MapInfo is linked back on the internal free list (see above). It is >>> *never* released with FreePool(). >>> >>> This approach guarantees that IoMmuUnmap() can de-program the IOMMU (= >>> re-set the memory encryption mask) without changing the UEFI memory >>> map. (I trust that MemEncryptSevSetPageEncMask() will not split page >>> tables internally when it *re*sets the encryption mask -- is that >>> correct?) > _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
> On 27 Jul 2017, at 21:55, Brijesh Singh <brijesh.singh@amd.com> wrote: > > > > On 07/27/2017 02:00 PM, Brijesh Singh wrote: > >>>> This distribution of operations seems wrong. The key point is that >>>> AllocateBuffer() *need not* result in a buffer that is immediately >>>> usable, and that client code is required to call Map() >>>> *unconditionally*, even if BusMasterCommonBuffer is the desired >>>> operation. Therefore, the right distribution of operations is: >>>> >>>> - IoMmuAllocateBuffer() allocates pages and does not touch the >>>> encryption mask.. >>>> >>>> - IoMmuFreeBuffer() deallocates pages and does not touch the encryption >>>> mask. >>>> >> Actually one of main reason why we cleared and restored the memory encryption mask >> during allocate/free is because we also consume the IOMMU protocol in QemuFwCfgLib >> as a method to allocate and free a DMA buffer. I am certainly open to suggestions. >> [1] https://github.com/tianocore/edk2/blob/master/OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c#L159 >> [2] https://github.com/tianocore/edk2/blob/master/OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c#L197 >>>> - IoMmuMap() does not allocate pages when BusMasterCommonBuffer is >>>> requested, and it allocates pages (bounce buffer) otherwise. >>>> >> I am trying to wrap my head around how we can support BusMasterCommonBuffer >> when buffer was not allocated by us. Changing the memory encryption mask in >> a page table will not update the contents. Also since the memory encryption >> mask works on PAGE_SIZE hence changing the encryption mask on not our allocated >> buffer could mess things up (e.g if NumberOfBytes is not PAGE_SIZE aligned). > > I may be missing something in my understanding. Here is a flow I have in my > mind, please correct me. > > OvmfPkg/VirtIoBlk.c: > > VirtioBlkInit() > .... > .... > VirtioRingInit > Virtio->AllocateSharedPages(RingSize, &Ring->Base) > PciIo->AllocatePages(RingSize, &RingAddress) > Virtio->MapSharedPages(...,BusMasterCommonBuffer, Ring->Base, RingSize, &RingDeviceAddress) > ..... > ..... > > This case is straight forward and we can easily maps. No need for bounce buffering. > > VirtioBlkReadBlocks(..., BufferSize, Buffer,) > ...... > ...... > SynchronousRequest(..., BufferSize, Buffer) > .... > Virtio->MapSharedPages(..., BusMasterCommonBuffer, Buffer, BufferSize, &DeviceAddress) > VirtioAppendDesc(DeviceAddress, BufferSize, ...) > VirtioFlush (...) > > In the above case, "Buffer" was not allocated by us hence we will not able to change the > memory encryption attributes. Am I missing something in the flow ? > Common buffer mappings may only be created from buffers that were allocated by AllocateBuffer(). In fact, that is its main purpose > >>>> *Regardless* of BusMaster operation, the following actions are carried >>>> out unconditionally: >>>> >>>> . the memory encryption mask is cleared in this function (and in this >>>> function only), >>>> >>>> . An attempt is made to grab a MAP_INFO structure from an internal >>>> free list (to be introduced!). The head of the list is a new static >>>> variable. If the free list is empty, then a MAP_INFO structure is >>>> allocated with AllocatePool(). The NO_MAPPING macro becomes unused >>>> and can be deleted from the source code. >>>> >>>> - IoMmuUnmap() clears the encryption mask unconditionally. (For this, it >>>> has to consult the MAP_INFO structure that is being passed in from the >>>> caller.) In addition: >>>> >>>> . If MapInfo->Operation is BusMasterCommonBuffer, then we know the >>>> allocation was done separately in AllocateBuffer, so we do not >>>> release the pages. Otherwise, we do release the pages. >>>> >>>> . MapInfo is linked back on the internal free list (see above). It is >>>> *never* released with FreePool(). >>>> >>>> This approach guarantees that IoMmuUnmap() can de-program the IOMMU (= >>>> re-set the memory encryption mask) without changing the UEFI memory >>>> map. (I trust that MemEncryptSevSetPageEncMask() will not split page >>>> tables internally when it *re*sets the encryption mask -- is that >>>> correct?) > > > _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
> On Jul 27, 2017, at 2:31 PM, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > >> >> On 27 Jul 2017, at 21:55, Brijesh Singh <brijesh.singh@amd.com> wrote: >> >> >> >> On 07/27/2017 02:00 PM, Brijesh Singh wrote: >> >>>>> This distribution of operations seems wrong. The key point is that >>>>> AllocateBuffer() *need not* result in a buffer that is immediately >>>>> usable, and that client code is required to call Map() >>>>> *unconditionally*, even if BusMasterCommonBuffer is the desired >>>>> operation. Therefore, the right distribution of operations is: >>>>> >>>>> - IoMmuAllocateBuffer() allocates pages and does not touch the >>>>> encryption mask.. >>>>> >>>>> - IoMmuFreeBuffer() deallocates pages and does not touch the encryption >>>>> mask. >>>>> >>> Actually one of main reason why we cleared and restored the memory encryption mask >>> during allocate/free is because we also consume the IOMMU protocol in QemuFwCfgLib >>> as a method to allocate and free a DMA buffer. I am certainly open to suggestions. >>> [1] https://github.com/tianocore/edk2/blob/master/OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c#L159 >>> [2] https://github.com/tianocore/edk2/blob/master/OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c#L197 >>>>> - IoMmuMap() does not allocate pages when BusMasterCommonBuffer is >>>>> requested, and it allocates pages (bounce buffer) otherwise. >>>>> >>> I am trying to wrap my head around how we can support BusMasterCommonBuffer >>> when buffer was not allocated by us. Changing the memory encryption mask in >>> a page table will not update the contents. Also since the memory encryption >>> mask works on PAGE_SIZE hence changing the encryption mask on not our allocated >>> buffer could mess things up (e.g if NumberOfBytes is not PAGE_SIZE aligned). >> >> I may be missing something in my understanding. Here is a flow I have in my >> mind, please correct me. >> >> OvmfPkg/VirtIoBlk.c: >> >> VirtioBlkInit() >> .... >> .... >> VirtioRingInit >> Virtio->AllocateSharedPages(RingSize, &Ring->Base) >> PciIo->AllocatePages(RingSize, &RingAddress) >> Virtio->MapSharedPages(...,BusMasterCommonBuffer, Ring->Base, RingSize, &RingDeviceAddress) >> ..... >> ..... >> >> This case is straight forward and we can easily maps. No need for bounce buffering. >> >> VirtioBlkReadBlocks(..., BufferSize, Buffer,) >> ...... >> ...... >> SynchronousRequest(..., BufferSize, Buffer) >> .... >> Virtio->MapSharedPages(..., BusMasterCommonBuffer, Buffer, BufferSize, &DeviceAddress) >> VirtioAppendDesc(DeviceAddress, BufferSize, ...) >> VirtioFlush (...) >> >> In the above case, "Buffer" was not allocated by us hence we will not able to change the >> memory encryption attributes. Am I missing something in the flow ? >> > > > Common buffer mappings may only be created from buffers that were allocated by AllocateBuffer(). In fact, that is its main purpose Brijesh, If you look in the UEFI Spec 13.4 EFI PCI I/O Protocol there is a good write on DMA. DMA Bus Master Read Operation ========================= Call Map() for EfiPciIoOperationBusMasterRead. Program the DMA Bus Master with the DeviceAddress returned by Map(). Start the DMA Bus Master. Wait for DMA Bus Master to complete the read operation. Call Unmap(). DMA Bus Master Write Operation ========================== Call Map() for EfiPciOperationBusMasterWrite. Program the DMA Bus Master with the DeviceAddress returned by Map(). Start the DMA Bus Master. Wait for DMA Bus Master to complete the write operation. Perform a PCI controller specific read transaction to flush all PCI write buffers (See PCI Specification Section 3.2.5.2) . Call Flush(). Call Unmap(). DMA Bus Master Common Buffer Operation ================================== Call AllocateBuffer() to allocate a common buffer. Call Map() for EfiPciIoOperationBusMasterCommonBuffer. Program the DMA Bus Master with the DeviceAddress returned by Map(). The common buffer can now be accessed equally by the processor and the DMA bus master. Call Unmap(). Call FreeBuffer(). Thanks, Andrew Fish >> >>>>> *Regardless* of BusMaster operation, the following actions are carried >>>>> out unconditionally: >>>>> >>>>> . the memory encryption mask is cleared in this function (and in this >>>>> function only), >>>>> >>>>> . An attempt is made to grab a MAP_INFO structure from an internal >>>>> free list (to be introduced!). The head of the list is a new static >>>>> variable. If the free list is empty, then a MAP_INFO structure is >>>>> allocated with AllocatePool(). The NO_MAPPING macro becomes unused >>>>> and can be deleted from the source code. >>>>> >>>>> - IoMmuUnmap() clears the encryption mask unconditionally. (For this, it >>>>> has to consult the MAP_INFO structure that is being passed in from the >>>>> caller.) In addition: >>>>> >>>>> . If MapInfo->Operation is BusMasterCommonBuffer, then we know the >>>>> allocation was done separately in AllocateBuffer, so we do not >>>>> release the pages. Otherwise, we do release the pages. >>>>> >>>>> . MapInfo is linked back on the internal free list (see above). It is >>>>> *never* released with FreePool(). >>>>> >>>>> This approach guarantees that IoMmuUnmap() can de-program the IOMMU (= >>>>> re-set the memory encryption mask) without changing the UEFI memory >>>>> map. (I trust that MemEncryptSevSetPageEncMask() will not split page >>>>> tables internally when it *re*sets the encryption mask -- is that >>>>> correct?) >> >> >> > _______________________________________________ > edk2-devel mailing list > edk2-devel@lists.01.org > https://lists.01.org/mailman/listinfo/edk2-devel _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
Hi Andrew, On 07/27/2017 04:38 PM, Andrew Fish wrote: > > Brijesh, > > If you look in the UEFI Spec 13.4 EFI PCI I/O Protocol there is a good write on DMA. > > DMA Bus Master Read Operation > ========================= > Call Map() for EfiPciIoOperationBusMasterRead. > Program the DMA Bus Master with the DeviceAddress returned by Map(). Start the DMA Bus Master. > Wait for DMA Bus Master to complete the read operation. > Call Unmap(). > > > DMA Bus Master Write Operation > ========================== > Call Map() for EfiPciOperationBusMasterWrite. > Program the DMA Bus Master with the DeviceAddress returned by Map(). > Start the DMA Bus Master. > Wait for DMA Bus Master to complete the write operation. > Perform a PCI controller specific read transaction to flush all PCI write buffers (See PCI Specification Section 3.2.5.2) . > Call Flush(). > Call Unmap(). > > DMA Bus Master Common Buffer Operation > ================================== > Call AllocateBuffer() to allocate a common buffer. > Call Map() for EfiPciIoOperationBusMasterCommonBuffer. > Program the DMA Bus Master with the DeviceAddress returned by Map(). > The common buffer can now be accessed equally by the processor and the DMA bus master. Call Unmap(). > Call FreeBuffer(). > Thanks for the link. -Brijesh _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
On 07/27/2017 04:31 PM, Ard Biesheuvel wrote: > >> On 27 Jul 2017, at 21:55, Brijesh Singh <brijesh.singh@amd.com> wrote: >> >> >> >> On 07/27/2017 02:00 PM, Brijesh Singh wrote: >> >>>>> This distribution of operations seems wrong. The key point is that >>>>> AllocateBuffer() *need not* result in a buffer that is immediately >>>>> usable, and that client code is required to call Map() >>>>> *unconditionally*, even if BusMasterCommonBuffer is the desired >>>>> operation. Therefore, the right distribution of operations is: >>>>> >>>>> - IoMmuAllocateBuffer() allocates pages and does not touch the >>>>> encryption mask.. >>>>> >>>>> - IoMmuFreeBuffer() deallocates pages and does not touch the encryption >>>>> mask. >>>>> >>> Actually one of main reason why we cleared and restored the memory encryption mask >>> during allocate/free is because we also consume the IOMMU protocol in QemuFwCfgLib >>> as a method to allocate and free a DMA buffer. I am certainly open to suggestions. >>> [1] https://github.com/tianocore/edk2/blob/master/OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c#L159 >>> [2] https://github.com/tianocore/edk2/blob/master/OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c#L197 >>>>> - IoMmuMap() does not allocate pages when BusMasterCommonBuffer is >>>>> requested, and it allocates pages (bounce buffer) otherwise. >>>>> >>> I am trying to wrap my head around how we can support BusMasterCommonBuffer >>> when buffer was not allocated by us. Changing the memory encryption mask in >>> a page table will not update the contents. Also since the memory encryption >>> mask works on PAGE_SIZE hence changing the encryption mask on not our allocated >>> buffer could mess things up (e.g if NumberOfBytes is not PAGE_SIZE aligned). >> >> I may be missing something in my understanding. Here is a flow I have in my >> mind, please correct me. >> >> OvmfPkg/VirtIoBlk.c: >> >> VirtioBlkInit() >> .... >> .... >> VirtioRingInit >> Virtio->AllocateSharedPages(RingSize, &Ring->Base) >> PciIo->AllocatePages(RingSize, &RingAddress) >> Virtio->MapSharedPages(...,BusMasterCommonBuffer, Ring->Base, RingSize, &RingDeviceAddress) >> ..... >> ..... >> >> This case is straight forward and we can easily maps. No need for bounce buffering. >> >> VirtioBlkReadBlocks(..., BufferSize, Buffer,) >> ...... >> ...... >> SynchronousRequest(..., BufferSize, Buffer) >> .... >> Virtio->MapSharedPages(..., BusMasterCommonBuffer, Buffer, BufferSize, &DeviceAddress) >> VirtioAppendDesc(DeviceAddress, BufferSize, ...) >> VirtioFlush (...) >> >> In the above case, "Buffer" was not allocated by us hence we will not able to change the >> memory encryption attributes. Am I missing something in the flow ? >> > > > Common buffer mappings may only be created from buffers that were allocated by AllocateBuffer(). In fact, that is its main purpose Yes, that part is well understood. If the buffer was allocated by us (e.g vring, request/status structure etc) then those should be mapped as "BusMasterCommonBuffer". But I am trying to figure out, how to map a data buffers before issuing a virtio request. e.g when VirtioBlkReadBlocks() is called, "Buffer" pointer is not a DMA address hence we need to map it. I think it should be mapped using "BusMasterWrite" not "BusMasterCommonBuffer" before adding into vring. >> >>>>> *Regardless* of BusMaster operation, the following actions are carried >>>>> out unconditionally: >>>>> >>>>> . the memory encryption mask is cleared in this function (and in this >>>>> function only), >>>>> >>>>> . An attempt is made to grab a MAP_INFO structure from an internal >>>>> free list (to be introduced!). The head of the list is a new static >>>>> variable. If the free list is empty, then a MAP_INFO structure is >>>>> allocated with AllocatePool(). The NO_MAPPING macro becomes unused >>>>> and can be deleted from the source code. >>>>> >>>>> - IoMmuUnmap() clears the encryption mask unconditionally. (For this, it >>>>> has to consult the MAP_INFO structure that is being passed in from the >>>>> caller.) In addition: >>>>> >>>>> . If MapInfo->Operation is BusMasterCommonBuffer, then we know the >>>>> allocation was done separately in AllocateBuffer, so we do not >>>>> release the pages. Otherwise, we do release the pages. >>>>> >>>>> . MapInfo is linked back on the internal free list (see above). It is >>>>> *never* released with FreePool(). >>>>> >>>>> This approach guarantees that IoMmuUnmap() can de-program the IOMMU (= >>>>> re-set the memory encryption mask) without changing the UEFI memory >>>>> map. (I trust that MemEncryptSevSetPageEncMask() will not split page >>>>> tables internally when it *re*sets the encryption mask -- is that >>>>> correct?) >> >> >> _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
On 27 July 2017 at 23:10, Brijesh Singh <brijesh.singh@amd.com> wrote: > > > On 07/27/2017 04:31 PM, Ard Biesheuvel wrote: >> >> >>> On 27 Jul 2017, at 21:55, Brijesh Singh <brijesh.singh@amd.com> wrote: >>> >>> >>> >>> On 07/27/2017 02:00 PM, Brijesh Singh wrote: >>> >>>>>> This distribution of operations seems wrong. The key point is that >>>>>> AllocateBuffer() *need not* result in a buffer that is immediately >>>>>> usable, and that client code is required to call Map() >>>>>> *unconditionally*, even if BusMasterCommonBuffer is the desired >>>>>> operation. Therefore, the right distribution of operations is: >>>>>> >>>>>> - IoMmuAllocateBuffer() allocates pages and does not touch the >>>>>> encryption mask.. >>>>>> >>>>>> - IoMmuFreeBuffer() deallocates pages and does not touch the >>>>>> encryption >>>>>> mask. >>>>>> >>>> Actually one of main reason why we cleared and restored the memory >>>> encryption mask >>>> during allocate/free is because we also consume the IOMMU protocol in >>>> QemuFwCfgLib >>>> as a method to allocate and free a DMA buffer. I am certainly open to >>>> suggestions. >>>> [1] >>>> https://github.com/tianocore/edk2/blob/master/OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c#L159 >>>> [2] >>>> https://github.com/tianocore/edk2/blob/master/OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c#L197 >>>>>> >>>>>> - IoMmuMap() does not allocate pages when BusMasterCommonBuffer is >>>>>> requested, and it allocates pages (bounce buffer) otherwise. >>>>>> >>>> I am trying to wrap my head around how we can support >>>> BusMasterCommonBuffer >>>> when buffer was not allocated by us. Changing the memory encryption mask >>>> in >>>> a page table will not update the contents. Also since the memory >>>> encryption >>>> mask works on PAGE_SIZE hence changing the encryption mask on not our >>>> allocated >>>> buffer could mess things up (e.g if NumberOfBytes is not PAGE_SIZE >>>> aligned). >>> >>> >>> I may be missing something in my understanding. Here is a flow I have in >>> my >>> mind, please correct me. >>> >>> OvmfPkg/VirtIoBlk.c: >>> >>> VirtioBlkInit() >>> .... >>> .... >>> VirtioRingInit >>> Virtio->AllocateSharedPages(RingSize, &Ring->Base) >>> PciIo->AllocatePages(RingSize, &RingAddress) >>> Virtio->MapSharedPages(...,BusMasterCommonBuffer, Ring->Base, >>> RingSize, &RingDeviceAddress) >>> ..... >>> ..... >>> >>> This case is straight forward and we can easily maps. No need for bounce >>> buffering. >>> >>> VirtioBlkReadBlocks(..., BufferSize, Buffer,) >>> ...... >>> ...... >>> SynchronousRequest(..., BufferSize, Buffer) >>> .... >>> Virtio->MapSharedPages(..., BusMasterCommonBuffer, Buffer, >>> BufferSize, &DeviceAddress) >>> VirtioAppendDesc(DeviceAddress, BufferSize, ...) >>> VirtioFlush (...) >>> In the above case, "Buffer" was not allocated by us hence we will not >>> able to change the >>> memory encryption attributes. Am I missing something in the flow ? >>> >> >> >> Common buffer mappings may only be created from buffers that were >> allocated by AllocateBuffer(). In fact, that is its main purpose > > > Yes, that part is well understood. If the buffer was allocated by us (e.g > vring, request/status > structure etc) then those should be mapped as "BusMasterCommonBuffer". > > But I am trying to figure out, how to map a data buffers before issuing a > virtio request. e.g when > VirtioBlkReadBlocks() is called, "Buffer" pointer is not a DMA address hence > we need to map it. > I think it should be mapped using "BusMasterWrite" not > "BusMasterCommonBuffer" before adding into vring. > If the transfer is strictly unidirectional, then that should work. If the transfer goes both ways, you may need to map/unmap for read and then map/unmap for write _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
On 07/28/17 10:39, Ard Biesheuvel wrote: > On 27 July 2017 at 23:10, Brijesh Singh <brijesh.singh@amd.com> wrote: >> >> >> On 07/27/2017 04:31 PM, Ard Biesheuvel wrote: >>> >>> >>>> On 27 Jul 2017, at 21:55, Brijesh Singh <brijesh.singh@amd.com> wrote: >>>> >>>> >>>> >>>> On 07/27/2017 02:00 PM, Brijesh Singh wrote: >>>> >>>>>>> This distribution of operations seems wrong. The key point is that >>>>>>> AllocateBuffer() *need not* result in a buffer that is immediately >>>>>>> usable, and that client code is required to call Map() >>>>>>> *unconditionally*, even if BusMasterCommonBuffer is the desired >>>>>>> operation. Therefore, the right distribution of operations is: >>>>>>> >>>>>>> - IoMmuAllocateBuffer() allocates pages and does not touch the >>>>>>> encryption mask.. >>>>>>> >>>>>>> - IoMmuFreeBuffer() deallocates pages and does not touch the >>>>>>> encryption >>>>>>> mask. >>>>>>> >>>>> Actually one of main reason why we cleared and restored the memory >>>>> encryption mask >>>>> during allocate/free is because we also consume the IOMMU protocol in >>>>> QemuFwCfgLib >>>>> as a method to allocate and free a DMA buffer. I am certainly open to >>>>> suggestions. >>>>> [1] >>>>> https://github.com/tianocore/edk2/blob/master/OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c#L159 >>>>> [2] >>>>> https://github.com/tianocore/edk2/blob/master/OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c#L197 >>>>>>> >>>>>>> - IoMmuMap() does not allocate pages when BusMasterCommonBuffer is >>>>>>> requested, and it allocates pages (bounce buffer) otherwise. >>>>>>> >>>>> I am trying to wrap my head around how we can support >>>>> BusMasterCommonBuffer >>>>> when buffer was not allocated by us. Changing the memory encryption mask >>>>> in >>>>> a page table will not update the contents. Also since the memory >>>>> encryption >>>>> mask works on PAGE_SIZE hence changing the encryption mask on not our >>>>> allocated >>>>> buffer could mess things up (e.g if NumberOfBytes is not PAGE_SIZE >>>>> aligned). >>>> >>>> >>>> I may be missing something in my understanding. Here is a flow I have in >>>> my >>>> mind, please correct me. >>>> >>>> OvmfPkg/VirtIoBlk.c: >>>> >>>> VirtioBlkInit() >>>> .... >>>> .... >>>> VirtioRingInit >>>> Virtio->AllocateSharedPages(RingSize, &Ring->Base) >>>> PciIo->AllocatePages(RingSize, &RingAddress) >>>> Virtio->MapSharedPages(...,BusMasterCommonBuffer, Ring->Base, >>>> RingSize, &RingDeviceAddress) >>>> ..... >>>> ..... >>>> >>>> This case is straight forward and we can easily maps. No need for bounce >>>> buffering. >>>> >>>> VirtioBlkReadBlocks(..., BufferSize, Buffer,) >>>> ...... >>>> ...... >>>> SynchronousRequest(..., BufferSize, Buffer) >>>> .... >>>> Virtio->MapSharedPages(..., BusMasterCommonBuffer, Buffer, >>>> BufferSize, &DeviceAddress) >>>> VirtioAppendDesc(DeviceAddress, BufferSize, ...) >>>> VirtioFlush (...) >>>> In the above case, "Buffer" was not allocated by us hence we will not >>>> able to change the >>>> memory encryption attributes. Am I missing something in the flow ? >>>> >>> >>> >>> Common buffer mappings may only be created from buffers that were >>> allocated by AllocateBuffer(). In fact, that is its main purpose >> >> >> Yes, that part is well understood. If the buffer was allocated by us (e.g >> vring, request/status >> structure etc) then those should be mapped as "BusMasterCommonBuffer". Brijesh, thanks for the clarification. Previously I replied (at length) to your paragraph that said "trying to wrap my head around...", and it wasn't clear what you meant by "allocated by us". In my previous response, I assumed that you meant a distinction between "allocated in Map()" vs. "allocated in AllocateBuffer()". I stand by my earlier answer for that (assumed) distinction, but now I see that you meant something else. >> >> But I am trying to figure out, how to map a data buffers before issuing a >> virtio request. e.g when >> VirtioBlkReadBlocks() is called, "Buffer" pointer is not a DMA address hence >> we need to map it. >> I think it should be mapped using "BusMasterWrite" not >> "BusMasterCommonBuffer" before adding into vring. >> > > If the transfer is strictly unidirectional, then that should work. If > the transfer goes both ways, you may need to map/unmap for read and > then map/unmap for write > You (Brijesh and Ard) are both right. This question depends on the outermost interface that the specific virtio driver provides. In this case, VirtioBlkReadBlocks() implements EFI_BLOCK_IO_PROTOCOL.ReadBlocks(). The buffer is owned by an independent agent, and it is guaranteed by the ReadBlocks() interface that the transfer is unidirectional. So a standalone Map() call with BusMasterWrite is appropriate, followed by a standalone Unmap() call *before* VirtioBlkReadBlocks() returns. In this sense, my earlier request to "*always* use AllocateBuffer + Map" was too strict. However, in the *general* case, the recommendation remains the same. For the virtio-net driver for example, the interfaces are not synchronous. E.g., while EFI_BLOCK_IO_PROTOCOL.WriteBlocks() is synchronous, EFI_SIMPLE_NETWORK_PROTOCOL.Transmit() is not. So, although in VirtioNetTransmit() we might be tempted to use BusMasterRead, that's not right, because ExitBootServices() could be called before the SNP client collects the buffer with VirtioNetGetStatus(). The ExitBootServices() callback will have to Unmap() the area without freeing it, and that's only possible if BusMasterCommonBuffer is used in VirtioNetTransmit() to begin with. This means that we'll have to save the client's data -- after updating it according to HeaderSize -- with AllocateBuffer() in VirtioNetTransmit(), Map() *that* as BusMasterCommonBuffer, and undo both steps in VirtioNetGetStatus(). And, in the exit-boot-services callback, it has to be Unmap()ped only, but not freed. Basically it depends upon whether you can complete the entire operation synchronously, before the outermost protocol interface returns. I recommend that we work on the IoMmuDxe and QemuFwCfgLib updates first. (And, my apologies again for not catching these issues immediately; as I said, this is my first time doing non-1:1 DMA.) Thanks, Laszlo _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
On 07/27/17 21:00, Brijesh Singh wrote: > On 07/27/2017 12:56 PM, Ard Biesheuvel wrote: >> On 27 July 2017 at 18:16, Laszlo Ersek <lersek@redhat.com> wrote: >>> Normally, Map allocates the bounce buffer internally, and Unmap >>> releases the bounce buffer internally (for BusMasterRead and >>> BusMasterWrite), which is not right here. If you use >>> AllocateBuffer() separately, then Map() -- with >>> BusMasterCommonBuffer -- will not allocate internally, and Unmap() >>> will also not deallocate internally. So, in the ExitBootServices() >>> callback, you will be able to call Unmap() only, and then *not* call >>> FreeBuffer() at all. >>> >>> This is why I suggest introducing all four functions (Allocate, >>> Free, Map, Unmap) to the VIRTIO_DEVICE_PROTOCOL, and why all virtio >>> drivers should use all four functions explicitly, not just Map and >>> Unmap. >>> >>> ... Actually, I think there is a bug in >>> "OvmfPkg/IoMmuDxe/AmdSevIoMmu.c". You have the following >>> distribution of operations at the moment: >>> >>> - IoMmuAllocateBuffer() allocates pages and clears the memory >>> encryption mask >>> >>> - IoMmuFreeBuffer() re-sets the memory encryption mask, and >>> deallocates pages >>> >>> - IoMmuMap() does nothing at all when BusMasterCommonBuffer is >>> requested (and IoMmuAllocateBuffer() was called previously). >>> Otherwise, IoMmuMap() allocates pages, allocates MAP_INFO, and >>> clears the memory encryption mask. >>> >>> - IoMmuUnmap() does nothing when cleaning up a BusMasterCommonBuffer >>> operation (--> NO_MAPPING). Otherwise, IoMmuUnmap() clears the >>> encryption mask, and frees both the pages and MAP_INFO. >>> > > I agree with you, there is a bug in AmdSevIoMmu.c. I will fix it. And > introduce list to track if the buffer was allocated by us. If buffer > was allocated by us then we can safely say that it does not require a > bounce buffer. While implementing the initial code I was not able to > see BusMasterCommonBuffer usage. But with virtio things are getting a > bit more clear in my mind. The purpose of the internal list is a little bit different -- it is a "free list", not a tracker list. Under the proposed scheme, a MAP_INFO structure will have to be allocated for *all* mappings: both for CommonBuffer (where the actual pages come from the caller, who retrieved them earlier with an AllocateBuffer() call), and for Read/Write (where the actual pages will be allocated internally to Map). Allocating the MAP_INFO structure in Map() is necessary in *both* cases because the Unmap() function can *only* consult this MAP_INFO structure to learn the address and the size at which it has to re-set the memory encryption mask. This is because the Unmap() function gets no other input parameter. Then, regardless of the original operation (CommonBuffer vs. Read/Write), the MAP_INFO structure has to be recycled in Unmap(). (For CommonBuffer, the actual pages are owned by the caller, so we don't free them here; for Read/Write, the actual pages are owned by Map/Unmap, so we free them in Unmap() *in addition* to recycling MAP_INFO.) The main point is that MAP_INFO cannot be released with FreePool() because that could change the UEFI memmap during ExitBootServices() processing. So MAP_INFO has to be "released" to an internal *free* list. And since we have an internal free list, the original MAP_INFO allocation in Map() should consult the free list first, and only if it is empty should it fall back to AllocatePool(). Whether the actual pages tracked by MAP_INFO were allocated internally by Map(), or externally by the caller (via AllocateBuffer()) is an independent question. (And it can be seen from the MAP_INFO.Operation field.) Does this make it clearer? > >>> This distribution of operations seems wrong. The key point is that >>> AllocateBuffer() *need not* result in a buffer that is immediately >>> usable, and that client code is required to call Map() >>> *unconditionally*, even if BusMasterCommonBuffer is the desired >>> operation. Therefore, the right distribution of operations is: >>> >>> - IoMmuAllocateBuffer() allocates pages and does not touch the >>> encryption mask.. >>> >>> - IoMmuFreeBuffer() deallocates pages and does not touch the >>> encryption mask. >>> > > Actually one of main reason why we cleared and restored the memory > encryption mask during allocate/free is because we also consume the > IOMMU protocol in QemuFwCfgLib as a method to allocate and free a DMA > buffer. I am certainly open to suggestions. Argh. That's again my fault then, I should have noticed it. :( I apologize, the Map/Unmap/AllocateBuffer/FreeBuffer APIs are a "first" for me as well. As discussed earlier (and confirmed by Ard), calling *just* AllocateBuffer() is never enough, Map()/Unmap() are always necessary. So here's what I suggest (to keep the changes localized): - Extend the prototype of InternalQemuFwCfgSevDmaAllocateBuffer() function to output a "VOID *Mapping" parameter as well, in addition to the address. - Modify the prototype of the InternalQemuFwCfgSevDmaFreeBuffer() function to take a "VOID *Mapping" parameter in addition to the buffer address. - In the DXE implementation of InternalQemuFwCfgSevDmaAllocateBuffer(), keep the current IOMMU AllocateBuffer() call, but also call IOMMU Map(), with CommonBuffer. Furthermore, propagate the Mapping output of Map() outwards. (Note that Map() may have to be called in a loop, dependent on "NumberOfBytes".) - In the DXE implementation of InternalQemuFwCfgSevDmaFreeBuffer(), call the IOMMU Unmap() function first (using the new Mapping parameter). > > [1] > https://github.com/tianocore/edk2/blob/master/OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c#L159 > > [2] > https://github.com/tianocore/edk2/blob/master/OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c#L197 > > >>> - IoMmuMap() does not allocate pages when BusMasterCommonBuffer is >>> requested, and it allocates pages (bounce buffer) otherwise. >>> > > I am trying to wrap my head around how we can support > BusMasterCommonBuffer when buffer was not allocated by us. Changing > the memory encryption mask in a page table will not update the > contents. Thank you for the clarification. I've now tried to review the current code for a better understanding. Are the below statements correct? - For the guest, guest memory is always readable transparently. - For the host, guest memory is always readable *as it was written last*. - If the guest memory was last written with the C bit clear, then the host can read it as plaintext (regardless of the current state of the C bit). - If the guest memory was last written with the C bit set, then the host can only read garbage (regardless of the current state of the C bit). *If* this is the case, then: (a) We already have a sort of guest->host information leak. Namely, in IoMmuFreeBuffer() [OvmfPkg/IoMmuDxe/AmdSevIoMmu.c], the C bit is restored, and the pages are released with gBS->FreePages(). However, the pages being released are not rewritten in place (they are not actually re-encrypted, only marked for encryption whenever they are next written to). This means that pages released like this remain readable to the hypervisor for an unpredictable time. This is not necessarily a critical problem (after all the contents of those pages were visible to the hypervisor at some point anyway!); but in the longer term, such pages could accumulate, and if the hypervisor is compromised later, it could potentially read an unbounded amount of "past guest data" as plain-text. (b) Plus, approaching the question from the Map() direction, we need to consider two scenarios: - Client code calls AllocateBuffer(), then Map(), and it writes to the buffer only then. This should be safe. - client code calls AllocateBuffer(), writes to it, and then calls Map(). This will result in memory contents that look like garbage to the hypervisor. Bad. I can imagine the following to handle these cases: in the Map() and Unmap() functions, we have to decrypt and encrypt the memory contents in-place, after changing the C bit (without allocating additional memory). Introduce a static UINT8 array with EFI_PAGE_SIZE bytes (this will always remain in encrypted memory). Update the C bit with a single function call for the entire range (like now) -- this will not affect the guest-readability of the pages --, then bounce each page within the range to the static buffer and back to its original place. In effect this will in-place encrypt or decrypt the memory, and will be faster than a byte-wise rewrite. * BusMasterRead (host will want to read guest memory): - Client calls Map() without calling AllocateBuffer() first. Map() allocates an internal bounce buffer, clears the C bit, and does the copying. - Client fires off the PCI transaction. - Client calls Unmap(), without calling FreeBuffer() later. Unmap() restores the C-bit, *zeros* the pages, and releases the pages. * BusMasterWrite (host will want to write to guest memory): - Client calls Map() without calling AllocateBuffer() first. Map() allocates an internal bounce buffer and clears the C bit. - Client fires off the PCI transaction. - Client calls Unmap(), without calling FreeBuffer() later. Unmap() copies the bounce buffer to crpyted (client-owned) memory, restores the C-bit, *zeros* the pages, and releases the pages. * BusMasterCommonBuffer: - Client calls AllocateBuffer(), and places some data in the returned memory. - Client calls Map(). Map() clears the C bit in one fell swoop, and then decrypts the buffer in-place (by bouncing it page-wise to the static array and back). - Client communicates with the device. - Client calls Unmap(). Unmap() restores the C bit in one fell swoop, and encrypts the buffer in-place (by bouncing it page-wise to the static array and back). - Client reads some residual data from the buffer. - Client calls FreeBuffer(). FreeBuffer() relases the pages. This is suitable for the discussed ExitBootServices() callback update too, where the callback will reset the virtio device (like now) and then call Unmap() for all shared areas, without calling FreeBuffer() on them. The above logic will re-encrypt the contents without affecting the UEFI memmap. > Also since the memory encryption mask works on PAGE_SIZE hence > changing the encryption mask on not our allocated buffer could mess > things up (e.g if NumberOfBytes is not PAGE_SIZE aligned). This is not a problem for BusMasterRead and BusMasterWrite because for those operations, Map() allocates the internal bounce buffer. Also not a problem for BusMasterCommonBuffer, because then the client first calls AllocateBuffer (whole number of pages), and so Map() can round up the number of bytes and de-crypt full pages in-place (see above). > >>> *Regardless* of BusMaster operation, the following actions are >>> carried out unconditionally: >>> >>> . the memory encryption mask is cleared in this function (and in >>> this function only), >>> >>> . An attempt is made to grab a MAP_INFO structure from an >>> internal free list (to be introduced!). The head of the list is >>> a new static variable. If the free list is empty, then a >>> MAP_INFO structure is allocated with AllocatePool(). The >>> NO_MAPPING macro becomes unused and can be deleted from the >>> source code. >>> >>> - IoMmuUnmap() clears the encryption mask unconditionally. (For >>> this, it has to consult the MAP_INFO structure that is being >>> passed in from the caller.) In addition: >>> >>> . If MapInfo->Operation is BusMasterCommonBuffer, then we know >>> the allocation was done separately in AllocateBuffer, so we do >>> not release the pages. Otherwise, we do release the pages. >>> >>> . MapInfo is linked back on the internal free list (see above). >>> It is *never* released with FreePool(). >>> >>> This approach guarantees that IoMmuUnmap() can de-program the >>> IOMMU (= re-set the memory encryption mask) without changing the >>> UEFI memory map. (I trust that MemEncryptSevSetPageEncMask() will >>> not split page tables internally when it *re*sets the encryption >>> mask -- is that correct?) > > Yes, MemEncryptSevSetPageEncMask() will not split pages when it > restores mask. Great, thanks. Laszlo _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
On 07/28/2017 08:38 AM, Laszlo Ersek wrote: > On 07/27/17 21:00, Brijesh Singh wrote: >> On 07/27/2017 12:56 PM, Ard Biesheuvel wrote: >>> On 27 July 2017 at 18:16, Laszlo Ersek <lersek@redhat.com> wrote: > >>>> Normally, Map allocates the bounce buffer internally, and Unmap >>>> releases the bounce buffer internally (for BusMasterRead and >>>> BusMasterWrite), which is not right here. If you use >>>> AllocateBuffer() separately, then Map() -- with >>>> BusMasterCommonBuffer -- will not allocate internally, and Unmap() >>>> will also not deallocate internally. So, in the ExitBootServices() >>>> callback, you will be able to call Unmap() only, and then *not* call >>>> FreeBuffer() at all. >>>> >>>> This is why I suggest introducing all four functions (Allocate, >>>> Free, Map, Unmap) to the VIRTIO_DEVICE_PROTOCOL, and why all virtio >>>> drivers should use all four functions explicitly, not just Map and >>>> Unmap. >>>> >>>> ... Actually, I think there is a bug in >>>> "OvmfPkg/IoMmuDxe/AmdSevIoMmu.c". You have the following >>>> distribution of operations at the moment: >>>> >>>> - IoMmuAllocateBuffer() allocates pages and clears the memory >>>> encryption mask >>>> >>>> - IoMmuFreeBuffer() re-sets the memory encryption mask, and >>>> deallocates pages >>>> >>>> - IoMmuMap() does nothing at all when BusMasterCommonBuffer is >>>> requested (and IoMmuAllocateBuffer() was called previously). >>>> Otherwise, IoMmuMap() allocates pages, allocates MAP_INFO, and >>>> clears the memory encryption mask. >>>> >>>> - IoMmuUnmap() does nothing when cleaning up a BusMasterCommonBuffer >>>> operation (--> NO_MAPPING). Otherwise, IoMmuUnmap() clears the >>>> encryption mask, and frees both the pages and MAP_INFO. >>>> >> >> I agree with you, there is a bug in AmdSevIoMmu.c. I will fix it. And >> introduce list to track if the buffer was allocated by us. If buffer >> was allocated by us then we can safely say that it does not require a >> bounce buffer. While implementing the initial code I was not able to >> see BusMasterCommonBuffer usage. But with virtio things are getting a >> bit more clear in my mind. > > The purpose of the internal list is a little bit different -- it is a > "free list", not a tracker list. > > Under the proposed scheme, a MAP_INFO structure will have to be > allocated for *all* mappings: both for CommonBuffer (where the actual > pages come from the caller, who retrieved them earlier with an > AllocateBuffer() call), and for Read/Write (where the actual pages will > be allocated internally to Map). Allocating the MAP_INFO structure in > Map() is necessary in *both* cases because the Unmap() function can > *only* consult this MAP_INFO structure to learn the address and the size > at which it has to re-set the memory encryption mask. This is because > the Unmap() function gets no other input parameter. > > Then, regardless of the original operation (CommonBuffer vs. > Read/Write), the MAP_INFO structure has to be recycled in Unmap(). (For > CommonBuffer, the actual pages are owned by the caller, so we don't free > them here; for Read/Write, the actual pages are owned by Map/Unmap, so > we free them in Unmap() *in addition* to recycling MAP_INFO.) The main > point is that MAP_INFO cannot be released with FreePool() because that > could change the UEFI memmap during ExitBootServices() processing. So > MAP_INFO has to be "released" to an internal *free* list. > > And since we have an internal free list, the original MAP_INFO > allocation in Map() should consult the free list first, and only if it > is empty should it fall back to AllocatePool(). > > Whether the actual pages tracked by MAP_INFO were allocated internally > by Map(), or externally by the caller (via AllocateBuffer()) is an > independent question. (And it can be seen from the MAP_INFO.Operation > field.) > > Does this make it clearer? > It makes sense. thanks One question, do we need to return EFI_NOT_SUPPORTED error when we get request to map a buffer with CommonBuffer but the buffer was not allocated using "EFI_PCI_IO_PROTOCOL->AllocateBuffer (...)"? At least as per spec, it seems if caller wants to map a buffer with CommonBuffer then it must have called "EFI_PCI_IO_PROTOCOL->AllocateBuffer (...)" >> >>>> This distribution of operations seems wrong. The key point is that >>>> AllocateBuffer() *need not* result in a buffer that is immediately >>>> usable, and that client code is required to call Map() >>>> *unconditionally*, even if BusMasterCommonBuffer is the desired >>>> operation. Therefore, the right distribution of operations is: >>>> >>>> - IoMmuAllocateBuffer() allocates pages and does not touch the >>>> encryption mask.. >>>> >>>> - IoMmuFreeBuffer() deallocates pages and does not touch the >>>> encryption mask. >>>> >> >> Actually one of main reason why we cleared and restored the memory >> encryption mask during allocate/free is because we also consume the >> IOMMU protocol in QemuFwCfgLib as a method to allocate and free a DMA >> buffer. I am certainly open to suggestions. > > Argh. That's again my fault then, I should have noticed it. :( I > apologize, the Map/Unmap/AllocateBuffer/FreeBuffer APIs are a "first" > for me as well. > > As discussed earlier (and confirmed by Ard), calling *just* > AllocateBuffer() is never enough, Map()/Unmap() are always necessary. > > So here's what I suggest (to keep the changes localized): > > - Extend the prototype of InternalQemuFwCfgSevDmaAllocateBuffer() > function to output a "VOID *Mapping" parameter as well, in addition to > the address. > > - Modify the prototype of the InternalQemuFwCfgSevDmaFreeBuffer() > function to take a "VOID *Mapping" parameter in addition to the buffer > address. > > - In the DXE implementation of InternalQemuFwCfgSevDmaAllocateBuffer(), > keep the current IOMMU AllocateBuffer() call, but also call IOMMU > Map(), with CommonBuffer. Furthermore, propagate the Mapping output of > Map() outwards. (Note that Map() may have to be called in a loop, > dependent on "NumberOfBytes".) > > - In the DXE implementation of InternalQemuFwCfgSevDmaFreeBuffer(), call > the IOMMU Unmap() function first (using the new Mapping parameter). > I will update the code and send patch for review. thanks >> >> [1] >> https://github.com/tianocore/edk2/blob/master/OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c#L159 >> >> [2] >> https://github.com/tianocore/edk2/blob/master/OvmfPkg/Library/QemuFwCfgLib/QemuFwCfgDxe.c#L197 >> >> >>>> - IoMmuMap() does not allocate pages when BusMasterCommonBuffer is >>>> requested, and it allocates pages (bounce buffer) otherwise. >>>> >> >> I am trying to wrap my head around how we can support >> BusMasterCommonBuffer when buffer was not allocated by us. Changing >> the memory encryption mask in a page table will not update the >> contents. > > Thank you for the clarification. I've now tried to review the current > code for a better understanding. Are the below statements correct? > > - For the guest, guest memory is always readable transparently. > - For the host, guest memory is always readable *as it was written > last*. > - If the guest memory was last written with the C bit clear, then the > host can read it as plaintext (regardless of the current state of the > C bit). > - If the guest memory was last written with the C bit set, then the host > can only read garbage (regardless of the current state of the C bit). > Your understanding is correct. > *If* this is the case, then: > > (a) We already have a sort of guest->host information leak. Namely, in > IoMmuFreeBuffer() [OvmfPkg/IoMmuDxe/AmdSevIoMmu.c], the C bit is > restored, and the pages are released with gBS->FreePages(). However, the > pages being released are not rewritten in place (they are not actually > re-encrypted, only marked for encryption whenever they are next written > to). This means that pages released like this remain readable to the > hypervisor for an unpredictable time. > > This is not necessarily a critical problem (after all the contents of > those pages were visible to the hypervisor at some point anyway!); but > in the longer term, such pages could accumulate, and if the hypervisor > is compromised later, it could potentially read an unbounded amount of > "past guest data" as plain-text. > Very good catch, I can *zero* the pages. IIRC, I did not consider doing so because it may introduce unnecessary perform hit (mainly when dealing with larger pages). I think when we load kernel or initrd using QemuFwCfg DMA interface we allocate large buffers. I will still go ahead and experiment *zeroing* page and measure the performance impact before we integrate the solution. > (b) Plus, approaching the question from the Map() direction, we need to > consider two scenarios: > > - Client code calls AllocateBuffer(), then Map(), and it writes to the > buffer only then. This should be safe. > - client code calls AllocateBuffer(), writes to it, and then calls > Map(). This will result in memory contents that look like garbage to > the hypervisor. Bad. > > I can imagine the following to handle these cases: in the Map() and > Unmap() functions, we have to decrypt and encrypt the memory contents > in-place, after changing the C bit (without allocating additional > memory). Introduce a static UINT8 array with EFI_PAGE_SIZE bytes (this > will always remain in encrypted memory). Update the C bit with a single > function call for the entire range (like now) -- this will not affect > the guest-readability of the pages --, then bounce each page within the > range to the static buffer and back to its original place. In effect > this will in-place encrypt or decrypt the memory, and will be faster > than a byte-wise rewrite. > > * BusMasterRead (host will want to read guest memory): > - Client calls Map() without calling AllocateBuffer() first. Map() > allocates an internal bounce buffer, clears the C bit, and does the > copying. > - Client fires off the PCI transaction. > - Client calls Unmap(), without calling FreeBuffer() later. Unmap() > restores the C-bit, *zeros* the pages, and releases the pages. > Yes this will work just fine (see my previous comments on *zeroing* pages) > * BusMasterWrite (host will want to write to guest memory): > - Client calls Map() without calling AllocateBuffer() first. Map() > allocates an internal bounce buffer and clears the C bit. > - Client fires off the PCI transaction. > - Client calls Unmap(), without calling FreeBuffer() later. Unmap() > copies the bounce buffer to crpyted (client-owned) memory, restores > the C-bit, *zeros* the pages, and releases the pages. > Yes, this will work just fine (see my previous comments on *zeroing* pages) > * BusMasterCommonBuffer: > - Client calls AllocateBuffer(), and places some data in the returned > memory. > - Client calls Map(). Map() clears the C bit in one fell swoop, and > then decrypts the buffer in-place (by bouncing it page-wise to the > static array and back). > - Client communicates with the device. > - Client calls Unmap(). Unmap() restores the C bit in one fell swoop, > and encrypts the buffer in-place (by bouncing it page-wise to the > static array and back). > - Client reads some residual data from the buffer. > - Client calls FreeBuffer(). FreeBuffer() relases the pages. > Yes this works fine as long as the client uses EFI_PCI_IO_PROTOCOL.AllocateBuffer() to allocate the buffer. > This is suitable for the discussed ExitBootServices() callback update > too, where the callback will reset the virtio device (like now) and then > call Unmap() for all shared areas, without calling FreeBuffer() on them. > The above logic will re-encrypt the contents without affecting the UEFI > memmap. > >> Also since the memory encryption mask works on PAGE_SIZE hence >> changing the encryption mask on not our allocated buffer could mess >> things up (e.g if NumberOfBytes is not PAGE_SIZE aligned). > > This is not a problem for BusMasterRead and BusMasterWrite because for > those operations, Map() allocates the internal bounce buffer. Also not a > problem for BusMasterCommonBuffer, because then the client first calls > AllocateBuffer (whole number of pages), and so Map() can round up the > number of bytes and de-crypt full pages in-place (see above). > >> >>>> *Regardless* of BusMaster operation, the following actions are >>>> carried out unconditionally: >>>> >>>> . the memory encryption mask is cleared in this function (and in >>>> this function only), >>>> >>>> . An attempt is made to grab a MAP_INFO structure from an >>>> internal free list (to be introduced!). The head of the list is >>>> a new static variable. If the free list is empty, then a >>>> MAP_INFO structure is allocated with AllocatePool(). The >>>> NO_MAPPING macro becomes unused and can be deleted from the >>>> source code. >>>> >>>> - IoMmuUnmap() clears the encryption mask unconditionally. (For >>>> this, it has to consult the MAP_INFO structure that is being >>>> passed in from the caller.) In addition: >>>> >>>> . If MapInfo->Operation is BusMasterCommonBuffer, then we know >>>> the allocation was done separately in AllocateBuffer, so we do >>>> not release the pages. Otherwise, we do release the pages. >>>> >>>> . MapInfo is linked back on the internal free list (see above). >>>> It is *never* released with FreePool(). >>>> >>>> This approach guarantees that IoMmuUnmap() can de-program the >>>> IOMMU (= re-set the memory encryption mask) without changing the >>>> UEFI memory map. (I trust that MemEncryptSevSetPageEncMask() will >>>> not split page tables internally when it *re*sets the encryption >>>> mask -- is that correct?) >> >> Yes, MemEncryptSevSetPageEncMask() will not split pages when it >> restores mask. > > Great, thanks. > Laszlo > _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
On 07/28/17 18:00, Brijesh Singh wrote: > > > On 07/28/2017 08:38 AM, Laszlo Ersek wrote: >> On 07/27/17 21:00, Brijesh Singh wrote: >>> I agree with you, there is a bug in AmdSevIoMmu.c. I will fix it. And >>> introduce list to track if the buffer was allocated by us. If buffer >>> was allocated by us then we can safely say that it does not require a >>> bounce buffer. While implementing the initial code I was not able to >>> see BusMasterCommonBuffer usage. But with virtio things are getting a >>> bit more clear in my mind. >> >> The purpose of the internal list is a little bit different -- it is a >> "free list", not a tracker list. >> >> Under the proposed scheme, a MAP_INFO structure will have to be >> allocated for *all* mappings: both for CommonBuffer (where the actual >> pages come from the caller, who retrieved them earlier with an >> AllocateBuffer() call), and for Read/Write (where the actual pages will >> be allocated internally to Map). Allocating the MAP_INFO structure in >> Map() is necessary in *both* cases because the Unmap() function can >> *only* consult this MAP_INFO structure to learn the address and the size >> at which it has to re-set the memory encryption mask. This is because >> the Unmap() function gets no other input parameter. >> >> Then, regardless of the original operation (CommonBuffer vs. >> Read/Write), the MAP_INFO structure has to be recycled in Unmap(). (For >> CommonBuffer, the actual pages are owned by the caller, so we don't free >> them here; for Read/Write, the actual pages are owned by Map/Unmap, so >> we free them in Unmap() *in addition* to recycling MAP_INFO.) The main >> point is that MAP_INFO cannot be released with FreePool() because that >> could change the UEFI memmap during ExitBootServices() processing. So >> MAP_INFO has to be "released" to an internal *free* list. >> >> And since we have an internal free list, the original MAP_INFO >> allocation in Map() should consult the free list first, and only if it >> is empty should it fall back to AllocatePool(). >> >> Whether the actual pages tracked by MAP_INFO were allocated internally >> by Map(), or externally by the caller (via AllocateBuffer()) is an >> independent question. (And it can be seen from the MAP_INFO.Operation >> field.) >> >> Does this make it clearer? >> > > It makes sense. thanks > > One question, do we need to return EFI_NOT_SUPPORTED error when we get > request to map a buffer with CommonBuffer but the buffer was not > allocated using "EFI_PCI_IO_PROTOCOL->AllocateBuffer (...)"? > > At least as per spec, it seems if caller wants to map a buffer with > CommonBuffer then it must have called > "EFI_PCI_IO_PROTOCOL->AllocateBuffer (...)" Yes, that is it, exactly: if the caller violates a requirement in the UEFI spec, covering the use of the APIs, then the firmware *may* make an attempt to detect that (and to reject it), but the firmware is not *required* to do so. A much simpler example: on a double-free programming error, the second gBS->FreePool() call is not *required* to detect the problem. ("The Buffer that is freed must have been allocated by AllocatePool().") So, I don't think we need to implement measures for checking that a CommonBuffer Map() actually refers to a previously returned, active, AllocateBuffer() address. (Snipping the rest, I've read it all, thanks for the answers. Nothing to add this time :) ) Cheers! Laszlo _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
On 07/28/17 18:00, Brijesh Singh wrote: > > > On 07/28/2017 08:38 AM, Laszlo Ersek wrote: >> On 07/27/17 21:00, Brijesh Singh wrote: >>> Actually one of main reason why we cleared and restored the memory >>> encryption mask during allocate/free is because we also consume the >>> IOMMU protocol in QemuFwCfgLib as a method to allocate and free a >>> DMA buffer. I am certainly open to suggestions. >> >> Argh. That's again my fault then, I should have noticed it. :( I >> apologize, the Map/Unmap/AllocateBuffer/FreeBuffer APIs are a "first" >> for me as well. >> >> As discussed earlier (and confirmed by Ard), calling *just* >> AllocateBuffer() is never enough, Map()/Unmap() are always necessary. >> >> So here's what I suggest (to keep the changes localized): >> >> - Extend the prototype of InternalQemuFwCfgSevDmaAllocateBuffer() >> function to output a "VOID *Mapping" parameter as well, in addition >> to the address. >> >> - Modify the prototype of the InternalQemuFwCfgSevDmaFreeBuffer() >> function to take a "VOID *Mapping" parameter in addition to the >> buffer address. >> >> - In the DXE implementation of >> InternalQemuFwCfgSevDmaAllocateBuffer(), keep the current IOMMU >> AllocateBuffer() call, but also call IOMMU Map(), with >> CommonBuffer. Furthermore, propagate the Mapping output of Map() >> outwards. (Note that Map() may have to be called in a loop, >> dependent on "NumberOfBytes".) >> >> - In the DXE implementation of InternalQemuFwCfgSevDmaFreeBuffer(), >> call the IOMMU Unmap() function first (using the new Mapping >> parameter). >> > > I will update the code and send patch for review. thanks Here's an alternative, given that you mentioned being performance-conscious. I'm not "suggesting" this as preferred, just offering it for your consideration. Pick whichever you like more. In this approach, I'm going to go back on my original request, namely to keep the InternalQemuFwCfgDmaBytes() implementation centralized, in the "QemuFwCfgLib.c" file. I now realize that the differences are large enough that this may not have been a good idea. So here goes: * Internal APIs ("QemuFwCfgLibInternal.h"): - Remove the InternalQemuFwCfgSev*() function prototypes. - Introduce the InternalQemuFwCfgDmaBytes() function prototype. * SEC instance ("QemuFwCfgSec.c"): - Remove the InternalQemuFwCfgSev*() function definitions. - Add the InternalQemuFwCfgDmaBytes() function definition with ASSERT(FALSE) + CpuDeadLoop(). * PEI instance ("QemuFwCfgPei.c): - Remove the InternalQemuFwCfgSev*() function definitions. - Copy the InternalQemuFwCfgDmaBytes() function definition from the currently shared code ("QemuFwCfgLib.c"), and remove all branches that are related to the SEV enabled case. IOW, specialize the implementation to the plaintext case. - In QemuFwCfgInitialize(), replace the InternalQemuFwCfgSevIsEnabled() function call with a direct call to MemEncryptSevIsEnabled(). * DXE instance ("QemuFwCfgDxe.c"): - Remove the InternalQemuFwCfgSev*() function definitions. - Copy the InternalQemuFwCfgDmaBytes() function definition from the currently shared code ("QemuFwCfgLib.c"), as a starting point. - Replace the InternalQemuFwCfgSevIsEnabled() call with a direct call to MemEncryptSevIsEnabled(). - When SEV is enabled, split the control area ("FW_CFG_DMA_ACCESS") to a separate buffer. This control area should be allocated with IOMMU AllocateBuffer(), and Map()ped separately, as BusMasterCommonBuffer64. - For the data transfer buffer, use *only* Map() and Unmap(), without AllocateBuffer() and FreeBuffer(). Use BusMasterRead64 and BusMasterWrite64, dependent on FW_CFG_DMA_CTL_WRITE / FW_CFG_DMA_CTL_READ. The point is that the potentially large data area will be bounced only once, because Map()/Unmap() will own the bounce buffer, and the in-place (de)crypting can be avoided. (The fw_cfg DMA transfer is completed in one synchronous operation, as witnessed by the library client.) The explicit CopyMem() calls can be removed. - Remove the InternalQemuFwCfgSevDmaAllocateBuffer() and InternalQemuFwCfgSevDmaFreeBuffer() calls. * shared code ("QemuFwCfgLib.c"): - remove the InternalQemuFwCfgDmaBytes() function definition. Thanks, Laszlo _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
On 07/28/17 18:00, Brijesh Singh wrote: > On 07/28/2017 08:38 AM, Laszlo Ersek wrote: snip >> (b) Plus, approaching the question from the Map() direction, we need >> to consider two scenarios: >> >> - Client code calls AllocateBuffer(), then Map(), and it writes to >> the buffer only then. This should be safe. >> - client code calls AllocateBuffer(), writes to it, and then calls >> Map(). This will result in memory contents that look like garbage >> to the hypervisor. Bad. >> >> I can imagine the following to handle these cases: in the Map() and >> Unmap() functions, we have to decrypt and encrypt the memory contents >> in-place, after changing the C bit (without allocating additional >> memory). Introduce a static UINT8 array with EFI_PAGE_SIZE bytes >> (this will always remain in encrypted memory). Update the C bit with >> a single function call for the entire range (like now) -- this will >> not affect the guest-readability of the pages --, then bounce each >> page within the range to the static buffer and back to its original >> place. In effect this will in-place encrypt or decrypt the memory, >> and will be faster than a byte-wise rewrite. snip >> * BusMasterCommonBuffer: >> - Client calls AllocateBuffer(), and places some data in the >> returned memory. >> - Client calls Map(). Map() clears the C bit in one fell swoop, >> and then decrypts the buffer in-place (by bouncing it page-wise >> to the static array and back). >> - Client communicates with the device. >> - Client calls Unmap(). Unmap() restores the C bit in one fell >> swoop, and encrypts the buffer in-place (by bouncing it >> page-wise to the static array and back). >> - Client reads some residual data from the buffer. >> - Client calls FreeBuffer(). FreeBuffer() relases the pages. >> > > Yes this works fine as long as the client uses > EFI_PCI_IO_PROTOCOL.AllocateBuffer() to allocate the buffer. Again, a performance-oriented thought: Above I suggested using a statically allocated page-sized buffer, for the in-place encryption/decryption. Ultimately this means *two* CopyMem()s for the entire buffer (just executed page-wise), in *each* of Map() and Unmap(). Maybe we can do better: what if you perform the CopyMem() from the buffer right back to the same buffer? CopyMem() is *required* to work with overlapping source and target areas (similarly to memmove() in standard C). This would result in *one* CopyMem (for in-place de-/encryption) in each of Map() and Unmap(), and thereby it would have identical performance impact to the BusMasterRead and BusMasterWrite Map() operations (where copying / crypting takes place between distinct memory areas). The OVMF DSC files resolve "BaseMemoryLib" -- which provides CopyMem() -- to "MdePkg/Library/BaseMemoryLibRepStr/BaseMemoryLibRepStr.inf"; regardless of module type. The actual implementation appears to reside in "MdePkg/Library/BaseMemoryLibRepStr/X64/CopyMem.nasm": > global ASM_PFX(InternalMemCopyMem) > ASM_PFX(InternalMemCopyMem): > push rsi > push rdi > mov rsi, rdx ; rsi <- Source > mov rdi, rcx ; rdi <- Destination > lea r9, [rsi + r8 - 1] ; r9 <- End of Source > cmp rsi, rdi > mov rax, rdi ; rax <- Destination as return value > jae .0 > cmp r9, rdi > jae @CopyBackward ; Copy backward if overlapped > .0: > mov rcx, r8 > and r8, 7 > shr rcx, 3 > rep movsq ; Copy as many Qwords as possible > jmp @CopyBytes > @CopyBackward: > mov rsi, r9 ; rsi <- End of Source > lea rdi, [rdi + r8 - 1] ; esi <- End of Destination > std ; set direction flag > @CopyBytes: > mov rcx, r8 > rep movsb ; Copy bytes backward > cld > pop rdi > pop rsi > ret > However, I'm afraid even if this works on SEV (which I certainly expect!), this code won't be reached, due to the following CopyMem() wrapper implementation in "MdePkg/Library/BaseMemoryLibRepStr/CopyMemWrapper.c": > VOID * > EFIAPI > CopyMem ( > OUT VOID *DestinationBuffer, > IN CONST VOID *SourceBuffer, > IN UINTN Length > ) > { > if (Length == 0) { > return DestinationBuffer; > } > ASSERT ((Length - 1) <= (MAX_ADDRESS - (UINTN)DestinationBuffer)); > ASSERT ((Length - 1) <= (MAX_ADDRESS - (UINTN)SourceBuffer)); > > if (DestinationBuffer == SourceBuffer) { > return DestinationBuffer; > } > return InternalMemCopyMem (DestinationBuffer, SourceBuffer, Length); > } As you see, (DestinationBuffer == SourceBuffer) is handled as a no-op (quite justifiedly, except in the case of SEV). Personally I think it would be OK to copy the wrapper function and the assembly code to OvmfPkg/IoMmuDxe/X64, under the names SevCopyMem() and InternalSevCopyMem(), and call SevCopyMem() in the CommonBuffer cases of Map() and Unmap(), for the in-place flipping. For the 32-bit case (OvmfPkgIa32.dsc), my understanding is that guests cannot control the C bit at all (there is no C bit in the PTEs), and memory is always encrypted. Is that correct? If so, then we only need to ensure that SevCopyMem() compile, as it will never be called -- in the entry point function of OvmfPkg/IoMmuDxe, MemEncryptSevIsEnabled() will return FALSE, and so the IOMMU protocol will not be installed. Therefore the 32-bit version (under OvmfPkg/IoMmuDxe/Ia32) of SevCopyMem() can be stubbed out as an ASSERT(FALSE)+CpuDeadLoop(). If you can think of a better location for SevCopyMem(), that's fine as well. For example, you could add it to "OvmfPkg/Library/BaseMemEncryptSevLib" as well. ... I don't think this functionality should be added under MdePkg, because it is *very* special to the IOMMU implementation, and practically no other module should use a "busy" in-place CopyMem(). Thanks Laszlo _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
On 7/28/17 2:59 PM, Laszlo Ersek wrote: > On 07/28/17 18:00, Brijesh Singh wrote: >> On 07/28/2017 08:38 AM, Laszlo Ersek wrote: > snip > >>> (b) Plus, approaching the question from the Map() direction, we need >>> to consider two scenarios: >>> >>> - Client code calls AllocateBuffer(), then Map(), and it writes to >>> the buffer only then. This should be safe. >>> - client code calls AllocateBuffer(), writes to it, and then calls >>> Map(). This will result in memory contents that look like garbage >>> to the hypervisor. Bad. >>> >>> I can imagine the following to handle these cases: in the Map() and >>> Unmap() functions, we have to decrypt and encrypt the memory contents >>> in-place, after changing the C bit (without allocating additional >>> memory). Introduce a static UINT8 array with EFI_PAGE_SIZE bytes >>> (this will always remain in encrypted memory). Update the C bit with >>> a single function call for the entire range (like now) -- this will >>> not affect the guest-readability of the pages --, then bounce each >>> page within the range to the static buffer and back to its original >>> place. In effect this will in-place encrypt or decrypt the memory, >>> and will be faster than a byte-wise rewrite. > snip > >>> * BusMasterCommonBuffer: >>> - Client calls AllocateBuffer(), and places some data in the >>> returned memory. >>> - Client calls Map(). Map() clears the C bit in one fell swoop, >>> and then decrypts the buffer in-place (by bouncing it page-wise >>> to the static array and back). >>> - Client communicates with the device. >>> - Client calls Unmap(). Unmap() restores the C bit in one fell >>> swoop, and encrypts the buffer in-place (by bouncing it >>> page-wise to the static array and back). >>> - Client reads some residual data from the buffer. >>> - Client calls FreeBuffer(). FreeBuffer() relases the pages. >>> >> Yes this works fine as long as the client uses >> EFI_PCI_IO_PROTOCOL.AllocateBuffer() to allocate the buffer. > Again, a performance-oriented thought: > > Above I suggested using a statically allocated page-sized buffer, for > the in-place encryption/decryption. Ultimately this means *two* > CopyMem()s for the entire buffer (just executed page-wise), in *each* of > Map() and Unmap(). > > Maybe we can do better: what if you perform the CopyMem() from the > buffer right back to the same buffer? CopyMem() is *required* to work > with overlapping source and target areas (similarly to memmove() in > standard C). > > This would result in *one* CopyMem (for in-place de-/encryption) in each > of Map() and Unmap(), and thereby it would have identical performance > impact to the BusMasterRead and BusMasterWrite Map() operations (where > copying / crypting takes place between distinct memory areas). > > The OVMF DSC files resolve "BaseMemoryLib" -- which provides CopyMem() > -- to "MdePkg/Library/BaseMemoryLibRepStr/BaseMemoryLibRepStr.inf"; > regardless of module type. The actual implementation appears to reside > in "MdePkg/Library/BaseMemoryLibRepStr/X64/CopyMem.nasm": > AMD APM document a procedure which must be used to perform in-place encryption/decryption. We must follow those steps to ensure that data is flush into memory using the correct C-bit. Not doing so may result in unpredictable results. http://support.amd.com/TechDocs/24593.pdf (Section 7.10.8) >> global ASM_PFX(InternalMemCopyMem) >> ASM_PFX(InternalMemCopyMem): >> push rsi >> push rdi >> mov rsi, rdx ; rsi <- Source >> mov rdi, rcx ; rdi <- Destination >> lea r9, [rsi + r8 - 1] ; r9 <- End of Source >> cmp rsi, rdi >> mov rax, rdi ; rax <- Destination as return value >> jae .0 >> cmp r9, rdi >> jae @CopyBackward ; Copy backward if overlapped >> .0: >> mov rcx, r8 >> and r8, 7 >> shr rcx, 3 >> rep movsq ; Copy as many Qwords as possible >> jmp @CopyBytes >> @CopyBackward: >> mov rsi, r9 ; rsi <- End of Source >> lea rdi, [rdi + r8 - 1] ; esi <- End of Destination >> std ; set direction flag >> @CopyBytes: >> mov rcx, r8 >> rep movsb ; Copy bytes backward >> cld >> pop rdi >> pop rsi >> ret >> > However, I'm afraid even if this works on SEV (which I certainly > expect!), this code won't be reached, due to the following CopyMem() > wrapper implementation in > "MdePkg/Library/BaseMemoryLibRepStr/CopyMemWrapper.c": > >> VOID * >> EFIAPI >> CopyMem ( >> OUT VOID *DestinationBuffer, >> IN CONST VOID *SourceBuffer, >> IN UINTN Length >> ) >> { >> if (Length == 0) { >> return DestinationBuffer; >> } >> ASSERT ((Length - 1) <= (MAX_ADDRESS - (UINTN)DestinationBuffer)); >> ASSERT ((Length - 1) <= (MAX_ADDRESS - (UINTN)SourceBuffer)); >> >> if (DestinationBuffer == SourceBuffer) { >> return DestinationBuffer; >> } >> return InternalMemCopyMem (DestinationBuffer, SourceBuffer, Length); >> } > As you see, (DestinationBuffer == SourceBuffer) is handled as a no-op > (quite justifiedly, except in the case of SEV). > > Personally I think it would be OK to copy the wrapper function and the > assembly code to OvmfPkg/IoMmuDxe/X64, under the names SevCopyMem() and > InternalSevCopyMem(), and call SevCopyMem() in the CommonBuffer cases of > Map() and Unmap(), for the in-place flipping. > > For the 32-bit case (OvmfPkgIa32.dsc), my understanding is that guests > cannot control the C bit at all (there is no C bit in the PTEs), and > memory is always encrypted. Is that correct? If so, then we only need to > ensure that SevCopyMem() compile, as it will never be called -- in the > entry point function of OvmfPkg/IoMmuDxe, MemEncryptSevIsEnabled() will > return FALSE, and so the IOMMU protocol will not be installed. Therefore > the 32-bit version (under OvmfPkg/IoMmuDxe/Ia32) of SevCopyMem() can be > stubbed out as an ASSERT(FALSE)+CpuDeadLoop(). > > If you can think of a better location for SevCopyMem(), that's fine as > well. For example, you could add it to > "OvmfPkg/Library/BaseMemEncryptSevLib" as well. > > ... I don't think this functionality should be added under MdePkg, > because it is *very* special to the IOMMU implementation, and > practically no other module should use a "busy" in-place CopyMem(). > > Thanks > Laszlo _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
On 7/28/17 7:52 PM, Brijesh Singh wrote: [snip] > AMD APM document a procedure which must be used to perform in-place > encryption/decryption. We must follow those steps to ensure that data is > flush into memory using the correct C-bit. Not doing so may result in > unpredictable results. > > http://support.amd.com/TechDocs/24593.pdf (Section 7.10.8) I am wondering if UEFI provides APIs to get two linear mapping of the same physical address. The steps says we create two mapping of same physical address with different C-bits. I will look into UEFI docs to see if something like this exist otherwise we have to consider two memcpy. Since its just for CommandBuffers (which usually are smaller hence we may be okay from performance point of view. Also its just a boot time thing, does not get used when guest OS is takes over. I will investigate and see what works best without adding extra complexity in the code :) -Brijesh _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
On 07/29/17 03:37, Brijesh Singh wrote: > On 7/28/17 7:52 PM, Brijesh Singh wrote: > [snip] >> AMD APM document a procedure which must be used to perform in-place >> encryption/decryption. We must follow those steps to ensure that data is >> flush into memory using the correct C-bit. Not doing so may result in >> unpredictable results. >> >> http://support.amd.com/TechDocs/24593.pdf (Section 7.10.8) > I am wondering if UEFI provides APIs to get two linear mapping of the > same physical address. The steps says we create two mapping of same > physical address with different C-bits. I will look into UEFI docs to > see if something like this exist otherwise we have to consider two > memcpy. Since its just for CommandBuffers (which usually are smaller > hence we may be okay from performance point of view. Right, the separate static bounce buffer (4KB in size) and the two matching CopyMem()s look easier to understand and safer. Hopefully there won't be a performance issue in practice. > Also its just a > boot time thing, does not get used when guest OS is takes over. I will > investigate and see what works best without adding extra complexity in > the code :) Thanks! (Sorry about the late reply.) Laszlo _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
© 2016 - 2024 Red Hat, Inc.