From: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
Devices implementing ATS can send translation requests using
pci_ats_request_translation_pasid.
The invalidation events are sent back to the device using the iommu
notifier managed with pci_register_iommu_tlb_event_notifier and
pci_unregister_iommu_tlb_event_notifier
Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com>
---
hw/pci/pci.c | 44 +++++++++++++++++++++++++++++++++++++
include/hw/pci/pci.h | 52 ++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 96 insertions(+)
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 7a483dd05d..93b816aff2 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2833,6 +2833,50 @@ void pci_device_unset_iommu_device(PCIDevice *dev)
}
}
+ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
+ bool priv_req, bool exec_req,
+ hwaddr addr, size_t length,
+ bool no_write, IOMMUTLBEntry *result,
+ size_t result_length,
+ uint32_t *err_count)
+{
+ assert(result_length);
+ IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
+ pasid);
+ if (!iommu_mr || !pcie_ats_enabled(dev)) {
+ return -EPERM;
+ }
+ return memory_region_iommu_ats_request_translation(iommu_mr, priv_req,
+ exec_req, addr, length,
+ no_write, result,
+ result_length,
+ err_count);
+}
+
+int pci_register_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
+ IOMMUNotifier *n)
+{
+ IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
+ pasid);
+ if (!iommu_mr) {
+ return -EPERM;
+ }
+ return memory_region_register_iommu_notifier(MEMORY_REGION(iommu_mr), n,
+ &error_fatal);
+}
+
+int pci_unregister_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
+ IOMMUNotifier *n)
+{
+ IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev,
+ pasid);
+ if (!iommu_mr) {
+ return -EPERM;
+ }
+ memory_region_unregister_iommu_notifier(MEMORY_REGION(iommu_mr), n);
+ return 0;
+}
+
void pci_setup_iommu(PCIBus *bus, const PCIIOMMUOps *ops, void *opaque)
{
/*
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index b2a9ed7782..d656f2656a 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -473,6 +473,58 @@ bool pci_iommu_init_iotlb_notifier(PCIDevice *dev, uint32_t pasid,
IOMMUNotifier *n, IOMMUNotify fn,
void *opaque);
+/**
+ * pci_ats_request_translation_pasid: perform an ATS request
+ *
+ * Return the number of translations stored in @result in case of success,
+ * a negative error code otherwise.
+ * -ENOMEM is returned when the result buffer is not large enough to store
+ * all the translations
+ *
+ * @dev: the ATS-capable PCI device
+ * @pasid: the pasid of the address space in which the translation will be made
+ * @priv_req: privileged mode bit (PASID TLP)
+ * @exec_req: execute request bit (PASID TLP)
+ * @addr: start address of the memory range to be translated
+ * @length: length of the memory range in bytes
+ * @no_write: request a read-only access translation (if supported by the IOMMU)
+ * @result: buffer in which the TLB entries will be stored
+ * @result_length: result buffer length
+ * @err_count: number of untranslated subregions
+ */
+ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid,
+ bool priv_req, bool exec_req,
+ hwaddr addr, size_t length,
+ bool no_write, IOMMUTLBEntry *result,
+ size_t result_length,
+ uint32_t *err_count);
+
+/**
+ * pci_register_iommu_tlb_event_notifier: register a notifier for changes to
+ * IOMMU translation entries in a specific address space.
+ *
+ * Returns 0 on success, or a negative errno otherwise.
+ *
+ * @dev: the device that wants to get notified
+ * @pasid: the pasid of the address space to track
+ * @n: the notifier to register
+ */
+int pci_register_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
+ IOMMUNotifier *n);
+
+/**
+ * pci_unregister_iommu_tlb_event_notifier: unregister a notifier that has been
+ * registerd with pci_register_iommu_tlb_event_notifier
+ *
+ * Returns 0 on success, or a negative errno otherwise.
+ *
+ * @dev: the device that wants to unsubscribe
+ * @pasid: the pasid of the address space to be untracked
+ * @n: the notifier to unregister
+ */
+int pci_unregister_iommu_tlb_event_notifier(PCIDevice *dev, uint32_t pasid,
+ IOMMUNotifier *n);
+
/**
* pci_setup_iommu: Initialize specific IOMMU handlers for a PCIBus
*
--
2.45.2
On 24-07-02 05:52:45, CLEMENT MATHIEU--DRIF wrote: > From: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com> > > Devices implementing ATS can send translation requests using > pci_ats_request_translation_pasid. > > The invalidation events are sent back to the device using the iommu > notifier managed with pci_register_iommu_tlb_event_notifier and > pci_unregister_iommu_tlb_event_notifier > > Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com> > --- > hw/pci/pci.c | 44 +++++++++++++++++++++++++++++++++++++ > include/hw/pci/pci.h | 52 ++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 96 insertions(+) > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c > index 7a483dd05d..93b816aff2 100644 > --- a/hw/pci/pci.c > +++ b/hw/pci/pci.c > @@ -2833,6 +2833,50 @@ void pci_device_unset_iommu_device(PCIDevice *dev) > } > } > > +ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid, > + bool priv_req, bool exec_req, > + hwaddr addr, size_t length, > + bool no_write, IOMMUTLBEntry *result, > + size_t result_length, > + uint32_t *err_count) > +{ > + assert(result_length); > + IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev, > + pasid); > + if (!iommu_mr || !pcie_ats_enabled(dev)) { > + return -EPERM; > + } > + return memory_region_iommu_ats_request_translation(iommu_mr, priv_req, > + exec_req, addr, length, > + no_write, result, > + result_length, > + err_count); > +} Can we use this function not from the endpoint PCI device, but inside of the pci subsystem (hw/pci/pci.c) to make transparent abstraction for ATS request from PCI endpoint device POV? I guess it would be better to have PCI subsystem to issue ATS request if pcie_ats_enabled(dev) rather than calling from the endpoint side.
On 09/07/2024 12:15, Minwoo Im wrote: > Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe. > > > On 24-07-02 05:52:45, CLEMENT MATHIEU--DRIF wrote: >> From: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com> >> >> Devices implementing ATS can send translation requests using >> pci_ats_request_translation_pasid. >> >> The invalidation events are sent back to the device using the iommu >> notifier managed with pci_register_iommu_tlb_event_notifier and >> pci_unregister_iommu_tlb_event_notifier >> >> Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com> >> --- >> hw/pci/pci.c | 44 +++++++++++++++++++++++++++++++++++++ >> include/hw/pci/pci.h | 52 ++++++++++++++++++++++++++++++++++++++++++++ >> 2 files changed, 96 insertions(+) >> >> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >> index 7a483dd05d..93b816aff2 100644 >> --- a/hw/pci/pci.c >> +++ b/hw/pci/pci.c >> @@ -2833,6 +2833,50 @@ void pci_device_unset_iommu_device(PCIDevice *dev) >> } >> } >> >> +ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid, >> + bool priv_req, bool exec_req, >> + hwaddr addr, size_t length, >> + bool no_write, IOMMUTLBEntry *result, >> + size_t result_length, >> + uint32_t *err_count) >> +{ >> + assert(result_length); >> + IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev, >> + pasid); >> + if (!iommu_mr || !pcie_ats_enabled(dev)) { >> + return -EPERM; >> + } >> + return memory_region_iommu_ats_request_translation(iommu_mr, priv_req, >> + exec_req, addr, length, >> + no_write, result, >> + result_length, >> + err_count); >> +} > Can we use this function not from the endpoint PCI device, but inside of the pci > subsystem (hw/pci/pci.c) to make transparent abstraction for ATS request from > PCI endpoint device POV? I guess it would be better to have PCI subsystem to > issue ATS request if pcie_ats_enabled(dev) rather than calling from the endpoint > side. Hi, This series aims to bring support for SVM (we are trying to integrate the patches bit by bit). From a spec point of view, I don't know if it would make sense to implement the SVM logic at the PCI level as it's supposed to be implemented by endpoint devices. However, we could consider providing a reference/reusable/encapsulated implementation of SVM with a simplified API that would call the pci_* functions under the hood. Do you have a specific use case in mind? >cmd >
On 24-07-09 11:58:53, CLEMENT MATHIEU--DRIF wrote: > > > On 09/07/2024 12:15, Minwoo Im wrote: > > Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe. > > > > > > On 24-07-02 05:52:45, CLEMENT MATHIEU--DRIF wrote: > >> From: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com> > >> > >> Devices implementing ATS can send translation requests using > >> pci_ats_request_translation_pasid. > >> > >> The invalidation events are sent back to the device using the iommu > >> notifier managed with pci_register_iommu_tlb_event_notifier and > >> pci_unregister_iommu_tlb_event_notifier > >> > >> Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com> > >> --- > >> hw/pci/pci.c | 44 +++++++++++++++++++++++++++++++++++++ > >> include/hw/pci/pci.h | 52 ++++++++++++++++++++++++++++++++++++++++++++ > >> 2 files changed, 96 insertions(+) > >> > >> diff --git a/hw/pci/pci.c b/hw/pci/pci.c > >> index 7a483dd05d..93b816aff2 100644 > >> --- a/hw/pci/pci.c > >> +++ b/hw/pci/pci.c > >> @@ -2833,6 +2833,50 @@ void pci_device_unset_iommu_device(PCIDevice *dev) > >> } > >> } > >> > >> +ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid, > >> + bool priv_req, bool exec_req, > >> + hwaddr addr, size_t length, > >> + bool no_write, IOMMUTLBEntry *result, > >> + size_t result_length, > >> + uint32_t *err_count) > >> +{ > >> + assert(result_length); > >> + IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev, > >> + pasid); > >> + if (!iommu_mr || !pcie_ats_enabled(dev)) { > >> + return -EPERM; > >> + } > >> + return memory_region_iommu_ats_request_translation(iommu_mr, priv_req, > >> + exec_req, addr, length, > >> + no_write, result, > >> + result_length, > >> + err_count); > >> +} > > Can we use this function not from the endpoint PCI device, but inside of the pci > > subsystem (hw/pci/pci.c) to make transparent abstraction for ATS request from > > PCI endpoint device POV? I guess it would be better to have PCI subsystem to > > issue ATS request if pcie_ats_enabled(dev) rather than calling from the endpoint > > side. > Hi, > > This series aims to bring support for SVM (we are trying to integrate > the patches bit by bit). > From a spec point of view, I don't know if it would make sense to > implement the SVM logic at the PCI level > as it's supposed to be implemented by endpoint devices. Understood that this series is targeting the SVM usage. But ATS feature is something general to PCI devices, not only just for SVM, so I guess it would be better to have caller to `pci_ats_request_translation_pasid()` in pci subsystem like pci_dma_rw() to avoid duplicated implementation in the future for the other PCI enpoint devices. > However, we could consider providing a reference/reusable/encapsulated > implementation of SVM with a simplified API > that would call the pci_* functions under the hood. I would prefer that PCI devices which want to request ATS translation has no additional implementation for ATS, but only pcie_ats_init(). > > Do you have a specific use case in mind? ATS/PRI is the actual use case, and it's not that different what you are targeting for :) > > >cmd > > >
On 09/07/2024 23:17, Minwoo Im wrote: > Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe. > > > On 24-07-09 11:58:53, CLEMENT MATHIEU--DRIF wrote: >> >> On 09/07/2024 12:15, Minwoo Im wrote: >>> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe. >>> >>> >>> On 24-07-02 05:52:45, CLEMENT MATHIEU--DRIF wrote: >>>> From: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com> >>>> >>>> Devices implementing ATS can send translation requests using >>>> pci_ats_request_translation_pasid. >>>> >>>> The invalidation events are sent back to the device using the iommu >>>> notifier managed with pci_register_iommu_tlb_event_notifier and >>>> pci_unregister_iommu_tlb_event_notifier >>>> >>>> Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com> >>>> --- >>>> hw/pci/pci.c | 44 +++++++++++++++++++++++++++++++++++++ >>>> include/hw/pci/pci.h | 52 ++++++++++++++++++++++++++++++++++++++++++++ >>>> 2 files changed, 96 insertions(+) >>>> >>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >>>> index 7a483dd05d..93b816aff2 100644 >>>> --- a/hw/pci/pci.c >>>> +++ b/hw/pci/pci.c >>>> @@ -2833,6 +2833,50 @@ void pci_device_unset_iommu_device(PCIDevice *dev) >>>> } >>>> } >>>> >>>> +ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid, >>>> + bool priv_req, bool exec_req, >>>> + hwaddr addr, size_t length, >>>> + bool no_write, IOMMUTLBEntry *result, >>>> + size_t result_length, >>>> + uint32_t *err_count) >>>> +{ >>>> + assert(result_length); >>>> + IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev, >>>> + pasid); >>>> + if (!iommu_mr || !pcie_ats_enabled(dev)) { >>>> + return -EPERM; >>>> + } >>>> + return memory_region_iommu_ats_request_translation(iommu_mr, priv_req, >>>> + exec_req, addr, length, >>>> + no_write, result, >>>> + result_length, >>>> + err_count); >>>> +} >>> Can we use this function not from the endpoint PCI device, but inside of the pci >>> subsystem (hw/pci/pci.c) to make transparent abstraction for ATS request from >>> PCI endpoint device POV? I guess it would be better to have PCI subsystem to >>> issue ATS request if pcie_ats_enabled(dev) rather than calling from the endpoint >>> side. >> Hi, >> >> This series aims to bring support for SVM (we are trying to integrate >> the patches bit by bit). >> From a spec point of view, I don't know if it would make sense to >> implement the SVM logic at the PCI level >> as it's supposed to be implemented by endpoint devices. > Understood that this series is targeting the SVM usage. But ATS feature is > something general to PCI devices, not only just for SVM, so I guess it would be > better to have caller to `pci_ats_request_translation_pasid()` in pci subsystem > like pci_dma_rw() to avoid duplicated implementation in the future for the > other PCI enpoint devices. Would we store the ATC directly in the PCI subsytem? > >> However, we could consider providing a reference/reusable/encapsulated >> implementation of SVM with a simplified API >> that would call the pci_* functions under the hood. > I would prefer that PCI devices which want to request ATS translation has no > additional implementation for ATS, but only pcie_ats_init(). Hi, I think both strategies can coexist. Keeping control can be interesting for people who use Qemu for hardware prototyping and who generally want to experiment. We can keep the current PCI-level API for devices that want to reimplement the logic themselves and add a kind of "DMA module"/"ATS+PRI module" that works out of the box. That module could be called "struct PciDmaModule" and expose a simple set of functions like pci_dma_module_init, pci_dma_module_read, pci_dma_module_write. I think it's important to keep existing DMA API as is to allow devices to do both "with ATS" and "without ATS" operations. Do you agree with that? > >> Do you have a specific use case in mind? > ATS/PRI is the actual use case, and it's not that different what you are > targeting for :) > >> >cmd >> >>>
On 24-07-10 05:17:42, CLEMENT MATHIEU--DRIF wrote: > > > On 09/07/2024 23:17, Minwoo Im wrote: > > Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe. > > > > > > On 24-07-09 11:58:53, CLEMENT MATHIEU--DRIF wrote: > >> > >> On 09/07/2024 12:15, Minwoo Im wrote: > >>> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe. > >>> > >>> > >>> On 24-07-02 05:52:45, CLEMENT MATHIEU--DRIF wrote: > >>>> From: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com> > >>>> > >>>> Devices implementing ATS can send translation requests using > >>>> pci_ats_request_translation_pasid. > >>>> > >>>> The invalidation events are sent back to the device using the iommu > >>>> notifier managed with pci_register_iommu_tlb_event_notifier and > >>>> pci_unregister_iommu_tlb_event_notifier > >>>> > >>>> Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com> > >>>> --- > >>>> hw/pci/pci.c | 44 +++++++++++++++++++++++++++++++++++++ > >>>> include/hw/pci/pci.h | 52 ++++++++++++++++++++++++++++++++++++++++++++ > >>>> 2 files changed, 96 insertions(+) > >>>> > >>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c > >>>> index 7a483dd05d..93b816aff2 100644 > >>>> --- a/hw/pci/pci.c > >>>> +++ b/hw/pci/pci.c > >>>> @@ -2833,6 +2833,50 @@ void pci_device_unset_iommu_device(PCIDevice *dev) > >>>> } > >>>> } > >>>> > >>>> +ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid, > >>>> + bool priv_req, bool exec_req, > >>>> + hwaddr addr, size_t length, > >>>> + bool no_write, IOMMUTLBEntry *result, > >>>> + size_t result_length, > >>>> + uint32_t *err_count) > >>>> +{ > >>>> + assert(result_length); > >>>> + IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev, > >>>> + pasid); > >>>> + if (!iommu_mr || !pcie_ats_enabled(dev)) { > >>>> + return -EPERM; > >>>> + } > >>>> + return memory_region_iommu_ats_request_translation(iommu_mr, priv_req, > >>>> + exec_req, addr, length, > >>>> + no_write, result, > >>>> + result_length, > >>>> + err_count); > >>>> +} > >>> Can we use this function not from the endpoint PCI device, but inside of the pci > >>> subsystem (hw/pci/pci.c) to make transparent abstraction for ATS request from > >>> PCI endpoint device POV? I guess it would be better to have PCI subsystem to > >>> issue ATS request if pcie_ats_enabled(dev) rather than calling from the endpoint > >>> side. > >> Hi, > >> > >> This series aims to bring support for SVM (we are trying to integrate > >> the patches bit by bit). > >> From a spec point of view, I don't know if it would make sense to > >> implement the SVM logic at the PCI level > >> as it's supposed to be implemented by endpoint devices. > > Understood that this series is targeting the SVM usage. But ATS feature is > > something general to PCI devices, not only just for SVM, so I guess it would be > > better to have caller to `pci_ats_request_translation_pasid()` in pci subsystem > > like pci_dma_rw() to avoid duplicated implementation in the future for the > > other PCI enpoint devices. > > Would we store the ATC directly in the PCI subsytem? Yes, endpoint device (e.g., svm.c) should call pci_* helpers in PCI subsystem with `PCIDevice *pdev instance` which represents the endpoint device itself. By the instance, we can look up the IOTLB entry from the ATC in the PCI subsystem, not the current caller side. > > > >> However, we could consider providing a reference/reusable/encapsulated > >> implementation of SVM with a simplified API > >> that would call the pci_* functions under the hood. > > I would prefer that PCI devices which want to request ATS translation has no > > additional implementation for ATS, but only pcie_ats_init(). > Hi, > > I think both strategies can coexist. > Keeping control can be interesting for people who use Qemu for hardware > prototyping and who generally want to experiment. > We can keep the current PCI-level API for devices that want to > reimplement the logic themselves > and add a kind of "DMA module"/"ATS+PRI module" that works out of the box. I think we should proivde hybrid mode on this. One for a `generic` cache policy mode for every PCI endpoint devices which can be controlled in the PCI subsystem for ATC, the other one is that device-specific cache policy mode which let each device implement their own ATC lookup behaviors to optimize their own caching impact. > That module could be called "struct PciDmaModule" and expose a simple > set of functions like pci_dma_module_init, pci_dma_module_read, > pci_dma_module_write. > I think it's important to keep existing DMA API as is to allow devices > to do both "with ATS" and "without ATS" operations. > > Do you agree with that? Indeed. Keeping the existing APIs is a good choice, but I would like to have endpoint devices code much more simpler for the generic usages :) > > > >> Do you have a specific use case in mind? > > ATS/PRI is the actual use case, and it's not that different what you are > > targeting for :) > > > >> >cmd > >> > >>>
On 11/07/2024 10:04, Minwoo Im wrote: > Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe. > > > On 24-07-10 05:17:42, CLEMENT MATHIEU--DRIF wrote: >> >> On 09/07/2024 23:17, Minwoo Im wrote: >>> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe. >>> >>> >>> On 24-07-09 11:58:53, CLEMENT MATHIEU--DRIF wrote: >>>> On 09/07/2024 12:15, Minwoo Im wrote: >>>>> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe. >>>>> >>>>> >>>>> On 24-07-02 05:52:45, CLEMENT MATHIEU--DRIF wrote: >>>>>> From: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com> >>>>>> >>>>>> Devices implementing ATS can send translation requests using >>>>>> pci_ats_request_translation_pasid. >>>>>> >>>>>> The invalidation events are sent back to the device using the iommu >>>>>> notifier managed with pci_register_iommu_tlb_event_notifier and >>>>>> pci_unregister_iommu_tlb_event_notifier >>>>>> >>>>>> Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com> >>>>>> --- >>>>>> hw/pci/pci.c | 44 +++++++++++++++++++++++++++++++++++++ >>>>>> include/hw/pci/pci.h | 52 ++++++++++++++++++++++++++++++++++++++++++++ >>>>>> 2 files changed, 96 insertions(+) >>>>>> >>>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >>>>>> index 7a483dd05d..93b816aff2 100644 >>>>>> --- a/hw/pci/pci.c >>>>>> +++ b/hw/pci/pci.c >>>>>> @@ -2833,6 +2833,50 @@ void pci_device_unset_iommu_device(PCIDevice *dev) >>>>>> } >>>>>> } >>>>>> >>>>>> +ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid, >>>>>> + bool priv_req, bool exec_req, >>>>>> + hwaddr addr, size_t length, >>>>>> + bool no_write, IOMMUTLBEntry *result, >>>>>> + size_t result_length, >>>>>> + uint32_t *err_count) >>>>>> +{ >>>>>> + assert(result_length); >>>>>> + IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev, >>>>>> + pasid); >>>>>> + if (!iommu_mr || !pcie_ats_enabled(dev)) { >>>>>> + return -EPERM; >>>>>> + } >>>>>> + return memory_region_iommu_ats_request_translation(iommu_mr, priv_req, >>>>>> + exec_req, addr, length, >>>>>> + no_write, result, >>>>>> + result_length, >>>>>> + err_count); >>>>>> +} >>>>> Can we use this function not from the endpoint PCI device, but inside of the pci >>>>> subsystem (hw/pci/pci.c) to make transparent abstraction for ATS request from >>>>> PCI endpoint device POV? I guess it would be better to have PCI subsystem to >>>>> issue ATS request if pcie_ats_enabled(dev) rather than calling from the endpoint >>>>> side. >>>> Hi, >>>> >>>> This series aims to bring support for SVM (we are trying to integrate >>>> the patches bit by bit). >>>> From a spec point of view, I don't know if it would make sense to >>>> implement the SVM logic at the PCI level >>>> as it's supposed to be implemented by endpoint devices. >>> Understood that this series is targeting the SVM usage. But ATS feature is >>> something general to PCI devices, not only just for SVM, so I guess it would be >>> better to have caller to `pci_ats_request_translation_pasid()` in pci subsystem >>> like pci_dma_rw() to avoid duplicated implementation in the future for the >>> other PCI enpoint devices. >> Would we store the ATC directly in the PCI subsytem? > Yes, endpoint device (e.g., svm.c) should call pci_* helpers in PCI subsystem > with `PCIDevice *pdev instance` which represents the endpoint device itself. > By the instance, we can look up the IOTLB entry from the ATC in the PCI > subsystem, not the current caller side. > >>>> However, we could consider providing a reference/reusable/encapsulated >>>> implementation of SVM with a simplified API >>>> that would call the pci_* functions under the hood. >>> I would prefer that PCI devices which want to request ATS translation has no >>> additional implementation for ATS, but only pcie_ats_init(). >> Hi, >> >> I think both strategies can coexist. >> Keeping control can be interesting for people who use Qemu for hardware >> prototyping and who generally want to experiment. >> We can keep the current PCI-level API for devices that want to >> reimplement the logic themselves >> and add a kind of "DMA module"/"ATS+PRI module" that works out of the box. > I think we should proivde hybrid mode on this. One for a `generic` cache > policy mode for every PCI endpoint devices which can be controlled in the PCI > subsystem for ATC, the other one is that device-specific cache policy mode > which let each device implement their own ATC lookup behaviors to optimize > their own caching impact. > >> That module could be called "struct PciDmaModule" and expose a simple >> set of functions like pci_dma_module_init, pci_dma_module_read, >> pci_dma_module_write. >> I think it's important to keep existing DMA API as is to allow devices >> to do both "with ATS" and "without ATS" operations. >> >> Do you agree with that? > Indeed. Keeping the existing APIs is a good choice, but I would like to have > endpoint devices code much more simpler for the generic usages :) That's a good point, we will se what we can do once the current work is integrated. Thanks for your comment :) > >>>> Do you have a specific use case in mind? >>> ATS/PRI is the actual use case, and it's not that different what you are >>> targeting for :) >>> >>>> >cmd >>>> >>>>>
On 24-07-11 19:00:58, CLEMENT MATHIEU--DRIF wrote: > > > On 11/07/2024 10:04, Minwoo Im wrote: > > Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe. > > > > > > On 24-07-10 05:17:42, CLEMENT MATHIEU--DRIF wrote: > >> > >> On 09/07/2024 23:17, Minwoo Im wrote: > >>> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe. > >>> > >>> > >>> On 24-07-09 11:58:53, CLEMENT MATHIEU--DRIF wrote: > >>>> On 09/07/2024 12:15, Minwoo Im wrote: > >>>>> Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe. > >>>>> > >>>>> > >>>>> On 24-07-02 05:52:45, CLEMENT MATHIEU--DRIF wrote: > >>>>>> From: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com> > >>>>>> > >>>>>> Devices implementing ATS can send translation requests using > >>>>>> pci_ats_request_translation_pasid. > >>>>>> > >>>>>> The invalidation events are sent back to the device using the iommu > >>>>>> notifier managed with pci_register_iommu_tlb_event_notifier and > >>>>>> pci_unregister_iommu_tlb_event_notifier > >>>>>> > >>>>>> Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com> > >>>>>> --- > >>>>>> hw/pci/pci.c | 44 +++++++++++++++++++++++++++++++++++++ > >>>>>> include/hw/pci/pci.h | 52 ++++++++++++++++++++++++++++++++++++++++++++ > >>>>>> 2 files changed, 96 insertions(+) > >>>>>> > >>>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c > >>>>>> index 7a483dd05d..93b816aff2 100644 > >>>>>> --- a/hw/pci/pci.c > >>>>>> +++ b/hw/pci/pci.c > >>>>>> @@ -2833,6 +2833,50 @@ void pci_device_unset_iommu_device(PCIDevice *dev) > >>>>>> } > >>>>>> } > >>>>>> > >>>>>> +ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid, > >>>>>> + bool priv_req, bool exec_req, > >>>>>> + hwaddr addr, size_t length, > >>>>>> + bool no_write, IOMMUTLBEntry *result, > >>>>>> + size_t result_length, > >>>>>> + uint32_t *err_count) > >>>>>> +{ > >>>>>> + assert(result_length); > >>>>>> + IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev, > >>>>>> + pasid); > >>>>>> + if (!iommu_mr || !pcie_ats_enabled(dev)) { > >>>>>> + return -EPERM; > >>>>>> + } > >>>>>> + return memory_region_iommu_ats_request_translation(iommu_mr, priv_req, > >>>>>> + exec_req, addr, length, > >>>>>> + no_write, result, > >>>>>> + result_length, > >>>>>> + err_count); > >>>>>> +} > >>>>> Can we use this function not from the endpoint PCI device, but inside of the pci > >>>>> subsystem (hw/pci/pci.c) to make transparent abstraction for ATS request from > >>>>> PCI endpoint device POV? I guess it would be better to have PCI subsystem to > >>>>> issue ATS request if pcie_ats_enabled(dev) rather than calling from the endpoint > >>>>> side. > >>>> Hi, > >>>> > >>>> This series aims to bring support for SVM (we are trying to integrate > >>>> the patches bit by bit). > >>>> From a spec point of view, I don't know if it would make sense to > >>>> implement the SVM logic at the PCI level > >>>> as it's supposed to be implemented by endpoint devices. > >>> Understood that this series is targeting the SVM usage. But ATS feature is > >>> something general to PCI devices, not only just for SVM, so I guess it would be > >>> better to have caller to `pci_ats_request_translation_pasid()` in pci subsystem > >>> like pci_dma_rw() to avoid duplicated implementation in the future for the > >>> other PCI enpoint devices. > >> Would we store the ATC directly in the PCI subsytem? > > Yes, endpoint device (e.g., svm.c) should call pci_* helpers in PCI subsystem > > with `PCIDevice *pdev instance` which represents the endpoint device itself. > > By the instance, we can look up the IOTLB entry from the ATC in the PCI > > subsystem, not the current caller side. > > > >>>> However, we could consider providing a reference/reusable/encapsulated > >>>> implementation of SVM with a simplified API > >>>> that would call the pci_* functions under the hood. > >>> I would prefer that PCI devices which want to request ATS translation has no > >>> additional implementation for ATS, but only pcie_ats_init(). > >> Hi, > >> > >> I think both strategies can coexist. > >> Keeping control can be interesting for people who use Qemu for hardware > >> prototyping and who generally want to experiment. > >> We can keep the current PCI-level API for devices that want to > >> reimplement the logic themselves > >> and add a kind of "DMA module"/"ATS+PRI module" that works out of the box. > > I think we should proivde hybrid mode on this. One for a `generic` cache > > policy mode for every PCI endpoint devices which can be controlled in the PCI > > subsystem for ATC, the other one is that device-specific cache policy mode > > which let each device implement their own ATC lookup behaviors to optimize > > their own caching impact. > > > >> That module could be called "struct PciDmaModule" and expose a simple > >> set of functions like pci_dma_module_init, pci_dma_module_read, > >> pci_dma_module_write. > >> I think it's important to keep existing DMA API as is to allow devices > >> to do both "with ATS" and "without ATS" operations. > >> > >> Do you agree with that? > > Indeed. Keeping the existing APIs is a good choice, but I would like to have > > endpoint devices code much more simpler for the generic usages :) > That's a good point, we will se what we can do once the current work is > integrated. > Thanks for your comment :) Do you have a plan to repost this series soon? I would like to apply your patches and review/test this series. It looks like you've been through some of fix patches for VT-d, but I'm curious your further plan for the actual SVM feature :) > > > >>>> Do you have a specific use case in mind? > >>> ATS/PRI is the actual use case, and it's not that different what you are > >>> targeting for :) > >>> > >>>> >cmd > >>>> > >>>>>
On 18/07/2024 01:44, Minwoo Im wrote: Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe. On 24-07-11 19:00:58, CLEMENT MATHIEU--DRIF wrote: On 11/07/2024 10:04, Minwoo Im wrote: Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe. On 24-07-10 05:17:42, CLEMENT MATHIEU--DRIF wrote: On 09/07/2024 23:17, Minwoo Im wrote: Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe. On 24-07-09 11:58:53, CLEMENT MATHIEU--DRIF wrote: On 09/07/2024 12:15, Minwoo Im wrote: Caution: External email. Do not open attachments or click links, unless this email comes from a known sender and you know the content is safe. On 24-07-02 05:52:45, CLEMENT MATHIEU--DRIF wrote: From: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com><mailto:clement.mathieu--drif@eviden.com> Devices implementing ATS can send translation requests using pci_ats_request_translation_pasid. The invalidation events are sent back to the device using the iommu notifier managed with pci_register_iommu_tlb_event_notifier and pci_unregister_iommu_tlb_event_notifier Signed-off-by: Clément Mathieu--Drif <clement.mathieu--drif@eviden.com><mailto:clement.mathieu--drif@eviden.com> --- hw/pci/pci.c | 44 +++++++++++++++++++++++++++++++++++++ include/hw/pci/pci.h | 52 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 96 insertions(+) diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 7a483dd05d..93b816aff2 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -2833,6 +2833,50 @@ void pci_device_unset_iommu_device(PCIDevice *dev) } } +ssize_t pci_ats_request_translation_pasid(PCIDevice *dev, uint32_t pasid, + bool priv_req, bool exec_req, + hwaddr addr, size_t length, + bool no_write, IOMMUTLBEntry *result, + size_t result_length, + uint32_t *err_count) +{ + assert(result_length); + IOMMUMemoryRegion *iommu_mr = pci_device_iommu_memory_region_pasid(dev, + pasid); + if (!iommu_mr || !pcie_ats_enabled(dev)) { + return -EPERM; + } + return memory_region_iommu_ats_request_translation(iommu_mr, priv_req, + exec_req, addr, length, + no_write, result, + result_length, + err_count); +} Can we use this function not from the endpoint PCI device, but inside of the pci subsystem (hw/pci/pci.c) to make transparent abstraction for ATS request from PCI endpoint device POV? I guess it would be better to have PCI subsystem to issue ATS request if pcie_ats_enabled(dev) rather than calling from the endpoint side. Hi, This series aims to bring support for SVM (we are trying to integrate the patches bit by bit). From a spec point of view, I don't know if it would make sense to implement the SVM logic at the PCI level as it's supposed to be implemented by endpoint devices. Understood that this series is targeting the SVM usage. But ATS feature is something general to PCI devices, not only just for SVM, so I guess it would be better to have caller to `pci_ats_request_translation_pasid()` in pci subsystem like pci_dma_rw() to avoid duplicated implementation in the future for the other PCI enpoint devices. Would we store the ATC directly in the PCI subsytem? Yes, endpoint device (e.g., svm.c) should call pci_* helpers in PCI subsystem with `PCIDevice *pdev instance` which represents the endpoint device itself. By the instance, we can look up the IOTLB entry from the ATC in the PCI subsystem, not the current caller side. However, we could consider providing a reference/reusable/encapsulated implementation of SVM with a simplified API that would call the pci_* functions under the hood. I would prefer that PCI devices which want to request ATS translation has no additional implementation for ATS, but only pcie_ats_init(). Hi, I think both strategies can coexist. Keeping control can be interesting for people who use Qemu for hardware prototyping and who generally want to experiment. We can keep the current PCI-level API for devices that want to reimplement the logic themselves and add a kind of "DMA module"/"ATS+PRI module" that works out of the box. I think we should proivde hybrid mode on this. One for a `generic` cache policy mode for every PCI endpoint devices which can be controlled in the PCI subsystem for ATC, the other one is that device-specific cache policy mode which let each device implement their own ATC lookup behaviors to optimize their own caching impact. That module could be called "struct PciDmaModule" and expose a simple set of functions like pci_dma_module_init, pci_dma_module_read, pci_dma_module_write. I think it's important to keep existing DMA API as is to allow devices to do both "with ATS" and "without ATS" operations. Do you agree with that? Indeed. Keeping the existing APIs is a good choice, but I would like to have endpoint devices code much more simpler for the generic usages :) That's a good point, we will se what we can do once the current work is integrated. Thanks for your comment :) Do you have a plan to repost this series soon? I would like to apply your patches and review/test this series. It looks like you've been through some of fix patches for VT-d, but I'm curious your further plan for the actual SVM feature :) Hi, As you may have noticed, my SVM implementation is based on Zhenzhong Duan's FLTS series, which has not yet been integrated. The VT-d quick fixes were the only fixes without dependencies. I'm now waiting for FLTS support to be integrated upstream before sending my series again (I'll add you to CC). If you want to test the features, you can clone the following repos: - https://github.com/BullSequana/Qemu-in-guest-SVM-demo - https://github.com/BullSequana/qemu (submodule of the repo above) >cmd Do you have a specific use case in mind? ATS/PRI is the actual use case, and it's not that different what you are targeting for :) >cmd
© 2016 - 2025 Red Hat, Inc.