From nobody Fri May 16 05:43:29 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) client-ip=209.132.183.28; envelope-from=libvir-list-bounces@redhat.com; helo=mx1.redhat.com; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of redhat.com designates 209.132.183.28 as permitted sender) smtp.mailfrom=libvir-list-bounces@redhat.com Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by mx.zohomail.com with SMTPS id 1502415782521220.27037988916572; Thu, 10 Aug 2017 18:43:02 -0700 (PDT) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id EC614F6886; Fri, 11 Aug 2017 01:42:59 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id C692967C72; Fri, 11 Aug 2017 01:42:59 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 6C895180B467; Fri, 11 Aug 2017 01:42:59 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id v7B1gYYa003171 for ; Thu, 10 Aug 2017 21:42:34 -0400 Received: by smtp.corp.redhat.com (Postfix) id 77DA798145; Fri, 11 Aug 2017 01:42:34 +0000 (UTC) Received: from vhost2.laine.org (ovpn-117-36.phx2.redhat.com [10.3.117.36]) by smtp.corp.redhat.com (Postfix) with ESMTP id 039E59814B; Fri, 11 Aug 2017 01:42:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com EC614F6886 Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=laine.org Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=libvir-list-bounces@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com EC614F6886 From: Laine Stump To: libvir-list@redhat.com Date: Thu, 10 Aug 2017 21:42:20 -0400 Message-Id: <20170811014222.29548-6-laine@laine.org> In-Reply-To: <20170811014222.29548-1-laine@laine.org> References: <20170811014222.29548-1-laine@laine.org> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-loop: libvir-list@redhat.com Cc: mprivozn@redhat.com, moshele@mellanox.com Subject: [libvirt] [PATCH v2 5/7] util: save the correct VF's info when using a dual port SRIOV NIC in single port mode X-BeenThere: libvir-list@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: Development discussions about the libvirt library & tools List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: libvir-list-bounces@redhat.com Errors-To: libvir-list-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Fri, 11 Aug 2017 01:43:01 +0000 (UTC) X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Type: text/plain; charset="utf-8" Mellanox ConnectX-3 dual port SRIOV NICs present a bit of a challenge when assigning one of their VFs to a guest using VFIO device assignment. These NICs have only a single PCI PF device, and that single PF has two netdevs sharing the single PCI address - one for port 1 and one for port 2. When a VF is created it can also have 2 netdevs, or it can be setup in "single port" mode, where the VF has only a single netdev, and that netdev is connected either to port 1 or to port 2. When the VF is created in dual port mode, you get/set the MAC address/vlan tag for the port 1 VF by sending a netlink message to the PF's port1 netdev, and you get/set the MAC address/vlan tag for the port 2 VF by sending a netlink message to the PF's port 2 netdev. (Of course libvirt doesn't have any way to describe MAC/vlan info for 2 ports in a single hostdev interface, so that's a bit of a moot point) When the VF is created in single port mode, you can *set* the MAC/vlan info by sending a netlink message to *either* PF netdev - the driver is smart enough to understand that there's only a single netdev, and set the MAC/vlan for that netdev. When you want to *get* it, however, the driver is more accurate - it will return 00:00:00:00:00:00 for the MAC if you request it from the port 1 PF netdev when the VF was configured to be single port on port 2, or if you request if from the port 2 PF netdev when the VF was configured to be single port on port 1. Based on this information, when *getting* the MAC/vlan info (to save the original setting prior to assignment), we determine the correct PF netdev by matching phys_port_id between VF and PF. (IMPORTANT NOTE: this implies that to do PCI device assignment of the VFs on dual port Mellanox cards using (i.e. if you want the MAC address/vlan tag to be set), not only must the VFs be configured in single port mode, but also the VFs *must* be bound to the host VF net driver, and libvirt must use managed=3D'yes') By the time libvirt is ready to set the new MAC/vlan tag, the VF has already been unbound from the host net driver and bound to vfio-pci. This isn't problematic though because, as stated earlier, when a VF is created in single port mode, commands to configure it can be sent to either the port 1 PF netdev or the port 2 PF netdev. When it is time to restore the original MAC/vlan tag, again the VF will *not* be bound to a host net driver, so it won't be possible to learn from sysfs whether to use the port 1 or port 2 PF netdev for the netlink commands. And again, it doesn't matter which netdev you use. However, we must keep in mind that we saved the original settings to a file called "${PF}_${VFNUM}". To solve this problem, we just check for the existence of ${PF1}_${VFNUM} and ${PF2}_${VFNUM}, and use whichever one we find (since we know that only one can be there) --- New in V2 src/util/virhostdev.c | 27 +++++++++++++++++++++------ src/util/virpci.c | 31 +++++++++++++++++++++++++++++-- src/util/virpci.h | 4 +++- 3 files changed, 53 insertions(+), 9 deletions(-) diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c index 580f0fac0..102fd85c1 100644 --- a/src/util/virhostdev.c +++ b/src/util/virhostdev.c @@ -307,7 +307,9 @@ virHostdevIsVirtualFunction(virDomainHostdevDefPtr host= dev) =20 =20 static int -virHostdevNetDevice(virDomainHostdevDefPtr hostdev, char **linkdev, +virHostdevNetDevice(virDomainHostdevDefPtr hostdev, + int pfNetDevIdx, + char **linkdev, int *vf) { int ret =3D -1; @@ -317,9 +319,10 @@ virHostdevNetDevice(virDomainHostdevDefPtr hostdev, ch= ar **linkdev, return ret; =20 if (virPCIIsVirtualFunction(sysfs_path) =3D=3D 1) { - if (virPCIGetVirtualFunctionInfo(sysfs_path, linkdev, - vf) < 0) + if (virPCIGetVirtualFunctionInfo(sysfs_path, pfNetDevIdx, + linkdev, vf) < 0) { goto cleanup; + } } else { /* In practice this should never happen, since we currently * only support assigning SRIOV VFs via parent.data.net); @@ -545,7 +548,7 @@ virHostdevRestoreNetConfig(virDomainHostdevDefPtr hostd= ev, return ret; } =20 - if (virHostdevNetDevice(hostdev, &linkdev, &vf) < 0) + if (virHostdevNetDevice(hostdev, 0, &linkdev, &vf) < 0) return ret; =20 virtPort =3D virDomainNetGetActualVirtPortProfile( @@ -565,6 +568,18 @@ virHostdevRestoreNetConfig(virDomainHostdevDefPtr host= dev, ret =3D virNetDevReadNetConfig(linkdev, vf, oldStateDir, &adminMAC, &vlan, &MAC); =20 + if (ret < 0) { + /* see if the config was saved using the PF's "port 2" + * netdev for the file name. + */ + VIR_FREE(linkdev); + + if (virHostdevNetDevice(hostdev, 1, &linkdev, &vf) >=3D 0) { + ret =3D virNetDevReadNetConfig(linkdev, vf, stateDir, + &adminMAC, &vlan, &MAC); + } + } + if (ret =3D=3D 0) { /* if a MAC was stored for the VF, we should now restore * that as the adminMAC. We have to do it this way because diff --git a/src/util/virpci.c b/src/util/virpci.c index 62a36b380..5ded77087 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -2935,10 +2935,14 @@ virPCIGetNetName(const char *device_link_sysfs_path, =20 int virPCIGetVirtualFunctionInfo(const char *vf_sysfs_device_path, - char **pfname, int *vf_index) + int pfNetDevIdx, + char **pfname, + int *vf_index) { virPCIDeviceAddressPtr pf_config_address =3D NULL; char *pf_sysfs_device_path =3D NULL; + char *vfname =3D NULL; + char *vfPhysPortID =3D NULL; int ret =3D -1; =20 if (virPCIGetPhysicalFunction(vf_sysfs_device_path, &pf_config_address= ) < 0) @@ -2957,8 +2961,28 @@ virPCIGetVirtualFunctionInfo(const char *vf_sysfs_de= vice_path, goto cleanup; } =20 - if (virPCIGetNetName(pf_sysfs_device_path, 0, NULL, pfname) < 0) + /* If the caller hasn't asked for a specific pfNetDevIdx, and VF + * is bound to a netdev, learn that netdev's phys_port_id (if + * available). This can be used to disambiguate when the PF has + * multiple netdevs. If the VF isn't bound to a netdev, then we + * return netdev[pfNetDevIdx] on the PF, which may or may not be + * correct. + */ + if (pfNetDevIdx =3D=3D -1) { + if (virPCIGetNetName(vf_sysfs_device_path, 0, NULL, &vfname) < 0) + goto cleanup; + + if (vfname) { + if (virNetDevGetPhysPortID(vfname, &vfPhysPortID) < 0) + goto cleanup; + } + pfNetDevIdx =3D 0; + } + + if (virPCIGetNetName(pf_sysfs_device_path, + pfNetDevIdx, vfPhysPortID, pfname) < 0) { goto cleanup; + } =20 if (!*pfname) { /* this shouldn't be possible. A VF can't exist unless its @@ -2974,6 +2998,8 @@ virPCIGetVirtualFunctionInfo(const char *vf_sysfs_dev= ice_path, cleanup: VIR_FREE(pf_config_address); VIR_FREE(pf_sysfs_device_path); + VIR_FREE(vfname); + VIR_FREE(vfPhysPortID); =20 return ret; } @@ -3044,6 +3070,7 @@ virPCIGetNetName(const char *device_link_sysfs_path A= TTRIBUTE_UNUSED, =20 int virPCIGetVirtualFunctionInfo(const char *vf_sysfs_device_path ATTRIBUTE_UN= USED, + int pfNetDevIdx ATTRIBUTE_UNUSED, char **pfname ATTRIBUTE_UNUSED, int *vf_index ATTRIBUTE_UNUSED) { diff --git a/src/util/virpci.h b/src/util/virpci.h index adf336706..f1fbe39e6 100644 --- a/src/util/virpci.h +++ b/src/util/virpci.h @@ -226,7 +226,9 @@ int virPCIGetAddrString(unsigned int domain, int virPCIDeviceAddressParse(char *address, virPCIDeviceAddressPtr bdf); =20 int virPCIGetVirtualFunctionInfo(const char *vf_sysfs_device_path, - char **pfname, int *vf_index); + int pfNetDevIdx, + char **pfname, + int *vf_index); =20 int virPCIDeviceUnbind(virPCIDevicePtr dev); int virPCIDeviceRebind(virPCIDevicePtr dev); --=20 2.13.3 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list