From: Wim ten Have <wim.ten.have@oracle.com>
Add libvirtd NUMA cell domain administration functionality to
describe underlying cell id sibling distances in full fashion
when configuring HVM guests.
Schema updates are made to docs/schemas/cputypes.rng enforcing domain
administration to follow the syntax below the numa cell id and
docs/schemas/basictypes.rng to add "numaDistanceValue".
A minimum value of 10 representing the LOCAL_DISTANCE as 0-9 are
reserved values and can not be used as System Locality Distance Information.
A value of 20 represents the default setting of REMOTE_DISTANCE
where a maximum value of 255 represents UNREACHABLE.
Effectively any cell sibling can be assigned a distance value where
practically 'LOCAL_DISTANCE <= value <= UNREACHABLE'.
[below is an example of a 4 node setup]
<cpu>
<numa>
<cell id='0' cpus='0' memory='2097152' unit='KiB'>
<distances>
<sibling id='0' value='10'/>
<sibling id='1' value='21'/>
<sibling id='2' value='31'/>
<sibling id='3' value='41'/>
</distances>
</cell>
<cell id='1' cpus='1' memory='2097152' unit='KiB'>
<distances>
<sibling id='0' value='21'/>
<sibling id='1' value='10'/>
<sibling id='2' value='31'/>
<sibling id='3' value='41'/>
</distances>
</cell>
<cell id='2' cpus='2' memory='2097152' unit='KiB'>
<distances>
<sibling id='0' value='31'/>
<sibling id='1' value='21'/>
<sibling id='2' value='10'/>
<sibling id='3' value='21'/>
</distances>
<cell id='3' cpus='3' memory='2097152' unit='KiB'>
<distances>
<sibling id='0' value='41'/>
<sibling id='1' value='31'/>
<sibling id='2' value='21'/>
<sibling id='3' value='10'/>
</distances>
</cell>
</numa>
</cpu>
Whenever a sibling id the cell LOCAL_DISTANCE does apply and for any
sibling id not being covered a default of REMOTE_DISTANCE is used
for internal computations.
Signed-off-by: Wim ten Have <wim.ten.have@oracle.com>
---
Changes on v1:
- Add changes to docs/formatdomain.html.in describing schema update.
Changes on v2:
- Automatically apply distance symmetry maintaining cell <-> sibling.
- Check for maximum '255' on numaDistanceValue.
- Automatically complete empty distance ranges.
- Check that sibling_id's are in range with cell identifiers.
- Allow non-contiguous ranges, starting from any node id.
- Respect parameters as ATTRIBUTE_NONNULL fix functions and callers.
- Add and apply topology for LOCAL_DISTANCE=10 and REMOTE_DISTANCE=20.
Changes on v3
- Add UNREACHABLE if one locality is unreachable from another.
- Add code cleanup aligning function naming in a separated patch.
- Add numa related driver code in a separated patch.
- Remove <choice> from numaDistanceValue schema/basictypes.rng
- Correct doc changes.
---
docs/formatdomain.html.in | 63 +++++++++++++-
docs/schemas/basictypes.rng | 7 ++
docs/schemas/cputypes.rng | 18 ++++
src/conf/numa_conf.c | 200 +++++++++++++++++++++++++++++++++++++++++++-
4 files changed, 284 insertions(+), 4 deletions(-)
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
index 8ca7637..c453d44 100644
--- a/docs/formatdomain.html.in
+++ b/docs/formatdomain.html.in
@@ -1529,7 +1529,68 @@
</p>
<p>
- This guest NUMA specification is currently available only for QEMU/KVM.
+ This guest NUMA specification is currently available only for
+ QEMU/KVM and Xen. Whereas Xen driver also allows for a distinct
+ description of NUMA arranged <code>sibling</code> <code>cell</code>
+ <code>distances</code> <span class="since">Since 3.6.0</span>.
+ </p>
+
+ <p>
+ Under NUMA h/w architecture, distinct resources such as memory
+ create a designated distance between <code>cell</code> and
+ <code>siblings</code> that now can be described with the help of
+ <code>distances</code>. A detailed description can be found within
+ the ACPI (Advanced Configuration and Power Interface Specification)
+ within the chapter explaining the system's SLIT (System Locality
+ Distance Information Table).
+ </p>
+
+<pre>
+...
+<cpu>
+ ...
+ <numa>
+ <cell id='0' cpus='0,4-7' memory='512000' unit='KiB'>
+ <distances>
+ <sibling id='0' value='10'/>
+ <sibling id='1' value='21'/>
+ <sibling id='2' value='31'/>
+ <sibling id='3' value='41'/>
+ </distances>
+ </cell>
+ <cell id='1' cpus='1,8-10,12-15' memory='512000' unit='KiB' memAccess='shared'>
+ <distances>
+ <sibling id='0' value='21'/>
+ <sibling id='1' value='10'/>
+ <sibling id='2' value='21'/>
+ <sibling id='3' value='31'/>
+ </distances>
+ </cell>
+ <cell id='2' cpus='2,11' memory='512000' unit='KiB' memAccess='shared'>
+ <distances>
+ <sibling id='0' value='31'/>
+ <sibling id='1' value='21'/>
+ <sibling id='2' value='10'/>
+ <sibling id='3' value='21'/>
+ </distances>
+ </cell>
+ <cell id='3' cpus='3' memory='512000' unit='KiB'>
+ <distances>
+ <sibling id='0' value='41'/>
+ <sibling id='1' value='31'/>
+ <sibling id='2' value='21'/>
+ <sibling id='3' value='10'/>
+ </distances>
+ </cell>
+ </numa>
+ ...
+</cpu>
+...</pre>
+
+ <p>
+ Under Xen driver, if no <code>distances</code> are given to describe
+ the SLIT data between different cells, it will default to a scheme
+ using 10 for local and 20 for remote distances.
</p>
<h3><a id="elementsEvents">Events configuration</a></h3>
diff --git a/docs/schemas/basictypes.rng b/docs/schemas/basictypes.rng
index 1ea667c..1a18cd3 100644
--- a/docs/schemas/basictypes.rng
+++ b/docs/schemas/basictypes.rng
@@ -77,6 +77,13 @@
</choice>
</define>
+ <define name="numaDistanceValue">
+ <data type="unsignedInt">
+ <param name="minInclusive">10</param>
+ <param name="maxInclusive">255</param>
+ </data>
+ </define>
+
<define name="pciaddress">
<optional>
<attribute name="domain">
diff --git a/docs/schemas/cputypes.rng b/docs/schemas/cputypes.rng
index 3eef16a..c45b6df 100644
--- a/docs/schemas/cputypes.rng
+++ b/docs/schemas/cputypes.rng
@@ -129,6 +129,24 @@
</choice>
</attribute>
</optional>
+ <optional>
+ <element name="distances">
+ <oneOrMore>
+ <ref name="numaDistance"/>
+ </oneOrMore>
+ </element>
+ </optional>
+ </element>
+ </define>
+
+ <define name="numaDistance">
+ <element name="sibling">
+ <attribute name="id">
+ <ref name="unsignedInt"/>
+ </attribute>
+ <attribute name="value">
+ <ref name="numaDistanceValue"/>
+ </attribute>
</element>
</define>
diff --git a/src/conf/numa_conf.c b/src/conf/numa_conf.c
index b71dc01..5db4311 100644
--- a/src/conf/numa_conf.c
+++ b/src/conf/numa_conf.c
@@ -29,6 +29,15 @@
#include "virnuma.h"
#include "virstring.h"
+/*
+ * Distance definitions defined Conform ACPI 2.0 SLIT.
+ * See include/linux/topology.h
+ */
+#define LOCAL_DISTANCE 10
+#define REMOTE_DISTANCE 20
+/* SLIT entry value is a one-byte unsigned integer. */
+#define UNREACHABLE 255
+
#define VIR_FROM_THIS VIR_FROM_DOMAIN
VIR_ENUM_IMPL(virDomainNumatuneMemMode,
@@ -48,6 +57,8 @@ VIR_ENUM_IMPL(virDomainMemoryAccess, VIR_DOMAIN_MEMORY_ACCESS_LAST,
"shared",
"private")
+typedef struct _virDomainNumaDistance virDomainNumaDistance;
+typedef virDomainNumaDistance *virDomainNumaDistancePtr;
typedef struct _virDomainNumaNode virDomainNumaNode;
typedef virDomainNumaNode *virDomainNumaNodePtr;
@@ -66,6 +77,12 @@ struct _virDomainNuma {
virBitmapPtr nodeset; /* host memory nodes where this guest node resides */
virDomainNumatuneMemMode mode; /* memory mode selection */
virDomainMemoryAccess memAccess; /* shared memory access configuration */
+
+ struct _virDomainNumaDistance {
+ unsigned int value; /* locality value for node i->j or j->i */
+ unsigned int cellid;
+ } *distances; /* remote node distances */
+ size_t ndistances;
} *mem_nodes; /* guest node configuration */
size_t nmem_nodes;
@@ -686,6 +703,153 @@ virDomainNumatuneNodesetIsAvailable(virDomainNumaPtr numatune,
}
+static int
+virDomainNumaDefNodeDistanceParseXML(virDomainNumaPtr def,
+ xmlXPathContextPtr ctxt,
+ unsigned int cur_cell)
+{
+ int ret = -1;
+ int sibling;
+ char *tmp = NULL;
+ xmlNodePtr *nodes = NULL;
+ size_t i, ndistances = def->nmem_nodes;
+
+ if (!ndistances)
+ return 0;
+
+ /* check if NUMA distances definition is present */
+ if (!virXPathNode("./distances[1]", ctxt))
+ return 0;
+
+ if ((sibling = virXPathNodeSet("./distances[1]/sibling", ctxt, &nodes)) <= 0) {
+ virReportError(VIR_ERR_XML_ERROR, "%s",
+ _("NUMA distances defined without siblings"));
+ goto cleanup;
+ }
+
+ for (i = 0; i < sibling; i++) {
+ virDomainNumaDistancePtr ldist, rdist;
+ unsigned int sibling_id, sibling_value;
+
+ /* siblings are in order of parsing or explicitly numbered */
+ if (!(tmp = virXMLPropString(nodes[i], "id"))) {
+ virReportError(VIR_ERR_XML_ERROR,
+ _("Missing 'id' attribute in NUMA "
+ "distances under 'cell id %d'"),
+ cur_cell);
+ goto cleanup;
+ }
+
+ /* The "id" needs to be applicable */
+ if (virStrToLong_uip(tmp, NULL, 10, &sibling_id) < 0) {
+ virReportError(VIR_ERR_XML_ERROR,
+ _("Invalid 'id' attribute in NUMA "
+ "distances for sibling: '%s'"),
+ tmp);
+ goto cleanup;
+ }
+ VIR_FREE(tmp);
+
+ /* The "id" needs to be within numa/cell range */
+ if (sibling_id >= ndistances) {
+ virReportError(VIR_ERR_XML_ERROR,
+ _("There is no cell administrated matching "
+ "'sibling_id %d' under NUMA 'cell id %d' "),
+ sibling_id, cur_cell);
+ goto cleanup;
+ }
+
+ /* We need a locality value. Check and correct
+ * distance to local and distance to remote node.
+ */
+ if (!(tmp = virXMLPropString(nodes[i], "value"))) {
+ virReportError(VIR_ERR_XML_ERROR,
+ _("Missing 'value' attribute in NUMA distances "
+ "under 'cell id %d' for 'sibling id %d'"),
+ cur_cell, sibling_id);
+ goto cleanup;
+ }
+
+ /* The "value" needs to be applicable */
+ if (virStrToLong_uip(tmp, NULL, 10, &sibling_value) < 0) {
+ virReportError(VIR_ERR_XML_ERROR,
+ _("Invalid 'value' attribute in NUMA "
+ "distances for value: '%s'"),
+ tmp);
+ goto cleanup;
+ }
+ VIR_FREE(tmp);
+
+ /* LOCAL_DISTANCE <= "value" <= UNREACHABLE */
+ if (sibling_value < LOCAL_DISTANCE ||
+ sibling_value > UNREACHABLE) {
+ virReportError(VIR_ERR_XML_ERROR,
+ _("Out of range value '%d' set for "
+ "'sibling id %d' under NUMA 'cell id %d' "),
+ sibling_value, sibling_id, cur_cell);
+ goto cleanup;
+ }
+
+ ldist = def->mem_nodes[cur_cell].distances;
+ if (!ldist) {
+ if (def->mem_nodes[cur_cell].ndistances) {
+ virReportError(VIR_ERR_XML_ERROR,
+ _("Invalid 'ndistances' set in NUMA "
+ "distances for sibling id: '%d'"),
+ cur_cell);
+ goto cleanup;
+ }
+
+ if (VIR_ALLOC_N(ldist, ndistances) < 0)
+ goto cleanup;
+
+ if (!ldist[cur_cell].value)
+ ldist[cur_cell].value = LOCAL_DISTANCE;
+ ldist[cur_cell].cellid = cur_cell;
+ def->mem_nodes[cur_cell].ndistances = ndistances;
+ }
+
+ ldist[sibling_id].cellid = sibling_id;
+ ldist[sibling_id].value = sibling_value;
+ def->mem_nodes[cur_cell].distances = ldist;
+
+ rdist = def->mem_nodes[sibling_id].distances;
+ if (!rdist) {
+ if (def->mem_nodes[sibling_id].ndistances) {
+ virReportError(VIR_ERR_XML_ERROR,
+ _("Invalid 'ndistances' set in NUMA "
+ "distances for sibling id: '%d'"),
+ sibling_id);
+ goto cleanup;
+ }
+
+ if (VIR_ALLOC_N(rdist, ndistances) < 0)
+ goto cleanup;
+
+ if (!rdist[sibling_id].value)
+ rdist[sibling_id].value = LOCAL_DISTANCE;
+ rdist[sibling_id].cellid = sibling_id;
+ def->mem_nodes[sibling_id].ndistances = ndistances;
+ }
+
+ rdist[cur_cell].cellid = cur_cell;
+ rdist[cur_cell].value = sibling_value;
+ def->mem_nodes[sibling_id].distances = rdist;
+ }
+
+ ret = 0;
+
+ cleanup:
+ if (ret) {
+ for (i = 0; i < ndistances; i++)
+ VIR_FREE(def->mem_nodes[i].distances);
+ }
+ VIR_FREE(nodes);
+ VIR_FREE(tmp);
+
+ return ret;
+}
+
int
virDomainNumaDefCPUParseXML(virDomainNumaPtr def,
xmlXPathContextPtr ctxt)
@@ -694,7 +858,7 @@ virDomainNumaDefCPUParseXML(virDomainNumaPtr def,
xmlNodePtr oldNode = ctxt->node;
char *tmp = NULL;
int n;
- size_t i;
+ size_t i, j;
int ret = -1;
/* check if NUMA definition is present */
@@ -712,7 +876,6 @@ virDomainNumaDefCPUParseXML(virDomainNumaPtr def,
def->nmem_nodes = n;
for (i = 0; i < n; i++) {
- size_t j;
int rc;
unsigned int cur_cell = i;
@@ -788,6 +951,10 @@ virDomainNumaDefCPUParseXML(virDomainNumaPtr def,
def->mem_nodes[cur_cell].memAccess = rc;
VIR_FREE(tmp);
}
+
+ /* Parse NUMA distances info */
+ if (virDomainNumaDefNodeDistanceParseXML(def, ctxt, cur_cell) < 0)
+ goto cleanup;
}
ret = 0;
@@ -815,6 +982,8 @@ virDomainNumaDefCPUFormatXML(virBufferPtr buf,
virBufferAddLit(buf, "<numa>\n");
virBufferAdjustIndent(buf, 2);
for (i = 0; i < ncells; i++) {
+ int ndistances;
+
memAccess = virDomainNumaGetNodeMemoryAccessMode(def, i);
if (!(cpustr = virBitmapFormat(virDomainNumaGetNodeCpumask(def, i))))
@@ -829,7 +998,32 @@ virDomainNumaDefCPUFormatXML(virBufferPtr buf,
if (memAccess)
virBufferAsprintf(buf, " memAccess='%s'",
virDomainMemoryAccessTypeToString(memAccess));
- virBufferAddLit(buf, "/>\n");
+
+ ndistances = def->mem_nodes[i].ndistances;
+ if (!ndistances) {
+ virBufferAddLit(buf, "/>\n");
+ } else {
+ size_t j;
+ virDomainNumaDistancePtr distances = def->mem_nodes[i].distances;
+
+ virBufferAddLit(buf, ">\n");
+ virBufferAdjustIndent(buf, 2);
+ virBufferAddLit(buf, "<distances>\n");
+ virBufferAdjustIndent(buf, 2);
+ for (j = 0; j < ndistances; j++) {
+ if (distances[j].value) {
+ virBufferAddLit(buf, "<sibling");
+ virBufferAsprintf(buf, " id='%d'", distances[j].cellid);
+ virBufferAsprintf(buf, " value='%d'", distances[j].value);
+ virBufferAddLit(buf, "/>\n");
+ }
+ }
+ virBufferAdjustIndent(buf, -2);
+ virBufferAddLit(buf, "</distances>\n");
+ virBufferAdjustIndent(buf, -2);
+ virBufferAddLit(buf, "</cell>\n");
+ }
+
VIR_FREE(cpustr);
}
virBufferAdjustIndent(buf, -2);
--
2.9.5
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
On 09/08/2017 08:47 AM, Wim Ten Have wrote:
> From: Wim ten Have <wim.ten.have@oracle.com>
>
> Add libvirtd NUMA cell domain administration functionality to
> describe underlying cell id sibling distances in full fashion
> when configuring HVM guests.
May I suggest wording this paragraph as:
Add support for describing sibling vCPU distances within a domain's vNUMA cell
configuration.
> Schema updates are made to docs/schemas/cputypes.rng enforcing domain
> administration to follow the syntax below the numa cell id and
> docs/schemas/basictypes.rng to add "numaDistanceValue".
I'm not sure this paragraph is needed in the commit message.
> A minimum value of 10 representing the LOCAL_DISTANCE as 0-9 are
> reserved values and can not be used as System Locality Distance Information.
> A value of 20 represents the default setting of REMOTE_DISTANCE
> where a maximum value of 255 represents UNREACHABLE.
>
> Effectively any cell sibling can be assigned a distance value where
> practically 'LOCAL_DISTANCE <= value <= UNREACHABLE'.
>
> [below is an example of a 4 node setup]
>
> <cpu>
> <numa>
> <cell id='0' cpus='0' memory='2097152' unit='KiB'>
> <distances>
> <sibling id='0' value='10'/>
> <sibling id='1' value='21'/>
> <sibling id='2' value='31'/>
> <sibling id='3' value='41'/>
> </distances>
> </cell>
> <cell id='1' cpus='1' memory='2097152' unit='KiB'>
> <distances>
> <sibling id='0' value='21'/>
> <sibling id='1' value='10'/>
> <sibling id='2' value='31'/>
> <sibling id='3' value='41'/>
> </distances>
> </cell>
> <cell id='2' cpus='2' memory='2097152' unit='KiB'>
> <distances>
> <sibling id='0' value='31'/>
> <sibling id='1' value='21'/>
> <sibling id='2' value='10'/>
> <sibling id='3' value='21'/>
> </distances>
> <cell id='3' cpus='3' memory='2097152' unit='KiB'>
> <distances>
> <sibling id='0' value='41'/>
> <sibling id='1' value='31'/>
> <sibling id='2' value='21'/>
> <sibling id='3' value='10'/>
> </distances>
> </cell>
> </numa>
> </cpu>
How would this look when having more than one cpu in a cell? I suppose something
like
<cpu>
<numa>
<cell id='0' cpus='0-3' memory='2097152' unit='KiB'>
<distances>
<sibling id='0' value='10'/>
<sibling id='1' value='10'/>
<sibling id='2' value='10'/>
<sibling id='3' value='10'/>
<sibling id='4' value='21'/>
<sibling id='5' value='21'/>
<sibling id='6' value='21'/>
<sibling id='7' value='21'/>
</distances>
</cell>
<cell id='1' cpus='4-7' memory='2097152' unit='KiB'>
<distances>
<sibling id='0' value='21'/>
<sibling id='1' value='21'/>
<sibling id='2' value='21'/>
<sibling id='3' value='21'/>
<sibling id='4' value='10'/>
<sibling id='5' value='10'/>
<sibling id='6' value='10'/>
<sibling id='7' value='10'/>
</distances>
</cell>
</numa>
</cpu>
In the V3 thread you mentioned "And to reduce even more we could also
remove LOCAL_DISTANCES as they make a constant factor where; (cell_id ==
sibling_id)". In the above example cell_id 1 == sibling_id 1, but it is not
LOCAL_DISTANCE.
> Whenever a sibling id the cell LOCAL_DISTANCE does apply and for any
> sibling id not being covered a default of REMOTE_DISTANCE is used
> for internal computations.
I'm having a hard time understanding this sentence...
I didn't look closely at the patch since I'd like to understand how multi-cpu
cells are handled before doing so.
Regards,
Jim
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
On Fri, 6 Oct 2017 08:49:46 -0600
Jim Fehlig <jfehlig@suse.com> wrote:
> On 09/08/2017 08:47 AM, Wim Ten Have wrote:
> > From: Wim ten Have <wim.ten.have@oracle.com>
> >
> > Add libvirtd NUMA cell domain administration functionality to
> > describe underlying cell id sibling distances in full fashion
> > when configuring HVM guests.
>
> May I suggest wording this paragraph as:
>
> Add support for describing sibling vCPU distances within a domain's vNUMA cell
> configuration.
See below (v5 comment).
> > Schema updates are made to docs/schemas/cputypes.rng enforcing domain
> > administration to follow the syntax below the numa cell id and
> > docs/schemas/basictypes.rng to add "numaDistanceValue".
>
> I'm not sure this paragraph is needed in the commit message.
>
> > A minimum value of 10 representing the LOCAL_DISTANCE as 0-9 are
> > reserved values and can not be used as System Locality Distance Information.
> > A value of 20 represents the default setting of REMOTE_DISTANCE
> > where a maximum value of 255 represents UNREACHABLE.
> >
> > Effectively any cell sibling can be assigned a distance value where
> > practically 'LOCAL_DISTANCE <= value <= UNREACHABLE'.
> >
> > [below is an example of a 4 node setup]
> >
> > <cpu>
> > <numa>
> > <cell id='0' cpus='0' memory='2097152' unit='KiB'>
> > <distances>
> > <sibling id='0' value='10'/>
> > <sibling id='1' value='21'/>
> > <sibling id='2' value='31'/>
> > <sibling id='3' value='41'/>
> > </distances>
> > </cell>
> > <cell id='1' cpus='1' memory='2097152' unit='KiB'>
> > <distances>
> > <sibling id='0' value='21'/>
> > <sibling id='1' value='10'/>
> > <sibling id='2' value='31'/>
> > <sibling id='3' value='41'/>
> > </distances>
> > </cell>
> > <cell id='2' cpus='2' memory='2097152' unit='KiB'>
> > <distances>
> > <sibling id='0' value='31'/>
> > <sibling id='1' value='21'/>
> > <sibling id='2' value='10'/>
> > <sibling id='3' value='21'/>
> > </distances>
> > <cell id='3' cpus='3' memory='2097152' unit='KiB'>
> > <distances>
> > <sibling id='0' value='41'/>
> > <sibling id='1' value='31'/>
> > <sibling id='2' value='21'/>
> > <sibling id='3' value='10'/>
> > </distances>
> > </cell>
> > </numa>
> > </cpu>
>
> How would this look when having more than one cpu in a cell? I suppose something
> like
>
> <cpu>
> <numa>
> <cell id='0' cpus='0-3' memory='2097152' unit='KiB'>
> <distances>
> <sibling id='0' value='10'/>
> <sibling id='1' value='10'/>
> <sibling id='2' value='10'/>
> <sibling id='3' value='10'/>
> <sibling id='4' value='21'/>
> <sibling id='5' value='21'/>
> <sibling id='6' value='21'/>
> <sibling id='7' value='21'/>
> </distances>
> </cell>
> <cell id='1' cpus='4-7' memory='2097152' unit='KiB'>
> <distances>
> <sibling id='0' value='21'/>
> <sibling id='1' value='21'/>
> <sibling id='2' value='21'/>
> <sibling id='3' value='21'/>
> <sibling id='4' value='10'/>
> <sibling id='5' value='10'/>
> <sibling id='6' value='10'/>
> <sibling id='7' value='10'/>
> </distances>
> </cell>
> </numa>
> </cpu>
Nope. That machine seems to make a 2 node vNUMA setup.
Where;
* NUMA node(0) defined by <cell id='0'> holds 4 (cores)
cpus '0-3' with 2GByte of dedicated memory.
* NUMA node(1) defined by <cell id='1'> holds 4 (cores)
cpus '4-7' with 2GByte of dedicated memory.
<cpu>
<numa>
<cell id='0' cpus='0-3' memory='2097152' unit='KiB'>
<distances>
<sibling id='0' value='10'/>
<sibling id='1' value='21'/>
</distances>
</cell>
<cell id='1' cpus='4-7' memory='2097152' unit='KiB'>
<distances>
<sibling id='0' value='21'/>
<sibling id='1' value='10'/>
</distances>
</cell>
</numa>
</cpu>
Specific configuration would typically report below when examined from
within the guest domain; (despite ignorance in this example that it
_DOES_ concern a single socket 8 cpu machine).
[root@f25 ~]# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 1
Core(s) per socket: 8
Socket(s): 1
*> NUMA node(s): 2
Vendor ID: AuthenticAMD
CPU family: 21
Model: 2
Model name: AMD FX-8320E Eight-Core Processor
Stepping: 0
CPU MHz: 3210.862
BogoMIPS: 6421.83
Virtualization: AMD-V
Hypervisor vendor: Xen
Virtualization type: full
L1d cache: 16K
L1i cache: 64K
L2 cache: 2048K
L3 cache: 8192K
*> NUMA node0 CPU(s): 0-3
*> NUMA node1 CPU(s): 4-7
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_good nopl cpuid extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic popcnt aes xsave avx f16c hypervisor lahf_lm svm cr8_legacy abm sse4a misalignsse 3dnowprefetch ibs xop lwp fma4 tbm vmmcall bmi1 arat npt lbrv nrip_save tsc_scale vmcb_clean decodeassists pausefilter
[root@f25 ~]# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3
node 0 size: 1990 MB
node 0 free: 1786 MB
node 1 cpus: 4 5 6 7
node 1 size: 1950 MB
node 1 free: 1820 MB
node distances:
node 0 1
0: 10 21
1: 21 10
> In the V3 thread you mentioned "And to reduce even more we could also
> remove LOCAL_DISTANCES as they make a constant factor where; (cell_id ==
> sibling_id)". In the above example cell_id 1 == sibling_id 1, but it is not
> LOCAL_DISTANCE.
>
> > Whenever a sibling id the cell LOCAL_DISTANCE does apply and for any
> > sibling id not being covered a default of REMOTE_DISTANCE is used
> > for internal computations.
>
> I'm having a hard time understanding this sentence...
Me.2
> I didn't look closely at the patch since I'd like to understand how multi-cpu
> cells are handled before doing so.
Let me prepare v5. I found a silly error in code being fixed and
given above commented confusion like to take a better approach under
the commit messages and witin the cover letter.
Regards,
- Wim.
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
On 10/12/2017 04:37 AM, Wim ten Have wrote: > On Fri, 6 Oct 2017 08:49:46 -0600 > Jim Fehlig <jfehlig@suse.com> wrote: > >> On 09/08/2017 08:47 AM, Wim Ten Have wrote: >>> From: Wim ten Have <wim.ten.have@oracle.com> >>> >>> Add libvirtd NUMA cell domain administration functionality to >>> describe underlying cell id sibling distances in full fashion >>> when configuring HVM guests. >> >> May I suggest wording this paragraph as: >> >> Add support for describing sibling vCPU distances within a domain's vNUMA cell >> configuration. > > See below (v5 comment). > >>> Schema updates are made to docs/schemas/cputypes.rng enforcing domain >>> administration to follow the syntax below the numa cell id and >>> docs/schemas/basictypes.rng to add "numaDistanceValue". >> >> I'm not sure this paragraph is needed in the commit message. >> >>> A minimum value of 10 representing the LOCAL_DISTANCE as 0-9 are >>> reserved values and can not be used as System Locality Distance Information. >>> A value of 20 represents the default setting of REMOTE_DISTANCE >>> where a maximum value of 255 represents UNREACHABLE. >>> >>> Effectively any cell sibling can be assigned a distance value where >>> practically 'LOCAL_DISTANCE <= value <= UNREACHABLE'. >>> >>> [below is an example of a 4 node setup] >>> >>> <cpu> >>> <numa> >>> <cell id='0' cpus='0' memory='2097152' unit='KiB'> >>> <distances> >>> <sibling id='0' value='10'/> >>> <sibling id='1' value='21'/> >>> <sibling id='2' value='31'/> >>> <sibling id='3' value='41'/> >>> </distances> >>> </cell> >>> <cell id='1' cpus='1' memory='2097152' unit='KiB'> >>> <distances> >>> <sibling id='0' value='21'/> >>> <sibling id='1' value='10'/> >>> <sibling id='2' value='31'/> >>> <sibling id='3' value='41'/> >>> </distances> >>> </cell> >>> <cell id='2' cpus='2' memory='2097152' unit='KiB'> >>> <distances> >>> <sibling id='0' value='31'/> >>> <sibling id='1' value='21'/> >>> <sibling id='2' value='10'/> >>> <sibling id='3' value='21'/> >>> </distances> >>> <cell id='3' cpus='3' memory='2097152' unit='KiB'> >>> <distances> >>> <sibling id='0' value='41'/> >>> <sibling id='1' value='31'/> >>> <sibling id='2' value='21'/> >>> <sibling id='3' value='10'/> >>> </distances> >>> </cell> >>> </numa> >>> </cpu> >> >> How would this look when having more than one cpu in a cell? I suppose something >> like >> >> <cpu> >> <numa> >> <cell id='0' cpus='0-3' memory='2097152' unit='KiB'> >> <distances> >> <sibling id='0' value='10'/> >> <sibling id='1' value='10'/> >> <sibling id='2' value='10'/> >> <sibling id='3' value='10'/> >> <sibling id='4' value='21'/> >> <sibling id='5' value='21'/> >> <sibling id='6' value='21'/> >> <sibling id='7' value='21'/> >> </distances> >> </cell> >> <cell id='1' cpus='4-7' memory='2097152' unit='KiB'> >> <distances> >> <sibling id='0' value='21'/> >> <sibling id='1' value='21'/> >> <sibling id='2' value='21'/> >> <sibling id='3' value='21'/> >> <sibling id='4' value='10'/> >> <sibling id='5' value='10'/> >> <sibling id='6' value='10'/> >> <sibling id='7' value='10'/> >> </distances> >> </cell> >> </numa> >> </cpu> > > Nope. That machine seems to make a 2 node vNUMA setup. > > Where; > * NUMA node(0) defined by <cell id='0'> holds 4 (cores) > cpus '0-3' with 2GByte of dedicated memory. > * NUMA node(1) defined by <cell id='1'> holds 4 (cores) > cpus '4-7' with 2GByte of dedicated memory. Correct. > <cpu> > <numa> > <cell id='0' cpus='0-3' memory='2097152' unit='KiB'> > <distances> > <sibling id='0' value='10'/> > <sibling id='1' value='21'/> > </distances> > </cell> > <cell id='1' cpus='4-7' memory='2097152' unit='KiB'> > <distances> > <sibling id='0' value='21'/> > <sibling id='1' value='10'/> > </distances> > </cell> > </numa> > </cpu> Duh. sibling id='x' refers to cell with id 'x'. For some reason I had it stuck in my head that it referred to vcpu with id 'x'. > > Specific configuration would typically report below when examined from > within the guest domain; (despite ignorance in this example that it > _DOES_ concern a single socket 8 cpu machine). > > [root@f25 ~]# lscpu > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Little Endian > CPU(s): 8 > On-line CPU(s) list: 0-7 > Thread(s) per core: 1 > Core(s) per socket: 8 > Socket(s): 1 > *> NUMA node(s): 2 > Vendor ID: AuthenticAMD > CPU family: 21 > Model: 2 > Model name: AMD FX-8320E Eight-Core Processor > Stepping: 0 > CPU MHz: 3210.862 > BogoMIPS: 6421.83 > Virtualization: AMD-V > Hypervisor vendor: Xen > Virtualization type: full > L1d cache: 16K > L1i cache: 64K > L2 cache: 2048K > L3 cache: 8192K > *> NUMA node0 CPU(s): 0-3 > *> NUMA node1 CPU(s): 4-7 > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_good nopl cpuid extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic popcnt aes xsave avx f16c hypervisor lahf_lm svm cr8_legacy abm sse4a misalignsse 3dnowprefetch ibs xop lwp fma4 tbm vmmcall bmi1 arat npt lbrv nrip_save tsc_scale vmcb_clean decodeassists pausefilter > > [root@f25 ~]# numactl -H > available: 2 nodes (0-1) > node 0 cpus: 0 1 2 3 > node 0 size: 1990 MB > node 0 free: 1786 MB > node 1 cpus: 4 5 6 7 > node 1 size: 1950 MB > node 1 free: 1820 MB > node distances: > node 0 1 > 0: 10 21 > 1: 21 10 Right, got it. > >> In the V3 thread you mentioned "And to reduce even more we could also >> remove LOCAL_DISTANCES as they make a constant factor where; (cell_id == >> sibling_id)". In the above example cell_id 1 == sibling_id 1, but it is not >> LOCAL_DISTANCE. >> >>> Whenever a sibling id the cell LOCAL_DISTANCE does apply and for any >>> sibling id not being covered a default of REMOTE_DISTANCE is used >>> for internal computations. >> >> I'm having a hard time understanding this sentence... > > Me.2 > >> I didn't look closely at the patch since I'd like to understand how multi-cpu >> cells are handled before doing so. > > Let me prepare v5. I found a silly error in code being fixed and > given above commented confusion like to take a better approach under > the commit messages and witin the cover letter. Thanks. Hopefully I'll have time to review it without much delay. Regards, Jim -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
© 2016 - 2025 Red Hat, Inc.