From 44b1a709960e6d3b672fb0a710f58f2759f92c61 Mon Sep 17 00:00:00 2001 From: Rohan Barar Date: Mon, 22 Jul 2024 17:34:38 +1000 Subject: [PATCH] Improved CPU pinning documentation --- docs/libvirt.md | 105 +++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 103 insertions(+), 2 deletions(-) diff --git a/docs/libvirt.md b/docs/libvirt.md index b214b3a..27d2fae 100644 --- a/docs/libvirt.md +++ b/docs/libvirt.md @@ -124,10 +124,111 @@ Together, these components form a powerful and flexible virtualization stack, wi

-10. (Optional) Configure 'CPU pinning' by following [this excellent guide](https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF#CPU_pinning). +10. (Optional) Assign specific physical CPU cores to the virtual machine. This can improve performance by reducing context switching and ensuring that the virtual machine's workload consistently uses the same cores, leading to better CPU cache utilisation. + 1. Run `lscpu -e` to determine which L1, L2 and L3 caches are associated with which CPU cores. + + Example 1 (Intel 11th Gen Core i7-1185G7): + ``` + CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE MAXMHZ MINMHZ + 0 0 0 0 0:0:0:0 yes 4800.0000 400.0000 + 1 0 0 1 1:1:1:0 yes 4800.0000 400.0000 + 2 0 0 2 2:2:2:0 yes 4800.0000 400.0000 + 3 0 0 3 3:3:3:0 yes 4800.0000 400.0000 + 4 0 0 0 0:0:0:0 yes 4800.0000 400.0000 + 5 0 0 1 1:1:1:0 yes 4800.0000 400.0000 + 6 0 0 2 2:2:2:0 yes 4800.0000 400.0000 + 7 0 0 3 3:3:3:0 yes 4800.0000 400.0000 + ``` + + - C0 = T0+T4 → L10+L20+L30 + - C1 = T1+T5 → L11+L21+L30 + - C2 = T2+T6 → L12+L22+L30 + - C3 = T3+T7 → L13+L23+L30 + + Example 2 (AMD Ryzen 5 1600): + ``` + CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE MAXMHZ MINMHZ + 0 0 0 0 0:0:0:0 yes 3800.0000 1550.0000 + 1 0 0 0 0:0:0:0 yes 3800.0000 1550.0000 + 2 0 0 1 1:1:1:0 yes 3800.0000 1550.0000 + 3 0 0 1 1:1:1:0 yes 3800.0000 1550.0000 + 4 0 0 2 2:2:2:0 yes 3800.0000 1550.0000 + 5 0 0 2 2:2:2:0 yes 3800.0000 1550.0000 + 6 0 0 3 3:3:3:1 yes 3800.0000 1550.0000 + 7 0 0 3 3:3:3:1 yes 3800.0000 1550.0000 + 8 0 0 4 4:4:4:1 yes 3800.0000 1550.0000 + 9 0 0 4 4:4:4:1 yes 3800.0000 1550.0000 + 10 0 0 5 5:5:5:1 yes 3800.0000 1550.0000 + 11 0 0 5 5:5:5:1 yes 3800.0000 1550.0000 + ``` + + - C0 = T0+T1 → L10+L20+L30 + - C1 = T2+T3 → L11+L21+L30 + - C2 = T4+T5 → L12+L22+L30 + - C3 = T6+T7 → L13+L23+L31 + - C4 = T8+T9 → L14+L24+L31 + - C5 = T10+T11 → L15+L25+L31 + + 2. Select which CPU cores to 'pin'. You should aim to select a combination of CPU cores that minimises sharing of caches between Windows and GNU/Linux. + + Example 1: + - CPU cores share the same singular L3 cache, so this cannot be optimised. + - CPU cores utilise different L1 and L2 caches, so isolatng corresponding thread pairs will help improve performance. + - Thus, if limiting the virtual machine to a maximum of 4 threads, there are 10 possible optimal configurations: + - T0+T4 + - T1+T5 + - T2+T6 + - T3+T7 + - T0+T4+T1+T5 + - T0+T4+T2+T6 + - T0+T4+T3+T7 + - T1+T5+T2+T6 + - T1+T5+T3+T7 + - T2+T6+T3+T7 + + Example 2: + - Threads 0-5 utilise one L3 cache whereas threads 6-11 utilise a different L3 cache. Thus, one of these two sets of threads should be pinned to the virtual machine. + - Pinning and isolating fewer than these (e.g. threads 8-11) would result in the host system making use of the L3 cache in threads 6 and 7, resulting in cache evictions and therefore bad performance. + - Thus, there are only two possible optimal configurations: + - T0+T1+T2+T3+T4+T5 + - T6+T7+T8+T9+T10+T11 + + 3. Prepare and add/modify the following to the ``, `` and `` sections, adjusting the values to match your selected threads. + + Example 1: The following selects 'T2+T6+T3+T7'. + + ```xml + 4 + + + + + + + + + + ``` + + Example 2: The following selects 'T6+T7+T8+T9+T10+T11'. + + ```xml + 6 + + + + + + + + + + + + ``` > [!NOTE] -> CPU pinning involves assigning specific physical CPU cores to a virtual machine. This can improve performance by reducing context switching and ensuring that the VM's workload consistently uses the same cores, leading to better CPU cache utilisation. +> More information on configuring CPU pinning can be found in [this excellent guide](https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF#CPU_pinning). 11. Navigate to the `XML` tab, and edit the `` section to disable all timers except for the hypervclock, thereby drastically reducing idle CPU usage. Once changed, click `Apply`. ```xml