[edk2] [PATCH 4/6] UefiCpuPkg/PiSmmCpuDxeSmm: Enable NXE if it's supported

Jian J Wang posted 6 patches 6 years, 11 months ago
[edk2] [PATCH 4/6] UefiCpuPkg/PiSmmCpuDxeSmm: Enable NXE if it's supported
Posted by Jian J Wang 6 years, 11 months ago
If PcdDxeNxMemoryProtectionPolicy is set to enable protection for memory
of EfiBootServicesCode, EfiConventionalMemory, the BIOS will hang at a page
fault exception triggered by PiSmmCpuDxeSmm.

The root cause is that PiSmmCpuDxeSmm will access default SMM RAM starting
at 0x30000 which is marked as non-executable, but NX feature was not
enabled during SMM initialization. Accessing memory which has invalid
attributes set will cause page fault exception. This patch fixes it by
checking NX capability in cpuid and enable NXE in EFER MSR if it's
available.

Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Ruiyu Ni <ruiyu.ni@intel.com>
Cc: Eric Dong <eric.dong@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Jian J Wang <jian.j.wang@intel.com>
---
 UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmInit.nasm | 14 ++++++++++++++
 UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmInit.nasm  | 12 +++++++++++-
 2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmInit.nasm b/UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmInit.nasm
index d9df3626c7..db172f108a 100644
--- a/UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmInit.nasm
+++ b/UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmInit.nasm
@@ -42,6 +42,11 @@ ASM_PFX(gcSmiInitGdtr):
 
 global ASM_PFX(SmmStartup)
 ASM_PFX(SmmStartup):
+    DB      0x66
+    mov     eax, 0x80000001             ; read capability
+    cpuid
+    DB      0x66
+    mov     ebx, edx                    ; rdmsr will change edx. keep it in ebx.
     DB      0x66, 0xb8
 ASM_PFX(gSmmCr3): DD 0
     mov     cr3, eax
@@ -50,6 +55,15 @@ ASM_PFX(gSmmCr3): DD 0
     DB      0x66, 0xb8
 ASM_PFX(gSmmCr4): DD 0
     mov     cr4, eax
+    DB      0x66
+    mov     ecx, 0xc0000080             ; IA32_EFER MSR
+    rdmsr
+    DB      0x66
+    test    ebx, BIT20                  ; check NXE capability
+    jz      .1
+    or      ah, BIT3                    ; set NXE bit
+    wrmsr
+.1:
     DB      0x66, 0xb8
 ASM_PFX(gSmmCr0): DD 0
     DB      0xbf, PROTECT_MODE_DS, 0    ; mov di, PROTECT_MODE_DS
diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmInit.nasm b/UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmInit.nasm
index 9d05e2cb05..2a3a1141c3 100644
--- a/UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmInit.nasm
+++ b/UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmInit.nasm
@@ -42,6 +42,11 @@ ASM_PFX(gcSmiInitGdtr):
 
 global ASM_PFX(SmmStartup)
 ASM_PFX(SmmStartup):
+    DB      0x66
+    mov     eax, 0x80000001             ; read capability
+    cpuid
+    DB      0x66
+    mov     ebx, edx                    ; rdmsr will change edx. keep it in ebx.
     DB      0x66, 0xb8                   ; mov eax, imm32
 ASM_PFX(gSmmCr3): DD 0
     mov     cr3, rax
@@ -54,7 +59,12 @@ ASM_PFX(gSmmCr4): DD 0
     DB      0x66
     mov     ecx, 0xc0000080             ; IA32_EFER MSR
     rdmsr
-    or      ah, 1                       ; set LME bit
+    or      ah, BIT0                    ; set LME bit
+    DB      0x66
+    test    ebx, BIT20                  ; check NXE capability
+    jz      .1
+    or      ah, BIT3                    ; set NXE bit
+.1:
     wrmsr
     DB      0x66, 0xb8                   ; mov eax, imm32
 ASM_PFX(gSmmCr0): DD 0
-- 
2.15.1.windows.2

_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
Re: [edk2] [PATCH 4/6] UefiCpuPkg/PiSmmCpuDxeSmm: Enable NXE if it's supported
Posted by Dong, Eric 6 years, 11 months ago
Reviewed-by: Eric Dong <eric.dong@intel.com>

> -----Original Message-----
> From: Wang, Jian J
> Sent: Monday, January 15, 2018 4:55 PM
> To: edk2-devel@lists.01.org
> Cc: Yao, Jiewen <jiewen.yao@intel.com>; Ni, Ruiyu <ruiyu.ni@intel.com>;
> Dong, Eric <eric.dong@intel.com>; Laszlo Ersek <lersek@redhat.com>
> Subject: [PATCH 4/6] UefiCpuPkg/PiSmmCpuDxeSmm: Enable NXE if it's
> supported
> 
> If PcdDxeNxMemoryProtectionPolicy is set to enable protection for memory
> of EfiBootServicesCode, EfiConventionalMemory, the BIOS will hang at a
> page fault exception triggered by PiSmmCpuDxeSmm.
> 
> The root cause is that PiSmmCpuDxeSmm will access default SMM RAM
> starting at 0x30000 which is marked as non-executable, but NX feature was
> not enabled during SMM initialization. Accessing memory which has invalid
> attributes set will cause page fault exception. This patch fixes it by checking
> NX capability in cpuid and enable NXE in EFER MSR if it's available.
> 
> Cc: Jiewen Yao <jiewen.yao@intel.com>
> Cc: Ruiyu Ni <ruiyu.ni@intel.com>
> Cc: Eric Dong <eric.dong@intel.com>
> Cc: Laszlo Ersek <lersek@redhat.com>
> Contributed-under: TianoCore Contribution Agreement 1.1
> Signed-off-by: Jian J Wang <jian.j.wang@intel.com>
> ---
>  UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmInit.nasm | 14 ++++++++++++++
> UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmInit.nasm  | 12 +++++++++++-
>  2 files changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmInit.nasm
> b/UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmInit.nasm
> index d9df3626c7..db172f108a 100644
> --- a/UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmInit.nasm
> +++ b/UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmInit.nasm
> @@ -42,6 +42,11 @@ ASM_PFX(gcSmiInitGdtr):
> 
>  global ASM_PFX(SmmStartup)
>  ASM_PFX(SmmStartup):
> +    DB      0x66
> +    mov     eax, 0x80000001             ; read capability
> +    cpuid
> +    DB      0x66
> +    mov     ebx, edx                    ; rdmsr will change edx. keep it in ebx.
>      DB      0x66, 0xb8
>  ASM_PFX(gSmmCr3): DD 0
>      mov     cr3, eax
> @@ -50,6 +55,15 @@ ASM_PFX(gSmmCr3): DD 0
>      DB      0x66, 0xb8
>  ASM_PFX(gSmmCr4): DD 0
>      mov     cr4, eax
> +    DB      0x66
> +    mov     ecx, 0xc0000080             ; IA32_EFER MSR
> +    rdmsr
> +    DB      0x66
> +    test    ebx, BIT20                  ; check NXE capability
> +    jz      .1
> +    or      ah, BIT3                    ; set NXE bit
> +    wrmsr
> +.1:
>      DB      0x66, 0xb8
>  ASM_PFX(gSmmCr0): DD 0
>      DB      0xbf, PROTECT_MODE_DS, 0    ; mov di, PROTECT_MODE_DS
> diff --git a/UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmInit.nasm
> b/UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmInit.nasm
> index 9d05e2cb05..2a3a1141c3 100644
> --- a/UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmInit.nasm
> +++ b/UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmInit.nasm
> @@ -42,6 +42,11 @@ ASM_PFX(gcSmiInitGdtr):
> 
>  global ASM_PFX(SmmStartup)
>  ASM_PFX(SmmStartup):
> +    DB      0x66
> +    mov     eax, 0x80000001             ; read capability
> +    cpuid
> +    DB      0x66
> +    mov     ebx, edx                    ; rdmsr will change edx. keep it in ebx.
>      DB      0x66, 0xb8                   ; mov eax, imm32
>  ASM_PFX(gSmmCr3): DD 0
>      mov     cr3, rax
> @@ -54,7 +59,12 @@ ASM_PFX(gSmmCr4): DD 0
>      DB      0x66
>      mov     ecx, 0xc0000080             ; IA32_EFER MSR
>      rdmsr
> -    or      ah, 1                       ; set LME bit
> +    or      ah, BIT0                    ; set LME bit
> +    DB      0x66
> +    test    ebx, BIT20                  ; check NXE capability
> +    jz      .1
> +    or      ah, BIT3                    ; set NXE bit
> +.1:
>      wrmsr
>      DB      0x66, 0xb8                   ; mov eax, imm32
>  ASM_PFX(gSmmCr0): DD 0
> --
> 2.15.1.windows.2

_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
Re: [edk2] [PATCH 4/6] UefiCpuPkg/PiSmmCpuDxeSmm: Enable NXE if it's supported
Posted by Laszlo Ersek 6 years, 10 months ago
Hello Jian,

On 01/15/18 09:54, Jian J Wang wrote:
> If PcdDxeNxMemoryProtectionPolicy is set to enable protection for memory
> of EfiBootServicesCode, EfiConventionalMemory, the BIOS will hang at a page
> fault exception triggered by PiSmmCpuDxeSmm.
>
> The root cause is that PiSmmCpuDxeSmm will access default SMM RAM starting
> at 0x30000 which is marked as non-executable, but NX feature was not
> enabled during SMM initialization. Accessing memory which has invalid
> attributes set will cause page fault exception. This patch fixes it by
> checking NX capability in cpuid and enable NXE in EFER MSR if it's
> available.
>
> Cc: Jiewen Yao <jiewen.yao@intel.com>
> Cc: Ruiyu Ni <ruiyu.ni@intel.com>
> Cc: Eric Dong <eric.dong@intel.com>
> Cc: Laszlo Ersek <lersek@redhat.com>
> Contributed-under: TianoCore Contribution Agreement 1.1
> Signed-off-by: Jian J Wang <jian.j.wang@intel.com>
> ---
>  UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmInit.nasm | 14 ++++++++++++++
>  UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmInit.nasm  | 12 +++++++++++-
>  2 files changed, 25 insertions(+), 1 deletion(-)

This patch also breaks OVMF on KVM. However, the circumstances and the
symptom differ from those of the other regression that I reported under
patch 1/6 in this series [1] [2].

Namely, this affects the IA32 build, with SMM:

$ build -a IA32 -p OvmfPkg/OvmfPkgIa32.dsc -D SECURE_BOOT_ENABLE \
    -D SMM_REQUIRE -t GCC48 -b NOOPT -D HTTP_BOOT_ENABLE

The value of PcdDxeNxMemoryProtectionPolicy is zero in this build too.

The virtual CPU model that this OVMF build runs on does *not* support
NX. (Please refer to "OvmfPkg/README":

> === SMM support ===
>
> [...]
>
> * QEMU binary and options specific to 32-bit guests:
>
>   $ qemu-system-i386 -cpu coreduo,-nx \

)

The boot hangs at the following location:

> Loading SMM driver at 0x0007FFA2000 EntryPoint=0x0007FFA844D PiSmmCpuDxeSmm.efi
> SMRR Base: 0x7F000000, SMRR Size: 0x1000000
> PcdCpuSmmCodeAccessCheckEnable = 1
> mAddressEncMask = 0x0
> SMRAM TileSize = 0x00002000 (0x00001000, 0x00001000)
> SMRAM SaveState Buffer (0x7FF94000, 0x0000E000)
> CPU[000]  APIC ID=0000  SMBASE=7FF8C000  SaveState=7FF9BC00  Size=00000400
> CPU[001]  APIC ID=0001  SMBASE=7FF8E000  SaveState=7FF9DC00  Size=00000400
> CPU[002]  APIC ID=0002  SMBASE=7FF90000  SaveState=7FF9FC00  Size=00000400
> CPU[003]  APIC ID=0003  SMBASE=7FF92000  SaveState=7FFA1C00  Size=00000400
> <--HANG-->

KVM trace (excerpt):

>  1 CPU-4989  [002] 16163.874223: kvm_enter_smm:        vcpu 1: entering SMM, smbase 0x30000
>  2 CPU-4989  [002] 16163.874244: kvm_fpu:              load
>  3 CPU-4989  [002] 16163.874245: kvm_entry:            vcpu 1
>  4 CPU-4989  [002] 16163.874247: kvm_exit:             reason EPT_VIOLATION rip 0x8000 info 184 0
>  5 CPU-4989  [002] 16163.874247: kvm_page_fault:       address 38000 error_code 184
>  6 CPU-4989  [002] 16163.874251: kvm_entry:            vcpu 1
>  7 CPU-4989  [002] 16163.874253: kvm_exit:             reason EPT_VIOLATION rip 0x7ff864ba info 184 0
>  8 CPU-4989  [002] 16163.874253: kvm_page_fault:       address 7ffb64ba error_code 184
>  9 CPU-4989  [002] 16163.874256: kvm_entry:            vcpu 1
> 10 CPU-4989  [002] 16163.874257: kvm_exit:             reason CPUID rip 0x7ff864c0 info 0 0
> 11 CPU-4989  [002] 16163.874258: kvm_cpuid:            func 80000001 rax 6e8 rbx 0 rcx 0 rdx 0
> 12 CPU-4989  [002] 16163.874258: kvm_entry:            vcpu 1
> 13 CPU-4989  [002] 16163.874259: kvm_exit:             reason CR_ACCESS rip 0x7ff864cb info 3 0
> 14 CPU-4989  [002] 16163.874260: kvm_cr:               cr_write 3 = 0x0
> 15 CPU-4989  [002] 16163.874261: kvm_entry:            vcpu 1
> 16 CPU-4989  [002] 16163.874262: kvm_exit:             reason CR_ACCESS rip 0x7ff864db info 4 0
> 17 CPU-4989  [002] 16163.874263: kvm_cr:               cr_write 4 = 0x640
> 18 CPU-4989  [002] 16163.874273: kvm_entry:            vcpu 1
> 19 CPU-4989  [002] 16163.874274: kvm_exit:             reason MSR_READ rip 0x7ff864e4 info 0 0
> 20 CPU-4989  [002] 16163.874275: kvm_msr:              msr_read c0000080 = 0x0
> 21 CPU-4989  [002] 16163.874276: kvm_entry:            vcpu 1
> 22 CPU-4989  [002] 16163.874277: kvm_exit:             reason EPT_VIOLATION rip 0x64f4 info 184 0
> 23 CPU-4989  [002] 16163.874277: kvm_page_fault:       address 364f4 error_code 184
> 24 CPU-4989  [002] 16163.874282: kvm_entry:            vcpu 1
> 25 CPU-4989  [002] 16163.874283: kvm_exit:             reason EPT_VIOLATION rip 0x64f4 info 183 0
> 26 CPU-4989  [002] 16163.874283: kvm_page_fault:       address f000 error_code 183
> 27 CPU-4989  [002] 16163.874288: kvm_entry:            vcpu 1
> 28 CPU-4989  [002] 16163.874292: kvm_exit:             reason EPT_VIOLATION rip 0x7000 info 184 0
> 29 CPU-4989  [002] 16163.874293: kvm_page_fault:       address 37000 error_code 184
> 30 CPU-4989  [002] 16163.874295: kvm_entry:            vcpu 1
> 31 CPU-4989  [002] 16163.874300: kvm_exit:             reason CPUID rip 0x7ff864c0 info 0 0
> 32 CPU-4989  [002] 16163.874301: kvm_cpuid:            func 80000001 rax 6e8 rbx 0 rcx 0 rdx 0
> 33 CPU-4989  [002] 16163.874301: kvm_entry:            vcpu 1
> 34 CPU-4989  [002] 16163.874302: kvm_exit:             reason CR_ACCESS rip 0x7ff864cb info 3 0
> 35 CPU-4989  [002] 16163.874303: kvm_cr:               cr_write 3 = 0x0
> 36 CPU-4989  [002] 16163.874304: kvm_entry:            vcpu 1

After this point, lines 19-36 repeat infinitely.

The above trace corresponds to the following, from
"UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmInit.nasm":

> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  43) global ASM_PFX(SmmStartup)
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  44) ASM_PFX(SmmStartup):
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  45)     DB      0x66
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  46)     mov     eax, 0x80000001             ; read capability
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  47)     cpuid
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  48)     DB      0x66
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  49)     mov     ebx, edx                    ; rdmsr will change edx. keep it in ebx.
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  50)     DB      0x66, 0xb8
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  51) ASM_PFX(gSmmCr3): DD 0
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  52)     mov     cr3, eax
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  53)     DB      0x67, 0x66
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  54)     lgdt    [cs:ebp + (ASM_PFX(gcSmiInitGdtr) - ASM_PFX(SmmStartup))]
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  55)     DB      0x66, 0xb8
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  56) ASM_PFX(gSmmCr4): DD 0
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  57)     mov     cr4, eax
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  58)     DB      0x66
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  59)     mov     ecx, 0xc0000080             ; IA32_EFER MSR
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  60)     rdmsr
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  61)     DB      0x66
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  62)     test    ebx, BIT20                  ; check NXE capability
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  63)     jz      .1
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  64)     or      ah, BIT3                    ; set NXE bit
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  65)     wrmsr
> d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  66) .1:
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  67)     DB      0x66, 0xb8
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  68) ASM_PFX(gSmmCr0): DD 0
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  69)     DB      0xbf, PROTECT_MODE_DS, 0    ; mov di, PROTECT_MODE_DS
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  70)     mov     cr0, eax
> 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  71)     DB      0x66, 0xea                   ; jmp far [ptr48]

The WRMSR is never reached (which is fine) but the CR0 write is also not
reached, ever. Instead, we seem to be stuck in SMM forever, looping back
to SmmStartup.

Here's the bisection log:

> git bisect start
> # good: [018432f0ce1b42541977f61f9c7607257a4bf43a] MdeModulePkg/Ip4Dxe: Add an independent timer for reconfig checking
> git bisect good 018432f0ce1b42541977f61f9c7607257a4bf43a
> # bad: [06c1f423e17fe5ddef824d688d21c83730238ba6] BeagleBoardPkg: reroute Firmware Vendor Pcd to MdeModulePkg
> git bisect bad 06c1f423e17fe5ddef824d688d21c83730238ba6
> # bad: [8ab0bd2397c9d3922e0c7dbb1aa6f7e08799079f] MdePkg/DMAR: Add the definition for DMA_CTRL_PLATFORM_OPT_IN_FLAG bit
> git bisect bad 8ab0bd2397c9d3922e0c7dbb1aa6f7e08799079f
> # good: [24a105a7d8b4b8312743cf265f869dc049b7ff92] BaseTools: Disable warning varargs in XCODE5 align to CLANG38
> git bisect good 24a105a7d8b4b8312743cf265f869dc049b7ff92
> # good: [b2725f57c7a1e6feeb176f1563a4f1a8c2eb6c6f] IntelSiliconPkg IntelVTdPmrPei: Get high top by host address width
> git bisect good b2725f57c7a1e6feeb176f1563a4f1a8c2eb6c6f
> # good: [4f10654e04601fe67a750c9b5a4242efd4141569] UefiCpuPkg/CpuDxe: fix SetMemoryAttributes issue in 32-bit mode
> git bisect good 4f10654e04601fe67a750c9b5a4242efd4141569
> # good: [fbe2c4b9be98a5c2b9c1f6976f51e2456467e752] UefiCpuPkg/CpuDxe: clear NX attr for page directory
> git bisect good fbe2c4b9be98a5c2b9c1f6976f51e2456467e752
> # bad: [94c0129d244f91fa0a7b122414872da49a35f853] MdeModulePkg/PiSmmCore: remove NX attr for SMM RAM
> git bisect bad 94c0129d244f91fa0a7b122414872da49a35f853
> # bad: [d4d87596c11d6e3f8220b6d9677797c802af3a33] UefiCpuPkg/PiSmmCpuDxeSmm: Enable NXE if it's supported
> git bisect bad d4d87596c11d6e3f8220b6d9677797c802af3a33
> # first bad commit: [d4d87596c11d6e3f8220b6d9677797c802af3a33] UefiCpuPkg/PiSmmCpuDxeSmm: Enable NXE if it's supported


Finally, I have an independent question: why are we still adding *new*
0x66 size prefixes with "DB"? NASM supports 16-bit, 32-bit and 64-bit
assembly code in the same source file, and such prefixes can be encoded
symbolically [3]:

> Explicit address-size and operand-size prefixes A16, A32, A64, O16 and
> O32, O64 are provided -- one example of their use is given in chapter
> 10.

Thanks
Laszlo

[1] https://lists.01.org/pipermail/edk2-devel/2018-January/020582.html
[2] https://lists.01.org/pipermail/edk2-devel/2018-January/020586.html
[3] http://www.nasm.us/doc/nasmdoc3.html
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
Re: [edk2] [PATCH 4/6] UefiCpuPkg/PiSmmCpuDxeSmm: Enable NXE if it's supported
Posted by Wang, Jian J 6 years, 10 months ago
Hi Laszlo,

I don't know the history of these code but I guess they're converted from .asm file.
That may be why there's "DB 66h" prefix. I think you're right these tricks should be
replaced with more formal ways. Please submit a bz tracker for it.

As to the issue, I don't have clue right now. The code seems no problem. Since msr
write didn't happen, code flow is correct. And those code has executed on real
32-bit platform without problem. I need more time to investigate it.

Regards,
Jian


> -----Original Message-----
> From: Laszlo Ersek [mailto:lersek@redhat.com]
> Sent: Monday, January 29, 2018 6:47 AM
> To: Wang, Jian J <jian.j.wang@intel.com>; edk2-devel@lists.01.org
> Cc: Ni, Ruiyu <ruiyu.ni@intel.com>; Yao, Jiewen <jiewen.yao@intel.com>; Dong,
> Eric <eric.dong@intel.com>; Paolo Bonzini <pbonzini@redhat.com>; Radim
> Kr?má? <rkrcmar@redhat.com>
> Subject: Re: [edk2] [PATCH 4/6] UefiCpuPkg/PiSmmCpuDxeSmm: Enable NXE if
> it's supported
> 
> Hello Jian,
> 
> On 01/15/18 09:54, Jian J Wang wrote:
> > If PcdDxeNxMemoryProtectionPolicy is set to enable protection for memory
> > of EfiBootServicesCode, EfiConventionalMemory, the BIOS will hang at a page
> > fault exception triggered by PiSmmCpuDxeSmm.
> >
> > The root cause is that PiSmmCpuDxeSmm will access default SMM RAM
> starting
> > at 0x30000 which is marked as non-executable, but NX feature was not
> > enabled during SMM initialization. Accessing memory which has invalid
> > attributes set will cause page fault exception. This patch fixes it by
> > checking NX capability in cpuid and enable NXE in EFER MSR if it's
> > available.
> >
> > Cc: Jiewen Yao <jiewen.yao@intel.com>
> > Cc: Ruiyu Ni <ruiyu.ni@intel.com>
> > Cc: Eric Dong <eric.dong@intel.com>
> > Cc: Laszlo Ersek <lersek@redhat.com>
> > Contributed-under: TianoCore Contribution Agreement 1.1
> > Signed-off-by: Jian J Wang <jian.j.wang@intel.com>
> > ---
> >  UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmInit.nasm | 14 ++++++++++++++
> >  UefiCpuPkg/PiSmmCpuDxeSmm/X64/SmmInit.nasm  | 12 +++++++++++-
> >  2 files changed, 25 insertions(+), 1 deletion(-)
> 
> This patch also breaks OVMF on KVM. However, the circumstances and the
> symptom differ from those of the other regression that I reported under
> patch 1/6 in this series [1] [2].
> 
> Namely, this affects the IA32 build, with SMM:
> 
> $ build -a IA32 -p OvmfPkg/OvmfPkgIa32.dsc -D SECURE_BOOT_ENABLE \
>     -D SMM_REQUIRE -t GCC48 -b NOOPT -D HTTP_BOOT_ENABLE
> 
> The value of PcdDxeNxMemoryProtectionPolicy is zero in this build too.
> 
> The virtual CPU model that this OVMF build runs on does *not* support
> NX. (Please refer to "OvmfPkg/README":
> 
> > === SMM support ===
> >
> > [...]
> >
> > * QEMU binary and options specific to 32-bit guests:
> >
> >   $ qemu-system-i386 -cpu coreduo,-nx \
> 
> )
> 
> The boot hangs at the following location:
> 
> > Loading SMM driver at 0x0007FFA2000 EntryPoint=0x0007FFA844D
> PiSmmCpuDxeSmm.efi
> > SMRR Base: 0x7F000000, SMRR Size: 0x1000000
> > PcdCpuSmmCodeAccessCheckEnable = 1
> > mAddressEncMask = 0x0
> > SMRAM TileSize = 0x00002000 (0x00001000, 0x00001000)
> > SMRAM SaveState Buffer (0x7FF94000, 0x0000E000)
> > CPU[000]  APIC ID=0000  SMBASE=7FF8C000  SaveState=7FF9BC00
> Size=00000400
> > CPU[001]  APIC ID=0001  SMBASE=7FF8E000  SaveState=7FF9DC00
> Size=00000400
> > CPU[002]  APIC ID=0002  SMBASE=7FF90000  SaveState=7FF9FC00
> Size=00000400
> > CPU[003]  APIC ID=0003  SMBASE=7FF92000  SaveState=7FFA1C00
> Size=00000400
> > <--HANG-->
> 
> KVM trace (excerpt):
> 
> >  1 CPU-4989  [002] 16163.874223: kvm_enter_smm:        vcpu 1: entering SMM,
> smbase 0x30000
> >  2 CPU-4989  [002] 16163.874244: kvm_fpu:              load
> >  3 CPU-4989  [002] 16163.874245: kvm_entry:            vcpu 1
> >  4 CPU-4989  [002] 16163.874247: kvm_exit:             reason EPT_VIOLATION rip
> 0x8000 info 184 0
> >  5 CPU-4989  [002] 16163.874247: kvm_page_fault:       address 38000
> error_code 184
> >  6 CPU-4989  [002] 16163.874251: kvm_entry:            vcpu 1
> >  7 CPU-4989  [002] 16163.874253: kvm_exit:             reason EPT_VIOLATION rip
> 0x7ff864ba info 184 0
> >  8 CPU-4989  [002] 16163.874253: kvm_page_fault:       address 7ffb64ba
> error_code 184
> >  9 CPU-4989  [002] 16163.874256: kvm_entry:            vcpu 1
> > 10 CPU-4989  [002] 16163.874257: kvm_exit:             reason CPUID rip
> 0x7ff864c0 info 0 0
> > 11 CPU-4989  [002] 16163.874258: kvm_cpuid:            func 80000001 rax 6e8
> rbx 0 rcx 0 rdx 0
> > 12 CPU-4989  [002] 16163.874258: kvm_entry:            vcpu 1
> > 13 CPU-4989  [002] 16163.874259: kvm_exit:             reason CR_ACCESS rip
> 0x7ff864cb info 3 0
> > 14 CPU-4989  [002] 16163.874260: kvm_cr:               cr_write 3 = 0x0
> > 15 CPU-4989  [002] 16163.874261: kvm_entry:            vcpu 1
> > 16 CPU-4989  [002] 16163.874262: kvm_exit:             reason CR_ACCESS rip
> 0x7ff864db info 4 0
> > 17 CPU-4989  [002] 16163.874263: kvm_cr:               cr_write 4 = 0x640
> > 18 CPU-4989  [002] 16163.874273: kvm_entry:            vcpu 1
> > 19 CPU-4989  [002] 16163.874274: kvm_exit:             reason MSR_READ rip
> 0x7ff864e4 info 0 0
> > 20 CPU-4989  [002] 16163.874275: kvm_msr:              msr_read c0000080 = 0x0
> > 21 CPU-4989  [002] 16163.874276: kvm_entry:            vcpu 1
> > 22 CPU-4989  [002] 16163.874277: kvm_exit:             reason EPT_VIOLATION
> rip 0x64f4 info 184 0
> > 23 CPU-4989  [002] 16163.874277: kvm_page_fault:       address 364f4
> error_code 184
> > 24 CPU-4989  [002] 16163.874282: kvm_entry:            vcpu 1
> > 25 CPU-4989  [002] 16163.874283: kvm_exit:             reason EPT_VIOLATION
> rip 0x64f4 info 183 0
> > 26 CPU-4989  [002] 16163.874283: kvm_page_fault:       address f000
> error_code 183
> > 27 CPU-4989  [002] 16163.874288: kvm_entry:            vcpu 1
> > 28 CPU-4989  [002] 16163.874292: kvm_exit:             reason EPT_VIOLATION
> rip 0x7000 info 184 0
> > 29 CPU-4989  [002] 16163.874293: kvm_page_fault:       address 37000
> error_code 184
> > 30 CPU-4989  [002] 16163.874295: kvm_entry:            vcpu 1
> > 31 CPU-4989  [002] 16163.874300: kvm_exit:             reason CPUID rip
> 0x7ff864c0 info 0 0
> > 32 CPU-4989  [002] 16163.874301: kvm_cpuid:            func 80000001 rax 6e8
> rbx 0 rcx 0 rdx 0
> > 33 CPU-4989  [002] 16163.874301: kvm_entry:            vcpu 1
> > 34 CPU-4989  [002] 16163.874302: kvm_exit:             reason CR_ACCESS rip
> 0x7ff864cb info 3 0
> > 35 CPU-4989  [002] 16163.874303: kvm_cr:               cr_write 3 = 0x0
> > 36 CPU-4989  [002] 16163.874304: kvm_entry:            vcpu 1
> 
> After this point, lines 19-36 repeat infinitely.
> 
> The above trace corresponds to the following, from
> "UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmInit.nasm":
> 
> > 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  43) global
> ASM_PFX(SmmStartup)
> > 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  44)
> ASM_PFX(SmmStartup):
> > d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  45)     DB      0x66
> > d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  46)     mov     eax,
> 0x80000001             ; read capability
> > d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  47)     cpuid
> > d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  48)     DB      0x66
> > d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  49)     mov     ebx,
> edx                    ; rdmsr will change edx. keep it in ebx.
> > 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  50)     DB      0x66,
> 0xb8
> > 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  51)
> ASM_PFX(gSmmCr3): DD 0
> > 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  52)     mov     cr3, eax
> > 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  53)     DB      0x67,
> 0x66
> > 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  54)     lgdt    [cs:ebp +
> (ASM_PFX(gcSmiInitGdtr) - ASM_PFX(SmmStartup))]
> > 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  55)     DB      0x66,
> 0xb8
> > 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  56)
> ASM_PFX(gSmmCr4): DD 0
> > 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  57)     mov     cr4, eax
> > d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  58)     DB      0x66
> > d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  59)     mov     ecx,
> 0xc0000080             ; IA32_EFER MSR
> > d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  60)     rdmsr
> > d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  61)     DB      0x66
> > d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  62)     test    ebx,
> BIT20                  ; check NXE capability
> > d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  63)     jz      .1
> > d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  64)     or      ah,
> BIT3                    ; set NXE bit
> > d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  65)     wrmsr
> > d4d87596c11d6 (Jian J Wang 2018-01-15 10:16:26 +0800  66) .1:
> > 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  67)     DB      0x66,
> 0xb8
> > 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  68)
> ASM_PFX(gSmmCr0): DD 0
> > 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  69)     DB      0xbf,
> PROTECT_MODE_DS, 0    ; mov di, PROTECT_MODE_DS
> > 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  70)     mov     cr0, eax
> > 246cd9085f806 (Liming Gao  2016-06-14 16:29:40 +0800  71)     DB      0x66,
> 0xea                   ; jmp far [ptr48]
> 
> The WRMSR is never reached (which is fine) but the CR0 write is also not
> reached, ever. Instead, we seem to be stuck in SMM forever, looping back
> to SmmStartup.
> 
> Here's the bisection log:
> 
> > git bisect start
> > # good: [018432f0ce1b42541977f61f9c7607257a4bf43a]
> MdeModulePkg/Ip4Dxe: Add an independent timer for reconfig checking
> > git bisect good 018432f0ce1b42541977f61f9c7607257a4bf43a
> > # bad: [06c1f423e17fe5ddef824d688d21c83730238ba6] BeagleBoardPkg:
> reroute Firmware Vendor Pcd to MdeModulePkg
> > git bisect bad 06c1f423e17fe5ddef824d688d21c83730238ba6
> > # bad: [8ab0bd2397c9d3922e0c7dbb1aa6f7e08799079f] MdePkg/DMAR: Add
> the definition for DMA_CTRL_PLATFORM_OPT_IN_FLAG bit
> > git bisect bad 8ab0bd2397c9d3922e0c7dbb1aa6f7e08799079f
> > # good: [24a105a7d8b4b8312743cf265f869dc049b7ff92] BaseTools: Disable
> warning varargs in XCODE5 align to CLANG38
> > git bisect good 24a105a7d8b4b8312743cf265f869dc049b7ff92
> > # good: [b2725f57c7a1e6feeb176f1563a4f1a8c2eb6c6f] IntelSiliconPkg
> IntelVTdPmrPei: Get high top by host address width
> > git bisect good b2725f57c7a1e6feeb176f1563a4f1a8c2eb6c6f
> > # good: [4f10654e04601fe67a750c9b5a4242efd4141569] UefiCpuPkg/CpuDxe:
> fix SetMemoryAttributes issue in 32-bit mode
> > git bisect good 4f10654e04601fe67a750c9b5a4242efd4141569
> > # good: [fbe2c4b9be98a5c2b9c1f6976f51e2456467e752] UefiCpuPkg/CpuDxe:
> clear NX attr for page directory
> > git bisect good fbe2c4b9be98a5c2b9c1f6976f51e2456467e752
> > # bad: [94c0129d244f91fa0a7b122414872da49a35f853]
> MdeModulePkg/PiSmmCore: remove NX attr for SMM RAM
> > git bisect bad 94c0129d244f91fa0a7b122414872da49a35f853
> > # bad: [d4d87596c11d6e3f8220b6d9677797c802af3a33]
> UefiCpuPkg/PiSmmCpuDxeSmm: Enable NXE if it's supported
> > git bisect bad d4d87596c11d6e3f8220b6d9677797c802af3a33
> > # first bad commit: [d4d87596c11d6e3f8220b6d9677797c802af3a33]
> UefiCpuPkg/PiSmmCpuDxeSmm: Enable NXE if it's supported
> 
> 
> Finally, I have an independent question: why are we still adding *new*
> 0x66 size prefixes with "DB"? NASM supports 16-bit, 32-bit and 64-bit
> assembly code in the same source file, and such prefixes can be encoded
> symbolically [3]:
> 
> > Explicit address-size and operand-size prefixes A16, A32, A64, O16 and
> > O32, O64 are provided -- one example of their use is given in chapter
> > 10.
> 
> Thanks
> Laszlo
> 
> [1] https://lists.01.org/pipermail/edk2-devel/2018-January/020582.html
> [2] https://lists.01.org/pipermail/edk2-devel/2018-January/020586.html
> [3] http://www.nasm.us/doc/nasmdoc3.html
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
Re: [edk2] [PATCH 4/6] UefiCpuPkg/PiSmmCpuDxeSmm: Enable NXE if it's supported
Posted by Laszlo Ersek 6 years, 10 months ago
On 01/29/18 10:02, Wang, Jian J wrote:
> Hi Laszlo,
>
> I don't know the history of these code but I guess they're converted
> from .asm file. That may be why there's "DB 66h" prefix. I think
> you're right these tricks should be replaced with more formal ways.
> Please submit a bz tracker for it.

I submitted <https://bugzilla.tianocore.org/show_bug.cgi?id=866>.

> As to the issue, I don't have clue right now. The code seems no
> problem. Since msr write didn't happen, code flow is correct. And
> those code has executed on real 32-bit platform without problem. I
> need more time to investigate it.

I think I've found the issue; more exactly I narrowed it down a bit. I
remember that the same problem drove me mad a few years ago. :)

The issue is that in the middle of such processor mode switches, no jump
instructions work *at all* on KVM. I don't know why, this is just my
experience. The KVM behavior could even be justified by the Intel SDM.
I'm unsure.

Let's look at the patch with more context:

>  global ASM_PFX(SmmStartup)
>  ASM_PFX(SmmStartup):
> +    DB      0x66
> +    mov     eax, 0x80000001             ; read capability
> +    cpuid
> +    DB      0x66
> +    mov     ebx, edx                    ; rdmsr will change edx. keep it in ebx.
>      DB      0x66, 0xb8
>  ASM_PFX(gSmmCr3): DD 0
>      mov     cr3, eax
>      DB      0x67, 0x66
>      lgdt    [cs:ebp + (ASM_PFX(gcSmiInitGdtr) - ASM_PFX(SmmStartup))]
>      DB      0x66, 0xb8
>  ASM_PFX(gSmmCr4): DD 0
>      mov     cr4, eax
> +    DB      0x66
> +    mov     ecx, 0xc0000080             ; IA32_EFER MSR
> +    rdmsr
> +    DB      0x66
> +    test    ebx, BIT20                  ; check NXE capability
> +    jz      .1
> +    or      ah, BIT3                    ; set NXE bit
> +    wrmsr
> +.1:

This code has exactly one jump, and in practice it is never taken
(because NX support is ubiquitous on physical platforms).

In my testing I added some other conditional jumps -- I reimplemented
IsExecuteDisableBitAvailable() from
"MdeModulePkg/Core/DxeIplPeim/Ia32/DxeLoadFunc.c", which has more
conditions. All the conditional jumps that were *not* taken didn't
cause any issues. (This is why the same logic from the patch works for
me in the X64 version, because there the "jz" is never taken, since NX
is always available there.) However, the first jump that *was* taken in
my testing immediately hung or crashed the IA32 guest.

Finally I replaced the entire NX management code with an unconditional
jump forward:

    jmp jump_here
jump_here:

and even this hung / crashed the guest.

For some reason this section of the code is unfit for jumping, under KVM
anyway.

Thanks
Laszlo
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
Re: [edk2] [PATCH 4/6] UefiCpuPkg/PiSmmCpuDxeSmm: Enable NXE if it's supported
Posted by Laszlo Ersek 6 years, 10 months ago
On 01/29/18 20:48, Laszlo Ersek wrote:
> On 01/29/18 10:02, Wang, Jian J wrote:
>> Hi Laszlo,
>>
>> I don't know the history of these code but I guess they're converted
>> from .asm file. That may be why there's "DB 66h" prefix. I think
>> you're right these tricks should be replaced with more formal ways.
>> Please submit a bz tracker for it.
> 
> I submitted <https://bugzilla.tianocore.org/show_bug.cgi?id=866>.
> 
>> As to the issue, I don't have clue right now. The code seems no
>> problem. Since msr write didn't happen, code flow is correct. And
>> those code has executed on real 32-bit platform without problem. I
>> need more time to investigate it.
> 
> I think I've found the issue; more exactly I narrowed it down a bit. I
> remember that the same problem drove me mad a few years ago. :)
> 
> The issue is that in the middle of such processor mode switches, no jump
> instructions work *at all* on KVM. I don't know why, this is just my
> experience. The KVM behavior could even be justified by the Intel SDM.
> I'm unsure.
> 
> Let's look at the patch with more context:
> 
>>  global ASM_PFX(SmmStartup)
>>  ASM_PFX(SmmStartup):
>> +    DB      0x66
>> +    mov     eax, 0x80000001             ; read capability
>> +    cpuid
>> +    DB      0x66
>> +    mov     ebx, edx                    ; rdmsr will change edx. keep it in ebx.
>>      DB      0x66, 0xb8
>>  ASM_PFX(gSmmCr3): DD 0
>>      mov     cr3, eax
>>      DB      0x67, 0x66
>>      lgdt    [cs:ebp + (ASM_PFX(gcSmiInitGdtr) - ASM_PFX(SmmStartup))]
>>      DB      0x66, 0xb8
>>  ASM_PFX(gSmmCr4): DD 0
>>      mov     cr4, eax
>> +    DB      0x66
>> +    mov     ecx, 0xc0000080             ; IA32_EFER MSR
>> +    rdmsr
>> +    DB      0x66
>> +    test    ebx, BIT20                  ; check NXE capability
>> +    jz      .1
>> +    or      ah, BIT3                    ; set NXE bit
>> +    wrmsr
>> +.1:
> 
> This code has exactly one jump, and in practice it is never taken
> (because NX support is ubiquitous on physical platforms).
> 
> In my testing I added some other conditional jumps -- I reimplemented
> IsExecuteDisableBitAvailable() from
> "MdeModulePkg/Core/DxeIplPeim/Ia32/DxeLoadFunc.c", which has more
> conditions. All the conditional jumps that were *not* taken didn't
> cause any issues. (This is why the same logic from the patch works for
> me in the X64 version, because there the "jz" is never taken, since NX
> is always available there.) However, the first jump that *was* taken in
> my testing immediately hung or crashed the IA32 guest.
> 
> Finally I replaced the entire NX management code with an unconditional
> jump forward:
> 
>     jmp jump_here
> jump_here:
> 
> and even this hung / crashed the guest.
> 
> For some reason this section of the code is unfit for jumping, under KVM
> anyway.

I found a work-around for this.

While short jumps (= relative to EIP) do not work under KVM, in initial
SMM, near jumps to absolute 32-bit addresses (specified indirectly via
registers) do work [*]. Except, the address calculation has to take into
account the trick that PiSmmCpuDxeSmm applies.

Namely, for initial SMBASE relocation,

- while the *short* gcSmmInitTemplate routine is copied to SMBASE+32KB
  (that is, to (0x3_0000 + 0x8000)), and executed from there,

- the SmmStartup routine is actually executed from the *body* of
  PiSmmCpuDxeSmm -- that is, from SMRAM, to which place the SMM Core
  loaded and *relocated* PiSmmCpuDxeSmm, via normal SMM driver dispatch.

This is why gcSmmInitTemplate jumps to SmmStartup as follows, in
"UefiCpuPkg/PiSmmCpuDxeSmm/Ia32/SmmInit.nasm":

> BITS 16
> ASM_PFX(gcSmmInitTemplate):
>     mov ebp, ASM_PFX(SmmStartup)
>     sub ebp, 0x30000
>     jmp ebp
>
> ASM_PFX(gcSmmInitSize): DW $ - ASM_PFX(gcSmmInitTemplate)

The "mov ebp, ASM_PFX(SmmStartup)" instruction is relocated at
PiSmmCpuDxeSmm dispatch time, to the absolute address of SmmStartup in
the body of PiSmmCpuDxeSmm. This absolute address is relative to zero.
However, when the "jmp ebp" is executed in initial SMM mode, the CS
register is not zero (i.e. EIP:=EBP will not be interpreted relative to
zero). Instead, CS is initially set to 0x3000 in SMM, implying a code
segment base that equals SMBASE (0x3_0000). Hence the "sub ebp, 0x30000"
-- it compensates the original relocation of the "mov" for the suddenly
increased CS base.

The same applies to any near jump (with absolute indirect target) in the
SmmStartup routine. All these jump instructions are relocated within the
body of PiSmmCpuDxeSmm (at driver dispatch time) against a *zero* code
segment base, but when they are actually executed in initial SMM, they
are executed against a code segment base of 0x3_0000.

I will send a patch (or a patch set, not sure yet).

[*] segment limits are raised to 4GB, and near jumps can use such
addresses with the "o32" (32-bit operand-size override) prefix. In fact,
NASM inserts the prefix automatically upon seeing eg. "jmp ebx" in BITS 16.

Thanks!
Laszlo
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
Re: [edk2] [PATCH 4/6] UefiCpuPkg/PiSmmCpuDxeSmm: Enable NXE if it's supported
Posted by Wang, Jian J 6 years, 10 months ago
Very good analysis and investigation.

Regards,
Jian


> -----Original Message-----
> From: Laszlo Ersek [mailto:lersek@redhat.com]
> Sent: Tuesday, January 30, 2018 3:49 AM
> To: Wang, Jian J <jian.j.wang@intel.com>; edk2-devel@lists.01.org
> Cc: Ni, Ruiyu <ruiyu.ni@intel.com>; Paolo Bonzini <pbonzini@redhat.com>; Yao,
> Jiewen <jiewen.yao@intel.com>; Dong, Eric <eric.dong@intel.com>; Radim
> Kr?má? <rkrcmar@redhat.com>
> Subject: Re: [edk2] [PATCH 4/6] UefiCpuPkg/PiSmmCpuDxeSmm: Enable NXE if
> it's supported
> 
> On 01/29/18 10:02, Wang, Jian J wrote:
> > Hi Laszlo,
> >
> > I don't know the history of these code but I guess they're converted
> > from .asm file. That may be why there's "DB 66h" prefix. I think
> > you're right these tricks should be replaced with more formal ways.
> > Please submit a bz tracker for it.
> 
> I submitted <https://bugzilla.tianocore.org/show_bug.cgi?id=866>.
> 
> > As to the issue, I don't have clue right now. The code seems no
> > problem. Since msr write didn't happen, code flow is correct. And
> > those code has executed on real 32-bit platform without problem. I
> > need more time to investigate it.
> 
> I think I've found the issue; more exactly I narrowed it down a bit. I
> remember that the same problem drove me mad a few years ago. :)
> 
> The issue is that in the middle of such processor mode switches, no jump
> instructions work *at all* on KVM. I don't know why, this is just my
> experience. The KVM behavior could even be justified by the Intel SDM.
> I'm unsure.
> 
> Let's look at the patch with more context:
> 
> >  global ASM_PFX(SmmStartup)
> >  ASM_PFX(SmmStartup):
> > +    DB      0x66
> > +    mov     eax, 0x80000001             ; read capability
> > +    cpuid
> > +    DB      0x66
> > +    mov     ebx, edx                    ; rdmsr will change edx. keep it in ebx.
> >      DB      0x66, 0xb8
> >  ASM_PFX(gSmmCr3): DD 0
> >      mov     cr3, eax
> >      DB      0x67, 0x66
> >      lgdt    [cs:ebp + (ASM_PFX(gcSmiInitGdtr) - ASM_PFX(SmmStartup))]
> >      DB      0x66, 0xb8
> >  ASM_PFX(gSmmCr4): DD 0
> >      mov     cr4, eax
> > +    DB      0x66
> > +    mov     ecx, 0xc0000080             ; IA32_EFER MSR
> > +    rdmsr
> > +    DB      0x66
> > +    test    ebx, BIT20                  ; check NXE capability
> > +    jz      .1
> > +    or      ah, BIT3                    ; set NXE bit
> > +    wrmsr
> > +.1:
> 
> This code has exactly one jump, and in practice it is never taken
> (because NX support is ubiquitous on physical platforms).
> 
> In my testing I added some other conditional jumps -- I reimplemented
> IsExecuteDisableBitAvailable() from
> "MdeModulePkg/Core/DxeIplPeim/Ia32/DxeLoadFunc.c", which has more
> conditions. All the conditional jumps that were *not* taken didn't
> cause any issues. (This is why the same logic from the patch works for
> me in the X64 version, because there the "jz" is never taken, since NX
> is always available there.) However, the first jump that *was* taken in
> my testing immediately hung or crashed the IA32 guest.
> 
> Finally I replaced the entire NX management code with an unconditional
> jump forward:
> 
>     jmp jump_here
> jump_here:
> 
> and even this hung / crashed the guest.
> 
> For some reason this section of the code is unfit for jumping, under KVM
> anyway.
> 
> Thanks
> Laszlo
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel