UefiCpuPkg/Library/CpuExceptionHandlerLib/CpuExceptionCommon.c | 537 ++++++++++++++++++-- UefiCpuPkg/Library/CpuExceptionHandlerLib/CpuExceptionCommon.h | 59 ++- UefiCpuPkg/Library/CpuExceptionHandlerLib/Ia32/ArchExceptionHandler.c | 483 +++++++++++++++++- UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ArchExceptionHandler.c | 426 +++++++++++++++- 4 files changed, 1435 insertions(+), 70 deletions(-)
Hi, This series adds stack trace support during IA32 and X64 CPU exceptions. Informations like back trace, stack contents and image module names (that were part of the call stack) will be dumped out. The current limitation is that it relies on available frame pointers (GCC only) in order to successfully unwind the stack. Jiewen, Thank you very much for your time on this. I've applied the changes you suggested, as well as tested it on IA32 PAE paging mode - it worked as expected. Other than that, I also tested the stack trace in SMM code by manually calling CpuBreakPoint() and then it broke with another exception (page fault). I didn't have much time to look into that, but what I've observed is that the page fault ocurred during the search of PE/COFF image base address (in PeCoffSearchImageBase). The function attempts to search for the image base from "Address" through 0, so any of those dereferenced addresses triggers the page fault. Do you know how we could fix that issue? Perhaps introducing a AddressValidationLib (as Brian suggested previously) and use it within PeCoffSearchImageBase()? I'd also like to thank Brian & Jeff for all the support! Thanks Paulo Repo: https://github.com/pcacjr/edk2.git Branch: stacktrace_v5 Cc: Rick Bramley <richard.bramley@hp.com> Cc: Kimon Berlin <kimon.berlin@hp.com> Cc: Diego Medaglia <diego.meaglia@hp.com> Cc: Andrew Fish <afish@apple.com> Cc: Eric Dong <eric.dong@intel.com> Cc: Laszlo Ersek <lersek@redhat.com> Cc: Brian Johnson <brian.johnson@hpe.com> Cc: Jeff Fan <vanjeff_919@hotmail.com> Cc: Jiewen Yao <jiewen.yao@intel.com> Cc: Paulo Alcantara <paulo@hp.com> Contributed-under: TianoCore Contribution Agreement 1.1 Signed-off-by: Paulo Alcantara <paulo@paulo.ac> --- v1 -> v2: * Add IA32 arch support (GCC toolchain only) * Replace hard-coded stack alignment value (16) with CPU_STACK_ALIGNMENT. * Check for proper stack and frame pointer alignments. * Fix initialization of UnwoundStacksCount to 1. * Move GetPdbFileName() to common code since it will be used by both IA32 and X64 implementations. v2 -> v3: * Fixed wrong assumption about "RIP < ImageBase" to start searching for another PE/COFF image. That is, RIP may point to lower and higher addresses for any other PE/COFF images. Both IA32 & X64. (Thanks Andrew & Jiewen) * Fixed typo: unwond -> unwound. Both IA32 & X64. (Thanks Brian) v3 -> v4: * Validate all frame/stack pointer addresses before dereferencing them as requested by Brian & Jiewen. * Correctly print out IP addresses during the stack traces (by Jeff) v4 -> v5: * Fixed address calculations and improved code as suggested by Jiewen. * Fixed parameter validation as suggested by Brian. * Tested stack stack with IA32 PAE paging mode. Paulo Alcantara (8): UefiCpuPkg/CpuExceptionHandlerLib/X64: Add stack trace support UefiCpuPkg/CpuExceptionHandlerLib: Export GetPdbFileName() UefiCpuPkg/CpuExceptionHandlerLib/Ia32: Add stack trace support UefiCpuPkg/CpuExceptionHandlerLib: Add helper to validate memory addresses UefiCpuPkg/CpuExceptionHandlerLib: Ensure valid frame/stack pointers UefiCpuPkg/CpuExceptionHandlerLib: Correctly print IP addresses UefiCpuPkg/CpuExceptionHandlerLib: Validate memory address ranges UefiCpuPkg/CpuExceptionHandlerLib: Add early check in DumpStackContents UefiCpuPkg/Library/CpuExceptionHandlerLib/CpuExceptionCommon.c | 537 ++++++++++++++++++-- UefiCpuPkg/Library/CpuExceptionHandlerLib/CpuExceptionCommon.h | 59 ++- UefiCpuPkg/Library/CpuExceptionHandlerLib/Ia32/ArchExceptionHandler.c | 483 +++++++++++++++++- UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ArchExceptionHandler.c | 426 +++++++++++++++- 4 files changed, 1435 insertions(+), 70 deletions(-) -- 2.14.3 _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
Thanks Paulo. The series looks really good. I give it thumb up. :-) The 8 patches keep updating 4 files, so I squash them when I review. The comment below is for the final code, not for a specific patch. 1) CpuExceptionCommon.c: IsLinearAddressRangeValid(). // // Check for valid input parameters // if (LinearAddressStart == 0 || LinearAddressEnd == 0 || LinearAddressStart > LinearAddressEnd) { return FALSE; } I think LinearAddressStart is a valid case. In BIOS, we do update IVT in 0 address when CSM is enabled. I do not think we need exclude it here. If we enabled ZeroPointer protection, it can be excluded in page table parsing. 2) I found below logic appears in 2 functions - DumpImageModuleNames() and DumpStacktrace(). Is that possible to merge them? We can calculate in DumpImageAndCpuContent() and use Eip as parameter. Just in case there is 3rd function need use EIP, it won't miss the calculation. (It is a bug fix we did recently.) // // Set current EIP address // if ((ExceptionType == EXCEPT_IA32_PAGE_FAULT) && ((SystemContext.SystemContextIa32->ExceptionData & IA32_PF_EC_ID) != 0)) { // // The EIP in SystemContext could not be used // if it is page fault with I/D set. // Eip = *(UINT32 *)(UINTN)SystemContext.SystemContextIa32->Esp; } else { Eip = SystemContext.SystemContextIa32->Eip; } 3) I am a little surprised on PeCoffSearchImageBase() issue. We have 4 PeCoffSearchImageBase() call in each arch. DumpImageModuleNames() calls twice and DumpStacktrace() calls twice. Do you know which specific one triggers the zero address #PF issue? C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\ArchExceptionHandler.c(547): ImageBase = PeCoffSearchImageBase (Eip); C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\ArchExceptionHandler.c(613): ImageBase = PeCoffSearchImageBase (Eip); C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\ArchExceptionHandler.c(682): ImageBase = PeCoffSearchImageBase (Eip); C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\ArchExceptionHandler.c(741): ImageBase = PeCoffSearchImageBase (Eip); C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\ArchExceptionHandler.c(540): ImageBase = PeCoffSearchImageBase (Rip); C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\ArchExceptionHandler.c(613): ImageBase = PeCoffSearchImageBase (Rip); C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\ArchExceptionHandler.c(710): ImageBase = PeCoffSearchImageBase (Rip); C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\ArchExceptionHandler.c(779): ImageBase = PeCoffSearchImageBase (Rip); The EIP from SystemContext seems good. I assume there is not a problem here. if ((ExceptionType == EXCEPT_IA32_PAGE_FAULT) && ((SystemContext.SystemContextIa32->ExceptionData & IA32_PF_EC_ID) != 0)) { // // The EIP in SystemContext could not be used // if it is page fault with I/D set. // Eip = *(UINT32 *)(UINTN)SystemContext.SystemContextIa32->Esp; } else { Eip = SystemContext.SystemContextIa32->Eip; } // // Get initial PE/COFF image base address from current EIP // ImageBase = PeCoffSearchImageBase (Eip); But the EIP from stack frame is unknown and risky. Especially for the code in mode-switch, such as PEI->DXE, UEFI->CSM, AP wakeup->AP C code, SMM entrypoint->SMM C code. Should we add a check for EIP here? // // Set EIP with return address from current stack frame // Eip = *(UINT32 *)((UINTN)Ebp + 4); // // If EIP is zero, then stop unwinding the stack // if (Eip == 0) { break; } // // Search for the respective PE/COFF image based on EIP // ImageBase = PeCoffSearchImageBase (Eip); If you can help us do some more investigation on the root-cause and narrow down the issue, I will appreciate that. We can decide how to fix once it is root-caused. Maybe we just do a simple IsLogicalAddressValid () check before we call PeCoffSearchImageBase(). :-) Thank you Yao Jiewen > -----Original Message----- > From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Paulo > Alcantara > Sent: Monday, January 15, 2018 8:23 AM > To: edk2-devel@lists.01.org > Cc: Rick Bramley <richard.bramley@hp.com>; Dong, Eric > <eric.dong@intel.com>; Kimon Berlin <kimon.berlin@hp.com>; Andrew Fish > <afish@apple.com>; Yao, Jiewen <jiewen.yao@intel.com>; Diego Medaglia > <diego.meaglia@hp.com>; Laszlo Ersek <lersek@redhat.com> > Subject: [edk2] [RFC v5 0/8] Stack trace support in X64 exception handling > > Hi, > > This series adds stack trace support during IA32 and X64 CPU exceptions. > > Informations like back trace, stack contents and image module names > (that were part of the call stack) will be dumped out. > > The current limitation is that it relies on available frame pointers > (GCC only) in order to successfully unwind the stack. > > Jiewen, > > Thank you very much for your time on this. I've applied the changes you > suggested, as well as tested it on IA32 PAE paging mode - it worked as > expected. > > Other than that, I also tested the stack trace in SMM code by manually > calling CpuBreakPoint() and then it broke with another exception > (page fault). I didn't have much time to look into that, but what I've > observed is that the page fault ocurred during the search of PE/COFF > image base address (in PeCoffSearchImageBase). The function attempts to > search for the image base from "Address" through 0, so any of those > dereferenced addresses triggers the page fault. > > Do you know how we could fix that issue? Perhaps introducing a > AddressValidationLib (as Brian suggested previously) and use it within > PeCoffSearchImageBase()? > > I'd also like to thank Brian & Jeff for all the support! > > Thanks > Paulo > > Repo: https://github.com/pcacjr/edk2.git > Branch: stacktrace_v5 > > Cc: Rick Bramley <richard.bramley@hp.com> > Cc: Kimon Berlin <kimon.berlin@hp.com> > Cc: Diego Medaglia <diego.meaglia@hp.com> > Cc: Andrew Fish <afish@apple.com> > Cc: Eric Dong <eric.dong@intel.com> > Cc: Laszlo Ersek <lersek@redhat.com> > Cc: Brian Johnson <brian.johnson@hpe.com> > Cc: Jeff Fan <vanjeff_919@hotmail.com> > Cc: Jiewen Yao <jiewen.yao@intel.com> > Cc: Paulo Alcantara <paulo@hp.com> > Contributed-under: TianoCore Contribution Agreement 1.1 > Signed-off-by: Paulo Alcantara <paulo@paulo.ac> > --- > > v1 -> v2: > * Add IA32 arch support (GCC toolchain only) > * Replace hard-coded stack alignment value (16) with > CPU_STACK_ALIGNMENT. > * Check for proper stack and frame pointer alignments. > * Fix initialization of UnwoundStacksCount to 1. > * Move GetPdbFileName() to common code since it will be used by both > IA32 and X64 implementations. > > v2 -> v3: > * Fixed wrong assumption about "RIP < ImageBase" to start searching > for another PE/COFF image. That is, RIP may point to lower and > higher addresses for any other PE/COFF images. Both IA32 & X64. > (Thanks Andrew & Jiewen) > * Fixed typo: unwond -> unwound. Both IA32 & X64. (Thanks Brian) > > v3 -> v4: > * Validate all frame/stack pointer addresses before dereferencing them > as requested by Brian & Jiewen. > * Correctly print out IP addresses during the stack traces (by Jeff) > > v4 -> v5: > * Fixed address calculations and improved code as suggested by Jiewen. > * Fixed parameter validation as suggested by Brian. > * Tested stack stack with IA32 PAE paging mode. > > Paulo Alcantara (8): > UefiCpuPkg/CpuExceptionHandlerLib/X64: Add stack trace support > UefiCpuPkg/CpuExceptionHandlerLib: Export GetPdbFileName() > UefiCpuPkg/CpuExceptionHandlerLib/Ia32: Add stack trace support > UefiCpuPkg/CpuExceptionHandlerLib: Add helper to validate memory > addresses > UefiCpuPkg/CpuExceptionHandlerLib: Ensure valid frame/stack pointers > UefiCpuPkg/CpuExceptionHandlerLib: Correctly print IP addresses > UefiCpuPkg/CpuExceptionHandlerLib: Validate memory address ranges > UefiCpuPkg/CpuExceptionHandlerLib: Add early check in > DumpStackContents > > UefiCpuPkg/Library/CpuExceptionHandlerLib/CpuExceptionCommon.c > | 537 ++++++++++++++++++-- > UefiCpuPkg/Library/CpuExceptionHandlerLib/CpuExceptionCommon.h > | 59 ++- > UefiCpuPkg/Library/CpuExceptionHandlerLib/Ia32/ArchExceptionHandler.c | > 483 +++++++++++++++++- > UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ArchExceptionHandler.c | > 426 +++++++++++++++- > 4 files changed, 1435 insertions(+), 70 deletions(-) > > -- > 2.14.3 > > _______________________________________________ > edk2-devel mailing list > edk2-devel@lists.01.org > https://lists.01.org/mailman/listinfo/edk2-devel _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
Hi Paulo I have some more thought on #3 for your consideration. Given the situation that we are doing heap guard feature, a *valid* EIP address might not be enough to guarantee the address is inside of an *executable* page. I am thinking if we need check the non-executable bit (BIT63) as well for DumpImageModuleNames(). (No need for DumpStacktrace()). We can add IsLogicalAddressExecutable() on top of IsLogicalAddressValid(). Also I do not object the idea to add check inside of PeCoffSearchImageBase(). I am thinking in another way - we can let *caller* to input a validation function for the PeCoffLib, instead of let PeCoffLib depend on another library. For example: PeCoffSearchImageBaseEx (Address, ValidateAddressFunc) Thank you Yao Jiewen > -----Original Message----- > From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Yao, > Jiewen > Sent: Wednesday, January 17, 2018 8:57 PM > To: Paulo Alcantara <paulo@paulo.ac>; edk2-devel@lists.01.org > Cc: Rick Bramley <richard.bramley@hp.com>; Dong, Eric > <eric.dong@intel.com>; Kimon Berlin <kimon.berlin@hp.com>; Andrew Fish > <afish@apple.com>; Diego Medaglia <diego.meaglia@hp.com>; Laszlo Ersek > <lersek@redhat.com> > Subject: Re: [edk2] [RFC v5 0/8] Stack trace support in X64 exception handling > > Thanks Paulo. > > The series looks really good. I give it thumb up. :-) > > The 8 patches keep updating 4 files, so I squash them when I review. The > comment below is for the final code, not for a specific patch. > > 1) CpuExceptionCommon.c: IsLinearAddressRangeValid(). > // > // Check for valid input parameters > // > if (LinearAddressStart == 0 || LinearAddressEnd == 0 || > LinearAddressStart > LinearAddressEnd) { > return FALSE; > } > > I think LinearAddressStart is a valid case. In BIOS, we do update IVT in 0 address > when CSM is enabled. > I do not think we need exclude it here. If we enabled ZeroPointer protection, it > can be excluded in page table parsing. > > 2) I found below logic appears in 2 functions - DumpImageModuleNames() and > DumpStacktrace(). Is that possible to merge them? > We can calculate in DumpImageAndCpuContent() and use Eip as parameter. Just > in case there is 3rd function need use EIP, it won't miss the calculation. > (It is a bug fix we did recently.) > > // > // Set current EIP address > // > if ((ExceptionType == EXCEPT_IA32_PAGE_FAULT) && > ((SystemContext.SystemContextIa32->ExceptionData & > IA32_PF_EC_ID) != 0)) { > // > // The EIP in SystemContext could not be used > // if it is page fault with I/D set. > // > Eip = *(UINT32 *)(UINTN)SystemContext.SystemContextIa32->Esp; > } else { > Eip = SystemContext.SystemContextIa32->Eip; > } > > 3) I am a little surprised on PeCoffSearchImageBase() issue. > > We have 4 PeCoffSearchImageBase() call in each arch. > DumpImageModuleNames() calls twice and DumpStacktrace() calls twice. > Do you know which specific one triggers the zero address #PF issue? > > > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\Arch > ExceptionHandler.c(547): ImageBase = PeCoffSearchImageBase (Eip); > > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\Arch > ExceptionHandler.c(613): ImageBase = PeCoffSearchImageBase (Eip); > > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\Arch > ExceptionHandler.c(682): ImageBase = PeCoffSearchImageBase (Eip); > > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\Arch > ExceptionHandler.c(741): ImageBase = PeCoffSearchImageBase (Eip); > > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\Arch > ExceptionHandler.c(540): ImageBase = PeCoffSearchImageBase (Rip); > > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\Arch > ExceptionHandler.c(613): ImageBase = PeCoffSearchImageBase (Rip); > > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\Arch > ExceptionHandler.c(710): ImageBase = PeCoffSearchImageBase (Rip); > > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\Arch > ExceptionHandler.c(779): ImageBase = PeCoffSearchImageBase (Rip); > > The EIP from SystemContext seems good. I assume there is not a problem here. > > if ((ExceptionType == EXCEPT_IA32_PAGE_FAULT) && > ((SystemContext.SystemContextIa32->ExceptionData & > IA32_PF_EC_ID) != 0)) { > // > // The EIP in SystemContext could not be used > // if it is page fault with I/D set. > // > Eip = *(UINT32 *)(UINTN)SystemContext.SystemContextIa32->Esp; > } else { > Eip = SystemContext.SystemContextIa32->Eip; > } > > // > // Get initial PE/COFF image base address from current EIP > // > ImageBase = PeCoffSearchImageBase (Eip); > > But the EIP from stack frame is unknown and risky. Especially for the code in > mode-switch, such as PEI->DXE, UEFI->CSM, AP wakeup->AP C code, SMM > entrypoint->SMM C code. > Should we add a check for EIP here? > > // > // Set EIP with return address from current stack frame > // > Eip = *(UINT32 *)((UINTN)Ebp + 4); > > // > // If EIP is zero, then stop unwinding the stack > // > if (Eip == 0) { > break; > } > > // > // Search for the respective PE/COFF image based on EIP > // > ImageBase = PeCoffSearchImageBase (Eip); > > If you can help us do some more investigation on the root-cause and narrow > down the issue, I will appreciate that. > > We can decide how to fix once it is root-caused. > > Maybe we just do a simple IsLogicalAddressValid () check before we call > PeCoffSearchImageBase(). :-) > > > Thank you > Yao Jiewen > > > -----Original Message----- > > From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of > Paulo > > Alcantara > > Sent: Monday, January 15, 2018 8:23 AM > > To: edk2-devel@lists.01.org > > Cc: Rick Bramley <richard.bramley@hp.com>; Dong, Eric > > <eric.dong@intel.com>; Kimon Berlin <kimon.berlin@hp.com>; Andrew Fish > > <afish@apple.com>; Yao, Jiewen <jiewen.yao@intel.com>; Diego Medaglia > > <diego.meaglia@hp.com>; Laszlo Ersek <lersek@redhat.com> > > Subject: [edk2] [RFC v5 0/8] Stack trace support in X64 exception handling > > > > Hi, > > > > This series adds stack trace support during IA32 and X64 CPU exceptions. > > > > Informations like back trace, stack contents and image module names > > (that were part of the call stack) will be dumped out. > > > > The current limitation is that it relies on available frame pointers > > (GCC only) in order to successfully unwind the stack. > > > > Jiewen, > > > > Thank you very much for your time on this. I've applied the changes you > > suggested, as well as tested it on IA32 PAE paging mode - it worked as > > expected. > > > > Other than that, I also tested the stack trace in SMM code by manually > > calling CpuBreakPoint() and then it broke with another exception > > (page fault). I didn't have much time to look into that, but what I've > > observed is that the page fault ocurred during the search of PE/COFF > > image base address (in PeCoffSearchImageBase). The function attempts to > > search for the image base from "Address" through 0, so any of those > > dereferenced addresses triggers the page fault. > > > > Do you know how we could fix that issue? Perhaps introducing a > > AddressValidationLib (as Brian suggested previously) and use it within > > PeCoffSearchImageBase()? > > > > I'd also like to thank Brian & Jeff for all the support! > > > > Thanks > > Paulo > > > > Repo: https://github.com/pcacjr/edk2.git > > Branch: stacktrace_v5 > > > > Cc: Rick Bramley <richard.bramley@hp.com> > > Cc: Kimon Berlin <kimon.berlin@hp.com> > > Cc: Diego Medaglia <diego.meaglia@hp.com> > > Cc: Andrew Fish <afish@apple.com> > > Cc: Eric Dong <eric.dong@intel.com> > > Cc: Laszlo Ersek <lersek@redhat.com> > > Cc: Brian Johnson <brian.johnson@hpe.com> > > Cc: Jeff Fan <vanjeff_919@hotmail.com> > > Cc: Jiewen Yao <jiewen.yao@intel.com> > > Cc: Paulo Alcantara <paulo@hp.com> > > Contributed-under: TianoCore Contribution Agreement 1.1 > > Signed-off-by: Paulo Alcantara <paulo@paulo.ac> > > --- > > > > v1 -> v2: > > * Add IA32 arch support (GCC toolchain only) > > * Replace hard-coded stack alignment value (16) with > > CPU_STACK_ALIGNMENT. > > * Check for proper stack and frame pointer alignments. > > * Fix initialization of UnwoundStacksCount to 1. > > * Move GetPdbFileName() to common code since it will be used by both > > IA32 and X64 implementations. > > > > v2 -> v3: > > * Fixed wrong assumption about "RIP < ImageBase" to start searching > > for another PE/COFF image. That is, RIP may point to lower and > > higher addresses for any other PE/COFF images. Both IA32 & X64. > > (Thanks Andrew & Jiewen) > > * Fixed typo: unwond -> unwound. Both IA32 & X64. (Thanks Brian) > > > > v3 -> v4: > > * Validate all frame/stack pointer addresses before dereferencing them > > as requested by Brian & Jiewen. > > * Correctly print out IP addresses during the stack traces (by Jeff) > > > > v4 -> v5: > > * Fixed address calculations and improved code as suggested by Jiewen. > > * Fixed parameter validation as suggested by Brian. > > * Tested stack stack with IA32 PAE paging mode. > > > > Paulo Alcantara (8): > > UefiCpuPkg/CpuExceptionHandlerLib/X64: Add stack trace support > > UefiCpuPkg/CpuExceptionHandlerLib: Export GetPdbFileName() > > UefiCpuPkg/CpuExceptionHandlerLib/Ia32: Add stack trace support > > UefiCpuPkg/CpuExceptionHandlerLib: Add helper to validate memory > > addresses > > UefiCpuPkg/CpuExceptionHandlerLib: Ensure valid frame/stack pointers > > UefiCpuPkg/CpuExceptionHandlerLib: Correctly print IP addresses > > UefiCpuPkg/CpuExceptionHandlerLib: Validate memory address ranges > > UefiCpuPkg/CpuExceptionHandlerLib: Add early check in > > DumpStackContents > > > > UefiCpuPkg/Library/CpuExceptionHandlerLib/CpuExceptionCommon.c > > | 537 ++++++++++++++++++-- > > UefiCpuPkg/Library/CpuExceptionHandlerLib/CpuExceptionCommon.h > > | 59 ++- > > UefiCpuPkg/Library/CpuExceptionHandlerLib/Ia32/ArchExceptionHandler.c | > > 483 +++++++++++++++++- > > UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ArchExceptionHandler.c > | > > 426 +++++++++++++++- > > 4 files changed, 1435 insertions(+), 70 deletions(-) > > > > -- > > 2.14.3 > > > > _______________________________________________ > > edk2-devel mailing list > > edk2-devel@lists.01.org > > https://lists.01.org/mailman/listinfo/edk2-devel > _______________________________________________ > edk2-devel mailing list > edk2-devel@lists.01.org > https://lists.01.org/mailman/listinfo/edk2-devel _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
Hi Jiewen, "Yao, Jiewen" <jiewen.yao@intel.com> writes: > I have some more thought on #3 for your consideration. > > Given the situation that we are doing heap guard feature, a *valid* > EIP address might not be enough to guarantee the address is inside of > an *executable* page. OK. > > I am thinking if we need check the non-executable bit (BIT63) as well for DumpImageModuleNames(). (No need for DumpStacktrace()). > We can add IsLogicalAddressExecutable() on top of > IsLogicalAddressValid(). OK. I can do that. > Also I do not object the idea to add check inside of PeCoffSearchImageBase(). > I am thinking in another way - we can let *caller* to input a validation function for the PeCoffLib, instead of let PeCoffLib depend on another library. > > For example: > PeCoffSearchImageBaseEx (Address, ValidateAddressFunc) Looks good to me! Whether or not we're making it into a external library in the future, we should at least get it working and well tested for the stack trace support. Thanks! Paulo >> -----Original Message----- >> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Yao, >> Jiewen >> Sent: Wednesday, January 17, 2018 8:57 PM >> To: Paulo Alcantara <paulo@paulo.ac>; edk2-devel@lists.01.org >> Cc: Rick Bramley <richard.bramley@hp.com>; Dong, Eric >> <eric.dong@intel.com>; Kimon Berlin <kimon.berlin@hp.com>; Andrew Fish >> <afish@apple.com>; Diego Medaglia <diego.meaglia@hp.com>; Laszlo Ersek >> <lersek@redhat.com> >> Subject: Re: [edk2] [RFC v5 0/8] Stack trace support in X64 exception handling >> >> Thanks Paulo. >> >> The series looks really good. I give it thumb up. :-) >> >> The 8 patches keep updating 4 files, so I squash them when I review. The >> comment below is for the final code, not for a specific patch. >> >> 1) CpuExceptionCommon.c: IsLinearAddressRangeValid(). >> // >> // Check for valid input parameters >> // >> if (LinearAddressStart == 0 || LinearAddressEnd == 0 || >> LinearAddressStart > LinearAddressEnd) { >> return FALSE; >> } >> >> I think LinearAddressStart is a valid case. In BIOS, we do update IVT in 0 address >> when CSM is enabled. >> I do not think we need exclude it here. If we enabled ZeroPointer protection, it >> can be excluded in page table parsing. >> >> 2) I found below logic appears in 2 functions - DumpImageModuleNames() and >> DumpStacktrace(). Is that possible to merge them? >> We can calculate in DumpImageAndCpuContent() and use Eip as parameter. Just >> in case there is 3rd function need use EIP, it won't miss the calculation. >> (It is a bug fix we did recently.) >> >> // >> // Set current EIP address >> // >> if ((ExceptionType == EXCEPT_IA32_PAGE_FAULT) && >> ((SystemContext.SystemContextIa32->ExceptionData & >> IA32_PF_EC_ID) != 0)) { >> // >> // The EIP in SystemContext could not be used >> // if it is page fault with I/D set. >> // >> Eip = *(UINT32 *)(UINTN)SystemContext.SystemContextIa32->Esp; >> } else { >> Eip = SystemContext.SystemContextIa32->Eip; >> } >> >> 3) I am a little surprised on PeCoffSearchImageBase() issue. >> >> We have 4 PeCoffSearchImageBase() call in each arch. >> DumpImageModuleNames() calls twice and DumpStacktrace() calls twice. >> Do you know which specific one triggers the zero address #PF issue? >> >> >> C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\Arch >> ExceptionHandler.c(547): ImageBase = PeCoffSearchImageBase (Eip); >> >> C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\Arch >> ExceptionHandler.c(613): ImageBase = PeCoffSearchImageBase (Eip); >> >> C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\Arch >> ExceptionHandler.c(682): ImageBase = PeCoffSearchImageBase (Eip); >> >> C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\Arch >> ExceptionHandler.c(741): ImageBase = PeCoffSearchImageBase (Eip); >> >> C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\Arch >> ExceptionHandler.c(540): ImageBase = PeCoffSearchImageBase (Rip); >> >> C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\Arch >> ExceptionHandler.c(613): ImageBase = PeCoffSearchImageBase (Rip); >> >> C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\Arch >> ExceptionHandler.c(710): ImageBase = PeCoffSearchImageBase (Rip); >> >> C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\Arch >> ExceptionHandler.c(779): ImageBase = PeCoffSearchImageBase (Rip); >> >> The EIP from SystemContext seems good. I assume there is not a problem here. >> >> if ((ExceptionType == EXCEPT_IA32_PAGE_FAULT) && >> ((SystemContext.SystemContextIa32->ExceptionData & >> IA32_PF_EC_ID) != 0)) { >> // >> // The EIP in SystemContext could not be used >> // if it is page fault with I/D set. >> // >> Eip = *(UINT32 *)(UINTN)SystemContext.SystemContextIa32->Esp; >> } else { >> Eip = SystemContext.SystemContextIa32->Eip; >> } >> >> // >> // Get initial PE/COFF image base address from current EIP >> // >> ImageBase = PeCoffSearchImageBase (Eip); >> >> But the EIP from stack frame is unknown and risky. Especially for the code in >> mode-switch, such as PEI->DXE, UEFI->CSM, AP wakeup->AP C code, SMM >> entrypoint->SMM C code. >> Should we add a check for EIP here? >> >> // >> // Set EIP with return address from current stack frame >> // >> Eip = *(UINT32 *)((UINTN)Ebp + 4); >> >> // >> // If EIP is zero, then stop unwinding the stack >> // >> if (Eip == 0) { >> break; >> } >> >> // >> // Search for the respective PE/COFF image based on EIP >> // >> ImageBase = PeCoffSearchImageBase (Eip); >> >> If you can help us do some more investigation on the root-cause and narrow >> down the issue, I will appreciate that. >> >> We can decide how to fix once it is root-caused. >> >> Maybe we just do a simple IsLogicalAddressValid () check before we call >> PeCoffSearchImageBase(). :-) >> >> >> Thank you >> Yao Jiewen >> >> > -----Original Message----- >> > From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of >> Paulo >> > Alcantara >> > Sent: Monday, January 15, 2018 8:23 AM >> > To: edk2-devel@lists.01.org >> > Cc: Rick Bramley <richard.bramley@hp.com>; Dong, Eric >> > <eric.dong@intel.com>; Kimon Berlin <kimon.berlin@hp.com>; Andrew Fish >> > <afish@apple.com>; Yao, Jiewen <jiewen.yao@intel.com>; Diego Medaglia >> > <diego.meaglia@hp.com>; Laszlo Ersek <lersek@redhat.com> >> > Subject: [edk2] [RFC v5 0/8] Stack trace support in X64 exception handling >> > >> > Hi, >> > >> > This series adds stack trace support during IA32 and X64 CPU exceptions. >> > >> > Informations like back trace, stack contents and image module names >> > (that were part of the call stack) will be dumped out. >> > >> > The current limitation is that it relies on available frame pointers >> > (GCC only) in order to successfully unwind the stack. >> > >> > Jiewen, >> > >> > Thank you very much for your time on this. I've applied the changes you >> > suggested, as well as tested it on IA32 PAE paging mode - it worked as >> > expected. >> > >> > Other than that, I also tested the stack trace in SMM code by manually >> > calling CpuBreakPoint() and then it broke with another exception >> > (page fault). I didn't have much time to look into that, but what I've >> > observed is that the page fault ocurred during the search of PE/COFF >> > image base address (in PeCoffSearchImageBase). The function attempts to >> > search for the image base from "Address" through 0, so any of those >> > dereferenced addresses triggers the page fault. >> > >> > Do you know how we could fix that issue? Perhaps introducing a >> > AddressValidationLib (as Brian suggested previously) and use it within >> > PeCoffSearchImageBase()? >> > >> > I'd also like to thank Brian & Jeff for all the support! >> > >> > Thanks >> > Paulo >> > >> > Repo: https://github.com/pcacjr/edk2.git >> > Branch: stacktrace_v5 >> > >> > Cc: Rick Bramley <richard.bramley@hp.com> >> > Cc: Kimon Berlin <kimon.berlin@hp.com> >> > Cc: Diego Medaglia <diego.meaglia@hp.com> >> > Cc: Andrew Fish <afish@apple.com> >> > Cc: Eric Dong <eric.dong@intel.com> >> > Cc: Laszlo Ersek <lersek@redhat.com> >> > Cc: Brian Johnson <brian.johnson@hpe.com> >> > Cc: Jeff Fan <vanjeff_919@hotmail.com> >> > Cc: Jiewen Yao <jiewen.yao@intel.com> >> > Cc: Paulo Alcantara <paulo@hp.com> >> > Contributed-under: TianoCore Contribution Agreement 1.1 >> > Signed-off-by: Paulo Alcantara <paulo@paulo.ac> >> > --- >> > >> > v1 -> v2: >> > * Add IA32 arch support (GCC toolchain only) >> > * Replace hard-coded stack alignment value (16) with >> > CPU_STACK_ALIGNMENT. >> > * Check for proper stack and frame pointer alignments. >> > * Fix initialization of UnwoundStacksCount to 1. >> > * Move GetPdbFileName() to common code since it will be used by both >> > IA32 and X64 implementations. >> > >> > v2 -> v3: >> > * Fixed wrong assumption about "RIP < ImageBase" to start searching >> > for another PE/COFF image. That is, RIP may point to lower and >> > higher addresses for any other PE/COFF images. Both IA32 & X64. >> > (Thanks Andrew & Jiewen) >> > * Fixed typo: unwond -> unwound. Both IA32 & X64. (Thanks Brian) >> > >> > v3 -> v4: >> > * Validate all frame/stack pointer addresses before dereferencing them >> > as requested by Brian & Jiewen. >> > * Correctly print out IP addresses during the stack traces (by Jeff) >> > >> > v4 -> v5: >> > * Fixed address calculations and improved code as suggested by Jiewen. >> > * Fixed parameter validation as suggested by Brian. >> > * Tested stack stack with IA32 PAE paging mode. >> > >> > Paulo Alcantara (8): >> > UefiCpuPkg/CpuExceptionHandlerLib/X64: Add stack trace support >> > UefiCpuPkg/CpuExceptionHandlerLib: Export GetPdbFileName() >> > UefiCpuPkg/CpuExceptionHandlerLib/Ia32: Add stack trace support >> > UefiCpuPkg/CpuExceptionHandlerLib: Add helper to validate memory >> > addresses >> > UefiCpuPkg/CpuExceptionHandlerLib: Ensure valid frame/stack pointers >> > UefiCpuPkg/CpuExceptionHandlerLib: Correctly print IP addresses >> > UefiCpuPkg/CpuExceptionHandlerLib: Validate memory address ranges >> > UefiCpuPkg/CpuExceptionHandlerLib: Add early check in >> > DumpStackContents >> > >> > UefiCpuPkg/Library/CpuExceptionHandlerLib/CpuExceptionCommon.c >> > | 537 ++++++++++++++++++-- >> > UefiCpuPkg/Library/CpuExceptionHandlerLib/CpuExceptionCommon.h >> > | 59 ++- >> > UefiCpuPkg/Library/CpuExceptionHandlerLib/Ia32/ArchExceptionHandler.c | >> > 483 +++++++++++++++++- >> > UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ArchExceptionHandler.c >> | >> > 426 +++++++++++++++- >> > 4 files changed, 1435 insertions(+), 70 deletions(-) >> > >> > -- >> > 2.14.3 >> > >> > _______________________________________________ >> > edk2-devel mailing list >> > edk2-devel@lists.01.org >> > https://lists.01.org/mailman/listinfo/edk2-devel >> _______________________________________________ >> edk2-devel mailing list >> edk2-devel@lists.01.org >> https://lists.01.org/mailman/listinfo/edk2-devel _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
"Yao, Jiewen" <jiewen.yao@intel.com> writes: Hi Jiewen, Thank you very much for teh review! My comments below: > The 8 patches keep updating 4 files, so I squash them when I review. The comment below is for the final code, not for a specific patch. > > 1) CpuExceptionCommon.c: IsLinearAddressRangeValid(). > // > // Check for valid input parameters > // > if (LinearAddressStart == 0 || LinearAddressEnd == 0 || > LinearAddressStart > LinearAddressEnd) { > return FALSE; > } > > I think LinearAddressStart is a valid case. In BIOS, we do update IVT in 0 address when CSM is enabled. > I do not think we need exclude it here. If we enabled ZeroPointer > protection, it can be excluded in page table parsing. OK - didn't know about it. I'll remove that check. > 2) I found below logic appears in 2 functions - DumpImageModuleNames() and DumpStacktrace(). Is that possible to merge them? > We can calculate in DumpImageAndCpuContent() and use Eip as parameter. Just in case there is 3rd function need use EIP, it won't miss the calculation. > (It is a bug fix we did recently.) > > // > // Set current EIP address > // > if ((ExceptionType == EXCEPT_IA32_PAGE_FAULT) && > ((SystemContext.SystemContextIa32->ExceptionData & IA32_PF_EC_ID) != 0)) { > // > // The EIP in SystemContext could not be used > // if it is page fault with I/D set. > // > Eip = *(UINT32 *)(UINTN)SystemContext.SystemContextIa32->Esp; > } else { > Eip = SystemContext.SystemContextIa32->Eip; > } Yes - I think it's possible and a good improvement. I'll do it. > 3) I am a little surprised on PeCoffSearchImageBase() issue. > > We have 4 PeCoffSearchImageBase() call in each arch. DumpImageModuleNames() calls twice and DumpStacktrace() calls twice. > Do you know which specific one triggers the zero address #PF issue? > > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\ArchExceptionHandler.c(547): ImageBase = PeCoffSearchImageBase (Eip); > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\ArchExceptionHandler.c(613): ImageBase = PeCoffSearchImageBase (Eip); > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\ArchExceptionHandler.c(682): ImageBase = PeCoffSearchImageBase (Eip); > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\ArchExceptionHandler.c(741): ImageBase = PeCoffSearchImageBase (Eip); > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\ArchExceptionHandler.c(540): ImageBase = PeCoffSearchImageBase (Rip); > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\ArchExceptionHandler.c(613): ImageBase = PeCoffSearchImageBase (Rip); > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\ArchExceptionHandler.c(710): ImageBase = PeCoffSearchImageBase (Rip); > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\ArchExceptionHandler.c(779): ImageBase = PeCoffSearchImageBase (Rip); > When I saw the #PF when testing stack trace in SMM, I was running out of time and I just saved the log file with the trace. I'm attaching the log for you, but I'm still going to look into that issue when time permits. I haven't looked on how the SMI entry is set up, but don't you think that we should do something similiar like in DXE and PEI phases with "push $0" and help the debugger or tracer know when stop unwinding the stack -- in DEBUG mode, at least? Of course, if that's really possible. > The EIP from SystemContext seems good. I assume there is not a problem here. > > if ((ExceptionType == EXCEPT_IA32_PAGE_FAULT) && > ((SystemContext.SystemContextIa32->ExceptionData & IA32_PF_EC_ID) != 0)) { > // > // The EIP in SystemContext could not be used > // if it is page fault with I/D set. > // > Eip = *(UINT32 *)(UINTN)SystemContext.SystemContextIa32->Esp; > } else { > Eip = SystemContext.SystemContextIa32->Eip; > } Possibly. I'll add some debug and see what the Eip looks like at this point. > // > // Get initial PE/COFF image base address from current EIP > // > ImageBase = PeCoffSearchImageBase (Eip); > > But the EIP from stack frame is unknown and risky. Especially for the code in mode-switch, such as PEI->DXE, UEFI->CSM, AP wakeup->AP C code, SMM entrypoint->SMM C code. > Should we add a check for EIP here? > > // > // Set EIP with return address from current stack frame > // > Eip = *(UINT32 *)((UINTN)Ebp + 4); > > // > // If EIP is zero, then stop unwinding the stack > // > if (Eip == 0) { > break; > } > > // > // Search for the respective PE/COFF image based on EIP > // > ImageBase = PeCoffSearchImageBase (Eip); I think so. There is no guarantee that the return address (Eip) will be valid even if we're handling a valid frame pointer. It might get corrupted at some point. Thanks for pointing it out! I'll validate them. > If you can help us do some more investigation on the root-cause and narrow down the issue, I will appreciate that. > > We can decide how to fix once it is root-caused. Sure. I will. Thanks for the comments! I learnt a lot of with them :-) Paulo > >> -----Original Message----- >> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Paulo >> Alcantara >> Sent: Monday, January 15, 2018 8:23 AM >> To: edk2-devel@lists.01.org >> Cc: Rick Bramley <richard.bramley@hp.com>; Dong, Eric >> <eric.dong@intel.com>; Kimon Berlin <kimon.berlin@hp.com>; Andrew Fish >> <afish@apple.com>; Yao, Jiewen <jiewen.yao@intel.com>; Diego Medaglia >> <diego.meaglia@hp.com>; Laszlo Ersek <lersek@redhat.com> >> Subject: [edk2] [RFC v5 0/8] Stack trace support in X64 exception handling >> >> Hi, >> >> This series adds stack trace support during IA32 and X64 CPU exceptions. >> >> Informations like back trace, stack contents and image module names >> (that were part of the call stack) will be dumped out. >> >> The current limitation is that it relies on available frame pointers >> (GCC only) in order to successfully unwind the stack. >> >> Jiewen, >> >> Thank you very much for your time on this. I've applied the changes you >> suggested, as well as tested it on IA32 PAE paging mode - it worked as >> expected. >> >> Other than that, I also tested the stack trace in SMM code by manually >> calling CpuBreakPoint() and then it broke with another exception >> (page fault). I didn't have much time to look into that, but what I've >> observed is that the page fault ocurred during the search of PE/COFF >> image base address (in PeCoffSearchImageBase). The function attempts to >> search for the image base from "Address" through 0, so any of those >> dereferenced addresses triggers the page fault. >> >> Do you know how we could fix that issue? Perhaps introducing a >> AddressValidationLib (as Brian suggested previously) and use it within >> PeCoffSearchImageBase()? >> >> I'd also like to thank Brian & Jeff for all the support! >> >> Thanks >> Paulo >> >> Repo: https://github.com/pcacjr/edk2.git >> Branch: stacktrace_v5 >> >> Cc: Rick Bramley <richard.bramley@hp.com> >> Cc: Kimon Berlin <kimon.berlin@hp.com> >> Cc: Diego Medaglia <diego.meaglia@hp.com> >> Cc: Andrew Fish <afish@apple.com> >> Cc: Eric Dong <eric.dong@intel.com> >> Cc: Laszlo Ersek <lersek@redhat.com> >> Cc: Brian Johnson <brian.johnson@hpe.com> >> Cc: Jeff Fan <vanjeff_919@hotmail.com> >> Cc: Jiewen Yao <jiewen.yao@intel.com> >> Cc: Paulo Alcantara <paulo@hp.com> >> Contributed-under: TianoCore Contribution Agreement 1.1 >> Signed-off-by: Paulo Alcantara <paulo@paulo.ac> >> --- >> >> v1 -> v2: >> * Add IA32 arch support (GCC toolchain only) >> * Replace hard-coded stack alignment value (16) with >> CPU_STACK_ALIGNMENT. >> * Check for proper stack and frame pointer alignments. >> * Fix initialization of UnwoundStacksCount to 1. >> * Move GetPdbFileName() to common code since it will be used by both >> IA32 and X64 implementations. >> >> v2 -> v3: >> * Fixed wrong assumption about "RIP < ImageBase" to start searching >> for another PE/COFF image. That is, RIP may point to lower and >> higher addresses for any other PE/COFF images. Both IA32 & X64. >> (Thanks Andrew & Jiewen) >> * Fixed typo: unwond -> unwound. Both IA32 & X64. (Thanks Brian) >> >> v3 -> v4: >> * Validate all frame/stack pointer addresses before dereferencing them >> as requested by Brian & Jiewen. >> * Correctly print out IP addresses during the stack traces (by Jeff) >> >> v4 -> v5: >> * Fixed address calculations and improved code as suggested by Jiewen. >> * Fixed parameter validation as suggested by Brian. >> * Tested stack stack with IA32 PAE paging mode. >> >> Paulo Alcantara (8): >> UefiCpuPkg/CpuExceptionHandlerLib/X64: Add stack trace support >> UefiCpuPkg/CpuExceptionHandlerLib: Export GetPdbFileName() >> UefiCpuPkg/CpuExceptionHandlerLib/Ia32: Add stack trace support >> UefiCpuPkg/CpuExceptionHandlerLib: Add helper to validate memory >> addresses >> UefiCpuPkg/CpuExceptionHandlerLib: Ensure valid frame/stack pointers >> UefiCpuPkg/CpuExceptionHandlerLib: Correctly print IP addresses >> UefiCpuPkg/CpuExceptionHandlerLib: Validate memory address ranges >> UefiCpuPkg/CpuExceptionHandlerLib: Add early check in >> DumpStackContents >> >> UefiCpuPkg/Library/CpuExceptionHandlerLib/CpuExceptionCommon.c >> | 537 ++++++++++++++++++-- >> UefiCpuPkg/Library/CpuExceptionHandlerLib/CpuExceptionCommon.h >> | 59 ++- >> UefiCpuPkg/Library/CpuExceptionHandlerLib/Ia32/ArchExceptionHandler.c | >> 483 +++++++++++++++++- >> UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ArchExceptionHandler.c | >> 426 +++++++++++++++- >> 4 files changed, 1435 insertions(+), 70 deletions(-) >> >> -- >> 2.14.3 >> >> _______________________________________________ >> edk2-devel mailing list >> edk2-devel@lists.01.org >> https://lists.01.org/mailman/listinfo/edk2-devel _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
Paulo Alcantara <paulo@paulo.ac> writes: >> 3) I am a little surprised on PeCoffSearchImageBase() issue. >> >> We have 4 PeCoffSearchImageBase() call in each arch. DumpImageModuleNames() calls twice and DumpStacktrace() calls twice. >> Do you know which specific one triggers the zero address #PF issue? >> >> C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\ArchExceptionHandler.c(547): ImageBase = PeCoffSearchImageBase (Eip); >> C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\ArchExceptionHandler.c(613): ImageBase = PeCoffSearchImageBase (Eip); >> C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\ArchExceptionHandler.c(682): ImageBase = PeCoffSearchImageBase (Eip); >> C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\ArchExceptionHandler.c(741): ImageBase = PeCoffSearchImageBase (Eip); >> C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\ArchExceptionHandler.c(540): ImageBase = PeCoffSearchImageBase (Rip); >> C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\ArchExceptionHandler.c(613): ImageBase = PeCoffSearchImageBase (Rip); >> C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\ArchExceptionHandler.c(710): ImageBase = PeCoffSearchImageBase (Rip); >> C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\ArchExceptionHandler.c(779): ImageBase = PeCoffSearchImageBase (Rip); >> > > When I saw the #PF when testing stack trace in SMM, I was running out of > time and I just saved the log file with the trace. I'm attaching the > log for you, but I'm still going to look into that issue when time > permits. Forgot to attach the log file. Done. :-) !!!! X64 Exception Type - 03(#BP - Breakpoint) CPU Apic ID - 00000000 !!!! RIP - 000000007FF48151, CS - 0000000000000038, RFLAGS - 0000000000000046 RAX - 0000000000000000, RCX - 000000007FEBB020, RDX - 0000000000040000 RBX - 0000000000000000, RSP - 000000007FF7B760, RBP - 000000007FF7B760 RSI - 000000007FF44018, RDI - 000000007FEBB018 R8 - 00000000000000FF, R9 - 0000000000048400, R10 - 00000000000000C9 R11 - 0000000000000069, R12 - 0000000000000000, R13 - 0000000000000000 R14 - 0000000000000000, R15 - 0000000000000000 DS - 0000000000000020, ES - 0000000000000020, FS - 0000000000000020 GS - 0000000000000020, SS - 0000000000000020 CR0 - 0000000080010033, CR2 - 0000000000000000, CR3 - 000000007FF6C000 CR4 - 0000000000000668, CR8 - 0000000000000000 DR0 - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000 DR3 - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400 GDTR - 000000007FF6B000 000000000000004F, LDTR - 0000000000000000 IDTR - 000000007FF75000 00000000000001FF, TR - 0000000000000040 FXSAVE_STATE - 000000007FF7B3C0 Call trace: 0 0x000000007FF48151 @ 0x000000007FF46000+0x2151 (0x000000007FF7B760) in VariableSmm.dll 1 0x000000007FF54E35 @ 0x000000007FF46000+0xEE35 (0x000000007FF7B7D0) in VariableSmm.dll 2 0x000000007FF5659E @ 0x000000007FF46000+0x1059E (0x000000007FF7B800) in VariableSmm.dll 3 0x000000007FF4BCFE @ 0x000000007FF46000+0x5CFE (0x000000007FF7B840) in VariableSmm.dll 4 0x000000007FFE9BFA @ 0x000000007FFDB000+0xEBFA (0x000000007FF7B8A0) in PiSmmCore.dll 5 0x000000007FFEAAC4 @ 0x000000007FFDB000+0xFAC4 (0x000000007FF7B9C0) in PiSmmCore.dll 6 0x000000007FFEADD1 @ 0x000000007FFDB000+0xFDD1 (0x000000007FF7BA20) in PiSmmCore.dll 7 0x000000007FFE43F8 @ 0x000000007FFDB000+0x93F8 (0x000000007FF7BA90) in PiSmmCore.dll 8 0x000000007FFA8B9A @ 0x000000007FF9C000+0xCB9A (0x000000007FF7BD50) in PiSmmCpuDxeSmm.dll 9 0x000000007FFA9E92 @ 0x000000007FF9C000+0xDE92 (0x000000007FF7BDC0) in PiSmmCpuDxeSmm.dll !!!! X64 Exception Type - 0E(#PF - Page-Fault) CPU Apic ID - 00000000 !!!! ExceptionData - 0000000000000000 I:0 R:0 U:0 W:0 P:0 PK:0 S:0 RIP - 000000007FFA1BBE, CS - 0000000000000038, RFLAGS - 0000000000010006 RAX - 000000007FF89FFC, RCX - 000000007FF8E149, RDX - FFFFFFFFFFFFFFFF RBX - 0000000000000000, RSP - 000000007FF7B1A0, RBP - 000000007FF7B1E0 RSI - 000000007FFA9E92, RDI - 000000007FF9C000 R8 - 0000000000000001, R9 - 0000000000000000, R10 - 0000000000000000 R11 - 0000000000000069, R12 - 0000000000000000, R13 - 0000000000000000 R14 - 0000000000000000, R15 - 0000000000000000 DS - 0000000000000020, ES - 0000000000000020, FS - 0000000000000020 GS - 0000000000000020, SS - 0000000000000020 CR0 - 0000000080010033, CR2 - 000000007FF89FFC, CR3 - 000000007FF6C000 CR4 - 0000000000000668, CR8 - 0000000000000000 DR0 - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000 DR3 - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400 GDTR - 000000007FF6B000 000000000000004F, LDTR - 0000000000000000 IDTR - 000000007FF75000 00000000000001FF, TR - 0000000000000040 FXSAVE_STATE - 000000007FF76C60 Paulo >>> -----Original Message----- >>> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Paulo >>> Alcantara >>> Sent: Monday, January 15, 2018 8:23 AM >>> To: edk2-devel@lists.01.org >>> Cc: Rick Bramley <richard.bramley@hp.com>; Dong, Eric >>> <eric.dong@intel.com>; Kimon Berlin <kimon.berlin@hp.com>; Andrew Fish >>> <afish@apple.com>; Yao, Jiewen <jiewen.yao@intel.com>; Diego Medaglia >>> <diego.meaglia@hp.com>; Laszlo Ersek <lersek@redhat.com> >>> Subject: [edk2] [RFC v5 0/8] Stack trace support in X64 exception handling >>> >>> Hi, >>> >>> This series adds stack trace support during IA32 and X64 CPU exceptions. >>> >>> Informations like back trace, stack contents and image module names >>> (that were part of the call stack) will be dumped out. >>> >>> The current limitation is that it relies on available frame pointers >>> (GCC only) in order to successfully unwind the stack. >>> >>> Jiewen, >>> >>> Thank you very much for your time on this. I've applied the changes you >>> suggested, as well as tested it on IA32 PAE paging mode - it worked as >>> expected. >>> >>> Other than that, I also tested the stack trace in SMM code by manually >>> calling CpuBreakPoint() and then it broke with another exception >>> (page fault). I didn't have much time to look into that, but what I've >>> observed is that the page fault ocurred during the search of PE/COFF >>> image base address (in PeCoffSearchImageBase). The function attempts to >>> search for the image base from "Address" through 0, so any of those >>> dereferenced addresses triggers the page fault. >>> >>> Do you know how we could fix that issue? Perhaps introducing a >>> AddressValidationLib (as Brian suggested previously) and use it within >>> PeCoffSearchImageBase()? >>> >>> I'd also like to thank Brian & Jeff for all the support! >>> >>> Thanks >>> Paulo >>> >>> Repo: https://github.com/pcacjr/edk2.git >>> Branch: stacktrace_v5 >>> >>> Cc: Rick Bramley <richard.bramley@hp.com> >>> Cc: Kimon Berlin <kimon.berlin@hp.com> >>> Cc: Diego Medaglia <diego.meaglia@hp.com> >>> Cc: Andrew Fish <afish@apple.com> >>> Cc: Eric Dong <eric.dong@intel.com> >>> Cc: Laszlo Ersek <lersek@redhat.com> >>> Cc: Brian Johnson <brian.johnson@hpe.com> >>> Cc: Jeff Fan <vanjeff_919@hotmail.com> >>> Cc: Jiewen Yao <jiewen.yao@intel.com> >>> Cc: Paulo Alcantara <paulo@hp.com> >>> Contributed-under: TianoCore Contribution Agreement 1.1 >>> Signed-off-by: Paulo Alcantara <paulo@paulo.ac> >>> --- >>> >>> v1 -> v2: >>> * Add IA32 arch support (GCC toolchain only) >>> * Replace hard-coded stack alignment value (16) with >>> CPU_STACK_ALIGNMENT. >>> * Check for proper stack and frame pointer alignments. >>> * Fix initialization of UnwoundStacksCount to 1. >>> * Move GetPdbFileName() to common code since it will be used by both >>> IA32 and X64 implementations. >>> >>> v2 -> v3: >>> * Fixed wrong assumption about "RIP < ImageBase" to start searching >>> for another PE/COFF image. That is, RIP may point to lower and >>> higher addresses for any other PE/COFF images. Both IA32 & X64. >>> (Thanks Andrew & Jiewen) >>> * Fixed typo: unwond -> unwound. Both IA32 & X64. (Thanks Brian) >>> >>> v3 -> v4: >>> * Validate all frame/stack pointer addresses before dereferencing them >>> as requested by Brian & Jiewen. >>> * Correctly print out IP addresses during the stack traces (by Jeff) >>> >>> v4 -> v5: >>> * Fixed address calculations and improved code as suggested by Jiewen. >>> * Fixed parameter validation as suggested by Brian. >>> * Tested stack stack with IA32 PAE paging mode. >>> >>> Paulo Alcantara (8): >>> UefiCpuPkg/CpuExceptionHandlerLib/X64: Add stack trace support >>> UefiCpuPkg/CpuExceptionHandlerLib: Export GetPdbFileName() >>> UefiCpuPkg/CpuExceptionHandlerLib/Ia32: Add stack trace support >>> UefiCpuPkg/CpuExceptionHandlerLib: Add helper to validate memory >>> addresses >>> UefiCpuPkg/CpuExceptionHandlerLib: Ensure valid frame/stack pointers >>> UefiCpuPkg/CpuExceptionHandlerLib: Correctly print IP addresses >>> UefiCpuPkg/CpuExceptionHandlerLib: Validate memory address ranges >>> UefiCpuPkg/CpuExceptionHandlerLib: Add early check in >>> DumpStackContents >>> >>> UefiCpuPkg/Library/CpuExceptionHandlerLib/CpuExceptionCommon.c >>> | 537 ++++++++++++++++++-- >>> UefiCpuPkg/Library/CpuExceptionHandlerLib/CpuExceptionCommon.h >>> | 59 ++- >>> UefiCpuPkg/Library/CpuExceptionHandlerLib/Ia32/ArchExceptionHandler.c | >>> 483 +++++++++++++++++- >>> UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ArchExceptionHandler.c | >>> 426 +++++++++++++++- >>> 4 files changed, 1435 insertions(+), 70 deletions(-) >>> >>> -- >>> 2.14.3 >>> >>> _______________________________________________ >>> edk2-devel mailing list >>> edk2-devel@lists.01.org >>> https://lists.01.org/mailman/listinfo/edk2-devel _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
Hi Jiewen, On 1/17/2018 10:57 AM, Yao, Jiewen wrote: > Thanks Paulo. > > The series looks really good. I give it thumb up. :-) > > The 8 patches keep updating 4 files, so I squash them when I review. The comment below is for the final code, not for a specific patch. > > 1) CpuExceptionCommon.c: IsLinearAddressRangeValid(). > // > // Check for valid input parameters > // > if (LinearAddressStart == 0 || LinearAddressEnd == 0 || > LinearAddressStart > LinearAddressEnd) { > return FALSE; > } > > I think LinearAddressStart is a valid case. In BIOS, we do update IVT in 0 address when CSM is enabled. > I do not think we need exclude it here. If we enabled ZeroPointer protection, it can be excluded in page table parsing. > > 2) I found below logic appears in 2 functions - DumpImageModuleNames() and DumpStacktrace(). Is that possible to merge them? > We can calculate in DumpImageAndCpuContent() and use Eip as parameter. Just in case there is 3rd function need use EIP, it won't miss the calculation. > (It is a bug fix we did recently.) > > // > // Set current EIP address > // > if ((ExceptionType == EXCEPT_IA32_PAGE_FAULT) && > ((SystemContext.SystemContextIa32->ExceptionData & IA32_PF_EC_ID) != 0)) { > // > // The EIP in SystemContext could not be used > // if it is page fault with I/D set. > // > Eip = *(UINT32 *)(UINTN)SystemContext.SystemContextIa32->Esp; > } else { > Eip = SystemContext.SystemContextIa32->Eip; > } > > 3) I am a little surprised on PeCoffSearchImageBase() issue. > > We have 4 PeCoffSearchImageBase() call in each arch. DumpImageModuleNames() calls twice and DumpStacktrace() calls twice. > Do you know which specific one triggers the zero address #PF issue? > > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\ArchExceptionHandler.c(547): ImageBase = PeCoffSearchImageBase (Eip); > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\ArchExceptionHandler.c(613): ImageBase = PeCoffSearchImageBase (Eip); > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\ArchExceptionHandler.c(682): ImageBase = PeCoffSearchImageBase (Eip); > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\ArchExceptionHandler.c(741): ImageBase = PeCoffSearchImageBase (Eip); > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\ArchExceptionHandler.c(540): ImageBase = PeCoffSearchImageBase (Rip); > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\ArchExceptionHandler.c(613): ImageBase = PeCoffSearchImageBase (Rip); > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\ArchExceptionHandler.c(710): ImageBase = PeCoffSearchImageBase (Rip); > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\ArchExceptionHandler.c(779): ImageBase = PeCoffSearchImageBase (Rip); > > The EIP from SystemContext seems good. I assume there is not a problem here. > > if ((ExceptionType == EXCEPT_IA32_PAGE_FAULT) && > ((SystemContext.SystemContextIa32->ExceptionData & IA32_PF_EC_ID) != 0)) { > // > // The EIP in SystemContext could not be used > // if it is page fault with I/D set. > // > Eip = *(UINT32 *)(UINTN)SystemContext.SystemContextIa32->Esp; > } else { > Eip = SystemContext.SystemContextIa32->Eip; > } > > // > // Get initial PE/COFF image base address from current EIP > // > ImageBase = PeCoffSearchImageBase (Eip); > > But the EIP from stack frame is unknown and risky. Especially for the code in mode-switch, such as PEI->DXE, UEFI->CSM, AP wakeup->AP C code, SMM entrypoint->SMM C code. > Should we add a check for EIP here? > > // > // Set EIP with return address from current stack frame > // > Eip = *(UINT32 *)((UINTN)Ebp + 4); > > // > // If EIP is zero, then stop unwinding the stack > // > if (Eip == 0) { > break; > } > > // > // Search for the respective PE/COFF image based on EIP > // > ImageBase = PeCoffSearchImageBase (Eip); > > If you can help us do some more investigation on the root-cause and narrow down the issue, I will appreciate that. > > We can decide how to fix once it is root-caused. > > Maybe we just do a simple IsLogicalAddressValid () check before we call PeCoffSearchImageBase(). :-) I was able to work a little bit on stack trace support yesterday, and here's what I came up with by following your suggestions: 1. Centralized the calculation of initial EIP in DumpImageAndCpuContent() 2. LinearAddressStart 0 is now a valid case 3. Validate all return addresses (EIP/RIP) from every frame pointer. Additionally, I run a few quick tests by generating exceptions from DXE and SMM and here's my result: OVMF X64: DXE & SMM worked. OVMF X64 IA32: DXE & SMM worked. OVMF IA32: DXE worked, but SMM not. SMM was failing in both X64 and IA32 - however when I started validating the return addresses from the frame pointers and, when we find an invalid return address (e.g. last valid frame), we stop the trace. With that, SMM is now working in X64. Since the page fault is only occurring in IA32 with no paging enabled (default case), I suspect that when we don't have paging, we are unable to successfully validate the memory address when it's i.e. outside SMRAM - that is, we don't know when to stop unwinding the stack. Any ideas on what might be causing that? Other than that, I'll try to do some more tests this weekend and come back with the results. (That's why I didn't send out the v6 yet) Repo: https://git.paulo.ac/pub/edk2.git Branch: stacktrace_v6 Thanks Paulo > > > Thank you > Yao Jiewen > >> -----Original Message----- >> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Paulo >> Alcantara >> Sent: Monday, January 15, 2018 8:23 AM >> To: edk2-devel@lists.01.org >> Cc: Rick Bramley <richard.bramley@hp.com>; Dong, Eric >> <eric.dong@intel.com>; Kimon Berlin <kimon.berlin@hp.com>; Andrew Fish >> <afish@apple.com>; Yao, Jiewen <jiewen.yao@intel.com>; Diego Medaglia >> <diego.meaglia@hp.com>; Laszlo Ersek <lersek@redhat.com> >> Subject: [edk2] [RFC v5 0/8] Stack trace support in X64 exception handling >> >> Hi, >> >> This series adds stack trace support during IA32 and X64 CPU exceptions. >> >> Informations like back trace, stack contents and image module names >> (that were part of the call stack) will be dumped out. >> >> The current limitation is that it relies on available frame pointers >> (GCC only) in order to successfully unwind the stack. >> >> Jiewen, >> >> Thank you very much for your time on this. I've applied the changes you >> suggested, as well as tested it on IA32 PAE paging mode - it worked as >> expected. >> >> Other than that, I also tested the stack trace in SMM code by manually >> calling CpuBreakPoint() and then it broke with another exception >> (page fault). I didn't have much time to look into that, but what I've >> observed is that the page fault ocurred during the search of PE/COFF >> image base address (in PeCoffSearchImageBase). The function attempts to >> search for the image base from "Address" through 0, so any of those >> dereferenced addresses triggers the page fault. >> >> Do you know how we could fix that issue? Perhaps introducing a >> AddressValidationLib (as Brian suggested previously) and use it within >> PeCoffSearchImageBase()? >> >> I'd also like to thank Brian & Jeff for all the support! >> >> Thanks >> Paulo >> >> Repo: https://github.com/pcacjr/edk2.git >> Branch: stacktrace_v5 >> >> Cc: Rick Bramley <richard.bramley@hp.com> >> Cc: Kimon Berlin <kimon.berlin@hp.com> >> Cc: Diego Medaglia <diego.meaglia@hp.com> >> Cc: Andrew Fish <afish@apple.com> >> Cc: Eric Dong <eric.dong@intel.com> >> Cc: Laszlo Ersek <lersek@redhat.com> >> Cc: Brian Johnson <brian.johnson@hpe.com> >> Cc: Jeff Fan <vanjeff_919@hotmail.com> >> Cc: Jiewen Yao <jiewen.yao@intel.com> >> Cc: Paulo Alcantara <paulo@hp.com> >> Contributed-under: TianoCore Contribution Agreement 1.1 >> Signed-off-by: Paulo Alcantara <paulo@paulo.ac> >> --- >> >> v1 -> v2: >> * Add IA32 arch support (GCC toolchain only) >> * Replace hard-coded stack alignment value (16) with >> CPU_STACK_ALIGNMENT. >> * Check for proper stack and frame pointer alignments. >> * Fix initialization of UnwoundStacksCount to 1. >> * Move GetPdbFileName() to common code since it will be used by both >> IA32 and X64 implementations. >> >> v2 -> v3: >> * Fixed wrong assumption about "RIP < ImageBase" to start searching >> for another PE/COFF image. That is, RIP may point to lower and >> higher addresses for any other PE/COFF images. Both IA32 & X64. >> (Thanks Andrew & Jiewen) >> * Fixed typo: unwond -> unwound. Both IA32 & X64. (Thanks Brian) >> >> v3 -> v4: >> * Validate all frame/stack pointer addresses before dereferencing them >> as requested by Brian & Jiewen. >> * Correctly print out IP addresses during the stack traces (by Jeff) >> >> v4 -> v5: >> * Fixed address calculations and improved code as suggested by Jiewen. >> * Fixed parameter validation as suggested by Brian. >> * Tested stack stack with IA32 PAE paging mode. >> >> Paulo Alcantara (8): >> UefiCpuPkg/CpuExceptionHandlerLib/X64: Add stack trace support >> UefiCpuPkg/CpuExceptionHandlerLib: Export GetPdbFileName() >> UefiCpuPkg/CpuExceptionHandlerLib/Ia32: Add stack trace support >> UefiCpuPkg/CpuExceptionHandlerLib: Add helper to validate memory >> addresses >> UefiCpuPkg/CpuExceptionHandlerLib: Ensure valid frame/stack pointers >> UefiCpuPkg/CpuExceptionHandlerLib: Correctly print IP addresses >> UefiCpuPkg/CpuExceptionHandlerLib: Validate memory address ranges >> UefiCpuPkg/CpuExceptionHandlerLib: Add early check in >> DumpStackContents >> >> UefiCpuPkg/Library/CpuExceptionHandlerLib/CpuExceptionCommon.c >> | 537 ++++++++++++++++++-- >> UefiCpuPkg/Library/CpuExceptionHandlerLib/CpuExceptionCommon.h >> | 59 ++- >> UefiCpuPkg/Library/CpuExceptionHandlerLib/Ia32/ArchExceptionHandler.c | >> 483 +++++++++++++++++- >> UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ArchExceptionHandler.c | >> 426 +++++++++++++++- >> 4 files changed, 1435 insertions(+), 70 deletions(-) >> >> -- >> 2.14.3 >> >> _______________________________________________ >> edk2-devel mailing list >> edk2-devel@lists.01.org >> https://lists.01.org/mailman/listinfo/edk2-devel _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
Thanks Paulo. ======================= OVMF IA32: DXE worked, but SMM not. Since the page fault is only occurring in IA32 with no paging enabled (default case), I suspect that when we don't have paging, we are unable to successfully validate the memory address when it's i.e. outside SMRAM - that is, we don't know when to stop unwinding the stack. ======================= For IA32 SMM, I am a little confused. We unconditionally setup page table for SMM, no matter it is IA32 or X64. If you find a SMM driver running in a page-disable environment, it means, the SMM CORE load the driver in SMRAM, but the SMM CPU driver has not rebased the CPU yet. SMM driver is still using the PageTable/GDT/IDT setup by DXE CPU driver, not SMM CPU driver. Would you please double confirm what you have observed? You can just check the boot log file to see if PiSmmCpu driver has run or not. Thank you Yao Jiewen > -----Original Message----- > From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Paulo > Alcantara > Sent: Monday, January 29, 2018 9:38 PM > To: Yao, Jiewen <jiewen.yao@intel.com>; edk2-devel@lists.01.org > Cc: Rick Bramley <richard.bramley@hp.com>; Dong, Eric > <eric.dong@intel.com>; Kimon Berlin <kimon.berlin@hp.com>; Andrew Fish > <afish@apple.com>; Diego Medaglia <diego.meaglia@hp.com>; Laszlo Ersek > <lersek@redhat.com> > Subject: Re: [edk2] [RFC v5 0/8] Stack trace support in X64 exception handling > > Hi Jiewen, > > On 1/17/2018 10:57 AM, Yao, Jiewen wrote: > > Thanks Paulo. > > > > The series looks really good. I give it thumb up. :-) > > > > The 8 patches keep updating 4 files, so I squash them when I review. The > comment below is for the final code, not for a specific patch. > > > > 1) CpuExceptionCommon.c: IsLinearAddressRangeValid(). > > // > > // Check for valid input parameters > > // > > if (LinearAddressStart == 0 || LinearAddressEnd == 0 || > > LinearAddressStart > LinearAddressEnd) { > > return FALSE; > > } > > > > I think LinearAddressStart is a valid case. In BIOS, we do update IVT in 0 > address when CSM is enabled. > > I do not think we need exclude it here. If we enabled ZeroPointer protection, it > can be excluded in page table parsing. > > > > 2) I found below logic appears in 2 functions - DumpImageModuleNames() and > DumpStacktrace(). Is that possible to merge them? > > We can calculate in DumpImageAndCpuContent() and use Eip as parameter. > Just in case there is 3rd function need use EIP, it won't miss the calculation. > > (It is a bug fix we did recently.) > > > > // > > // Set current EIP address > > // > > if ((ExceptionType == EXCEPT_IA32_PAGE_FAULT) && > > ((SystemContext.SystemContextIa32->ExceptionData & > IA32_PF_EC_ID) != 0)) { > > // > > // The EIP in SystemContext could not be used > > // if it is page fault with I/D set. > > // > > Eip = *(UINT32 *)(UINTN)SystemContext.SystemContextIa32->Esp; > > } else { > > Eip = SystemContext.SystemContextIa32->Eip; > > } > > > > 3) I am a little surprised on PeCoffSearchImageBase() issue. > > > > We have 4 PeCoffSearchImageBase() call in each arch. > DumpImageModuleNames() calls twice and DumpStacktrace() calls twice. > > Do you know which specific one triggers the zero address #PF issue? > > > > > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\Arch > ExceptionHandler.c(547): ImageBase = PeCoffSearchImageBase (Eip); > > > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\Arch > ExceptionHandler.c(613): ImageBase = PeCoffSearchImageBase (Eip); > > > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\Arch > ExceptionHandler.c(682): ImageBase = PeCoffSearchImageBase (Eip); > > > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\Ia32\Arch > ExceptionHandler.c(741): ImageBase = PeCoffSearchImageBase (Eip); > > > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\Arch > ExceptionHandler.c(540): ImageBase = PeCoffSearchImageBase (Rip); > > > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\Arch > ExceptionHandler.c(613): ImageBase = PeCoffSearchImageBase (Rip); > > > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\Arch > ExceptionHandler.c(710): ImageBase = PeCoffSearchImageBase (Rip); > > > C:\home\EdkIIGit\edk2\UefiCpuPkg\Library\CpuExceptionHandlerLib\X64\Arch > ExceptionHandler.c(779): ImageBase = PeCoffSearchImageBase (Rip); > > > > The EIP from SystemContext seems good. I assume there is not a problem here. > > > > if ((ExceptionType == EXCEPT_IA32_PAGE_FAULT) && > > ((SystemContext.SystemContextIa32->ExceptionData & > IA32_PF_EC_ID) != 0)) { > > // > > // The EIP in SystemContext could not be used > > // if it is page fault with I/D set. > > // > > Eip = *(UINT32 *)(UINTN)SystemContext.SystemContextIa32->Esp; > > } else { > > Eip = SystemContext.SystemContextIa32->Eip; > > } > > > > // > > // Get initial PE/COFF image base address from current EIP > > // > > ImageBase = PeCoffSearchImageBase (Eip); > > > > But the EIP from stack frame is unknown and risky. Especially for the code in > mode-switch, such as PEI->DXE, UEFI->CSM, AP wakeup->AP C code, SMM > entrypoint->SMM C code. > > Should we add a check for EIP here? > > > > // > > // Set EIP with return address from current stack frame > > // > > Eip = *(UINT32 *)((UINTN)Ebp + 4); > > > > // > > // If EIP is zero, then stop unwinding the stack > > // > > if (Eip == 0) { > > break; > > } > > > > // > > // Search for the respective PE/COFF image based on EIP > > // > > ImageBase = PeCoffSearchImageBase (Eip); > > > > If you can help us do some more investigation on the root-cause and narrow > down the issue, I will appreciate that. > > > > We can decide how to fix once it is root-caused. > > > > Maybe we just do a simple IsLogicalAddressValid () check before we call > PeCoffSearchImageBase(). :-) > > I was able to work a little bit on stack trace support yesterday, and > here's what I came up with by following your suggestions: > > 1. Centralized the calculation of initial EIP in DumpImageAndCpuContent() > > 2. LinearAddressStart 0 is now a valid case > > 3. Validate all return addresses (EIP/RIP) from every frame pointer. > > Additionally, I run a few quick tests by generating exceptions from DXE > and SMM and here's my result: > > OVMF X64: DXE & SMM worked. > OVMF X64 IA32: DXE & SMM worked. > OVMF IA32: DXE worked, but SMM not. > > SMM was failing in both X64 and IA32 - however when I started validating > the return addresses from the frame pointers and, when we find an > invalid return address (e.g. last valid frame), we stop the trace. With > that, SMM is now working in X64. > > Since the page fault is only occurring in IA32 with no paging enabled > (default case), I suspect that when we don't have paging, we are unable > to successfully validate the memory address when it's i.e. outside SMRAM > - that is, we don't know when to stop unwinding the stack. > > Any ideas on what might be causing that? > > Other than that, I'll try to do some more tests this weekend and come > back with the results. (That's why I didn't send out the v6 yet) > > Repo: https://git.paulo.ac/pub/edk2.git > Branch: stacktrace_v6 > > Thanks > Paulo > > > > > > > Thank you > > Yao Jiewen > > > >> -----Original Message----- > >> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of > Paulo > >> Alcantara > >> Sent: Monday, January 15, 2018 8:23 AM > >> To: edk2-devel@lists.01.org > >> Cc: Rick Bramley <richard.bramley@hp.com>; Dong, Eric > >> <eric.dong@intel.com>; Kimon Berlin <kimon.berlin@hp.com>; Andrew Fish > >> <afish@apple.com>; Yao, Jiewen <jiewen.yao@intel.com>; Diego Medaglia > >> <diego.meaglia@hp.com>; Laszlo Ersek <lersek@redhat.com> > >> Subject: [edk2] [RFC v5 0/8] Stack trace support in X64 exception handling > >> > >> Hi, > >> > >> This series adds stack trace support during IA32 and X64 CPU exceptions. > >> > >> Informations like back trace, stack contents and image module names > >> (that were part of the call stack) will be dumped out. > >> > >> The current limitation is that it relies on available frame pointers > >> (GCC only) in order to successfully unwind the stack. > >> > >> Jiewen, > >> > >> Thank you very much for your time on this. I've applied the changes you > >> suggested, as well as tested it on IA32 PAE paging mode - it worked as > >> expected. > >> > >> Other than that, I also tested the stack trace in SMM code by manually > >> calling CpuBreakPoint() and then it broke with another exception > >> (page fault). I didn't have much time to look into that, but what I've > >> observed is that the page fault ocurred during the search of PE/COFF > >> image base address (in PeCoffSearchImageBase). The function attempts to > >> search for the image base from "Address" through 0, so any of those > >> dereferenced addresses triggers the page fault. > >> > >> Do you know how we could fix that issue? Perhaps introducing a > >> AddressValidationLib (as Brian suggested previously) and use it within > >> PeCoffSearchImageBase()? > >> > >> I'd also like to thank Brian & Jeff for all the support! > >> > >> Thanks > >> Paulo > >> > >> Repo: https://github.com/pcacjr/edk2.git > >> Branch: stacktrace_v5 > >> > >> Cc: Rick Bramley <richard.bramley@hp.com> > >> Cc: Kimon Berlin <kimon.berlin@hp.com> > >> Cc: Diego Medaglia <diego.meaglia@hp.com> > >> Cc: Andrew Fish <afish@apple.com> > >> Cc: Eric Dong <eric.dong@intel.com> > >> Cc: Laszlo Ersek <lersek@redhat.com> > >> Cc: Brian Johnson <brian.johnson@hpe.com> > >> Cc: Jeff Fan <vanjeff_919@hotmail.com> > >> Cc: Jiewen Yao <jiewen.yao@intel.com> > >> Cc: Paulo Alcantara <paulo@hp.com> > >> Contributed-under: TianoCore Contribution Agreement 1.1 > >> Signed-off-by: Paulo Alcantara <paulo@paulo.ac> > >> --- > >> > >> v1 -> v2: > >> * Add IA32 arch support (GCC toolchain only) > >> * Replace hard-coded stack alignment value (16) with > >> CPU_STACK_ALIGNMENT. > >> * Check for proper stack and frame pointer alignments. > >> * Fix initialization of UnwoundStacksCount to 1. > >> * Move GetPdbFileName() to common code since it will be used by both > >> IA32 and X64 implementations. > >> > >> v2 -> v3: > >> * Fixed wrong assumption about "RIP < ImageBase" to start searching > >> for another PE/COFF image. That is, RIP may point to lower and > >> higher addresses for any other PE/COFF images. Both IA32 & X64. > >> (Thanks Andrew & Jiewen) > >> * Fixed typo: unwond -> unwound. Both IA32 & X64. (Thanks Brian) > >> > >> v3 -> v4: > >> * Validate all frame/stack pointer addresses before dereferencing them > >> as requested by Brian & Jiewen. > >> * Correctly print out IP addresses during the stack traces (by Jeff) > >> > >> v4 -> v5: > >> * Fixed address calculations and improved code as suggested by Jiewen. > >> * Fixed parameter validation as suggested by Brian. > >> * Tested stack stack with IA32 PAE paging mode. > >> > >> Paulo Alcantara (8): > >> UefiCpuPkg/CpuExceptionHandlerLib/X64: Add stack trace support > >> UefiCpuPkg/CpuExceptionHandlerLib: Export GetPdbFileName() > >> UefiCpuPkg/CpuExceptionHandlerLib/Ia32: Add stack trace support > >> UefiCpuPkg/CpuExceptionHandlerLib: Add helper to validate memory > >> addresses > >> UefiCpuPkg/CpuExceptionHandlerLib: Ensure valid frame/stack pointers > >> UefiCpuPkg/CpuExceptionHandlerLib: Correctly print IP addresses > >> UefiCpuPkg/CpuExceptionHandlerLib: Validate memory address ranges > >> UefiCpuPkg/CpuExceptionHandlerLib: Add early check in > >> DumpStackContents > >> > >> UefiCpuPkg/Library/CpuExceptionHandlerLib/CpuExceptionCommon.c > >> | 537 ++++++++++++++++++-- > >> UefiCpuPkg/Library/CpuExceptionHandlerLib/CpuExceptionCommon.h > >> | 59 ++- > >> > UefiCpuPkg/Library/CpuExceptionHandlerLib/Ia32/ArchExceptionHandler.c | > >> 483 +++++++++++++++++- > >> UefiCpuPkg/Library/CpuExceptionHandlerLib/X64/ArchExceptionHandler.c > | > >> 426 +++++++++++++++- > >> 4 files changed, 1435 insertions(+), 70 deletions(-) > >> > >> -- > >> 2.14.3 > >> > >> _______________________________________________ > >> edk2-devel mailing list > >> edk2-devel@lists.01.org > >> https://lists.01.org/mailman/listinfo/edk2-devel > > _______________________________________________ > edk2-devel mailing list > edk2-devel@lists.01.org > https://lists.01.org/mailman/listinfo/edk2-devel _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
"Yao, Jiewen" <jiewen.yao@intel.com> writes: Hi Jiewen, > ======================= > OVMF IA32: DXE worked, but SMM not. > > Since the page fault is only occurring in IA32 with no paging enabled > (default case), I suspect that when we don't have paging, we are unable > to successfully validate the memory address when it's i.e. outside SMRAM > - that is, we don't know when to stop unwinding the stack. > ======================= > > For IA32 SMM, I am a little confused. > We unconditionally setup page table for SMM, no matter it is IA32 or X64. > > If you find a SMM driver running in a page-disable environment, it means, the SMM CORE load the driver in SMRAM, but the SMM CPU driver has not rebased the CPU yet. > SMM driver is still using the PageTable/GDT/IDT setup by DXE CPU > driver, not SMM CPU driver. OK - thanks for clarifying that. > Would you please double confirm what you have observed? > > You can just check the boot log file to see if PiSmmCpu driver has run > or not. Sure. I will do it and then get back to you once I got the results. Thanks Paulo _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel
© 2016 - 2024 Red Hat, Inc.