On 12/13/23 05:23, Wu, Jiaxin wrote: >> >> Thanks. This documentation (in the commit message and the lib class >> header file) seems really good (especially with the formatting updates >> suggested by Ray). >> >> (1) I think there is one typo: exist <-> exits. >> > > agree, I will fix this. > >>> +RETURN_STATUS >>> +EFIAPI >>> +SmmCpuSyncContextReset ( >>> + IN OUT SMM_CPU_SYNC_CONTEXT *SmmCpuSyncCtx >>> + ); >> >> (2) The file-top documentation says about this API: "ContextReset() is >> called before CPU exist SMI, which allows CPU to check into the next SMI >> from this point". >> >> It is not clear *which* CPU is supposed to call ContextReset (and the >> function does not take a CPU index). Can you explain this in the >> documentation? (Assuming my observation makes sense.) >> > > For SMM CPU driver, it shall the BSP to call the function since BSP will > gather all APs to exit SMM synchronously, it's the role to control the > overall SMI execution flow. > > For the API itself, I don't restrict which CPU can do that. It depends on the > consumer. So, it's not the mandatory, that's the reason I don't mention that. > > But you are right, here, it has a requirement: the elected CPU calling this > function need make sure all CPUs are ready to exist SMI. I can clear document > this requirement as below: > > "This function is called by one of CPUs after all CPUs are ready to exist > SMI, which allows CPU to check into the next SMI from this point." > > Besides, one comment from Ray: we can ASSERT the SmmCpuSyncCtx is not NULL, > don't need return status to handle all such case. if so, the RETURN_STATUS is > not required. > > So, based on all above, I will update the API as below: > > /** > Reset SMM CPU Sync context. SMM CPU Sync context will be reset to the > initialized state. > > This function is called by one of CPUs after all CPUs are ready to exist > SMI, which allows CPU to > check into the next SMI from this point. > > If Context is NULL, then ASSERT(). > > @param[in,out] Context Pointer to the SMM CPU Sync context object to > be reset. > > **/ > VOID > EFIAPI > SmmCpuSyncContextReset ( > IN OUT SMM_CPU_SYNC_CONTEXT *Context > );
Looks good, thanks -- except, there is again the same typo: "ready to exist SMI". Should be "ready to exit". I also agree that asserting (Context != NULL) is valid, as long as we document that passing in a non-NULL context is the caller's responsibility. (And we do that above, so fine.) > > >>> +RETURN_STATUS >>> +EFIAPI >>> +SmmCpuSyncGetArrivedCpuCount ( >>> + IN SMM_CPU_SYNC_CONTEXT *SmmCpuSyncCtx, >>> + IN OUT UINTN *CpuCount >>> + ); >> >> (3) Why is CpuCount IN OUT? I would think just OUT should suffice. >> >> > > Agree. I will correct all similar case. Besides, I also received the comments > from Ray offline: > 1. we can ASSERT the SmmCpuSyncCtx is not NULL, don't need return status to > handle that. > 2. we don't need RETURN_UNSUPPORTED, GetArrivedCpuCount() should be always > supported. > With above comments, I will update the API as below to return the count > directly, this is also aligned with the function name to get arrived CPU > Count: > > /** > Get current number of arrived CPU in SMI. > > BSP might need to know the current number of arrived CPU in SMI to make > sure all APs > in SMI. This API can be for that purpose. > > If Context is NULL, then ASSERT(). > > @param[in] Context Pointer to the SMM CPU Sync context object. > > @retval Current number of arrived CPU in SMI. > > **/ > UINTN > EFIAPI > SmmCpuSyncGetArrivedCpuCount ( > IN SMM_CPU_SYNC_CONTEXT *Context > ); Sure, if you can guarantee at the lib *class* level that this API always succeeds as long as Context is not NULL, then this update looks fine. > > >>> +RETURN_STATUS >>> +EFIAPI >>> +SmmCpuSyncCheckInCpu ( >>> + IN OUT SMM_CPU_SYNC_CONTEXT *SmmCpuSyncCtx, >>> + IN UINTN CpuIndex >>> + ); >> >> (4) Do we need an error condition for CpuIndex being out of range? >> > > Good idea. We can handle this check within ASSERT. Then I will update all > similar case by adding below comments in API: > > "If CpuIndex exceeds the range of all CPUs in the system, then ASSERT()." > > For example: > /** > Performs an atomic operation to check in CPU. > > When SMI happens, all processors including BSP enter to SMM mode by calling > SmmCpuSyncCheckInCpu(). > > If Context is NULL, then ASSERT(). > If CpuIndex exceeds the range of all CPUs in the system, then ASSERT(). > > @param[in,out] Context Pointer to the SMM CPU Sync context > object. > @param[in] CpuIndex Check in CPU index. > > @retval RETURN_SUCCESS Check in CPU (CpuIndex) successfully. > @retval RETURN_ABORTED Check in CPU failed due to > SmmCpuSyncLockDoor() has been called by one elected CPU. > > **/ > RETURN_STATUS > EFIAPI > SmmCpuSyncCheckInCpu ( > IN OUT SMM_CPU_SYNC_CONTEXT *Context, > IN UINTN CpuIndex > ); Works for me. My main points are: - if we consider a particular input condition a *programming error* (and we document it as such), then asserting the opposite of that condition is fine - if we consider an input condition a particular data / environment issue, then catching it / reporting it with an explicit status code is fine. - point is, for *any* problematic input condition, we should decide if we handle it with *either* assert (in which case the caller is responsible for preventing that condition upon input), *or* with a retval (in which case the caller is responsible for handling the circumstance after the call returns). Handling a given input state with *both* approaches at the same time is totally bogus. > > >> (5) Do we have a special CpuIndex value for the BSP? >> > > No, the elected BSP is also the part of CPUs with its own CpuIndex value. > > >>> + >>> +/** >>> + Performs an atomic operation to check out CPU. >>> + >>> + CheckOutCpu() can be called in error handling flow for the CPU who calls >> CheckInCpu() earlier. >>> + >>> + @param[in,out] SmmCpuSyncCtx Pointer to the SMM CPU Sync >> context object. >>> + @param[in] CpuIndex Check out CPU index. >>> + >>> + @retval RETURN_SUCCESS Check out CPU (CpuIndex) successfully. >>> + @retval RETURN_INVALID_PARAMETER SmmCpuSyncCtx is NULL. >>> + @retval RETURN_NOT_READY The CPU is not checked-in. >>> + @retval RETURN_UNSUPPORTED Unsupported operation. >>> + >>> +**/ >>> +RETURN_STATUS >>> +EFIAPI >>> +SmmCpuSyncCheckOutCpu ( >>> + IN OUT SMM_CPU_SYNC_CONTEXT *SmmCpuSyncCtx, >>> + IN UINTN CpuIndex >>> + ); >>> + >> >> (6) Looks good, my only question is again if we need a status code for >> CpuIndex being out of range. >> > > Yes, agree. The comments from Ray is that we don't need handle the > RETURN_INVALID_PARAMETER, just ASSERT for better performance, we can avoid > the if check in release version. Just document the assert. > > To better define the API behavior, I will remove RETURN_UNSUPPORTED, and > replaced with RETURN_ABORTED, which can align with the checkin behavior if we > don't want support the checkout after look door. RETURN_ABORTED clearly > document the behavior instead of RETURN_UNSUPPORTED. > > So, the API would be as below: > /** > Performs an atomic operation to check out CPU. > > This function can be called in error handling flow for the CPU who calls > CheckInCpu() earlier. > > If Context is NULL, then ASSERT(). > If CpuIndex exceeds the range of all CPUs in the system, then ASSERT(). > > @param[in,out] Context Pointer to the SMM CPU Sync context > object. > @param[in] CpuIndex Check out CPU index. > > @retval RETURN_SUCCESS Check out CPU (CpuIndex) successfully. > @retval RETURN_ABORTED Check out CPU failed due to > SmmCpuSyncLockDoor() has been called by one elected CPU. > > **/ > RETURN_STATUS > EFIAPI > SmmCpuSyncCheckOutCpu ( > IN OUT SMM_CPU_SYNC_CONTEXT *Context, > IN UINTN CpuIndex > ); > > >>> +/** >>> + Performs an atomic operation lock door for CPU checkin or checkout. >>> + >>> + After this function, CPU can not check in via SmmCpuSyncCheckInCpu(). >>> + >>> + The CPU specified by CpuIndex is elected to lock door. >>> + >>> + @param[in,out] SmmCpuSyncCtx Pointer to the SMM CPU Sync >> context object. >>> + @param[in] CpuIndex Indicate which CPU to lock door. >>> + @param[in,out] CpuCount Number of arrived CPU in SMI after look >> door. >>> + >>> + @retval RETURN_SUCCESS Lock door for CPU successfully. >>> + @retval RETURN_INVALID_PARAMETER SmmCpuSyncCtx or CpuCount is >> NULL. >>> + >>> +**/ >>> +RETURN_STATUS >>> +EFIAPI >>> +SmmCpuSyncLockDoor ( >>> + IN OUT SMM_CPU_SYNC_CONTEXT *SmmCpuSyncCtx, >>> + IN UINTN CpuIndex, >>> + IN OUT UINTN *CpuCount >>> + ); >> >> This is where it's getting tricky :) >> >> (7) error condition for CpuIndex being out of range? > > Yes, agree. The same case as above, it will be handled in the ASSERT and > documented in the comments. > >> >> (8) why is CpuCount IN OUT and not just OUT? (Other than that, I can see >> how outputting CpuCout at once can be useful.) >> > > Well, I agree it should only OUT. > CpuCout is to tell the number of arrived CPU in SMI after look door. For SMM > CPU driver, it needs to know the number of arrived CPU in SMI after look > door, it's for later rendezvous & sync usage. So, it returns the CpuCount. > > >> (9) do we need error conditions for: >> >> (9.1) CpuIndex being "wrong" (i.e., not the CPU that's "supposed" to >> lock the door)? >> >> (9.2) CpuIndex not having checked in already, before trying to lock the >> door? >> >> Now, I can imagine that problem (9.1) is undetectable, i.e., it causes >> undefined behavior. That's fine, but then we should mention that. >> > > Actually CpuIndex might not be cared by the lib instance. It puts here just > for the future extension that some lib instance might need to know which CPU > lock the door. It's the information provided by the API. > I don't add the error check for those because I don't want focus the > implementation to do this check. > > But I agree, we can document this undefined behavior. How about like below: > "The CPU specified by CpuIndex is elected to lock door. The caller shall > make sure the CpuIndex is the actual CPU calling this function to avoid the > undefined behavior." > > With above, I will update the API like below: > > /** > Performs an atomic operation lock door for CPU checkin and checkout. After > this function: > CPU can not check in via SmmCpuSyncCheckInCpu(). > CPU can not check out via SmmCpuSyncCheckOutCpu(). > > The CPU specified by CpuIndex is elected to lock door. The caller shall > make sure the CpuIndex > is the actual CPU calling this function to avoid the undefined behavior. > > If Context is NULL, then ASSERT(). > If CpuCount is NULL, then ASSERT(). > > @param[in,out] Context Pointer to the SMM CPU Sync context > object. > @param[in] CpuIndex Indicate which CPU to lock door. > @param[in,out] CpuCount Number of arrived CPU in SMI after look > door. > > **/ > VOID > EFIAPI > SmmCpuSyncLockDoor ( > IN OUT SMM_CPU_SYNC_CONTEXT *Context, > IN UINTN CpuIndex, > OUT UINTN *CpuCount > ); Works for me! > > > >>> + >>> +/** >>> + Used by the BSP to wait for APs. >>> + >>> + The number of APs need to be waited is specified by NumberOfAPs. The >> BSP is specified by BspIndex. >>> + >>> + Note: This function is blocking mode, and it will return only after the >> number of APs released by >>> + calling SmmCpuSyncReleaseBsp(): >>> + BSP: WaitForAPs <-- AP: ReleaseBsp >>> + >>> + @param[in,out] SmmCpuSyncCtx Pointer to the SMM CPU Sync >> context object. >>> + @param[in] NumberOfAPs Number of APs need to be waited by >> BSP. >>> + @param[in] BspIndex The BSP Index to wait for APs. >>> + >>> + @retval RETURN_SUCCESS BSP to wait for APs successfully. >>> + @retval RETURN_INVALID_PARAMETER SmmCpuSyncCtx is NULL or >> NumberOfAPs > total number of processors in system. >>> + >>> +**/ >>> +RETURN_STATUS >>> +EFIAPI >>> +SmmCpuSyncWaitForAPs ( >>> + IN OUT SMM_CPU_SYNC_CONTEXT *SmmCpuSyncCtx, >>> + IN UINTN NumberOfAPs, >>> + IN UINTN BspIndex >>> + ); >> >> The "NumberOfAPs > total number of processors in system" check is nice! >> >> (10) Again, do we need a similar error condition for BspIndex being out >> of range? >> > > Agree, I will handle the case in the same way as above in the ASSERT. If so, > no need return the status. > > >> (11) Do we need to document / enforce explicitly (status code) that the >> BSP and the APs must have checked in, and/or the door must have been >> locked? Again -- if we can't detect / enforce these conditions, that's >> fine, but then we should mention the expected call environment. The >> file-top description does not seem very explicit about it. >> > > Agree, if BspIndex is the actual CPU calling this function, it must be > checkin before. So, how about adding the comments as below: > " The caller shall make sure the BspIndex is the actual CPU calling this > function to avoid the undefined behavior." > > Based on above, I propose the API to be: > > /** > Used by the BSP to wait for APs. > > The number of APs need to be waited is specified by NumberOfAPs. The BSP is > specified by BspIndex. > The caller shall make sure the BspIndex is the actual CPU calling this > function to avoid the undefined behavior. > The caller shall make sure the NumberOfAPs have checked-in to avoid the > undefined behavior. > > If Context is NULL, then ASSERT(). > If NumberOfAPs > All CPUs in system, then ASSERT(). > If BspIndex exceeds the range of all CPUs in the system, then ASSERT(). > > Note: > This function is blocking mode, and it will return only after the number of > APs released by > calling SmmCpuSyncReleaseBsp(): > BSP: WaitForAPs <-- AP: ReleaseBsp > > @param[in,out] Context Pointer to the SMM CPU Sync context > object. > @param[in] NumberOfAPs Number of APs need to be waited by BSP. > @param[in] BspIndex The BSP Index to wait for APs. > > **/ > VOID > EFIAPI > SmmCpuSyncWaitForAPs ( > IN OUT SMM_CPU_SYNC_CONTEXT *Context, > IN UINTN NumberOfAPs, > IN UINTN BspIndex > ); OK, thanks. > >>> + >>> +/** >>> + Used by the BSP to release one AP. >>> + >>> + The AP is specified by CpuIndex. The BSP is specified by BspIndex. >>> + >>> + @param[in,out] SmmCpuSyncCtx Pointer to the SMM CPU Sync >> context object. >>> + @param[in] CpuIndex Indicate which AP need to be released. >>> + @param[in] BspIndex The BSP Index to release AP. >>> + >>> + @retval RETURN_SUCCESS BSP to release one AP successfully. >>> + @retval RETURN_INVALID_PARAMETER SmmCpuSyncCtx is NULL or >> CpuIndex is same as BspIndex. >>> + >>> +**/ >>> +RETURN_STATUS >>> +EFIAPI >>> +SmmCpuSyncReleaseOneAp ( >>> + IN OUT SMM_CPU_SYNC_CONTEXT *SmmCpuSyncCtx, >>> + IN UINTN CpuIndex, >>> + IN UINTN BspIndex >>> + ); >> >> (12) Same comments as elsewhere: >> >> - it's good that we check CpuIndex versus BspIndex, but do we also need >> to range-check each? >> > > Agree. > >> - document that both affected CPUs need to have checked in, with the >> door potentially locked? >> > > Yes, for SMM CPU driver, it shall be called after look door. For API itself, > it's better not restrict it. the only requirement I can see is need CpuIndex > must be checkin. So, I will refine it as below: > /** > Used by the BSP to release one AP. > > The AP is specified by CpuIndex. The BSP is specified by BspIndex. > The caller shall make sure the BspIndex is the actual CPU calling this > function to avoid the undefined behavior. > The caller shall make sure the CpuIndex has checked-in to avoid the > undefined behavior. > > If Context is NULL, then ASSERT(). > If CpuIndex == BspIndex, then ASSERT(). > If BspIndex and CpuIndex exceed the range of all CPUs in the system, then > ASSERT(). > > @param[in,out] Context Pointer to the SMM CPU Sync context > object. > @param[in] CpuIndex Indicate which AP need to be released. > @param[in] BspIndex The BSP Index to release AP. > > **/ > VOID > EFIAPI > SmmCpuSyncReleaseOneAp ( > IN OUT SMM_CPU_SYNC_CONTEXT *Context, > IN UINTN CpuIndex, > IN UINTN BspIndex > ); OK. (Small update: the comment should say: "If BspIndex *or* CpuIndex exceed the range ...". For the other functions too, below.) > > > >> >>> + >>> +/** >>> + Used by the AP to wait BSP. >>> + >>> + The AP is specified by CpuIndex. The BSP is specified by BspIndex. >>> + >>> + Note: This function is blocking mode, and it will return only after the >>> AP >> released by >>> + calling SmmCpuSyncReleaseOneAp(): >>> + BSP: ReleaseOneAp --> AP: WaitForBsp >>> + >>> + @param[in,out] SmmCpuSyncCtx Pointer to the SMM CPU Sync context >> object. >>> + @param[in] CpuIndex Indicate which AP wait BSP. >>> + @param[in] BspIndex The BSP Index to be waited. >>> + >>> + @retval RETURN_SUCCESS AP to wait BSP successfully. >>> + @retval RETURN_INVALID_PARAMETER SmmCpuSyncCtx is NULL or >> CpuIndex is same as BspIndex. >>> + >>> +**/ >>> +RETURN_STATUS >>> +EFIAPI >>> +SmmCpuSyncWaitForBsp ( >>> + IN OUT SMM_CPU_SYNC_CONTEXT *SmmCpuSyncCtx, >>> + IN UINTN CpuIndex, >>> + IN UINTN BspIndex >>> + ); >>> + >> >> (13) Same questions as under (12). >> > > See below proposed API: > > /** > Used by the AP to wait BSP. > > The AP is specified by CpuIndex. > The caller shall make sure the CpuIndex is the actual CPU calling this > function to avoid the undefined behavior. > The BSP is specified by BspIndex. > > If Context is NULL, then ASSERT(). > If CpuIndex == BspIndex, then ASSERT(). > If BspIndex and CpuIndex exceed the range of all CPUs in the system, then > ASSERT(). > > Note: > This function is blocking mode, and it will return only after the AP > released by > calling SmmCpuSyncReleaseOneAp(): > BSP: ReleaseOneAp --> AP: WaitForBsp > > @param[in,out] Context Pointer to the SMM CPU Sync context object. > @param[in] CpuIndex Indicate which AP wait BSP. > @param[in] BspIndex The BSP Index to be waited. > > **/ > VOID > EFIAPI > SmmCpuSyncWaitForBsp ( > IN OUT SMM_CPU_SYNC_CONTEXT *Context, > IN UINTN CpuIndex, > IN UINTN BspIndex > ); OK, thanks. > > >>> +/** >>> + Used by the AP to release BSP. >>> + >>> + The AP is specified by CpuIndex. The BSP is specified by BspIndex. >>> + >>> + @param[in,out] SmmCpuSyncCtx Pointer to the SMM CPU Sync >> context object. >>> + @param[in] CpuIndex Indicate which AP release BSP. >>> + @param[in] BspIndex The BSP Index to be released. >>> + >>> + @retval RETURN_SUCCESS AP to release BSP successfully. >>> + @retval RETURN_INVALID_PARAMETER SmmCpuSyncCtx is NULL or >> CpuIndex is same as BspIndex. >>> + >>> +**/ >>> +RETURN_STATUS >>> +EFIAPI >>> +SmmCpuSyncReleaseBsp ( >>> + IN OUT SMM_CPU_SYNC_CONTEXT *SmmCpuSyncCtx, >>> + IN UINTN CpuIndex, >>> + IN UINTN BspIndex >>> + ); >> >> (14) Same questions as under (12). >> > > See below proposed API: > > /** > Used by the AP to release BSP. > > The AP is specified by CpuIndex. > The caller shall make sure the CpuIndex is the actual CPU calling this > function to avoid the undefined behavior. > The BSP is specified by BspIndex. > > If Context is NULL, then ASSERT(). > If CpuIndex == BspIndex, then ASSERT(). > If BspIndex and CpuIndex exceed the range of all CPUs in the system, then > ASSERT(). > > @param[in,out] Context Pointer to the SMM CPU Sync context > object. > @param[in] CpuIndex Indicate which AP release BSP. > @param[in] BspIndex The BSP Index to be released. > > **/ > VOID > EFIAPI > SmmCpuSyncReleaseBsp ( > IN OUT SMM_CPU_SYNC_CONTEXT *Context, > IN UINTN CpuIndex, > IN UINTN BspIndex > ); > Thanks! Laszlo > > Thanks, > Jiaxin > > >>> + >>> +#endif >>> diff --git a/UefiCpuPkg/UefiCpuPkg.dec b/UefiCpuPkg/UefiCpuPkg.dec >>> index 0b5431dbf7..20ab079219 100644 >>> --- a/UefiCpuPkg/UefiCpuPkg.dec >>> +++ b/UefiCpuPkg/UefiCpuPkg.dec >>> @@ -62,10 +62,13 @@ >>> CpuPageTableLib|Include/Library/CpuPageTableLib.h >>> >>> ## @libraryclass Provides functions for manipulating smram savestate >> registers. >>> MmSaveStateLib|Include/Library/MmSaveStateLib.h >>> >>> + ## @libraryclass Provides functions for SMM CPU Sync Operation. >>> + SmmCpuSyncLib|Include/Library/SmmCpuSyncLib.h >>> + >>> [LibraryClasses.RISCV64] >>> ## @libraryclass Provides functions to manage MMU features on RISCV64 >> CPUs. >>> ## >>> RiscVMmuLib|Include/Library/BaseRiscVMmuLib.h >>> >> >> These interfaces look real nice, my comments/questions are all docs-related. >> >> Thanks! >> Laszlo > -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#112484): https://edk2.groups.io/g/devel/message/112484 Mute This Topic: https://groups.io/mt/103010164/21656 Group Owner: devel+ow...@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/leave/9847357/21656/1706620634/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-