-
Notifications
You must be signed in to change notification settings - Fork 57
Manage ticks to suppress RCU CPU stall warning #75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
--- utils.h
+++ utils.h
@@ -87,7 +87,7 @@ static inline void list_del_init(struct
})
#else
#define container_of(ptr, type, member) \
- ((type *) ((char *) (ptr) - offsetof(type, member)))
+ ((type *) ((char *) (ptr) -offsetof(type, member)))
#endif
#define list_entry(node, type, member) container_of(node, type, member) We shall migrate to newer versions later. |
Code Review Agent Run #9f603bActionable Suggestions - 6
Review Details
|
Changelist by BitoThis pull request implements the following key changes. |
@@ -133,6 +133,7 @@ E := | |||
S := $E $E | |||
|
|||
SMP ?= 1 | |||
CFLAGS += -D SEMU_BOOT_TARGET_TIME=10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider making the boot target time configurable via environment variable instead of hardcoding it in the Makefile. This would allow users to override the value without modifying the Makefile. Could use something like: SEMU_BOOT_TARGET_TIME ?= 10
Code suggestion
Check the AI-generated fix before applying
CFLAGS += -D SEMU_BOOT_TARGET_TIME=10 | |
SEMU_BOOT_TARGET_TIME ?= 10 | |
CFLAGS += -D SEMU_BOOT_TARGET_TIME=$(SEMU_BOOT_TARGET_TIME) |
Code Review Run #9f603b
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
static double ticks_increment; | ||
static double boot_ticks; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider initializing boot_ticks
and ticks_increment
with meaningful default values to avoid potential undefined behavior when these variables are first accessed.
Code suggestion
Check the AI-generated fix before applying
static double ticks_increment; | |
static double boot_ticks; | |
static double ticks_increment = 0.0; | |
static double boot_ticks = 0.0; |
Code Review Run #9f603b
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
return t.tv_sec * freq + mult_frac(t.tv_nsec, freq, 1e9); | ||
struct timespec ts; | ||
clock_gettime(CLOCKID, &ts); | ||
return (uint64_t) ts.tv_sec * 1e9 + (uint64_t) ts.tv_nsec; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using UINT64_C(1000000000)
instead of 1e9
for explicit integer constant. Using floating point literals for integer calculations could lead to precision issues.
Code suggestion
Check the AI-generated fix before applying
return (uint64_t) ts.tv_sec * 1e9 + (uint64_t) ts.tv_nsec; | |
return (uint64_t) ts.tv_sec * UINT64_C(1000000000) + (uint64_t) ts.tv_nsec; |
Code Review Run #9f603b
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
return time(0) * freq; | ||
/* Fallback to non-HRT calls time(0) in seconds => convert to ns. */ | ||
time_t now_sec = time(0); | ||
return (uint64_t) now_sec * 1e9; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to above, consider using UINT64_C(1000000000)
instead of 1e9
in the fallback path for consistency and to avoid potential floating point precision issues.
Code suggestion
Check the AI-generated fix before applying
return (uint64_t) now_sec * 1e9; | |
return (uint64_t) now_sec * UINT64_C(1000000000); |
Code Review Run #9f603b
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
static int64_t offset = 0; | ||
static bool first_switch = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider making offset
and first_switch
variables thread-safe since they are static variables accessed from multiple threads. Using atomic variables would prevent potential race conditions.
Code suggestion
Check the AI-generated fix before applying
static int64_t offset = 0; | |
static bool first_switch = true; | |
static atomic_int64_t offset = ATOMIC_VAR_INIT(0); | |
static atomic_bool first_switch = ATOMIC_VAR_INIT(true); |
Code Review Run #9f603b
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
@@ -19,6 +20,10 @@ | |||
#endif | |||
#endif | |||
|
|||
bool boot_complete = false; | |||
static double ticks_increment; | |||
static double boot_ticks; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The boot_ticks
variable is declared as double
but cast to uint64_t
without checking for potential overflow or precision loss. Consider using uint64_t
type consistently.
Code suggestion
Check the AI-generated fix before applying
static double boot_ticks; | |
static uint64_t boot_ticks; |
Code Review Run #9f603b
Is this a valid issue, or was it incorrectly flagged by the Agent?
- it was incorrectly flagged
Since the emulator currently operates using sequential emulation, the execution time for the boot process is relatively long, which can result in the generation of RCU CPU stall warnings. To address this issue, there are several potential solutions: 1. Scale the frequency to slow down emulator time during the boot process, thereby eliminating RCU CPU stall warnings. 2. During the boot process, avoid using 'clock_gettime' to update ticks and instead manage the tick increment relationship manually. 3. Implement multi-threaded emulation to accelerate the emulator's execution speed. For the third point, while implementing multi-threaded emulation can significantly accelerate the emulator's execution speed, it cannot guarantee that this issue will not reappear as the number of cores increases in the future. Therefore, a better approach is to use methods 1 and 2 to allow the emulator to set an expected time for completing the boot process. The scale method requires three pieces of information to be implemented: - the cost of 'semu_timer_clocksource' - percentage of 'semu_timer_clocksource' within the boot process - the number of times 'semu_timer_clocksource' is called. In contrast, the increment method only requires the third information to be implemented, making its implementation simpler. Furthermore, through statistical analysis, we found that the other two values (cost and percentage) exhibit different distributions across different environments. Therefore, using the scale method would require an additional profiling step when the emulator starts, adding complexity to the implementation. Finally, since the increment method does not obtain real-time timestamps during the boot process, its overhead is lower compared to the scale method. Because 'semu_timer_clocksource' is called frequently, this reduction in overhead accumulates and results in a noticeable performance improvement. Therefore, this commit opts for the increment method to address this issue. This commit divides time into emulator time and real time. During the boot process, the timer manage the ticks increment to slow down the growth of emulator time, eliminating RCU CPU stall warnings. After the boot process is complete, the growth of emulator time aligns with real time. According to Using RCU’s CPU Stall Detector [1], the grace period for RCU CPU stalls is typically set to 21 seconds. By dividing this value by two as the expected completion time, we can provide a sufficient buffer to reduce the impact of errors and avoid RCU CPU stall warnings. By statistic, the number of times 'semu_timer_clocksource' called is approximately 'SMP count * 2.15 * 1e8'. By the time the boot process is completed, the emulator will have a total of 'boot seconds * frequency' ticks. Therefore, each time, '(boot seconds * frequency) / (2.15 * 1e8 * SMP count)' ticks need to be added. Close sysprog21#51 [1] https://docs.kernel.org/RCU/stallwarn.html#config-rcu-cpu-stall-timeout
e895724
to
cd565d0
Compare
Code Review Agent Run #cbf1c4Actionable Suggestions - 1
Review Details
|
Thank @Mes0903 for contributing! |
Since the emulator currently operates using sequential emulation, the execution time for the boot process is relatively long, which can result in the generation of RCU CPU stall warnings.
To address this issue, there are several potential solutions:
For the third point, while implementing multi-threaded emulation can significantly accelerate the emulator's execution speed, it cannot guarantee that this issue will not reappear as the number of cores increases in the future. Therefore, a better approach is to use methods 1 and 2 to allow the emulator to set an expected time for completing the boot process.
The scale method requires three pieces of information to be implemented:
In contrast, the increment method only requires the third information to be implemented, making its implementation simpler.
Furthermore, through statistical analysis, we found that the other two values (cost and percentage) exhibit different distributions across different environments. Therefore, using the scale method would require an additional profiling step when the emulator starts, adding complexity to the implementation.
Finally, since the increment method does not obtain real-time timestamps during the boot process, its overhead is lower compared to the scale method. Because 'semu_timer_clocksource' is called frequently, this reduction in overhead accumulates and results in a noticeable performance improvement.
Therefore, this commit opts for the increment method to address this issue. This commit divides time into emulator time and real time. During the boot process, the timer manage the ticks increment to slow down the growth of emulator time, eliminating RCU CPU stall warnings. After the boot process is complete, the growth of emulator time aligns with real time.
According to Using RCU’s CPU Stall Detector [1], the grace period for RCU CPU stalls is typically set to 21 seconds. By dividing this value by two as the expected completion time, we can provide a sufficient buffer to reduce the impact of errors and avoid RCU CPU stall warnings.
By statistic, the number of times 'semu_timer_clocksource' called is approximately
By the time the boot process is completed, the emulator will have a total of
ticks. Therefore, each time,
ticks need to be added.
Close #51
[1] https://docs.kernel.org/RCU/stallwarn.html#config-rcu-cpu-stall-timeout
Summary by Bito
Implementation of a sophisticated timer management system with dual-phase timing to prevent RCU CPU stall warnings during boot process. The system features manual tick increment management during boot, transitioning to real-time timing after detecting boot completion via U mode transition. The solution includes statistical calculations for optimal tick increments based on hart count, creating a robust timing infrastructure.Unit tests added: False
Estimated effort to review (1-5, lower is better): 2