-
Notifications
You must be signed in to change notification settings - Fork 18
Manage 2nd stack allocation in-library w/refcnts #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I recently had a new discovery, which brought some bad news to g_cont on sys stack: WPS is broken by this move, and maybe other SDK functions that involves heavy crypto. In my fork of bearssl+AsyncTCP, I was handling the 2nd stack the other way around:
Additional benefits are:
I also had some needs to use bearssl in non-async context (e.g. loading trusted certs before connecting), and I use the following method to delegate computation from g_cont to sys context:
There may be better/nicer ways to do the same thing. Just want to bring up this possibility in the discussion. |
@Adam5Wu Awesome finding, that means that the current stack handling can't be left as-is for every user.
Keep in mind that the main reason of the current stack move was that it gives 4K heap to every user. The caveat of not being able to yield, is that only during handshake? Or is the restriction also imposed on any communication? I'm thinking of e.g. secure webserver callbacks. |
Reading the discussion, I'm not sure that's quite right, @devyte . I think @Adam5Wu's observations are only if you're using AsyncTCP. The way I read it is that there is a stack overflow problem if and only if you're running AsyncTCP + BearSSL + WPS. Can you confirm or correct, @Adam5Wu ? BearSSL or not, an extra 4K for general use from the g_cont move is a big deal, I'd hate to pull that unless it's causing problems in the general case... |
@earlephilhower the wps crash has nothing to do with bearssl. Purely caused by moving g_cont to sys stack. But bearssl's need of 4.5KB extra memory increases overall memory pressure and makes moving g_cont more appealing. @devyte if bearssl uses sys stack, then any of its operation must run to natural completion before any other SDK function can run. Otherwise the data on sys stack gets destroyed. |
After the previous comments, the following questions come up in my head:
|
I was discussing this with @d-a-v earlier, and I think we need to:
The solutions that address the cases should try to maximize available heap. Also, the default case should be what applies to most users, which I think are users who do not use wps, which implies the current solution. BTW, my own use case is no wps, no ssl, and async libs, and the current solution works fine for me. As an aside, given this issue, and the potential pitfalls if we shuffle the stacks, I REALLY think we should prioritize the stack overflow canary (HW breakpoint). |
In my understanding, what stack to use really depends on the "ideology".
Although it is not exactly correct, but we can think of every esp8266 Arduino program are divided into two parts, the low-level part sort of like "kernel" and the high-level part sort of like "user".
Now, if we conceptually classify SSL to be "high-level", as we do now, then it runs in arduino task and uses g_cont stack, which is not enough. This is why we need a third "stack" but there is no third task -- we don't literally need a new stack, we just need some more space which g_cont does not have. But, if we conceptually classify SSL part of the logic to be "low-level", just like LwIP, then it supposedly should run in the sdk task and uses sys stack. And since there is enough space, then we don't need to allocate a third stack any more. As to placing g_cont on sys stack, I am becoming more skeptical after the incident -- yes it can bring some savings in certain cases. A PC can be made 5~15% faster by uninstalling/disabling antivirus; Or a car 20% more fuel efficient by replacing most metal parts with plastics... Anyway, the concerning part is that, the SDK is not open source, and is subject to future changes. Who knows how many corner cases there are can cause subtle problems with this practice; and how many more there will be in the future? (Engineers at Espressif didn't pick the sys stack size for no reason. They maybe conservative, but probably won't overprovision by over 100%, given such low amount of total RAM.) If we do this, then every time we upgrade SDK version, we need to literally run a set of regression tests on the actual hardware (there is not esp8266 hardware emulators, is there?) just for this feature, otherwise it may be too unreliable to be usable. Of course, if we can get blessing from Espressif, I will have a whole lot more confidence on this feature. :) |
@Adam5Wu it's really not as complicated as all that, about stability. The issue is that the ESP's stack is so small that it's easy to blow it up. That's not just due to ssl or wps, but in general. Many users have made that mistake, even I blew it up once when I started out. What I mean is that it's a common issue and cause for instability. |
Heads-up, @Adam5Wu. I've finally had the time to make a working stack thunking code (I believe, basic tests seem fine), and will probably remove all the STACK_PROXY_* code in the mainline branch going forward. It's an ugly hack now, and a real pain to merge in when @pornin updates BearSSL. I know you were using this with AsyncTCP, so I wanted to give you some advance warning. With the changes there will be a Do you see this giving you grief? You still have a "SetSecondStack"-type call once, and then things "just work." If you call the original |
Hi @earlephilhower, thanks for the heads up!
If it is the second usage, would it be possible if you could add a check in your thunking code so that, if no alternative stack region is setup, the stack pointer simply does not change? This could make your port work better with both sync and async scenario. Or better, you could define a special value for "do-not-switch", so that:
|
Howdy Right now it's the 2nd option, the "you set it up or else" because I'm trying to change as little as possible while refactoring, but it will move to your 1st option with I've also just committed a working(!) version of the Arduino core and the thunk-stack version of bssl. You can checkout https://github.com./earlephilhower/bearssl-esp8266/tree/to-thunks for the BSSL source and https://github.com./earlephilhower/Arduino/tree/thunks for the Arduino core bits. With this, the code changes between the original and the ported version are minimized, and I don't have to worry about modifying new functions being added (or worry about breaking existing ones). The thunks today live in Arduino space, not BSSL. Is that a problem for you? They're a single file, so it's not an issue to pull them elsewhere (but again, I want to minimize the BSSL diffs I have to maintain). |
I see what you mean -- so if I do not define the macro thunk_xxx it will just call the original functions, which will just directly use the current stack. I think it is great solution, it will work great for both worlds. Nice job! :) |
PR #4 removes all 2nd-stack stuff. If you need to go beyond the space in SYS in a callback, you can use the same thunking as the core WiFiClientSecureBearSSL uses (in its own PR in the Arduino repo) |
Second stack is now managed, if desired, in arduino core and not in the library. You an use SYS stack if desired (i.e. by not using any thunks). |
To allow AsyncTCP to work when combined with standard WiFiClientSecure, we need to handle the memory allocation in the library itself.
Add a refcnt API and memory allocation inside the lib. @Adam5Wu and the main Arduino libs to be updated once this is completed.
The text was updated successfully, but these errors were encountered: