Skip to content

scheduled functions: calls from yield are now optional #6158

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
May 29, 2019

Conversation

d-a-v
Copy link
Collaborator

@d-a-v d-a-v commented May 28, 2019

@hreintke This PR restores legacy behavior by default

@dok-net
Copy link
Contributor

dok-net commented May 28, 2019

@d-a-v Could yield - opportunistic_yield() - be called in run_scheduled_functions() to stop the wdt from firing if there are many scheduled items?

@d-a-v
Copy link
Collaborator Author

d-a-v commented May 28, 2019

@dok-net done

@d-a-v d-a-v merged commit 455583b into esp8266:master May 29, 2019
@d-a-v d-a-v deleted the notinyield branch May 29, 2019 09:10
@dok-net
Copy link
Contributor

dok-net commented May 29, 2019

@d-a-v @hreintke Besides the portability (ESP32, FreeRTOS on ESP8266) question that's keeping me engaged, I would like to pose a more obvious question: what's so wrong with the SDK scheduler that we're doing this Scheduler work in the first place? From a performance point of view, iterating the list of timeouts (esp8266::polledTimeout) over and over is no advantage either, and if there is nothing in the SDK scheduler supporting timed tasks, then this can be done as a wrapper around the user code just as well. What I am saying is, is this just a work-around for some itch that gets thrown out once the Non-OS SDK becomes unavailable? But aren't we throwing performance out the window with this Schedule scheduler? If everyone/anyone starts using it, things don't look shiny, I for one have reservations about the performance for EspSoftwareSerial, once a few device drivers etc. start inserting their timed calls into the ring buffer.
Given the limitations for yield as a result of this work, how hard could it be to do it with tasks in the "right way"? You see, I am already asking these questions about the correct yield code in my code remarks above.

@hreintke
Copy link
Contributor

@dok-net : I think portability between ESP32, FreeRTOS ESP826 (and NONOS ESP8266 ?) is preferable.
With the ESP32-SOLO-1 coming, more people will be moving to/combining ESP32.
I am not fully aware of the functionality/options of the SDK scheduler. Are you thinking of the use of system_os_task(..) or it it another ?

@devyte
Copy link
Collaborator

devyte commented May 29, 2019

@dok-net if by sdk scheduler you mean the sdk tasks, then the problem is that their code executes in the SYS context, and hence the user code can't be long runnig, blocking, can't call yield or delay, etc. The Ticker for example runs cbs in that way, and so suffers from those limitations.
Consider all the code we have with delays and yields scattered in it. None of that can run in SYS.
This mechanism allows running the cbs in the CONT context, so restrictions are relaxed.

@d-a-v
Copy link
Collaborator Author

d-a-v commented May 29, 2019

I originally needed these regular scheduled functions because of ethernet. I could use a ethernet.handlePacket() in the main loop and I was, but it is not compatible with some network inner-loop (not exiting to main loop() for a while until some protocol handshaking is finished - like some websocket libraries are doing).

So I needed ethernet.handlePacket() called without modifying any external library on which I have no control. This is the reason of this PR.

We ended up with this linked list of recurrent function mixed with non-recurrent functions which, you seem to say, brings a performance penalty. I originally was not thinking they would be that intensively used, but only for some calls sometimes (using lightweight polledtimeout lib).

@dok-net and @hreintke you state there is a performance penalty because you are intensively using them. Can you be more specifc ? How many functions on each loop() ?
Did you compare this model to a multi-threaded/task OS ?

Regarding multi-thread on ESP32 or the new single core version, these beasts have at least 512KB ram while we have 50KB. They can afford a stack per task while we can't (well, esp8266-rtos-sdk can and I wonder how many tasks can be spawned without running out of memory).

These recurrent scheduled functions are a useful addition (not recurrent version was already there). How could it be improved regarding the performance penalty ?

  • two lists ?
    • one for recurrent, not meant to be modified at every cycle
    • one for non recurrent, very simple to manage
  • @dok-net 's circular queue (said to be faster)
  • banning delay() from them ? using instead a if (canBeDoneNow) { doit(); } (<- this is starting to look like a scheduler)
  • a multitask OS (but the programming model is different)

@dok-net
Copy link
Contributor

dok-net commented May 29, 2019

@devyte It's great that you mention Ticker. I was beginning to plan in this direction, like the following: determine which delayed functions will not run during the next invocation of run_scheduled_functions and set them up with the Ticker INSTEAD of in the queue. Once the Ticker finds a function "runnable", the Ticker CB would insert the function in the scheduler queue. After returning true, the scheduler would return the function object to the Ticker, and so forth. Is this possible without violating some system constraint?

About my performance wishes: it would be great to have enough performance AND precision to schedule, at up to 8.6µs resolution (1/115200bps serial), the line signal changes of bit output. Or anything below that, my algorithm determines the necessary signal inversions and could fall back to change/live-lock for too high signal rates.

Probably that's asking to much from coop multitasking, but I'd like to try it.

@hreintke
Copy link
Contributor

@d-a-v @devyte @dok-net
I personally do not have high performance requirements, just am dependent on the "just run in loop and not in yield".
For limiting the performance impact I think the best option is to avoid checking he complete list at every run_scheduled_options. Having a Ticker for every scheduled_function will limit the quickly.
Having a "next time to execute" sorted list and only one ticker to move to "run_now list" probably is better.

For the functions that need to run/be scheduled on regular intervals there is already the possibility to use the Ticker'

Ticket t;
t.attach_scheduled_ms([](){   })

For higher precision, I think etstimer provides microscecond precision, not yet used in Ticker.

@devyte
Copy link
Collaborator

devyte commented May 29, 2019

Because the code is executed as though it were part of the loop, high timing precision between loops can't be assured. Precision depends on the application: consider wifimanager, where the execution path is completely hijacked for an extended period of time before returning. Consider File streaming, where the execution path doesn't return until the File is done. The precision in CONT depends on the frequency of loop calls and now on the frequency of calls to yield/delay.
If precision is needed, the Ticker should be used directly, or if high precision is needed, an os timer should be used. Top precision is assured with a hw timer.

About performance, if I understand correctly, the issue is that currently we're essentially polling the entire queue to find what to run, and that is done on every loop. Because the approach is polled, there is high CPU load. I see the following possibilities, including the suggestion above:

  1. Maintain 2 queues: one for 1-shot functions, which would work as before, and one for recurrent items.
  2. The recurrent items would not get polled like now. Instead, there would be a Ticker which would trigger every so often, and insert (copy?) an item from the recurrent list and insert the copy into the 1-shot list.
  3. Alternative to 2.: Use a FIFO queue for 1-shot and an ordered queue for recurring, where the first item is always the next to execute. Therefore you only need to poll the first item every loop instead of the entire queue.
  4. Recurrence should be limited in frequency, i.e.: disallow periods smaller than some value, e.g.: 50ms.

@dok-net
Copy link
Contributor

dok-net commented May 29, 2019

@devyte I was typing the "wish" for µs between appointments - it's impossible for the obvious reasons you've restated for me. It's good that everyone is moving the direction, that on each yield or loop polling all the timeout objects, has a noticeable impact. In the case of the circular queue that I am proposing, there's even a copy involved on each round-trip!
For your point 2 above, I have question: is there a limit to the number of Ticker objects than can be created? Does the SDK set up a linked list by itself that orders the ETSTIMER instances in a linked list for efficiency?
Please consider PR #6164 that fixes the heap fragmentation issues causes by the Ticker class.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants