-
Notifications
You must be signed in to change notification settings - Fork 780
Refactor expression runner so it can be used via the C and JS APIs #2702
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Last commit adds a mechanism to support evaluating expressions like runner.runAndDispose(
module.i32.add(
module.block(null, [
module.local.set(0, module.i32.const(5)),
module.local.get(0, binaryen.i32)
], binaryen.i32),
module.i32.const(1)
)
); where a value is |
This now also implements what I mentioned earlier, in that constant local and global values known beforehand can be set explicitly. NFCI for existing code using the runner, but will open up new possibilities in that a generator does not have to inline these values manually, but can instead use locals as long as their constant value is known, e.g. Another interesting feature would be to utilize a sub-runner to traverse into lightweight function calls, so something like |
Last commit now also traverses into (simple) functions, which has some minor but beneficial effects on existing tests. However, for this to work I had to add a |
Hmm, appears this leads to non-determinism now, depending on the order that functions become optimized in, so sometimes a function is simple enough to evaluate and sometimes it's not (yet) :( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! I just have a few questions and small suggestions, but otherwise LGTM.
src/binaryen-c.h
Outdated
BINARYEN_API ExpressionRunnerFlags ExpressionRunnerFlagsDefault(); | ||
|
||
// Be very careful to preserve any side effects, like those of a `local.tee`, | ||
// for example when we are going to replace the expression afterwards. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to see more explanation of what preserving side effects means. Where/how are the side effects preserved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This now reads
// Be very careful to preserve any side effects. For example, if we are
// intending to replace the expression with a constant afterwards, even if we
// can technically evaluate down to a constant, we still cannot replace the
// expression if it also sets a local, which must be preserved in this scenario
// so subsequent code keeps functioning.
src/binaryen-c.h
Outdated
// might or might not have been optimized already to something we can traverse | ||
// successfully, in turn leading to non-deterministic behavior. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// might or might not have been optimized already to something we can traverse | |
// successfully, in turn leading to non-deterministic behavior. | |
// might be concurrently modified, leading to undefined behavior. |
The problem here is not so much that we don't know what state the other function is in, but that it's state could be changing and inconsistent when we try to traverse it. It would also be good to mention how this flag interacts with the PreserveSideEffects
flag, if at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used your suggested change and mentioned that traversing another function uses this runner's flags, which implies PreserveSideEffects
.
src/binaryen-c.h
Outdated
BinaryenIndex maxLoopIterations); | ||
|
||
// Sets a known local value to use. Order matters if expressions have side | ||
// effects. Returns `true` if the expression actually evaluates to a constant. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the side effects of these expressions preserved even without the PreserveSideEffects
flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This now reads
// Sets a known local value to use. Order matters if expressions have side
// effects. For example, if the expression also sets a local, this side effect
// will also happen (not affected by any flags). Returns `true` if the
// expression actually evaluates to a constant.
// Check if a constant value has been set in the context of this runner. | ||
auto iter = localValues.find(curr->index); | ||
if (iter != localValues.end()) { | ||
return Flow(std::move(iter->second)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this a std::move
? What if the same local is gotten twice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding here is that creating the std::pair<const wasm::Index, wasm::Literals>
will make a copy of the wasm::Literals
value, so using a std::move
here hints that we can move that volatile copy instead of copying twice and dumping one. Perfectly possible that I don't actually know what I'm doing. Please advise :)
} | ||
// Otherwise remember the constant value set, if any, for subsequent gets. | ||
if (!setFlow.breaking()) { | ||
setLocalValue(curr->index, setFlow.values); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't there be subsequent gets if this is a tee, too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, updated the code accordingly
module.i32.const(1) | ||
) | ||
); | ||
assert(expr === 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be more idiomatic for the JS API to turn this into null
or something like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Figured that one would typically test this as !expr
anyway, just doing an overly precise check here for testing purposes. Would imagine that not mixing 0
and null
has benefits for the JIT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative here is to return the unmodified original expression. Would that be an improvement?
test/binaryen.js/expressionrunner.js
Outdated
module.local.get(0, binaryen.i32) | ||
) | ||
); | ||
assert(JSON.stringify(binaryen.getExpressionInfo(expr)) === '{"id":14,"type":2,"value":8}'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would make these tests easier to understand if the JSON were not hardcoded with the raw numbers for id
and type
. Would it be possible to explicitly construct the expected expressions instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used the respective constants now and added a little assertDeepEqual
to make it more easily readable.
@@ -258,7 +258,10 @@ | |||
(i64.const 42) | |||
) | |||
) | |||
(func $reftype-test (; 18 ;) (result nullref) | |||
(func $loop-precompute (; 18 ;) (result i32) | |||
(i32.const 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome 👍
} | ||
|
||
BinaryenExpressionRef | ||
ExpressionRunnerRunAndDispose(ExpressionRunnerRef runner, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the runner fails, it seems like it would be useful to expose more information to the caller about why it failed. That way the user could choose to increase the depth or loop count, if applicable. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, good question. Seems like this might be a bit too much, considering how it complicates the API. For instance, on the AssemblyScript side I expect to always use a reasonable maxDepth
(or none) and give up otherwise as there is no reason to make an exception using larger limits. Would have used that limit right away then.
src/binaryen-c.cpp
Outdated
BinaryenIndex maxDepth, | ||
BinaryenIndex maxLoopIterations) { | ||
if (tracing) { | ||
std::cout << " the_runner = ExpressionRunnerCreate(the_module, " << flags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unfortunate that this will only work correctly if there is at most one Runner created at a time, but it's probably not worth fixing urgently. Perhaps you could at least leave a TODO about it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented something working, but the code for it turned out to be a bit unattractive since, other than expressions etc., runners can be deleted leading to undefined behavior in tracing. Added comments.
Is this ready to land, or still waiting for review from @tlively ? |
I'll take a final look now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just two smallish nits, but I'd be happy to see this merged as-is and have those cleaned up in non-urgent follow-ups. @kripken feel free to merge if you'd like.
src/binaryen-c.cpp
Outdated
|
||
// Even though unlikely, it is possible that we are trying to use an id that is | ||
// still in use after wrapping around, which we must prevent. | ||
std::unordered_set<size_t> usedExpressionRunnerIds; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be static
, too. You could get extra fancy by making both of these helpers static
variables inside of noteExpressionRunner
to limit their scope, but I'll leave that up to you. OTOH, it would probably be better to just say we don't support making more than max size_t expression runners and get rid of all this logic, especially since it is literally impossible to have than many expression runners recorded in expressionRunners
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly thinking in terms of a very long lived process using Binaryen, let's say where modules are being created as-a-service. While we can't store max size_t in the structure, we might at some point overflow, where the likely scenario is that this is just fine, yet guarding for not reusing something left over (i.e. from a module created and never disposed) seems like a good precaution to have. Unlikely that someone will do this with tracing enabled, ofc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, I had missed that the ExpressionRunner
s were removed from the expressionRunners
map when they were destroyed 👍
test/binaryen.js/expressionrunner.js
Outdated
function assertDeepEqual(x, y) { | ||
if (typeof x === "object") { | ||
for (var i in x) assertDeepEqual(x[i], y[i]); | ||
for (i in y) assertDeepEqual(x[i], y[i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do var i
here, too, or would that be unidiomatic or bad? Seeing the variable reused like this gives me the heebie jeebies.
On it, will fix these two real quick :) |
@dcodeIO How is this for a commit message? (This just copied from the opening description) Refactors most of the precompute pass's expression runner into its base class so it can also be used via the C and JS APIs. Also adds the option to populate the runner with known constant local and global values upfront, and remembers assigned intermediate values as well as traversing into functions if requested. C-API: JS-API: |
Looks good :) (have been trying for a while to keep the first post good for a commit message, sometimes divided by a horizontal line to indicate where additional comments start) |
@dcodeIO Looks like there is a merge conflict to resolve now :( |
Ok great, merging! Thanks for all the work here, and sorry it took this long, but sometimes more complex changes end up that way... |
I saw this now, and sorry for late questions. It looks this PR duplicates many of functionalities of |
Was under the impression that |
Thanks for the answer. Yes, I now think the functionalities duplicated are not in |
Tackles the concerns raised in #2797 directly related to #2702 by reverting merging all of `PrecomputeExpressionRunner` into the base `ExpressionRunner`, instead adding a common base for both the precompute pass and the new C-API to inherit. No functional changes. --- ### Current hierarchy after #2702 is ``` ExpressionRunner ├ [PrecomputeExpressionRunner] ├ [CExpressionRunner] ├ ConstantExpressionRunner └ RuntimeExpressionRunner ``` where `ExpressionRunner` contains functionality not utilized by `ConstantExpressionRunner` and `RuntimeExpressionRunner`. ### New hierarchy will be: ``` ExpressionRunner ├ ConstantExpressionRunner │ ├ [PrecomputeExpressionRunner] │ └ [CExpressionRunner] ├ InitializerExpressionRunner └ RuntimeExpressionRunner ``` with the precompute pass's and the C-API's shared functionality now moved out of `ExpressionRunner` into a new `ConstantExpressionRunner`. Also renames the previous `ConstantExpressionRunner` to `InitializerExpressionRunner` to [better represent its uses](https://webassembly.org/docs/modules/#initializer-expression) and to make its previous name usable for the new intermediate template, where it fits perfectly. Also adds a few comments answering some of the questions that came up recently. ### Old hierarchy before #2702 for comparison: ``` ExpressionRunner ├ [PrecomputeExpressionRunner] ├ ConstantExpressionRunner └ RuntimeExpressionRunner ```
Refactors most of the precompute pass's expression runner into its base class so it can also be used via the C and JS APIs. Also adds the option to populate the runner with known constant local and global values upfront, and remembers assigned intermediate values as well as traversing into functions if requested.
C-API:
JS-API: