Refactor expression runner so it can be used via the C and JS APIs #2702

dcodeIO · 2020-03-18T16:13:48Z

Refactors most of the precompute pass's expression runner into its base class so it can also be used via the C and JS APIs. Also adds the option to populate the runner with known constant local and global values upfront, and remembers assigned intermediate values as well as traversing into functions if requested.

C-API:

ExpressionRunnerFlagsDefault()
ExpressionRunnerFlagsPreserveSideeffects()
ExpressionRunnerFlagsTraverseCalls()
ExpressionRunnerCreate(module, flags, maxDepth, maxLoopIterations)
ExpressionRunnerSetLocalValue(runner, index, value)
ExpressionRunnerSetGlobalValue(runner, name, value)
ExpressionRunnerRunAndDispose(runner, expr)

JS-API:

binaryen.ExpressionRunner.Flags.Default
binaryen.ExpressionRunner.Flags.PreserveSideeffects
binaryen.ExpressionRunner.Flags.TraverseCalls
new binaryen.ExpressionRunner(module[, flags[, maxDepth[, maxLoopIterations]]])
binaryen.ExpressionRunner#setLocalValue(index, valueExpr)
binaryen.ExpressionRunner#setGlobalValue(name, valueExpr)
binaryen.ExpressionRunner#runAndDispose(expr)

src/wasm-interpreter.h

dcodeIO · 2020-03-19T23:31:58Z

Last commit adds a mechanism to support evaluating expressions like

runner.runAndDispose(
  module.i32.add(
    module.block(null, [
      module.local.set(0, module.i32.const(5)),
      module.local.get(0, binaryen.i32)
    ], binaryen.i32),
    module.i32.const(1)
  )
);

where a value is local.set and later picked up again by a local.get, which is a common pattern in our code. Appears that precompute tests are unaffected by this, but chances are that I overlooked something. For instance, at one point I'm returning Flow() to indicate no value, instead of a breaking Flow(NONSTANDALONE_FLOW), which I'm not certain about.

dcodeIO · 2020-03-20T14:28:06Z

This now also implements what I mentioned earlier, in that constant local and global values known beforehand can be set explicitly. NFCI for existing code using the runner, but will open up new possibilities in that a generator does not have to inline these values manually, but can instead use locals as long as their constant value is known, e.g. const in JS where sometimes the value is known to be a compile time constant, and sometimes is just meant to be readonly.

Another interesting feature would be to utilize a sub-runner to traverse into lightweight function calls, so something like clamp or max functions can be evaluated.

src/wasm-interpreter.h

dcodeIO · 2020-03-21T17:13:39Z

Last commit now also traverses into (simple) functions, which has some minor but beneficial effects on existing tests. However, for this to work I had to add a trapIfInvalid flag to the base class since it would otherwise abort on some test cases like test/unit.wat, which seems odd and might indicate that there is something wrong either in my code or in general.

dcodeIO · 2020-03-21T17:28:09Z

Hmm, appears this leads to non-determinism now, depending on the order that functions become optimized in, so sometimes a function is simple enough to evaluate and sometimes it's not (yet) :(

tlively

Looks great! I just have a few questions and small suggestions, but otherwise LGTM.

tlively · 2020-04-07T01:28:05Z

src/binaryen-c.h

+BINARYEN_API ExpressionRunnerFlags ExpressionRunnerFlagsDefault();
+
+// Be very careful to preserve any side effects, like those of a `local.tee`,
+// for example when we are going to replace the expression afterwards.


I would like to see more explanation of what preserving side effects means. Where/how are the side effects preserved?

This now reads

// Be very careful to preserve any side effects. For example, if we are // intending to replace the expression with a constant afterwards, even if we // can technically evaluate down to a constant, we still cannot replace the // expression if it also sets a local, which must be preserved in this scenario // so subsequent code keeps functioning.

tlively · 2020-04-07T01:31:33Z

src/binaryen-c.h

+// might or might not have been optimized already to something we can traverse
+// successfully, in turn leading to non-deterministic behavior.


Suggested change

// might or might not have been optimized already to something we can traverse

// successfully, in turn leading to non-deterministic behavior.

// might be concurrently modified, leading to undefined behavior.

The problem here is not so much that we don't know what state the other function is in, but that it's state could be changing and inconsistent when we try to traverse it. It would also be good to mention how this flag interacts with the PreserveSideEffects flag, if at all.

Used your suggested change and mentioned that traversing another function uses this runner's flags, which implies PreserveSideEffects.

tlively · 2020-04-07T01:33:25Z

src/binaryen-c.h

+                       BinaryenIndex maxLoopIterations);
+
+// Sets a known local value to use. Order matters if expressions have side
+// effects. Returns `true` if the expression actually evaluates to a constant.


Are the side effects of these expressions preserved even without the PreserveSideEffects flag?

This now reads

// Sets a known local value to use. Order matters if expressions have side // effects. For example, if the expression also sets a local, this side effect // will also happen (not affected by any flags). Returns `true` if the // expression actually evaluates to a constant.

tlively · 2020-04-07T01:43:40Z

src/wasm-interpreter.h

+    // Check if a constant value has been set in the context of this runner.
+    auto iter = localValues.find(curr->index);
+    if (iter != localValues.end()) {
+      return Flow(std::move(iter->second));


Why is this a std::move? What if the same local is gotten twice?

My understanding here is that creating the std::pair<const wasm::Index, wasm::Literals> will make a copy of the wasm::Literals value, so using a std::move here hints that we can move that volatile copy instead of copying twice and dumping one. Perfectly possible that I don't actually know what I'm doing. Please advise :)

tlively · 2020-04-07T01:46:29Z

src/wasm-interpreter.h

+      }
+      // Otherwise remember the constant value set, if any, for subsequent gets.
+      if (!setFlow.breaking()) {
+        setLocalValue(curr->index, setFlow.values);


Couldn't there be subsequent gets if this is a tee, too?

Good catch, updated the code accordingly

tlively · 2020-04-07T02:06:03Z

test/binaryen.js/expressionrunner.js

+    module.i32.const(1)
+  )
+);
+assert(expr === 0);


Would it be more idiomatic for the JS API to turn this into null or something like that?

Figured that one would typically test this as !expr anyway, just doing an overly precise check here for testing purposes. Would imagine that not mixing 0 and null has benefits for the JIT.

An alternative here is to return the unmodified original expression. Would that be an improvement?

tlively · 2020-04-07T02:09:01Z

test/binaryen.js/expressionrunner.js

+    module.local.get(0, binaryen.i32)
+  )
+);
+assert(JSON.stringify(binaryen.getExpressionInfo(expr)) === '{"id":14,"type":2,"value":8}');


I think it would make these tests easier to understand if the JSON were not hardcoded with the raw numbers for id and type. Would it be possible to explicitly construct the expected expressions instead?

Used the respective constants now and added a little assertDeepEqual to make it more easily readable.

tlively · 2020-04-07T02:10:08Z

test/passes/precompute_all-features.txt

@@ -258,7 +258,10 @@
   (i64.const 42)
  )
 )
- (func $reftype-test (; 18 ;) (result nullref)
+ (func $loop-precompute (; 18 ;) (result i32)
+  (i32.const 1)


Awesome 👍

tlively · 2020-04-07T02:12:11Z

src/binaryen-c.cpp

+}
+
+BinaryenExpressionRef
+ExpressionRunnerRunAndDispose(ExpressionRunnerRef runner,


If the runner fails, it seems like it would be useful to expose more information to the caller about why it failed. That way the user could choose to increase the depth or loop count, if applicable. What do you think?

Hmm, good question. Seems like this might be a bit too much, considering how it complicates the API. For instance, on the AssemblyScript side I expect to always use a reasonable maxDepth (or none) and give up otherwise as there is no reason to make an exception using larger limits. Would have used that limit right away then.

tlively · 2020-04-07T02:13:28Z

src/binaryen-c.cpp

+                                           BinaryenIndex maxDepth,
+                                           BinaryenIndex maxLoopIterations) {
+  if (tracing) {
+    std::cout << "  the_runner = ExpressionRunnerCreate(the_module, " << flags


It's unfortunate that this will only work correctly if there is at most one Runner created at a time, but it's probably not worth fixing urgently. Perhaps you could at least leave a TODO about it?

Implemented something working, but the code for it turned out to be a bit unattractive since, other than expressions etc., runners can be deleted leading to undefined behavior in tracing. Added comments.

kripken · 2020-04-16T20:31:01Z

Is this ready to land, or still waiting for review from @tlively ?

tlively · 2020-04-16T21:06:56Z

I'll take a final look now.

tlively

Just two smallish nits, but I'd be happy to see this merged as-is and have those cleaned up in non-urgent follow-ups. @kripken feel free to merge if you'd like.

tlively · 2020-04-16T21:19:54Z

src/binaryen-c.cpp

+
+// Even though unlikely, it is possible that we are trying to use an id that is
+// still in use after wrapping around, which we must prevent.
+std::unordered_set<size_t> usedExpressionRunnerIds;


This should probably be static, too. You could get extra fancy by making both of these helpers static variables inside of noteExpressionRunner to limit their scope, but I'll leave that up to you. OTOH, it would probably be better to just say we don't support making more than max size_t expression runners and get rid of all this logic, especially since it is literally impossible to have than many expression runners recorded in expressionRunners.

Mostly thinking in terms of a very long lived process using Binaryen, let's say where modules are being created as-a-service. While we can't store max size_t in the structure, we might at some point overflow, where the likely scenario is that this is just fine, yet guarding for not reusing something left over (i.e. from a module created and never disposed) seems like a good precaution to have. Unlikely that someone will do this with tracing enabled, ofc.

Aha, I had missed that the ExpressionRunners were removed from the expressionRunners map when they were destroyed 👍

tlively · 2020-04-16T21:34:51Z

test/binaryen.js/expressionrunner.js

+function assertDeepEqual(x, y) {
+  if (typeof x === "object") {
+    for (var i in x) assertDeepEqual(x[i], y[i]);
+    for (i in y) assertDeepEqual(x[i], y[i]);


Can we do var i here, too, or would that be unidiomatic or bad? Seeing the variable reused like this gives me the heebie jeebies.

dcodeIO · 2020-04-16T21:43:09Z

On it, will fix these two real quick :)

tlively · 2020-04-17T00:45:34Z

@dcodeIO How is this for a commit message? (This just copied from the opening description)

Refactors most of the precompute pass's expression runner into its base class so it can also be used via the C and JS APIs. Also adds the option to populate the runner with known constant local and global values upfront, and remembers assigned intermediate values as well as traversing into functions if requested.

C-API:
ExpressionRunnerFlagsDefault()
ExpressionRunnerFlagsPreserveSideeffects()
ExpressionRunnerFlagsTraverseCalls()
ExpressionRunnerCreate(module, flags, maxDepth, maxLoopIterations)
ExpressionRunnerSetLocalValue(runner, index, value)
ExpressionRunnerSetGlobalValue(runner, name, value)
ExpressionRunnerRunAndDispose(runner, expr)

JS-API:
binaryen.ExpressionRunner.Flags.Default
binaryen.ExpressionRunner.Flags.PreserveSideeffects
binaryen.ExpressionRunner.Flags.TraverseCalls
new binaryen.ExpressionRunner(module[, flags[, maxDepth[, maxLoopIterations]]])
binaryen.ExpressionRunner#setLocalValue(index, valueExpr)
binaryen.ExpressionRunner#setGlobalValue(name, valueExpr)
binaryen.ExpressionRunner#runAndDispose(expr)

dcodeIO · 2020-04-17T01:23:50Z

Looks good :) (have been trying for a while to keep the first post good for a commit message, sometimes divided by a horizontal line to indicate where additional comments start)

tlively · 2020-04-17T02:09:59Z

@dcodeIO Looks like there is a merge conflict to resolve now :(

kripken · 2020-04-20T21:00:33Z

Ok great, merging! Thanks for all the work here, and sorry it took this long, but sometimes more complex changes end up that way...

aheejin · 2020-04-23T11:02:30Z

I saw this now, and sorry for late questions. It looks this PR duplicates many of functionalities of RuntimeExpressionRunner in ExpressionRunner. Why is that? If Binaryen and C API want to make use of it, why can't they just use RuntimeExpressionRunner? I only took a brief look and I might well be mistaken, so please advise!

dcodeIO · 2020-04-23T15:14:07Z

Was under the impression that RuntimeExpressionRunner is a different beast and requires a lot of context, like memory and external interface, where what I wanted to achieve was to quickly evaluate an expression with limited context while the module is still being generated, for example to check if a static condition (i.e. in generics supporting varying types) is statically true or false, again affecting codegen. Some of these changes are also beneficial to the precompute pass (or previously lived in a separate PrecomputeExpressionRunner), which does something very similar.

Fixes #2788 found by the fuzzer, introduced in #2702, which turned out to be incorrect usage of std::move, by removing any std::moves introduced in that PR to be better safe than sorry. Also fixes problems with WASM_INTERPRETER_DEBUG spotted during debugging.

aheejin · 2020-04-23T22:19:30Z

Thanks for the answer. Yes, I now think the functionalities duplicated are not in RuntimeExpressionRunner but more in PrecomputeExpressionRunner. I opened #2797 for this.

Tackles the concerns raised in #2797 directly related to #2702 by reverting merging all of `PrecomputeExpressionRunner` into the base `ExpressionRunner`, instead adding a common base for both the precompute pass and the new C-API to inherit. No functional changes. --- ### Current hierarchy after #2702 is ``` ExpressionRunner ├ [PrecomputeExpressionRunner] ├ [CExpressionRunner] ├ ConstantExpressionRunner └ RuntimeExpressionRunner ``` where `ExpressionRunner` contains functionality not utilized by `ConstantExpressionRunner` and `RuntimeExpressionRunner`. ### New hierarchy will be: ``` ExpressionRunner ├ ConstantExpressionRunner │ ├ [PrecomputeExpressionRunner] │ └ [CExpressionRunner] ├ InitializerExpressionRunner └ RuntimeExpressionRunner ``` with the precompute pass's and the C-API's shared functionality now moved out of `ExpressionRunner` into a new `ConstantExpressionRunner`. Also renames the previous `ConstantExpressionRunner` to `InitializerExpressionRunner` to [better represent its uses](https://webassembly.org/docs/modules/#initializer-expression) and to make its previous name usable for the new intermediate template, where it fits perfectly. Also adds a few comments answering some of the questions that came up recently. ### Old hierarchy before #2702 for comparison: ``` ExpressionRunner ├ [PrecomputeExpressionRunner] ├ ConstantExpressionRunner └ RuntimeExpressionRunner ```

dcodeIO added 4 commits March 18, 2020 17:03

Derive standalone expression runner from precompute pass

132cb1d

handle traps, format

bbe1dc5

fix?

3f3c02d

update tests

461576d

tlively reviewed Mar 18, 2020

View reviewed changes

src/wasm-interpreter.h Outdated Show resolved Hide resolved

refactor replaceExpresion to an enum

f5e3837

dcodeIO commented Mar 19, 2020

View reviewed changes

src/wasm-interpreter.h Outdated Show resolved Hide resolved

dcodeIO marked this pull request as ready for review March 19, 2020 18:18

fix expressions[0] in tracing, track temporary local values

2ccc81d

dcodeIO added 2 commits March 20, 2020 14:08

simplify

30bd10c

implement preset local/global values

ca3ed47

could need some format on save

a853db9

kripken reviewed Mar 20, 2020

View reviewed changes

src/wasm-interpreter.h Outdated Show resolved Hide resolved

kripken reviewed Mar 20, 2020

View reviewed changes

src/wasm-interpreter.h Outdated Show resolved Hide resolved

kripken reviewed Mar 20, 2020

View reviewed changes

src/wasm-interpreter.h Outdated Show resolved Hide resolved

kripken reviewed Mar 20, 2020

View reviewed changes

src/wasm-interpreter.h Outdated Show resolved Hide resolved

kripken reviewed Mar 20, 2020

View reviewed changes

src/wasm-interpreter.h Outdated Show resolved Hide resolved

dcodeIO added 3 commits March 20, 2020 18:24

address comments

66d23f1

more documentation for getValues

35d09c2

refactor runner to its own cpp file

f57cbdc

dcodeIO changed the title ~~Derive standalone expression runner from precompute pass~~ Derive context aware expression runner from precompute pass Mar 21, 2020

traverse into simple functions

63d7520

dcodeIO added 4 commits March 21, 2020 19:41

deal with non-determinism

5432156

update API, add test

ec0a93d

retrigger CI

1ceb5b5

fix comment

d145cfc

tlively reviewed Apr 7, 2020

View reviewed changes

dcodeIO added 3 commits April 7, 2020 12:44

address (most) comments

4c8fc7c

mention interaction between TraverseCalls and PreserveSideEffects

8586c57

Merge branch 'master' into expressionrunner

1184f58

tlively approved these changes Apr 16, 2020

View reviewed changes

address comments

17e5de0

Merge branch 'master' into expressionrunner

84c27ac

kripken merged commit 483d759 into WebAssembly:master Apr 20, 2020

This was referenced Apr 20, 2020

Tackle the case of === AssemblyScript/assemblyscript#1111

Closed

Update Binaryen and utilize new ExpressionRunner API AssemblyScript/assemblyscript#1237

Merged

Further enhancements to ExpressionRunner #2786

Open

kripken mentioned this pull request Apr 22, 2020

Fuzz failures after #2702 #2788

Closed

dcodeIO mentioned this pull request Apr 22, 2020

Fix ExpressionRunner issues found by the fuzzer #2790

Merged

This was referenced Apr 23, 2020

Handle drops of unknown values in ExpressionRunner #2787

Closed

ExpressionRunner class hierarchy #2797

Closed

dcodeIO mentioned this pull request Apr 24, 2020

Refactor ExpressionRunner #2804

Merged

kripken mentioned this pull request Mar 29, 2024

Remove the TRAVERSE_CALLS option in the ConstantExpressionRunner #6449

Merged

tlively mentioned this pull request Jan 15, 2025

A better API to evaluate or precompute an expression #2699

Closed

		// might or might not have been optimized already to something we can traverse
		// successfully, in turn leading to non-deterministic behavior.

	// might or might not have been optimized already to something we can traverse
	// successfully, in turn leading to non-deterministic behavior.
	// might be concurrently modified, leading to undefined behavior.

Refactor expression runner so it can be used via the C and JS APIs #2702

Refactor expression runner so it can be used via the C and JS APIs #2702

Conversation

dcodeIO commented Mar 18, 2020 • edited Loading

C-API:

JS-API:

dcodeIO commented Mar 19, 2020

dcodeIO commented Mar 20, 2020 • edited Loading

dcodeIO commented Mar 21, 2020

dcodeIO commented Mar 21, 2020

tlively left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kripken commented Apr 16, 2020

tlively commented Apr 16, 2020

tlively left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dcodeIO Apr 16, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dcodeIO commented Apr 16, 2020

tlively commented Apr 17, 2020

dcodeIO commented Apr 17, 2020

tlively commented Apr 17, 2020

kripken commented Apr 20, 2020

aheejin commented Apr 23, 2020 • edited Loading

dcodeIO commented Apr 23, 2020

aheejin commented Apr 23, 2020

dcodeIO commented Mar 18, 2020 •

edited

Loading

dcodeIO commented Mar 20, 2020 •

edited

Loading

dcodeIO Apr 16, 2020 •

edited

Loading

aheejin commented Apr 23, 2020 •

edited

Loading