Skip to content

nextest timeouts don't work because cargo-miri doesn't propagate signals? #2421

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
saethlin opened this issue Jul 23, 2022 · 11 comments · Fixed by #2426
Closed

nextest timeouts don't work because cargo-miri doesn't propagate signals? #2421

saethlin opened this issue Jul 23, 2022 · 11 comments · Fixed by #2426
Labels
A-cargo Area: affects the cargo wrapper (cargo miri) C-bug Category: This is a bug.

Comments

@saethlin
Copy link
Member

cc @sunshowers

If I try to run cargo miri nextest --no-fail-fast with a long-running test that should be timed out, the Miri process executing the test doesn't exit, only the cargo-miri process that nextest is managing exits. I think this is because cargo-miri isn't passing along the SIGTERM to its child.

That all sort of makes sense to me but I feel like I'm losing my mind because normal cargo miri test doesn't have this problem. If I kill any of the cargo-miri processes it takes down the Miri process as well. I've probably spent more time than I should trying to understand this, I think I'm just missing the Unix knowledge and understanding of how nextest works.

@sunshowers
Copy link
Contributor

sunshowers commented Jul 23, 2022

Huh that's weird! How are you configuring the timeout? Are you using nextest's support for timeouts?

Nextest sends SIGTERM to the child process (test), waits 10 seconds, then sends SIGKILL. This should be foolproof.

@saethlin
Copy link
Member Author

Configured per your suggestion with --tool-config-file. Yup, nextest sends SIGTERM to a cargo-miri process which immediately exits, but leaves its child miri process just cranking away.

@sunshowers
Copy link
Contributor

Ahh that makes sense. Killing grandchild processes is a bit of a nightmare especially on Unix, so nextest doesn't try and do that at the moment (though that may change in the future).

The cargo miri process run under nextest should probably forward the SIGTERM (and honestly do a SIGKILL after a couple of seconds) to the miri process.

See this link:

https://github.com./oconnor663/duct.py/blob/master/gotchas.md#killing-grandchild-processes

@RalfJung
Copy link
Member

RalfJung commented Jul 23, 2022

I guess the problem is that cargo-miri inserts itself between cargo (or, in this case, cargo-nextest) and the processes it spawns, to be able to adjust the flags. In an ideal world this would actually use POSIX exec, and then the final process graph would look like it does without cargo-miri. But I don't think the Rust standard library exposes a plain exec.

That all sort of makes sense to me but I feel like I'm losing my mind because normal cargo miri test doesn't have this problem. If I kill any of the cargo-miri processes it takes down the Miri process as well. I've probably spent more time than I should trying to understand this, I think I'm just missing the Unix knowledge and understanding of how nextest works.

Hm, that seems surprising? If you do Ctrl-C in the shell, I think the shell does some magic and sends SIGINT to a whole load of processes (all processes attached to this terminal, or something like that?), and that's what makes it work. But a single targeted signal to cargo-miri should behave the same whether you send it by hand or whether cargo-nextest sends it.

@saethlin
Copy link
Member Author

As usual, you're right. Don't know what I was doing before, but if I cargo miri test then pkill cargo-miri, it definitely leaves the miri process running. So at least the bad behavior is consistent, maybe I was just tired.

@RalfJung RalfJung added C-bug Category: This is a bug. A-cargo Area: affects the cargo wrapper (cargo miri) labels Jul 23, 2022
@sunshowers
Copy link
Contributor

On the nextest side I'm going to try setting up a process group (looks like Bazel does this for tests).

@sunshowers
Copy link
Contributor

In nextest-rs/nextest#393 I've switched nextest over to using process groups on Unix, which should address this on the nextest end.

I'm also going to do a similar patch on Windows using job handles.

@sunshowers
Copy link
Contributor

nextest-rs/nextest#396 is the fix for Windows. Should aim to get a new release out tomorrow.

@sunshowers
Copy link
Contributor

FWIW this is addressed on nextest's end now, so the change in #2426 (while probably good in case the cargo-miri process gets a SIGTERM from some other source) isn't necessary for nextest.

@saethlin
Copy link
Member Author

Yeah, I still definitely want the exec change because it doesn't make it seem like something is broken when you hit ctrl+c, and it also makes the experience looking at top or pgrep much more familiar.

@saethlin
Copy link
Member Author

saethlin commented Jul 25, 2022

However, this issue has been addressed in nextest :)
As-written, this is closed.

bors added a commit that referenced this issue Jul 28, 2022
Use real exec on cfg(unix) targets

Closes #2421

The standard library has a platform extension trait that lets us get the behavior we want on cfg(unix), so why not use it?

I tried this out and it produces the correct behavior in concert with nextest.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cargo Area: affects the cargo wrapper (cargo miri) C-bug Category: This is a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants