native-lib/trace: Intercept allocating calls, but do nothing#4792
native-lib/trace: Intercept allocating calls, but do nothing#4792nia-e wants to merge 2 commits intorust-lang:masterfrom
Conversation
|
Thank you for contributing to Miri! A reviewer will take a look at your PR, typically within a week or two. |
428bee0 to
966775e
Compare
|
☔ The latest upstream changes (possibly 4b59d7e) made this pull request unmergeable. Please resolve the merge conflicts. |
There was a problem hiding this comment.
Thanks for the PR! I can already see I have to brace myself for some more abstraction-breaking shenanigans.^^ :)
Here's a first round of comments. As you said there's also conflicts to be resolved and CI failures to be looked into.
@rustbot author
| nix = { version = "0.30.1", features = ["mman", "ptrace", "signal"], optional = true } | ||
| ipc-channel = { version = "0.20.0", optional = true } | ||
| capstone = { version = "0.13", optional = true } | ||
| proc-maps = { version = "0.4.0", optional = true } |
There was a problem hiding this comment.
That's a huge pile of new dependencies. :/ and some duplicated ones, like libloading.
How long does this build script take? We just managed to finally get rid of the capstone overhead for check builds, I don't want to add back any new slow build scripts.
| let alloc = (); | ||
|
|
||
| trace::Supervisor::do_ffi(alloc, || { | ||
| let ty_is_sized = dest.layout.ty.is_sized(*this.tcx, this.typing_env()); |
There was a problem hiding this comment.
We do not support unsized return types for FFI calls.^^ So what is this about?
EDIT: Ah, it is factoring out the RawPtr check below -- but this is incorrect, it checks sizedness of the wrong thing. Anyway this logic looks quite different on master so hopefully a rebase will make this all go away.
| return interp_ok(ImmTy::uninit(dest.layout)); | ||
| } | ||
| ty::RawPtr(ty, ..) if ty.is_sized(*this.tcx, this.typing_env()) => { | ||
| ty::RawPtr(ty, ..) if ty_is_sized => { |
There was a problem hiding this comment.
This changes what is being checked. Previously it checked whether the pointee is sized, now you are checking of the pointer is sized (which obviously it always is).
|
|
||
| if tracing { | ||
| this.tracing_apply_accesses(maybe_memevents.unwrap())?; | ||
| let mm = maybe_memevents.unwrap(); |
There was a problem hiding this comment.
What's mm for? After unwrapping they are not "maybe" any more.
|
|
||
| // Save the machine pointer to a location where the libc interceptors can use it, | ||
| // since we can't pass in arguments. | ||
| super::parent::MACHINE_PTR.store(machine_ptr.cast(), std::sync::atomic::Ordering::Relaxed); |
There was a problem hiding this comment.
This seems very similar to native_lib_ecx_interchange which has been added by a concurrent PR -- we don't need both of those, do we?
| // TODO: This should probably only log writes & not reads, since reads | ||
| // in these functions will never expose provenance to the rest of the native | ||
| // code. However, these functions likely won't even do any reads, so... | ||
| wait_for_signal(Some(pid), signal::SIGTRAP, InitialCont::Yes, Some(catch_segfaults)) | ||
| .unwrap(); |
There was a problem hiding this comment.
What are we even doing here? Who's logging stuff? What does that have to do with exposing anything? I am confused.
The commit message says "Handle accesses during libc fn calls also". Is this something we have to do for some reason, or something you thought would be better to do but it's not mandatory?
| wait_for_signal(Some(pid), signal::SIGTRAP, InitialCont::Yes, Some(catch_segfaults)) | ||
| .unwrap(); | ||
|
|
||
| // Unset the breakpoint stuff and move the ip back an instruction to compensate. |
There was a problem hiding this comment.
As in, undo the ret_addr editing we did above? Right?
| ptrace::write( | ||
| pid, | ||
| libc::malloc as *mut _, | ||
| (libc::malloc as *mut libc::c_long).read_volatile(), | ||
| ) | ||
| .unwrap(); |
There was a problem hiding this comment.
Lol you're just copying the bytes from the parent (where we didn't change them) back to the child? 😭 😂
|
|
||
| /// Set up SIGTRAPs on the first few bytes of malloc/free/etc. | ||
| #[expect(clippy::as_conversions)] | ||
| fn trap_libc(pid: unistd::Pid) { |
There was a problem hiding this comment.
I think it would be good to consistently use debugger terminology here: we are setting and unsetting breakpoints.
| // Override the return address to give us another sigtrap | ||
| // (but save the original bytes). |
There was a problem hiding this comment.
This is another case of setting (and later unsetting) a breakpoint, right?
This is part 1 of #4791; it adds the bits needed to intercept allocating/deallocating calls to libc, but silently ignores them for now. In a follow-up PR this will forward them to shims we control.