Re: More Rust code

Henri Sivonen Mon, 31 Jul 2017 03:03:31 -0700

On Tue, Jul 18, 2017 at 7:01 AM, Jim Blandy <jbla...@mozilla.com> wrote:
> BTW, speaking of training: Jason's and my book, "Programming Rust" will be
> available on paper from O'Reilly on August 29th!

And already available on Safari Books Online (access available via
Service Now request subject to managerial approval).

On Mon, Jul 17, 2017 at 10:43 PM, Ted Mielczarek <t...@mielczarek.org> wrote:
> From my perspective Rust is our
> single-biggest competitive advantage for shipping Firefox, and every
> time we choose C++ over Rust we throw that away.

I agree.

> but we are quickly
> going to hit a point where "I don't feel like learning Rust" is not
> going to cut it anymore.

Indeed.

On Tue, Jul 11, 2017 at 4:37 PM, smaug <sm...@welho.com> wrote:
> How is the performance when crossing Rust <-> C++ boundary? We need to make
> everything faster, not slower.
> If I understood emilio's explanation on IRC correctly having the performance
> of an inlined (C++) function requires
> handwritten rust bindings to access member variables of some C++ object.
> That doesn't sound too good - hard to maintain and possibly easy to forget
> to optimize.
>
> I don't claim to understand anything about the current setup, but
> has anyone written down what would be needed to have fast and easy to
> maintain Rust <-> C++ boundary in
> such way that also memory handling is easy to manage (aka, how to deal with
> CC/GC).
> I think it would be better to sort out this kind of low level issues rather
> soon before we have too much
> Rust code in tree, or perhaps we won't see much Rust usage before those
> issues are sorted out.
>
> (I'm looking this all from DOM point of view, where pretty much all the
> objects need to be cycle collectable JS holders, but perhaps Rust would fit
> better in code outside DOM)

Rust indeed is, at least at present, a better fit for code outside the
area of cycle-collectable DOM objects.

The performance issue you mention applies if the usage scenario is
that Rust code needs to get or set a lot of fields on a C++ object.
While we do have code that, if implemented in Rust, would have to do
performance-sensitive field access on C++ objects, we also have areas
for which that would not be a concern. For example, in the case of
encoding_rs, the data that crosses the FFI boundary is structurally
simple (mozilla::Span / Rust slices decomposing to pointer to an array
of primitives and a length for FFI crossing) and the amount of work
done on the Rust side is substantial compared to the frequency of
crossing the FFI boundary.

In the absence of the Stylo-like optimization of frequent
performance-sensitive access of fields of a foreign-language object,
the FFI story that one can expect involves three functions/methods per
logical method.

Either (for C++ caller and Rust callee)
 1) C++ method wrapping the C function to hide the unsafety and bad
ergonomics of raw C.
 2) C function declared in C++ and implemented in Rust.
 3) Rust method: the actual callee that does something useful.
Or (for Rust caller and C++ callee)
 1) Rust method wrapping the C function to hide the unsafety and bad
ergonomics of raw C.
 2) C function declared in Rust and implemented in C++.
 3) C++ method: the actual callee that does something useful.

So there's the real callee method, there's a small C function that
wraps that method in a C ABI-compatible way and then there is a
wrapper for the C function that provides the ergonomics that one would
expect in the calling language.

The caller-side wrapper around the C function is trivial to make
inline and as a matter of code size is likely harmless or even
strictly beneficial to make inline.

The compilers don't have visibility across the declaration definition
of the C function, since that's where the cross-language linkage
happens, so currently one needs to assume that the C function always
has the cost of an actual function call.

As for inlining the actual callee method of interest in the language
being called into the implementation of the C function, it may or may
not happen automatically and when it doesn't happen automatically,
forcing it to happen manually might be a problem in terms of code
size.

So when the callee that actually does the work we care about doesn't
get inlined into its C wrapper, one should approximate a call from
Rust to C++ or from C++ to Rust have the cost of two *non-virtual*
function calls instead of one. (It would be interesting to contrast
this to the cost of over-use of virtual calls due to nsIFoo
interfaces.)

- -

Ideally, both the caller-language-side wrapper around the C function
and the C function itself would get inlined so that the cross-language
call would on the machine code level look just like a normal (if not
also inlined!) call to a method of the callee language. For that to
happen, we'd need link-time inlining across object files produced by
different-language compilers.

Naïvely, one would think that it should be possible to do that with
clang producing "object files" holding LLVM IR and rustc producing
"object files" holding LLVM IR and the "link" step involving mashing
those together, running LLVM optimizations again and then producing
machine code from a massive collection of mashed-together LLVM IR.

In London, so over a year ago, I asked people who (unlike me) actually
understand the issues involved how far off this kind of cross-language
inlining would be, and I was told that it was very far off. Most
obviously, it would require us to compile using clang instead of MSVC
on Windows.

Now that it's been over a year and two significant things have
happened, 1) we actually have (traditionally-linked for the FFI
boundary) Rust code in Firefox and 2) clang is ready enough on Windows
that Chrome has switched to it on Windows, I guess it's worthwhile to
ask again:

If we were compiling C++ using clang on all platforms, how far off
would such cross-language inlining be?

If we could have the cross-language inlining benefit from compiling
C++ using clang on all platforms, how far off would we be from being
able to switch to clang on all platforms?

- -

But to go back to Rust and DOM objects:

Even the context of DOM objects, there are two very different
scenarios of relevance:
 1) Rust code participating in DOM mutations
 2) Rust code reading from the DOM when the DOM is guaranteed not to change.

Scenario #2 applies to Stylo, but Stylo isn't the only case where it
could be useful to have Rust code reading from the DOM when the DOM is
guaranteed not to change.

I've been talking about wishing to rewrite our DOM serializers (likely
excluding the one we use for innerHTML in the document is in the HTML
mode) in Rust. I have been assuming that such work could reuse the
code that Stylo of has for viewing the DOM from Rust in a read-only
stop-the-world  fashion.

I haven't actually examined how reusable that Stylo code is for
non-Stylo purposes. Is it usable for non-Stylo purposes?

- -

And on the topic of memory management:

DOM nodes themselves obviously have to be able to deal with multiple
references to them, but otherwise we have a lot of useless use of
refcounting attributable to the 1998/1999 mindset of making everything
an nsIFoo. In cases where mozilla::UniquePtr would suffice and
nsCOMPtr isn't truly needed considering the practical ownership
pattern, making the Rust objects holdable by mozilla::UniquePtr is
rather easy: mozilla::Decoder and mozilla::Encoder are real-world
examples.

The main thing is implementing operator delete for the C++ stand-in
class that has no fields, no virtual methods, an empty destructor and
deleted constructors and operator=:
https://searchfox.org/mozilla-central/source/intl/Encoding.h#903

For the rest of the boilerplate, see:
https://searchfox.org/mozilla-central/source/intl/Encoding.h#1069
https://searchfox.org/mozilla-central/source/intl/Encoding.h#661
https://searchfox.org/mozilla-central/source/third_party/rust/encoding_c/src/lib.rs#467
https://searchfox.org/mozilla-central/source/third_party/rust/encoding_c/include/encoding_rs.h#350
https://searchfox.org/mozilla-central/source/third_party/rust/encoding_c/src/lib.rs#677

This, of course, involves more boilerplate than scenarios that stay
completely within C++ or that stay completely within Rust, but in the
case of encoding_rs, the work needed to create the boilerplate was
trivial compared to the overall effort of implementing the bulk of the
library itself.

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Re: More Rust code

Reply via email to