Neat Rust Tricks: Passing Rust Closures to C

One of Rust’s biggest selling points is how well it can interoperate with C. It’s able to call into C libraries and produce APIs that C can call into with very little fuss. However, when dealing with sufficiently complex APIs, mismatches between language concepts can become a problem. In this post we’re going to look at how to handle callback functions when working with C from Rust.

Our hypothetical library has a Widget struct, which periodically generates events. We want to take a callback function from users that is called whenever one of these events occurs. More concretely, we want to provide this signature:

impl Widget {
    fn register_callback(&mut self, callback: impl FnMut(Event)) {
        // ...
    }
}

Unlike Rust, C has no concept of closures. Instead it has function pointers. To put this in Rust terms, you can take fn(Event), but not impl FnMut(Event). Function pointers can only work with the arguments passed to them (or global state like static variables), while closures can capture (or “close over”) any arbitrary state in the environment they were created. Because of this, C APIs often let you pass a “data pointer” when registering a callback, and then pass that pointer to your function whenever it’s called. For example, the C version of that signature might look like this:

void widget_register_callback(
    widget_t *widget,
    void *data,
    void (*callback)(void*, event_t)
);

If the callback is long lived, or the ownership semantics are complex, the C API may also have you provide a destructor function:

void widget_register_callback(
    widget_t *widget,
    void *data,
    void (*callback)(void*, event_t),
    void (*destroy)(void*)
);

If you’re not familiar with C’s syntax, here’s the equivalent signature in Rust:

fn widget_register_callback(
    widget: Widget,
    data: *mut (),
    callback: fn(*mut (), Event),
    destroy: fn(*mut ()),
);

So instead of taking a closure which automatically captures whatever state it needs to, in C we need to manually shove any state we want to keep into a struct, and pass that along with our callback function. Bridging these two APIs in Rust is surprisingly easy. The language does most of the work for us. The way we handle this is to pass the actual closure as our data pointer. Let’s look more concretely at what this means:

fn register_c_callback<F>(widget: &mut ffi::widget_t, callback: F)
where
    F: FnMut(ffi::event_t) + 'static,
{
    // Safety: We've carefully reviewed the docs for the C function
    // we're calling, and the variants we need to uphold are:
    // - widget is a valid pointer
    //    - We're using Rust references so we know this is true.
    // - data is valid until its destructor is called
    //     - We've added a `'static` bound to ensure that is true.
    let data = Box::into_raw(Box::new(callback));
    unsafe {
        ffi::widget_register_callback(
            widget,
            data as *mut _,
            call_closure::<F>,
            drop_box::<F>,
        );
    }
}

// Safety: The pointer passed to this function must be
// a valid non-null pointer of type `F`. We've carefully
// reviewed the documentation for our C lib and know
// that is the case.
unsafe extern "C" fn call_closure<F>(
    data: *mut libc::c_void,
    event: ffi::event_t,
)
where
    F: FnMut(ffi::event_t),
{
    let callback_ptr = data as *mut F;
    let callback = &mut *callback_ptr;
    callback(event);
}

unsafe extern "C" fn drop_box<T>(data: *mut libc::c_void) {
    Box::from_raw(data as *mut T);
}

There’s a lot going on here, so let’s look at each piece one at a time.

fn register_c_callback<F>(widget: &mut ffi::widget_t, callback: F)
where
    F: FnMut(ffi::event_t) + 'static,

The signature of this function is pretty standard for Rust. We don’t care if the function has mutable state, so we take FnMut instead of Fn. The 'static bound is needed since the callback given to us is going to be called after register_c_callback returns. This assumes that we need the function to be valid for some unknown period of time, which is the most common case in my experience. It’s possible to have less strict bounds for this, but it’s hard to do safely[1]. For this example, we’re working with a single threaded application. If we were passing this to an API that might call it from another thread, we would need to add a Send bound as well.

[1]: To have any shorter requirement on our closure than 'static, we need to know exactly how long it needs to be valid for. This will most likely look something like “the closure given must live until some other function is called”. Representing this in rust usually means calling whatever that second function is in the same place we register the callback. e.g.

  fn process_with_callback<'a, F>(
      widget: &'a mut ffi::widget_t,
      callback: F,
  )
  where
      F: FnMut(ffi::event_t) + 'a,
  {
      register_c_callback(widget, callback);
      process_events(widget);
  }
let data = Box::into_raw(Box::new(callback));

Since we need the closure to live for some unknown period of time, we need to move it onto the heap. We then immediately call Box::into_raw on it, which will give us a raw pointer to the closure, and prevent the memory from being de-allocated.

ffi::widget_register_callback(
    widget,
    data as *mut _,
    call_closure::<F>,
    drop_box::<F>,
);

Here we’re actually calling the underlying C function. Since the type of data is *mut F, and our C API expects void *, we need to explicitly cast it. Finally, we’re passing our two function pointers. Both of these functions are generic, so we’re giving it the concrete type of the closure it’s calling. This is one of the few times you’ll ever see an explicit turbofish without actually calling the function.

unsafe extern "C" fn call_closure<F>(
    data: *mut libc::c_void,
    event: ffi::event_t,
)
where
    F: FnMut(ffi::event_t),

Here we declare the first of our two functions we’re passing to C. Since it’s meant to be called from C code, the function is defined as extern "C" to tell the Rust compiler to use C’s ABI here. Usually extern "C" functions also need #[no_mangle] to disable Rust’s automatic name mangling. However, this function is never called by name[2]. We give it to C by passing function pointers directly, so #[no_mangle] isn’t needed here. Even though this is a function meant to be called from C, we still need to mark it as unsafe, otherwise safe Rust could trigger undefined behavior by passing ptr::null_mut()

[2]: C code actually couldn’t call these functions by name even if we wanted them to. Since the functions are generic, they don’t actually represent a single symbol in the final binary, but one per concrete type passed to the function, so C could never call it by name. That doesn’t mean you can’t use generic functions in C FFI, but it does mean you always have to give the concrete type from Rust code, and C can never call the functions by name.

let callback_ptr = data as *mut F;
let callback = &mut *callback_ptr;
callback(event);

Since our C API wanted a function that takes void * as it’s first argument, that’s how we declared the function[3]. This means that the first thing we need to do is cast it to a pointer of the right type. We then turn that raw pointer into a Rust reference. Finally, with a &mut F, we can actually call the Rust closure with our event.

[3]: In theory we could have declared this function as taking *mut F directly, or even possibly &mut F. However, this would mean that call_closure::<F> is the wrong type, and needs to get cast to the right one (which we might be doing anyway since some C APIs take void* instead of function pointers with an explicit signature). Casting our function pointer will compile, but accidentally casting *mut c_void to a possibly unsized type will fail.

unsafe extern "C" fn drop_box::<T>(data: *mut libc::c_void) {
    Box::from_raw(data as *mut T);
}

And lastly, we have our destructor function. Box::from_raw will recreate a new Box<T> from our raw pointer. Once we have a regular Rust Box, it’ll automatically be dropped when this function returns, freeing the underlying memory, and calling the destructors of any values captured by our closure.

It’s surprisingly little code for such a complex operation[4]. And true to Rust, this is done as a zero cost abstraction. This is all done using monomorphized generic functions, so there’s no indirection or allocation beyond what’s absolutely required.

[4]: One thing I’ve left out here is panic handling. Unwinding through FFI is currently undefined behavior in Rust. You may wish to wrap all of this in catch_unwind and either report the error through whatever mechanisms the C API you’re interacting with gives or abort the process. However, there is some active discussion around precisely what the rules should be here! If you’re interested, check out the ffi-unwind project group.

And this composes really well. At this point any additional code we write can operate purely in safe Rust. For example, if we want to wrap that ffi::event_t in a Rust abstraction, we can just add more closures!

impl Widget {
    fn register_callback(&mut self, callback: impl FnMut(Event)) {
        register_c_callback(
            &mut self.ffi_widget,
            move |ffi_event| {
                let event = Event::from_raw(ffi_event);
                callback(event);
            }
        );
    }
}

Your closures can even have mutable state with no fuss!

let mut x = 0;
widget.register_callback(move |_| {
    x += 1;
    println!("I was called {} times", x);
});

This will print an incrementing number every time it’s called as you’d expect. With this little bit of code, we’ve got a full bridge between Rust closures and our C API. You can do anything you would be able to do with the same API written natively in Rust, and consumers of this code never need to know the difference.

All of the code in this article is based on real code from Diesel, which allows you to use Rust closures as the implementation of custom SQL functions on SQLite. You can find the code here and here.

When I first started working on this feature for Diesel, I knew I wanted an API that let you use a closure, and I expected making that work with SQLite’s C library to be much harder than it actually was. But ultimately the code required boils down to a bit of pointer juggling wrapped in an extern "C" fn. The fact that the language can handle this with so little work really shows how powerful Rust’s abstractions are, and how well they can compose into unexpected use cases.

 
337
Kudos
 
337
Kudos

Now read this

Things I Wish I Knew About Assembly

My talk for RustConf this year includes an technical deep dive of the MissingNo glitch from Pokemon Red and Blue. It was important to me to really understand not just what happened in this glitch, but why it happened. This meant I had to... Continue →