Orthogonal Persistence, Webassembly, and Rust


I was reading about the other day. I recommend giving the wiki article a quick read if you're not familar with it, but the core (and this is very watered-down) is that Urbit is how one might imagine computing . Computing power and bandwidth are infinite, so inefficiencies are irrelevant and one needn't bother with worrying about them.

Urbit takes many strange paths, but when viewed through this lens, I find it slightly more understandable.

Urbit is many things, but a core part is the concept of an Operating Function, a take on the term, Operating System. The Operating Function fulfills the role of an operating system (e.g. orchestrating applications, talking to the outside world, etc), but in a pure, functional manner:

(event, old state) ⟶ (effect, new state)

All the incoming events are automatically saved by the Urbit runtime. Occasionally, the list of events is squashed down and the most recent state is saved instead.

The applications running inside of Urbit don't have to explicitly save their data, they just modify their state over time, receiving events and producing effects. But, when the computer is shut down, the urbit system can resume exactly in the same state.

This is called .


Orthogonal persistence refers to persistence of data without explicitly saving it.

There's no need to use the filesystem or a database. Just store data the way you would in a running application. e.g. in lists, tables, or anything else.


So, I like the idea of orthogonal persistence. The "business" logic of the application is probably significantly clearer and simpler when saving and loading data isn't part of it.

Is it possible to have this without the obscurist veneer of Urbit? Yes, it is.

Webassembly provides a (mostly) deterministic execution environment and I have a hunch that it'll be just flexible enough for the cursed things we'll have to do to it. As a plus, wasm can run quite fast!


Webassembly

What's in a wasm module anyhow?

Doesn't look like much, does it? (I'm not referring to my drawing skills.) The code section is self-explanatory, and there's nothing there we need to work with for this project. Globals, tables, and memory are the important parts here. However, because this is a proof-of-concept, we're not going to worry about tables, which hold things like function references.

With those in mind, there are two things we need to do to persist the state then:

  1. Save the wasm memory
  2. Save the mutable globals (e.g. that the program can change at runtime)

However, we run into a hiccup here. Oftentimes, the module keeps many of its globals private and inaccessible from outside. Even native wasm runtimes, like wasmtime, don't let you get at them. The solution to this is to actually modify the wasm module itself to make the right items accessible. The walrus crate works well for this.

Let's get started with what we want the wasm module to be. I'll write it in Rust, though other languages would work fine.

static mut N: usize = 0;

#[no_mangle]
pub extern "C" fn set(x: usize) {
    unsafe { N = x }
}

#[no_mangle]
pub extern "C" fn get() -> usize {
    unsafe { N }
}

Rust really doesn't like mutable global variables, like N, so we have to use unsafe to access it.

Let's compile that to webassembly:

❯ rustc --target=wasm32-unknown-unknown --crate-type=cdylib -O test.rs

Now for actually loading and unloading the wasm module, I threw together a rust crate that makes it pretty easy. This uses wasmtime internally to load and execute the webassembly.

Here's how you instantiate a module and then save the state to the filesystem!

let store = Store::default();
let (_, instance) = PersistentInstance::new_from_file(
    &store,
    "test.wasm"
)?;

let set = instance
    .get_func("set")
    .ok_or(anyhow::format_err!("failed to find `set` function export"))?
    .get1::<u32, ()>()?;

println!("Calling exported wasm function: `set(42)`");
set(42)?;
instance.save("globals.json", "memory.bin")?;

Here's the contents of the globals.json file we just generated:

{"__heap_base":{"I32":1049664},"$probed_global:0":{"I32":1048576},"__data_end":{"I32":1049664}}

memory.bin ends up being a 1.1Mb file of mostly zeros. This could be very easily compressed to save space.

And here's how to reload it!

let store = Store::default();
let (_, instance) = PersistentInstance::load_from_file(
    &store,
    "test.wasm",
    "globals.json",
    "memory.bin"
)?;

let get = instance
    .get_func("get")
    .ok_or(anyhow::format_err!("failed to find `get` function export"))?
    .get0::<u32>()?;

let x = get()?;
println!("calling exported wasm function: `get()` => {}", x);
assert_eq!(x, 42);

Okay, so how is this Orthogonal Persistence?

Well, what I have here isn't orthogonal persistence exactly, but it represents a necessary building block for building a system that enables webassembly modules to orthogonally persist their data.

I think there are some interesting possibilities.

For applications with an event loop, which is almost every application if you squint your eyes a little, that event loop could be moved outside the module into the runtime itself. When the program is not on the call-stack, the runtime can simply persist it to disk (possibly through a memory mapped file, more research is necessary).

When webassembly interface types eventually land, I think a whole system, operating system even, could consist of these orthogonally persistent programs all talking to each other through strongly-typed interfaces. It's a beautiful, elegant vision in my eyes.