I once read a heartbraking story about a young woman who finished her MFA (Master of Fine Arts) degree in classical composition by listening to her masters thesis work, a full-length 40-minute symphony, played by a professional orchestra. It was the high point of her life at that point.

In her essay, she wrote about the profound sadness she felt that she’d never hear it played again onstage. She’d probably never write another one, and even if she did the only way she’d hear it is through a synthesizer, in the privacy of her own home. Perhaps shared on SoundCloud or YouTube or even Spotify.

But she’d never hear it again the way orchestral music is meant to be heard. She was a scholarship student; she didn’t have rich parents who could afford to put her onto the international composition circuit, sponsor her to expensive music festivals, and get her in front of critical symphony directors.

Software development is undergoing an equally problematic phase change. As a programming language dilettante enthusiant, I have the luxury of sitting around all day, trying out my ideas in Haskell or Racket, and then implementing them in some language that’s more tractable for others, like Rust or C. And that’s pretty much been true since P-system Pascal was released in the late 1970s. All you needed was a computer. Just one. The one on your desktop.

But today, if you want to do anything substantial in some of the fundamental up-and-coming fields, such as artificial intelligence, data science, or cloud computing, you need far more money. You need to be able to rent several farms worth of compute space, and if you’re doing anything at all interesting with AI, the cost of a single experiment could easily run into the hundreds, if not thousands of dollars. Nobody except the wealthy can afford to do that, or let their kids specialize in it.

Companies looking for that kind of experience just can’t get it off the street. The only way startups get people with cloud experience is by tempting someone away from an established company that put the time and effort into training someone. It’s not available in schools, and you’re certainly not going to be self-taught in systems reliability engineering or Kubernetes cloud deployments. Those skills come only through spending time with very large systems while someone else foots the bill.

What this means for the future of software development is anyone’s guess, but I’m starting to see already an awareness that there needs to be some kind of revamping of the educational pipeline to enable people to get these skills without having wealthy sponsors.

01Jul

The Sterling Bullet

Posted by Elf Sternberg as Uncategorized

One of the best books on software development ever written is the 1986 No Silver Bullet. Brooks’s central argument is that software development in a team requires the constant, daily transfer of active knowledge of progress, and the cost overhead of this transfer exceeds the value of additional developers fairly quickly; adding more people to a failing software project will actually slow the project down.

The classic joke around this book is that, if your manager proposes hiring more people to get the job done, you should buy him two copies of the book so he can read it twice as fast.

Alex Mills recently wrote an essay that got my attention (and that of a lot of other developers), in which he crystallized an emotion that a lot of programming language enthusiasts feel: The World Might Be Missing A Programming Language. Mills’s impression is one I used to share. To some degree I still share it, and I still love hacking on programming languages, but Mills’s laundry list of features he wants in his perfect programming language is the wrong list.

Let’s talk about four programming languages that are awesome, first.

Racket (i.e. Lisp)

Lisp is the great granddaddy of all great programming languages. Functional-first, mutation-capable, and possessing the ur-construct of all special cases (exceptions, parallel programming, concurrent programming), call-with-current-continuation, Lisp isn’t really so much a programming language as a language for writing other languages. It’s so malleable and powerful it can do almost anything.

Lisp’s malleability comes at a price: every developer writes it differently. Every lisp quickly becomes a hard-to-read pile of abstractions because every developer’s brain works differently, and Lisp has absolutely no guard rails.

Lisp also has exactly one semantic construct: the parenthetical. Every Lisp function looks exactly like every other function. Only the names changed. It can be hard, in a deeply nested Lisp function, to discover what is doing what to what.

Haskell

Haskell is the functional programming opposite of Lisp: it has Magical Guardrails. Haskell programs are strongly typed, but they use types not just as validation or as a description of memory, but as actual guidance to the compiler as to how the program should be assembled and run. Haskell is famous for the maxim "If it built, it will run," because types guarantee that the program will do something. It may not always be correct (it’s possible you wrote a plus where you meant minus somewhere, after all), but a huge set of possible errors simply don’t exist in Haskell.

Haskell’s syntax is clear and concise. Almost too concise; it can sometimes not be obvious where functionality is coming from, because the developer’s type-based request can lead to the compiler just going out into the library and, if there’s only one type match, automatically adding it. This is great if you know the library, not so great for beginners.

Haskell also uses some concepts from advanced topics in mathematics. Phrases like "structure-preserving transforms" and "monoids in the category of endofunctors" get thrown around, giving beginners a sense of terror at all the learning that’s about to happen.

Python

Python isn’t either of those language; Python is a dynamically typed language with a strict syntax and a huge standard library. Python is a hit because of that library: it was the first interpreted language to bring massive, available functionality for many of the things programmers did on a daily basis with its initial install. Python’s whitespace-oriented syntax annoyed people when it came out; now it’s considered something of a stroke of genius, as you ought to be indenting your scopes anyway.

Python is easy to learn, easy to teach, and at first glance "does what you mean." Its lack of types can lead to lots of debugging pain and unit testing, but for small programs it’s a wonder of fast productivity.

C

C gets a deservedly bad rap. It’s a language which pretends to have guardrails, but they’re mostly made of spaghetti. On the other hand, C creates predictable assembly language, and it’s possible to extract every last erg of power out of your CPU using it. But it’s "types" are mostly there to describe the layout of memory rather than securely associate functions with objects, and the semantics of C let you change types on the fly. This can be useful for systems programming, where a blob of memory can be a database structure, and the exact same blob just a stream of bytes to be passed over a network, without having to copy or transform the data, but the human overhead of debugging and the notorious security headaches has the world searching for an alternative.

Mills’ language already exists, sort-of

Every time I read a bullet point in Mills list, I had the reaction, Oh, you want Python, or Oh, you mean Haskell.

When he says "I want a language that makes writing asynchronous code easy," my reaction is: Haskell. Haskell goes the full math and says "If you squint just right, everything looks like everything else." A list is a container for a number of things; an exception is just a container for a thing that might fail; an asynchronous operation is just a container for a thing that will happen eventually. By leveraging this insight, Haskell can glue everything together into "an operation that might deliver a list eventually." But Haskell requires exceptional levels of cleverness to know how to put together its n-dimensional Lego pieces. It’s a great language to learn, and it is general purpose, but it’s hard to master.

He wants a Haskell that doesn’t require all that thinking.

A lot of his requests made me think: Lisp. The closest thing we have is Clojure, a Lisp that runs on the Java VM. It resists the classic Lisp explosion by leveraging the Java Standard Library, thus requiring that every deviation from the stock language converges on "tools that still run on the JVM." It’s a nifty restriction and Rich Hickey’s secret weapon. But it’s still a Lisp.

He wants a Lisp that can’t blow up, but isn’t restricted to a VM-only platform.

Like a lot of developers, I gripe a lot about the languages I know best. I complain about Python, Javascript, and Go because I know those languages, and am good at them. I love Haskell and Lisp and Rust, but I also know where those warts are and, unlike the first three, I’ve never been asked to work in them professionally; I suppose if I were using them for something other than recreational purposes I’d get tired of their faults just as quickly.

What Mills (and I, and a lot of other developers) really want is smarter programmers. Self-aware developers who care more abou the code than the usual hired guns we run across in this industry, the fly-by-night, $15/hour code-camp kids who just know how to search Stack Overflow.

When I first moved to Seattle in 1991, I had a year’s experience working as a Cobol & DB2 developer, an accounting degree, and a couple of men in dark suits and sunglasses following me around asking if I wanted to take an Ada job that entailed six months on-site and then a six-month vacation. [True story. I finally told one that “I don’t want to work at Pine Gap.” At the time, no one outside the government was supposed to know about Pine Gap, but the previous year’s revolt, in which every Ada developer at the place just up and quit was all over Usenet. Why they’d quit was obviously a top secret; it was just that there were a lot of Ada developers suddenly looking for work, and where they’d come from sorta leaked out. Just saying “Pine Gap” was enough to get me off their hiring list and probably into a file somewhere.] Desperate for work, I took a data entry position at little cellular phone company that long ago merged with one of the bigger ones.

It wasn’t actually much of a data entry position; mostly I was playing sneakernet for a sales team that spent its time cold-calling subscribers and asking them if they wanted to upgrade. It was an eight-person team, five men and three women, and they were all surprisingly diligent and good at their jobs, given how much we all hate cold-call sales these days. I guess it was new back then. They were all very different people: Dave was a big man, loud and brash; Lauren was a confident woman who spoke softly but persistently; Alex was a little blond man who drank too much coffee. You could write novels around characters like them.

Work came in as a list of phone numbers encoded on some kind of primitive industrial digital tape cassette. First thing in the morning, the tape would go into a machine and the first number would roll out to the first salesperson, and then the second. When a call ended, the machine would click forward and send another number. Efficient and relentless, and yet the team could keep it up for two hours at a time, take a break, and start all over again. My job was to sneakernet their hits and misses, which they recorded on their own DATs, into my machine and collate the final report for the day.

These tapes were assembled regionally: one day would be “all Wisconsin,” another “all Colorado,” and the next, “Greater Los Angeles.” In-house marketing teams, driven by surveys and their own instincts, assembled the tapes by hand. Remember: this is 1992, before the Internet was a commercial venture. There was no “big data.” There were, at best, a couple of expensive database products out there, and Excel.

Having studied accounting, I was pretty damned good at Excel 3.0, which had been released the year prior. I started to track our team’s success at the individual contributor level.

After about three weeks, when the team manager and I were alone in my office, I said, “I have something interesting to show you.” I pulled up my spreadsheets and showed him my effort. Dave was fantastic in Los Angeles, but sold absolutely nothing in Wisconsin. Lauren had a steady and reliable hit rate in Colorado, but didn’t do well at all in LA. And so on. I showed how each salesperson’s style meshed or didn’t mesh with a given region, and how we could predict with some accuracy how well they’d do on a given day given a regional tape marking. I told him, “It would be a lot better if we could assign them to the markets they know and are good at. They’d be upselling much more every day.”

He first said, “Huh. Let me think about this.”

The next day he came back and said in an angry tone, “Delete it. Delete all of it. Throw those spreadsheets away and never mention them again. Especially don’t mention this to the team. That’s not what we hired you to do.”

I was surprised. He softened a bit and said, “Sorry. When I told my manager I got my ass chewed. There’s no mechanism for assigning someone to a specific region and marketing would cut off my balls if I asked them to build tapes that way. If you told anyone on the team that they were bad in a given region, they’d probably not try very hard that day.”

I pointed out that Dave probably already knew he’d have a good day, maybe eighteen hits, with an LA tape, but he’d be lucky to get two upsells if he had a Wisconsin or Colorado tape. The manager nodded and said, “Probably. But we can’t make it official.”

Despite having worked at a brokerage firm for a year, this was my first real taste of corporate bullshit.

After working on a take-home quiz I have four numbers with regard to the "twitter clone" project given to me by ${MEGACORP}.

2,572

2,572 different Javascript libraries ended up inside the build tree for the demonstration application. That number does not include the global instance of the Prisma toolkit that I ended up installing on the recommendation of the tutorial. That number does not include the Prisma back-end, running on AWS, over which I had no ownership or control.

That’s 2,572 potential cases of corruption, infiltration, or code unreliability. No one involved with this project could possibly have vetted all of them, and there’s very little chance anyone could have a web-of-trust that encompasses all of them.

2,371,451

The final build size for this project was 2,371,451 bytes. That’s the size of the payload that gets downloaded to your client— desktop, phone, tablet— to run on your browser. 2.4 Megabytes. That’s a lot of code to run when all you want is a simple scrolling twitter feed.

[NaN]

I don’t know how many random accounts I have out there. I’ve been on the Internet since 1991; I’ve had account on AOL, CompuServe, MySpace, and a zillion other places. Authorizing Prisma to have access to my GitHub account for the purposes of authentication was just one more, and it annoyed me that local-first isn’t even on some companies’ radars.

32

32 is the number of hours I spent on a coding challenge. That’s a ridiculous number; no one who’s actually got a job could possibly have gotten through this assignment in anything like a valid amount of time.

On the other hand, learning five new technologies sufficient to "make it go" in only 32 hours is an awesome, and it was really good to see that my coding chops at the browser level are still as solid as they are at the back-end.

Afterward

The thing of it all is, I look at This is a Motherfucking Website and wonder how the hell we got from that to, well, to React. My story sites use native Javascript and treats the HTML as the source-of-truth, which is one of the reasons it’s so damn fast. The Javascript on those pages is minescule… less than 1 kilobyte, but they do the job of helping people navigate the site fast and efficiently and are still ARIA-compliant.

React is like Golang: both are enterprise-ready solutions that let developers do things fast enough, but hem the developers into pre-packaged tooling and solutions that limit thought and creativity. They’re both great tool in their roles, but they’re not what I would choose to work with recreationally.

Which is too bad, but what else can you do? They really do let companies throw stuff out fast, and if the consumer can’t keep up, well, it’s the consumer’s responsibility to upgrade their phone or laptop, isn’t it? Upgrade or die.

The world still deserves better.

So how did I do on the coding challenge? I have no idea. But here’s what I went through:

The README promised that there was a working, minimal implementation in the ZIP file. There was not. So the first thing I did was go through the How To GraphQL tutorial, which kinda gets you to the point of having a working Twitter clone. While going through it, I realized that the coding challenge was this tutorial, more or less.

Prisma is ridiculously difficult to run locally. It’s a cloud-first solution, written in Scala. In fact, ${MEGACORP} provided me with a Prisma instance running somewhere on AWS rather than ask me to try and run it on my laptop. So the first discovery was that part of the reason the example code they gave me wasn’t working was that I didn’t have a Prisma account; I had to log into Prisma and give it authentication via Github.

The next thing I realized was that, looking at the schema they provided me, it didn’t look right. I ran the prisma generate account and all these warnings and errors came back to me. Despite being only three months old, the schema provided was already deprecated; the @unique attribute had been augmented by the @ID attribute for data keys, and the syntax for date/time fields had changed. With the schema fixed, the Prisma account activated, and the database seeded, I actually started to see something in my "Twitter" feed.

It was time to turn my attention to the client.

Figuring out how to do login / logout was the hardest part. It involved all the moving parts, including authentication, and it was tricky. I ended up wrapping the entire app in a hand-written React Authentication.Context object that provided hooks to the login and logout function, as well as a state that indicated if there was a user and who it was. I so missed Rust’s sum types here; I wish Javascript had them, and in that form. (One of the wildest things I saw recently was that Typescript has sum types and enums, and they’re not the same thing. I filed that under "What the whatting what?") But once it was working, it was working fine.

I did have one moment where I couldn’t figure out how to banish the log-in popup. It was a two-stage asynchronous operation, one to handle the ‘submit’ even and another to handle the ‘login’ event. On a hunch, I wrapped the login handler in a console.log() and it returned a Promise which contained the return value from the login, so I could use a successful Promise resolution as the event to call the .dismiss() method. It worked, and it was elegant as hell, but I get the feeling it’s rather un-React-y; it was a "do something" rather than "be something" implementation.

I wrote out a few decision tables and decided I could live with it.

After that, things accelerated: I quickly sketched out how I wanted it to look, how I wanted the toolbar and profile sidebar to update, and getting them to work was interesting. I learned a megaton of Material and Material-UI in the process, although I’m still not keen on Material’s thematic engine.

I had a good time doing this. It’s not the best use of my skills, but it’s certainly satisfying to see things working in real-time, and I’m impressed with the tooling, at any rate.

I’ve just spent the past four days working on a “homework assignment,” a coding assignment given to me by ${MEGACORP} to assess my skills as a web developer and programmer. I was sent a zip file with a README in it and a pair of sub folders: one of the server, one of the client, and told to build it out and “make it work.”

It was broken as delivered. The install worked fine, but I winced when I realized that yarn had installed 2,572 different packages in the library trees, including the entire flow binary and the OCaml runtime that comes with it! Running the server threw a dozen errors, and the client showed only a blank white page with a red notification bar, completely unpopulated.

I was coming out of Splunk, where I’d been working with Splunk’s own server and client foundation for five years. That foundation was built on top of Django and Backbone, two skills I clearly had when Splunk hired me in 2013. In 2015, some of our teams had started to work with React, and I’d been fortunate enough to be part of that.

But the 2017-2018 development cycle had been all Go, all the time. I was woefully out of practice as a web developer.

Here’s what they threw at me:

GraphQL

GraphQL is a query language specified by Facebook that handles client/server transactions. You specify the possibly deeply nested data you want sent to the client, and when you make the request you get a JSON object pre-populated with that nested data. Instead of assembling all the data out of different requests, your client sends a single request and gets back all the data it needs to fill in a page.

It’s a pretty good idea. But it was new to me.

Yoga

Yoga is a proxy server into which you define the GraphQL endpoints for your client. You write your specification here, defining the structure of the data and the handlers for assembling that data out of your back-end. Also completely new to me. You write in Javascript or one of its derivatives (Typescript, Clojurescript).

Prisma

Prisma is yet another proxy server, this one written in Scala. Prisma is the big aggregate engine; it talks to all your back-end services and assembles the data for you.

The relationship between Yoga and Prisma is basically something of a firewall; Yoga limits the kinds of transactions that can happen, and has the capability to restrict access based on authentication issues. Prisma will take whatever Yoga sends it and do the heavy lifting of assembling and aggregating the objects to send to the client.

JWT

JavascriptWebTokens were being used for authentication. I straight up admit that I ended up using them badly and contrary to security advice. We hadn’t used them at Splunk until the “cloud era,” and by then I was working on Kubernetes deployment issues and didn’t get to learn much about them. They’re basically cryptographically signed bundles that describe “claims,” a word that encompasses authentication and authorization, but they’re short so they fit in the HTTP Authentication: header. Unfortunately, they’re also blatant, so if you’re using them in an unsecured environment anyone can grab them and use them against you.

ES6 Javascript

Oh, and did I mention that this was entirely ES6 Javascript, which we hadn’t been using at Splunk at all? So I got to learn a whole bunch of new things! And actually, I was pretty stoked about that. It has real Classes now, and the function shorthand operator with no this overrides (inhering the lexical this is awesome!), and it does destructuring and lexical naming and a whole bunch of stuff that I enjoyed the hell out of it.

React

I wasn’t new to React. In fact, I had a pretty solid working relationship with React, and even had a good idea of how to hack it to use with older, Backbone-based data sources. But that was two and a half years ago, and React has moved. There are more libraries today.

Apollo

Apollo is a toolkit for GraphQL clients. The Apollo toolkit creates React Contexts in which you can write GraphQL queries and get them back as processed objects ready to feed directly into your React components and draw them at will. Apollo was interesting, but I found its relationship with React a bit awkward, especially when trying to handle post-event routing. Apollo wants to deliver you ultimately render, but in some cases, such as log-in, what you want is both the render event and some other behavior, such as dismissing the login pop-up. That was tricky.

Material & Material UI

Material is Google’s own CSS Design System, the one we’ve all come to recognize. It has paper and cards as metaphors, and has a whole bunch of guidelines for building out the best UIs. It’s fine as far as design systems go, and it’s certainly mature.

Material UI is an implementation of Material in React; it provides a whole bunch of widgets for all the things one might normally do with an application and makes them easy to bring to life.

… and that was just the basics. I had to learn all of those enough to do the job, and this was for a code quiz. It was like a 32-hour whiteboarding session without any of the “cultural fit” stuff going on. It was grueling, and I hope I don’t have to do anything like it very soon.

Part 2: What I actually did on the project.

Kazanov’s fifth World’s Simplest Bytecode Intepreter (see the Fourth Simplest) isn’t quite so simple: it is an implementation of Thompson’s most basic regular expression table-driven DFA handler. It handles a complete set of Kleene’s Regular Expressions without the closure operator (no * operator), and it does so by exploiting the host language stack to handle nested recursion of alternatives.

This is a much different beast from the previous four, and it requires a little thinking to grok why it’s so different. For one thing, we’re going to be jumping back and forth inside the program: this is our first bytecode interpreter with a Jmp instruction. More than that, we may want to jump backwards, so we can’t use ‘usize’ for that, we’ll need to use ‘isize’. There are times we want to add a negative offset, after all!

Our opcode now looks like this:

<lib.rs>=
pub enum Opcode {
    Chr(char),
    Jmp(isize),
    Alt(isize, isize),
    End,
}

If you’re at all following my other project at home, there’s an operation missing: where is Seq? Well, it turns out that Seq is handled simply by incrementing the two pointers.

In order to facilitate rapid access to our opcodes, we’re going to convert it into a concrete Vec<Opcode>, and likewise the string we’ll be analyzing is going to be a Vec<char>. We’ll be passing references of these two objects to our interpreter, as neither the string nor the program will be mutated by the interpreter’s operation:

<lib.rs>+=
pub fn interpret_reentrant(
    program: &[Opcode],
    sample: &[char],
    mut pc: isize,
    mut sp: isize,
) -> bool {

We’re only going to bother with “did it match, or didn’t it?”— a boolean. We also have two new variables, the pc (program counter) and the sp (string pointer). As those can “fall off the world” at either end, we have to test for that. If we leave the world, obviously it didn’t match:

<lib.rs>+=
    loop {
        if sp < 0 || (sp as usize) > sample.len() || pc < 0 || (pc as usize) > program.len() {
            return false;
        }
        

The meat of our interpreter is, erm, “fun.” If the character matches, we proceed; if not, we return false. The Alt operator takes two addresses: an offset to the left child expression and another offset to the right child expression. Each child expression must end with a Jmp to the next element in the program sequence after the Alt.  And again, because we’re mixing and matching pointer and offsets, we have to do benign casts on our indexes:

<lib.rs>+=
        match program[pc as usize] {
            Opcode::Chr(c) => {
                if c == sample[sp as usize] {
                    pc += 1;
                    sp += 1;
                } else {
                    return false;
                }
            },
            
            Opcode::Jmp(i) => {
                pc += i;
            },
            
            Opcode::Alt(l, r) => {
                if interpret_reentrant(program, sample, pc + l, sp) {
                    return true;
                }
                pc += r;
            },

            Opcode::End => {
                return true
            }
        }
    }
}

We now have to start our engine:

<lib.rs>+=
pub fn interpret(program: Vec<Opcode>, sample: &str) -> bool {
    let s: Vec<char> = sample.chars().collect();
    interpret_reentrant(&program, &s, 0, 0)
}

And… that’s it. That’s an entire regular expression engine in about 45 lines of Rust. It’s not really a good regular expression engine; a char in Rust isn’t really a character, as it will only encode a Unicode code point. A letter with an umlaut might be a single Unicode point, or it could be a multi-code assembly of an undecorated letter followed by an “put an umlaut over the previous symbol” instruction. Discussing the difference between individual Unicode code-points and grapheme clusters, and normalizing between them, is a very advanced topic, one I have not yet begun to tackle in my own work. Because the sp points on the stack encode backtracking-on-failure, this code doesn’t work in a streaming fashion. But for a demo… eh. It does the job.

Let’s just say that performing regular expressions on Asian languages, apparently especially Korean, is a non-trivial exercise.

The unit tests are straightforward, but you can see how complex our programs quickly become. There’s also an interesting, well, I don’t want to call it a ‘failure mode,’ but look down at the Alt test. See those Jmp(1) operations? They’re a consequence of the rule that each arm of the Alt must end with a Jmp to command after the Alt. That Jmp is completely unnecessary; it only advances the program counter by one, which our (virtual) Seq would do anyway. It’s a wasted operation and hopefully any good compiler we write would recognize it for what it is.

<lib.rs>+=
#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn basic() {
        use Opcode::*;
        // "a"
        assert!(interpret(vec![Chr('a'), End], "a"));
    }

    #[test]
    fn seq() {
        use Opcode::*;
        // "ab"
        assert!(interpret(vec![Chr('a'), Chr('b'), End], "ab"));
    }

    #[test]
    fn alt() {
        use Opcode::*;
        // "(a|b)c"
        assert!(interpret(
            vec![Alt(1, 3), Chr('a'), Jmp(2), Chr('b'), Jmp(1), Chr('c'), End],
            "ac"
        ));
        assert!(interpret(
            vec![Alt(1, 3), Chr('a'), Jmp(2), Chr('b'), Jmp(1), Chr('c'), End],
            "bc"
        ));
        assert_eq!(
            interpret(
                vec![Alt(1, 3), Chr('a'), Jmp(2), Chr('b'), Jmp(1), Chr('c'), End],
                "ab"
            ),
            false
        );
    }
}

And here ends where I’m going to follow Kazanov. He does much more with later regular expressions, exploiting some GCC-isms to create a full-powered context-threaded interpreter. It turns out Rust will recognize what you’re trying to do and do the right thing if you follow a fairly idiomatic design pattern for interpreters, so those wacky GCC-isms aren’t necessary.

I’m sure there’s much more we could go into, but I have other things to study next week, I’m in the middle of looking for work (hey, anyone need a skilled Rust developer?) and this more or less satisfies my rebooting my knowledge of bytecode interpreter design.

All of the code for this project is now available on Github.

For the fourth World’s Simplest Bytecode Interpreter, Kazanov goes into a bit of esoterica without discussing why: he has “authentic bytecode”; he’s doing C programming and he dense-packs his instructions into a single 16-bit instruction. That’s fine if you’re emulating a real chunk of hardware, but for our purposes… I really don’t care.

With that in mind, let’s talk: A register-based machine is one that says that all of the instructions will fit within only a few memory locations and the entirety of the program’s live data will only need a small number of memory locations. In our case, that small number is 16, although in reality my tests only use three.

There are only two opcodes for moving memory in or out of our system: Load which takes a value from memory and moves it into a register, and Done which returns a specific value from a register to some calling context. Since our example program is a simple calculator, every operation takes three register names: two sources and a destination:

<lib.rs>=
use std::result;

const REGISTER_COUNT: usize = 16;

pub enum Opcode {
    Load(usize, i64),
    Add(usize, usize, usize),
    Sub(usize, usize, usize),
    Mul(usize, usize, usize),
    Div(usize, usize, usize),
    Done(usize),
}

pub struct Program {
    program: Vec<Opcode>,
    registers: [i64; REGISTER_COUNT],
}

Since there’s no stack, we don’t care about stack underflow. On the other hand, we’re requiring that the program have an explicit exit terminator; unlike the previous versions, falling off the end of the program is not acceptable, so we replace our stack underflow with an “unexpected termination” error.

<lib.rs>+=
#[derive(Debug)]
pub enum ProgramError {
    DivisionByZero,
    UnexpectedTermination,
}

type Result<T> = result::Result<T, ProgramError>;

Initializing our program remains familiar, although I’ve changed one of the initializers. The cargo clippy command enforces an even more strict programming style on Rust than cargo fmt, and it recommends this form of initialization. Here, the program entry means “find a value named program in the current lexical scope and use it to initialize the program field for the Program struct. I am not fond of this— it strikes me as a kind of mystery meat programming— but it seems to be a popular with Rustaceans.

<lib.rs>+=
pub fn interpret(program: Vec<Opcode>) -> Result<i64> {
    let mut program = Program {
        program, 
        registers: [0; REGISTER_COUNT],
    };

And now the opcode handler. This is straightforward; we either introduce values into the program with Load, we process them with an operation, or we’re Done and return a value in the register specified. We don’t need our macro this time; the addresses are immanent and quickly available. We do still need to watch out for that divide-by-zero problem. And if the program ends without being Done, well, that’s an error, too.  I should point out that the order of registers is Kazanov’s; if I had been designing this from the ground up, I would have chosen the leftmost register instruction as the target.

<lib.rs>+=
    for op in program.program {
        match op {
            Opcode::Load(r0, imm) => program.registers[r0] = imm,
            Opcode::Add(r0, r1, r2) => {
                program.registers[r2] = program.registers[r0] + program.registers[r1]
            }
            Opcode::Sub(r0, r1, r2) => {
                program.registers[r2] = program.registers[r0] - program.registers[r1]
            }
            Opcode::Mul(r0, r1, r2) => {
                program.registers[r2] = program.registers[r0] * program.registers[r1]
            }
            Opcode::Div(r0, r1, r2) => {
                if program.registers[r1] == 0 {
                    return Err(ProgramError::DivisionByZero);
                }
                program.registers[r2] = program.registers[r0] / program.registers[r1];
            },
            Opcode::Done(r0) => return Ok(program.registers[r0]),
        }
    }
    Err(ProgramError::UnexpectedTermination)
}

Two things this VM does not do that it should. First, we do not verify that a register being read is one that has been written to at least once by the program, otherwise we’re just using the default 0 value for whatever calculations come next. Second, we do not validate that the register IDs being passed in are valid; we could easily access a value outside the size of our static array.

In some ways, this is okay: a smart compiler would catch these mistakes before they could possibly happen.

But we’re now outside the realm of “things the compiler can do for us”; our interpreter is beginning to have its own error states, and while there are a number of things we can do to make sure the interpreter itself is well-formed, we’re going to have to start worrying about bugs in the programs our interpreter runs.

And the unit tests. You can start to see how complex a register-based compiler would have to be to keep track of which registers were free, which were in use, and how those registers should be allocated and used. And a “real” programming language would also have a stack for scopes and for tracknig the evolution of complex function trees. For now, though, this is sufficient.

<lib.rs>+=
#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn basic() {
        use Opcode::*;
        assert_eq!(interpret(vec![Load(0, 2), Done(0)]).unwrap(), 2);
        assert_eq!(interpret(vec![Load(1, 2), Load(2, 2), Mul(1, 2, 0), 
                                  Load(1, 3), Mul(1, 0, 0), Load(1, 4), 
                                  Mul(1, 0, 0), Done(0) ]).unwrap(),
                   48);
    }
}

And the results are the same:

running 1 tests
test tests::basic ... ok

And that’s a register-based virtual machine.

For the previous step of Implementing The World’s Simplest Bytecode Interpreter in Rust, I followed Vladimir Kazanov as he added larger opcodes to his bytecode. In the third variant, he adds a stack, pushing values on as needed and allowing the operations to pop values off the stack as needed.

My variant of this operation follows his, but is slightly different. He continues to use the stack somewhat arbitrarily, but as I’ve pointed out, Rust doesn’t alow for that outside of unsafe code.

We’re going to swap out the accumulator for a stack, and we’ll return the last value pushed onto the stack as the result. After including the result type, which we’ll use for error messaging and forwarding, our program now looks like this:

<lib.rs>=
use std::result;

pub enum Opcode {
    Push(i64),
    Add,
    Sub,
    Mul,
    Div,
    Done
}

pub struct Program {
    program: Vec<Opcode>,
    stack: Vec<i64>,
}

One thing I wanted to do was add proper error handling. This isn’t exactly "proper," as I’ve added no pretty printing, but it does the job. One thing I have done is redefine Result to always have the types used by my library, and nothing else:

<lib.rs>+=
#[derive(Debug)]
pub enum ProgramError {
    DivisionByZero,
    StackUnderflow,
}

type Result<T> = result::Result<T, ProgramError>;

Okay, now things are gonna get… a little advanced. But only a little. What this next piece is, is simple: I found myself writing the exact same seven lines of code for three of the four operations. So why not extract that out into a function? Well, the biggest problem is that the maths symbols, +, -, etc, aren’t first-class functions; they can’t be passed around. But they are Rust tokens, which means they can be handled by a macro. So I created a macro that does the operation, and manipulates the object to pop() and push() the values as needed:

<lib.rs>+=
macro_rules! make_op {
    ($code:expr, $op:tt) => {{
        if let Some(a) = $code.stack.pop() {
            if let Some(b) = $code.stack.pop() {
                $code.stack.push(b $op a);
                None
            } else { Some(ProgramError::StackUnderflow) }
        } else { Some(ProgramError::StackUnderflow) }
    }
}}

This is a very simple substitution macro; it takes two items separated by a comma and builds a block that either returns None or Some(Error). We’ll use this return value as a signal to our processing loop.

The first item passed into the macro is a Rust expression, something that returns a value (in this case, the value is just the program struct, so access is a no-op), and the second is a Rust token, which is the most primitive object a macro can handle, but it’s perfect for arithmetic symbols.

There is one big oddity in the macro rule; can you spot it? The operation’s values will be extracted into a and b, but the operation itself is b $op a. Multiplication and addition are fully commutative, at least for integers, but subtraction is not; we needed to do the operations in this order to ensure that they were done in an "unsurprising" order: 5 4 - is "5 – 4", not "4 – 5".

Initializing our program is unchanged:

<lib.rs>+=
pub fn interpret(program: Vec<Opcode>) -> Result<i64> {
    let mut code = Program {
        program: program,
        stack: Vec::new()
    };

And now that I have my macro handler, the actual opcode runner is fairly uncluttered, although the cyclomatic complexity of the error handling for division does make it ugly. This is one place where railway-oriented programming would be lovely, but that’s an advanced topic.

Basically, if any operation returns an error the program halts right there with the error value, otherwise it proceeds until it either hits Done or runs out of operations to perform, at which point it tries to extract the value from the stack and return it, or returns a StackUnderflow error if the stack is empty.

<lib.rs>+=
    for op in code.program {
        if let Some(err) = match op {
            Opcode::Push(i) => { code.stack.push(i); None }
            Opcode::Mul => make_op!(code, *),
            Opcode::Add => make_op!(code, +),
            Opcode::Sub => make_op!(code, -),
            Opcode::Div => {
                if let Some(a) = code.stack.pop() {
                    if a == 0 {
                        Some(ProgramError::DivisionByZero)
                    } else if let Some(b) = code.stack.pop() {
`                       code.stack.push(b / a);
                        None
                    } else { Some(ProgramError::StackUnderflow) }
                } else { Some(ProgramError::StackUnderflow) }
            }
            Opcode::Done => break,
        } {
            return Err(err);
        }
    }

    if let Some(v) = code.stack.pop() {
        Ok(v)
    } else {
        Err(ProgramError::StackUnderflow)
    }
}

And the unit tests:

<lib.rs>+=
#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn basic() {
        use Opcode::*;
        assert_eq!(interpret(vec![Push(2), Done]).unwrap(), 2);
        assert_eq!(interpret(vec![Push(2), Push(3), Mul, Done]).unwrap(), 6);
        assert_eq!(interpret(vec![Push(5), Push(3), Sub, Done]).unwrap(), 2);
        assert!(interpret(vec![Push(2), Mul, Done]).is_err());
        assert!(interpret(vec![Push(2), Push(0), Div, Done]).is_err());
        assert_eq!(
            interpret(vec![ Push(2), Push(2), Mul, Push(3), Mul,
                            Push(4), Mul, Done]).unwrap(), 48);
        assert_eq!(
            interpret(vec![ Push(5), Push(2), Mul, Push(5), Div,
                            Push(2), Div, Done ]) .unwrap(), 1);
    }
}

And the results are the same:

running 1 tests
test tests::basic ... ok

One thing to note here is that we’ve now entered a realm where type errors are possible; if we could push more than one type of object onto the stack, we would need to differentiate among the types that our operations could handle. For a full-blown general-purpose interpreter, this sort of typechecking will get repetitive very quickly, but once you’ve identified the typechecking pattern in your code you can extract that repetition into a macro. Rust macros are already amazing, and I consider myself very much a beginner when it comes to them.

In my last post, Implementing The World’s Simplest Bytecode Interpreter in Rust, I did exactly that, following Vladimir Kazanov’s World’s Simplest Bytecode Interpreter.

His first interpreter was basically dumb as a rock: starting from an integer value of zero it allowed only increment and decrement. His second variant adds two new instructions which allow for the addition and subtraction of arbitrary integers. This is where the difference between C and Rust really begin to shine. I’ve added the two instructions to the Opcode:

pub enum Opcode {
    Inc,
    Dec,
    Add(i64),  // New
    Sub(i64),  // New
    Done
}
    
pub struct Program {
    program: Vec<Opcode>,
    accumulate: i64
}

The program definition is unchanged:

pub fn interpret(program: Vec<Opcode>) -> Option<i64> {
    let mut code = Program {
        program: program,
        accumulate: 0
    };

And the actual opcode runner is fairly straightforward:

    for i in code.program {
        code.accumulate = match i {
            Opcode::Inc => code.accumulate + 1,
            Opcode::Dec => code.accumulate - 1,
            Opcode::Add(i) => code.accumulate + i,  // New
            Opcode::Sub(i) => code.accumulate - i,  // New
            Opcode::Done => break
        }
    }

    return Some(code.accumulate)
}

And that’s the whole change. Kazanov is trying to show how the program’s array of instruction can keep values associated with an operation in-line with the operation itself, but in Rust that’s just… an enum. An enum in Rust is an authentic sum type; it can carry additional information.  There’s no need to coerce the vector into holding “weird” values and casting them appropriately; Rust already guarantees that happened at compile time.  And again, the responsibility for verifying the program’s correctness off storage is a separate responsibility; by the time the program has entered the interpreter, we’ve verified that it is syntactically correct at the bytecode level.

There is a cost to this: Rust’s Vec will consist of uniformly sized objects, each as big as the largest object defined in the enum plus the management of the enum itself, so at least one byte for the enum differentiation plus the size of a 64-bit integer, even for the Inc and Dec operators. For a small DSL, that’s not a terrible price to pay. As an interpreter grows, alternative ways of describing an opcode may be desired. We shall see.

A basic unit test:

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn longer() {
        use Opcode::*;
        assert_eq!(interpret(vec![Inc, Inc, Inc, Add(4), Add(3), Sub(6), Done]), Some(4));
    }

}

And the results are the same:

running 1 tests
test tests::longer ... ok

 

Subscribe to Feed

Categories

Calendar

July 2019
M T W T F S S
« Jun    
1234567
891011121314
15161718192021
22232425262728
293031