shadowfacts.net/site/posts/2022-05-25-typed-variables.md

6.3 KiB

metadata.title = "Part 12: Typed Variables"
metadata.tags = ["build a programming language", "rust"]
metadata.date = "2022-05-25 16:38:42 -0400"
metadata.shortDesc = ""
metadata.slug = "typed-variables"
metadata.preamble = `<p style="font-style: italic;">This post is part of a <a href="/build-a-programming-language/" data-link="/build-a-programming-language/">series</a> about learning Rust and building a small programming language.</p><hr>`

Hi. It's been a while. Though the pace of blog posts fell off a cliff last year1, I've continued working on my toy programming language on and off.

Part 1: Type Theory is for Chumps

I spent a while thinking about what I wanted the type system to look like—I do want some level of static typing, I know that much—but it got to the point where I was tired of thinking about it and just wanted to get back to writing code. So, lo and behold, the world's simplest type system:

#[derive(Debug, PartialEq, Clone, Copy)]
enum Type {
	Integer,
	Boolean,
	String,
}

impl Type {
	fn is_assignable_to(&self, other: &Type) -> bool {
		self == other
	}
}

Then, in the Context, rather than variables just being a map of names to Values, the map now stores VariableDecls:

struct VariableDecl {
	variable_type: Type,
	value: Value,
}

So variable declaration and lookup now goes through a simple helper in the function that creates the VariableDecl.

For now, types at variable declarations are optional at parse time since I haven't touched type inference yet and I didn't want to go back and update a bunch of unit tests. They are, however, inferred at evaluation time, if one wasn't specified.

fn parse_statement<'a, I: Iterator<Item = &'a Token>>(it: &mut Peekable<'a, I>) -> Option<Statement> {
	// ...
	let node = match token {
		Token::Let => {
			let name: String;
			if let Some(Token::Identifier(s)) = it.peek() {
				name = s.clone();
				it.next();
			} else {
				panic!("expected identifier after let");
			}
			let mut variable_type = None;
			if let Some(Token::Colon) = it.peek() {
				it.next();
				variable_type = Some(parse_type().expect("type after colon in variable declaration"));
			}
			expect_token!(it, Equals, "equals in variable declaration");
			let value = parse_expression(it).expect("initial value in variable declaration");
			Some(Statement::Declare {
				name,
				variable_type,
				value,
			})
		}
		// ...
	};
	// ...
}

The parse_type function is super simple, so I won't go over it—it just converts a the tokens for string/int/bool into their respective Types. I call expect on the result of that type and then again wrap it in a Some, which seems redundant, because if whatever followed the colon wasn't a type, there's a syntax error and I don't want to continue.

Actually evaluating the variable declaration is still pretty straightforward, though it now checks that the type the initialization expression evaluated to matches the declared type:

fn eval_declare_variable(
    name: &str,
    mutable: bool,
    variable_type: &Option<Type>,
    value: &Node,
    context: &ContextRef,
) {
    let val = eval_expr(value, context);
    let variable_type = match variable_type {
        Some(declared) => {
            assert!(
                val.value_type().is_assignable_to(declared),
                "variable value type is not assignable to declared type"
            );
            *declared
        }
        None => val.value_type(),
    };
    context
        .borrow_mut()
        .declare_variable(name, mutable, variable_type, val);
}

Part 2: Variable Variables

The other bit I added was mutable variables, so that I could write a small program that did something non-trivial.

To do this, I changed the VariableDecl struct I showed above to hold a ValueStorage rather than a Value directly.

ValueStorage is an enum with variants for mutable and immutable variables. Immutables variables simply own their Value. Mutable ones, though, wrap it in a RefCell so that it can be mutated.

enum ValueStorage {
	Immutable(Value),
	Mutable(RefCell<Value>),
}

Setting the value is straightforward, but getting them is a bit annoying because Value isn't Copy, since it may own a string. So, there are a couple of helper functions: one to access the borrowed value and one to clone it.

impl ValueStorage {
    fn set(&self, value: Value) {
        match self {
            ValueStorage::Immutable(_) => panic!("cannot set immutable variable"),
            ValueStorage::Mutable(cell) => {
                *cell.borrow_mut() = value;
            }
        }
    }

    fn with_value<R, F: FnOnce(&Value) -> R>(&self, f: F) -> R {
        match self {
            ValueStorage::Immutable(val) => f(&val),
            ValueStorage::Mutable(cell) => f(&cell.borrow()),
        }
    }

    fn clone_value(&self) -> Value {
        self.with_value(|v| v.clone())
    }
}

This works, but isn't ideal. At some point, the complex Value types should probably changed to reference-counted so, even if they're still not copy-able, cloning doesn't always involve an allocation.

Lexing and parsing I won't go into detail on, since it's trivial. There's a new for var and whether a declaration starts with that or let controls the mutability.

Setting variables isn't complicated either: when parsing a statement, if there's an equals sign after an identifier, that turns into a SetVariable which is evaluated simply by calling the aforementioned set function on the ValueStorage for that variable.

And with that, I can write a little fibonacci program:

$ cat fib.toy
var a = 0
var b = 1
var i = 0
while (i < 10) {
	print("iteration: " + toString(i) + ", a: " + toString(a));
	let tmp = a
	a = b
	b = tmp + a
	i = i + 1
}

$ cargo r -- fib.toy
iteration: 0, a: 0
iteration: 1, a: 1
iteration: 2, a: 1
iteration: 3, a: 2
iteration: 4, a: 3
iteration: 5, a: 5
iteration: 6, a: 8
iteration: 7, a: 13
iteration: 8, a: 21
iteration: 9, a: 34

I also added a small CLI using structopt so I didn't have to keep writing code inside a string in main.rs.


  1. During and after WWDC21, basically all of my non-work programming energy shifted onto iOS apps, and then never shifted back. I do recognize the irony of resuming mere weeks before WWDC22. ↩︎