Add Part 12: Typed Variables

This commit is contained in:
Shadowfacts 2022-05-25 17:32:31 -04:00
parent 7d8754ad82
commit a39a938d08

View File

@ -0,0 +1,183 @@
```
metadata.title = "Part 12: Typed Variables"
metadata.tags = ["build a programming language", "rust"]
metadata.date = "2022-05-25 16:38:42 -0400"
metadata.shortDesc = ""
metadata.slug = "typed-variables"
metadata.preamble = `<p style="font-style: italic;">This post is part of a <a href="/build-a-programming-language/" data-link="/build-a-programming-language/">series</a> about learning Rust and building a small programming language.</p><hr>`
```
Hi. It's been a while. Though the pace of blog posts fell off a cliff last year[^1], I've continued working on my toy programming language on and off.
[^1]: During and after WWDC21, basically all of my non-work programming energy shifted onto iOS apps, and then never shifted back. I do recognize the irony of resuming mere weeks before WWDC22.
<!-- excerpt-end -->
## Part 1: Type Theory is for Chumps
I spent a while thinking about what I wanted the type system to look like—I do want some level of static typing, I know that much—but it got to the point where I was tired of thinking about it and just wanted to get back to writing code. So, lo and behold, the world's simplest type system:
```rust
#[derive(Debug, PartialEq, Clone, Copy)]
enum Type {
Integer,
Boolean,
String,
}
impl Type {
fn is_assignable_to(&self, other: &Type) -> bool {
self == other
}
}
```
Then, in the `Context`, rather than variables just being a map of names to `Value`s, the map now stores `VariableDecl`s:
```rust
struct VariableDecl {
variable_type: Type,
value: Value,
}
```
So variable declaration and lookup now goes through a simple helper in the function that creates the `VariableDecl`.
For now, types at variable declarations are optional at parse time since I haven't touched type inference yet and I didn't want to go back and update a bunch of unit tests. They are, however, inferred at evaluation time, if one wasn't specified.
```rust
fn parse_statement<'a, I: Iterator<Item = &'a Token>>(it: &mut Peekable<'a, I>) -> Option<Statement> {
// ...
let node = match token {
Token::Let => {
let name: String;
if let Some(Token::Identifier(s)) = it.peek() {
name = s.clone();
it.next();
} else {
panic!("expected identifier after let");
}
let mut variable_type = None;
if let Some(Token::Colon) = it.peek() {
it.next();
variable_type = Some(parse_type().expect("type after colon in variable declaration"));
}
expect_token!(it, Equals, "equals in variable declaration");
let value = parse_expression(it).expect("initial value in variable declaration");
Some(Statement::Declare {
name,
variable_type,
value,
})
}
// ...
};
// ...
}
```
The `parse_type` function is super simple, so I won't go over it—it just converts a the tokens for string/int/bool into their respective `Type`s. I call `expect` on the result of that type and then again wrap it in a `Some`, which seems redundant, because if whatever followed the colon wasn't a type, there's a syntax error and I don't want to continue.
Actually evaluating the variable declaration is still pretty straightforward, though it now checks that the type the initialization expression evaluated to matches the declared type:
```rust
fn eval_declare_variable(
name: &str,
mutable: bool,
variable_type: &Option<Type>,
value: &Node,
context: &ContextRef,
) {
let val = eval_expr(value, context);
let variable_type = match variable_type {
Some(declared) => {
assert!(
val.value_type().is_assignable_to(declared),
"variable value type is not assignable to declared type"
);
*declared
}
None => val.value_type(),
};
context
.borrow_mut()
.declare_variable(name, mutable, variable_type, val);
}
```
## Part 2: Variable Variables
The other bit I added was mutable variables, so that I could write a small program that did something non-trivial.
To do this, I changed the `VariableDecl` struct I showed above to hold a `ValueStorage` rather than a `Value` directly.
`ValueStorage` is an enum with variants for mutable and immutable variables. Immutables variables simply own their `Value`. Mutable ones, though, wrap it in a `RefCell` so that it can be mutated.
```rust
enum ValueStorage {
Immutable(Value),
Mutable(RefCell<Value>),
}
```
Setting the value is straightforward, but getting them is a bit annoying because `Value` isn't `Copy`, since it may own a string. So, there are a couple of helper functions: one to access the borrowed value and one to clone it.
```rust
impl ValueStorage {
fn set(&self, value: Value) {
match self {
ValueStorage::Immutable(_) => panic!("cannot set immutable variable"),
ValueStorage::Mutable(cell) => {
*cell.borrow_mut() = value;
}
}
}
fn with_value<R, F: FnOnce(&Value) -> R>(&self, f: F) -> R {
match self {
ValueStorage::Immutable(val) => f(&val),
ValueStorage::Mutable(cell) => f(&cell.borrow()),
}
}
fn clone_value(&self) -> Value {
self.with_value(|v| v.clone())
}
}
```
This works, but isn't ideal. At some point, the complex `Value` types should probably changed to reference-counted so, even if they're still not copy-able, cloning doesn't always involve an allocation.
Lexing and parsing I won't go into detail on, since it's trivial. There's a new for `var` and whether a declaration starts with that or `let` controls the mutability.
Setting variables isn't complicated either: when parsing a statement, if there's an equals sign after an identifier, that turns into a `SetVariable` which is evaluated simply by calling the aforementioned `set` function on the `ValueStorage` for that variable.
And with that, I can write a little fibonacci program:
```txt
$ cat fib.toy
var a = 0
var b = 1
var i = 0
while (i < 10) {
print("iteration: " + toString(i) + ", a: " + toString(a));
let tmp = a
a = b
b = tmp + a
i = i + 1
}
$ cargo r -- fib.toy
iteration: 0, a: 0
iteration: 1, a: 1
iteration: 2, a: 1
iteration: 3, a: 2
iteration: 4, a: 3
iteration: 5, a: 5
iteration: 6, a: 8
iteration: 7, a: 13
iteration: 8, a: 21
iteration: 9, a: 34
```
I also added a small CLI using [`structopt`](https://lib.rs/structopt) so I didn't have to keep writing code inside a string in `main.rs`.