v6/site/posts/2021-04-19-cleaning-up-binary-operators.md

144 lines
5.0 KiB
Markdown
Raw Normal View History

2022-12-10 13:15:32 -05:00
```
title = "Part 7: Cleaning Up Binary Operators"
tags = ["build a programming language", "rust"]
date = "2021-04-19 17:00:42 -0400"
short_desc = "A minor fight with the Rust borrow checker."
slug = "cleaning-up-binary-operators"
preamble = '<p style="font-style: italic;">This post is part of a <a href="/build-a-programming-language/" data-link="/build-a-programming-language/">series</a> about learning Rust and building a small programming language.</p><hr>'
```
The code from [part 4](/2021/operator-precedence/) that checks whether a pair of binary operators should be grouped to the left or right works, but I'm not particularly happy with it. The issue is that it needs to pattern match on the right node twice: first in the `should_group_left` function, and then again in `combine_with_binary_operator` if `should_group_left` returned true.
<!-- excerpt-end -->
As a reminder, the code currently looks like this:
```rust
fn combine_with_binary_operator(left: Node, token: &Token, right: Node) -> Node {
let op: BinaryOp = match token {
// ...
};
if should_group_left(&op, &right) {
if let Node::BinaryOp {
left: right_left,
op: right_op,
right: right_right,
} {
Node::BinaryOp {
left: Box::new(Node::BinaryOp {
left: Box::new(left),
op,
right: right_left,
}),
op: right_op,
right: right_right,
}
} else {
panic!();
}
} else {
Node::BinaryOp {
left: Box::new(left),
op,
right: Box::new(right),
}
}
}
fn should_group_left(left_op: &BinaryOp, right: &Node) -> bool {
match right {
Node::BinaryOp { op: right_op, .. } => {
right_op.precedence() < left_op.precedence()
|| (right_op.precedence() == left_op.precedence()
&& left_op.associativity() == Associativity::Left)
}
_ => false,
}
}
```
See that `panic!()` in the else branch? The compiler thinks it (or some return value) is necessary there, because the pattern match could fail. But as the programmer, I know better. I know that if we're in the true branch of the outer if statement, then `should_group_left` returned true and the pattern match can never fail.
This is why I just call `panic!` without even a message: because I know that code is unreachable.
But it would be even better not to have it at all.
Basically, what I want the `should_group_left` function to do is pattern match on the right node, and if it meets the conditions for being left-grouped, to get the values inside the right binary operator node out without having to do another pattern match.
<aside class="inline">
Swift handles this rather nicely, because it allows you to combine multiple if... clauses? together with commas, including mixing and matching boolean conditions and pattern matching, requiring them all to succeed for the body of the if to be executed.
```swift
if should_group_left(op, right),
case let .binaryOp(rightLeft, rightOp, rightRight) = right {
// ...
}
```
Other languages with flow typing, like TypeScript or Kotlin, handle similar issues with things like custom [type guards](https://www.typescriptlang.org/docs/handbook/advanced-types.html#user-defined-type-guards), which inform the compiler "if this function returns true, the following type constraint holds".
</aside>
The best solution I was able to come up with was changing `should_group_left` to take ownership of the right node and return a `Result<(Box<Node>, BinaryOp, Box<Node>), Node>`[^1]. If it returns an Ok result, all the values are available. If it returns an "error", ownership of the right node is returned back to the caller.
[^1]: It doesn't really need to be a `Result` specifically, I just didn't bother writing my own enum just for this.
```rust
fn should_group_left(
left_op: &BinaryOp,
right: Node,
) -> Result<(Box<Node>, BinaryOp, Box<Node>), Node> {
match right {
Node::BinaryOp {
left: right_left,
op: right_op,
right: right_right,
} => {
let should_group = // ...
if should_group {
Ok((right_left, right_op, right_right))
} else {
Err(Node::BinaryOp {
left: right_left,
op: right_op,
right: right_right,
})
}
}
_ => Err(right),
}
}
```
Even this isn't ideal, because in the else branch in the first match arm, I still need to reconstruct the original `right` node, since it's been moved by the destructuring. I spent a while playing around with using references for various things in this function, but ultimately couldn't come up with anything better than this. If you have any ideas, let me know.
At any rate, at the call site in `combine_with_binary_operator`, this works pretty well:
```rust
fn combine_with_binary_operator(left: Node, token: &Token, right: Node) -> Node {
let op = match token {
// ...
};
match should_group_left(&op, right) {
Ok((right_left, right_op, right_right)) => Node::BinaryOp {
left: Box::new(Node::BinaryOp {
left: Box::new(left),
op,
right: right_left,
}),
op: right_op,
right: right_right,
},
Err(right) => Node::BinaryOp {
left: Box::new(left),
op,
right: Box::new(right),
},
}
}
```