shadowfacts.net/site/posts/2021-04-19-cleaning-up-bina...

5.0 KiB

metadata.title = "Part 7: Cleaning Up Binary Operators"
metadata.tags = ["build a programming language", "rust"]
metadata.date = "2021-04-19 17:00:42 -0400"
metadata.shortDesc = "A minor fight with the Rust borrow checker."
metadata.slug = "cleaning-up-binary-operators"
metadata.preamble = `<p style="font-style: italic;">This post is part of a <a href="/build-a-programming-language/" data-link="/build-a-programming-language/">series</a> about learning Rust and building a small programming language.</p><hr>`

The code from part 4 that checks whether a pair of binary operators should be grouped to the left or right works, but I'm not particularly happy with it. The issue is that it needs to pattern match on the right node twice: first in the should_group_left function, and then again in combine_with_binary_operator if should_group_left returned true.

As a reminder, the code currently looks like this:

fn combine_with_binary_operator(left: Node, token: &Token, right: Node) -> Node {
	let op: BinaryOp = match token {
		// ...
	};

	if should_group_left(&op, &right) {
		if let Node::BinaryOp {
			left: right_left,
			op: right_op,
			right: right_right,
		} {
			Node::BinaryOp {
				left: Box::new(Node::BinaryOp {
					left: Box::new(left),
					op,
					right: right_left,
				}),
				op: right_op,
				right: right_right,
			}
		} else {
			panic!();
		}
	} else {
		Node::BinaryOp {
			left: Box::new(left),
			op,
			right: Box::new(right),
		}
	}
}

fn should_group_left(left_op: &BinaryOp, right: &Node) -> bool {
	match right {
		Node::BinaryOp { op: right_op, .. } => {
			right_op.precedence() < left_op.precedence()
				|| (right_op.precedence() == left_op.precedence()
					&& left_op.associativity() == Associativity::Left)
		}
		_ => false,
	}
}

See that panic!() in the else branch? The compiler thinks it (or some return value) is necessary there, because the pattern match could fail. But as the programmer, I know better. I know that if we're in the true branch of the outer if statement, then should_group_left returned true and the pattern match can never fail.

This is why I just call panic! without even a message: because I know that code is unreachable.

But it would be even better not to have it at all.

Basically, what I want the should_group_left function to do is pattern match on the right node, and if it meets the conditions for being left-grouped, to get the values inside the right binary operator node out without having to do another pattern match.

The best solution I was able to come up with was changing should_group_left to take ownership of the right node and return a Result<(Box<Node>, BinaryOp, Box<Node>), Node>1. If it returns an Ok result, all the values are available. If it returns an "error", ownership of the right node is returned back to the caller.

fn should_group_left(
	left_op: &BinaryOp,
	right: Node,
) -> Result<(Box<Node>, BinaryOp, Box<Node>), Node> {
	match right {
		Node::BinaryOp {
			left: right_left,
			op: right_op,
			right: right_right,
		} => {
			let should_group = // ...
			if should_group {
				Ok((right_left, right_op, right_right))
			} else {
				Err(Node::BinaryOp {
					left: right_left,
					op: right_op,
					right: right_right,
				})
			}
		}
		_ => Err(right),
	}
}

Even this isn't ideal, because in the else branch in the first match arm, I still need to reconstruct the original right node, since it's been moved by the destructuring. I spent a while playing around with using references for various things in this function, but ultimately couldn't come up with anything better than this. If you have any ideas, let me know.

At any rate, at the call site in combine_with_binary_operator, this works pretty well:

fn combine_with_binary_operator(left: Node, token: &Token, right: Node) -> Node {
	let op = match token {
		// ...
	};

	match should_group_left(&op, right) {
		Ok((right_left, right_op, right_right)) => Node::BinaryOp {
			left: Box::new(Node::BinaryOp {
				left: Box::new(left),
				op,
				right: right_left,
			}),
			op: right_op,
			right: right_right,
		},
		Err(right) => Node::BinaryOp {
			left: Box::new(left),
			op,
			right: Box::new(right),
		},
	}
}

  1. It doesn't really need to be a Result specifically, I just didn't bother writing my own enum just for this. ↩︎