v6/site/posts/2023-01-05-rewritten-in-rust.md

6.8 KiB
Raw Blame History

title = "Rewritten in Rust"
tags = ["meta"]
date = "2023-01-05 14:30:42 -0500"
short_desc = "If you're reading this message, I'm being held hostage by the Rust Evangelism Strike Force at—"
slug = "rewritten-in-rust"

So, about six months ago I decided I wanted to rewrite my perfectly-working blog backend in Rust. Why? Because I was bored and wanted an excuse to use Rust more.

The fundamental architecture of my site is unchanged from the last rewrite. All of the HTML pages are generated up front and written to disk. The HTTP server can then handle any ActivityPub-specific requests and fall back to serving files straight from disk.

i look forward to finishing this rewrite and then being able to sit back and enjoy... *checks notes* the exact same website i had before

Because this project was undertaken with the deliberate goal of using Rust more, I let myself spend more time bikeshedding and working on pieces that I ordinarily would have ignored or left to 3rd party libraries. One of those was spending probably too much time writing a bunch of code to slugify post titles—that is, turning titles with lots of punctuation and things into a nice and URL-safe format. It handles a bunch of pet peeves I have when I look at URLs on other websites (e.g., non-ASCII characters geting blindly replaced resulting in long sequences of hyphens), even if those are highly unlikely to ever arise here.

Another component I spent a great deal of time working on was the Markdown to HTML rendering. I'm using the pulldown-cmark crate, which handles a great deal for me, but not quite everything. In the previous implementation of my blog, I was using the markdown-it package which has some other little niceties to make the generated HTML better, in addition to the custom plugins I was using for the Markdown decorations you see if you're reading this on my blog itself. I have custom code for handling the link and heading decorations, as before. I also override how footnote definitions are generated, to make them all appear at the end of the HTML rather appearing in the same locations they're defined in the Markdown. As part of the changes to footnote definitions, I also generate backlinks which go from the footnote back to where it was referenced, to make reading them a bit easier.

Syntax Highlighting

The previous, Node.js implementation of my blog used highlight.js for syntax highlighting. This works decently well, but it doesn't produce the most accurate highlighting since it's essentially a big pile of regexes rather than parsing the syntax. Now, I'm using Tree Sitter which actually parses the language and so highlighting ends up being more accurate.

Unfortunately, software being what it is, this is not always the case. The syntax highlighting for Swift with the best available Tree Sitter grammar is substantially worse than the highlight.js results. It routinely misses keywords, misinterprets variables as functions, and—to top all that off—something about how the highlight query is structured is incredibly slow for Tree Sitter to load. It more than doubles the time it takes to generate my blog, from about 0.5 seconds when skipping Swift highlighting to 1.3s when using tree-sitter-swift.

So, because I've never met a problem I couldn't yak-shave, I decided the solution was to use John Sundell's Splash library for highlighting Swift snippets. A number of Swift blogs I follow use it, and it seems to produce very good results. But, of course, it's written in Swift, so I need some way of accessing it from Rust. This was a little bit of an ordeal, and it ended up being very easy, then very difficult, then easy, then confusing, and finally not too bad. The details of how exactly everything works and what I went through are a subject for another time, but if you want to see how it works, the source code for the Rust/Swift bridge is available here.

ActivityPub

One of the big features from the last time I rewrote my blog was the ActivityPub integration. Almost nobody uses it, but I think its a cool feature and so I wanted to keep it. I'm using the activitystreams crate, and what I've learned is that statically typed languages are maybe not so great for writing AP implementations. Lots of properties can be multiple different types, so you can't directly read anything, you have to check that it is in fact the type you expect. This does make for a more correct implementation, but it ends up being a big pain in the ass.

ActivityPub support was undoubtedly the part of this project that took the most time. Just about all of the static generator was complete within a couple weeks. The AP support dragged on over the next 6 months because it was so unpleasant (both for the aforementioned reasons, and just that dealing with all the quirks of different AP implementations is a pain).

But, now that it's all done, I'm pretty happy with where it is. The ActivityPub support is mostly unchanged. Blog posts are still AP Article objects that you can interact with and comment on, and you can still follow the blog AP actor from Mastodon/etc. to get new posts to show up in your home feed. I did make a couple small quality-of-life changes:

  1. If you reply to a blog post with a non-public post, you'll get an automated reply back telling you that's not supported. The purpose of replying to a blog post is to make comments show up, and I don't want to display things that people didn't intend to be public. If you want to send a private comment, you can message me directly, rather than the blog itself.
  2. When there are new comments, the blog will automatically send me a notification (via AP, of course) with a digest of new posts. Previously, I had to manually check if there were any comments (if you ever commented and I never noticed it, sorry).

Misc

The only other notable change is the addition of the TV section, which is an archive of the various long-running commentary threads I've written on Mastodon.