From 60948118da0f70a13a42c7085ea16857fc3f9d76 Mon Sep 17 00:00:00 2001 From: Shadowfacts Date: Wed, 22 May 2024 19:24:59 -0400 Subject: [PATCH] Fix typo --- site/posts/2024-05-20-parsing-html-slower.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/site/posts/2024-05-20-parsing-html-slower.md b/site/posts/2024-05-20-parsing-html-slower.md index 89a08cd..75b9e04 100644 --- a/site/posts/2024-05-20-parsing-html-slower.md +++ b/site/posts/2024-05-20-parsing-html-slower.md @@ -7,13 +7,11 @@ slug = "parsing-html-slower" [Last time](/2023/parsing-html-fast/), I wrote about how to parse HTML and convert it to `NSAttributedString`s quickly. Unfortunately, in the time since then, it's gotten slower. It's still a good deal faster than it was before all that work, mind you. At fault is not any of the optimizations I discussed last time, fortunately. Rather, to get the correct behavior across a whole slew of edge cases, there was more work that needed to be done. - - The root of all this complexity is the fact that I'm essentially trying to replicate a portion of the CSS layout algorithm using only the information provided by the HTML tokenization process (that is, the text that is emitted and the start/end tags) while flattening into a single string all the structure used to achieve those results. -The previous version of this—which did correctly handle the initial test cases that I threw at it, but not what cropped up in the wild—worked by trying to keep track of when you had just finished one block element and then, before starting a new one, emitting like breaks to approximate the spacing between them that would otherwise be specified by CSS. Here are an assortment of issues that arise when using this strategy with real input: +The previous version of this—which did correctly handle the initial test cases that I threw at it, but not what cropped up in the wild—worked by trying to keep track of when you had just finished one block element and then, before starting a new one, emitting line breaks to approximate the spacing between them that would otherwise be specified by CSS. Here are an assortment of issues that arise when using this strategy with real input: ### Blocks can start after a closing non-block element