Whitespace in HTML parsed incorrectly #27
Labels
No Milestone
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: shadowfacts/Tusker#27
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Posts authored in Markdown or HTML have extra whitespace at the beginning of paragraphs.
Extra whitespace in Markdown/HTML formatted poststo Whitespace in HTML parsed incorrectlySwiftSoup, the HTML parsing library we currently use parses whitespace in between HTML elements incorrectly, e.g. it will pars<p>a</p>\n<p>b</p>
as [paragraph element, TextNode containing a space, paragraph element]. The TextNode containing the space is what's showing up when rendering.This issue only manifests with rich text posts because non-rich text posts don't have their text wrapped in paragraph tags.
The newlines in the raw HTML should be
parsed correctly and thencollapsed per the CSS whitespace collapsing rules.Potential solutions to this are:
Use the
NSAttributedString
HTML initializerThis has the downside of required two HTML parses (first with something else to sanitize the HTML, second to convert into an attributed string) which would be slower than ideal.
Fix SwiftSoup
good luck
Switch to a different HTML parsing library
HTMLReader seems like it could work, but most libraries seem like crap
Manually scan through the attributed string after it's generated and collapse whitespace per the CSS rules
just kind of a pain in the ass