when decompiled from binary, the SVG.docList contains (unicode) strings, decoded as UTF-8. lxml fromstring accepts either bytes or str, but when given str with the xml header declaring an explicit encoding, it rejects them (since the header is lying). So we encode to bytes before calling fromstring in case the SVG contains an explicit encoding (UTF-8 is the only one allowed anyway). When serializing to XML with tostring, we similarly decode to str as UTF-8. Not only to match SVG decompile (which gives us str), but if we didn't do that, then attempting to dump to XML would fail, because XMLWriter.writecdata expects str, not bytes.
With this I can finally follow xlink:href and url(#...) sort of
references within the SVG doc and subset the elements accordingly so
that only those that are reachable from the initial set of glyph
elements are kept.
support for namespaces and xpath is insufficient in built-in ElementTree; supporting both lxml and ElementTree is too complicated, let's simply require lxml to be able to subset SVG for now
this drops svg document records when they no longer intersect the subset. It keeps them in their entirety (for now) when they still intersect the subset, only renaming all the id='glyphXXX' to point to the new glyph indices after subsetting. Unused, unreferenced elements are not pruned yet.