This is based on bungeman's https://github.com/fonttools/fonttools/pull/2627
Previously, an entire `SVG ` table would be marked as compressed if any
of the decoded SVG documents in it were compressed. Then on encoding all
SVG documents would be considered for compression. The XML format had no
means to indicate if compression was desired.
Instead, mark each svgDoc with its compression status. When decoding
mark the svgDoc as compressed if the data was compressed. When encoding
try to compress the svgDoc if it is marked as compressed. In the XML
format the data itself is always uncompressed, but allow an optional
`compressed` boolean attribute (defaults to false) to indicate the
svgDoc should be compressed when encoded.
We also try to make sure that older code that relies on docList containing
sequences of three items (doc, startGID, endGID) will continue to work
without modification.
If a font has SinglePosFormat2 subtable with ValueFormat=0, then it contains a Value list with all None values.
Even though we don't build such inefficient tables, they do exist in the wild, as #2602 demostrates so we should handle those nicely, by downgrading them to Format1 with a single None value
if explicitly enabled, it will raise ImportError if uharfbuzz is not found, and will propagate the uharfbuzz error instead of silently falling back to the pure-python serializer
Two small changes that significantly speed up subsetting of large fonts
such as Noto Sans CJK:
1. When emptying a charstring, simply empty its program rather than
attempting to decompile it first. (Only relevant when retaining GIDs.)
2. When reindexing charstrings, swap an accidentally-quadratic
implementation for one that is linear in the number of retained glyphs.
* Add default auto doc options
* Ensure all references are unique
* Use anonymous links to avoid duplicate references
* Remove default options, fix wrong module name
* Don’t index repeated class
* Remove repeated classes included through automodule
* Fix warnings
* We don’t use our own static directory
* Correctly format XML in docs
* Fix indentation
* Fix overline
* Bring TOC to top
* Fix definition list
* Offset definition lists and examples
* Fix erroneous markup
* Fix markup
* Already included in automodule
* Fix args markup
* Correct markup for example
* Don’t reindex repeated module
* Correct XML code block markup
* Fix markup errors, change example to doctest
* Correct list markup
* Make ttx docstring both valid RST and valid help output
* Various other boring markup fixes
* Fix example indenting
* Make docstring valid RST and valid help output
* Mock import for reportlab
* It’s ok if manual links don’t appear in toctrees
* Oops typo, I guess doctests are useful
when decompiled from binary, the SVG.docList contains (unicode) strings, decoded as UTF-8. lxml fromstring accepts either bytes or str, but when given str with the xml header declaring an explicit encoding, it rejects them (since the header is lying). So we encode to bytes before calling fromstring in case the SVG contains an explicit encoding (UTF-8 is the only one allowed anyway). When serializing to XML with tostring, we similarly decode to str as UTF-8. Not only to match SVG decompile (which gives us str), but if we didn't do that, then attempting to dump to XML would fail, because XMLWriter.writecdata expects str, not bytes.
With this I can finally follow xlink:href and url(#...) sort of
references within the SVG doc and subset the elements accordingly so
that only those that are reachable from the initial set of glyph
elements are kept.
support for namespaces and xpath is insufficient in built-in ElementTree; supporting both lxml and ElementTree is too complicated, let's simply require lxml to be able to subset SVG for now
this drops svg document records when they no longer intersect the subset. It keeps them in their entirety (for now) when they still intersect the subset, only renaming all the id='glyphXXX' to point to the new glyph indices after subsetting. Unused, unreferenced elements are not pruned yet.
relying on ClipList.compile to drop unused clips based on updated glyphOrder won't work when font is loaded lazily (default for subsetter), because ClipList gets decompiled too late (after glyphOrder has already been modified) and this produces warnings about missing glyphIDs.
Better to make the subsetter explicilty prune unused clips.
This is a breaking change (but the COLRv1 API was already marked as unstable and subject to change)
The changes in this PR are meant to match the changes from the COLRv1 draft spec at:
https://github.com/googlefonts/colr-gradients-spec/pull/302