The method that was being used to read the header from the input was
inadvertently dropping the first non-header line on the floor; although
this happens to be okay in some cases (where there is an empty line
after the header) in the case of newer versions of the
ScriptExtensions.txt file, this was causing the generated code to be
missing the first entry, for U+00B7 (MIDDLE DOT)
Removed fontTools imports to prevent bootstrapping issues for
downstream package maintainers that wish to run buildUCD.py at
build time (i.e. when fontTools is not installed yet).
to use the bisect built-in module we need to have two separate
tables, one with the ranges themselves (which we pass to bisect
to get an index) and the other containing the script name for
each range.
Also, allow the buildUCD.py script to load data files from a
local directory, e.g. to allow downstream maintainers to rebuild
the generated modules from local files instead of downloading
from Unicode website.
The script currently only parses the Scripts.txt file and
generates a new python module `fontTools.unicodedata.scripts`
containing a `SCRIPT_RANGES` list of tuples, each containing
the range and the corresponding script name.