26 Commits

Author SHA1 Message Date
Khaled Hosny
232b2ccbc4 Move the rest of py23 module to textTools
Change all imports to use textTools module, except the test_py23.py test
which is kept until we decide to remove the module (if ever).
2021-08-20 01:29:45 +02:00
Just van Rossum
5fc65d7168
Misc py23 cleanups (#2243)
* Replaced all from ...py23 import * with explicit name imports, or removed completely when possible.
* Replaced tounicode() with tostr()
* Changed all BytesIO ans StringIO imports to from io import ..., replaced all UnicodeIO with StringIO.
* Replaced all unichr() with chr()
* Misc minor tweaks and fixes
2021-03-29 11:45:58 +02:00
David Corbett
ac44a7f61d unicodedata: Update RTL_SCRIPTS for Unicode 13.0 2020-05-06 17:16:37 -04:00
justvanrossum
98efa31ce6 fixed doctest 2020-03-20 14:38:59 +01:00
justvanrossum
b14e6ecf4c update to unicode 13.0 2020-03-20 14:15:00 +01:00
Nikolaus Waxweiler
d381609885 Update RTL_SCRIPTS for Unicode 11 and 12
Information taken from https://docs.google.com/spreadsheets/d/1Y90M0Ie3MUJ6UVCRDOypOtijlMDLNNyyLk36T6iMu0o.
2020-01-09 14:21:49 +00:00
Nikolaus Waxweiler
01328213c7 Remove __future__ imports 2019-08-09 12:20:13 +01:00
Cosimo Lupo
987798f82e
Update Blocks, Scripts and ScriptExtensions to latest Unicode Data 12.1 2019-06-18 11:33:53 +01:00
Cosimo Lupo
a526b7170c
tests: fix expected results after Unicode 11 update
fixes https://github.com/fonttools/fonttools/issues/1291
2018-07-12 11:35:16 +01:00
Cosimo Lupo
452c85ecef
Update Blocks, Scripts and ScriptExtensions for Unicode 11
I run: python3 MetaTools/buildUCD.py
2018-07-12 11:35:16 +01:00
Cosimo Lupo
677954d5b9
unicodedata: add ot_tag_to_script function
returns the Unicode script code for a given OpenType script tag, or None if no match is found
2018-01-23 11:45:20 -08:00
Cosimo Lupo
91a8cc33e7
unicodedata: add script_horizontal_direction function
same as harfbuzz hb_script_get_horizontal_direction.

We just hard-code the set of RTL script here, as it doesn't change often anyway.
The function is just syntactic sugar as it all does is basically looking up the
constant RTL_SCRIPTS set.
It's nice to have it here in a central place alongside 'script', 'script_name', etc.
2018-01-19 18:04:33 +00:00
Cosimo Lupo
5e0bad94c5
export new ot_tags_from_script func in __all__ list [skip ci] 2018-01-18 20:26:44 +00:00
Cosimo Lupo
c9259c4723
unicodedata: add ot_tags_from_script function
Fixes https://github.com/fonttools/fonttools/issues/1112

This implements the same logic found in harbfuzz hb-ot-tag.cc to
convert between Unicode (or ISO 15924) script codes to OpenType script
tags as defined at:
https://www.microsoft.com/typography/otspec/scripttags.htm

461a605fde/src/hb-ot-tag.cc (L127)
2018-01-18 20:20:17 +00:00
Cosimo Lupo
1765ed772a [unicodedata] add script_name and script_code to __all__
and cast to str to avoid error with import * in python2.7

TypeError: Item in from list'' must be str, not unicode
2017-11-22 18:37:14 +01:00
Cosimo Lupo
99ea0a3986 [unicodedata] add script_code func and 'default' fallback arg
`script_code` does the reverse of `script_name`: it takes a long
script name and returns a 4-letter script code.

Both `script_name` and `script_code` raise KeyError by default,
but can optionally return a default value instead.
2017-11-22 17:46:44 +01:00
Cosimo Lupo
afd2490a6c [unicodedata] add script_name function
Converts four-letter script codes to human-readable long names
2017-11-22 17:41:23 +01:00
Cosimo Lupo
012688ac20 [Tests] adjust unicodedata_test to expect short script codes 2017-11-22 17:41:23 +01:00
Cosimo Lupo
54fa00499e [Scripts] use short codes, add NAMES dict with aliases 2017-11-22 17:41:23 +01:00
Cosimo Lupo
697b8d9af5 [unicodedata] add block and script_extension functions 2017-11-20 18:16:02 +01:00
Cosimo Lupo
8b50ed56d9 add auto-generated Blocks.py and ScriptsExtensions.py 2017-11-20 18:15:09 +01:00
Cosimo Lupo
1ed78b12f5 [unicodedata] rename scripts.py to Scripts.py
let's use the same names as the original UCD data files for simplicity
2017-11-20 17:37:45 +01:00
Cosimo Lupo
b53b878bdc [scripts] update auto-generated module
it now contains two list, one for the ranges and another for the script names
2017-11-20 13:38:49 +01:00
Cosimo Lupo
3442da1529 [unicodedata] use bisect.bisect_right function
CPython comes with a fast C implementation of bisect module.
This gives 4 to 5 times speed-ups over my pure-python version.
2017-11-20 13:30:17 +01:00
Cosimo Lupo
52d6131525 [unicodedata] add new module and 'script' function
The new `fontTools.unicodedata` module re-exports all the public
functions from the built-in `unicodedata` module, and also adds
additional functions.

The `script` function takes a unicode character and returns the
script name as defined in the UCD "Script.txt" data file.

It's implemented as a simple binary search, plus a memoizing
decorator that caches the results to avoid search the same
character more than once.

The unicodedata2 backport is imported if present, otherwise
the unicodedata built-in is used.
2017-11-17 19:17:17 +00:00
Cosimo Lupo
96dafe4afc [unicodedata] add auto-generated 'scripts' module
containing the script ranges and names from Scripts.txt
2017-11-17 19:16:45 +00:00