36 Commits

Author SHA1 Message Date
Khaled Hosny
cf08265cd5 Black 2024-02-06 15:47:35 +02:00
Harry Dalton
9bfb72b055 Fix typing of script_horizontal_direction()
Without explicit annotation, some type checkers infer that the type of
the 'default' argument can only be type[KeyError]. This was the case
in unicodedata_test.py, where pyright disallowed "LTR". This commit adds
annotations to avoid this, fixing the issue in the test (and external
code dependent on the API).

Some of the other functions in this file have the same semantics and
suffer from the same type error, and so this fix could also be extended
to them as usage requires.
2023-07-13 13:53:36 +01:00
Harry Dalton
d61cad65fc Remove some of the str conversions that are redundant in Python 3 2023-07-13 13:50:55 +01:00
Nikolaus Waxweiler
d584daa8fd Blacken code 2022-12-13 11:26:36 +00:00
Cosimo Lupo
8697f91cdc
[unicodedata] map Zmth<->math in ot_tag_{to,from}_script
Fixes https://github.com/fonttools/fonttools/issues/1737
2022-11-11 12:20:37 +00:00
Cosimo Lupo
ea534a4ecf
unicodedata: Update Scripts/Blocks to Unicode 15.0
by re-running the MetaTools/buildUCD.py script using the current UCD
database.
2022-09-23 10:42:55 +01:00
Cosimo Lupo
98f5fd3d5a unicodedata: alias 'jamo' script tag to 'hang'
see https://github.com/googlefonts/ufo2ft/issues/575#issuecomment-1009962836
2022-01-11 17:13:20 +00:00
Cosimo Lupo
09e2d1d548 unicodedata: update the script direction list to Unicode 14.0
same as https://github.com/harfbuzz/harfbuzz/blob/3.2.0/src/hb-common.cc#L514-L613
2022-01-11 13:19:01 +00:00
Cosimo Lupo
bd58e66dbf bump unicodedata2 dependency to 14.0.0
now supports pypy3 and Unicode 14.0
2021-12-20 16:44:17 +00:00
Yao Wei (魏銘廷)
390640a357
update to unicode 14.0 2021-10-31 23:24:18 +08:00
Khaled Hosny
232b2ccbc4 Move the rest of py23 module to textTools
Change all imports to use textTools module, except the test_py23.py test
which is kept until we decide to remove the module (if ever).
2021-08-20 01:29:45 +02:00
Just van Rossum
5fc65d7168
Misc py23 cleanups (#2243)
* Replaced all from ...py23 import * with explicit name imports, or removed completely when possible.
* Replaced tounicode() with tostr()
* Changed all BytesIO ans StringIO imports to from io import ..., replaced all UnicodeIO with StringIO.
* Replaced all unichr() with chr()
* Misc minor tweaks and fixes
2021-03-29 11:45:58 +02:00
David Corbett
ac44a7f61d unicodedata: Update RTL_SCRIPTS for Unicode 13.0 2020-05-06 17:16:37 -04:00
justvanrossum
98efa31ce6 fixed doctest 2020-03-20 14:38:59 +01:00
justvanrossum
b14e6ecf4c update to unicode 13.0 2020-03-20 14:15:00 +01:00
Nikolaus Waxweiler
d381609885 Update RTL_SCRIPTS for Unicode 11 and 12
Information taken from https://docs.google.com/spreadsheets/d/1Y90M0Ie3MUJ6UVCRDOypOtijlMDLNNyyLk36T6iMu0o.
2020-01-09 14:21:49 +00:00
Nikolaus Waxweiler
01328213c7 Remove __future__ imports 2019-08-09 12:20:13 +01:00
Cosimo Lupo
987798f82e
Update Blocks, Scripts and ScriptExtensions to latest Unicode Data 12.1 2019-06-18 11:33:53 +01:00
Cosimo Lupo
a526b7170c
tests: fix expected results after Unicode 11 update
fixes https://github.com/fonttools/fonttools/issues/1291
2018-07-12 11:35:16 +01:00
Cosimo Lupo
452c85ecef
Update Blocks, Scripts and ScriptExtensions for Unicode 11
I run: python3 MetaTools/buildUCD.py
2018-07-12 11:35:16 +01:00
Cosimo Lupo
677954d5b9
unicodedata: add ot_tag_to_script function
returns the Unicode script code for a given OpenType script tag, or None if no match is found
2018-01-23 11:45:20 -08:00
Cosimo Lupo
91a8cc33e7
unicodedata: add script_horizontal_direction function
same as harfbuzz hb_script_get_horizontal_direction.

We just hard-code the set of RTL script here, as it doesn't change often anyway.
The function is just syntactic sugar as it all does is basically looking up the
constant RTL_SCRIPTS set.
It's nice to have it here in a central place alongside 'script', 'script_name', etc.
2018-01-19 18:04:33 +00:00
Cosimo Lupo
5e0bad94c5
export new ot_tags_from_script func in __all__ list [skip ci] 2018-01-18 20:26:44 +00:00
Cosimo Lupo
c9259c4723
unicodedata: add ot_tags_from_script function
Fixes https://github.com/fonttools/fonttools/issues/1112

This implements the same logic found in harbfuzz hb-ot-tag.cc to
convert between Unicode (or ISO 15924) script codes to OpenType script
tags as defined at:
https://www.microsoft.com/typography/otspec/scripttags.htm

461a605fde/src/hb-ot-tag.cc (L127)
2018-01-18 20:20:17 +00:00
Cosimo Lupo
1765ed772a [unicodedata] add script_name and script_code to __all__
and cast to str to avoid error with import * in python2.7

TypeError: Item in from list'' must be str, not unicode
2017-11-22 18:37:14 +01:00
Cosimo Lupo
99ea0a3986 [unicodedata] add script_code func and 'default' fallback arg
`script_code` does the reverse of `script_name`: it takes a long
script name and returns a 4-letter script code.

Both `script_name` and `script_code` raise KeyError by default,
but can optionally return a default value instead.
2017-11-22 17:46:44 +01:00
Cosimo Lupo
afd2490a6c [unicodedata] add script_name function
Converts four-letter script codes to human-readable long names
2017-11-22 17:41:23 +01:00
Cosimo Lupo
012688ac20 [Tests] adjust unicodedata_test to expect short script codes 2017-11-22 17:41:23 +01:00
Cosimo Lupo
54fa00499e [Scripts] use short codes, add NAMES dict with aliases 2017-11-22 17:41:23 +01:00
Cosimo Lupo
697b8d9af5 [unicodedata] add block and script_extension functions 2017-11-20 18:16:02 +01:00
Cosimo Lupo
8b50ed56d9 add auto-generated Blocks.py and ScriptsExtensions.py 2017-11-20 18:15:09 +01:00
Cosimo Lupo
1ed78b12f5 [unicodedata] rename scripts.py to Scripts.py
let's use the same names as the original UCD data files for simplicity
2017-11-20 17:37:45 +01:00
Cosimo Lupo
b53b878bdc [scripts] update auto-generated module
it now contains two list, one for the ranges and another for the script names
2017-11-20 13:38:49 +01:00
Cosimo Lupo
3442da1529 [unicodedata] use bisect.bisect_right function
CPython comes with a fast C implementation of bisect module.
This gives 4 to 5 times speed-ups over my pure-python version.
2017-11-20 13:30:17 +01:00
Cosimo Lupo
52d6131525 [unicodedata] add new module and 'script' function
The new `fontTools.unicodedata` module re-exports all the public
functions from the built-in `unicodedata` module, and also adds
additional functions.

The `script` function takes a unicode character and returns the
script name as defined in the UCD "Script.txt" data file.

It's implemented as a simple binary search, plus a memoizing
decorator that caches the results to avoid search the same
character more than once.

The unicodedata2 backport is imported if present, otherwise
the unicodedata built-in is used.
2017-11-17 19:17:17 +00:00
Cosimo Lupo
96dafe4afc [unicodedata] add auto-generated 'scripts' module
containing the script ranges and names from Scripts.txt
2017-11-17 19:16:45 +00:00