summaryrefslogtreecommitdiff
path: root/unidecode
Commit message (Collapse)AuthorAgeFilesLines
* Fix U+204A "TIRONIAN SIGN ET"Tomaz Solc2020-12-061-1/+5
| | | | See https://github.com/avian2/unidecode/issues/57
* Add some missing replacements in U+23xx page.Tomaz Solc2020-05-281-43/+43
| | | | | | Content of this commit by Marcoffee (Marco Ribeiro) on GitHub. https://github.com/marcoffee/unidecode/commit/705d91ad4c9c7755529d4be025170b11922f1dee
* Add more latin variants in U+1F1xx page.Tomaz Solc2019-01-191-81/+81
| | | | | | | | | | This adds: - SQUARED LATIN CAPITAL LETTERs, - NEGATIVE CIRCLED LATIN CAPITAL LETTERs, - NEGATIVE SQUARED LATIN CAPITAL LETTERs, - TORTOISE SHELL BRACKETED LATIN CAPITAL LETTERs and - CIRCLED ITALIC LATIN CAPITAL LETTERs
* Merge remote-tracking branch 'jdufresne/main'Tomaz Solc2019-01-191-0/+3
|\
| * Add __main__.py file so the CLI can be executed as a moduleJon Dufresne2018-12-311-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | Allows running the following command to execute the CLI $ python -m unidecode ... https://docs.python.org/3/library/__main__.html > For a package, the same effect can be achieved by including a > __main__.py module, the contents of which will be executed when the > module is run with -m.
* | Merge remote-tracking branch 'jdufresne/b-literal'Tomaz Solc2019-01-191-1/+1
|\ \
| * | Replace string literal + encode with bytes literalJon Dufresne2018-12-311-1/+1
| |/ | | | | | | | | Simpler and more forward compatible. The b prefix syntax is available on all supported Pythons.
* | Merge remote-tracking branch 'jdufresne/argparse'Tomaz Solc2019-01-191-12/+13
|\ \
| * | Replace use of deprecated optparse with argparseJon Dufresne2018-12-311-12/+13
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Python project considers the optparse module as deprecated. See: https://docs.python.org/3/library/optparse.html > Deprecated since version 3.2: The optparse module is deprecated and > will not be developed further; development will continue with the > argparse module. Replace the project's use with the newer argparse. The CLI is fully equivalent and should not result in any backwards comparability concerns. https://docs.python.org/3/library/argparse.html
* | Remove unused import from unidecode/util.pyJon Dufresne2018-12-311-1/+0
|/
* Fix "SQUARE V OVER M" and "SQUARE A OVER M".Tomaz Solc2018-06-191-2/+2
|
* Use uA instead of microampere, etc.Tomaz Solc2018-06-191-8/+8
| | | | | | | These codepoints are defined as "Greek small letter mu" and a Latin capital letter, not with spelled-out unit names. "u" is a common way of representing "micro" SI prefix in ASCII.
* Adds decoding for phonetic bloc 1D00—1D7Folau2018-04-031-86/+86
| | | | https://unicode-table.com/en/blocks/phonetic-extensions/
* Improve Hebrew conversionAlon Bar-Lev2018-03-101-15/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert double letter translation to capital letter as very hard to understand what the translation is because of duplicate, for example: kh - is it k and h or kh? tskh - is it t,s,kh or ts,k,h or ts,kh, etc... 0xa2 Hebrew bible puncheation mark, should be ignored. 0xc6 Opposite Nun, same as 'n'. 0xba Hulam Haser, vawel as 'o'. 0xbf Makaf Raphe, same as Makaf (0xbe). 0xc5 Hebrew bible puncheation mark, should be ignored. 0xc7 Makaf katan, vowel as 'o'. 0xd0 Aleph, sounds as AHA must exist to make string readbale. Distinguish from '`' use capital A to distinguish from 'a' vowel. 0xf5 Splitted Vave, same as 'v'. 0xf6 Opposite Nun, same as 'n'. 0xf7 Small Kuf, same as 'q'. Signed-off-by: Alon Bar-Lev <alon.barlev@gmail.com>
* Fix syntax error in an exampleJakub Wilk2018-02-191-1/+1
|
* Surround fractions with spacesJeffrey Gerard2017-10-101-3/+3
| | | | Goal is to avoid incorrect combination with adjacent numbers.
* Add currency translations for U+20B0 through U+20BFMike Swanson2017-09-221-16/+16
|
* U+05be is a hyphenMicha Moskovic2017-06-231-1/+1
| | | U+05be is the Hebrew Maqaf character, which is equivalent to a hyphen, as explained in https://en.wikipedia.org/wiki/Hebrew_punctuation#Hyphen_and_maqaf.
* U+2116 is the numero signAlan Davidson2017-01-161-1/+1
|
* Add missing square unit symbols.Tomaz Solc2016-11-042-10/+11
|
* Added latin variants in U+20xx and U+21xx pages.Tomaz Solc2016-11-042-28/+28
|
* Fix U+02B1 MODIFIER LETTER SMALL H WITH HOOKTomaz Solc2016-11-041-1/+1
|
* Fix U+205F MEDIUM MATHEMATICAL SPACETomaz Solc2016-11-041-1/+1
|
* Add U+1F1xx page.Tomaz Solc2016-10-121-0/+258
| | | | Includes "DIGIT ... COMMA" and "PARANTHESIZED LATIN CAPITAL LETTER" characters.
* Add missing vulgar fractions.Tomaz Solc2016-10-121-4/+4
|
* Add a/c, a/s, c/o, c/uTomaz Solc2016-10-121-4/+4
|
* Fix transliteration of enclosed alphanumericsKrzysztof Jurewicz2016-05-291-46/+47
|
* Fix typosJakub Wilk2015-12-101-1/+1
|
* Fix docstringsTomaz Solc2015-11-171-9/+15
|
* Rename unidecode_fast to unidecode_expect_asciiTomaz Solc2015-11-171-4/+8
| | | | | Also, add unidecode_expect_nonascii. "unidecode" is now an alias for "unidecode_expect_ascii"
* Add unidecode_fast function to speedup mostly-ASCII transliterations.dukebody2015-11-141-6/+30
|
* Add a newline if the string comes from commandlineTomaz Solc2015-05-141-0/+4
|
* Don't append an extra new-line.Tomaz Solc2015-05-131-1/+1
|
* Add -c command line option.Tomaz Solc2015-05-131-6/+18
|
* Use optparse for Python 2.6 compatibility.Tomaz Solc2015-05-131-21/+15
|
* Avoid reopening sys.stdin on Python 3.Tomaz Solc2015-05-131-5/+4
|
* Remove unnecessary check for isatty()Tomaz Solc2015-05-131-9/+7
|
* Use entry_points for the commandline utility.Tomaz Solc2015-05-131-0/+51
|
* Merge branch 'unidecode-1.00'release-0.04.17Tomaz Solc2014-12-181-4/+4
|\
| * Remove '[?]' for some characters.Tomaz Solc2014-06-161-4/+4
| |
* | Issue a warning if a surrogate char is encounteredTomaz Solc2014-12-071-0/+5
| | | | | | | | Also, improved the section in README regarding "narrow" Python builds.
* | Add missing double-struct italic capitalsTomaz Solc2014-12-071-5/+5
| |
* | Add some missing script latin letters.Tomaz Solc2014-12-071-5/+5
| |
* | Add missing double-struck capital letters.Tomaz Solc2014-12-071-7/+7
| |
* | fix of importing definitionsKarol Sikora2014-11-251-1/+1
|/
* Transliterate U+4E00 as "Yi "Tomaz Solc2014-05-111-1/+1
| | | | Thanks to Yao Zuo.
* Transliterate Euro sign as EUR.Tomaz Solc2013-12-241-1/+1
| | | | Thanks to Dave Smith.
* Add some comments for commonly requested changes.Tomaz Solc2013-12-241-0/+24
|
* Remove part of table that is equivalent to ASCIITomaz Solc2013-12-241-128/+11
| | | | Add comment about special case in the code.
* Add vim modeline.Tomaz Solc2013-12-241-0/+1
|