delta/python-packages/unidecode.git - www.tablix.org: ~avian/git/unidecode.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Improve Yiddish conversion	Alon Bar-Lev	2021-08-31	1	-22/+22
\| \| \| \| \| \|	Cleanup invalid characters and typos. Fixup special characters.
*	Improve Hebrew conversion	Alon Bar-Lev	2021-08-31	1	-14/+14
\| \| \| \| \| \| \| \|	Cleanup special rearly used characters. Regular characters closer to formal document[1]. [1] https://hebrew-academy.org.il/wp-content/uploads/taatik-ivrit-latinit-1-1.pdf
*	Remove __init__.pyi	Tomaz Solc	2021-02-05	2	-10/+4
\|
*	More Python 2 compatibility code clean up.	Tomaz Solc	2021-02-05	2	-12/+2
\|
*	Move Py3.5-compatible type annotations inline.	Tomaz Solc	2021-02-05	2	-8/+4
\|
*	Drop support for Python 2 and 3.4.	Tomaz Solc	2021-02-05	1	-17/+7
\|
*	Add Typing stubs for the main API.	Pascal Corpet	2021-02-03	2	-0/+11
\| \| \| \|	See PEP 484 (for typing) and PEP 561 (for distributing types).
*	Avoid exception chaining on Python 3.	Tomaz Solc	2021-01-08	1	-4/+7
\| \| \| \| \| \|	This avoids exceptions raised by errors='strict' from displaying as "During handling of the above exception ..." in the backtrace which can be confusing.
*	Rename argument replace_char -> replace_str	Tomaz Solc	2020-12-20	1	-9/+9
\|
*	More mass replace '' -> None	Tomaz Solc	2020-12-20	4	-757/+757
\| \| \| \|	See 35295352.
*	Mass replace '[?] ' -> None	Tomaz Solc	2020-12-20	79	-795/+795
\| \| \| \| \| \| \|	To make use of the new 'errors' argument. It seems that '[?] ' (with space) was used for code points that were assigned, but the replacement was not known.
*	Mass replace '' -> None.	Tomaz Solc	2020-12-20	4	-395/+395
\| \| \| \| \| \| \| \| \|	To make use of the new 'errors' argument. '' was used in the original Perl tables both to mean an unknown replacement and an intentional replacement with and empty string. Here I only replace it in ranges I've added later where I'm reasonably sure that '' means unknown replacement.
*	Mass replace '[?]' -> None	Tomaz Solc	2020-12-20	45	-3319/+3319
\| \| \| \| \| \|	To make use of the new 'errors' argument. It seems '[?]' was used in the original Perl tables for unassigned codepoints.
*	Add missing ligatures and quotes in U+1F6xx range	Tomaz Solc	2020-12-20	1	-0/+258
\|
*	Add missing quotation marks in the U+27xx range.	Tomaz Solc	2020-12-20	1	-5/+5
\|
*	Add errors parameter to unidecode()	Tomaz Solc	2020-12-20	1	-34/+76
\| \| \| \|	This implements the idea in https://github.com/avian2/unidecode/pull/53
*	Fix U+204A "TIRONIAN SIGN ET"	Tomaz Solc	2020-12-06	1	-1/+5
\| \| \| \|	See https://github.com/avian2/unidecode/issues/57
*	Add some missing replacements in U+23xx page.	Tomaz Solc	2020-05-28	1	-43/+43
\| \| \| \| \| \|	Content of this commit by Marcoffee (Marco Ribeiro) on GitHub. https://github.com/marcoffee/unidecode/commit/705d91ad4c9c7755529d4be025170b11922f1dee
*	Add more latin variants in U+1F1xx page.	Tomaz Solc	2019-01-19	1	-81/+81
\| \| \| \| \| \| \| \| \| \|	This adds: - SQUARED LATIN CAPITAL LETTERs, - NEGATIVE CIRCLED LATIN CAPITAL LETTERs, - NEGATIVE SQUARED LATIN CAPITAL LETTERs, - TORTOISE SHELL BRACKETED LATIN CAPITAL LETTERs and - CIRCLED ITALIC LATIN CAPITAL LETTERs
*	Merge remote-tracking branch 'jdufresne/main'	Tomaz Solc	2019-01-19	1	-0/+3
\|\
\| *	Add __main__.py file so the CLI can be executed as a module	Jon Dufresne	2018-12-31	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allows running the following command to execute the CLI $ python -m unidecode ... https://docs.python.org/3/library/__main__.html > For a package, the same effect can be achieved by including a > __main__.py module, the contents of which will be executed when the > module is run with -m.
* \|	Merge remote-tracking branch 'jdufresne/b-literal'	Tomaz Solc	2019-01-19	1	-1/+1
\|\ \
\| * \|	Replace string literal + encode with bytes literal	Jon Dufresne	2018-12-31	1	-1/+1
\| \|/ \| \| \| \| \| \| \| \|	Simpler and more forward compatible. The b prefix syntax is available on all supported Pythons.
* \|	Merge remote-tracking branch 'jdufresne/argparse'	Tomaz Solc	2019-01-19	1	-12/+13
\|\ \
\| * \|	Replace use of deprecated optparse with argparse	Jon Dufresne	2018-12-31	1	-12/+13
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Python project considers the optparse module as deprecated. See: https://docs.python.org/3/library/optparse.html > Deprecated since version 3.2: The optparse module is deprecated and > will not be developed further; development will continue with the > argparse module. Replace the project's use with the newer argparse. The CLI is fully equivalent and should not result in any backwards comparability concerns. https://docs.python.org/3/library/argparse.html
* \|	Remove unused import from unidecode/util.py	Jon Dufresne	2018-12-31	1	-1/+0
\|/
*	Fix "SQUARE V OVER M" and "SQUARE A OVER M".	Tomaz Solc	2018-06-19	1	-2/+2
\|
*	Use uA instead of microampere, etc.	Tomaz Solc	2018-06-19	1	-8/+8
\| \| \| \| \| \| \|	These codepoints are defined as "Greek small letter mu" and a Latin capital letter, not with spelled-out unit names. "u" is a common way of representing "micro" SI prefix in ASCII.
*	Adds decoding for phonetic bloc 1D00—1D7F	olau	2018-04-03	1	-86/+86
\| \| \| \|	https://unicode-table.com/en/blocks/phonetic-extensions/
*	Improve Hebrew conversion	Alon Bar-Lev	2018-03-10	1	-15/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Convert double letter translation to capital letter as very hard to understand what the translation is because of duplicate, for example: kh - is it k and h or kh? tskh - is it t,s,kh or ts,k,h or ts,kh, etc... 0xa2 Hebrew bible puncheation mark, should be ignored. 0xc6 Opposite Nun, same as 'n'. 0xba Hulam Haser, vawel as 'o'. 0xbf Makaf Raphe, same as Makaf (0xbe). 0xc5 Hebrew bible puncheation mark, should be ignored. 0xc7 Makaf katan, vowel as 'o'. 0xd0 Aleph, sounds as AHA must exist to make string readbale. Distinguish from '`' use capital A to distinguish from 'a' vowel. 0xf5 Splitted Vave, same as 'v'. 0xf6 Opposite Nun, same as 'n'. 0xf7 Small Kuf, same as 'q'. Signed-off-by: Alon Bar-Lev <alon.barlev@gmail.com>
*	Fix syntax error in an example	Jakub Wilk	2018-02-19	1	-1/+1
\|
*	Surround fractions with spaces	Jeffrey Gerard	2017-10-10	1	-3/+3
\| \| \| \|	Goal is to avoid incorrect combination with adjacent numbers.
*	Add currency translations for U+20B0 through U+20BF	Mike Swanson	2017-09-22	1	-16/+16
\|
*	U+05be is a hyphen	Micha Moskovic	2017-06-23	1	-1/+1
\| \| \|	U+05be is the Hebrew Maqaf character, which is equivalent to a hyphen, as explained in https://en.wikipedia.org/wiki/Hebrew_punctuation#Hyphen_and_maqaf.
*	U+2116 is the numero sign	Alan Davidson	2017-01-16	1	-1/+1
\|
*	Add missing square unit symbols.	Tomaz Solc	2016-11-04	2	-10/+11
\|
*	Added latin variants in U+20xx and U+21xx pages.	Tomaz Solc	2016-11-04	2	-28/+28
\|
*	Fix U+02B1 MODIFIER LETTER SMALL H WITH HOOK	Tomaz Solc	2016-11-04	1	-1/+1
\|
*	Fix U+205F MEDIUM MATHEMATICAL SPACE	Tomaz Solc	2016-11-04	1	-1/+1
\|
*	Add U+1F1xx page.	Tomaz Solc	2016-10-12	1	-0/+258
\| \| \| \|	Includes "DIGIT ... COMMA" and "PARANTHESIZED LATIN CAPITAL LETTER" characters.
*	Add missing vulgar fractions.	Tomaz Solc	2016-10-12	1	-4/+4
\|
*	Add a/c, a/s, c/o, c/u	Tomaz Solc	2016-10-12	1	-4/+4
\|
*	Fix transliteration of enclosed alphanumerics	Krzysztof Jurewicz	2016-05-29	1	-46/+47
\|
*	Fix typos	Jakub Wilk	2015-12-10	1	-1/+1
\|
*	Fix docstrings	Tomaz Solc	2015-11-17	1	-9/+15
\|
*	Rename unidecode_fast to unidecode_expect_ascii	Tomaz Solc	2015-11-17	1	-4/+8
\| \| \| \| \|	Also, add unidecode_expect_nonascii. "unidecode" is now an alias for "unidecode_expect_ascii"
*	Add unidecode_fast function to speedup mostly-ASCII transliterations.	dukebody	2015-11-14	1	-6/+30
\|
*	Add a newline if the string comes from commandline	Tomaz Solc	2015-05-14	1	-0/+4
\|
*	Don't append an extra new-line.	Tomaz Solc	2015-05-13	1	-1/+1
\|
*	Add -c command line option.	Tomaz Solc	2015-05-13	1	-6/+18
\|