summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorTomaz Solc <tomaz.solc@tablix.org>2020-12-20 12:36:36 +0100
committerTomaz Solc <tomaz.solc@tablix.org>2020-12-20 12:36:36 +0100
commitcc60c01f991ebf0114c61ab28d81bc7d18474973 (patch)
treeabec2013580e1a35cf2723457deb2f73dcd3e5d7
parenta3f59f9ca4f980b336a9e9df4fa973afd24ac7f4 (diff)
downloadunidecode-cc60c01f991ebf0114c61ab28d81bc7d18474973.tar.gz
Document the new errors argument.
-rw-r--r--README.rst11
1 files changed, 11 insertions, 0 deletions
diff --git a/README.rst b/README.rst
index 220a0bf..f3e35e5 100644
--- a/README.rst
+++ b/README.rst
@@ -64,6 +64,17 @@ Python 3.x)::
>>> unidecode(u"\u5317\u4EB0")
'Bei Jing '
+You can also specify an *errors* argument to ``unidecode()`` that determines
+what Unidecode does with characters that are not present in its transliteration
+tables. The default is ``'ignore'`` meaning that Unidecode will ignore those
+characters (replace them with an empty string). ``'strict'`` will raise a
+`UnidecodeError`. The exception object will contain an *index* attribute that
+can be used to find the offending character. ``'replace'`` will replace them
+with ``'?'`` (or another string, specified in the *replacement_char* argument).
+``'preserve'`` will keep the original, non-ASCII character in the string. Note
+that if ``'preserve'`` is used the string returned by ``unidecode()`` will not
+be ASCII-encodable!
+
A utility is also included that allows you to transliterate text from the
command line in several ways. Reading from standard input::