diff options
-rw-r--r-- | Doc/library/codecs.rst | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/Doc/library/codecs.rst b/Doc/library/codecs.rst index b44dca4ac8..1bcd376cb1 100644 --- a/Doc/library/codecs.rst +++ b/Doc/library/codecs.rst @@ -816,7 +816,7 @@ Without external information it's impossible to reliably determine which encoding was used for encoding a Unicode string. Each charmap encoding can decode any random byte sequence. However that's not possible with UTF-8, as UTF-8 byte sequences have a structure that doesn't allow arbitrary byte -sequence. To increase the reliability with which a UTF-8 encoding can be +sequences. To increase the reliability with which a UTF-8 encoding can be detected, Microsoft invented a variant of UTF-8 (that Python 2.5 calls ``"utf-8-sig"``) for its Notepad program: Before any of the Unicode characters is written to the file, a UTF-8 encoded BOM (which looks like this as a byte |