diff options
| author | (no author) <(no author)@929543f6-e4f2-0310-98a6-ba3bd3dd1d04> | 2003-07-02 17:51:01 +0000 |
|---|---|---|
| committer | (no author) <(no author)@929543f6-e4f2-0310-98a6-ba3bd3dd1d04> | 2003-07-02 17:51:01 +0000 |
| commit | edf2f3bec1de42965e347eb808f8604151c35bdb (patch) | |
| tree | 783ec29ed403599a0717019ac974ed2fd7366207 /docutils/docs/dev/rst/problems.txt | |
| parent | 97e20d03d7a4f773f24b082b76a34411acc1fefd (diff) | |
| download | docutils-ax.tar.gz | |
This commit was manufactured by cvs2svn to create branch 'ax'.ax
git-svn-id: http://svn.code.sf.net/p/docutils/code/branches/ax@1551 929543f6-e4f2-0310-98a6-ba3bd3dd1d04
Diffstat (limited to 'docutils/docs/dev/rst/problems.txt')
| -rw-r--r-- | docutils/docs/dev/rst/problems.txt | 870 |
1 files changed, 0 insertions, 870 deletions
diff --git a/docutils/docs/dev/rst/problems.txt b/docutils/docs/dev/rst/problems.txt deleted file mode 100644 index a8747af88..000000000 --- a/docutils/docs/dev/rst/problems.txt +++ /dev/null @@ -1,870 +0,0 @@ -============================== - Problems With StructuredText -============================== -:Author: David Goodger -:Contact: goodger@users.sourceforge.net -:Revision: $Revision$ -:Date: $Date$ -:Copyright: This document has been placed in the public domain. - -There are several problems, unresolved issues, and areas of -controversy within StructuredText_ (Classic and Next Generation). In -order to resolve all these issues, this analysis brings all of the -issues out into the open, enumerates all the alternatives, and -proposes solutions to be incorporated into the reStructuredText_ -specification. - - -.. contents:: - - -Formal Specification -==================== - -The description in the original StructuredText.py has been criticized -for being vague. For practical purposes, "the code *is* the spec." -Tony Ibbs has been working on deducing a `detailed description`_ from -the documentation and code of StructuredTextNG_. Edward Loper's -STMinus_ is another attempt to formalize a spec. - -For this kind of a project, the specification should always precede -the code. Otherwise, the markup is a moving target which can never be -adopted as a standard. Of course, a specification may be revised -during lifetime of the code, but without a spec there is no visible -control and thus no confidence. - - -Understanding and Extending the Code -==================================== - -The original StructuredText_ is a dense mass of sparsely commented -code and inscrutable regular expressions. It was not designed to be -extended and is very difficult to understand. StructuredTextNG_ has -been designed to allow input (syntax) and output extensions, but its -documentation (both internal [comments & docstrings], and external) is -inadequate for the complexity of the code itself. - -For reStructuredText to become truly useful, perhaps even part of -Python's standard library, it must have clear, understandable -documentation and implementation code. For the implementation of -reStructuredText to be taken seriously, it must be a sterling example -of the potential of docstrings; the implementation must practice what -the specification preaches. - - -Section Structure via Indentation -================================= - -Setext_ required that body text be indented by 2 spaces. The original -StructuredText_ and StructuredTextNG_ require that section structure -be indicated through indentation, as "inspired by Python". For -certain structures with a very limited, local extent (such as lists, -block quotes, and literal blocks), indentation naturally indicates -structure or hierarchy. For sections (which may have a very large -extent), structure via indentation is unnecessary, unnatural and -ambiguous. Rather, the syntax of the section title *itself* should -indicate that it is a section title. - -The original StructuredText states that "A single-line paragraph whose -immediately succeeding paragraphs are lower level is treated as a -header." Requiring indentation in this way is: - -- Unnecessary. The vast majority of docstrings and standalone - documents will have no more than one level of section structure. - Requiring indentation for such docstrings is unnecessary and - irritating. - -- Unnatural. Most published works use title style (type size, face, - weight, and position) and/or section/subsection numbering rather - than indentation to indicate hierarchy. This is a tradition with a - very long history. - -- Ambiguous. A StructuredText header is indistinguishable from a - one-line paragraph followed by a block quote (precluding the use of - block quotes). Enumerated section titles are ambiguous (is it a - header? is it a list item?). Some additional adornment must be - required to confirm the line's role as a title, both to a parser and - to the human reader of the source text. - -Python's use of significant whitespace is a wonderful (if not -original) innovation, however requiring indentation in ordinary -written text is hypergeneralization. - -reStructuredText_ indicates section structure through title adornment -style (as exemplified by this document). This is far more natural. -In fact, it is already in widespread use in plain text documents, -including in Python's standard distribution (such as the toplevel -README_ file). - - -Character Escaping Mechanism -============================ - -No matter what characters are chosen for markup, some day someone will -want to write documentation *about* that markup or using markup -characters in a non-markup context. Therefore, any complete markup -language must have an escaping or encoding mechanism. For a -lightweight markup system, encoding mechanisms like SGML/XML's '*' -are out. So an escaping mechanism is in. However, with carefully -chosen markup, it should be necessary to use the escaping mechanism -only infrequently. - -reStructuredText_ needs an escaping mechanism: a way to treat -markup-significant characters as the characters themselves. Currently -there is no such mechanism (although ZWiki uses '!'). What are the -candidates? - -1. ``!`` (http://dev.zope.org/Members/jim/StructuredTextWiki/NGEscaping) -2. ``\`` -3. ``~`` -4. doubling of characters - -The best choice for this is the backslash (``\``). It's "the single -most popular escaping character in the world!", therefore familiar and -unsurprising. Since characters only need to be escaped under special -circumstances, which are typically those explaining technical -programming issues, the use of the backslash is natural and -understandable. Python docstrings can be raw (prefixed with an 'r', -as in 'r""'), which would obviate the need for gratuitous doubling-up -of backslashes. - -(On 2001-03-29 on the Doc-SIG mailing list, GvR endorsed backslash -escapes, saying, "'nuff said. Backslash it is." Although neither -legally binding nor irrevocable nor any kind of guarantee of anything, -it is a good sign.) - -The rule would be: An unescaped backslash followed by any markup -character escapes the character. The escaped character represents the -character itself, and is prevented from playing a role in any markup -interpretation. The backslash is removed from the output. A literal -backslash is represented by an "escaped backslash," two backslashes in -a row. - -A carefully constructed set of recognition rules for inline markup -will obviate the need for backslash-escapes in almost all cases; see -`Delimitation of Inline Markup`_ below. - -When an expression (requiring backslashes and other characters used -for markup) becomes too complicated and therefore unreadable, a -literal block may be used instead. Inside literal blocks, no markup -is recognized, therefore backslashes (for the purpose of escaping -markup) become unnecessary. - -We could allow backslashes preceding non-markup characters to remain -in the output. This would make describing regular expressions and -other uses of backslashes easier. However, this would complicate the -markup rules and would be confusing. - - -Blank Lines in Lists -==================== - -Oft-requested in Doc-SIG (the earliest reference is dated 1996-08-13) -is the ability to write lists without requiring blank lines between -items. In docstrings, space is at a premium. Authors want to convey -their API or usage information in as compact a form as possible. -StructuredText_ requires blank lines between all body elements, -including list items, even when boundaries are obvious from the markup -itself. - -In reStructuredText, blank lines are optional between list items. -However, in order to eliminate ambiguity, a blank line is required -before the first list item and after the last. Nested lists also -require blank lines before the list start and after the list end. - - -Bullet List Markup -================== - -StructuredText_ includes 'o' as a bullet character. This is dangerous -and counter to the language-independent nature of the markup. There -are many languages in which 'o' is a word. For example, in Spanish:: - - Llamame a la casa - o al trabajo. - - (Call me at home or at work.) - -And in Japanese (when romanized):: - - Senshuu no doyoubi ni tegami - o kakimashita. - - ([I] wrote a letter on Saturday last week.) - -If a paragraph containing an 'o' word wraps such that the 'o' is the -first text on a line, or if a paragraph begins with such a word, it -could be misinterpreted as a bullet list. - -In reStructuredText_, 'o' is not used as a bullet character. '-', -'*', and '+' are the possible bullet characters. - - -Enumerated List Markup -====================== - -StructuredText enumerated lists are allowed to begin with numbers and -letters followed by a period or right-parenthesis, then whitespace. -This has surprising consequences for writing styles. For example, -this is recognized as an enumerated list item by StructuredText:: - - Mr. Creosote. - -People will write enumerated lists in all different ways. It is folly -to try to come up with the "perfect" format for an enumerated list, -and limit the docstring parser's recognition to that one format only. - -Rather, the parser should recognize a variety of enumerator styles. -It is also recommended that the enumerator of the first list item be -ordinal-1 ('1', 'A', 'a', 'I', or 'i'), as output formats may not be -able to begin a list at an arbitrary enumeration. - -An initial idea was to require two or more consistent enumerated list -items in a row. This idea proved impractical and was dropped. In -practice, the presence of a proper enumerator is enough to reliably -recognize an enumerated list item; any ambiguities are reported by the -parser. Here's the original idea for posterity: - - The parser should recognize a variety of enumerator styles, mark - each block as a potential enumerated list item (PELI), and - interpret the enumerators of adjacent PELIs to decide whether they - make up a consistent enumerated list. - - If a PELI is labeled with a "1.", and is immediately followed by a - PELI labeled with a "2.", we've got an enumerated list. Or "(A)" - followed by "(B)". Or "i)" followed by "ii)", etc. The chances - of accidentally recognizing two adjacent and consistently labeled - PELIs, are acceptably small. - - For an enumerated list to be recognized, the following must be - true: - - - the list must consist of multiple adjacent list items (2 or - more) - - the enumerators must all have the same format - - the enumerators must be sequential - - -Definition List Markup -====================== - -StructuredText uses ' -- ' (whitespace, two hyphens, whitespace) on -the first line of a paragraph to indicate a definition list item. The -' -- ' serves to separate the term (on the left) from the definition -(on the right). - -Many people use ' -- ' as an em-dash in their text, conflicting with -the StructuredText usage. Although the Chicago Manual of Style says -that spaces should not be used around an em-dash, Peter Funk pointed -out that this is standard usage in German (according to the Duden, the -official German reference), and possibly in other languages as well. -The widespread use of ' -- ' precludes its use for definition lists; -it would violate the "unsurprising" criterion. - -A simpler, and at least equally visually distinctive construct -(proposed by Guido van Rossum, who incidentally is a frequent user of -' -- ') would do just as well:: - - term 1 - Definition. - - term 2 - Definition 2, paragraph 1. - - Definition 2, paragraph 2. - -A reStructuredText definition list item consists of a term and a -definition. A term is a simple one-line paragraph. A definition is a -block indented relative to the term, and may contain multiple -paragraphs and other body elements. No blank line precedes a -definition (this distinguishes definition lists from block quotes). - - -Literal Blocks -============== - -The StructuredText_ specification has literal blocks indicated by -'example', 'examples', or '::' ending the preceding paragraph. STNG -only recognizes '::'; 'example'/'examples' are not implemented. This -is good; it fixes an unnecessary language dependency. The problem is -what to do with the sometimes- unwanted '::'. - -In reStructuredText_ '::' at the end of a paragraph indicates that -subsequent *indented* blocks are treated as literal text. No further -markup interpretation is done within literal blocks (not even -backslash-escapes). If the '::' is preceded by whitespace, '::' is -omitted from the output; if '::' was the sole content of a paragraph, -the entire paragraph is removed (no 'empty' paragraph remains). If -'::' is preceded by a non-whitespace character, '::' is replaced by -':' (i.e., the extra colon is removed). - -Thus, a section could begin with a literal block as follows:: - - Section Title - ------------- - - :: - - print "this is example literal" - - -Tables -====== - -The table markup scheme in classic StructuredText was horrible. Its -omission from StructuredTextNG is welcome, and its markup will not be -repeated here. However, tables themselves are useful in -documentation. Alternatives: - -1. This format is the most natural and obvious. It was independently - invented (no great feat of creation!), and later found to be the - format supported by the `Emacs table mode`_:: - - +------------+------------+------------+--------------+ - | Header 1 | Header 2 | Header 3 | Header 4 | - +============+============+============+==============+ - | Column 1 | Column 2 | Column 3 & 4 span (Row 1) | - +------------+------------+------------+--------------+ - | Column 1 & 2 span | Column 3 | - Column 4 | - +------------+------------+------------+ - Row 2 & 3 | - | 1 | 2 | 3 | - span | - +------------+------------+------------+--------------+ - - Tables are described with a visual outline made up of the - characters '-', '=', '|', and '+': - - - The hyphen ('-') is used for horizontal lines (row separators). - - The equals sign ('=') is optionally used as a header separator - (as of version 1.5.24, this is not supported by the Emacs table - mode). - - The vertical bar ('|') is used for for vertical lines (column - separators). - - The plus sign ('+') is used for intersections of horizontal and - vertical lines. - - Row and column spans are possible simply by omitting the column or - row separators, respectively. The header row separator must be - complete; in other words, a header cell may not span into the table - body. Each cell contains body elements, and may have multiple - paragraphs, lists, etc. Initial spaces for a left margin are - allowed; the first line of text in a cell determines its left - margin. - -2. Below is a simpler table structure. It may be better suited to - manual input than alternative #1, but there is no Emacs editing - mode available. One disadvantage is that it resembles section - titles; a one-column table would look exactly like section & - subsection titles. :: - - ============ ============ ============ ============== - Header 1 Header 2 Header 3 Header 4 - ============ ============ ============ ============== - Column 1 Column 2 Column 3 & 4 span (Row 1) - ------------ ------------ --------------------------- - Column 1 & 2 span Column 3 - Column 4 - ------------------------- ------------ - Row 2 & 3 - 1 2 3 - span - ============ ============ ============ ============== - - The table begins with a top border of equals signs with a space at - each column boundary (regardless of spans). Each row is - underlined. Internal row separators are underlines of '-', with - spaces at column boundaries. The last of the optional head rows is - underlined with '=', again with spaces at column boundaries. - Column spans have no spaces in their underline. Row spans simply - lack an underline at the row boundary. The bottom boundary of the - table consists of '=' underlines. A blank line is required - following a table. - -3. A minimalist alternative is as follows:: - - ==== ===== ======== ======== ======= ==== ===== ===== - Old State Input Action New State Notes - ----------- -------- ----------------- ----------- - ids types new type sys.msg. dupname ids types - ==== ===== ======== ======== ======= ==== ===== ===== - -- -- explicit -- -- new True - -- -- implicit -- -- new False - None False explicit -- -- new True - old False explicit implicit old new True - None True explicit explicit new None True - old True explicit explicit new,old None True [1] - None False implicit implicit new None False - old False implicit implicit new,old None False - None True implicit implicit new None True - old True implicit implicit new old True - ==== ===== ======== ======== ======= ==== ===== ===== - - The table begins with a top border of equals signs with one or more - spaces at each column boundary (regardless of spans). There must - be at least two columns in the table (to differentiate it from - section headers). Each line starts a new row. The rightmost - column is unbounded; text may continue past the edge of the table. - Each row/line must contain spaces at column boundaries, except for - explicit column spans. Underlines of '-' can be used to indicate - column spans, but should be used sparingly if at all. Lines - containing column span underlines may not contain any other text. - The last of the optional head rows is underlined with '=', again - with spaces at column boundaries. The bottom boundary of the table - consists of '=' underlines. A blank line is required following a - table. - - This table sums up the features. Using all the features in such a - small space is not pretty though:: - - ======== ======== ======== - Header 2 & 3 Span - ------------------ - Header 1 Header 2 Header 3 - ======== ======== ======== - Each line is a new row. - Each row consists of one line only. - Row spans are not possible. - The last column may spill over to the right. - Column spans are possible with an underline joining columns. - ---------------------------- - The span is limited to the row above the underline. - ======== ======== ======== - -4. As a variation of alternative 3, bullet list syntax in the first - column could be used to indicate row starts. Multi-line rows are - possible, but row spans are not. For example:: - - ===== ===== - col 1 col 2 - ===== ===== - - 1 Second column of row 1. - - 2 Second column of row 2. - Second line of paragraph. - - 3 Second column of row 3. - - Second paragraph of row 3, - column 2 - ===== ===== - - Column spans would be indicated on the line after the last line of - the row. To indicate a real bullet list within a first-column - cell, simply nest the bullets. - -5. In a further variation, we could simply assume that whitespace in - the first column implies a multi-line row; the text in other - columns is continuation text. For example:: - - ===== ===== - col 1 col 2 - ===== ===== - 1 Second column of row 1. - 2 Second column of row 2. - Second line of paragraph. - 3 Second column of row 3. - - Second paragraph of row 3, - column 2 - ===== ===== - - Limitations of this approach: - - - Cells in the first column are limited to one line of text. - - - Cells in the first column *must* contain some text; blank cells - would lead to a misinterpretation. An empty comment ("..") is - sufficient. - -6. Combining alternative 3 and 4, a bullet list in the first column - could mean multi-line rows, and no bullet list means single-line - rows only. - -Alternatives 1 and 5 has been adopted by reStructuredText. - - -Delimitation of Inline Markup -============================= - -StructuredText specifies that inline markup must begin with -whitespace, precluding such constructs as parenthesized or quoted -emphatic text:: - - "**What?**" she cried. (*exit stage left*) - -The `reStructuredText markup specification`_ allows for such -constructs and disambiguates inline markup through a set of -recognition rules. These recognition rules define the context of -markup start-strings and end-strings, allowing markup characters to be -used in most non-markup contexts without a problem (or a backslash). -So we can say, "Use asterisks (*) around words or phrases to -*emphasisze* them." The '(*)' will not be recognized as markup. This -reduces the need for markup escaping to the point where an escape -character is *almost* (but not quite!) unnecessary. - - -Underlining -=========== - -StructuredText uses '_text_' to indicate underlining. To quote David -Ascher in his 2000-01-21 Doc-SIG mailing list post, "Docstring -grammar: a very revised proposal": - - The tagging of underlined text with _'s is suboptimal. Underlines - shouldn't be used from a typographic perspective (underlines were - designed to be used in manuscripts to communicate to the - typesetter that the text should be italicized -- no well-typeset - book ever uses underlines), and conflict with double-underscored - Python variable names (__init__ and the like), which would get - truncated and underlined when that effect is not desired. Note - that while *complete* markup would prevent that truncation - ('__init__'), I think of docstring markups much like I think of - type annotations -- they should be optional and above all do no - harm. In this case the underline markup does harm. - -Underlining is not part of the reStructuredText specification. - - -Inline Literals -=============== - -StructuredText's markup for inline literals (text left as-is, -verbatim, usually in a monospaced font; as in HTML <TT>) is single -quotes ('literals'). The problem with single quotes is that they are -too often used for other purposes: - -- Apostrophes: "Don't blame me, 'cause it ain't mine, it's Chris'."; - -- Quoting text: - - First Bruce: "Well Bruce, I heard the prime minister use it. - 'S'hot enough to boil a monkey's bum in 'ere your Majesty,' he - said, and she smiled quietly to herself." - - In the UK, single quotes are used for dialogue in published works. - -- String literals: s = '' - -Alternatives:: - - 'text' \'text\' ''text'' "text" \"text\" ""text"" - #text# @text@ `text` ^text^ ``text'' ``text`` - -The examples below contain inline literals, quoted text, and -apostrophes. Each example should evaluate to the following HTML:: - - Some <TT>code</TT>, with a 'quote', "double", ain't it grand? - Does <TT>a[b] = 'c' + "d" + `2^3`</TT> work? - - 0. Some code, with a quote, double, ain't it grand? - Does a[b] = 'c' + "d" + `2^3` work? - 1. Some 'code', with a \'quote\', "double", ain\'t it grand? - Does 'a[b] = \'c\' + "d" + `2^3`' work? - 2. Some \'code\', with a 'quote', "double", ain't it grand? - Does \'a[b] = 'c' + "d" + `2^3`\' work? - 3. Some ''code'', with a 'quote', "double", ain't it grand? - Does ''a[b] = 'c' + "d" + `2^3`'' work? - 4. Some "code", with a 'quote', \"double\", ain't it grand? - Does "a[b] = 'c' + "d" + `2^3`" work? - 5. Some \"code\", with a 'quote', "double", ain't it grand? - Does \"a[b] = 'c' + "d" + `2^3`\" work? - 6. Some ""code"", with a 'quote', "double", ain't it grand? - Does ""a[b] = 'c' + "d" + `2^3`"" work? - 7. Some #code#, with a 'quote', "double", ain't it grand? - Does #a[b] = 'c' + "d" + `2^3`# work? - 8. Some @code@, with a 'quote', "double", ain't it grand? - Does @a[b] = 'c' + "d" + `2^3`@ work? - 9. Some `code`, with a 'quote', "double", ain't it grand? - Does `a[b] = 'c' + "d" + \`2^3\`` work? - 10. Some ^code^, with a 'quote', "double", ain't it grand? - Does ^a[b] = 'c' + "d" + `2\^3`^ work? - 11. Some ``code'', with a 'quote', "double", ain't it grand? - Does ``a[b] = 'c' + "d" + `2^3`'' work? - 12. Some ``code``, with a 'quote', "double", ain't it grand? - Does ``a[b] = 'c' + "d" + `2^3\``` work? - -Backquotes (#9 & #12) are the best choice. They are unobtrusive and -relatviely rarely used (more rarely than ' or ", anyhow). Backquotes -have the connotation of 'quotes', which other options (like carets, -#10) don't. - -Analogously with ``*emph*`` & ``**strong**``, double-backquotes (#12) -could be used for inline literals. If single-backquotes are used for -'interpreted text' (context-sensitive domain-specific descriptive -markup) such as function name hyperlinks in Python docstrings, then -double-backquotes could be used for absolute-literals, wherein no -processing whatsoever takes place. An advantage of double-backquotes -would be that backslash-escaping would no longer be necessary for -embedded single-backquotes; however, embedded double-backquotes (in an -end-string context) would be illegal. See `Backquotes in -Phrase-Links`__ in `Record of reStructuredText Syntax Alternatives`__. - -__ alternatives.html#backquotes-in-phrase-links -__ alternatives.html - -Alternative choices are carets (#10) and TeX-style quotes (#11). For -examples of TeX-style quoting, see -http://www.zope.org/Members/jim/StructuredTextWiki/CustomizingTheDocumentProcessor. - -Some existing uses of backquotes: - -1. As a synonym for repr() in Python. -2. For command-interpolation in shell scripts. -3. Used as open-quotes in TeX code (and carried over into plaintext - by TeXies). - -The inline markup start-string and end-string recognition rules -defined by the `reStructuredText markup specification`_ would allow -all of these cases inside inline literals, with very few exceptions. -As a fallback, literal blocks could handle all cases. - -Outside of inline literals, the above uses of backquotes would require -backslash-escaping. However, these are all prime examples of text -that should be marked up with inline literals. - -If either backquotes or straight single-quotes are used as markup, -TeX-quotes are too troublesome to support, so no special-casing of -TeX-quotes should be done (at least at first). If TeX-quotes have to -be used outside of literals, a single backslash-escaped would suffice: -\``TeX quote''. Ugly, true, but very infrequently used. - -Using literal blocks is a fallback option which removes the need for -backslash-escaping:: - - like this:: - - Here, we can do ``absolutely'' anything `'`'\|/|\ we like! - -No mechanism for inline literals is perfect, just as no escaping -mechanism is perfect. No matter what we use, complicated inline -expressions involving the inline literal quote and/or the backslash -will end up looking ugly. We can only choose the least often ugly -option. - -reStructuredText will use double backquotes for inline literals, and -single backqoutes for interpreted text. - - -Hyperlinks -========== - -There are three forms of hyperlink currently in StructuredText_: - -1. (Absolute & relative URIs.) Text enclosed by double quotes - followed by a colon, a URI, and concluded by punctuation plus white - space, or just white space, is treated as a hyperlink:: - - "Python":http://www.python.org/ - -2. (Absolute URIs only.) Text enclosed by double quotes followed by a - comma, one or more spaces, an absolute URI and concluded by - punctuation plus white space, or just white space, is treated as a - hyperlink:: - - "mail me", mailto:me@mail.com - -3. (Endnotes.) Text enclosed by brackets link to an endnote at the - end of the document: at the beginning of the line, two dots, a - space, and the same text in brackets, followed by the end note - itself:: - - Please refer to the fine manual [GVR2001]. - - .. [GVR2001] Python Documentation, Release 2.1, van Rossum, - Drake, et al., http://www.python.org/doc/ - -The problem with forms 1 and 2 is that they are neither intuitive nor -unobtrusive (they break design goals 5 & 2). They overload -double-quotes, which are too often used in ordinary text (potentially -breaking design goal 4). The brackets in form 3 are also too common -in ordinary text (such as [nested] asides and Python lists like [12]). - -Alternatives: - -1. Have no special markup for hyperlinks. - -2. A. Interpret and mark up hyperlinks as any contiguous text - containing '://' or ':...@' (absolute URI) or '@' (email - address) after an alphanumeric word. To de-emphasize the URI, - simply enclose it in parentheses: - - Python (http://www.python.org/) - - B. Leave special hyperlink markup as a domain-specific extension. - Hyperlinks in ordinary reStructuredText documents would be - required to be standalone (i.e. the URI text inline in the - document text). Processed hyperlinks (where the URI text is - hidden behind the link) are important enough to warrant syntax. - -3. The original Setext_ introduced a mechanism of indirect hyperlinks. - A source link word ('hot word') in the text was given a trailing - underscore:: - - Here is some text with a hyperlink_ built in. - - The hyperlink itself appeared at the end of the document on a line - by itself, beginning with two dots, a space, the link word with a - leading underscore, whitespace, and the URI itself:: - - .. _hyperlink http://www.123.xyz - - Setext used ``underscores_instead_of_spaces_`` for phrase links. - -With some modification, alternative 3 best satisfies the design goals. -It has the advantage of being readable and relatively unobtrusive. -Since each source link must match up to a target, the odd variable -ending in an underscore can be spared being marked up (although it -should generate a "no such link target" warning). The only -disadvantage is that phrase-links aren't possible without some -obtrusive syntax. - -We could achieve phrase-links if we enclose the link text: - -1. in double quotes:: - - "like this"_ - -2. in brackets:: - - [like this]_ - -3. or in backquotes:: - - `like this`_ - -Each gives us somewhat obtrusive markup, but that is unavoidable. The -bracketed syntax (#2) is reminiscent of links on many web pages -(intuitive), although it is somewhat obtrusive. Alternative #3 is -much less obtrusive, and is consistent with interpreted text: the -trailing underscore indicates the interpretation of the phrase, as a -hyperlink. #3 also disambiguates hyperlinks from footnote references. -Alternative #3 wins. - -The same trailing underscore markup can also be used for footnote and -citation references, removing the problem with ordinary bracketed text -and Python lists:: - - Please refer to the fine manual [GVR2000]_. - - .. [GVR2000] Python Documentation, van Rossum, Drake, et al., - http://www.python.org/doc/ - -The two-dots-and-a-space syntax was generalized by Setext for -comments, which are removed from the (visible) processed output. -reStructuredText uses this syntax for comments, footnotes, and link -target, collectively termed "explicit markup". For link targets, in -order to eliminate ambiguity with comments and footnotes, -reStructuredText specifies that a colon always follow the link target -word/phrase. The colon denotes 'maps to'. There is no reason to -restrict target links to the end of the document; they could just as -easily be interspersed. - -Internal hyperlinks (links from one point to another within a single -document) can be expressed by a source link as before, and a target -link with a colon but no URI. In effect, these targets 'map to' the -element immediately following. - -As an added bonus, we now have a perfect candidate for -reStructuredText directives, a simple extension mechanism: explicit -markup containing a single word followed by two colons and whitespace. -The interpretation of subsequent data on the directive line or -following is directive-dependent. - -To summarize:: - - .. This is a comment. - - .. The line below is an example of a directive. - .. version:: 1 - - This is a footnote [1]_. - - This internal hyperlink will take us to the footnotes_ area below. - - Here is a one-word_ external hyperlink. - - Here is `a hyperlink phrase`_. - - .. _footnotes: - .. [1] Footnote text goes here. - - .. external hyperlink target mappings: - .. _one-word: http://www.123.xyz - .. _a hyperlink phrase: http://www.123.xyz - -The presence or absence of a colon after the target link -differentiates an indirect hyperlink from a footnote, respectively. A -footnote requires brackets. Backquotes around a target link word or -phrase are required if the phrase contains a colon, optional -otherwise. - -Below are examples using no markup, the two StructuredText hypertext -styles, and the reStructuredText hypertext style. Each example -contains an indirect link, a direct link, a footnote/endnote, and -bracketed text. In HTML, each example should evaluate to:: - - <P>A <A HREF="http://spam.org">URI</A>, see <A HREF="#eggs2000"> - [eggs2000]</A> (in Bacon [Publisher]). Also see - <A HREF="http://eggs.org">http://eggs.org</A>.</P> - - <P><A NAME="eggs2000">[eggs2000]</A> "Spam, Spam, Spam, Eggs, - Bacon, and Spam"</P> - -1. No markup:: - - A URI http://spam.org, see eggs2000 (in Bacon [Publisher]). - Also see http://eggs.org. - - eggs2000 "Spam, Spam, Spam, Eggs, Bacon, and Spam" - -2. StructuredText absolute/relative URI syntax - ("text":http://www.url.org):: - - A "URI":http://spam.org, see [eggs2000] (in Bacon [Publisher]). - Also see "http://eggs.org":http://eggs.org. - - .. [eggs2000] "Spam, Spam, Spam, Eggs, Bacon, and Spam" - - Note that StructuredText does not recognize standalone URIs, - forcing doubling up as shown in the second line of the example - above. - -3. StructuredText absolute-only URI syntax - ("text", mailto:you@your.com):: - - A "URI", http://spam.org, see [eggs2000] (in Bacon - [Publisher]). Also see "http://eggs.org", http://eggs.org. - - .. [eggs2000] "Spam, Spam, Spam, Eggs, Bacon, and Spam" - -4. reStructuredText syntax:: - - 4. A URI_, see [eggs2000]_ (in Bacon [Publisher]). - Also see http://eggs.org. - - .. _URI: http:/spam.org - .. [eggs2000] "Spam, Spam, Spam, Eggs, Bacon, and Spam" - -The bracketed text '[Publisher]' may be problematic with -StructuredText (syntax 2 & 3). - -reStructuredText's syntax (#4) is definitely the most readable. The -text is separated from the link URI and the footnote, resulting in -cleanly readable text. - -.. _StructuredText: - http://dev.zope.org/Members/jim/StructuredTextWiki/FrontPage -.. _Setext: http://docutils.sourceforge.net/mirror/setext.html -.. _reStructuredText: http://docutils.sourceforge.net/rst.html -.. _detailed description: - http://www.tibsnjoan.demon.co.uk/STNG-format.html -.. _STMinus: http://www.cis.upenn.edu/~edloper/pydoc/stminus.html -.. _StructuredTextNG: - http://dev.zope.org/Members/jim/StructuredTextWiki/StructuredTextNG -.. _README: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/~checkout~/ - python/python/dist/src/README -.. _Emacs table mode: http://table.sourceforge.net/ -.. _reStructuredText Markup Specification: reStructuredText.html - - -.. - Local Variables: - mode: indented-text - indent-tabs-mode: nil - sentence-end-double-space: t - fill-column: 70 - End: |
