3 author: John MacFarlane
6 license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)'
13 Markdown is a plain text format for writing structured documents,
14 based on conventions used for indicating formatting in email and
15 usenet posts. It was developed in 2004 by John Gruber, who wrote
16 the first Markdown-to-HTML converter in Perl, and it soon became
17 ubiquitous. In the next decade, dozens of implementations were
18 developed in many languages. Some extended the original
19 Markdown syntax with conventions for footnotes, tables, and
20 other document elements. Some allowed Markdown documents to be
21 rendered in formats other than HTML. Websites like Reddit,
22 StackOverflow, and GitHub had millions of people using Markdown.
23 And Markdown started to be used beyond the web, to author books,
24 articles, slide shows, letters, and lecture notes.
26 What distinguishes Markdown from many other lightweight markup
27 syntaxes, which are often easier to write, is its readability.
30 > The overriding design goal for Markdown's formatting syntax is
31 > to make it as readable as possible. The idea is that a
32 > Markdown-formatted document should be publishable as-is, as
33 > plain text, without looking like it's been marked up with tags
34 > or formatting instructions.
35 > (<http://daringfireball.net/projects/markdown/>)
37 The point can be illustrated by comparing a sample of
38 [AsciiDoc](http://www.methods.co.nz/asciidoc/) with
39 an equivalent sample of Markdown. Here is a sample of
40 AsciiDoc from the AsciiDoc manual:
45 List item one continued with a second paragraph followed by an
53 List item continued with a third paragraph.
55 2. List item two continued with an open block.
58 This paragraph is part of the preceding list item.
60 a. This list is nested and does not require explicit item
63 This paragraph is part of the preceding list item.
67 This paragraph belongs to item two of the outer list.
71 And here is the equivalent in Markdown:
75 List item one continued with a second paragraph followed by an
81 List item continued with a third paragraph.
83 2. List item two continued with an open block.
85 This paragraph is part of the preceding list item.
87 1. This list is nested and does not require explicit item continuation.
89 This paragraph is part of the preceding list item.
93 This paragraph belongs to item two of the outer list.
96 The AsciiDoc version is, arguably, easier to write. You don't need
97 to worry about indentation. But the Markdown version is much easier
98 to read. The nesting of list items is apparent to the eye in the
99 source, not just in the processed document.
101 ## Why is a spec needed?
103 John Gruber's [canonical description of Markdown's
104 syntax](http://daringfireball.net/projects/markdown/syntax)
105 does not specify the syntax unambiguously. Here are some examples of
106 questions it does not answer:
108 1. How much indentation is needed for a sublist? The spec says that
109 continuation paragraphs need to be indented four spaces, but is
110 not fully explicit about sublists. It is natural to think that
111 they, too, must be indented four spaces, but `Markdown.pl` does
112 not require that. This is hardly a "corner case," and divergences
113 between implementations on this issue often lead to surprises for
114 users in real documents. (See [this comment by John
115 Gruber](http://article.gmane.org/gmane.text.markdown.general/1997).)
117 2. Is a blank line needed before a block quote or heading?
118 Most implementations do not require the blank line. However,
119 this can lead to unexpected results in hard-wrapped text, and
120 also to ambiguities in parsing (note that some implementations
121 put the heading inside the blockquote, while others do not).
122 (John Gruber has also spoken [in favor of requiring the blank
123 lines](http://article.gmane.org/gmane.text.markdown.general/2146).)
125 3. Is a blank line needed before an indented code block?
126 (`Markdown.pl` requires it, but this is not mentioned in the
127 documentation, and some implementations do not require it.)
134 4. What is the exact rule for determining when list items get
135 wrapped in `<p>` tags? Can a list be partially "loose" and partially
136 "tight"? What should we do with a list like this?
155 (There are some relevant comments by John Gruber
156 [here](http://article.gmane.org/gmane.text.markdown.general/2554).)
158 5. Can list markers be indented? Can ordered list markers be right-aligned?
166 6. Is this one list with a thematic break in its second item,
167 or two lists separated by a thematic break?
175 7. When list markers change from numbers to bullets, do we have
176 two lists or one? (The Markdown syntax description suggests two,
177 but the perl scripts and many other implementations produce one.)
186 8. What are the precedence rules for the markers of inline structure?
187 For example, is the following a valid link, or does the code span
191 [a backtick (`)](/url) and [another backtick (`)](/url).
194 9. What are the precedence rules for markers of emphasis and strong
195 emphasis? For example, how should the following be parsed?
201 10. What are the precedence rules between block-level and inline-level
202 structure? For example, how should the following be parsed?
205 - `a long code span can contain a hyphen like this
206 - and it can screw things up`
209 11. Can list items include section headings? (`Markdown.pl` does not
210 allow this, but does allow blockquotes to include headings.)
216 12. Can list items be empty?
224 13. Can link references be defined inside block quotes or list items?
232 14. If there are multiple definitions for the same reference, which takes
242 In the absence of a spec, early implementers consulted `Markdown.pl`
243 to resolve these ambiguities. But `Markdown.pl` was quite buggy, and
244 gave manifestly bad results in many cases, so it was not a
245 satisfactory replacement for a spec.
247 Because there is no unambiguous spec, implementations have diverged
248 considerably. As a result, users are often surprised to find that
249 a document that renders one way on one system (say, a github wiki)
250 renders differently on another (say, converting to docbook using
251 pandoc). To make matters worse, because nothing in Markdown counts
252 as a "syntax error," the divergence often isn't discovered right away.
254 ## About this document
256 This document attempts to specify Markdown syntax unambiguously.
257 It contains many examples with side-by-side Markdown and
258 HTML. These are intended to double as conformance tests. An
259 accompanying script `spec_tests.py` can be used to run the tests
260 against any Markdown program:
262 python test/spec_tests.py --spec spec.txt --program PROGRAM
264 Since this document describes how Markdown is to be parsed into
265 an abstract syntax tree, it would have made sense to use an abstract
266 representation of the syntax tree instead of HTML. But HTML is capable
267 of representing the structural distinctions we need to make, and the
268 choice of HTML for the tests makes it possible to run the tests against
269 an implementation without writing an abstract syntax tree renderer.
271 This document is generated from a text file, `spec.txt`, written
272 in Markdown with a small extension for the side-by-side tests.
273 The script `tools/makespec.py` can be used to convert `spec.txt` into
274 HTML or CommonMark (which can then be converted into other formats).
276 In the examples, the `→` character is used to represent tabs.
280 ## Characters and lines
282 Any sequence of [characters] is a valid CommonMark
285 A [character](@) is a Unicode code point. Although some
286 code points (for example, combining accents) do not correspond to
287 characters in an intuitive sense, all code points count as characters
288 for purposes of this spec.
290 This spec does not specify an encoding; it thinks of lines as composed
291 of [characters] rather than bytes. A conforming parser may be limited
292 to a certain encoding.
294 A [line](@) is a sequence of zero or more [characters]
295 other than newline (`U+000A`) or carriage return (`U+000D`),
296 followed by a [line ending] or by the end of file.
298 A [line ending](@) is a newline (`U+000A`), a carriage return
299 (`U+000D`) not followed by a newline, or a carriage return and a
302 A line containing no characters, or a line containing only spaces
303 (`U+0020`) or tabs (`U+0009`), is called a [blank line](@).
305 The following definitions of character classes will be used in this spec:
307 A [whitespace character](@) is a space
308 (`U+0020`), tab (`U+0009`), newline (`U+000A`), line tabulation (`U+000B`),
309 form feed (`U+000C`), or carriage return (`U+000D`).
311 [Whitespace](@) is a sequence of one or more [whitespace
314 A [Unicode whitespace character](@) is
315 any code point in the Unicode `Zs` class, or a tab (`U+0009`),
316 carriage return (`U+000D`), newline (`U+000A`), or form feed
319 [Unicode whitespace](@) is a sequence of one
320 or more [Unicode whitespace characters].
322 A [space](@) is `U+0020`.
324 A [non-whitespace character](@) is any character
325 that is not a [whitespace character].
327 An [ASCII punctuation character](@)
328 is `!`, `"`, `#`, `$`, `%`, `&`, `'`, `(`, `)`,
329 `*`, `+`, `,`, `-`, `.`, `/`, `:`, `;`, `<`, `=`, `>`, `?`, `@`,
330 `[`, `\`, `]`, `^`, `_`, `` ` ``, `{`, `|`, `}`, or `~`.
332 A [punctuation character](@) is an [ASCII
333 punctuation character] or anything in
334 the Unicode classes `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po`, or `Ps`.
338 Tabs in lines are not expanded to [spaces]. However,
339 in contexts where whitespace helps to define block structure,
340 tabs behave as if they were replaced by spaces with a tab stop
343 Thus, for example, a tab can be used instead of four spaces
344 in an indented code block. (Note, however, that internal
345 tabs are passed through as literal tabs, not expanded to
348 ```````````````````````````````` example
351 <pre><code>foo→baz→→bim
353 ````````````````````````````````
355 ```````````````````````````````` example
358 <pre><code>foo→baz→→bim
360 ````````````````````````````````
362 ```````````````````````````````` example
369 ````````````````````````````````
371 In the following example, a continuation paragraph of a list
372 item is indented with a tab; this has exactly the same effect
373 as indentation with four spaces would:
375 ```````````````````````````````` example
386 ````````````````````````````````
388 ```````````````````````````````` example
400 ````````````````````````````````
402 Normally the `>` that begins a block quote may be followed
403 optionally by a space, which is not considered part of the
404 content. In the following case `>` is followed by a tab,
405 which is treated as if it were expanded into spaces.
406 Since one of theses spaces is considered part of the
407 delimiter, `foo` is considered to be indented six spaces
408 inside the block quote context, so we get an indented
409 code block starting with two spaces.
411 ```````````````````````````````` example
418 ````````````````````````````````
420 ```````````````````````````````` example
429 ````````````````````````````````
432 ```````````````````````````````` example
439 ````````````````````````````````
441 ```````````````````````````````` example
457 ````````````````````````````````
459 ```````````````````````````````` example
463 ````````````````````````````````
465 ```````````````````````````````` example
469 ````````````````````````````````
472 ## Insecure characters
474 For security reasons, the Unicode character `U+0000` must be replaced
475 with the REPLACEMENT CHARACTER (`U+FFFD`).
479 We can think of a document as a sequence of
480 [blocks](@)---structural elements like paragraphs, block
481 quotations, lists, headings, rules, and code blocks. Some blocks (like
482 block quotes and list items) contain other blocks; others (like
483 headings and paragraphs) contain [inline](@) content---text,
484 links, emphasized text, images, code, and so on.
488 Indicators of block structure always take precedence over indicators
489 of inline structure. So, for example, the following is a list with
490 two items, not a list with one item containing a code span:
492 ```````````````````````````````` example
500 ````````````````````````````````
503 This means that parsing can proceed in two steps: first, the block
504 structure of the document can be discerned; second, text lines inside
505 paragraphs, headings, and other block constructs can be parsed for inline
506 structure. The second step requires information about link reference
507 definitions that will be available only at the end of the first
508 step. Note that the first step requires processing lines in sequence,
509 but the second can be parallelized, since the inline parsing of
510 one block element does not affect the inline parsing of any other.
512 ## Container blocks and leaf blocks
514 We can divide blocks into two types:
515 [container block](@)s,
516 which can contain other blocks, and [leaf block](@)s,
521 This section describes the different kinds of leaf block that make up a
526 A line consisting of 0-3 spaces of indentation, followed by a sequence
527 of three or more matching `-`, `_`, or `*` characters, each followed
528 optionally by any number of spaces, forms a
531 ```````````````````````````````` example
539 ````````````````````````````````
544 ```````````````````````````````` example
548 ````````````````````````````````
551 ```````````````````````````````` example
555 ````````````````````````````````
558 Not enough characters:
560 ```````````````````````````````` example
568 ````````````````````````````````
571 One to three spaces indent are allowed:
573 ```````````````````````````````` example
581 ````````````````````````````````
584 Four spaces is too many:
586 ```````````````````````````````` example
591 ````````````````````````````````
594 ```````````````````````````````` example
600 ````````````````````````````````
603 More than three characters may be used:
605 ```````````````````````````````` example
606 _____________________________________
609 ````````````````````````````````
612 Spaces are allowed between the characters:
614 ```````````````````````````````` example
618 ````````````````````````````````
621 ```````````````````````````````` example
625 ````````````````````````````````
628 ```````````````````````````````` example
632 ````````````````````````````````
635 Spaces are allowed at the end:
637 ```````````````````````````````` example
641 ````````````````````````````````
644 However, no other characters may occur in the line:
646 ```````````````````````````````` example
656 ````````````````````````````````
659 It is required that all of the [non-whitespace characters] be the same.
660 So, this is not a thematic break:
662 ```````````````````````````````` example
666 ````````````````````````````````
669 Thematic breaks do not need blank lines before or after:
671 ```````````````````````````````` example
683 ````````````````````````````````
686 Thematic breaks can interrupt a paragraph:
688 ```````````````````````````````` example
696 ````````````````````````````````
699 If a line of dashes that meets the above conditions for being a
700 thematic break could also be interpreted as the underline of a [setext
701 heading], the interpretation as a
702 [setext heading] takes precedence. Thus, for example,
703 this is a setext heading, not a paragraph followed by a thematic break:
705 ```````````````````````````````` example
712 ````````````````````````````````
715 When both a thematic break and a list item are possible
716 interpretations of a line, the thematic break takes precedence:
718 ```````````````````````````````` example
730 ````````````````````````````````
733 If you want a thematic break in a list item, use a different bullet:
735 ```````````````````````````````` example
745 ````````````````````````````````
751 consists of a string of characters, parsed as inline content, between an
752 opening sequence of 1--6 unescaped `#` characters and an optional
753 closing sequence of any number of unescaped `#` characters.
754 The opening sequence of `#` characters must be followed by a
755 [space] or by the end of line. The optional closing sequence of `#`s must be
756 preceded by a [space] and may be followed by spaces only. The opening
757 `#` character may be indented 0-3 spaces. The raw contents of the
758 heading are stripped of leading and trailing spaces before being parsed
759 as inline content. The heading level is equal to the number of `#`
760 characters in the opening sequence.
764 ```````````````````````````````` example
778 ````````````````````````````````
781 More than six `#` characters is not a heading:
783 ```````````````````````````````` example
787 ````````````````````````````````
790 At least one space is required between the `#` characters and the
791 heading's contents, unless the heading is empty. Note that many
792 implementations currently do not require the space. However, the
793 space was required by the
794 [original ATX implementation](http://www.aaronsw.com/2002/atx/atx.py),
795 and it helps prevent things like the following from being parsed as
798 ```````````````````````````````` example
805 ````````````````````````````````
808 This is not a heading, because the first `#` is escaped:
810 ```````````````````````````````` example
814 ````````````````````````````````
817 Contents are parsed as inlines:
819 ```````````````````````````````` example
822 <h1>foo <em>bar</em> *baz*</h1>
823 ````````````````````````````````
826 Leading and trailing blanks are ignored in parsing inline content:
828 ```````````````````````````````` example
832 ````````````````````````````````
835 One to three spaces indentation are allowed:
837 ```````````````````````````````` example
845 ````````````````````````````````
848 Four spaces are too much:
850 ```````````````````````````````` example
855 ````````````````````````````````
858 ```````````````````````````````` example
864 ````````````````````````````````
867 A closing sequence of `#` characters is optional:
869 ```````````````````````````````` example
875 ````````````````````````````````
878 It need not be the same length as the opening sequence:
880 ```````````````````````````````` example
881 # foo ##################################
886 ````````````````````````````````
889 Spaces are allowed after the closing sequence:
891 ```````````````````````````````` example
895 ````````````````````````````````
898 A sequence of `#` characters with anything but [spaces] following it
899 is not a closing sequence, but counts as part of the contents of the
902 ```````````````````````````````` example
906 ````````````````````````````````
909 The closing sequence must be preceded by a space:
911 ```````````````````````````````` example
915 ````````````````````````````````
918 Backslash-escaped `#` characters do not count as part
919 of the closing sequence:
921 ```````````````````````````````` example
929 ````````````````````````````````
932 ATX headings need not be separated from surrounding content by blank
933 lines, and they can interrupt paragraphs:
935 ```````````````````````````````` example
943 ````````````````````````````````
946 ```````````````````````````````` example
954 ````````````````````````````````
957 ATX headings can be empty:
959 ```````````````````````````````` example
967 ````````````````````````````````
972 A [setext heading](@) consists of one or more
973 lines of text, each containing at least one [non-whitespace
974 character], with no more than 3 spaces indentation, followed by
975 a [setext heading underline]. The lines of text must be such
976 that, were they not followed by the setext heading underline,
977 they would be interpreted as a paragraph: they cannot be
978 interpretable as a [code fence], [ATX heading][ATX headings],
979 [block quote][block quotes], [thematic break][thematic breaks],
980 [list item][list items], or [HTML block][HTML blocks].
982 A [setext heading underline](@) is a sequence of
983 `=` characters or a sequence of `-` characters, with no more than 3
984 spaces indentation and any number of trailing spaces. If a line
985 containing a single `-` can be interpreted as an
986 empty [list items], it should be interpreted this way
987 and not as a [setext heading underline].
989 The heading is a level 1 heading if `=` characters are used in
990 the [setext heading underline], and a level 2 heading if `-`
991 characters are used. The contents of the heading are the result
992 of parsing the preceding lines of text as CommonMark inline
995 In general, a setext heading need not be preceded or followed by a
996 blank line. However, it cannot interrupt a paragraph, so when a
997 setext heading comes after a paragraph, a blank line is needed between
1002 ```````````````````````````````` example
1009 <h1>Foo <em>bar</em></h1>
1010 <h2>Foo <em>bar</em></h2>
1011 ````````````````````````````````
1014 The content of the header may span more than one line:
1016 ```````````````````````````````` example
1023 ````````````````````````````````
1026 The underlining can be any length:
1028 ```````````````````````````````` example
1030 -------------------------
1037 ````````````````````````````````
1040 The heading content can be indented up to three spaces, and need
1041 not line up with the underlining:
1043 ```````````````````````````````` example
1056 ````````````````````````````````
1059 Four spaces indent is too much:
1061 ```````````````````````````````` example
1074 ````````````````````````````````
1077 The setext heading underline can be indented up to three spaces, and
1078 may have trailing spaces:
1080 ```````````````````````````````` example
1085 ````````````````````````````````
1088 Four spaces is too much:
1090 ```````````````````````````````` example
1096 ````````````````````````````````
1099 The setext heading underline cannot contain internal spaces:
1101 ```````````````````````````````` example
1112 ````````````````````````````````
1115 Trailing spaces in the content line do not cause a line break:
1117 ```````````````````````````````` example
1122 ````````````````````````````````
1125 Nor does a backslash at the end:
1127 ```````````````````````````````` example
1132 ````````````````````````````````
1135 Since indicators of block structure take precedence over
1136 indicators of inline structure, the following are setext headings:
1138 ```````````````````````````````` example
1149 <h2><a title="a lot</h2>
1150 <p>of dashes"/></p>
1151 ````````````````````````````````
1154 The setext heading underline cannot be a [lazy continuation
1155 line] in a list item or block quote:
1157 ```````````````````````````````` example
1165 ````````````````````````````````
1168 ```````````````````````````````` example
1178 ````````````````````````````````
1181 ```````````````````````````````` example
1189 ````````````````````````````````
1192 A blank line is needed between a paragraph and a following
1193 setext heading, since otherwise the paragraph becomes part
1194 of the heading's content:
1196 ```````````````````````````````` example
1203 ````````````````````````````````
1206 But in general a blank line is not required before or after
1209 ```````````````````````````````` example
1221 ````````````````````````````````
1224 Setext headings cannot be empty:
1226 ```````````````````````````````` example
1231 ````````````````````````````````
1234 Setext heading text lines must not be interpretable as block
1235 constructs other than paragraphs. So, the line of dashes
1236 in these examples gets interpreted as a thematic break:
1238 ```````````````````````````````` example
1244 ````````````````````````````````
1247 ```````````````````````````````` example
1255 ````````````````````````````````
1258 ```````````````````````````````` example
1265 ````````````````````````````````
1268 ```````````````````````````````` example
1276 ````````````````````````````````
1279 If you want a heading with `> foo` as its literal text, you can
1280 use backslash escapes:
1282 ```````````````````````````````` example
1287 ````````````````````````````````
1290 **Compatibility note:** Most existing Markdown implementations
1291 do not allow the text of setext headings to span multiple lines.
1292 But there is no consensus about how to interpret
1301 One can find four different interpretations:
1303 1. paragraph "Foo", heading "bar", paragraph "baz"
1304 2. paragraph "Foo bar", thematic break, paragraph "baz"
1305 3. paragraph "Foo bar --- baz"
1306 4. heading "Foo bar", paragraph "baz"
1308 We find interpretation 4 most natural, and interpretation 4
1309 increases the expressive power of CommonMark, by allowing
1310 multiline headings. Authors who want interpretation 1 can
1311 put a blank line after the first paragraph:
1313 ```````````````````````````````` example
1323 ````````````````````````````````
1326 Authors who want interpretation 2 can put blank lines around
1329 ```````````````````````````````` example
1341 ````````````````````````````````
1344 or use a thematic break that cannot count as a [setext heading
1347 ```````````````````````````````` example
1357 ````````````````````````````````
1360 Authors who want interpretation 3 can use backslash escapes:
1362 ```````````````````````````````` example
1372 ````````````````````````````````
1375 ## Indented code blocks
1377 An [indented code block](@) is composed of one or more
1378 [indented chunks] separated by blank lines.
1379 An [indented chunk](@) is a sequence of non-blank lines,
1380 each indented four or more spaces. The contents of the code block are
1381 the literal contents of the lines, including trailing
1382 [line endings], minus four spaces of indentation.
1383 An indented code block has no [info string].
1385 An indented code block cannot interrupt a paragraph, so there must be
1386 a blank line between a paragraph and a following indented code block.
1387 (A blank line is not needed, however, between a code block and a following
1390 ```````````````````````````````` example
1397 ````````````````````````````````
1400 If there is any ambiguity between an interpretation of indentation
1401 as a code block and as indicating that material belongs to a [list
1402 item][list items], the list item interpretation takes precedence:
1404 ```````````````````````````````` example
1415 ````````````````````````````````
1418 ```````````````````````````````` example
1431 ````````````````````````````````
1435 The contents of a code block are literal text, and do not get parsed
1438 ```````````````````````````````` example
1444 <pre><code><a/>
1449 ````````````````````````````````
1452 Here we have three chunks separated by blank lines:
1454 ```````````````````````````````` example
1471 ````````````````````````````````
1474 Any initial spaces beyond four will be included in the content, even
1475 in interior blank lines:
1477 ```````````````````````````````` example
1486 ````````````````````````````````
1489 An indented code block cannot interrupt a paragraph. (This
1490 allows hanging indents and the like.)
1492 ```````````````````````````````` example
1499 ````````````````````````````````
1502 However, any non-blank line with fewer than four leading spaces ends
1503 the code block immediately. So a paragraph may occur immediately
1504 after indented code:
1506 ```````````````````````````````` example
1513 ````````````````````````````````
1516 And indented code can occur immediately before and after other kinds of
1519 ```````````````````````````````` example
1534 ````````````````````````````````
1537 The first line can be indented more than four spaces:
1539 ```````````````````````````````` example
1546 ````````````````````````````````
1549 Blank lines preceding or following an indented code block
1550 are not included in it:
1552 ```````````````````````````````` example
1561 ````````````````````````````````
1564 Trailing spaces are included in the code block's content:
1566 ```````````````````````````````` example
1571 ````````````````````````````````
1575 ## Fenced code blocks
1577 A [code fence](@) is a sequence
1578 of at least three consecutive backtick characters (`` ` ``) or
1579 tildes (`~`). (Tildes and backticks cannot be mixed.)
1580 A [fenced code block](@)
1581 begins with a code fence, indented no more than three spaces.
1583 The line with the opening code fence may optionally contain some text
1584 following the code fence; this is trimmed of leading and trailing
1585 spaces and called the [info string](@).
1586 The [info string] may not contain any backtick
1587 characters. (The reason for this restriction is that otherwise
1588 some inline code would be incorrectly interpreted as the
1589 beginning of a fenced code block.)
1591 The content of the code block consists of all subsequent lines, until
1592 a closing [code fence] of the same type as the code block
1593 began with (backticks or tildes), and with at least as many backticks
1594 or tildes as the opening code fence. If the leading code fence is
1595 indented N spaces, then up to N spaces of indentation are removed from
1596 each line of the content (if present). (If a content line is not
1597 indented, it is preserved unchanged. If it is indented less than N
1598 spaces, all of the indentation is removed.)
1600 The closing code fence may be indented up to three spaces, and may be
1601 followed only by spaces, which are ignored. If the end of the
1602 containing block (or document) is reached and no closing code fence
1603 has been found, the code block contains all of the lines after the
1604 opening code fence until the end of the containing block (or
1605 document). (An alternative spec would require backtracking in the
1606 event that a closing code fence is not found. But this makes parsing
1607 much less efficient, and there seems to be no real down side to the
1608 behavior described here.)
1610 A fenced code block may interrupt a paragraph, and does not require
1611 a blank line either before or after.
1613 The content of a code fence is treated as literal text, not parsed
1614 as inlines. The first word of the [info string] is typically used to
1615 specify the language of the code sample, and rendered in the `class`
1616 attribute of the `code` tag. However, this spec does not mandate any
1617 particular treatment of the [info string].
1619 Here is a simple example with backticks:
1621 ```````````````````````````````` example
1630 ````````````````````````````````
1635 ```````````````````````````````` example
1644 ````````````````````````````````
1647 The closing code fence must use the same character as the opening
1650 ```````````````````````````````` example
1659 ````````````````````````````````
1662 ```````````````````````````````` example
1671 ````````````````````````````````
1674 The closing code fence must be at least as long as the opening fence:
1676 ```````````````````````````````` example
1685 ````````````````````````````````
1688 ```````````````````````````````` example
1697 ````````````````````````````````
1700 Unclosed code blocks are closed by the end of the document
1701 (or the enclosing [block quote][block quotes] or [list item][list items]):
1703 ```````````````````````````````` example
1706 <pre><code></code></pre>
1707 ````````````````````````````````
1710 ```````````````````````````````` example
1720 ````````````````````````````````
1723 ```````````````````````````````` example
1734 ````````````````````````````````
1737 A code block can have all empty lines as its content:
1739 ```````````````````````````````` example
1748 ````````````````````````````````
1751 A code block can be empty:
1753 ```````````````````````````````` example
1757 <pre><code></code></pre>
1758 ````````````````````````````````
1761 Fences can be indented. If the opening fence is indented,
1762 content lines will have equivalent opening indentation removed,
1765 ```````````````````````````````` example
1774 ````````````````````````````````
1777 ```````````````````````````````` example
1788 ````````````````````````````````
1791 ```````````````````````````````` example
1802 ````````````````````````````````
1805 Four spaces indentation produces an indented code block:
1807 ```````````````````````````````` example
1816 ````````````````````````````````
1819 Closing fences may be indented by 0-3 spaces, and their indentation
1820 need not match that of the opening fence:
1822 ```````````````````````````````` example
1829 ````````````````````````````````
1832 ```````````````````````````````` example
1839 ````````````````````````````````
1842 This is not a closing fence, because it is indented 4 spaces:
1844 ```````````````````````````````` example
1852 ````````````````````````````````
1856 Code fences (opening and closing) cannot contain internal spaces:
1858 ```````````````````````````````` example
1864 ````````````````````````````````
1867 ```````````````````````````````` example
1875 ````````````````````````````````
1878 Fenced code blocks can interrupt paragraphs, and can be followed
1879 directly by paragraphs, without a blank line between:
1881 ```````````````````````````````` example
1892 ````````````````````````````````
1895 Other blocks can also occur before and after fenced code blocks
1896 without an intervening blank line:
1898 ```````````````````````````````` example
1910 ````````````````````````````````
1913 An [info string] can be provided after the opening code fence.
1914 Opening and closing spaces will be stripped, and the first word, prefixed
1915 with `language-`, is used as the value for the `class` attribute of the
1916 `code` element within the enclosing `pre` element.
1918 ```````````````````````````````` example
1925 <pre><code class="language-ruby">def foo(x)
1929 ````````````````````````````````
1932 ```````````````````````````````` example
1933 ~~~~ ruby startline=3 $%@#$
1939 <pre><code class="language-ruby">def foo(x)
1943 ````````````````````````````````
1946 ```````````````````````````````` example
1950 <pre><code class="language-;"></code></pre>
1951 ````````````````````````````````
1954 [Info strings] for backtick code blocks cannot contain backticks:
1956 ```````````````````````````````` example
1962 ````````````````````````````````
1965 Closing code fences cannot have [info strings]:
1967 ```````````````````````````````` example
1974 ````````````````````````````````
1980 An [HTML block](@) is a group of lines that is treated
1981 as raw HTML (and will not be escaped in HTML output).
1983 There are seven kinds of [HTML block], which can be defined
1984 by their start and end conditions. The block begins with a line that
1985 meets a [start condition](@) (after up to three spaces
1986 optional indentation). It ends with the first subsequent line that
1987 meets a matching [end condition](@), or the last line of
1988 the document or other [container block]), if no line is encountered that meets the
1989 [end condition]. If the first line meets both the [start condition]
1990 and the [end condition], the block will contain just that line.
1992 1. **Start condition:** line begins with the string `<script`,
1993 `<pre`, or `<style` (case-insensitive), followed by whitespace,
1994 the string `>`, or the end of the line.\
1995 **End condition:** line contains an end tag
1996 `</script>`, `</pre>`, or `</style>` (case-insensitive; it
1997 need not match the start tag).
1999 2. **Start condition:** line begins with the string `<!--`.\
2000 **End condition:** line contains the string `-->`.
2002 3. **Start condition:** line begins with the string `<?`.\
2003 **End condition:** line contains the string `?>`.
2005 4. **Start condition:** line begins with the string `<!`
2006 followed by an uppercase ASCII letter.\
2007 **End condition:** line contains the character `>`.
2009 5. **Start condition:** line begins with the string
2011 **End condition:** line contains the string `]]>`.
2013 6. **Start condition:** line begins the string `<` or `</`
2014 followed by one of the strings (case-insensitive) `address`,
2015 `article`, `aside`, `base`, `basefont`, `blockquote`, `body`,
2016 `caption`, `center`, `col`, `colgroup`, `dd`, `details`, `dialog`,
2017 `dir`, `div`, `dl`, `dt`, `fieldset`, `figcaption`, `figure`,
2018 `footer`, `form`, `frame`, `frameset`,
2019 `h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `head`, `header`, `hr`,
2020 `html`, `iframe`, `legend`, `li`, `link`, `main`, `menu`, `menuitem`,
2021 `meta`, `nav`, `noframes`, `ol`, `optgroup`, `option`, `p`, `param`,
2022 `section`, `source`, `summary`, `table`, `tbody`, `td`,
2023 `tfoot`, `th`, `thead`, `title`, `tr`, `track`, `ul`, followed
2024 by [whitespace], the end of the line, the string `>`, or
2026 **End condition:** line is followed by a [blank line].
2028 7. **Start condition:** line begins with a complete [open tag]
2029 or [closing tag] (with any [tag name] other than `script`,
2030 `style`, or `pre`) followed only by [whitespace]
2031 or the end of the line.\
2032 **End condition:** line is followed by a [blank line].
2034 All types of [HTML blocks] except type 7 may interrupt
2035 a paragraph. Blocks of type 7 may not interrupt a paragraph.
2036 (This restriction is intended to prevent unwanted interpretation
2037 of long tags inside a wrapped paragraph as starting HTML blocks.)
2039 Some simple examples follow. Here are some basic HTML blocks
2042 ```````````````````````````````` example
2061 ````````````````````````````````
2064 ```````````````````````````````` example
2072 ````````````````````````````````
2075 A block can also start with a closing tag:
2077 ```````````````````````````````` example
2083 ````````````````````````````````
2086 Here we have two HTML blocks with a Markdown paragraph between them:
2088 ```````````````````````````````` example
2096 <p><em>Markdown</em></p>
2098 ````````````````````````````````
2101 The tag on the first line can be partial, as long
2102 as it is split where there would be whitespace:
2104 ```````````````````````````````` example
2112 ````````````````````````````````
2115 ```````````````````````````````` example
2116 <div id="foo" class="bar
2120 <div id="foo" class="bar
2123 ````````````````````````````````
2126 An open tag need not be closed:
2127 ```````````````````````````````` example
2136 ````````````````````````````````
2140 A partial tag need not even be completed (garbage
2143 ```````````````````````````````` example
2149 ````````````````````````````````
2152 ```````````````````````````````` example
2158 ````````````````````````````````
2161 The initial tag doesn't even need to be a valid
2162 tag, as long as it starts like one:
2164 ```````````````````````````````` example
2170 ````````````````````````````````
2173 In type 6 blocks, the initial tag need not be on a line by
2176 ```````````````````````````````` example
2177 <div><a href="bar">*foo*</a></div>
2179 <div><a href="bar">*foo*</a></div>
2180 ````````````````````````````````
2183 ```````````````````````````````` example
2191 ````````````````````````````````
2194 Everything until the next blank line or end of document
2195 gets included in the HTML block. So, in the following
2196 example, what looks like a Markdown code block
2197 is actually part of the HTML block, which continues until a blank
2198 line or the end of the document is reached:
2200 ```````````````````````````````` example
2210 ````````````````````````````````
2213 To start an [HTML block] with a tag that is *not* in the
2214 list of block-level tags in (6), you must put the tag by
2215 itself on the first line (and it must be complete):
2217 ```````````````````````````````` example
2225 ````````````````````````````````
2228 In type 7 blocks, the [tag name] can be anything:
2230 ```````````````````````````````` example
2238 ````````````````````````````````
2241 ```````````````````````````````` example
2249 ````````````````````````````````
2252 ```````````````````````````````` example
2258 ````````````````````````````````
2261 These rules are designed to allow us to work with tags that
2262 can function as either block-level or inline-level tags.
2263 The `<del>` tag is a nice example. We can surround content with
2264 `<del>` tags in three different ways. In this case, we get a raw
2265 HTML block, because the `<del>` tag is on a line by itself:
2267 ```````````````````````````````` example
2275 ````````````````````````````````
2278 In this case, we get a raw HTML block that just includes
2279 the `<del>` tag (because it ends with the following blank
2280 line). So the contents get interpreted as CommonMark:
2282 ```````````````````````````````` example
2292 ````````````````````````````````
2295 Finally, in this case, the `<del>` tags are interpreted
2296 as [raw HTML] *inside* the CommonMark paragraph. (Because
2297 the tag is not on a line by itself, we get inline HTML
2298 rather than an [HTML block].)
2300 ```````````````````````````````` example
2303 <p><del><em>foo</em></del></p>
2304 ````````````````````````````````
2307 HTML tags designed to contain literal content
2308 (`script`, `style`, `pre`), comments, processing instructions,
2309 and declarations are treated somewhat differently.
2310 Instead of ending at the first blank line, these blocks
2311 end at the first line containing a corresponding end tag.
2312 As a result, these blocks can contain blank lines:
2316 ```````````````````````````````` example
2317 <pre language="haskell"><code>
2318 import Text.HTML.TagSoup
2321 main = print $ parseTags tags
2325 <pre language="haskell"><code>
2326 import Text.HTML.TagSoup
2329 main = print $ parseTags tags
2332 ````````````````````````````````
2335 A script tag (type 1):
2337 ```````````````````````````````` example
2338 <script type="text/javascript">
2339 // JavaScript example
2341 document.getElementById("demo").innerHTML = "Hello JavaScript!";
2345 <script type="text/javascript">
2346 // JavaScript example
2348 document.getElementById("demo").innerHTML = "Hello JavaScript!";
2351 ````````````````````````````````
2354 A style tag (type 1):
2356 ```````````````````````````````` example
2372 ````````````````````````````````
2375 If there is no matching end tag, the block will end at the
2376 end of the document (or the enclosing [block quote][block quotes]
2377 or [list item][list items]):
2379 ```````````````````````````````` example
2389 ````````````````````````````````
2392 ```````````````````````````````` example
2403 ````````````````````````````````
2406 ```````````````````````````````` example
2416 ````````````````````````````````
2419 The end tag can occur on the same line as the start tag:
2421 ```````````````````````````````` example
2422 <style>p{color:red;}</style>
2425 <style>p{color:red;}</style>
2427 ````````````````````````````````
2430 ```````````````````````````````` example
2436 ````````````````````````````````
2439 Note that anything on the last line after the
2440 end tag will be included in the [HTML block]:
2442 ```````````````````````````````` example
2450 ````````````````````````````````
2455 ```````````````````````````````` example
2467 ````````````````````````````````
2471 A processing instruction (type 3):
2473 ```````````````````````````````` example
2487 ````````````````````````````````
2490 A declaration (type 4):
2492 ```````````````````````````````` example
2496 ````````````````````````````````
2501 ```````````````````````````````` example
2503 function matchwo(a,b)
2505 if (a < b && a < 0) then {
2517 function matchwo(a,b)
2519 if (a < b && a < 0) then {
2529 ````````````````````````````````
2532 The opening tag can be indented 1-3 spaces, but not 4:
2534 ```````````````````````````````` example
2540 <pre><code><!-- foo -->
2542 ````````````````````````````````
2545 ```````````````````````````````` example
2551 <pre><code><div>
2553 ````````````````````````````````
2556 An HTML block of types 1--6 can interrupt a paragraph, and need not be
2557 preceded by a blank line.
2559 ```````````````````````````````` example
2569 ````````````````````````````````
2572 However, a following blank line is needed, except at the end of
2573 a document, and except for blocks of types 1--5, above:
2575 ```````````````````````````````` example
2585 ````````````````````````````````
2588 HTML blocks of type 7 cannot interrupt a paragraph:
2590 ```````````````````````````````` example
2598 ````````````````````````````````
2601 This rule differs from John Gruber's original Markdown syntax
2602 specification, which says:
2604 > The only restrictions are that block-level HTML elements —
2605 > e.g. `<div>`, `<table>`, `<pre>`, `<p>`, etc. — must be separated from
2606 > surrounding content by blank lines, and the start and end tags of the
2607 > block should not be indented with tabs or spaces.
2609 In some ways Gruber's rule is more restrictive than the one given
2612 - It requires that an HTML block be preceded by a blank line.
2613 - It does not allow the start tag to be indented.
2614 - It requires a matching end tag, which it also does not allow to
2617 Most Markdown implementations (including some of Gruber's own) do not
2618 respect all of these restrictions.
2620 There is one respect, however, in which Gruber's rule is more liberal
2621 than the one given here, since it allows blank lines to occur inside
2622 an HTML block. There are two reasons for disallowing them here.
2623 First, it removes the need to parse balanced tags, which is
2624 expensive and can require backtracking from the end of the document
2625 if no matching end tag is found. Second, it provides a very simple
2626 and flexible way of including Markdown content inside HTML tags:
2627 simply separate the Markdown from the HTML using blank lines:
2631 ```````````````````````````````` example
2639 <p><em>Emphasized</em> text.</p>
2641 ````````````````````````````````
2644 ```````````````````````````````` example
2652 ````````````````````````````````
2655 Some Markdown implementations have adopted a convention of
2656 interpreting content inside tags as text if the open tag has
2657 the attribute `markdown=1`. The rule given above seems a simpler and
2658 more elegant way of achieving the same expressive power, which is also
2659 much simpler to parse.
2661 The main potential drawback is that one can no longer paste HTML
2662 blocks into Markdown documents with 100% reliability. However,
2663 *in most cases* this will work fine, because the blank lines in
2664 HTML are usually followed by HTML block tags. For example:
2666 ```````````````````````````````` example
2686 ````````````````````````````````
2689 There are problems, however, if the inner tags are indented
2690 *and* separated by spaces, as then they will be interpreted as
2691 an indented code block:
2693 ```````````````````````````````` example
2708 <pre><code><td>
2714 ````````````````````````````````
2717 Fortunately, blank lines are usually not necessary and can be
2718 deleted. The exception is inside `<pre>` tags, but as described
2719 above, raw HTML blocks starting with `<pre>` *can* contain blank
2722 ## Link reference definitions
2724 A [link reference definition](@)
2725 consists of a [link label], indented up to three spaces, followed
2726 by a colon (`:`), optional [whitespace] (including up to one
2727 [line ending]), a [link destination],
2728 optional [whitespace] (including up to one
2729 [line ending]), and an optional [link
2730 title], which if it is present must be separated
2731 from the [link destination] by [whitespace].
2732 No further [non-whitespace characters] may occur on the line.
2734 A [link reference definition]
2735 does not correspond to a structural element of a document. Instead, it
2736 defines a label which can be used in [reference links]
2737 and reference-style [images] elsewhere in the document. [Link
2738 reference definitions] can come either before or after the links that use
2741 ```````````````````````````````` example
2746 <p><a href="/url" title="title">foo</a></p>
2747 ````````````````````````````````
2750 ```````````````````````````````` example
2757 <p><a href="/url" title="the title">foo</a></p>
2758 ````````````````````````````````
2761 ```````````````````````````````` example
2762 [Foo*bar\]]:my_(url) 'title (with parens)'
2766 <p><a href="my_(url)" title="title (with parens)">Foo*bar]</a></p>
2767 ````````````````````````````````
2770 ```````````````````````````````` example
2777 <p><a href="my%20url" title="title">Foo bar</a></p>
2778 ````````````````````````````````
2781 The title may extend over multiple lines:
2783 ```````````````````````````````` example
2792 <p><a href="/url" title="
2797 ````````````````````````````````
2800 However, it may not contain a [blank line]:
2802 ```````````````````````````````` example
2809 <p>[foo]: /url 'title</p>
2810 <p>with blank line'</p>
2812 ````````````````````````````````
2815 The title may be omitted:
2817 ```````````````````````````````` example
2823 <p><a href="/url">foo</a></p>
2824 ````````````````````````````````
2827 The link destination may not be omitted:
2829 ```````````````````````````````` example
2836 ````````````````````````````````
2839 Both title and destination can contain backslash escapes
2840 and literal backslashes:
2842 ```````````````````````````````` example
2843 [foo]: /url\bar\*baz "foo\"bar\baz"
2847 <p><a href="/url%5Cbar*baz" title="foo"bar\baz">foo</a></p>
2848 ````````````````````````````````
2851 A link can come before its corresponding definition:
2853 ```````````````````````````````` example
2858 <p><a href="url">foo</a></p>
2859 ````````````````````````````````
2862 If there are several matching definitions, the first one takes
2865 ```````````````````````````````` example
2871 <p><a href="first">foo</a></p>
2872 ````````````````````````````````
2875 As noted in the section on [Links], matching of labels is
2876 case-insensitive (see [matches]).
2878 ```````````````````````````````` example
2883 <p><a href="/url">Foo</a></p>
2884 ````````````````````````````````
2887 ```````````````````````````````` example
2892 <p><a href="/%CF%86%CE%BF%CF%85">αγω</a></p>
2893 ````````````````````````````````
2896 Here is a link reference definition with no corresponding link.
2897 It contributes nothing to the document.
2899 ```````````````````````````````` example
2902 ````````````````````````````````
2905 Here is another one:
2907 ```````````````````````````````` example
2914 ````````````````````````````````
2917 This is not a link reference definition, because there are
2918 [non-whitespace characters] after the title:
2920 ```````````````````````````````` example
2921 [foo]: /url "title" ok
2923 <p>[foo]: /url "title" ok</p>
2924 ````````````````````````````````
2927 This is a link reference definition, but it has no title:
2929 ```````````````````````````````` example
2933 <p>"title" ok</p>
2934 ````````````````````````````````
2937 This is not a link reference definition, because it is indented
2940 ```````````````````````````````` example
2945 <pre><code>[foo]: /url "title"
2948 ````````````````````````````````
2951 This is not a link reference definition, because it occurs inside
2954 ```````````````````````````````` example
2961 <pre><code>[foo]: /url
2964 ````````````````````````````````
2967 A [link reference definition] cannot interrupt a paragraph.
2969 ```````````````````````````````` example
2978 ````````````````````````````````
2981 However, it can directly follow other block elements, such as headings
2982 and thematic breaks, and it need not be followed by a blank line.
2984 ```````````````````````````````` example
2989 <h1><a href="/url">Foo</a></h1>
2993 ````````````````````````````````
2996 Several [link reference definitions]
2997 can occur one after another, without intervening blank lines.
2999 ```````````````````````````````` example
3000 [foo]: /foo-url "foo"
3009 <p><a href="/foo-url" title="foo">foo</a>,
3010 <a href="/bar-url" title="bar">bar</a>,
3011 <a href="/baz-url">baz</a></p>
3012 ````````````````````````````````
3015 [Link reference definitions] can occur
3016 inside block containers, like lists and block quotations. They
3017 affect the entire document, not just the container in which they
3020 ```````````````````````````````` example
3025 <p><a href="/url">foo</a></p>
3028 ````````````````````````````````
3034 A sequence of non-blank lines that cannot be interpreted as other
3035 kinds of blocks forms a [paragraph](@).
3036 The contents of the paragraph are the result of parsing the
3037 paragraph's raw content as inlines. The paragraph's raw content
3038 is formed by concatenating the lines and removing initial and final
3041 A simple example with two paragraphs:
3043 ```````````````````````````````` example
3050 ````````````````````````````````
3053 Paragraphs can contain multiple lines, but no blank lines:
3055 ```````````````````````````````` example
3066 ````````````````````````````````
3069 Multiple blank lines between paragraph have no effect:
3071 ```````````````````````````````` example
3079 ````````````````````````````````
3082 Leading spaces are skipped:
3084 ```````````````````````````````` example
3090 ````````````````````````````````
3093 Lines after the first may be indented any amount, since indented
3094 code blocks cannot interrupt paragraphs.
3096 ```````````````````````````````` example
3104 ````````````````````````````````
3107 However, the first line may be indented at most three spaces,
3108 or an indented code block will be triggered:
3110 ```````````````````````````````` example
3116 ````````````````````````````````
3119 ```````````````````````````````` example
3126 ````````````````````````````````
3129 Final spaces are stripped before inline parsing, so a paragraph
3130 that ends with two or more spaces will not end with a [hard line
3133 ```````````````````````````````` example
3139 ````````````````````````````````
3144 [Blank lines] between block-level elements are ignored,
3145 except for the role they play in determining whether a [list]
3146 is [tight] or [loose].
3148 Blank lines at the beginning and end of the document are also ignored.
3150 ```````````````````````````````` example
3162 ````````````````````````````````
3168 A [container block] is a block that has other
3169 blocks as its contents. There are two basic kinds of container blocks:
3170 [block quotes] and [list items].
3171 [Lists] are meta-containers for [list items].
3173 We define the syntax for container blocks recursively. The general
3174 form of the definition is:
3176 > If X is a sequence of blocks, then the result of
3177 > transforming X in such-and-such a way is a container of type Y
3178 > with these blocks as its content.
3180 So, we explain what counts as a block quote or list item by explaining
3181 how these can be *generated* from their contents. This should suffice
3182 to define the syntax, although it does not give a recipe for *parsing*
3183 these constructions. (A recipe is provided below in the section entitled
3184 [A parsing strategy](#appendix-a-parsing-strategy).)
3188 A [block quote marker](@)
3189 consists of 0-3 spaces of initial indent, plus (a) the character `>` together
3190 with a following space, or (b) a single character `>` not followed by a space.
3192 The following rules define [block quotes]:
3194 1. **Basic case.** If a string of lines *Ls* constitute a sequence
3195 of blocks *Bs*, then the result of prepending a [block quote
3196 marker] to the beginning of each line in *Ls*
3197 is a [block quote](#block-quotes) containing *Bs*.
3199 2. **Laziness.** If a string of lines *Ls* constitute a [block
3200 quote](#block-quotes) with contents *Bs*, then the result of deleting
3201 the initial [block quote marker] from one or
3202 more lines in which the next [non-whitespace character] after the [block
3203 quote marker] is [paragraph continuation
3204 text] is a block quote with *Bs* as its content.
3205 [Paragraph continuation text](@) is text
3206 that will be parsed as part of the content of a paragraph, but does
3207 not occur at the beginning of the paragraph.
3209 3. **Consecutiveness.** A document cannot contain two [block
3210 quotes] in a row unless there is a [blank line] between them.
3212 Nothing else counts as a [block quote](#block-quotes).
3214 Here is a simple example:
3216 ```````````````````````````````` example
3226 ````````````````````````````````
3229 The spaces after the `>` characters can be omitted:
3231 ```````````````````````````````` example
3241 ````````````````````````````````
3244 The `>` characters can be indented 1-3 spaces:
3246 ```````````````````````````````` example
3256 ````````````````````````````````
3259 Four spaces gives us a code block:
3261 ```````````````````````````````` example
3266 <pre><code>> # Foo
3270 ````````````````````````````````
3273 The Laziness clause allows us to omit the `>` before
3274 [paragraph continuation text]:
3276 ```````````````````````````````` example
3286 ````````````````````````````````
3289 A block quote can contain some lazy and some non-lazy
3292 ```````````````````````````````` example
3302 ````````````````````````````````
3305 Laziness only applies to lines that would have been continuations of
3306 paragraphs had they been prepended with [block quote markers].
3307 For example, the `> ` cannot be omitted in the second line of
3314 without changing the meaning:
3316 ```````````````````````````````` example
3324 ````````````````````````````````
3327 Similarly, if we omit the `> ` in the second line of
3334 then the block quote ends after the first line:
3336 ```````````````````````````````` example
3348 ````````````````````````````````
3351 For the same reason, we can't omit the `> ` in front of
3352 subsequent lines of an indented or fenced code block:
3354 ```````````````````````````````` example
3364 ````````````````````````````````
3367 ```````````````````````````````` example
3373 <pre><code></code></pre>
3376 <pre><code></code></pre>
3377 ````````````````````````````````
3380 Note that in the following case, we have a [lazy
3383 ```````````````````````````````` example
3391 ````````````````````````````````
3394 To see why, note that in
3401 the `- bar` is indented too far to start a list, and can't
3402 be an indented code block because indented code blocks cannot
3403 interrupt paragraphs, so it is [paragraph continuation text].
3405 A block quote can be empty:
3407 ```````````````````````````````` example
3412 ````````````````````````````````
3415 ```````````````````````````````` example
3422 ````````````````````````````````
3425 A block quote can have initial or final blank lines:
3427 ```````````````````````````````` example
3435 ````````````````````````````````
3438 A blank line always separates block quotes:
3440 ```````````````````````````````` example
3451 ````````````````````````````````
3454 (Most current Markdown implementations, including John Gruber's
3455 original `Markdown.pl`, will parse this example as a single block quote
3456 with two paragraphs. But it seems better to allow the author to decide
3457 whether two block quotes or one are wanted.)
3459 Consecutiveness means that if we put these block quotes together,
3460 we get a single block quote:
3462 ```````````````````````````````` example
3470 ````````````````````````````````
3473 To get a block quote with two paragraphs, use:
3475 ```````````````````````````````` example
3484 ````````````````````````````````
3487 Block quotes can interrupt paragraphs:
3489 ```````````````````````````````` example
3497 ````````````````````````````````
3500 In general, blank lines are not needed before or after block
3503 ```````````````````````````````` example
3515 ````````````````````````````````
3518 However, because of laziness, a blank line is needed between
3519 a block quote and a following paragraph:
3521 ```````````````````````````````` example
3529 ````````````````````````````````
3532 ```````````````````````````````` example
3541 ````````````````````````````````
3544 ```````````````````````````````` example
3553 ````````````````````````````````
3556 It is a consequence of the Laziness rule that any number
3557 of initial `>`s may be omitted on a continuation line of a
3560 ```````````````````````````````` example
3572 ````````````````````````````````
3575 ```````````````````````````````` example
3589 ````````````````````````````````
3592 When including an indented code block in a block quote,
3593 remember that the [block quote marker] includes
3594 both the `>` and a following space. So *five spaces* are needed after
3597 ```````````````````````````````` example
3609 ````````````````````````````````
3615 A [list marker](@) is a
3616 [bullet list marker] or an [ordered list marker].
3618 A [bullet list marker](@)
3619 is a `-`, `+`, or `*` character.
3621 An [ordered list marker](@)
3622 is a sequence of 1--9 arabic digits (`0-9`), followed by either a
3623 `.` character or a `)` character. (The reason for the length
3624 limit is that with 10 digits we start seeing integer overflows
3627 The following rules define [list items]:
3629 1. **Basic case.** If a sequence of lines *Ls* constitute a sequence of
3630 blocks *Bs* starting with a [non-whitespace character] and not separated
3631 from each other by more than one blank line, and *M* is a list
3632 marker of width *W* followed by 1 ≤ *N* ≤ 4 spaces, then the result
3633 of prepending *M* and the following spaces to the first line of
3634 *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a
3635 list item with *Bs* as its contents. The type of the list item
3636 (bullet or ordered) is determined by the type of its list marker.
3637 If the list item is ordered, then it is also assigned a start
3638 number, based on the ordered list marker.
3640 Exceptions: When the first list item in a [list] interrupts
3641 a paragraph---that is, when it starts on a line that would
3642 otherwise count as [paragraph continuation text]---then (a)
3643 the lines *Ls* must not begin with a blank line, and (b) if
3644 the list item is ordered, the start number must be 1.
3646 For example, let *Ls* be the lines
3648 ```````````````````````````````` example
3658 <pre><code>indented code
3661 <p>A block quote.</p>
3663 ````````````````````````````````
3666 And let *M* be the marker `1.`, and *N* = 2. Then rule #1 says
3667 that the following is an ordered list item with start number 1,
3668 and the same contents as *Ls*:
3670 ```````````````````````````````` example
3682 <pre><code>indented code
3685 <p>A block quote.</p>
3689 ````````````````````````````````
3692 The most important thing to notice is that the position of
3693 the text after the list marker determines how much indentation
3694 is needed in subsequent blocks in the list item. If the list
3695 marker takes up two spaces, and there are three spaces between
3696 the list marker and the next [non-whitespace character], then blocks
3697 must be indented five spaces in order to fall under the list
3700 Here are some examples showing how far content must be indented to be
3701 put under the list item:
3703 ```````````````````````````````` example
3712 ````````````````````````````````
3715 ```````````````````````````````` example
3726 ````````````````````````````````
3729 ```````````````````````````````` example
3739 ````````````````````````````````
3742 ```````````````````````````````` example
3753 ````````````````````````````````
3756 It is tempting to think of this in terms of columns: the continuation
3757 blocks must be indented at least to the column of the first
3758 [non-whitespace character] after the list marker. However, that is not quite right.
3759 The spaces after the list marker determine how much relative indentation
3760 is needed. Which column this indentation reaches will depend on
3761 how the list item is embedded in other constructions, as shown by
3764 ```````````````````````````````` example
3779 ````````````````````````````````
3782 Here `two` occurs in the same column as the list marker `1.`,
3783 but is actually contained in the list item, because there is
3784 sufficient indentation after the last containing blockquote marker.
3786 The converse is also possible. In the following example, the word `two`
3787 occurs far to the right of the initial text of the list item, `one`, but
3788 it is not considered part of the list item, because it is not indented
3789 far enough past the blockquote marker:
3791 ```````````````````````````````` example
3804 ````````````````````````````````
3807 Note that at least one space is needed between the list marker and
3808 any following content, so these are not list items:
3810 ```````````````````````````````` example
3817 ````````````````````````````````
3820 A list item may contain blocks that are separated by more than
3823 ```````````````````````````````` example
3835 ````````````````````````````````
3838 A list item may contain any kind of block:
3840 ```````````````````````````````` example
3862 ````````````````````````````````
3865 A list item that contains an indented code block will preserve
3866 empty lines within the code block verbatim.
3868 ```````````````````````````````` example
3886 ````````````````````````````````
3888 Note that ordered list start numbers must be nine digits or less:
3890 ```````````````````````````````` example
3893 <ol start="123456789">
3896 ````````````````````````````````
3899 ```````````````````````````````` example
3902 <p>1234567890. not ok</p>
3903 ````````````````````````````````
3906 A start number may begin with 0s:
3908 ```````````````````````````````` example
3914 ````````````````````````````````
3917 ```````````````````````````````` example
3923 ````````````````````````````````
3926 A start number may not be negative:
3928 ```````````````````````````````` example
3932 ````````````````````````````````
3936 2. **Item starting with indented code.** If a sequence of lines *Ls*
3937 constitute a sequence of blocks *Bs* starting with an indented code
3938 block and not separated from each other by more than one blank line,
3939 and *M* is a list marker of width *W* followed by
3940 one space, then the result of prepending *M* and the following
3941 space to the first line of *Ls*, and indenting subsequent lines of
3942 *Ls* by *W + 1* spaces, is a list item with *Bs* as its contents.
3943 If a line is empty, then it need not be indented. The type of the
3944 list item (bullet or ordered) is determined by the type of its list
3945 marker. If the list item is ordered, then it is also assigned a
3946 start number, based on the ordered list marker.
3948 An indented code block will have to be indented four spaces beyond
3949 the edge of the region where text will be included in the list item.
3950 In the following case that is 6 spaces:
3952 ```````````````````````````````` example
3964 ````````````````````````````````
3967 And in this case it is 11 spaces:
3969 ```````````````````````````````` example
3981 ````````````````````````````````
3984 If the *first* block in the list item is an indented code block,
3985 then by rule #2, the contents must be indented *one* space after the
3988 ```````````````````````````````` example
3995 <pre><code>indented code
3998 <pre><code>more code
4000 ````````````````````````````````
4003 ```````````````````````````````` example
4012 <pre><code>indented code
4015 <pre><code>more code
4019 ````````````````````````````````
4022 Note that an additional space indent is interpreted as space
4023 inside the code block:
4025 ```````````````````````````````` example
4034 <pre><code> indented code
4037 <pre><code>more code
4041 ````````````````````````````````
4044 Note that rules #1 and #2 only apply to two cases: (a) cases
4045 in which the lines to be included in a list item begin with a
4046 [non-whitespace character], and (b) cases in which
4047 they begin with an indented code
4048 block. In a case like the following, where the first block begins with
4049 a three-space indent, the rules do not allow us to form a list item by
4050 indenting the whole thing and prepending a list marker:
4052 ```````````````````````````````` example
4059 ````````````````````````````````
4062 ```````````````````````````````` example
4071 ````````````````````````````````
4074 This is not a significant restriction, because when a block begins
4075 with 1-3 spaces indent, the indentation can always be removed without
4076 a change in interpretation, allowing rule #1 to be applied. So, in
4079 ```````````````````````````````` example
4090 ````````````````````````````````
4093 3. **Item starting with a blank line.** If a sequence of lines *Ls*
4094 starting with a single [blank line] constitute a (possibly empty)
4095 sequence of blocks *Bs*, not separated from each other by more than
4096 one blank line, and *M* is a list marker of width *W*,
4097 then the result of prepending *M* to the first line of *Ls*, and
4098 indenting subsequent lines of *Ls* by *W + 1* spaces, is a list
4099 item with *Bs* as its contents.
4100 If a line is empty, then it need not be indented. The type of the
4101 list item (bullet or ordered) is determined by the type of its list
4102 marker. If the list item is ordered, then it is also assigned a
4103 start number, based on the ordered list marker.
4105 Here are some list items that start with a blank line but are not empty:
4107 ```````````````````````````````` example
4128 ````````````````````````````````
4130 When the list item starts with a blank line, the number of spaces
4131 following the list marker doesn't change the required indentation:
4133 ```````````````````````````````` example
4140 ````````````````````````````````
4143 A list item can begin with at most one blank line.
4144 In the following example, `foo` is not part of the list
4147 ```````````````````````````````` example
4156 ````````````````````````````````
4159 Here is an empty bullet list item:
4161 ```````````````````````````````` example
4171 ````````````````````````````````
4174 It does not matter whether there are spaces following the [list marker]:
4176 ```````````````````````````````` example
4186 ````````````````````````````````
4189 Here is an empty ordered list item:
4191 ```````````````````````````````` example
4201 ````````````````````````````````
4204 A list may start or end with an empty list item:
4206 ```````````````````````````````` example
4212 ````````````````````````````````
4214 However, an empty list item cannot interrupt a paragraph:
4216 ```````````````````````````````` example
4227 ````````````````````````````````
4230 4. **Indentation.** If a sequence of lines *Ls* constitutes a list item
4231 according to rule #1, #2, or #3, then the result of indenting each line
4232 of *Ls* by 1-3 spaces (the same for each line) also constitutes a
4233 list item with the same contents and attributes. If a line is
4234 empty, then it need not be indented.
4238 ```````````````````````````````` example
4250 <pre><code>indented code
4253 <p>A block quote.</p>
4257 ````````````````````````````````
4260 Indented two spaces:
4262 ```````````````````````````````` example
4274 <pre><code>indented code
4277 <p>A block quote.</p>
4281 ````````````````````````````````
4284 Indented three spaces:
4286 ```````````````````````````````` example
4298 <pre><code>indented code
4301 <p>A block quote.</p>
4305 ````````````````````````````````
4308 Four spaces indent gives a code block:
4310 ```````````````````````````````` example
4318 <pre><code>1. A paragraph
4325 ````````````````````````````````
4329 5. **Laziness.** If a string of lines *Ls* constitute a [list
4330 item](#list-items) with contents *Bs*, then the result of deleting
4331 some or all of the indentation from one or more lines in which the
4332 next [non-whitespace character] after the indentation is
4333 [paragraph continuation text] is a
4334 list item with the same contents and attributes. The unindented
4336 [lazy continuation line](@)s.
4338 Here is an example with [lazy continuation lines]:
4340 ```````````````````````````````` example
4352 <pre><code>indented code
4355 <p>A block quote.</p>
4359 ````````````````````````````````
4362 Indentation can be partially deleted:
4364 ```````````````````````````````` example
4370 with two lines.</li>
4372 ````````````````````````````````
4375 These examples show how laziness can work in nested structures:
4377 ```````````````````````````````` example
4391 ````````````````````````````````
4394 ```````````````````````````````` example
4408 ````````````````````````````````
4412 6. **That's all.** Nothing that is not counted as a list item by rules
4413 #1--5 counts as a [list item](#list-items).
4415 The rules for sublists follow from the general rules above. A sublist
4416 must be indented the same number of spaces a paragraph would need to be
4417 in order to be included in the list item.
4419 So, in this case we need two spaces indent:
4421 ```````````````````````````````` example
4442 ````````````````````````````````
4447 ```````````````````````````````` example
4459 ````````````````````````````````
4462 Here we need four, because the list marker is wider:
4464 ```````````````````````````````` example
4475 ````````````````````````````````
4478 Three is not enough:
4480 ```````````````````````````````` example
4490 ````````````````````````````````
4493 A list may be the first block in a list item:
4495 ```````````````````````````````` example
4505 ````````````````````````````````
4508 ```````````````````````````````` example
4522 ````````````````````````````````
4525 A list item can contain a heading:
4527 ```````````````````````````````` example
4541 ````````````````````````````````
4546 John Gruber's Markdown spec says the following about list items:
4548 1. "List markers typically start at the left margin, but may be indented
4549 by up to three spaces. List markers must be followed by one or more
4552 2. "To make lists look nice, you can wrap items with hanging indents....
4553 But if you don't want to, you don't have to."
4555 3. "List items may consist of multiple paragraphs. Each subsequent
4556 paragraph in a list item must be indented by either 4 spaces or one
4559 4. "It looks nice if you indent every line of the subsequent paragraphs,
4560 but here again, Markdown will allow you to be lazy."
4562 5. "To put a blockquote within a list item, the blockquote's `>`
4563 delimiters need to be indented."
4565 6. "To put a code block within a list item, the code block needs to be
4566 indented twice — 8 spaces or two tabs."
4568 These rules specify that a paragraph under a list item must be indented
4569 four spaces (presumably, from the left margin, rather than the start of
4570 the list marker, but this is not said), and that code under a list item
4571 must be indented eight spaces instead of the usual four. They also say
4572 that a block quote must be indented, but not by how much; however, the
4573 example given has four spaces indentation. Although nothing is said
4574 about other kinds of block-level content, it is certainly reasonable to
4575 infer that *all* block elements under a list item, including other
4576 lists, must be indented four spaces. This principle has been called the
4579 The four-space rule is clear and principled, and if the reference
4580 implementation `Markdown.pl` had followed it, it probably would have
4581 become the standard. However, `Markdown.pl` allowed paragraphs and
4582 sublists to start with only two spaces indentation, at least on the
4583 outer level. Worse, its behavior was inconsistent: a sublist of an
4584 outer-level list needed two spaces indentation, but a sublist of this
4585 sublist needed three spaces. It is not surprising, then, that different
4586 implementations of Markdown have developed very different rules for
4587 determining what comes under a list item. (Pandoc and python-Markdown,
4588 for example, stuck with Gruber's syntax description and the four-space
4589 rule, while discount, redcarpet, marked, PHP Markdown, and others
4590 followed `Markdown.pl`'s behavior more closely.)
4592 Unfortunately, given the divergences between implementations, there
4593 is no way to give a spec for list items that will be guaranteed not
4594 to break any existing documents. However, the spec given here should
4595 correctly handle lists formatted with either the four-space rule or
4596 the more forgiving `Markdown.pl` behavior, provided they are laid out
4597 in a way that is natural for a human to read.
4599 The strategy here is to let the width and indentation of the list marker
4600 determine the indentation necessary for blocks to fall under the list
4601 item, rather than having a fixed and arbitrary number. The writer can
4602 think of the body of the list item as a unit which gets indented to the
4603 right enough to fit the list marker (and any indentation on the list
4604 marker). (The laziness rule, #5, then allows continuation lines to be
4605 unindented if needed.)
4607 This rule is superior, we claim, to any rule requiring a fixed level of
4608 indentation from the margin. The four-space rule is clear but
4609 unnatural. It is quite unintuitive that
4619 should be parsed as two lists with an intervening paragraph,
4631 as the four-space rule demands, rather than a single list,
4645 The choice of four spaces is arbitrary. It can be learned, but it is
4646 not likely to be guessed, and it trips up beginners regularly.
4648 Would it help to adopt a two-space rule? The problem is that such
4649 a rule, together with the rule allowing 1--3 spaces indentation of the
4650 initial list marker, allows text that is indented *less than* the
4651 original list marker to be included in the list item. For example,
4652 `Markdown.pl` parses
4660 as a single list item, with `two` a continuation paragraph:
4692 This is extremely unintuitive.
4694 Rather than requiring a fixed indent from the margin, we could require
4695 a fixed indent (say, two spaces, or even one space) from the list marker (which
4696 may itself be indented). This proposal would remove the last anomaly
4697 discussed. Unlike the spec presented above, it would count the following
4698 as a list item with a subparagraph, even though the paragraph `bar`
4699 is not indented as far as the first paragraph `foo`:
4707 Arguably this text does read like a list item with `bar` as a subparagraph,
4708 which may count in favor of the proposal. However, on this proposal indented
4709 code would have to be indented six spaces after the list marker. And this
4710 would break a lot of existing Markdown, which has the pattern:
4718 where the code is indented eight spaces. The spec above, by contrast, will
4719 parse this text as expected, since the code block's indentation is measured
4720 from the beginning of `foo`.
4722 The one case that needs special treatment is a list item that *starts*
4723 with indented code. How much indentation is required in that case, since
4724 we don't have a "first paragraph" to measure from? Rule #2 simply stipulates
4725 that in such cases, we require one space indentation from the list marker
4726 (and then the normal four spaces for the indented code). This will match the
4727 four-space rule in cases where the list marker plus its initial indentation
4728 takes four spaces (a common case), but diverge in other cases.
4732 A [list](@) is a sequence of one or more
4733 list items [of the same type]. The list items
4734 may be separated by any number of blank lines.
4736 Two list items are [of the same type](@)
4737 if they begin with a [list marker] of the same type.
4738 Two list markers are of the
4739 same type if (a) they are bullet list markers using the same character
4740 (`-`, `+`, or `*`) or (b) they are ordered list numbers with the same
4741 delimiter (either `.` or `)`).
4743 A list is an [ordered list](@)
4744 if its constituent list items begin with
4745 [ordered list markers], and a
4746 [bullet list](@) if its constituent list
4747 items begin with [bullet list markers].
4749 The [start number](@)
4750 of an [ordered list] is determined by the list number of
4751 its initial list item. The numbers of subsequent list items are
4754 A list is [loose](@) if any of its constituent
4755 list items are separated by blank lines, or if any of its constituent
4756 list items directly contain two block-level elements with a blank line
4757 between them. Otherwise a list is [tight](@).
4758 (The difference in HTML output is that paragraphs in a loose list are
4759 wrapped in `<p>` tags, while paragraphs in a tight list are not.)
4761 Changing the bullet or ordered list delimiter starts a new list:
4763 ```````````````````````````````` example
4775 ````````````````````````````````
4778 ```````````````````````````````` example
4790 ````````````````````````````````
4793 In CommonMark, a list can interrupt a paragraph. That is,
4794 no blank line is needed to separate a paragraph from a following
4797 ```````````````````````````````` example
4807 ````````````````````````````````
4809 `Markdown.pl` does not allow this, through fear of triggering a list
4810 via a numeral in a hard-wrapped line:
4813 The number of windows in my house is
4814 14. The number of doors is 6.
4817 Oddly, though, `Markdown.pl` *does* allow a blockquote to
4818 interrupt a paragraph, even though the same considerations might
4821 In CommonMark, we do allow lists to interrupt paragraphs, for
4822 two reasons. First, it is natural and not uncommon for people
4823 to start lists without blank lines:
4832 Second, we are attracted to a
4834 > [principle of uniformity](@):
4835 > if a chunk of text has a certain
4836 > meaning, it will continue to have the same meaning when put into a
4837 > container block (such as a list item or blockquote).
4839 (Indeed, the spec for [list items] and [block quotes] presupposes
4840 this principle.) This principle implies that if
4849 is a list item containing a paragraph followed by a nested sublist,
4850 as all Markdown implementations agree it is (though the paragraph
4851 may be rendered without `<p>` tags, since the list is "tight"),
4861 by itself should be a paragraph followed by a nested sublist.
4863 Since it is well established Markdown practice to allow lists to
4864 interrupt paragraphs inside list items, the [principle of
4865 uniformity] requires us to allow this outside list items as
4866 well. ([reStructuredText](http://docutils.sourceforge.net/rst.html)
4867 takes a different approach, requiring blank lines before lists
4868 even inside other list items.)
4870 In order to solve of unwanted lists in paragraphs with
4871 hard-wrapped numerals, we allow only lists starting with `1` to
4872 interrupt paragraphs. Thus,
4874 ```````````````````````````````` example
4875 The number of windows in my house is
4876 14. The number of doors is 6.
4878 <p>The number of windows in my house is
4879 14. The number of doors is 6.</p>
4880 ````````````````````````````````
4882 We may still get an unintended result in cases like
4884 ```````````````````````````````` example
4885 The number of windows in my house is
4886 1. The number of doors is 6.
4888 <p>The number of windows in my house is</p>
4890 <li>The number of doors is 6.</li>
4892 ````````````````````````````````
4894 but this rule should prevent most spurious list captures.
4896 There can be any number of blank lines between items:
4898 ```````````````````````````````` example
4917 ````````````````````````````````
4919 ```````````````````````````````` example
4941 ````````````````````````````````
4944 To separate consecutive lists of the same type, or to separate a
4945 list from an indented code block that would otherwise be parsed
4946 as a subparagraph of the final list item, you can insert a blank HTML
4949 ```````````````````````````````` example
4967 ````````````````````````````````
4970 ```````````````````````````````` example
4993 ````````````````````````````````
4996 List items need not be indented to the same level. The following
4997 list items will be treated as items at the same list level,
4998 since none is indented enough to belong to the previous list
5001 ```````````````````````````````` example
5023 ````````````````````````````````
5026 ```````````````````````````````` example
5044 ````````````````````````````````
5047 This is a loose list, because there is a blank line between
5048 two of the list items:
5050 ```````````````````````````````` example
5067 ````````````````````````````````
5070 So is this, with a empty second item:
5072 ```````````````````````````````` example
5087 ````````````````````````````````
5090 These are loose lists, even though there is no space between the items,
5091 because one of the items directly contains two block-level elements
5092 with a blank line between them:
5094 ```````````````````````````````` example
5113 ````````````````````````````````
5116 ```````````````````````````````` example
5134 ````````````````````````````````
5137 This is a tight list, because the blank lines are in a code block:
5139 ```````````````````````````````` example
5158 ````````````````````````````````
5161 This is a tight list, because the blank line is between two
5162 paragraphs of a sublist. So the sublist is loose while
5163 the outer list is tight:
5165 ```````````````````````````````` example
5183 ````````````````````````````````
5186 This is a tight list, because the blank line is inside the
5189 ```````````````````````````````` example
5203 ````````````````````````````````
5206 This list is tight, because the consecutive block elements
5207 are not separated by blank lines:
5209 ```````````````````````````````` example
5227 ````````````````````````````````
5230 A single-paragraph list is tight:
5232 ```````````````````````````````` example
5238 ````````````````````````````````
5241 ```````````````````````````````` example
5252 ````````````````````````````````
5255 This list is loose, because of the blank line between the
5256 two block elements in the list item:
5258 ```````````````````````````````` example
5272 ````````````````````````````````
5275 Here the outer list is loose, the inner list tight:
5277 ```````````````````````````````` example
5292 ````````````````````````````````
5295 ```````````````````````````````` example
5320 ````````````````````````````````
5325 Inlines are parsed sequentially from the beginning of the character
5326 stream to the end (left to right, in left-to-right languages).
5327 Thus, for example, in
5329 ```````````````````````````````` example
5332 <p><code>hi</code>lo`</p>
5333 ````````````````````````````````
5336 `hi` is parsed as code, leaving the backtick at the end as a literal
5339 ## Backslash escapes
5341 Any ASCII punctuation character may be backslash-escaped:
5343 ```````````````````````````````` example
5344 \!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~
5346 <p>!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~</p>
5347 ````````````````````````````````
5350 Backslashes before other characters are treated as literal
5353 ```````````````````````````````` example
5356 <p>\→\A\a\ \3\φ\«</p>
5357 ````````````````````````````````
5360 Escaped characters are treated as regular characters and do
5361 not have their usual Markdown meanings:
5363 ```````````````````````````````` example
5371 \[foo]: /url "not a reference"
5374 <br/> not a tag
5380 [foo]: /url "not a reference"</p>
5381 ````````````````````````````````
5384 If a backslash is itself escaped, the following character is not:
5386 ```````````````````````````````` example
5389 <p>\<em>emphasis</em></p>
5390 ````````````````````````````````
5393 A backslash at the end of the line is a [hard line break]:
5395 ```````````````````````````````` example
5401 ````````````````````````````````
5404 Backslash escapes do not work in code blocks, code spans, autolinks, or
5407 ```````````````````````````````` example
5410 <p><code>\[\`</code></p>
5411 ````````````````````````````````
5414 ```````````````````````````````` example
5419 ````````````````````````````````
5422 ```````````````````````````````` example
5429 ````````````````````````````````
5432 ```````````````````````````````` example
5433 <http://example.com?find=\*>
5435 <p><a href="http://example.com?find=%5C*">http://example.com?find=\*</a></p>
5436 ````````````````````````````````
5439 ```````````````````````````````` example
5443 ````````````````````````````````
5446 But they work in all other contexts, including URLs and link titles,
5447 link references, and [info strings] in [fenced code blocks]:
5449 ```````````````````````````````` example
5450 [foo](/bar\* "ti\*tle")
5452 <p><a href="/bar*" title="ti*tle">foo</a></p>
5453 ````````````````````````````````
5456 ```````````````````````````````` example
5459 [foo]: /bar\* "ti\*tle"
5461 <p><a href="/bar*" title="ti*tle">foo</a></p>
5462 ````````````````````````````````
5465 ```````````````````````````````` example
5470 <pre><code class="language-foo+bar">foo
5472 ````````````````````````````````
5476 ## Entity and numeric character references
5478 All valid HTML entity references and numeric character
5479 references, except those occuring in code blocks and code spans,
5480 are recognized as such and treated as equivalent to the
5481 corresponding Unicode characters. Conforming CommonMark parsers
5482 need not store information about whether a particular character
5483 was represented in the source using a Unicode character or
5484 an entity reference.
5486 [Entity references](@) consist of `&` + any of the valid
5487 HTML5 entity names + `;`. The
5488 document <https://html.spec.whatwg.org/multipage/entities.json>
5489 is used as an authoritative source for the valid entity
5490 references and their corresponding code points.
5492 ```````````````````````````````` example
5493 & © Æ Ď
5494 ¾ ℋ ⅆ
5495 ∲ ≧̸
5500 ````````````````````````````````
5503 [Decimal numeric character
5505 consist of `&#` + a string of 1--8 arabic digits + `;`. A
5506 numeric character reference is parsed as the corresponding
5507 Unicode character. Invalid Unicode code points will be replaced by
5508 the REPLACEMENT CHARACTER (`U+FFFD`). For security reasons,
5509 the code point `U+0000` will also be replaced by `U+FFFD`.
5511 ```````````````````````````````` example
5512 # Ӓ Ϡ � �
5515 ````````````````````````````````
5518 [Hexadecimal numeric character
5519 references](@) consist of `&#` +
5520 either `X` or `x` + a string of 1-8 hexadecimal digits + `;`.
5521 They too are parsed as the corresponding Unicode character (this
5522 time specified with a hexadecimal numeral instead of decimal).
5524 ```````````````````````````````` example
5525 " ആ ಫ
5528 ````````````````````````````````
5531 Here are some nonentities:
5533 ```````````````````````````````` example
5535 &ThisIsNotDefined; &hi?;
5537 <p>&nbsp &x; &#; &#x;
5538 &ThisIsNotDefined; &hi?;</p>
5539 ````````````````````````````````
5542 Although HTML5 does accept some entity references
5543 without a trailing semicolon (such as `©`), these are not
5544 recognized here, because it makes the grammar too ambiguous:
5546 ```````````````````````````````` example
5550 ````````````````````````````````
5553 Strings that are not on the list of HTML5 named entities are not
5554 recognized as entity references either:
5556 ```````````````````````````````` example
5559 <p>&MadeUpEntity;</p>
5560 ````````````````````````````````
5563 Entity and numeric character references are recognized in any
5564 context besides code spans or code blocks, including
5565 URLs, [link titles], and [fenced code block][] [info strings]:
5567 ```````````````````````````````` example
5568 <a href="öö.html">
5570 <a href="öö.html">
5571 ````````````````````````````````
5574 ```````````````````````````````` example
5575 [foo](/föö "föö")
5577 <p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p>
5578 ````````````````````````````````
5581 ```````````````````````````````` example
5584 [foo]: /föö "föö"
5586 <p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p>
5587 ````````````````````````````````
5590 ```````````````````````````````` example
5595 <pre><code class="language-föö">foo
5597 ````````````````````````````````
5600 Entity and numeric character references are treated as literal
5601 text in code spans and code blocks:
5603 ```````````````````````````````` example
5606 <p><code>f&ouml;&ouml;</code></p>
5607 ````````````````````````````````
5610 ```````````````````````````````` example
5613 <pre><code>f&ouml;f&ouml;
5615 ````````````````````````````````
5620 A [backtick string](@)
5621 is a string of one or more backtick characters (`` ` ``) that is neither
5622 preceded nor followed by a backtick.
5624 A [code span](@) begins with a backtick string and ends with
5625 a backtick string of equal length. The contents of the code span are
5626 the characters between the two backtick strings, with leading and
5627 trailing spaces and [line endings] removed, and
5628 [whitespace] collapsed to single spaces.
5630 This is a simple code span:
5632 ```````````````````````````````` example
5635 <p><code>foo</code></p>
5636 ````````````````````````````````
5639 Here two backticks are used, because the code contains a backtick.
5640 This example also illustrates stripping of leading and trailing spaces:
5642 ```````````````````````````````` example
5645 <p><code>foo ` bar</code></p>
5646 ````````````````````````````````
5649 This example shows the motivation for stripping leading and trailing
5652 ```````````````````````````````` example
5655 <p><code>``</code></p>
5656 ````````````````````````````````
5659 [Line endings] are treated like spaces:
5661 ```````````````````````````````` example
5666 <p><code>foo</code></p>
5667 ````````````````````````````````
5670 Interior spaces and [line endings] are collapsed into
5671 single spaces, just as they would be by a browser:
5673 ```````````````````````````````` example
5677 <p><code>foo bar baz</code></p>
5678 ````````````````````````````````
5681 Not all [Unicode whitespace] (for instance, non-breaking space) is
5684 ```````````````````````````````` example
5687 <p><code>a b</code></p>
5688 ````````````````````````````````
5691 Q: Why not just leave the spaces, since browsers will collapse them
5692 anyway? A: Because we might be targeting a non-HTML format, and we
5693 shouldn't rely on HTML-specific rendering assumptions.
5695 (Existing implementations differ in their treatment of internal
5696 spaces and [line endings]. Some, including `Markdown.pl` and
5697 `showdown`, convert an internal [line ending] into a
5698 `<br />` tag. But this makes things difficult for those who like to
5699 hard-wrap their paragraphs, since a line break in the midst of a code
5700 span will cause an unintended line break in the output. Others just
5701 leave internal spaces as they are, which is fine if only HTML is being
5704 ```````````````````````````````` example
5707 <p><code>foo `` bar</code></p>
5708 ````````````````````````````````
5711 Note that backslash escapes do not work in code spans. All backslashes
5712 are treated literally:
5714 ```````````````````````````````` example
5717 <p><code>foo\</code>bar`</p>
5718 ````````````````````````````````
5721 Backslash escapes are never needed, because one can always choose a
5722 string of *n* backtick characters as delimiters, where the code does
5723 not contain any strings of exactly *n* backtick characters.
5725 Code span backticks have higher precedence than any other inline
5726 constructs except HTML tags and autolinks. Thus, for example, this is
5727 not parsed as emphasized text, since the second `*` is part of a code
5730 ```````````````````````````````` example
5733 <p>*foo<code>*</code></p>
5734 ````````````````````````````````
5737 And this is not parsed as a link:
5739 ```````````````````````````````` example
5740 [not a `link](/foo`)
5742 <p>[not a <code>link](/foo</code>)</p>
5743 ````````````````````````````````
5746 Code spans, HTML tags, and autolinks have the same precedence.
5749 ```````````````````````````````` example
5752 <p><code><a href="</code>">`</p>
5753 ````````````````````````````````
5756 But this is an HTML tag:
5758 ```````````````````````````````` example
5761 <p><a href="`">`</p>
5762 ````````````````````````````````
5767 ```````````````````````````````` example
5768 `<http://foo.bar.`baz>`
5770 <p><code><http://foo.bar.</code>baz>`</p>
5771 ````````````````````````````````
5774 But this is an autolink:
5776 ```````````````````````````````` example
5777 <http://foo.bar.`baz>`
5779 <p><a href="http://foo.bar.%60baz">http://foo.bar.`baz</a>`</p>
5780 ````````````````````````````````
5783 When a backtick string is not closed by a matching backtick string,
5784 we just have literal backticks:
5786 ```````````````````````````````` example
5790 ````````````````````````````````
5793 ```````````````````````````````` example
5797 ````````````````````````````````
5800 ## Emphasis and strong emphasis
5802 John Gruber's original [Markdown syntax
5803 description](http://daringfireball.net/projects/markdown/syntax#em) says:
5805 > Markdown treats asterisks (`*`) and underscores (`_`) as indicators of
5806 > emphasis. Text wrapped with one `*` or `_` will be wrapped with an HTML
5807 > `<em>` tag; double `*`'s or `_`'s will be wrapped with an HTML `<strong>`
5810 This is enough for most users, but these rules leave much undecided,
5811 especially when it comes to nested emphasis. The original
5812 `Markdown.pl` test suite makes it clear that triple `***` and
5813 `___` delimiters can be used for strong emphasis, and most
5814 implementations have also allowed the following patterns:
5818 ***strong** in emph*
5819 ***emph* in strong**
5820 **in strong *emph***
5821 *in emph **strong***
5824 The following patterns are less widely supported, but the intent
5825 is clear and they are useful (especially in contexts like bibliography
5829 *emph *with emph* in it*
5830 **strong **with strong** in it**
5833 Many implementations have also restricted intraword emphasis to
5834 the `*` forms, to avoid unwanted emphasis in words containing
5835 internal underscores. (It is best practice to put these in code
5836 spans, but users often do not.)
5839 internal emphasis: foo*bar*baz
5840 no emphasis: foo_bar_baz
5843 The rules given below capture all of these patterns, while allowing
5844 for efficient parsing strategies that do not backtrack.
5846 First, some definitions. A [delimiter run](@) is either
5847 a sequence of one or more `*` characters that is not preceded or
5848 followed by a `*` character, or a sequence of one or more `_`
5849 characters that is not preceded or followed by a `_` character.
5851 A [left-flanking delimiter run](@) is
5852 a [delimiter run] that is (a) not followed by [Unicode whitespace],
5853 and (b) either not followed by a [punctuation character], or
5854 preceded by [Unicode whitespace] or a [punctuation character].
5855 For purposes of this definition, the beginning and the end of
5856 the line count as Unicode whitespace.
5858 A [right-flanking delimiter run](@) is
5859 a [delimiter run] that is (a) not preceded by [Unicode whitespace],
5860 and (b) either not preceded by a [punctuation character], or
5861 followed by [Unicode whitespace] or a [punctuation character].
5862 For purposes of this definition, the beginning and the end of
5863 the line count as Unicode whitespace.
5865 Here are some examples of delimiter runs.
5867 - left-flanking but not right-flanking:
5876 - right-flanking but not left-flanking:
5885 - Both left and right-flanking:
5892 - Neither left nor right-flanking:
5899 (The idea of distinguishing left-flanking and right-flanking
5900 delimiter runs based on the character before and the character
5901 after comes from Roopesh Chander's
5902 [vfmd](http://www.vfmd.org/vfmd-spec/specification/#procedure-for-identifying-emphasis-tags).
5903 vfmd uses the terminology "emphasis indicator string" instead of "delimiter
5904 run," and its rules for distinguishing left- and right-flanking runs
5905 are a bit more complex than the ones given here.)
5907 The following rules define emphasis and strong emphasis:
5909 1. A single `*` character [can open emphasis](@)
5910 iff (if and only if) it is part of a [left-flanking delimiter run].
5912 2. A single `_` character [can open emphasis] iff
5913 it is part of a [left-flanking delimiter run]
5914 and either (a) not part of a [right-flanking delimiter run]
5915 or (b) part of a [right-flanking delimiter run]
5916 preceded by punctuation.
5918 3. A single `*` character [can close emphasis](@)
5919 iff it is part of a [right-flanking delimiter run].
5921 4. A single `_` character [can close emphasis] iff
5922 it is part of a [right-flanking delimiter run]
5923 and either (a) not part of a [left-flanking delimiter run]
5924 or (b) part of a [left-flanking delimiter run]
5925 followed by punctuation.
5927 5. A double `**` [can open strong emphasis](@)
5928 iff it is part of a [left-flanking delimiter run].
5930 6. A double `__` [can open strong emphasis] iff
5931 it is part of a [left-flanking delimiter run]
5932 and either (a) not part of a [right-flanking delimiter run]
5933 or (b) part of a [right-flanking delimiter run]
5934 preceded by punctuation.
5936 7. A double `**` [can close strong emphasis](@)
5937 iff it is part of a [right-flanking delimiter run].
5939 8. A double `__` [can close strong emphasis]
5940 it is part of a [right-flanking delimiter run]
5941 and either (a) not part of a [left-flanking delimiter run]
5942 or (b) part of a [left-flanking delimiter run]
5943 followed by punctuation.
5945 9. Emphasis begins with a delimiter that [can open emphasis] and ends
5946 with a delimiter that [can close emphasis], and that uses the same
5947 character (`_` or `*`) as the opening delimiter. The
5948 opening and closing delimiters must belong to separate
5949 [delimiter runs]. If one of the delimiters can both
5950 open and close emphasis, then the sum of the lengths of the
5951 delimiter runs containing the opening and closing delimiters
5952 must not be a multiple of 3.
5954 10. Strong emphasis begins with a delimiter that
5955 [can open strong emphasis] and ends with a delimiter that
5956 [can close strong emphasis], and that uses the same character
5957 (`_` or `*`) as the opening delimiter. The
5958 opening and closing delimiters must belong to separate
5959 [delimiter runs]. If one of the delimiters can both open
5960 and close strong emphasis, then the sum of the lengths of
5961 the delimiter runs containing the opening and closing
5962 delimiters must not be a multiple of 3.
5964 11. A literal `*` character cannot occur at the beginning or end of
5965 `*`-delimited emphasis or `**`-delimited strong emphasis, unless it
5966 is backslash-escaped.
5968 12. A literal `_` character cannot occur at the beginning or end of
5969 `_`-delimited emphasis or `__`-delimited strong emphasis, unless it
5970 is backslash-escaped.
5972 Where rules 1--12 above are compatible with multiple parsings,
5973 the following principles resolve ambiguity:
5975 13. The number of nestings should be minimized. Thus, for example,
5976 an interpretation `<strong>...</strong>` is always preferred to
5977 `<em><em>...</em></em>`.
5979 14. An interpretation `<strong><em>...</em></strong>` is always
5980 preferred to `<em><strong>..</strong></em>`.
5982 15. When two potential emphasis or strong emphasis spans overlap,
5983 so that the second begins before the first ends and ends after
5984 the first ends, the first takes precedence. Thus, for example,
5985 `*foo _bar* baz_` is parsed as `<em>foo _bar</em> baz_` rather
5986 than `*foo <em>bar* baz</em>`.
5988 16. When there are two potential emphasis or strong emphasis spans
5989 with the same closing delimiter, the shorter one (the one that
5990 opens later) takes precedence. Thus, for example,
5991 `**foo **bar baz**` is parsed as `**foo <strong>bar baz</strong>`
5992 rather than `<strong>foo **bar baz</strong>`.
5994 17. Inline code spans, links, images, and HTML tags group more tightly
5995 than emphasis. So, when there is a choice between an interpretation
5996 that contains one of these elements and one that does not, the
5997 former always wins. Thus, for example, `*[foo*](bar)` is
5998 parsed as `*<a href="bar">foo*</a>` rather than as
5999 `<em>[foo</em>](bar)`.
6001 These rules can be illustrated through a series of examples.
6005 ```````````````````````````````` example
6008 <p><em>foo bar</em></p>
6009 ````````````````````````````````
6012 This is not emphasis, because the opening `*` is followed by
6013 whitespace, and hence not part of a [left-flanking delimiter run]:
6015 ```````````````````````````````` example
6019 ````````````````````````````````
6022 This is not emphasis, because the opening `*` is preceded
6023 by an alphanumeric and followed by punctuation, and hence
6024 not part of a [left-flanking delimiter run]:
6026 ```````````````````````````````` example
6029 <p>a*"foo"*</p>
6030 ````````````````````````````````
6033 Unicode nonbreaking spaces count as whitespace, too:
6035 ```````````````````````````````` example
6039 ````````````````````````````````
6042 Intraword emphasis with `*` is permitted:
6044 ```````````````````````````````` example
6047 <p>foo<em>bar</em></p>
6048 ````````````````````````````````
6051 ```````````````````````````````` example
6054 <p>5<em>6</em>78</p>
6055 ````````````````````````````````
6060 ```````````````````````````````` example
6063 <p><em>foo bar</em></p>
6064 ````````````````````````````````
6067 This is not emphasis, because the opening `_` is followed by
6070 ```````````````````````````````` example
6074 ````````````````````````````````
6077 This is not emphasis, because the opening `_` is preceded
6078 by an alphanumeric and followed by punctuation:
6080 ```````````````````````````````` example
6083 <p>a_"foo"_</p>
6084 ````````````````````````````````
6087 Emphasis with `_` is not allowed inside words:
6089 ```````````````````````````````` example
6093 ````````````````````````````````
6096 ```````````````````````````````` example
6100 ````````````````````````````````
6103 ```````````````````````````````` example
6104 пристаням_стремятся_
6106 <p>пристаням_стремятся_</p>
6107 ````````````````````````````````
6110 Here `_` does not generate emphasis, because the first delimiter run
6111 is right-flanking and the second left-flanking:
6113 ```````````````````````````````` example
6116 <p>aa_"bb"_cc</p>
6117 ````````````````````````````````
6120 This is emphasis, even though the opening delimiter is
6121 both left- and right-flanking, because it is preceded by
6124 ```````````````````````````````` example
6127 <p>foo-<em>(bar)</em></p>
6128 ````````````````````````````````
6133 This is not emphasis, because the closing delimiter does
6134 not match the opening delimiter:
6136 ```````````````````````````````` example
6140 ````````````````````````````````
6143 This is not emphasis, because the closing `*` is preceded by
6146 ```````````````````````````````` example
6150 ````````````````````````````````
6153 A newline also counts as whitespace:
6155 ```````````````````````````````` example
6161 ````````````````````````````````
6164 This is not emphasis, because the second `*` is
6165 preceded by punctuation and followed by an alphanumeric
6166 (hence it is not part of a [right-flanking delimiter run]:
6168 ```````````````````````````````` example
6172 ````````````````````````````````
6175 The point of this restriction is more easily appreciated
6178 ```````````````````````````````` example
6181 <p><em>(<em>foo</em>)</em></p>
6182 ````````````````````````````````
6185 Intraword emphasis with `*` is allowed:
6187 ```````````````````````````````` example
6190 <p><em>foo</em>bar</p>
6191 ````````````````````````````````
6197 This is not emphasis, because the closing `_` is preceded by
6200 ```````````````````````````````` example
6204 ````````````````````````````````
6207 This is not emphasis, because the second `_` is
6208 preceded by punctuation and followed by an alphanumeric:
6210 ```````````````````````````````` example
6214 ````````````````````````````````
6217 This is emphasis within emphasis:
6219 ```````````````````````````````` example
6222 <p><em>(<em>foo</em>)</em></p>
6223 ````````````````````````````````
6226 Intraword emphasis is disallowed for `_`:
6228 ```````````````````````````````` example
6232 ````````````````````````````````
6235 ```````````````````````````````` example
6236 _пристаням_стремятся
6238 <p>_пристаням_стремятся</p>
6239 ````````````````````````````````
6242 ```````````````````````````````` example
6245 <p><em>foo_bar_baz</em></p>
6246 ````````````````````````````````
6249 This is emphasis, even though the closing delimiter is
6250 both left- and right-flanking, because it is followed by
6253 ```````````````````````````````` example
6256 <p><em>(bar)</em>.</p>
6257 ````````````````````````````````
6262 ```````````````````````````````` example
6265 <p><strong>foo bar</strong></p>
6266 ````````````````````````````````
6269 This is not strong emphasis, because the opening delimiter is
6270 followed by whitespace:
6272 ```````````````````````````````` example
6276 ````````````````````````````````
6279 This is not strong emphasis, because the opening `**` is preceded
6280 by an alphanumeric and followed by punctuation, and hence
6281 not part of a [left-flanking delimiter run]:
6283 ```````````````````````````````` example
6286 <p>a**"foo"**</p>
6287 ````````````````````````````````
6290 Intraword strong emphasis with `**` is permitted:
6292 ```````````````````````````````` example
6295 <p>foo<strong>bar</strong></p>
6296 ````````````````````````````````
6301 ```````````````````````````````` example
6304 <p><strong>foo bar</strong></p>
6305 ````````````````````````````````
6308 This is not strong emphasis, because the opening delimiter is
6309 followed by whitespace:
6311 ```````````````````````````````` example
6315 ````````````````````````````````
6318 A newline counts as whitespace:
6319 ```````````````````````````````` example
6325 ````````````````````````````````
6328 This is not strong emphasis, because the opening `__` is preceded
6329 by an alphanumeric and followed by punctuation:
6331 ```````````````````````````````` example
6334 <p>a__"foo"__</p>
6335 ````````````````````````````````
6338 Intraword strong emphasis is forbidden with `__`:
6340 ```````````````````````````````` example
6344 ````````````````````````````````
6347 ```````````````````````````````` example
6351 ````````````````````````````````
6354 ```````````````````````````````` example
6355 пристаням__стремятся__
6357 <p>пристаням__стремятся__</p>
6358 ````````````````````````````````
6361 ```````````````````````````````` example
6362 __foo, __bar__, baz__
6364 <p><strong>foo, <strong>bar</strong>, baz</strong></p>
6365 ````````````````````````````````
6368 This is strong emphasis, even though the opening delimiter is
6369 both left- and right-flanking, because it is preceded by
6372 ```````````````````````````````` example
6375 <p>foo-<strong>(bar)</strong></p>
6376 ````````````````````````````````
6382 This is not strong emphasis, because the closing delimiter is preceded
6385 ```````````````````````````````` example
6389 ````````````````````````````````
6392 (Nor can it be interpreted as an emphasized `*foo bar *`, because of
6395 This is not strong emphasis, because the second `**` is
6396 preceded by punctuation and followed by an alphanumeric:
6398 ```````````````````````````````` example
6402 ````````````````````````````````
6405 The point of this restriction is more easily appreciated
6406 with these examples:
6408 ```````````````````````````````` example
6411 <p><em>(<strong>foo</strong>)</em></p>
6412 ````````````````````````````````
6415 ```````````````````````````````` example
6416 **Gomphocarpus (*Gomphocarpus physocarpus*, syn.
6417 *Asclepias physocarpa*)**
6419 <p><strong>Gomphocarpus (<em>Gomphocarpus physocarpus</em>, syn.
6420 <em>Asclepias physocarpa</em>)</strong></p>
6421 ````````````````````````````````
6424 ```````````````````````````````` example
6427 <p><strong>foo "<em>bar</em>" foo</strong></p>
6428 ````````````````````````````````
6433 ```````````````````````````````` example
6436 <p><strong>foo</strong>bar</p>
6437 ````````````````````````````````
6442 This is not strong emphasis, because the closing delimiter is
6443 preceded by whitespace:
6445 ```````````````````````````````` example
6449 ````````````````````````````````
6452 This is not strong emphasis, because the second `__` is
6453 preceded by punctuation and followed by an alphanumeric:
6455 ```````````````````````````````` example
6459 ````````````````````````````````
6462 The point of this restriction is more easily appreciated
6465 ```````````````````````````````` example
6468 <p><em>(<strong>foo</strong>)</em></p>
6469 ````````````````````````````````
6472 Intraword strong emphasis is forbidden with `__`:
6474 ```````````````````````````````` example
6478 ````````````````````````````````
6481 ```````````````````````````````` example
6482 __пристаням__стремятся
6484 <p>__пристаням__стремятся</p>
6485 ````````````````````````````````
6488 ```````````````````````````````` example
6491 <p><strong>foo__bar__baz</strong></p>
6492 ````````````````````````````````
6495 This is strong emphasis, even though the closing delimiter is
6496 both left- and right-flanking, because it is followed by
6499 ```````````````````````````````` example
6502 <p><strong>(bar)</strong>.</p>
6503 ````````````````````````````````
6508 Any nonempty sequence of inline elements can be the contents of an
6511 ```````````````````````````````` example
6514 <p><em>foo <a href="/url">bar</a></em></p>
6515 ````````````````````````````````
6518 ```````````````````````````````` example
6524 ````````````````````````````````
6527 In particular, emphasis and strong emphasis can be nested
6530 ```````````````````````````````` example
6533 <p><em>foo <strong>bar</strong> baz</em></p>
6534 ````````````````````````````````
6537 ```````````````````````````````` example
6540 <p><em>foo <em>bar</em> baz</em></p>
6541 ````````````````````````````````
6544 ```````````````````````````````` example
6547 <p><em><em>foo</em> bar</em></p>
6548 ````````````````````````````````
6551 ```````````````````````````````` example
6554 <p><em>foo <em>bar</em></em></p>
6555 ````````````````````````````````
6558 ```````````````````````````````` example
6561 <p><em>foo <strong>bar</strong> baz</em></p>
6562 ````````````````````````````````
6564 ```````````````````````````````` example
6567 <p><em>foo<strong>bar</strong>baz</em></p>
6568 ````````````````````````````````
6570 Note that in the preceding case, the interpretation
6573 <p><em>foo</em><em>bar<em></em>baz</em></p>
6577 is precluded by the condition that a delimiter that
6578 can both open and close (like the `*` after `foo`)
6579 cannot form emphasis if the sum of the lengths of
6580 the delimiter runs containing the opening and
6581 closing delimiters is a multiple of 3.
6583 The same condition ensures that the following
6584 cases are all strong emphasis nested inside
6585 emphasis, even when the interior spaces are
6589 ```````````````````````````````` example
6592 <p><em><strong>foo</strong> bar</em></p>
6593 ````````````````````````````````
6596 ```````````````````````````````` example
6599 <p><em>foo <strong>bar</strong></em></p>
6600 ````````````````````````````````
6603 ```````````````````````````````` example
6606 <p><em>foo<strong>bar</strong></em></p>
6607 ````````````````````````````````
6610 Indefinite levels of nesting are possible:
6612 ```````````````````````````````` example
6613 *foo **bar *baz* bim** bop*
6615 <p><em>foo <strong>bar <em>baz</em> bim</strong> bop</em></p>
6616 ````````````````````````````````
6619 ```````````````````````````````` example
6622 <p><em>foo <a href="/url"><em>bar</em></a></em></p>
6623 ````````````````````````````````
6626 There can be no empty emphasis or strong emphasis:
6628 ```````````````````````````````` example
6629 ** is not an empty emphasis
6631 <p>** is not an empty emphasis</p>
6632 ````````````````````````````````
6635 ```````````````````````````````` example
6636 **** is not an empty strong emphasis
6638 <p>**** is not an empty strong emphasis</p>
6639 ````````````````````````````````
6645 Any nonempty sequence of inline elements can be the contents of an
6646 strongly emphasized span.
6648 ```````````````````````````````` example
6651 <p><strong>foo <a href="/url">bar</a></strong></p>
6652 ````````````````````````````````
6655 ```````````````````````````````` example
6661 ````````````````````````````````
6664 In particular, emphasis and strong emphasis can be nested
6665 inside strong emphasis:
6667 ```````````````````````````````` example
6670 <p><strong>foo <em>bar</em> baz</strong></p>
6671 ````````````````````````````````
6674 ```````````````````````````````` example
6677 <p><strong>foo <strong>bar</strong> baz</strong></p>
6678 ````````````````````````````````
6681 ```````````````````````````````` example
6684 <p><strong><strong>foo</strong> bar</strong></p>
6685 ````````````````````````````````
6688 ```````````````````````````````` example
6691 <p><strong>foo <strong>bar</strong></strong></p>
6692 ````````````````````````````````
6695 ```````````````````````````````` example
6698 <p><strong>foo <em>bar</em> baz</strong></p>
6699 ````````````````````````````````
6702 ```````````````````````````````` example
6705 <p><strong>foo<em>bar</em>baz</strong></p>
6706 ````````````````````````````````
6709 ```````````````````````````````` example
6712 <p><strong><em>foo</em> bar</strong></p>
6713 ````````````````````````````````
6716 ```````````````````````````````` example
6719 <p><strong>foo <em>bar</em></strong></p>
6720 ````````````````````````````````
6723 Indefinite levels of nesting are possible:
6725 ```````````````````````````````` example
6729 <p><strong>foo <em>bar <strong>baz</strong>
6730 bim</em> bop</strong></p>
6731 ````````````````````````````````
6734 ```````````````````````````````` example
6735 **foo [*bar*](/url)**
6737 <p><strong>foo <a href="/url"><em>bar</em></a></strong></p>
6738 ````````````````````````````````
6741 There can be no empty emphasis or strong emphasis:
6743 ```````````````````````````````` example
6744 __ is not an empty emphasis
6746 <p>__ is not an empty emphasis</p>
6747 ````````````````````````````````
6750 ```````````````````````````````` example
6751 ____ is not an empty strong emphasis
6753 <p>____ is not an empty strong emphasis</p>
6754 ````````````````````````````````
6760 ```````````````````````````````` example
6764 ````````````````````````````````
6767 ```````````````````````````````` example
6770 <p>foo <em>*</em></p>
6771 ````````````````````````````````
6774 ```````````````````````````````` example
6777 <p>foo <em>_</em></p>
6778 ````````````````````````````````
6781 ```````````````````````````````` example
6785 ````````````````````````````````
6788 ```````````````````````````````` example
6791 <p>foo <strong>*</strong></p>
6792 ````````````````````````````````
6795 ```````````````````````````````` example
6798 <p>foo <strong>_</strong></p>
6799 ````````````````````````````````
6802 Note that when delimiters do not match evenly, Rule 11 determines
6803 that the excess literal `*` characters will appear outside of the
6804 emphasis, rather than inside it:
6806 ```````````````````````````````` example
6809 <p>*<em>foo</em></p>
6810 ````````````````````````````````
6813 ```````````````````````````````` example
6816 <p><em>foo</em>*</p>
6817 ````````````````````````````````
6820 ```````````````````````````````` example
6823 <p>*<strong>foo</strong></p>
6824 ````````````````````````````````
6827 ```````````````````````````````` example
6830 <p>***<em>foo</em></p>
6831 ````````````````````````````````
6834 ```````````````````````````````` example
6837 <p><strong>foo</strong>*</p>
6838 ````````````````````````````````
6841 ```````````````````````````````` example
6844 <p><em>foo</em>***</p>
6845 ````````````````````````````````
6851 ```````````````````````````````` example
6855 ````````````````````````````````
6858 ```````````````````````````````` example
6861 <p>foo <em>_</em></p>
6862 ````````````````````````````````
6865 ```````````````````````````````` example
6868 <p>foo <em>*</em></p>
6869 ````````````````````````````````
6872 ```````````````````````````````` example
6876 ````````````````````````````````
6879 ```````````````````````````````` example
6882 <p>foo <strong>_</strong></p>
6883 ````````````````````````````````
6886 ```````````````````````````````` example
6889 <p>foo <strong>*</strong></p>
6890 ````````````````````````````````
6893 ```````````````````````````````` example
6896 <p>_<em>foo</em></p>
6897 ````````````````````````````````
6900 Note that when delimiters do not match evenly, Rule 12 determines
6901 that the excess literal `_` characters will appear outside of the
6902 emphasis, rather than inside it:
6904 ```````````````````````````````` example
6907 <p><em>foo</em>_</p>
6908 ````````````````````````````````
6911 ```````````````````````````````` example
6914 <p>_<strong>foo</strong></p>
6915 ````````````````````````````````
6918 ```````````````````````````````` example
6921 <p>___<em>foo</em></p>
6922 ````````````````````````````````
6925 ```````````````````````````````` example
6928 <p><strong>foo</strong>_</p>
6929 ````````````````````````````````
6932 ```````````````````````````````` example
6935 <p><em>foo</em>___</p>
6936 ````````````````````````````````
6939 Rule 13 implies that if you want emphasis nested directly inside
6940 emphasis, you must use different delimiters:
6942 ```````````````````````````````` example
6945 <p><strong>foo</strong></p>
6946 ````````````````````````````````
6949 ```````````````````````````````` example
6952 <p><em><em>foo</em></em></p>
6953 ````````````````````````````````
6956 ```````````````````````````````` example
6959 <p><strong>foo</strong></p>
6960 ````````````````````````````````
6963 ```````````````````````````````` example
6966 <p><em><em>foo</em></em></p>
6967 ````````````````````````````````
6970 However, strong emphasis within strong emphasis is possible without
6971 switching delimiters:
6973 ```````````````````````````````` example
6976 <p><strong><strong>foo</strong></strong></p>
6977 ````````````````````````````````
6980 ```````````````````````````````` example
6983 <p><strong><strong>foo</strong></strong></p>
6984 ````````````````````````````````
6988 Rule 13 can be applied to arbitrarily long sequences of
6991 ```````````````````````````````` example
6994 <p><strong><strong><strong>foo</strong></strong></strong></p>
6995 ````````````````````````````````
7000 ```````````````````````````````` example
7003 <p><strong><em>foo</em></strong></p>
7004 ````````````````````````````````
7007 ```````````````````````````````` example
7010 <p><strong><strong><em>foo</em></strong></strong></p>
7011 ````````````````````````````````
7016 ```````````````````````````````` example
7019 <p><em>foo _bar</em> baz_</p>
7020 ````````````````````````````````
7023 ```````````````````````````````` example
7024 *foo __bar *baz bim__ bam*
7026 <p><em>foo <strong>bar *baz bim</strong> bam</em></p>
7027 ````````````````````````````````
7032 ```````````````````````````````` example
7035 <p>**foo <strong>bar baz</strong></p>
7036 ````````````````````````````````
7039 ```````````````````````````````` example
7042 <p>*foo <em>bar baz</em></p>
7043 ````````````````````````````````
7048 ```````````````````````````````` example
7051 <p>*<a href="/url">bar*</a></p>
7052 ````````````````````````````````
7055 ```````````````````````````````` example
7058 <p>_foo <a href="/url">bar_</a></p>
7059 ````````````````````````````````
7062 ```````````````````````````````` example
7063 *<img src="foo" title="*"/>
7065 <p>*<img src="foo" title="*"/></p>
7066 ````````````````````````````````
7069 ```````````````````````````````` example
7072 <p>**<a href="**"></p>
7073 ````````````````````````````````
7076 ```````````````````````````````` example
7079 <p>__<a href="__"></p>
7080 ````````````````````````````````
7083 ```````````````````````````````` example
7086 <p><em>a <code>*</code></em></p>
7087 ````````````````````````````````
7090 ```````````````````````````````` example
7093 <p><em>a <code>_</code></em></p>
7094 ````````````````````````````````
7097 ```````````````````````````````` example
7098 **a<http://foo.bar/?q=**>
7100 <p>**a<a href="http://foo.bar/?q=**">http://foo.bar/?q=**</a></p>
7101 ````````````````````````````````
7104 ```````````````````````````````` example
7105 __a<http://foo.bar/?q=__>
7107 <p>__a<a href="http://foo.bar/?q=__">http://foo.bar/?q=__</a></p>
7108 ````````````````````````````````
7114 A link contains [link text] (the visible text), a [link destination]
7115 (the URI that is the link destination), and optionally a [link title].
7116 There are two basic kinds of links in Markdown. In [inline links] the
7117 destination and title are given immediately after the link text. In
7118 [reference links] the destination and title are defined elsewhere in
7121 A [link text](@) consists of a sequence of zero or more
7122 inline elements enclosed by square brackets (`[` and `]`). The
7123 following rules apply:
7125 - Links may not contain other links, at any level of nesting. If
7126 multiple otherwise valid link definitions appear nested inside each
7127 other, the inner-most definition is used.
7129 - Brackets are allowed in the [link text] only if (a) they
7130 are backslash-escaped or (b) they appear as a matched pair of brackets,
7131 with an open bracket `[`, a sequence of zero or more inlines, and
7132 a close bracket `]`.
7134 - Backtick [code spans], [autolinks], and raw [HTML tags] bind more tightly
7135 than the brackets in link text. Thus, for example,
7136 `` [foo`]` `` could not be a link text, since the second `]`
7137 is part of a code span.
7139 - The brackets in link text bind more tightly than markers for
7140 [emphasis and strong emphasis]. Thus, for example, `*[foo*](url)` is a link.
7142 A [link destination](@) consists of either
7144 - a sequence of zero or more characters between an opening `<` and a
7145 closing `>` that contains no spaces, line breaks, or unescaped
7146 `<` or `>` characters, or
7148 - a nonempty sequence of characters that does not include
7149 ASCII space or control characters, and includes parentheses
7150 only if (a) they are backslash-escaped or (b) they are part of
7151 a balanced pair of unescaped parentheses that is not itself
7152 inside a balanced pair of unescaped parentheses.
7154 A [link title](@) consists of either
7156 - a sequence of zero or more characters between straight double-quote
7157 characters (`"`), including a `"` character only if it is
7158 backslash-escaped, or
7160 - a sequence of zero or more characters between straight single-quote
7161 characters (`'`), including a `'` character only if it is
7162 backslash-escaped, or
7164 - a sequence of zero or more characters between matching parentheses
7165 (`(...)`), including a `)` character only if it is backslash-escaped.
7167 Although [link titles] may span multiple lines, they may not contain
7170 An [inline link](@) consists of a [link text] followed immediately
7171 by a left parenthesis `(`, optional [whitespace], an optional
7172 [link destination], an optional [link title] separated from the link
7173 destination by [whitespace], optional [whitespace], and a right
7174 parenthesis `)`. The link's text consists of the inlines contained
7175 in the [link text] (excluding the enclosing square brackets).
7176 The link's URI consists of the link destination, excluding enclosing
7177 `<...>` if present, with backslash-escapes in effect as described
7178 above. The link's title consists of the link title, excluding its
7179 enclosing delimiters, with backslash-escapes in effect as described
7182 Here is a simple inline link:
7184 ```````````````````````````````` example
7185 [link](/uri "title")
7187 <p><a href="/uri" title="title">link</a></p>
7188 ````````````````````````````````
7191 The title may be omitted:
7193 ```````````````````````````````` example
7196 <p><a href="/uri">link</a></p>
7197 ````````````````````````````````
7200 Both the title and the destination may be omitted:
7202 ```````````````````````````````` example
7205 <p><a href="">link</a></p>
7206 ````````````````````````````````
7209 ```````````````````````````````` example
7212 <p><a href="">link</a></p>
7213 ````````````````````````````````
7216 The destination cannot contain spaces or line breaks,
7217 even if enclosed in pointy brackets:
7219 ```````````````````````````````` example
7222 <p>[link](/my uri)</p>
7223 ````````````````````````````````
7226 ```````````````````````````````` example
7229 <p>[link](</my uri>)</p>
7230 ````````````````````````````````
7233 ```````````````````````````````` example
7239 ````````````````````````````````
7242 ```````````````````````````````` example
7248 ````````````````````````````````
7250 Parentheses inside the link destination may be escaped:
7252 ```````````````````````````````` example
7255 <p><a href="(foo)">link</a></p>
7256 ````````````````````````````````
7258 One level of balanced parentheses is allowed without escaping:
7260 ```````````````````````````````` example
7261 [link]((foo)and(bar))
7263 <p><a href="(foo)and(bar)">link</a></p>
7264 ````````````````````````````````
7266 However, if you have parentheses within parentheses, you need to escape
7267 or use the `<...>` form:
7269 ```````````````````````````````` example
7270 [link](foo(and(bar)))
7272 <p>[link](foo(and(bar)))</p>
7273 ````````````````````````````````
7276 ```````````````````````````````` example
7277 [link](foo(and\(bar\)))
7279 <p><a href="foo(and(bar))">link</a></p>
7280 ````````````````````````````````
7283 ```````````````````````````````` example
7284 [link](<foo(and(bar))>)
7286 <p><a href="foo(and(bar))">link</a></p>
7287 ````````````````````````````````
7290 Parentheses and other symbols can also be escaped, as usual
7293 ```````````````````````````````` example
7296 <p><a href="foo):">link</a></p>
7297 ````````````````````````````````
7300 A link can contain fragment identifiers and queries:
7302 ```````````````````````````````` example
7305 [link](http://example.com#fragment)
7307 [link](http://example.com?foo=3#frag)
7309 <p><a href="#fragment">link</a></p>
7310 <p><a href="http://example.com#fragment">link</a></p>
7311 <p><a href="http://example.com?foo=3#frag">link</a></p>
7312 ````````````````````````````````
7315 Note that a backslash before a non-escapable character is
7318 ```````````````````````````````` example
7321 <p><a href="foo%5Cbar">link</a></p>
7322 ````````````````````````````````
7325 URL-escaping should be left alone inside the destination, as all
7326 URL-escaped characters are also valid URL characters. Entity and
7327 numerical character references in the destination will be parsed
7328 into the corresponding Unicode code points, as usual. These may
7329 be optionally URL-escaped when written as HTML, but this spec
7330 does not enforce any particular policy for rendering URLs in
7331 HTML or other formats. Renderers may make different decisions
7332 about how to escape or normalize URLs in the output.
7334 ```````````````````````````````` example
7335 [link](foo%20bä)
7337 <p><a href="foo%20b%C3%A4">link</a></p>
7338 ````````````````````````````````
7341 Note that, because titles can often be parsed as destinations,
7342 if you try to omit the destination and keep the title, you'll
7343 get unexpected results:
7345 ```````````````````````````````` example
7348 <p><a href="%22title%22">link</a></p>
7349 ````````````````````````````````
7352 Titles may be in single quotes, double quotes, or parentheses:
7354 ```````````````````````````````` example
7355 [link](/url "title")
7356 [link](/url 'title')
7357 [link](/url (title))
7359 <p><a href="/url" title="title">link</a>
7360 <a href="/url" title="title">link</a>
7361 <a href="/url" title="title">link</a></p>
7362 ````````````````````````````````
7365 Backslash escapes and entity and numeric character references
7366 may be used in titles:
7368 ```````````````````````````````` example
7369 [link](/url "title \""")
7371 <p><a href="/url" title="title """>link</a></p>
7372 ````````````````````````````````
7375 Titles must be separated from the link using a [whitespace].
7376 Other [Unicode whitespace] like non-breaking space doesn't work.
7378 ```````````````````````````````` example
7379 [link](/url "title")
7381 <p><a href="/url%C2%A0%22title%22">link</a></p>
7382 ````````````````````````````````
7385 Nested balanced quotes are not allowed without escaping:
7387 ```````````````````````````````` example
7388 [link](/url "title "and" title")
7390 <p>[link](/url "title "and" title")</p>
7391 ````````````````````````````````
7394 But it is easy to work around this by using a different quote type:
7396 ```````````````````````````````` example
7397 [link](/url 'title "and" title')
7399 <p><a href="/url" title="title "and" title">link</a></p>
7400 ````````````````````````````````
7403 (Note: `Markdown.pl` did allow double quotes inside a double-quoted
7404 title, and its test suite included a test demonstrating this.
7405 But it is hard to see a good rationale for the extra complexity this
7406 brings, since there are already many ways---backslash escaping,
7407 entity and numeric character references, or using a different
7408 quote type for the enclosing title---to write titles containing
7409 double quotes. `Markdown.pl`'s handling of titles has a number
7410 of other strange features. For example, it allows single-quoted
7411 titles in inline links, but not reference links. And, in
7412 reference links but not inline links, it allows a title to begin
7413 with `"` and end with `)`. `Markdown.pl` 1.0.1 even allows
7414 titles with no closing quotation mark, though 1.0.2b8 does not.
7415 It seems preferable to adopt a simple, rational rule that works
7416 the same way in inline links and link reference definitions.)
7418 [Whitespace] is allowed around the destination and title:
7420 ```````````````````````````````` example
7424 <p><a href="/uri" title="title">link</a></p>
7425 ````````````````````````````````
7428 But it is not allowed between the link text and the
7429 following parenthesis:
7431 ```````````````````````````````` example
7434 <p>[link] (/uri)</p>
7435 ````````````````````````````````
7438 The link text may contain balanced brackets, but not unbalanced ones,
7439 unless they are escaped:
7441 ```````````````````````````````` example
7442 [link [foo [bar]]](/uri)
7444 <p><a href="/uri">link [foo [bar]]</a></p>
7445 ````````````````````````````````
7448 ```````````````````````````````` example
7451 <p>[link] bar](/uri)</p>
7452 ````````````````````````````````
7455 ```````````````````````````````` example
7458 <p>[link <a href="/uri">bar</a></p>
7459 ````````````````````````````````
7462 ```````````````````````````````` example
7465 <p><a href="/uri">link [bar</a></p>
7466 ````````````````````````````````
7469 The link text may contain inline content:
7471 ```````````````````````````````` example
7472 [link *foo **bar** `#`*](/uri)
7474 <p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p>
7475 ````````````````````````````````
7478 ```````````````````````````````` example
7479 [![moon](moon.jpg)](/uri)
7481 <p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p>
7482 ````````````````````````````````
7485 However, links may not contain other links, at any level of nesting.
7487 ```````````````````````````````` example
7488 [foo [bar](/uri)](/uri)
7490 <p>[foo <a href="/uri">bar</a>](/uri)</p>
7491 ````````````````````````````````
7494 ```````````````````````````````` example
7495 [foo *[bar [baz](/uri)](/uri)*](/uri)
7497 <p>[foo <em>[bar <a href="/uri">baz</a>](/uri)</em>](/uri)</p>
7498 ````````````````````````````````
7501 ```````````````````````````````` example
7502 ![[[foo](uri1)](uri2)](uri3)
7504 <p><img src="uri3" alt="[foo](uri2)" /></p>
7505 ````````````````````````````````
7508 These cases illustrate the precedence of link text grouping over
7511 ```````````````````````````````` example
7514 <p>*<a href="/uri">foo*</a></p>
7515 ````````````````````````````````
7518 ```````````````````````````````` example
7521 <p><a href="baz*">foo *bar</a></p>
7522 ````````````````````````````````
7525 Note that brackets that *aren't* part of links do not take
7528 ```````````````````````````````` example
7531 <p><em>foo [bar</em> baz]</p>
7532 ````````````````````````````````
7535 These cases illustrate the precedence of HTML tags, code spans,
7536 and autolinks over link grouping:
7538 ```````````````````````````````` example
7539 [foo <bar attr="](baz)">
7541 <p>[foo <bar attr="](baz)"></p>
7542 ````````````````````````````````
7545 ```````````````````````````````` example
7548 <p>[foo<code>](/uri)</code></p>
7549 ````````````````````````````````
7552 ```````````````````````````````` example
7553 [foo<http://example.com/?search=](uri)>
7555 <p>[foo<a href="http://example.com/?search=%5D(uri)">http://example.com/?search=](uri)</a></p>
7556 ````````````````````````````````
7559 There are three kinds of [reference link](@)s:
7560 [full](#full-reference-link), [collapsed](#collapsed-reference-link),
7561 and [shortcut](#shortcut-reference-link).
7563 A [full reference link](@)
7564 consists of a [link text] immediately followed by a [link label]
7565 that [matches] a [link reference definition] elsewhere in the document.
7567 A [link label](@) begins with a left bracket (`[`) and ends
7568 with the first right bracket (`]`) that is not backslash-escaped.
7569 Between these brackets there must be at least one [non-whitespace character].
7570 Unescaped square bracket characters are not allowed in
7571 [link labels]. A link label can have at most 999
7572 characters inside the square brackets.
7574 One label [matches](@)
7575 another just in case their normalized forms are equal. To normalize a
7576 label, perform the *Unicode case fold* and collapse consecutive internal
7577 [whitespace] to a single space. If there are multiple
7578 matching reference link definitions, the one that comes first in the
7579 document is used. (It is desirable in such cases to emit a warning.)
7581 The contents of the first link label are parsed as inlines, which are
7582 used as the link's text. The link's URI and title are provided by the
7583 matching [link reference definition].
7585 Here is a simple example:
7587 ```````````````````````````````` example
7592 <p><a href="/url" title="title">foo</a></p>
7593 ````````````````````````````````
7596 The rules for the [link text] are the same as with
7597 [inline links]. Thus:
7599 The link text may contain balanced brackets, but not unbalanced ones,
7600 unless they are escaped:
7602 ```````````````````````````````` example
7603 [link [foo [bar]]][ref]
7607 <p><a href="/uri">link [foo [bar]]</a></p>
7608 ````````````````````````````````
7611 ```````````````````````````````` example
7616 <p><a href="/uri">link [bar</a></p>
7617 ````````````````````````````````
7620 The link text may contain inline content:
7622 ```````````````````````````````` example
7623 [link *foo **bar** `#`*][ref]
7627 <p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p>
7628 ````````````````````````````````
7631 ```````````````````````````````` example
7632 [![moon](moon.jpg)][ref]
7636 <p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p>
7637 ````````````````````````````````
7640 However, links may not contain other links, at any level of nesting.
7642 ```````````````````````````````` example
7643 [foo [bar](/uri)][ref]
7647 <p>[foo <a href="/uri">bar</a>]<a href="/uri">ref</a></p>
7648 ````````````````````````````````
7651 ```````````````````````````````` example
7652 [foo *bar [baz][ref]*][ref]
7656 <p>[foo <em>bar <a href="/uri">baz</a></em>]<a href="/uri">ref</a></p>
7657 ````````````````````````````````
7660 (In the examples above, we have two [shortcut reference links]
7661 instead of one [full reference link].)
7663 The following cases illustrate the precedence of link text grouping over
7666 ```````````````````````````````` example
7671 <p>*<a href="/uri">foo*</a></p>
7672 ````````````````````````````````
7675 ```````````````````````````````` example
7680 <p><a href="/uri">foo *bar</a></p>
7681 ````````````````````````````````
7684 These cases illustrate the precedence of HTML tags, code spans,
7685 and autolinks over link grouping:
7687 ```````````````````````````````` example
7688 [foo <bar attr="][ref]">
7692 <p>[foo <bar attr="][ref]"></p>
7693 ````````````````````````````````
7696 ```````````````````````````````` example
7701 <p>[foo<code>][ref]</code></p>
7702 ````````````````````````````````
7705 ```````````````````````````````` example
7706 [foo<http://example.com/?search=][ref]>
7710 <p>[foo<a href="http://example.com/?search=%5D%5Bref%5D">http://example.com/?search=][ref]</a></p>
7711 ````````````````````````````````
7714 Matching is case-insensitive:
7716 ```````````````````````````````` example
7721 <p><a href="/url" title="title">foo</a></p>
7722 ````````````````````````````````
7725 Unicode case fold is used:
7727 ```````````````````````````````` example
7728 [Толпой][Толпой] is a Russian word.
7732 <p><a href="/url">Толпой</a> is a Russian word.</p>
7733 ````````````````````````````````
7736 Consecutive internal [whitespace] is treated as one space for
7737 purposes of determining matching:
7739 ```````````````````````````````` example
7745 <p><a href="/url">Baz</a></p>
7746 ````````````````````````````````
7749 No [whitespace] is allowed between the [link text] and the
7752 ```````````````````````````````` example
7757 <p>[foo] <a href="/url" title="title">bar</a></p>
7758 ````````````````````````````````
7761 ```````````````````````````````` example
7768 <a href="/url" title="title">bar</a></p>
7769 ````````````````````````````````
7772 This is a departure from John Gruber's original Markdown syntax
7773 description, which explicitly allows whitespace between the link
7774 text and the link label. It brings reference links in line with
7775 [inline links], which (according to both original Markdown and
7776 this spec) cannot have whitespace after the link text. More
7777 importantly, it prevents inadvertent capture of consecutive
7778 [shortcut reference links]. If whitespace is allowed between the
7779 link text and the link label, then in the following we will have
7780 a single reference link, not two shortcut reference links, as
7791 (Note that [shortcut reference links] were introduced by Gruber
7792 himself in a beta version of `Markdown.pl`, but never included
7793 in the official syntax description. Without shortcut reference
7794 links, it is harmless to allow space between the link text and
7795 link label; but once shortcut references are introduced, it is
7796 too dangerous to allow this, as it frequently leads to
7797 unintended results.)
7799 When there are multiple matching [link reference definitions],
7802 ```````````````````````````````` example
7809 <p><a href="/url1">bar</a></p>
7810 ````````````````````````````````
7813 Note that matching is performed on normalized strings, not parsed
7814 inline content. So the following does not match, even though the
7815 labels define equivalent inline content:
7817 ```````````````````````````````` example
7823 ````````````````````````````````
7826 [Link labels] cannot contain brackets, unless they are
7829 ```````````````````````````````` example
7836 ````````````````````````````````
7839 ```````````````````````````````` example
7844 <p>[foo][ref[bar]]</p>
7845 <p>[ref[bar]]: /uri</p>
7846 ````````````````````````````````
7849 ```````````````````````````````` example
7855 <p>[[[foo]]]: /url</p>
7856 ````````````````````````````````
7859 ```````````````````````````````` example
7864 <p><a href="/uri">foo</a></p>
7865 ````````````````````````````````
7868 Note that in this example `]` is not backslash-escaped:
7870 ```````````````````````````````` example
7875 <p><a href="/uri">bar\</a></p>
7876 ````````````````````````````````
7879 A [link label] must contain at least one [non-whitespace character]:
7881 ```````````````````````````````` example
7888 ````````````````````````````````
7891 ```````````````````````````````` example
7902 ````````````````````````````````
7905 A [collapsed reference link](@)
7906 consists of a [link label] that [matches] a
7907 [link reference definition] elsewhere in the
7908 document, followed by the string `[]`.
7909 The contents of the first link label are parsed as inlines,
7910 which are used as the link's text. The link's URI and title are
7911 provided by the matching reference link definition. Thus,
7912 `[foo][]` is equivalent to `[foo][foo]`.
7914 ```````````````````````````````` example
7919 <p><a href="/url" title="title">foo</a></p>
7920 ````````````````````````````````
7923 ```````````````````````````````` example
7926 [*foo* bar]: /url "title"
7928 <p><a href="/url" title="title"><em>foo</em> bar</a></p>
7929 ````````````````````````````````
7932 The link labels are case-insensitive:
7934 ```````````````````````````````` example
7939 <p><a href="/url" title="title">Foo</a></p>
7940 ````````````````````````````````
7944 As with full reference links, [whitespace] is not
7945 allowed between the two sets of brackets:
7947 ```````````````````````````````` example
7953 <p><a href="/url" title="title">foo</a>
7955 ````````````````````````````````
7958 A [shortcut reference link](@)
7959 consists of a [link label] that [matches] a
7960 [link reference definition] elsewhere in the
7961 document and is not followed by `[]` or a link label.
7962 The contents of the first link label are parsed as inlines,
7963 which are used as the link's text. The link's URI and title
7964 are provided by the matching link reference definition.
7965 Thus, `[foo]` is equivalent to `[foo][]`.
7967 ```````````````````````````````` example
7972 <p><a href="/url" title="title">foo</a></p>
7973 ````````````````````````````````
7976 ```````````````````````````````` example
7979 [*foo* bar]: /url "title"
7981 <p><a href="/url" title="title"><em>foo</em> bar</a></p>
7982 ````````````````````````````````
7985 ```````````````````````````````` example
7988 [*foo* bar]: /url "title"
7990 <p>[<a href="/url" title="title"><em>foo</em> bar</a>]</p>
7991 ````````````````````````````````
7994 ```````````````````````````````` example
7999 <p>[[bar <a href="/url">foo</a></p>
8000 ````````````````````````````````
8003 The link labels are case-insensitive:
8005 ```````````````````````````````` example
8010 <p><a href="/url" title="title">Foo</a></p>
8011 ````````````````````````````````
8014 A space after the link text should be preserved:
8016 ```````````````````````````````` example
8021 <p><a href="/url">foo</a> bar</p>
8022 ````````````````````````````````
8025 If you just want bracketed text, you can backslash-escape the
8026 opening bracket to avoid links:
8028 ```````````````````````````````` example
8034 ````````````````````````````````
8037 Note that this is a link, because a link label ends with the first
8038 following closing bracket:
8040 ```````````````````````````````` example
8045 <p>*<a href="/url">foo*</a></p>
8046 ````````````````````````````````
8049 Full and compact references take precedence over shortcut
8052 ```````````````````````````````` example
8058 <p><a href="/url2">foo</a></p>
8059 ````````````````````````````````
8061 ```````````````````````````````` example
8066 <p><a href="/url1">foo</a></p>
8067 ````````````````````````````````
8069 Inline links also take precedence:
8071 ```````````````````````````````` example
8076 <p><a href="">foo</a></p>
8077 ````````````````````````````````
8079 ```````````````````````````````` example
8084 <p><a href="/url1">foo</a>(not a link)</p>
8085 ````````````````````````````````
8087 In the following case `[bar][baz]` is parsed as a reference,
8088 `[foo]` as normal text:
8090 ```````````````````````````````` example
8095 <p>[foo]<a href="/url">bar</a></p>
8096 ````````````````````````````````
8099 Here, though, `[foo][bar]` is parsed as a reference, since
8102 ```````````````````````````````` example
8108 <p><a href="/url2">foo</a><a href="/url1">baz</a></p>
8109 ````````````````````````````````
8112 Here `[foo]` is not parsed as a shortcut reference, because it
8113 is followed by a link label (even though `[bar]` is not defined):
8115 ```````````````````````````````` example
8121 <p>[foo]<a href="/url1">bar</a></p>
8122 ````````````````````````````````
8128 Syntax for images is like the syntax for links, with one
8129 difference. Instead of [link text], we have an
8130 [image description](@). The rules for this are the
8131 same as for [link text], except that (a) an
8132 image description starts with `![` rather than `[`, and
8133 (b) an image description may contain links.
8134 An image description has inline elements
8135 as its contents. When an image is rendered to HTML,
8136 this is standardly used as the image's `alt` attribute.
8138 ```````````````````````````````` example
8139 ![foo](/url "title")
8141 <p><img src="/url" alt="foo" title="title" /></p>
8142 ````````````````````````````````
8145 ```````````````````````````````` example
8148 [foo *bar*]: train.jpg "train & tracks"
8150 <p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p>
8151 ````````````````````````````````
8154 ```````````````````````````````` example
8155 ![foo ![bar](/url)](/url2)
8157 <p><img src="/url2" alt="foo bar" /></p>
8158 ````````````````````````````````
8161 ```````````````````````````````` example
8162 ![foo [bar](/url)](/url2)
8164 <p><img src="/url2" alt="foo bar" /></p>
8165 ````````````````````````````````
8168 Though this spec is concerned with parsing, not rendering, it is
8169 recommended that in rendering to HTML, only the plain string content
8170 of the [image description] be used. Note that in
8171 the above example, the alt attribute's value is `foo bar`, not `foo
8172 [bar](/url)` or `foo <a href="/url">bar</a>`. Only the plain string
8173 content is rendered, without formatting.
8175 ```````````````````````````````` example
8178 [foo *bar*]: train.jpg "train & tracks"
8180 <p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p>
8181 ````````````````````````````````
8184 ```````````````````````````````` example
8185 ![foo *bar*][foobar]
8187 [FOOBAR]: train.jpg "train & tracks"
8189 <p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p>
8190 ````````````````````````````````
8193 ```````````````````````````````` example
8196 <p><img src="train.jpg" alt="foo" /></p>
8197 ````````````````````````````````
8200 ```````````````````````````````` example
8201 My ![foo bar](/path/to/train.jpg "title" )
8203 <p>My <img src="/path/to/train.jpg" alt="foo bar" title="title" /></p>
8204 ````````````````````````````````
8207 ```````````````````````````````` example
8210 <p><img src="url" alt="foo" /></p>
8211 ````````````````````````````````
8214 ```````````````````````````````` example
8217 <p><img src="/url" alt="" /></p>
8218 ````````````````````````````````
8223 ```````````````````````````````` example
8228 <p><img src="/url" alt="foo" /></p>
8229 ````````````````````````````````
8232 ```````````````````````````````` example
8237 <p><img src="/url" alt="foo" /></p>
8238 ````````````````````````````````
8243 ```````````````````````````````` example
8248 <p><img src="/url" alt="foo" title="title" /></p>
8249 ````````````````````````````````
8252 ```````````````````````````````` example
8255 [*foo* bar]: /url "title"
8257 <p><img src="/url" alt="foo bar" title="title" /></p>
8258 ````````````````````````````````
8261 The labels are case-insensitive:
8263 ```````````````````````````````` example
8268 <p><img src="/url" alt="Foo" title="title" /></p>
8269 ````````````````````````````````
8272 As with reference links, [whitespace] is not allowed
8273 between the two sets of brackets:
8275 ```````````````````````````````` example
8281 <p><img src="/url" alt="foo" title="title" />
8283 ````````````````````````````````
8288 ```````````````````````````````` example
8293 <p><img src="/url" alt="foo" title="title" /></p>
8294 ````````````````````````````````
8297 ```````````````````````````````` example
8300 [*foo* bar]: /url "title"
8302 <p><img src="/url" alt="foo bar" title="title" /></p>
8303 ````````````````````````````````
8306 Note that link labels cannot contain unescaped brackets:
8308 ```````````````````````````````` example
8311 [[foo]]: /url "title"
8314 <p>[[foo]]: /url "title"</p>
8315 ````````````````````````````````
8318 The link labels are case-insensitive:
8320 ```````````````````````````````` example
8325 <p><img src="/url" alt="Foo" title="title" /></p>
8326 ````````````````````````````````
8329 If you just want bracketed text, you can backslash-escape the
8330 opening `!` and `[`:
8332 ```````````````````````````````` example
8338 ````````````````````````````````
8341 If you want a link after a literal `!`, backslash-escape the
8344 ```````````````````````````````` example
8349 <p>!<a href="/url" title="title">foo</a></p>
8350 ````````````````````````````````
8355 [Autolink](@)s are absolute URIs and email addresses inside
8356 `<` and `>`. They are parsed as links, with the URL or email address
8359 A [URI autolink](@) consists of `<`, followed by an
8360 [absolute URI] not containing `<`, followed by `>`. It is parsed as
8361 a link to the URI, with the URI as the link's label.
8363 An [absolute URI](@),
8364 for these purposes, consists of a [scheme] followed by a colon (`:`)
8365 followed by zero or more characters other than ASCII
8366 [whitespace] and control characters, `<`, and `>`. If
8367 the URI includes these characters, they must be percent-encoded
8368 (e.g. `%20` for a space).
8370 For purposes of this spec, a [scheme](@) is any sequence
8371 of 2--32 characters beginning with an ASCII letter and followed
8372 by any combination of ASCII letters, digits, or the symbols plus
8373 ("+"), period ("."), or hyphen ("-").
8375 Here are some valid autolinks:
8377 ```````````````````````````````` example
8378 <http://foo.bar.baz>
8380 <p><a href="http://foo.bar.baz">http://foo.bar.baz</a></p>
8381 ````````````````````````````````
8384 ```````````````````````````````` example
8385 <http://foo.bar.baz/test?q=hello&id=22&boolean>
8387 <p><a href="http://foo.bar.baz/test?q=hello&id=22&boolean">http://foo.bar.baz/test?q=hello&id=22&boolean</a></p>
8388 ````````````````````````````````
8391 ```````````````````````````````` example
8392 <irc://foo.bar:2233/baz>
8394 <p><a href="irc://foo.bar:2233/baz">irc://foo.bar:2233/baz</a></p>
8395 ````````````````````````````````
8398 Uppercase is also fine:
8400 ```````````````````````````````` example
8401 <MAILTO:FOO@BAR.BAZ>
8403 <p><a href="MAILTO:FOO@BAR.BAZ">MAILTO:FOO@BAR.BAZ</a></p>
8404 ````````````````````````````````
8407 Note that many strings that count as [absolute URIs] for
8408 purposes of this spec are not valid URIs, because their
8409 schemes are not registered or because of other problems
8412 ```````````````````````````````` example
8415 <p><a href="a+b+c:d">a+b+c:d</a></p>
8416 ````````````````````````````````
8419 ```````````````````````````````` example
8420 <made-up-scheme://foo,bar>
8422 <p><a href="made-up-scheme://foo,bar">made-up-scheme://foo,bar</a></p>
8423 ````````````````````````````````
8426 ```````````````````````````````` example
8429 <p><a href="http://../">http://../</a></p>
8430 ````````````````````````````````
8433 ```````````````````````````````` example
8434 <localhost:5001/foo>
8436 <p><a href="localhost:5001/foo">localhost:5001/foo</a></p>
8437 ````````````````````````````````
8440 Spaces are not allowed in autolinks:
8442 ```````````````````````````````` example
8443 <http://foo.bar/baz bim>
8445 <p><http://foo.bar/baz bim></p>
8446 ````````````````````````````````
8449 Backslash-escapes do not work inside autolinks:
8451 ```````````````````````````````` example
8452 <http://example.com/\[\>
8454 <p><a href="http://example.com/%5C%5B%5C">http://example.com/\[\</a></p>
8455 ````````````````````````````````
8458 An [email autolink](@)
8459 consists of `<`, followed by an [email address],
8460 followed by `>`. The link's label is the email address,
8461 and the URL is `mailto:` followed by the email address.
8463 An [email address](@),
8464 for these purposes, is anything that matches
8465 the [non-normative regex from the HTML5
8466 spec](https://html.spec.whatwg.org/multipage/forms.html#e-mail-state-(type=email)):
8468 /^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?
8469 (?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/
8471 Examples of email autolinks:
8473 ```````````````````````````````` example
8474 <foo@bar.example.com>
8476 <p><a href="mailto:foo@bar.example.com">foo@bar.example.com</a></p>
8477 ````````````````````````````````
8480 ```````````````````````````````` example
8481 <foo+special@Bar.baz-bar0.com>
8483 <p><a href="mailto:foo+special@Bar.baz-bar0.com">foo+special@Bar.baz-bar0.com</a></p>
8484 ````````````````````````````````
8487 Backslash-escapes do not work inside email autolinks:
8489 ```````````````````````````````` example
8490 <foo\+@bar.example.com>
8492 <p><foo+@bar.example.com></p>
8493 ````````````````````````````````
8496 These are not autolinks:
8498 ```````````````````````````````` example
8502 ````````````````````````````````
8505 ```````````````````````````````` example
8508 <p>< http://foo.bar ></p>
8509 ````````````````````````````````
8512 ```````````````````````````````` example
8515 <p><m:abc></p>
8516 ````````````````````````````````
8519 ```````````````````````````````` example
8522 <p><foo.bar.baz></p>
8523 ````````````````````````````````
8526 ```````````````````````````````` example
8529 <p>http://example.com</p>
8530 ````````````````````````````````
8533 ```````````````````````````````` example
8536 <p>foo@bar.example.com</p>
8537 ````````````````````````````````
8542 Text between `<` and `>` that looks like an HTML tag is parsed as a
8543 raw HTML tag and will be rendered in HTML without escaping.
8544 Tag and attribute names are not limited to current HTML tags,
8545 so custom tags (and even, say, DocBook tags) may be used.
8547 Here is the grammar for tags:
8549 A [tag name](@) consists of an ASCII letter
8550 followed by zero or more ASCII letters, digits, or
8553 An [attribute](@) consists of [whitespace],
8554 an [attribute name], and an optional
8555 [attribute value specification].
8557 An [attribute name](@)
8558 consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII
8559 letters, digits, `_`, `.`, `:`, or `-`. (Note: This is the XML
8560 specification restricted to ASCII. HTML5 is laxer.)
8562 An [attribute value specification](@)
8563 consists of optional [whitespace],
8564 a `=` character, optional [whitespace], and an [attribute
8567 An [attribute value](@)
8568 consists of an [unquoted attribute value],
8569 a [single-quoted attribute value], or a [double-quoted attribute value].
8571 An [unquoted attribute value](@)
8572 is a nonempty string of characters not
8573 including spaces, `"`, `'`, `=`, `<`, `>`, or `` ` ``.
8575 A [single-quoted attribute value](@)
8576 consists of `'`, zero or more
8577 characters not including `'`, and a final `'`.
8579 A [double-quoted attribute value](@)
8580 consists of `"`, zero or more
8581 characters not including `"`, and a final `"`.
8583 An [open tag](@) consists of a `<` character, a [tag name],
8584 zero or more [attributes], optional [whitespace], an optional `/`
8585 character, and a `>` character.
8587 A [closing tag](@) consists of the string `</`, a
8588 [tag name], optional [whitespace], and the character `>`.
8590 An [HTML comment](@) consists of `<!--` + *text* + `-->`,
8591 where *text* does not start with `>` or `->`, does not end with `-`,
8592 and does not contain `--`. (See the
8593 [HTML5 spec](http://www.w3.org/TR/html5/syntax.html#comments).)
8595 A [processing instruction](@)
8596 consists of the string `<?`, a string
8597 of characters not including the string `?>`, and the string
8600 A [declaration](@) consists of the
8601 string `<!`, a name consisting of one or more uppercase ASCII letters,
8602 [whitespace], a string of characters not including the
8603 character `>`, and the character `>`.
8605 A [CDATA section](@) consists of
8606 the string `<![CDATA[`, a string of characters not including the string
8607 `]]>`, and the string `]]>`.
8609 An [HTML tag](@) consists of an [open tag], a [closing tag],
8610 an [HTML comment], a [processing instruction], a [declaration],
8611 or a [CDATA section].
8613 Here are some simple open tags:
8615 ```````````````````````````````` example
8618 <p><a><bab><c2c></p>
8619 ````````````````````````````````
8624 ```````````````````````````````` example
8628 ````````````````````````````````
8631 [Whitespace] is allowed:
8633 ```````````````````````````````` example
8639 ````````````````````````````````
8644 ```````````````````````````````` example
8645 <a foo="bar" bam = 'baz <em>"</em>'
8646 _boolean zoop:33=zoop:33 />
8648 <p><a foo="bar" bam = 'baz <em>"</em>'
8649 _boolean zoop:33=zoop:33 /></p>
8650 ````````````````````````````````
8653 Custom tag names can be used:
8655 ```````````````````````````````` example
8656 Foo <responsive-image src="foo.jpg" />
8658 <p>Foo <responsive-image src="foo.jpg" /></p>
8659 ````````````````````````````````
8662 Illegal tag names, not parsed as HTML:
8664 ```````````````````````````````` example
8667 <p><33> <__></p>
8668 ````````````````````````````````
8671 Illegal attribute names:
8673 ```````````````````````````````` example
8676 <p><a h*#ref="hi"></p>
8677 ````````````````````````````````
8680 Illegal attribute values:
8682 ```````````````````````````````` example
8683 <a href="hi'> <a href=hi'>
8685 <p><a href="hi'> <a href=hi'></p>
8686 ````````````````````````````````
8689 Illegal [whitespace]:
8691 ```````````````````````````````` example
8696 foo><bar/ ></p>
8697 ````````````````````````````````
8700 Missing [whitespace]:
8702 ```````````````````````````````` example
8703 <a href='bar'title=title>
8705 <p><a href='bar'title=title></p>
8706 ````````````````````````````````
8711 ```````````````````````````````` example
8715 ````````````````````````````````
8718 Illegal attributes in closing tag:
8720 ```````````````````````````````` example
8723 <p></a href="foo"></p>
8724 ````````````````````````````````
8729 ```````````````````````````````` example
8731 comment - with hyphen -->
8733 <p>foo <!-- this is a
8734 comment - with hyphen --></p>
8735 ````````````````````````````````
8738 ```````````````````````````````` example
8739 foo <!-- not a comment -- two hyphens -->
8741 <p>foo <!-- not a comment -- two hyphens --></p>
8742 ````````````````````````````````
8747 ```````````````````````````````` example
8752 <p>foo <!--> foo --></p>
8753 <p>foo <!-- foo---></p>
8754 ````````````````````````````````
8757 Processing instructions:
8759 ```````````````````````````````` example
8760 foo <?php echo $a; ?>
8762 <p>foo <?php echo $a; ?></p>
8763 ````````````````````````````````
8768 ```````````````````````````````` example
8769 foo <!ELEMENT br EMPTY>
8771 <p>foo <!ELEMENT br EMPTY></p>
8772 ````````````````````````````````
8777 ```````````````````````````````` example
8780 <p>foo <![CDATA[>&<]]></p>
8781 ````````````````````````````````
8784 Entity and numeric character references are preserved in HTML
8787 ```````````````````````````````` example
8788 foo <a href="ö">
8790 <p>foo <a href="ö"></p>
8791 ````````````````````````````````
8794 Backslash escapes do not work in HTML attributes:
8796 ```````````````````````````````` example
8799 <p>foo <a href="\*"></p>
8800 ````````````````````````````````
8803 ```````````````````````````````` example
8806 <p><a href="""></p>
8807 ````````````````````````````````
8812 A line break (not in a code span or HTML tag) that is preceded
8813 by two or more spaces and does not occur at the end of a block
8814 is parsed as a [hard line break](@) (rendered
8815 in HTML as a `<br />` tag):
8817 ```````````````````````````````` example
8823 ````````````````````````````````
8826 For a more visible alternative, a backslash before the
8827 [line ending] may be used instead of two spaces:
8829 ```````````````````````````````` example
8835 ````````````````````````````````
8838 More than two spaces can be used:
8840 ```````````````````````````````` example
8846 ````````````````````````````````
8849 Leading spaces at the beginning of the next line are ignored:
8851 ```````````````````````````````` example
8857 ````````````````````````````````
8860 ```````````````````````````````` example
8866 ````````````````````````````````
8869 Line breaks can occur inside emphasis, links, and other constructs
8870 that allow inline content:
8872 ```````````````````````````````` example
8878 ````````````````````````````````
8881 ```````````````````````````````` example
8887 ````````````````````````````````
8890 Line breaks do not occur inside code spans
8892 ```````````````````````````````` example
8896 <p><code>code span</code></p>
8897 ````````````````````````````````
8900 ```````````````````````````````` example
8904 <p><code>code\ span</code></p>
8905 ````````````````````````````````
8910 ```````````````````````````````` example
8916 ````````````````````````````````
8919 ```````````````````````````````` example
8925 ````````````````````````````````
8928 Hard line breaks are for separating inline content within a block.
8929 Neither syntax for hard line breaks works at the end of a paragraph or
8930 other block element:
8932 ```````````````````````````````` example
8936 ````````````````````````````````
8939 ```````````````````````````````` example
8943 ````````````````````````````````
8946 ```````````````````````````````` example
8950 ````````````````````````````````
8953 ```````````````````````````````` example
8957 ````````````````````````````````
8962 A regular line break (not in a code span or HTML tag) that is not
8963 preceded by two or more spaces or a backslash is parsed as a
8964 [softbreak](@). (A softbreak may be rendered in HTML either as a
8965 [line ending] or as a space. The result will be the same in
8966 browsers. In the examples here, a [line ending] will be used.)
8968 ```````````````````````````````` example
8974 ````````````````````````````````
8977 Spaces at the end of the line and beginning of the next line are
8980 ```````````````````````````````` example
8986 ````````````````````````````````
8989 A conforming parser may render a soft line break in HTML either as a
8990 line break or as a space.
8992 A renderer may also provide an option to render soft line breaks
8993 as hard line breaks.
8997 Any characters not given an interpretation by the above rules will
8998 be parsed as plain textual content.
9000 ```````````````````````````````` example
9003 <p>hello $.;'there</p>
9004 ````````````````````````````````
9007 ```````````````````````````````` example
9011 ````````````````````````````````
9014 Internal spaces are preserved verbatim:
9016 ```````````````````````````````` example
9019 <p>Multiple spaces</p>
9020 ````````````````````````````````
9025 # Appendix: A parsing strategy
9027 In this appendix we describe some features of the parsing strategy
9028 used in the CommonMark reference implementations.
9032 Parsing has two phases:
9034 1. In the first phase, lines of input are consumed and the block
9035 structure of the document---its division into paragraphs, block quotes,
9036 list items, and so on---is constructed. Text is assigned to these
9037 blocks but not parsed. Link reference definitions are parsed and a
9038 map of links is constructed.
9040 2. In the second phase, the raw text contents of paragraphs and headings
9041 are parsed into sequences of Markdown inline elements (strings,
9042 code spans, links, emphasis, and so on), using the map of link
9043 references constructed in phase 1.
9045 At each point in processing, the document is represented as a tree of
9046 **blocks**. The root of the tree is a `document` block. The `document`
9047 may have any number of other blocks as **children**. These children
9048 may, in turn, have other blocks as children. The last child of a block
9049 is normally considered **open**, meaning that subsequent lines of input
9050 can alter its contents. (Blocks that are not open are **closed**.)
9051 Here, for example, is a possible document tree, with the open blocks
9058 "Lorem ipsum dolor\nsit amet."
9059 -> list (type=bullet tight=true bullet_char=-)
9062 "Qui *quodsi iracundia*"
9068 ## Phase 1: block structure
9070 Each line that is processed has an effect on this tree. The line is
9071 analyzed and, depending on its contents, the document may be altered
9072 in one or more of the following ways:
9074 1. One or more open blocks may be closed.
9075 2. One or more new blocks may be created as children of the
9077 3. Text may be added to the last (deepest) open block remaining
9080 Once a line has been incorporated into the tree in this way,
9081 it can be discarded, so input can be read in a stream.
9083 For each line, we follow this procedure:
9085 1. First we iterate through the open blocks, starting with the
9086 root document, and descending through last children down to the last
9087 open block. Each block imposes a condition that the line must satisfy
9088 if the block is to remain open. For example, a block quote requires a
9089 `>` character. A paragraph requires a non-blank line.
9090 In this phase we may match all or just some of the open
9091 blocks. But we cannot close unmatched blocks yet, because we may have a
9092 [lazy continuation line].
9094 2. Next, after consuming the continuation markers for existing
9095 blocks, we look for new block starts (e.g. `>` for a block quote).
9096 If we encounter a new block start, we close any blocks unmatched
9097 in step 1 before creating the new block as a child of the last
9100 3. Finally, we look at the remainder of the line (after block
9101 markers like `>`, list markers, and indentation have been consumed).
9102 This is text that can be incorporated into the last open
9103 block (a paragraph, code block, heading, or raw HTML).
9105 Setext headings are formed when we see a line of a paragraph
9106 that is a [setext heading underline].
9108 Reference link definitions are detected when a paragraph is closed;
9109 the accumulated text lines are parsed to see if they begin with
9110 one or more reference link definitions. Any remainder becomes a
9113 We can see how this works by considering how the tree above is
9114 generated by four lines of Markdown:
9119 > - Qui *quodsi iracundia*
9123 At the outset, our document model is just
9129 The first line of our text,
9135 causes a `block_quote` block to be created as a child of our
9136 open `document` block, and a `paragraph` block as a child of
9137 the `block_quote`. Then the text is added to the last open
9138 block, the `paragraph`:
9153 is a "lazy continuation" of the open `paragraph`, so it gets added
9154 to the paragraph's text:
9160 "Lorem ipsum dolor\nsit amet."
9166 > - Qui *quodsi iracundia*
9169 causes the `paragraph` block to be closed, and a new `list` block
9170 opened as a child of the `block_quote`. A `list_item` is also
9171 added as a child of the `list`, and a `paragraph` as a child of
9172 the `list_item`. The text is then added to the new `paragraph`:
9178 "Lorem ipsum dolor\nsit amet."
9179 -> list (type=bullet tight=true bullet_char=-)
9182 "Qui *quodsi iracundia*"
9191 causes the `list_item` (and its child the `paragraph`) to be closed,
9192 and a new `list_item` opened up as child of the `list`. A `paragraph`
9193 is added as a child of the new `list_item`, to contain the text.
9194 We thus obtain the final tree:
9200 "Lorem ipsum dolor\nsit amet."
9201 -> list (type=bullet tight=true bullet_char=-)
9204 "Qui *quodsi iracundia*"
9210 ## Phase 2: inline structure
9212 Once all of the input has been parsed, all open blocks are closed.
9214 We then "walk the tree," visiting every node, and parse raw
9215 string contents of paragraphs and headings as inlines. At this
9216 point we have seen all the link reference definitions, so we can
9217 resolve reference links as we go.
9223 str "Lorem ipsum dolor"
9226 list (type=bullet tight=true bullet_char=-)
9231 str "quodsi iracundia"
9237 Notice how the [line ending] in the first paragraph has
9238 been parsed as a `softbreak`, and the asterisks in the first list item
9239 have become an `emph`.
9241 ### An algorithm for parsing nested emphasis and links
9243 By far the trickiest part of inline parsing is handling emphasis,
9244 strong emphasis, links, and images. This is done using the following
9247 When we're parsing inlines and we hit either
9249 - a run of `*` or `_` characters, or
9252 we insert a text node with these symbols as its literal content, and we
9253 add a pointer to this text node to the [delimiter stack](@).
9255 The [delimiter stack] is a doubly linked list. Each
9256 element contains a pointer to a text node, plus information about
9258 - the type of delimiter (`[`, `![`, `*`, `_`)
9259 - the number of delimiters,
9260 - whether the delimiter is "active" (all are active to start), and
9261 - whether the delimiter is a potential opener, a potential closer,
9262 or both (which depends on what sort of characters precede
9263 and follow the delimiters).
9265 When we hit a `]` character, we call the *look for link or image*
9266 procedure (see below).
9268 When we hit the end of the input, we call the *process emphasis*
9269 procedure (see below), with `stack_bottom` = NULL.
9271 #### *look for link or image*
9273 Starting at the top of the delimiter stack, we look backwards
9274 through the stack for an opening `[` or `![` delimiter.
9276 - If we don't find one, we return a literal text node `]`.
9278 - If we do find one, but it's not *active*, we remove the inactive
9279 delimiter from the stack, and return a literal text node `]`.
9281 - If we find one and it's active, then we parse ahead to see if
9282 we have an inline link/image, reference link/image, compact reference
9283 link/image, or shortcut reference link/image.
9285 + If we don't, then we remove the opening delimiter from the
9286 delimiter stack and return a literal text node `]`.
9290 * We return a link or image node whose children are the inlines
9291 after the text node pointed to by the opening delimiter.
9293 * We run *process emphasis* on these inlines, with the `[` opener
9296 * We remove the opening delimiter.
9298 * If we have a link (and not an image), we also set all
9299 `[` delimiters before the opening delimiter to *inactive*. (This
9300 will prevent us from getting links within links.)
9302 #### *process emphasis*
9304 Parameter `stack_bottom` sets a lower bound to how far we
9305 descend in the [delimiter stack]. If it is NULL, we can
9306 go all the way to the bottom. Otherwise, we stop before
9307 visiting `stack_bottom`.
9309 Let `current_position` point to the element on the [delimiter stack]
9310 just above `stack_bottom` (or the first element if `stack_bottom`
9313 We keep track of the `openers_bottom` for each delimiter
9314 type (`*`, `_`). Initialize this to `stack_bottom`.
9316 Then we repeat the following until we run out of potential
9319 - Move `current_position` forward in the delimiter stack (if needed)
9320 until we find the first potential closer with delimiter `*` or `_`.
9321 (This will be the potential closer closest
9322 to the beginning of the input -- the first one in parse order.)
9324 - Now, look back in the stack (staying above `stack_bottom` and
9325 the `openers_bottom` for this delimiter type) for the
9326 first matching potential opener ("matching" means same delimiter).
9330 + Figure out whether we have emphasis or strong emphasis:
9331 if both closer and opener spans have length >= 2, we have
9332 strong, otherwise regular.
9334 + Insert an emph or strong emph node accordingly, after
9335 the text node corresponding to the opener.
9337 + Remove any delimiters between the opener and closer from
9338 the delimiter stack.
9340 + Remove 1 (for regular emph) or 2 (for strong emph) delimiters
9341 from the opening and closing text nodes. If they become empty
9342 as a result, remove them and remove the corresponding element
9343 of the delimiter stack. If the closing node is removed, reset
9344 `current_position` to the next element in the stack.
9348 + Set `openers_bottom` to the element before `current_position`.
9349 (We know that there are no openers for this kind of closer up to and
9350 including this point, so this puts a lower bound on future searches.)
9352 + If the closer at `current_position` is not a potential opener,
9353 remove it from the delimiter stack (since we know it can't
9354 be a closer either).
9356 + Advance `current_position` to the next element in the stack.
9358 After we're done, we remove all delimiters above `stack_bottom` from the