3 author: John MacFarlane
6 license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)'
13 Markdown is a plain text format for writing structured documents,
14 based on conventions for indicating formatting in email
15 and usenet posts. It was developed by John Gruber (with
16 help from Aaron Swartz) and released in 2004 in the form of a
17 [syntax description](http://daringfireball.net/projects/markdown/syntax)
18 and a Perl script (`Markdown.pl`) for converting Markdown to
19 HTML. In the next decade, dozens of implementations were
20 developed in many languages. Some extended the original
21 Markdown syntax with conventions for footnotes, tables, and
22 other document elements. Some allowed Markdown documents to be
23 rendered in formats other than HTML. Websites like Reddit,
24 StackOverflow, and GitHub had millions of people using Markdown.
25 And Markdown started to be used beyond the web, to author books,
26 articles, slide shows, letters, and lecture notes.
28 What distinguishes Markdown from many other lightweight markup
29 syntaxes, which are often easier to write, is its readability.
32 > The overriding design goal for Markdown's formatting syntax is
33 > to make it as readable as possible. The idea is that a
34 > Markdown-formatted document should be publishable as-is, as
35 > plain text, without looking like it's been marked up with tags
36 > or formatting instructions.
37 > (<http://daringfireball.net/projects/markdown/>)
39 The point can be illustrated by comparing a sample of
40 [AsciiDoc](http://www.methods.co.nz/asciidoc/) with
41 an equivalent sample of Markdown. Here is a sample of
42 AsciiDoc from the AsciiDoc manual:
47 List item one continued with a second paragraph followed by an
55 List item continued with a third paragraph.
57 2. List item two continued with an open block.
60 This paragraph is part of the preceding list item.
62 a. This list is nested and does not require explicit item
65 This paragraph is part of the preceding list item.
69 This paragraph belongs to item two of the outer list.
73 And here is the equivalent in Markdown:
77 List item one continued with a second paragraph followed by an
83 List item continued with a third paragraph.
85 2. List item two continued with an open block.
87 This paragraph is part of the preceding list item.
89 1. This list is nested and does not require explicit item continuation.
91 This paragraph is part of the preceding list item.
95 This paragraph belongs to item two of the outer list.
98 The AsciiDoc version is, arguably, easier to write. You don't need
99 to worry about indentation. But the Markdown version is much easier
100 to read. The nesting of list items is apparent to the eye in the
101 source, not just in the processed document.
103 ## Why is a spec needed?
105 John Gruber's [canonical description of Markdown's
106 syntax](http://daringfireball.net/projects/markdown/syntax)
107 does not specify the syntax unambiguously. Here are some examples of
108 questions it does not answer:
110 1. How much indentation is needed for a sublist? The spec says that
111 continuation paragraphs need to be indented four spaces, but is
112 not fully explicit about sublists. It is natural to think that
113 they, too, must be indented four spaces, but `Markdown.pl` does
114 not require that. This is hardly a "corner case," and divergences
115 between implementations on this issue often lead to surprises for
116 users in real documents. (See [this comment by John
117 Gruber](http://article.gmane.org/gmane.text.markdown.general/1997).)
119 2. Is a blank line needed before a block quote or heading?
120 Most implementations do not require the blank line. However,
121 this can lead to unexpected results in hard-wrapped text, and
122 also to ambiguities in parsing (note that some implementations
123 put the heading inside the blockquote, while others do not).
124 (John Gruber has also spoken [in favor of requiring the blank
125 lines](http://article.gmane.org/gmane.text.markdown.general/2146).)
127 3. Is a blank line needed before an indented code block?
128 (`Markdown.pl` requires it, but this is not mentioned in the
129 documentation, and some implementations do not require it.)
136 4. What is the exact rule for determining when list items get
137 wrapped in `<p>` tags? Can a list be partially "loose" and partially
138 "tight"? What should we do with a list like this?
157 (There are some relevant comments by John Gruber
158 [here](http://article.gmane.org/gmane.text.markdown.general/2554).)
160 5. Can list markers be indented? Can ordered list markers be right-aligned?
168 6. Is this one list with a thematic break in its second item,
169 or two lists separated by a thematic break?
177 7. When list markers change from numbers to bullets, do we have
178 two lists or one? (The Markdown syntax description suggests two,
179 but the perl scripts and many other implementations produce one.)
188 8. What are the precedence rules for the markers of inline structure?
189 For example, is the following a valid link, or does the code span
193 [a backtick (`)](/url) and [another backtick (`)](/url).
196 9. What are the precedence rules for markers of emphasis and strong
197 emphasis? For example, how should the following be parsed?
203 10. What are the precedence rules between block-level and inline-level
204 structure? For example, how should the following be parsed?
207 - `a long code span can contain a hyphen like this
208 - and it can screw things up`
211 11. Can list items include section headings? (`Markdown.pl` does not
212 allow this, but does allow blockquotes to include headings.)
218 12. Can list items be empty?
226 13. Can link references be defined inside block quotes or list items?
234 14. If there are multiple definitions for the same reference, which takes
244 In the absence of a spec, early implementers consulted `Markdown.pl`
245 to resolve these ambiguities. But `Markdown.pl` was quite buggy, and
246 gave manifestly bad results in many cases, so it was not a
247 satisfactory replacement for a spec.
249 Because there is no unambiguous spec, implementations have diverged
250 considerably. As a result, users are often surprised to find that
251 a document that renders one way on one system (say, a github wiki)
252 renders differently on another (say, converting to docbook using
253 pandoc). To make matters worse, because nothing in Markdown counts
254 as a "syntax error," the divergence often isn't discovered right away.
256 ## About this document
258 This document attempts to specify Markdown syntax unambiguously.
259 It contains many examples with side-by-side Markdown and
260 HTML. These are intended to double as conformance tests. An
261 accompanying script `spec_tests.py` can be used to run the tests
262 against any Markdown program:
264 python test/spec_tests.py --spec spec.txt --program PROGRAM
266 Since this document describes how Markdown is to be parsed into
267 an abstract syntax tree, it would have made sense to use an abstract
268 representation of the syntax tree instead of HTML. But HTML is capable
269 of representing the structural distinctions we need to make, and the
270 choice of HTML for the tests makes it possible to run the tests against
271 an implementation without writing an abstract syntax tree renderer.
273 This document is generated from a text file, `spec.txt`, written
274 in Markdown with a small extension for the side-by-side tests.
275 The script `tools/makespec.py` can be used to convert `spec.txt` into
276 HTML or CommonMark (which can then be converted into other formats).
278 In the examples, the `→` character is used to represent tabs.
282 ## Characters and lines
284 Any sequence of [characters] is a valid CommonMark
287 A [character](@) is a Unicode code point. Although some
288 code points (for example, combining accents) do not correspond to
289 characters in an intuitive sense, all code points count as characters
290 for purposes of this spec.
292 This spec does not specify an encoding; it thinks of lines as composed
293 of [characters] rather than bytes. A conforming parser may be limited
294 to a certain encoding.
296 A [line](@) is a sequence of zero or more [characters]
297 other than newline (`U+000A`) or carriage return (`U+000D`),
298 followed by a [line ending] or by the end of file.
300 A [line ending](@) is a newline (`U+000A`), a carriage return
301 (`U+000D`) not followed by a newline, or a carriage return and a
304 A line containing no characters, or a line containing only spaces
305 (`U+0020`) or tabs (`U+0009`), is called a [blank line](@).
307 The following definitions of character classes will be used in this spec:
309 A [whitespace character](@) is a space
310 (`U+0020`), tab (`U+0009`), newline (`U+000A`), line tabulation (`U+000B`),
311 form feed (`U+000C`), or carriage return (`U+000D`).
313 [Whitespace](@) is a sequence of one or more [whitespace
316 A [Unicode whitespace character](@) is
317 any code point in the Unicode `Zs` general category, or a tab (`U+0009`),
318 carriage return (`U+000D`), newline (`U+000A`), or form feed
321 [Unicode whitespace](@) is a sequence of one
322 or more [Unicode whitespace characters].
324 A [space](@) is `U+0020`.
326 A [non-whitespace character](@) is any character
327 that is not a [whitespace character].
329 An [ASCII punctuation character](@)
330 is `!`, `"`, `#`, `$`, `%`, `&`, `'`, `(`, `)`,
331 `*`, `+`, `,`, `-`, `.`, `/`, `:`, `;`, `<`, `=`, `>`, `?`, `@`,
332 `[`, `\`, `]`, `^`, `_`, `` ` ``, `{`, `|`, `}`, or `~`.
334 A [punctuation character](@) is an [ASCII
335 punctuation character] or anything in
336 the general Unicode categories `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po`, or `Ps`.
340 Tabs in lines are not expanded to [spaces]. However,
341 in contexts where whitespace helps to define block structure,
342 tabs behave as if they were replaced by spaces with a tab stop
345 Thus, for example, a tab can be used instead of four spaces
346 in an indented code block. (Note, however, that internal
347 tabs are passed through as literal tabs, not expanded to
350 ```````````````````````````````` example
353 <pre><code>foo→baz→→bim
355 ````````````````````````````````
357 ```````````````````````````````` example
360 <pre><code>foo→baz→→bim
362 ````````````````````````````````
364 ```````````````````````````````` example
371 ````````````````````````````````
373 In the following example, a continuation paragraph of a list
374 item is indented with a tab; this has exactly the same effect
375 as indentation with four spaces would:
377 ```````````````````````````````` example
388 ````````````````````````````````
390 ```````````````````````````````` example
402 ````````````````````````````````
404 Normally the `>` that begins a block quote may be followed
405 optionally by a space, which is not considered part of the
406 content. In the following case `>` is followed by a tab,
407 which is treated as if it were expanded into three spaces.
408 Since one of these spaces is considered part of the
409 delimiter, `foo` is considered to be indented six spaces
410 inside the block quote context, so we get an indented
411 code block starting with two spaces.
413 ```````````````````````````````` example
420 ````````````````````````````````
422 ```````````````````````````````` example
431 ````````````````````````````````
434 ```````````````````````````````` example
441 ````````````````````````````````
443 ```````````````````````````````` example
459 ````````````````````````````````
461 ```````````````````````````````` example
465 ````````````````````````````````
467 ```````````````````````````````` example
471 ````````````````````````````````
474 ## Insecure characters
476 For security reasons, the Unicode character `U+0000` must be replaced
477 with the REPLACEMENT CHARACTER (`U+FFFD`).
481 We can think of a document as a sequence of
482 [blocks](@)---structural elements like paragraphs, block
483 quotations, lists, headings, rules, and code blocks. Some blocks (like
484 block quotes and list items) contain other blocks; others (like
485 headings and paragraphs) contain [inline](@) content---text,
486 links, emphasized text, images, code spans, and so on.
490 Indicators of block structure always take precedence over indicators
491 of inline structure. So, for example, the following is a list with
492 two items, not a list with one item containing a code span:
494 ```````````````````````````````` example
502 ````````````````````````````````
505 This means that parsing can proceed in two steps: first, the block
506 structure of the document can be discerned; second, text lines inside
507 paragraphs, headings, and other block constructs can be parsed for inline
508 structure. The second step requires information about link reference
509 definitions that will be available only at the end of the first
510 step. Note that the first step requires processing lines in sequence,
511 but the second can be parallelized, since the inline parsing of
512 one block element does not affect the inline parsing of any other.
514 ## Container blocks and leaf blocks
516 We can divide blocks into two types:
517 [container block](@)s,
518 which can contain other blocks, and [leaf block](@)s,
523 This section describes the different kinds of leaf block that make up a
528 A line consisting of 0-3 spaces of indentation, followed by a sequence
529 of three or more matching `-`, `_`, or `*` characters, each followed
530 optionally by any number of spaces, forms a
533 ```````````````````````````````` example
541 ````````````````````````````````
546 ```````````````````````````````` example
550 ````````````````````````````````
553 ```````````````````````````````` example
557 ````````````````````````````````
560 Not enough characters:
562 ```````````````````````````````` example
570 ````````````````````````````````
573 One to three spaces indent are allowed:
575 ```````````````````````````````` example
583 ````````````````````````````````
586 Four spaces is too many:
588 ```````````````````````````````` example
593 ````````````````````````````````
596 ```````````````````````````````` example
602 ````````````````````````````````
605 More than three characters may be used:
607 ```````````````````````````````` example
608 _____________________________________
611 ````````````````````````````````
614 Spaces are allowed between the characters:
616 ```````````````````````````````` example
620 ````````````````````````````````
623 ```````````````````````````````` example
627 ````````````````````````````````
630 ```````````````````````````````` example
634 ````````````````````````````````
637 Spaces are allowed at the end:
639 ```````````````````````````````` example
643 ````````````````````````````````
646 However, no other characters may occur in the line:
648 ```````````````````````````````` example
658 ````````````````````````````````
661 It is required that all of the [non-whitespace characters] be the same.
662 So, this is not a thematic break:
664 ```````````````````````````````` example
668 ````````````````````````````````
671 Thematic breaks do not need blank lines before or after:
673 ```````````````````````````````` example
685 ````````````````````````````````
688 Thematic breaks can interrupt a paragraph:
690 ```````````````````````````````` example
698 ````````````````````````````````
701 If a line of dashes that meets the above conditions for being a
702 thematic break could also be interpreted as the underline of a [setext
703 heading], the interpretation as a
704 [setext heading] takes precedence. Thus, for example,
705 this is a setext heading, not a paragraph followed by a thematic break:
707 ```````````````````````````````` example
714 ````````````````````````````````
717 When both a thematic break and a list item are possible
718 interpretations of a line, the thematic break takes precedence:
720 ```````````````````````````````` example
732 ````````````````````````````````
735 If you want a thematic break in a list item, use a different bullet:
737 ```````````````````````````````` example
747 ````````````````````````````````
753 consists of a string of characters, parsed as inline content, between an
754 opening sequence of 1--6 unescaped `#` characters and an optional
755 closing sequence of any number of unescaped `#` characters.
756 The opening sequence of `#` characters must be followed by a
757 [space] or by the end of line. The optional closing sequence of `#`s must be
758 preceded by a [space] and may be followed by spaces only. The opening
759 `#` character may be indented 0-3 spaces. The raw contents of the
760 heading are stripped of leading and trailing spaces before being parsed
761 as inline content. The heading level is equal to the number of `#`
762 characters in the opening sequence.
766 ```````````````````````````````` example
780 ````````````````````````````````
783 More than six `#` characters is not a heading:
785 ```````````````````````````````` example
789 ````````````````````````````````
792 At least one space is required between the `#` characters and the
793 heading's contents, unless the heading is empty. Note that many
794 implementations currently do not require the space. However, the
795 space was required by the
796 [original ATX implementation](http://www.aaronsw.com/2002/atx/atx.py),
797 and it helps prevent things like the following from being parsed as
800 ```````````````````````````````` example
807 ````````````````````````````````
810 This is not a heading, because the first `#` is escaped:
812 ```````````````````````````````` example
816 ````````````````````````````````
819 Contents are parsed as inlines:
821 ```````````````````````````````` example
824 <h1>foo <em>bar</em> *baz*</h1>
825 ````````````````````````````````
828 Leading and trailing blanks are ignored in parsing inline content:
830 ```````````````````````````````` example
834 ````````````````````````````````
837 One to three spaces indentation are allowed:
839 ```````````````````````````````` example
847 ````````````````````````````````
850 Four spaces are too much:
852 ```````````````````````````````` example
857 ````````````````````````````````
860 ```````````````````````````````` example
866 ````````````````````````````````
869 A closing sequence of `#` characters is optional:
871 ```````````````````````````````` example
877 ````````````````````````````````
880 It need not be the same length as the opening sequence:
882 ```````````````````````````````` example
883 # foo ##################################
888 ````````````````````````````````
891 Spaces are allowed after the closing sequence:
893 ```````````````````````````````` example
897 ````````````````````````````````
900 A sequence of `#` characters with anything but [spaces] following it
901 is not a closing sequence, but counts as part of the contents of the
904 ```````````````````````````````` example
908 ````````````````````````````````
911 The closing sequence must be preceded by a space:
913 ```````````````````````````````` example
917 ````````````````````````````````
920 Backslash-escaped `#` characters do not count as part
921 of the closing sequence:
923 ```````````````````````````````` example
931 ````````````````````````````````
934 ATX headings need not be separated from surrounding content by blank
935 lines, and they can interrupt paragraphs:
937 ```````````````````````````````` example
945 ````````````````````````````````
948 ```````````````````````````````` example
956 ````````````````````````````````
959 ATX headings can be empty:
961 ```````````````````````````````` example
969 ````````````````````````````````
974 A [setext heading](@) consists of one or more
975 lines of text, each containing at least one [non-whitespace
976 character], with no more than 3 spaces indentation, followed by
977 a [setext heading underline]. The lines of text must be such
978 that, were they not followed by the setext heading underline,
979 they would be interpreted as a paragraph: they cannot be
980 interpretable as a [code fence], [ATX heading][ATX headings],
981 [block quote][block quotes], [thematic break][thematic breaks],
982 [list item][list items], or [HTML block][HTML blocks].
984 A [setext heading underline](@) is a sequence of
985 `=` characters or a sequence of `-` characters, with no more than 3
986 spaces indentation and any number of trailing spaces. If a line
987 containing a single `-` can be interpreted as an
988 empty [list items], it should be interpreted this way
989 and not as a [setext heading underline].
991 The heading is a level 1 heading if `=` characters are used in
992 the [setext heading underline], and a level 2 heading if `-`
993 characters are used. The contents of the heading are the result
994 of parsing the preceding lines of text as CommonMark inline
997 In general, a setext heading need not be preceded or followed by a
998 blank line. However, it cannot interrupt a paragraph, so when a
999 setext heading comes after a paragraph, a blank line is needed between
1004 ```````````````````````````````` example
1011 <h1>Foo <em>bar</em></h1>
1012 <h2>Foo <em>bar</em></h2>
1013 ````````````````````````````````
1016 The content of the header may span more than one line:
1018 ```````````````````````````````` example
1025 ````````````````````````````````
1028 The underlining can be any length:
1030 ```````````````````````````````` example
1032 -------------------------
1039 ````````````````````````````````
1042 The heading content can be indented up to three spaces, and need
1043 not line up with the underlining:
1045 ```````````````````````````````` example
1058 ````````````````````````````````
1061 Four spaces indent is too much:
1063 ```````````````````````````````` example
1076 ````````````````````````````````
1079 The setext heading underline can be indented up to three spaces, and
1080 may have trailing spaces:
1082 ```````````````````````````````` example
1087 ````````````````````````````````
1090 Four spaces is too much:
1092 ```````````````````````````````` example
1098 ````````````````````````````````
1101 The setext heading underline cannot contain internal spaces:
1103 ```````````````````````````````` example
1114 ````````````````````````````````
1117 Trailing spaces in the content line do not cause a line break:
1119 ```````````````````````````````` example
1124 ````````````````````````````````
1127 Nor does a backslash at the end:
1129 ```````````````````````````````` example
1134 ````````````````````````````````
1137 Since indicators of block structure take precedence over
1138 indicators of inline structure, the following are setext headings:
1140 ```````````````````````````````` example
1151 <h2><a title="a lot</h2>
1152 <p>of dashes"/></p>
1153 ````````````````````````````````
1156 The setext heading underline cannot be a [lazy continuation
1157 line] in a list item or block quote:
1159 ```````````````````````````````` example
1167 ````````````````````````````````
1170 ```````````````````````````````` example
1180 ````````````````````````````````
1183 ```````````````````````````````` example
1191 ````````````````````````````````
1194 A blank line is needed between a paragraph and a following
1195 setext heading, since otherwise the paragraph becomes part
1196 of the heading's content:
1198 ```````````````````````````````` example
1205 ````````````````````````````````
1208 But in general a blank line is not required before or after
1211 ```````````````````````````````` example
1223 ````````````````````````````````
1226 Setext headings cannot be empty:
1228 ```````````````````````````````` example
1233 ````````````````````````````````
1236 Setext heading text lines must not be interpretable as block
1237 constructs other than paragraphs. So, the line of dashes
1238 in these examples gets interpreted as a thematic break:
1240 ```````````````````````````````` example
1246 ````````````````````````````````
1249 ```````````````````````````````` example
1257 ````````````````````````````````
1260 ```````````````````````````````` example
1267 ````````````````````````````````
1270 ```````````````````````````````` example
1278 ````````````````````````````````
1281 If you want a heading with `> foo` as its literal text, you can
1282 use backslash escapes:
1284 ```````````````````````````````` example
1289 ````````````````````````````````
1292 **Compatibility note:** Most existing Markdown implementations
1293 do not allow the text of setext headings to span multiple lines.
1294 But there is no consensus about how to interpret
1303 One can find four different interpretations:
1305 1. paragraph "Foo", heading "bar", paragraph "baz"
1306 2. paragraph "Foo bar", thematic break, paragraph "baz"
1307 3. paragraph "Foo bar --- baz"
1308 4. heading "Foo bar", paragraph "baz"
1310 We find interpretation 4 most natural, and interpretation 4
1311 increases the expressive power of CommonMark, by allowing
1312 multiline headings. Authors who want interpretation 1 can
1313 put a blank line after the first paragraph:
1315 ```````````````````````````````` example
1325 ````````````````````````````````
1328 Authors who want interpretation 2 can put blank lines around
1331 ```````````````````````````````` example
1343 ````````````````````````````````
1346 or use a thematic break that cannot count as a [setext heading
1349 ```````````````````````````````` example
1359 ````````````````````````````````
1362 Authors who want interpretation 3 can use backslash escapes:
1364 ```````````````````````````````` example
1374 ````````````````````````````````
1377 ## Indented code blocks
1379 An [indented code block](@) is composed of one or more
1380 [indented chunks] separated by blank lines.
1381 An [indented chunk](@) is a sequence of non-blank lines,
1382 each indented four or more spaces. The contents of the code block are
1383 the literal contents of the lines, including trailing
1384 [line endings], minus four spaces of indentation.
1385 An indented code block has no [info string].
1387 An indented code block cannot interrupt a paragraph, so there must be
1388 a blank line between a paragraph and a following indented code block.
1389 (A blank line is not needed, however, between a code block and a following
1392 ```````````````````````````````` example
1399 ````````````````````````````````
1402 If there is any ambiguity between an interpretation of indentation
1403 as a code block and as indicating that material belongs to a [list
1404 item][list items], the list item interpretation takes precedence:
1406 ```````````````````````````````` example
1417 ````````````````````````````````
1420 ```````````````````````````````` example
1433 ````````````````````````````````
1437 The contents of a code block are literal text, and do not get parsed
1440 ```````````````````````````````` example
1446 <pre><code><a/>
1451 ````````````````````````````````
1454 Here we have three chunks separated by blank lines:
1456 ```````````````````````````````` example
1473 ````````````````````````````````
1476 Any initial spaces beyond four will be included in the content, even
1477 in interior blank lines:
1479 ```````````````````````````````` example
1488 ````````````````````````````````
1491 An indented code block cannot interrupt a paragraph. (This
1492 allows hanging indents and the like.)
1494 ```````````````````````````````` example
1501 ````````````````````````````````
1504 However, any non-blank line with fewer than four leading spaces ends
1505 the code block immediately. So a paragraph may occur immediately
1506 after indented code:
1508 ```````````````````````````````` example
1515 ````````````````````````````````
1518 And indented code can occur immediately before and after other kinds of
1521 ```````````````````````````````` example
1536 ````````````````````````````````
1539 The first line can be indented more than four spaces:
1541 ```````````````````````````````` example
1548 ````````````````````````````````
1551 Blank lines preceding or following an indented code block
1552 are not included in it:
1554 ```````````````````````````````` example
1563 ````````````````````````````````
1566 Trailing spaces are included in the code block's content:
1568 ```````````````````````````````` example
1573 ````````````````````````````````
1577 ## Fenced code blocks
1579 A [code fence](@) is a sequence
1580 of at least three consecutive backtick characters (`` ` ``) or
1581 tildes (`~`). (Tildes and backticks cannot be mixed.)
1582 A [fenced code block](@)
1583 begins with a code fence, indented no more than three spaces.
1585 The line with the opening code fence may optionally contain some text
1586 following the code fence; this is trimmed of leading and trailing
1587 spaces and called the [info string](@).
1588 The [info string] may not contain any backtick
1589 characters. (The reason for this restriction is that otherwise
1590 some inline code would be incorrectly interpreted as the
1591 beginning of a fenced code block.)
1593 The content of the code block consists of all subsequent lines, until
1594 a closing [code fence] of the same type as the code block
1595 began with (backticks or tildes), and with at least as many backticks
1596 or tildes as the opening code fence. If the leading code fence is
1597 indented N spaces, then up to N spaces of indentation are removed from
1598 each line of the content (if present). (If a content line is not
1599 indented, it is preserved unchanged. If it is indented less than N
1600 spaces, all of the indentation is removed.)
1602 The closing code fence may be indented up to three spaces, and may be
1603 followed only by spaces, which are ignored. If the end of the
1604 containing block (or document) is reached and no closing code fence
1605 has been found, the code block contains all of the lines after the
1606 opening code fence until the end of the containing block (or
1607 document). (An alternative spec would require backtracking in the
1608 event that a closing code fence is not found. But this makes parsing
1609 much less efficient, and there seems to be no real down side to the
1610 behavior described here.)
1612 A fenced code block may interrupt a paragraph, and does not require
1613 a blank line either before or after.
1615 The content of a code fence is treated as literal text, not parsed
1616 as inlines. The first word of the [info string] is typically used to
1617 specify the language of the code sample, and rendered in the `class`
1618 attribute of the `code` tag. However, this spec does not mandate any
1619 particular treatment of the [info string].
1621 Here is a simple example with backticks:
1623 ```````````````````````````````` example
1632 ````````````````````````````````
1637 ```````````````````````````````` example
1646 ````````````````````````````````
1649 The closing code fence must use the same character as the opening
1652 ```````````````````````````````` example
1661 ````````````````````````````````
1664 ```````````````````````````````` example
1673 ````````````````````````````````
1676 The closing code fence must be at least as long as the opening fence:
1678 ```````````````````````````````` example
1687 ````````````````````````````````
1690 ```````````````````````````````` example
1699 ````````````````````````````````
1702 Unclosed code blocks are closed by the end of the document
1703 (or the enclosing [block quote][block quotes] or [list item][list items]):
1705 ```````````````````````````````` example
1708 <pre><code></code></pre>
1709 ````````````````````````````````
1712 ```````````````````````````````` example
1722 ````````````````````````````````
1725 ```````````````````````````````` example
1736 ````````````````````````````````
1739 A code block can have all empty lines as its content:
1741 ```````````````````````````````` example
1750 ````````````````````````````````
1753 A code block can be empty:
1755 ```````````````````````````````` example
1759 <pre><code></code></pre>
1760 ````````````````````````````````
1763 Fences can be indented. If the opening fence is indented,
1764 content lines will have equivalent opening indentation removed,
1767 ```````````````````````````````` example
1776 ````````````````````````````````
1779 ```````````````````````````````` example
1790 ````````````````````````````````
1793 ```````````````````````````````` example
1804 ````````````````````````````````
1807 Four spaces indentation produces an indented code block:
1809 ```````````````````````````````` example
1818 ````````````````````````````````
1821 Closing fences may be indented by 0-3 spaces, and their indentation
1822 need not match that of the opening fence:
1824 ```````````````````````````````` example
1831 ````````````````````````````````
1834 ```````````````````````````````` example
1841 ````````````````````````````````
1844 This is not a closing fence, because it is indented 4 spaces:
1846 ```````````````````````````````` example
1854 ````````````````````````````````
1858 Code fences (opening and closing) cannot contain internal spaces:
1860 ```````````````````````````````` example
1866 ````````````````````````````````
1869 ```````````````````````````````` example
1877 ````````````````````````````````
1880 Fenced code blocks can interrupt paragraphs, and can be followed
1881 directly by paragraphs, without a blank line between:
1883 ```````````````````````````````` example
1894 ````````````````````````````````
1897 Other blocks can also occur before and after fenced code blocks
1898 without an intervening blank line:
1900 ```````````````````````````````` example
1912 ````````````````````````````````
1915 An [info string] can be provided after the opening code fence.
1916 Opening and closing spaces will be stripped, and the first word, prefixed
1917 with `language-`, is used as the value for the `class` attribute of the
1918 `code` element within the enclosing `pre` element.
1920 ```````````````````````````````` example
1927 <pre><code class="language-ruby">def foo(x)
1931 ````````````````````````````````
1934 ```````````````````````````````` example
1935 ~~~~ ruby startline=3 $%@#$
1941 <pre><code class="language-ruby">def foo(x)
1945 ````````````````````````````````
1948 ```````````````````````````````` example
1952 <pre><code class="language-;"></code></pre>
1953 ````````````````````````````````
1956 [Info strings] for backtick code blocks cannot contain backticks:
1958 ```````````````````````````````` example
1964 ````````````````````````````````
1967 Closing code fences cannot have [info strings]:
1969 ```````````````````````````````` example
1976 ````````````````````````````````
1982 An [HTML block](@) is a group of lines that is treated
1983 as raw HTML (and will not be escaped in HTML output).
1985 There are seven kinds of [HTML block], which can be defined
1986 by their start and end conditions. The block begins with a line that
1987 meets a [start condition](@) (after up to three spaces
1988 optional indentation). It ends with the first subsequent line that
1989 meets a matching [end condition](@), or the last line of
1990 the document or other [container block]), if no line is encountered that meets the
1991 [end condition]. If the first line meets both the [start condition]
1992 and the [end condition], the block will contain just that line.
1994 1. **Start condition:** line begins with the string `<script`,
1995 `<pre`, or `<style` (case-insensitive), followed by whitespace,
1996 the string `>`, or the end of the line.\
1997 **End condition:** line contains an end tag
1998 `</script>`, `</pre>`, or `</style>` (case-insensitive; it
1999 need not match the start tag).
2001 2. **Start condition:** line begins with the string `<!--`.\
2002 **End condition:** line contains the string `-->`.
2004 3. **Start condition:** line begins with the string `<?`.\
2005 **End condition:** line contains the string `?>`.
2007 4. **Start condition:** line begins with the string `<!`
2008 followed by an uppercase ASCII letter.\
2009 **End condition:** line contains the character `>`.
2011 5. **Start condition:** line begins with the string
2013 **End condition:** line contains the string `]]>`.
2015 6. **Start condition:** line begins the string `<` or `</`
2016 followed by one of the strings (case-insensitive) `address`,
2017 `article`, `aside`, `base`, `basefont`, `blockquote`, `body`,
2018 `caption`, `center`, `col`, `colgroup`, `dd`, `details`, `dialog`,
2019 `dir`, `div`, `dl`, `dt`, `fieldset`, `figcaption`, `figure`,
2020 `footer`, `form`, `frame`, `frameset`,
2021 `h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `head`, `header`, `hr`,
2022 `html`, `iframe`, `legend`, `li`, `link`, `main`, `menu`, `menuitem`,
2023 `meta`, `nav`, `noframes`, `ol`, `optgroup`, `option`, `p`, `param`,
2024 `section`, `source`, `summary`, `table`, `tbody`, `td`,
2025 `tfoot`, `th`, `thead`, `title`, `tr`, `track`, `ul`, followed
2026 by [whitespace], the end of the line, the string `>`, or
2028 **End condition:** line is followed by a [blank line].
2030 7. **Start condition:** line begins with a complete [open tag]
2031 or [closing tag] (with any [tag name] other than `script`,
2032 `style`, or `pre`) followed only by [whitespace]
2033 or the end of the line.\
2034 **End condition:** line is followed by a [blank line].
2036 All types of [HTML blocks] except type 7 may interrupt
2037 a paragraph. Blocks of type 7 may not interrupt a paragraph.
2038 (This restriction is intended to prevent unwanted interpretation
2039 of long tags inside a wrapped paragraph as starting HTML blocks.)
2041 Some simple examples follow. Here are some basic HTML blocks
2044 ```````````````````````````````` example
2063 ````````````````````````````````
2066 ```````````````````````````````` example
2074 ````````````````````````````````
2077 A block can also start with a closing tag:
2079 ```````````````````````````````` example
2085 ````````````````````````````````
2088 Here we have two HTML blocks with a Markdown paragraph between them:
2090 ```````````````````````````````` example
2098 <p><em>Markdown</em></p>
2100 ````````````````````````````````
2103 The tag on the first line can be partial, as long
2104 as it is split where there would be whitespace:
2106 ```````````````````````````````` example
2114 ````````````````````````````````
2117 ```````````````````````````````` example
2118 <div id="foo" class="bar
2122 <div id="foo" class="bar
2125 ````````````````````````````````
2128 An open tag need not be closed:
2129 ```````````````````````````````` example
2138 ````````````````````````````````
2142 A partial tag need not even be completed (garbage
2145 ```````````````````````````````` example
2151 ````````````````````````````````
2154 ```````````````````````````````` example
2160 ````````````````````````````````
2163 The initial tag doesn't even need to be a valid
2164 tag, as long as it starts like one:
2166 ```````````````````````````````` example
2172 ````````````````````````````````
2175 In type 6 blocks, the initial tag need not be on a line by
2178 ```````````````````````````````` example
2179 <div><a href="bar">*foo*</a></div>
2181 <div><a href="bar">*foo*</a></div>
2182 ````````````````````````````````
2185 ```````````````````````````````` example
2193 ````````````````````````````````
2196 Everything until the next blank line or end of document
2197 gets included in the HTML block. So, in the following
2198 example, what looks like a Markdown code block
2199 is actually part of the HTML block, which continues until a blank
2200 line or the end of the document is reached:
2202 ```````````````````````````````` example
2212 ````````````````````````````````
2215 To start an [HTML block] with a tag that is *not* in the
2216 list of block-level tags in (6), you must put the tag by
2217 itself on the first line (and it must be complete):
2219 ```````````````````````````````` example
2227 ````````````````````````````````
2230 In type 7 blocks, the [tag name] can be anything:
2232 ```````````````````````````````` example
2240 ````````````````````````````````
2243 ```````````````````````````````` example
2251 ````````````````````````````````
2254 ```````````````````````````````` example
2260 ````````````````````````````````
2263 These rules are designed to allow us to work with tags that
2264 can function as either block-level or inline-level tags.
2265 The `<del>` tag is a nice example. We can surround content with
2266 `<del>` tags in three different ways. In this case, we get a raw
2267 HTML block, because the `<del>` tag is on a line by itself:
2269 ```````````````````````````````` example
2277 ````````````````````````````````
2280 In this case, we get a raw HTML block that just includes
2281 the `<del>` tag (because it ends with the following blank
2282 line). So the contents get interpreted as CommonMark:
2284 ```````````````````````````````` example
2294 ````````````````````````````````
2297 Finally, in this case, the `<del>` tags are interpreted
2298 as [raw HTML] *inside* the CommonMark paragraph. (Because
2299 the tag is not on a line by itself, we get inline HTML
2300 rather than an [HTML block].)
2302 ```````````````````````````````` example
2305 <p><del><em>foo</em></del></p>
2306 ````````````````````````````````
2309 HTML tags designed to contain literal content
2310 (`script`, `style`, `pre`), comments, processing instructions,
2311 and declarations are treated somewhat differently.
2312 Instead of ending at the first blank line, these blocks
2313 end at the first line containing a corresponding end tag.
2314 As a result, these blocks can contain blank lines:
2318 ```````````````````````````````` example
2319 <pre language="haskell"><code>
2320 import Text.HTML.TagSoup
2323 main = print $ parseTags tags
2327 <pre language="haskell"><code>
2328 import Text.HTML.TagSoup
2331 main = print $ parseTags tags
2334 ````````````````````````````````
2337 A script tag (type 1):
2339 ```````````````````````````````` example
2340 <script type="text/javascript">
2341 // JavaScript example
2343 document.getElementById("demo").innerHTML = "Hello JavaScript!";
2347 <script type="text/javascript">
2348 // JavaScript example
2350 document.getElementById("demo").innerHTML = "Hello JavaScript!";
2353 ````````````````````````````````
2356 A style tag (type 1):
2358 ```````````````````````````````` example
2374 ````````````````````````````````
2377 If there is no matching end tag, the block will end at the
2378 end of the document (or the enclosing [block quote][block quotes]
2379 or [list item][list items]):
2381 ```````````````````````````````` example
2391 ````````````````````````````````
2394 ```````````````````````````````` example
2405 ````````````````````````````````
2408 ```````````````````````````````` example
2418 ````````````````````````````````
2421 The end tag can occur on the same line as the start tag:
2423 ```````````````````````````````` example
2424 <style>p{color:red;}</style>
2427 <style>p{color:red;}</style>
2429 ````````````````````````````````
2432 ```````````````````````````````` example
2438 ````````````````````````````````
2441 Note that anything on the last line after the
2442 end tag will be included in the [HTML block]:
2444 ```````````````````````````````` example
2452 ````````````````````````````````
2457 ```````````````````````````````` example
2469 ````````````````````````````````
2473 A processing instruction (type 3):
2475 ```````````````````````````````` example
2489 ````````````````````````````````
2492 A declaration (type 4):
2494 ```````````````````````````````` example
2498 ````````````````````````````````
2503 ```````````````````````````````` example
2505 function matchwo(a,b)
2507 if (a < b && a < 0) then {
2519 function matchwo(a,b)
2521 if (a < b && a < 0) then {
2531 ````````````````````````````````
2534 The opening tag can be indented 1-3 spaces, but not 4:
2536 ```````````````````````````````` example
2542 <pre><code><!-- foo -->
2544 ````````````````````````````````
2547 ```````````````````````````````` example
2553 <pre><code><div>
2555 ````````````````````````````````
2558 An HTML block of types 1--6 can interrupt a paragraph, and need not be
2559 preceded by a blank line.
2561 ```````````````````````````````` example
2571 ````````````````````````````````
2574 However, a following blank line is needed, except at the end of
2575 a document, and except for blocks of types 1--5, above:
2577 ```````````````````````````````` example
2587 ````````````````````````````````
2590 HTML blocks of type 7 cannot interrupt a paragraph:
2592 ```````````````````````````````` example
2600 ````````````````````````````````
2603 This rule differs from John Gruber's original Markdown syntax
2604 specification, which says:
2606 > The only restrictions are that block-level HTML elements —
2607 > e.g. `<div>`, `<table>`, `<pre>`, `<p>`, etc. — must be separated from
2608 > surrounding content by blank lines, and the start and end tags of the
2609 > block should not be indented with tabs or spaces.
2611 In some ways Gruber's rule is more restrictive than the one given
2614 - It requires that an HTML block be preceded by a blank line.
2615 - It does not allow the start tag to be indented.
2616 - It requires a matching end tag, which it also does not allow to
2619 Most Markdown implementations (including some of Gruber's own) do not
2620 respect all of these restrictions.
2622 There is one respect, however, in which Gruber's rule is more liberal
2623 than the one given here, since it allows blank lines to occur inside
2624 an HTML block. There are two reasons for disallowing them here.
2625 First, it removes the need to parse balanced tags, which is
2626 expensive and can require backtracking from the end of the document
2627 if no matching end tag is found. Second, it provides a very simple
2628 and flexible way of including Markdown content inside HTML tags:
2629 simply separate the Markdown from the HTML using blank lines:
2633 ```````````````````````````````` example
2641 <p><em>Emphasized</em> text.</p>
2643 ````````````````````````````````
2646 ```````````````````````````````` example
2654 ````````````````````````````````
2657 Some Markdown implementations have adopted a convention of
2658 interpreting content inside tags as text if the open tag has
2659 the attribute `markdown=1`. The rule given above seems a simpler and
2660 more elegant way of achieving the same expressive power, which is also
2661 much simpler to parse.
2663 The main potential drawback is that one can no longer paste HTML
2664 blocks into Markdown documents with 100% reliability. However,
2665 *in most cases* this will work fine, because the blank lines in
2666 HTML are usually followed by HTML block tags. For example:
2668 ```````````````````````````````` example
2688 ````````````````````````````````
2691 There are problems, however, if the inner tags are indented
2692 *and* separated by spaces, as then they will be interpreted as
2693 an indented code block:
2695 ```````````````````````````````` example
2710 <pre><code><td>
2716 ````````````````````````````````
2719 Fortunately, blank lines are usually not necessary and can be
2720 deleted. The exception is inside `<pre>` tags, but as described
2721 above, raw HTML blocks starting with `<pre>` *can* contain blank
2724 ## Link reference definitions
2726 A [link reference definition](@)
2727 consists of a [link label], indented up to three spaces, followed
2728 by a colon (`:`), optional [whitespace] (including up to one
2729 [line ending]), a [link destination],
2730 optional [whitespace] (including up to one
2731 [line ending]), and an optional [link
2732 title], which if it is present must be separated
2733 from the [link destination] by [whitespace].
2734 No further [non-whitespace characters] may occur on the line.
2736 A [link reference definition]
2737 does not correspond to a structural element of a document. Instead, it
2738 defines a label which can be used in [reference links]
2739 and reference-style [images] elsewhere in the document. [Link
2740 reference definitions] can come either before or after the links that use
2743 ```````````````````````````````` example
2748 <p><a href="/url" title="title">foo</a></p>
2749 ````````````````````````````````
2752 ```````````````````````````````` example
2759 <p><a href="/url" title="the title">foo</a></p>
2760 ````````````````````````````````
2763 ```````````````````````````````` example
2764 [Foo*bar\]]:my_(url) 'title (with parens)'
2768 <p><a href="my_(url)" title="title (with parens)">Foo*bar]</a></p>
2769 ````````````````````````````````
2772 ```````````````````````````````` example
2779 <p><a href="my%20url" title="title">Foo bar</a></p>
2780 ````````````````````````````````
2783 The title may extend over multiple lines:
2785 ```````````````````````````````` example
2794 <p><a href="/url" title="
2799 ````````````````````````````````
2802 However, it may not contain a [blank line]:
2804 ```````````````````````````````` example
2811 <p>[foo]: /url 'title</p>
2812 <p>with blank line'</p>
2814 ````````````````````````````````
2817 The title may be omitted:
2819 ```````````````````````````````` example
2825 <p><a href="/url">foo</a></p>
2826 ````````````````````````````````
2829 The link destination may not be omitted:
2831 ```````````````````````````````` example
2838 ````````````````````````````````
2841 Both title and destination can contain backslash escapes
2842 and literal backslashes:
2844 ```````````````````````````````` example
2845 [foo]: /url\bar\*baz "foo\"bar\baz"
2849 <p><a href="/url%5Cbar*baz" title="foo"bar\baz">foo</a></p>
2850 ````````````````````````````````
2853 A link can come before its corresponding definition:
2855 ```````````````````````````````` example
2860 <p><a href="url">foo</a></p>
2861 ````````````````````````````````
2864 If there are several matching definitions, the first one takes
2867 ```````````````````````````````` example
2873 <p><a href="first">foo</a></p>
2874 ````````````````````````````````
2877 As noted in the section on [Links], matching of labels is
2878 case-insensitive (see [matches]).
2880 ```````````````````````````````` example
2885 <p><a href="/url">Foo</a></p>
2886 ````````````````````````````````
2889 ```````````````````````````````` example
2894 <p><a href="/%CF%86%CE%BF%CF%85">αγω</a></p>
2895 ````````````````````````````````
2898 Here is a link reference definition with no corresponding link.
2899 It contributes nothing to the document.
2901 ```````````````````````````````` example
2904 ````````````````````````````````
2907 Here is another one:
2909 ```````````````````````````````` example
2916 ````````````````````````````````
2919 This is not a link reference definition, because there are
2920 [non-whitespace characters] after the title:
2922 ```````````````````````````````` example
2923 [foo]: /url "title" ok
2925 <p>[foo]: /url "title" ok</p>
2926 ````````````````````````````````
2929 This is a link reference definition, but it has no title:
2931 ```````````````````````````````` example
2935 <p>"title" ok</p>
2936 ````````````````````````````````
2939 This is not a link reference definition, because it is indented
2942 ```````````````````````````````` example
2947 <pre><code>[foo]: /url "title"
2950 ````````````````````````````````
2953 This is not a link reference definition, because it occurs inside
2956 ```````````````````````````````` example
2963 <pre><code>[foo]: /url
2966 ````````````````````````````````
2969 A [link reference definition] cannot interrupt a paragraph.
2971 ```````````````````````````````` example
2980 ````````````````````````````````
2983 However, it can directly follow other block elements, such as headings
2984 and thematic breaks, and it need not be followed by a blank line.
2986 ```````````````````````````````` example
2991 <h1><a href="/url">Foo</a></h1>
2995 ````````````````````````````````
2998 Several [link reference definitions]
2999 can occur one after another, without intervening blank lines.
3001 ```````````````````````````````` example
3002 [foo]: /foo-url "foo"
3011 <p><a href="/foo-url" title="foo">foo</a>,
3012 <a href="/bar-url" title="bar">bar</a>,
3013 <a href="/baz-url">baz</a></p>
3014 ````````````````````````````````
3017 [Link reference definitions] can occur
3018 inside block containers, like lists and block quotations. They
3019 affect the entire document, not just the container in which they
3022 ```````````````````````````````` example
3027 <p><a href="/url">foo</a></p>
3030 ````````````````````````````````
3036 A sequence of non-blank lines that cannot be interpreted as other
3037 kinds of blocks forms a [paragraph](@).
3038 The contents of the paragraph are the result of parsing the
3039 paragraph's raw content as inlines. The paragraph's raw content
3040 is formed by concatenating the lines and removing initial and final
3043 A simple example with two paragraphs:
3045 ```````````````````````````````` example
3052 ````````````````````````````````
3055 Paragraphs can contain multiple lines, but no blank lines:
3057 ```````````````````````````````` example
3068 ````````````````````````````````
3071 Multiple blank lines between paragraph have no effect:
3073 ```````````````````````````````` example
3081 ````````````````````````````````
3084 Leading spaces are skipped:
3086 ```````````````````````````````` example
3092 ````````````````````````````````
3095 Lines after the first may be indented any amount, since indented
3096 code blocks cannot interrupt paragraphs.
3098 ```````````````````````````````` example
3106 ````````````````````````````````
3109 However, the first line may be indented at most three spaces,
3110 or an indented code block will be triggered:
3112 ```````````````````````````````` example
3118 ````````````````````````````````
3121 ```````````````````````````````` example
3128 ````````````````````````````````
3131 Final spaces are stripped before inline parsing, so a paragraph
3132 that ends with two or more spaces will not end with a [hard line
3135 ```````````````````````````````` example
3141 ````````````````````````````````
3146 [Blank lines] between block-level elements are ignored,
3147 except for the role they play in determining whether a [list]
3148 is [tight] or [loose].
3150 Blank lines at the beginning and end of the document are also ignored.
3152 ```````````````````````````````` example
3164 ````````````````````````````````
3170 A [container block] is a block that has other
3171 blocks as its contents. There are two basic kinds of container blocks:
3172 [block quotes] and [list items].
3173 [Lists] are meta-containers for [list items].
3175 We define the syntax for container blocks recursively. The general
3176 form of the definition is:
3178 > If X is a sequence of blocks, then the result of
3179 > transforming X in such-and-such a way is a container of type Y
3180 > with these blocks as its content.
3182 So, we explain what counts as a block quote or list item by explaining
3183 how these can be *generated* from their contents. This should suffice
3184 to define the syntax, although it does not give a recipe for *parsing*
3185 these constructions. (A recipe is provided below in the section entitled
3186 [A parsing strategy](#appendix-a-parsing-strategy).)
3190 A [block quote marker](@)
3191 consists of 0-3 spaces of initial indent, plus (a) the character `>` together
3192 with a following space, or (b) a single character `>` not followed by a space.
3194 The following rules define [block quotes]:
3196 1. **Basic case.** If a string of lines *Ls* constitute a sequence
3197 of blocks *Bs*, then the result of prepending a [block quote
3198 marker] to the beginning of each line in *Ls*
3199 is a [block quote](#block-quotes) containing *Bs*.
3201 2. **Laziness.** If a string of lines *Ls* constitute a [block
3202 quote](#block-quotes) with contents *Bs*, then the result of deleting
3203 the initial [block quote marker] from one or
3204 more lines in which the next [non-whitespace character] after the [block
3205 quote marker] is [paragraph continuation
3206 text] is a block quote with *Bs* as its content.
3207 [Paragraph continuation text](@) is text
3208 that will be parsed as part of the content of a paragraph, but does
3209 not occur at the beginning of the paragraph.
3211 3. **Consecutiveness.** A document cannot contain two [block
3212 quotes] in a row unless there is a [blank line] between them.
3214 Nothing else counts as a [block quote](#block-quotes).
3216 Here is a simple example:
3218 ```````````````````````````````` example
3228 ````````````````````````````````
3231 The spaces after the `>` characters can be omitted:
3233 ```````````````````````````````` example
3243 ````````````````````````````````
3246 The `>` characters can be indented 1-3 spaces:
3248 ```````````````````````````````` example
3258 ````````````````````````````````
3261 Four spaces gives us a code block:
3263 ```````````````````````````````` example
3268 <pre><code>> # Foo
3272 ````````````````````````````````
3275 The Laziness clause allows us to omit the `>` before
3276 [paragraph continuation text]:
3278 ```````````````````````````````` example
3288 ````````````````````````````````
3291 A block quote can contain some lazy and some non-lazy
3294 ```````````````````````````````` example
3304 ````````````````````````````````
3307 Laziness only applies to lines that would have been continuations of
3308 paragraphs had they been prepended with [block quote markers].
3309 For example, the `> ` cannot be omitted in the second line of
3316 without changing the meaning:
3318 ```````````````````````````````` example
3326 ````````````````````````````````
3329 Similarly, if we omit the `> ` in the second line of
3336 then the block quote ends after the first line:
3338 ```````````````````````````````` example
3350 ````````````````````````````````
3353 For the same reason, we can't omit the `> ` in front of
3354 subsequent lines of an indented or fenced code block:
3356 ```````````````````````````````` example
3366 ````````````````````````````````
3369 ```````````````````````````````` example
3375 <pre><code></code></pre>
3378 <pre><code></code></pre>
3379 ````````````````````````````````
3382 Note that in the following case, we have a [lazy
3385 ```````````````````````````````` example
3393 ````````````````````````````````
3396 To see why, note that in
3403 the `- bar` is indented too far to start a list, and can't
3404 be an indented code block because indented code blocks cannot
3405 interrupt paragraphs, so it is [paragraph continuation text].
3407 A block quote can be empty:
3409 ```````````````````````````````` example
3414 ````````````````````````````````
3417 ```````````````````````````````` example
3424 ````````````````````````````````
3427 A block quote can have initial or final blank lines:
3429 ```````````````````````````````` example
3437 ````````````````````````````````
3440 A blank line always separates block quotes:
3442 ```````````````````````````````` example
3453 ````````````````````````````````
3456 (Most current Markdown implementations, including John Gruber's
3457 original `Markdown.pl`, will parse this example as a single block quote
3458 with two paragraphs. But it seems better to allow the author to decide
3459 whether two block quotes or one are wanted.)
3461 Consecutiveness means that if we put these block quotes together,
3462 we get a single block quote:
3464 ```````````````````````````````` example
3472 ````````````````````````````````
3475 To get a block quote with two paragraphs, use:
3477 ```````````````````````````````` example
3486 ````````````````````````````````
3489 Block quotes can interrupt paragraphs:
3491 ```````````````````````````````` example
3499 ````````````````````````````````
3502 In general, blank lines are not needed before or after block
3505 ```````````````````````````````` example
3517 ````````````````````````````````
3520 However, because of laziness, a blank line is needed between
3521 a block quote and a following paragraph:
3523 ```````````````````````````````` example
3531 ````````````````````````````````
3534 ```````````````````````````````` example
3543 ````````````````````````````````
3546 ```````````````````````````````` example
3555 ````````````````````````````````
3558 It is a consequence of the Laziness rule that any number
3559 of initial `>`s may be omitted on a continuation line of a
3562 ```````````````````````````````` example
3574 ````````````````````````````````
3577 ```````````````````````````````` example
3591 ````````````````````````````````
3594 When including an indented code block in a block quote,
3595 remember that the [block quote marker] includes
3596 both the `>` and a following space. So *five spaces* are needed after
3599 ```````````````````````````````` example
3611 ````````````````````````````````
3617 A [list marker](@) is a
3618 [bullet list marker] or an [ordered list marker].
3620 A [bullet list marker](@)
3621 is a `-`, `+`, or `*` character.
3623 An [ordered list marker](@)
3624 is a sequence of 1--9 arabic digits (`0-9`), followed by either a
3625 `.` character or a `)` character. (The reason for the length
3626 limit is that with 10 digits we start seeing integer overflows
3629 The following rules define [list items]:
3631 1. **Basic case.** If a sequence of lines *Ls* constitute a sequence of
3632 blocks *Bs* starting with a [non-whitespace character] and not separated
3633 from each other by more than one blank line, and *M* is a list
3634 marker of width *W* followed by 1 ≤ *N* ≤ 4 spaces, then the result
3635 of prepending *M* and the following spaces to the first line of
3636 *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a
3637 list item with *Bs* as its contents. The type of the list item
3638 (bullet or ordered) is determined by the type of its list marker.
3639 If the list item is ordered, then it is also assigned a start
3640 number, based on the ordered list marker.
3642 Exceptions: When the first list item in a [list] interrupts
3643 a paragraph---that is, when it starts on a line that would
3644 otherwise count as [paragraph continuation text]---then (a)
3645 the lines *Ls* must not begin with a blank line, and (b) if
3646 the list item is ordered, the start number must be 1.
3648 For example, let *Ls* be the lines
3650 ```````````````````````````````` example
3660 <pre><code>indented code
3663 <p>A block quote.</p>
3665 ````````````````````````````````
3668 And let *M* be the marker `1.`, and *N* = 2. Then rule #1 says
3669 that the following is an ordered list item with start number 1,
3670 and the same contents as *Ls*:
3672 ```````````````````````````````` example
3684 <pre><code>indented code
3687 <p>A block quote.</p>
3691 ````````````````````````````````
3694 The most important thing to notice is that the position of
3695 the text after the list marker determines how much indentation
3696 is needed in subsequent blocks in the list item. If the list
3697 marker takes up two spaces, and there are three spaces between
3698 the list marker and the next [non-whitespace character], then blocks
3699 must be indented five spaces in order to fall under the list
3702 Here are some examples showing how far content must be indented to be
3703 put under the list item:
3705 ```````````````````````````````` example
3714 ````````````````````````````````
3717 ```````````````````````````````` example
3728 ````````````````````````````````
3731 ```````````````````````````````` example
3741 ````````````````````````````````
3744 ```````````````````````````````` example
3755 ````````````````````````````````
3758 It is tempting to think of this in terms of columns: the continuation
3759 blocks must be indented at least to the column of the first
3760 [non-whitespace character] after the list marker. However, that is not quite right.
3761 The spaces after the list marker determine how much relative indentation
3762 is needed. Which column this indentation reaches will depend on
3763 how the list item is embedded in other constructions, as shown by
3766 ```````````````````````````````` example
3781 ````````````````````````````````
3784 Here `two` occurs in the same column as the list marker `1.`,
3785 but is actually contained in the list item, because there is
3786 sufficient indentation after the last containing blockquote marker.
3788 The converse is also possible. In the following example, the word `two`
3789 occurs far to the right of the initial text of the list item, `one`, but
3790 it is not considered part of the list item, because it is not indented
3791 far enough past the blockquote marker:
3793 ```````````````````````````````` example
3806 ````````````````````````````````
3809 Note that at least one space is needed between the list marker and
3810 any following content, so these are not list items:
3812 ```````````````````````````````` example
3819 ````````````````````````````````
3822 A list item may contain blocks that are separated by more than
3825 ```````````````````````````````` example
3837 ````````````````````````````````
3840 A list item may contain any kind of block:
3842 ```````````````````````````````` example
3864 ````````````````````````````````
3867 A list item that contains an indented code block will preserve
3868 empty lines within the code block verbatim.
3870 ```````````````````````````````` example
3888 ````````````````````````````````
3890 Note that ordered list start numbers must be nine digits or less:
3892 ```````````````````````````````` example
3895 <ol start="123456789">
3898 ````````````````````````````````
3901 ```````````````````````````````` example
3904 <p>1234567890. not ok</p>
3905 ````````````````````````````````
3908 A start number may begin with 0s:
3910 ```````````````````````````````` example
3916 ````````````````````````````````
3919 ```````````````````````````````` example
3925 ````````````````````````````````
3928 A start number may not be negative:
3930 ```````````````````````````````` example
3934 ````````````````````````````````
3938 2. **Item starting with indented code.** If a sequence of lines *Ls*
3939 constitute a sequence of blocks *Bs* starting with an indented code
3940 block and not separated from each other by more than one blank line,
3941 and *M* is a list marker of width *W* followed by
3942 one space, then the result of prepending *M* and the following
3943 space to the first line of *Ls*, and indenting subsequent lines of
3944 *Ls* by *W + 1* spaces, is a list item with *Bs* as its contents.
3945 If a line is empty, then it need not be indented. The type of the
3946 list item (bullet or ordered) is determined by the type of its list
3947 marker. If the list item is ordered, then it is also assigned a
3948 start number, based on the ordered list marker.
3950 An indented code block will have to be indented four spaces beyond
3951 the edge of the region where text will be included in the list item.
3952 In the following case that is 6 spaces:
3954 ```````````````````````````````` example
3966 ````````````````````````````````
3969 And in this case it is 11 spaces:
3971 ```````````````````````````````` example
3983 ````````````````````````````````
3986 If the *first* block in the list item is an indented code block,
3987 then by rule #2, the contents must be indented *one* space after the
3990 ```````````````````````````````` example
3997 <pre><code>indented code
4000 <pre><code>more code
4002 ````````````````````````````````
4005 ```````````````````````````````` example
4014 <pre><code>indented code
4017 <pre><code>more code
4021 ````````````````````````````````
4024 Note that an additional space indent is interpreted as space
4025 inside the code block:
4027 ```````````````````````````````` example
4036 <pre><code> indented code
4039 <pre><code>more code
4043 ````````````````````````````````
4046 Note that rules #1 and #2 only apply to two cases: (a) cases
4047 in which the lines to be included in a list item begin with a
4048 [non-whitespace character], and (b) cases in which
4049 they begin with an indented code
4050 block. In a case like the following, where the first block begins with
4051 a three-space indent, the rules do not allow us to form a list item by
4052 indenting the whole thing and prepending a list marker:
4054 ```````````````````````````````` example
4061 ````````````````````````````````
4064 ```````````````````````````````` example
4073 ````````````````````````````````
4076 This is not a significant restriction, because when a block begins
4077 with 1-3 spaces indent, the indentation can always be removed without
4078 a change in interpretation, allowing rule #1 to be applied. So, in
4081 ```````````````````````````````` example
4092 ````````````````````````````````
4095 3. **Item starting with a blank line.** If a sequence of lines *Ls*
4096 starting with a single [blank line] constitute a (possibly empty)
4097 sequence of blocks *Bs*, not separated from each other by more than
4098 one blank line, and *M* is a list marker of width *W*,
4099 then the result of prepending *M* to the first line of *Ls*, and
4100 indenting subsequent lines of *Ls* by *W + 1* spaces, is a list
4101 item with *Bs* as its contents.
4102 If a line is empty, then it need not be indented. The type of the
4103 list item (bullet or ordered) is determined by the type of its list
4104 marker. If the list item is ordered, then it is also assigned a
4105 start number, based on the ordered list marker.
4107 Here are some list items that start with a blank line but are not empty:
4109 ```````````````````````````````` example
4130 ````````````````````````````````
4132 When the list item starts with a blank line, the number of spaces
4133 following the list marker doesn't change the required indentation:
4135 ```````````````````````````````` example
4142 ````````````````````````````````
4145 A list item can begin with at most one blank line.
4146 In the following example, `foo` is not part of the list
4149 ```````````````````````````````` example
4158 ````````````````````````````````
4161 Here is an empty bullet list item:
4163 ```````````````````````````````` example
4173 ````````````````````````````````
4176 It does not matter whether there are spaces following the [list marker]:
4178 ```````````````````````````````` example
4188 ````````````````````````````````
4191 Here is an empty ordered list item:
4193 ```````````````````````````````` example
4203 ````````````````````````````````
4206 A list may start or end with an empty list item:
4208 ```````````````````````````````` example
4214 ````````````````````````````````
4216 However, an empty list item cannot interrupt a paragraph:
4218 ```````````````````````````````` example
4229 ````````````````````````````````
4232 4. **Indentation.** If a sequence of lines *Ls* constitutes a list item
4233 according to rule #1, #2, or #3, then the result of indenting each line
4234 of *Ls* by 1-3 spaces (the same for each line) also constitutes a
4235 list item with the same contents and attributes. If a line is
4236 empty, then it need not be indented.
4240 ```````````````````````````````` example
4252 <pre><code>indented code
4255 <p>A block quote.</p>
4259 ````````````````````````````````
4262 Indented two spaces:
4264 ```````````````````````````````` example
4276 <pre><code>indented code
4279 <p>A block quote.</p>
4283 ````````````````````````````````
4286 Indented three spaces:
4288 ```````````````````````````````` example
4300 <pre><code>indented code
4303 <p>A block quote.</p>
4307 ````````````````````````````````
4310 Four spaces indent gives a code block:
4312 ```````````````````````````````` example
4320 <pre><code>1. A paragraph
4327 ````````````````````````````````
4331 5. **Laziness.** If a string of lines *Ls* constitute a [list
4332 item](#list-items) with contents *Bs*, then the result of deleting
4333 some or all of the indentation from one or more lines in which the
4334 next [non-whitespace character] after the indentation is
4335 [paragraph continuation text] is a
4336 list item with the same contents and attributes. The unindented
4338 [lazy continuation line](@)s.
4340 Here is an example with [lazy continuation lines]:
4342 ```````````````````````````````` example
4354 <pre><code>indented code
4357 <p>A block quote.</p>
4361 ````````````````````````````````
4364 Indentation can be partially deleted:
4366 ```````````````````````````````` example
4372 with two lines.</li>
4374 ````````````````````````````````
4377 These examples show how laziness can work in nested structures:
4379 ```````````````````````````````` example
4393 ````````````````````````````````
4396 ```````````````````````````````` example
4410 ````````````````````````````````
4414 6. **That's all.** Nothing that is not counted as a list item by rules
4415 #1--5 counts as a [list item](#list-items).
4417 The rules for sublists follow from the general rules above. A sublist
4418 must be indented the same number of spaces a paragraph would need to be
4419 in order to be included in the list item.
4421 So, in this case we need two spaces indent:
4423 ```````````````````````````````` example
4444 ````````````````````````````````
4449 ```````````````````````````````` example
4461 ````````````````````````````````
4464 Here we need four, because the list marker is wider:
4466 ```````````````````````````````` example
4477 ````````````````````````````````
4480 Three is not enough:
4482 ```````````````````````````````` example
4492 ````````````````````````````````
4495 A list may be the first block in a list item:
4497 ```````````````````````````````` example
4507 ````````````````````````````````
4510 ```````````````````````````````` example
4524 ````````````````````````````````
4527 A list item can contain a heading:
4529 ```````````````````````````````` example
4543 ````````````````````````````````
4548 John Gruber's Markdown spec says the following about list items:
4550 1. "List markers typically start at the left margin, but may be indented
4551 by up to three spaces. List markers must be followed by one or more
4554 2. "To make lists look nice, you can wrap items with hanging indents....
4555 But if you don't want to, you don't have to."
4557 3. "List items may consist of multiple paragraphs. Each subsequent
4558 paragraph in a list item must be indented by either 4 spaces or one
4561 4. "It looks nice if you indent every line of the subsequent paragraphs,
4562 but here again, Markdown will allow you to be lazy."
4564 5. "To put a blockquote within a list item, the blockquote's `>`
4565 delimiters need to be indented."
4567 6. "To put a code block within a list item, the code block needs to be
4568 indented twice — 8 spaces or two tabs."
4570 These rules specify that a paragraph under a list item must be indented
4571 four spaces (presumably, from the left margin, rather than the start of
4572 the list marker, but this is not said), and that code under a list item
4573 must be indented eight spaces instead of the usual four. They also say
4574 that a block quote must be indented, but not by how much; however, the
4575 example given has four spaces indentation. Although nothing is said
4576 about other kinds of block-level content, it is certainly reasonable to
4577 infer that *all* block elements under a list item, including other
4578 lists, must be indented four spaces. This principle has been called the
4581 The four-space rule is clear and principled, and if the reference
4582 implementation `Markdown.pl` had followed it, it probably would have
4583 become the standard. However, `Markdown.pl` allowed paragraphs and
4584 sublists to start with only two spaces indentation, at least on the
4585 outer level. Worse, its behavior was inconsistent: a sublist of an
4586 outer-level list needed two spaces indentation, but a sublist of this
4587 sublist needed three spaces. It is not surprising, then, that different
4588 implementations of Markdown have developed very different rules for
4589 determining what comes under a list item. (Pandoc and python-Markdown,
4590 for example, stuck with Gruber's syntax description and the four-space
4591 rule, while discount, redcarpet, marked, PHP Markdown, and others
4592 followed `Markdown.pl`'s behavior more closely.)
4594 Unfortunately, given the divergences between implementations, there
4595 is no way to give a spec for list items that will be guaranteed not
4596 to break any existing documents. However, the spec given here should
4597 correctly handle lists formatted with either the four-space rule or
4598 the more forgiving `Markdown.pl` behavior, provided they are laid out
4599 in a way that is natural for a human to read.
4601 The strategy here is to let the width and indentation of the list marker
4602 determine the indentation necessary for blocks to fall under the list
4603 item, rather than having a fixed and arbitrary number. The writer can
4604 think of the body of the list item as a unit which gets indented to the
4605 right enough to fit the list marker (and any indentation on the list
4606 marker). (The laziness rule, #5, then allows continuation lines to be
4607 unindented if needed.)
4609 This rule is superior, we claim, to any rule requiring a fixed level of
4610 indentation from the margin. The four-space rule is clear but
4611 unnatural. It is quite unintuitive that
4621 should be parsed as two lists with an intervening paragraph,
4633 as the four-space rule demands, rather than a single list,
4647 The choice of four spaces is arbitrary. It can be learned, but it is
4648 not likely to be guessed, and it trips up beginners regularly.
4650 Would it help to adopt a two-space rule? The problem is that such
4651 a rule, together with the rule allowing 1--3 spaces indentation of the
4652 initial list marker, allows text that is indented *less than* the
4653 original list marker to be included in the list item. For example,
4654 `Markdown.pl` parses
4662 as a single list item, with `two` a continuation paragraph:
4694 This is extremely unintuitive.
4696 Rather than requiring a fixed indent from the margin, we could require
4697 a fixed indent (say, two spaces, or even one space) from the list marker (which
4698 may itself be indented). This proposal would remove the last anomaly
4699 discussed. Unlike the spec presented above, it would count the following
4700 as a list item with a subparagraph, even though the paragraph `bar`
4701 is not indented as far as the first paragraph `foo`:
4709 Arguably this text does read like a list item with `bar` as a subparagraph,
4710 which may count in favor of the proposal. However, on this proposal indented
4711 code would have to be indented six spaces after the list marker. And this
4712 would break a lot of existing Markdown, which has the pattern:
4720 where the code is indented eight spaces. The spec above, by contrast, will
4721 parse this text as expected, since the code block's indentation is measured
4722 from the beginning of `foo`.
4724 The one case that needs special treatment is a list item that *starts*
4725 with indented code. How much indentation is required in that case, since
4726 we don't have a "first paragraph" to measure from? Rule #2 simply stipulates
4727 that in such cases, we require one space indentation from the list marker
4728 (and then the normal four spaces for the indented code). This will match the
4729 four-space rule in cases where the list marker plus its initial indentation
4730 takes four spaces (a common case), but diverge in other cases.
4734 A [list](@) is a sequence of one or more
4735 list items [of the same type]. The list items
4736 may be separated by any number of blank lines.
4738 Two list items are [of the same type](@)
4739 if they begin with a [list marker] of the same type.
4740 Two list markers are of the
4741 same type if (a) they are bullet list markers using the same character
4742 (`-`, `+`, or `*`) or (b) they are ordered list numbers with the same
4743 delimiter (either `.` or `)`).
4745 A list is an [ordered list](@)
4746 if its constituent list items begin with
4747 [ordered list markers], and a
4748 [bullet list](@) if its constituent list
4749 items begin with [bullet list markers].
4751 The [start number](@)
4752 of an [ordered list] is determined by the list number of
4753 its initial list item. The numbers of subsequent list items are
4756 A list is [loose](@) if any of its constituent
4757 list items are separated by blank lines, or if any of its constituent
4758 list items directly contain two block-level elements with a blank line
4759 between them. Otherwise a list is [tight](@).
4760 (The difference in HTML output is that paragraphs in a loose list are
4761 wrapped in `<p>` tags, while paragraphs in a tight list are not.)
4763 Changing the bullet or ordered list delimiter starts a new list:
4765 ```````````````````````````````` example
4777 ````````````````````````````````
4780 ```````````````````````````````` example
4792 ````````````````````````````````
4795 In CommonMark, a list can interrupt a paragraph. That is,
4796 no blank line is needed to separate a paragraph from a following
4799 ```````````````````````````````` example
4809 ````````````````````````````````
4811 `Markdown.pl` does not allow this, through fear of triggering a list
4812 via a numeral in a hard-wrapped line:
4815 The number of windows in my house is
4816 14. The number of doors is 6.
4819 Oddly, though, `Markdown.pl` *does* allow a blockquote to
4820 interrupt a paragraph, even though the same considerations might
4823 In CommonMark, we do allow lists to interrupt paragraphs, for
4824 two reasons. First, it is natural and not uncommon for people
4825 to start lists without blank lines:
4834 Second, we are attracted to a
4836 > [principle of uniformity](@):
4837 > if a chunk of text has a certain
4838 > meaning, it will continue to have the same meaning when put into a
4839 > container block (such as a list item or blockquote).
4841 (Indeed, the spec for [list items] and [block quotes] presupposes
4842 this principle.) This principle implies that if
4851 is a list item containing a paragraph followed by a nested sublist,
4852 as all Markdown implementations agree it is (though the paragraph
4853 may be rendered without `<p>` tags, since the list is "tight"),
4863 by itself should be a paragraph followed by a nested sublist.
4865 Since it is well established Markdown practice to allow lists to
4866 interrupt paragraphs inside list items, the [principle of
4867 uniformity] requires us to allow this outside list items as
4868 well. ([reStructuredText](http://docutils.sourceforge.net/rst.html)
4869 takes a different approach, requiring blank lines before lists
4870 even inside other list items.)
4872 In order to solve of unwanted lists in paragraphs with
4873 hard-wrapped numerals, we allow only lists starting with `1` to
4874 interrupt paragraphs. Thus,
4876 ```````````````````````````````` example
4877 The number of windows in my house is
4878 14. The number of doors is 6.
4880 <p>The number of windows in my house is
4881 14. The number of doors is 6.</p>
4882 ````````````````````````````````
4884 We may still get an unintended result in cases like
4886 ```````````````````````````````` example
4887 The number of windows in my house is
4888 1. The number of doors is 6.
4890 <p>The number of windows in my house is</p>
4892 <li>The number of doors is 6.</li>
4894 ````````````````````````````````
4896 but this rule should prevent most spurious list captures.
4898 There can be any number of blank lines between items:
4900 ```````````````````````````````` example
4919 ````````````````````````````````
4921 ```````````````````````````````` example
4943 ````````````````````````````````
4946 To separate consecutive lists of the same type, or to separate a
4947 list from an indented code block that would otherwise be parsed
4948 as a subparagraph of the final list item, you can insert a blank HTML
4951 ```````````````````````````````` example
4969 ````````````````````````````````
4972 ```````````````````````````````` example
4995 ````````````````````````````````
4998 List items need not be indented to the same level. The following
4999 list items will be treated as items at the same list level,
5000 since none is indented enough to belong to the previous list
5003 ```````````````````````````````` example
5025 ````````````````````````````````
5028 ```````````````````````````````` example
5046 ````````````````````````````````
5049 This is a loose list, because there is a blank line between
5050 two of the list items:
5052 ```````````````````````````````` example
5069 ````````````````````````````````
5072 So is this, with a empty second item:
5074 ```````````````````````````````` example
5089 ````````````````````````````````
5092 These are loose lists, even though there is no space between the items,
5093 because one of the items directly contains two block-level elements
5094 with a blank line between them:
5096 ```````````````````````````````` example
5115 ````````````````````````````````
5118 ```````````````````````````````` example
5136 ````````````````````````````````
5139 This is a tight list, because the blank lines are in a code block:
5141 ```````````````````````````````` example
5160 ````````````````````````````````
5163 This is a tight list, because the blank line is between two
5164 paragraphs of a sublist. So the sublist is loose while
5165 the outer list is tight:
5167 ```````````````````````````````` example
5185 ````````````````````````````````
5188 This is a tight list, because the blank line is inside the
5191 ```````````````````````````````` example
5205 ````````````````````````````````
5208 This list is tight, because the consecutive block elements
5209 are not separated by blank lines:
5211 ```````````````````````````````` example
5229 ````````````````````````````````
5232 A single-paragraph list is tight:
5234 ```````````````````````````````` example
5240 ````````````````````````````````
5243 ```````````````````````````````` example
5254 ````````````````````````````````
5257 This list is loose, because of the blank line between the
5258 two block elements in the list item:
5260 ```````````````````````````````` example
5274 ````````````````````````````````
5277 Here the outer list is loose, the inner list tight:
5279 ```````````````````````````````` example
5294 ````````````````````````````````
5297 ```````````````````````````````` example
5322 ````````````````````````````````
5327 Inlines are parsed sequentially from the beginning of the character
5328 stream to the end (left to right, in left-to-right languages).
5329 Thus, for example, in
5331 ```````````````````````````````` example
5334 <p><code>hi</code>lo`</p>
5335 ````````````````````````````````
5338 `hi` is parsed as code, leaving the backtick at the end as a literal
5341 ## Backslash escapes
5343 Any ASCII punctuation character may be backslash-escaped:
5345 ```````````````````````````````` example
5346 \!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~
5348 <p>!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~</p>
5349 ````````````````````````````````
5352 Backslashes before other characters are treated as literal
5355 ```````````````````````````````` example
5358 <p>\→\A\a\ \3\φ\«</p>
5359 ````````````````````````````````
5362 Escaped characters are treated as regular characters and do
5363 not have their usual Markdown meanings:
5365 ```````````````````````````````` example
5373 \[foo]: /url "not a reference"
5376 <br/> not a tag
5382 [foo]: /url "not a reference"</p>
5383 ````````````````````````````````
5386 If a backslash is itself escaped, the following character is not:
5388 ```````````````````````````````` example
5391 <p>\<em>emphasis</em></p>
5392 ````````````````````````````````
5395 A backslash at the end of the line is a [hard line break]:
5397 ```````````````````````````````` example
5403 ````````````````````````````````
5406 Backslash escapes do not work in code blocks, code spans, autolinks, or
5409 ```````````````````````````````` example
5412 <p><code>\[\`</code></p>
5413 ````````````````````````````````
5416 ```````````````````````````````` example
5421 ````````````````````````````````
5424 ```````````````````````````````` example
5431 ````````````````````````````````
5434 ```````````````````````````````` example
5435 <http://example.com?find=\*>
5437 <p><a href="http://example.com?find=%5C*">http://example.com?find=\*</a></p>
5438 ````````````````````````````````
5441 ```````````````````````````````` example
5445 ````````````````````````````````
5448 But they work in all other contexts, including URLs and link titles,
5449 link references, and [info strings] in [fenced code blocks]:
5451 ```````````````````````````````` example
5452 [foo](/bar\* "ti\*tle")
5454 <p><a href="/bar*" title="ti*tle">foo</a></p>
5455 ````````````````````````````````
5458 ```````````````````````````````` example
5461 [foo]: /bar\* "ti\*tle"
5463 <p><a href="/bar*" title="ti*tle">foo</a></p>
5464 ````````````````````````````````
5467 ```````````````````````````````` example
5472 <pre><code class="language-foo+bar">foo
5474 ````````````````````````````````
5478 ## Entity and numeric character references
5480 All valid HTML entity references and numeric character
5481 references, except those occuring in code blocks and code spans,
5482 are recognized as such and treated as equivalent to the
5483 corresponding Unicode characters. Conforming CommonMark parsers
5484 need not store information about whether a particular character
5485 was represented in the source using a Unicode character or
5486 an entity reference.
5488 [Entity references](@) consist of `&` + any of the valid
5489 HTML5 entity names + `;`. The
5490 document <https://html.spec.whatwg.org/multipage/entities.json>
5491 is used as an authoritative source for the valid entity
5492 references and their corresponding code points.
5494 ```````````````````````````````` example
5495 & © Æ Ď
5496 ¾ ℋ ⅆ
5497 ∲ ≧̸
5502 ````````````````````````````````
5505 [Decimal numeric character
5507 consist of `&#` + a string of 1--8 arabic digits + `;`. A
5508 numeric character reference is parsed as the corresponding
5509 Unicode character. Invalid Unicode code points will be replaced by
5510 the REPLACEMENT CHARACTER (`U+FFFD`). For security reasons,
5511 the code point `U+0000` will also be replaced by `U+FFFD`.
5513 ```````````````````````````````` example
5514 # Ӓ Ϡ � �
5517 ````````````````````````````````
5520 [Hexadecimal numeric character
5521 references](@) consist of `&#` +
5522 either `X` or `x` + a string of 1-8 hexadecimal digits + `;`.
5523 They too are parsed as the corresponding Unicode character (this
5524 time specified with a hexadecimal numeral instead of decimal).
5526 ```````````````````````````````` example
5527 " ആ ಫ
5530 ````````````````````````````````
5533 Here are some nonentities:
5535 ```````````````````````````````` example
5537 &ThisIsNotDefined; &hi?;
5539 <p>&nbsp &x; &#; &#x;
5540 &ThisIsNotDefined; &hi?;</p>
5541 ````````````````````````````````
5544 Although HTML5 does accept some entity references
5545 without a trailing semicolon (such as `©`), these are not
5546 recognized here, because it makes the grammar too ambiguous:
5548 ```````````````````````````````` example
5552 ````````````````````````````````
5555 Strings that are not on the list of HTML5 named entities are not
5556 recognized as entity references either:
5558 ```````````````````````````````` example
5561 <p>&MadeUpEntity;</p>
5562 ````````````````````````````````
5565 Entity and numeric character references are recognized in any
5566 context besides code spans or code blocks, including
5567 URLs, [link titles], and [fenced code block][] [info strings]:
5569 ```````````````````````````````` example
5570 <a href="öö.html">
5572 <a href="öö.html">
5573 ````````````````````````````````
5576 ```````````````````````````````` example
5577 [foo](/föö "föö")
5579 <p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p>
5580 ````````````````````````````````
5583 ```````````````````````````````` example
5586 [foo]: /föö "föö"
5588 <p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p>
5589 ````````````````````````````````
5592 ```````````````````````````````` example
5597 <pre><code class="language-föö">foo
5599 ````````````````````````````````
5602 Entity and numeric character references are treated as literal
5603 text in code spans and code blocks:
5605 ```````````````````````````````` example
5608 <p><code>f&ouml;&ouml;</code></p>
5609 ````````````````````````````````
5612 ```````````````````````````````` example
5615 <pre><code>f&ouml;f&ouml;
5617 ````````````````````````````````
5622 A [backtick string](@)
5623 is a string of one or more backtick characters (`` ` ``) that is neither
5624 preceded nor followed by a backtick.
5626 A [code span](@) begins with a backtick string and ends with
5627 a backtick string of equal length. The contents of the code span are
5628 the characters between the two backtick strings, with leading and
5629 trailing spaces and [line endings] removed, and
5630 [whitespace] collapsed to single spaces.
5632 This is a simple code span:
5634 ```````````````````````````````` example
5637 <p><code>foo</code></p>
5638 ````````````````````````````````
5641 Here two backticks are used, because the code contains a backtick.
5642 This example also illustrates stripping of leading and trailing spaces:
5644 ```````````````````````````````` example
5647 <p><code>foo ` bar</code></p>
5648 ````````````````````````````````
5651 This example shows the motivation for stripping leading and trailing
5654 ```````````````````````````````` example
5657 <p><code>``</code></p>
5658 ````````````````````````````````
5661 [Line endings] are treated like spaces:
5663 ```````````````````````````````` example
5668 <p><code>foo</code></p>
5669 ````````````````````````````````
5672 Interior spaces and [line endings] are collapsed into
5673 single spaces, just as they would be by a browser:
5675 ```````````````````````````````` example
5679 <p><code>foo bar baz</code></p>
5680 ````````````````````````````````
5683 Not all [Unicode whitespace] (for instance, non-breaking space) is
5686 ```````````````````````````````` example
5689 <p><code>a b</code></p>
5690 ````````````````````````````````
5693 Q: Why not just leave the spaces, since browsers will collapse them
5694 anyway? A: Because we might be targeting a non-HTML format, and we
5695 shouldn't rely on HTML-specific rendering assumptions.
5697 (Existing implementations differ in their treatment of internal
5698 spaces and [line endings]. Some, including `Markdown.pl` and
5699 `showdown`, convert an internal [line ending] into a
5700 `<br />` tag. But this makes things difficult for those who like to
5701 hard-wrap their paragraphs, since a line break in the midst of a code
5702 span will cause an unintended line break in the output. Others just
5703 leave internal spaces as they are, which is fine if only HTML is being
5706 ```````````````````````````````` example
5709 <p><code>foo `` bar</code></p>
5710 ````````````````````````````````
5713 Note that backslash escapes do not work in code spans. All backslashes
5714 are treated literally:
5716 ```````````````````````````````` example
5719 <p><code>foo\</code>bar`</p>
5720 ````````````````````````````````
5723 Backslash escapes are never needed, because one can always choose a
5724 string of *n* backtick characters as delimiters, where the code does
5725 not contain any strings of exactly *n* backtick characters.
5727 Code span backticks have higher precedence than any other inline
5728 constructs except HTML tags and autolinks. Thus, for example, this is
5729 not parsed as emphasized text, since the second `*` is part of a code
5732 ```````````````````````````````` example
5735 <p>*foo<code>*</code></p>
5736 ````````````````````````````````
5739 And this is not parsed as a link:
5741 ```````````````````````````````` example
5742 [not a `link](/foo`)
5744 <p>[not a <code>link](/foo</code>)</p>
5745 ````````````````````````````````
5748 Code spans, HTML tags, and autolinks have the same precedence.
5751 ```````````````````````````````` example
5754 <p><code><a href="</code>">`</p>
5755 ````````````````````````````````
5758 But this is an HTML tag:
5760 ```````````````````````````````` example
5763 <p><a href="`">`</p>
5764 ````````````````````````````````
5769 ```````````````````````````````` example
5770 `<http://foo.bar.`baz>`
5772 <p><code><http://foo.bar.</code>baz>`</p>
5773 ````````````````````````````````
5776 But this is an autolink:
5778 ```````````````````````````````` example
5779 <http://foo.bar.`baz>`
5781 <p><a href="http://foo.bar.%60baz">http://foo.bar.`baz</a>`</p>
5782 ````````````````````````````````
5785 When a backtick string is not closed by a matching backtick string,
5786 we just have literal backticks:
5788 ```````````````````````````````` example
5792 ````````````````````````````````
5795 ```````````````````````````````` example
5799 ````````````````````````````````
5801 The following case also illustrates the need for opening and
5802 closing backtick strings to be equal in length:
5804 ```````````````````````````````` example
5807 <p>`foo<code>bar</code></p>
5808 ````````````````````````````````
5811 ## Emphasis and strong emphasis
5813 John Gruber's original [Markdown syntax
5814 description](http://daringfireball.net/projects/markdown/syntax#em) says:
5816 > Markdown treats asterisks (`*`) and underscores (`_`) as indicators of
5817 > emphasis. Text wrapped with one `*` or `_` will be wrapped with an HTML
5818 > `<em>` tag; double `*`'s or `_`'s will be wrapped with an HTML `<strong>`
5821 This is enough for most users, but these rules leave much undecided,
5822 especially when it comes to nested emphasis. The original
5823 `Markdown.pl` test suite makes it clear that triple `***` and
5824 `___` delimiters can be used for strong emphasis, and most
5825 implementations have also allowed the following patterns:
5829 ***strong** in emph*
5830 ***emph* in strong**
5831 **in strong *emph***
5832 *in emph **strong***
5835 The following patterns are less widely supported, but the intent
5836 is clear and they are useful (especially in contexts like bibliography
5840 *emph *with emph* in it*
5841 **strong **with strong** in it**
5844 Many implementations have also restricted intraword emphasis to
5845 the `*` forms, to avoid unwanted emphasis in words containing
5846 internal underscores. (It is best practice to put these in code
5847 spans, but users often do not.)
5850 internal emphasis: foo*bar*baz
5851 no emphasis: foo_bar_baz
5854 The rules given below capture all of these patterns, while allowing
5855 for efficient parsing strategies that do not backtrack.
5857 First, some definitions. A [delimiter run](@) is either
5858 a sequence of one or more `*` characters that is not preceded or
5859 followed by a `*` character, or a sequence of one or more `_`
5860 characters that is not preceded or followed by a `_` character.
5862 A [left-flanking delimiter run](@) is
5863 a [delimiter run] that is (a) not followed by [Unicode whitespace],
5864 and (b) not followed by a [punctuation character], or
5865 preceded by [Unicode whitespace] or a [punctuation character].
5866 For purposes of this definition, the beginning and the end of
5867 the line count as Unicode whitespace.
5869 A [right-flanking delimiter run](@) is
5870 a [delimiter run] that is (a) not preceded by [Unicode whitespace],
5871 and (b) not preceded by a [punctuation character], or
5872 followed by [Unicode whitespace] or a [punctuation character].
5873 For purposes of this definition, the beginning and the end of
5874 the line count as Unicode whitespace.
5876 Here are some examples of delimiter runs.
5878 - left-flanking but not right-flanking:
5887 - right-flanking but not left-flanking:
5896 - Both left and right-flanking:
5903 - Neither left nor right-flanking:
5910 (The idea of distinguishing left-flanking and right-flanking
5911 delimiter runs based on the character before and the character
5912 after comes from Roopesh Chander's
5913 [vfmd](http://www.vfmd.org/vfmd-spec/specification/#procedure-for-identifying-emphasis-tags).
5914 vfmd uses the terminology "emphasis indicator string" instead of "delimiter
5915 run," and its rules for distinguishing left- and right-flanking runs
5916 are a bit more complex than the ones given here.)
5918 The following rules define emphasis and strong emphasis:
5920 1. A single `*` character [can open emphasis](@)
5921 iff (if and only if) it is part of a [left-flanking delimiter run].
5923 2. A single `_` character [can open emphasis] iff
5924 it is part of a [left-flanking delimiter run]
5925 and either (a) not part of a [right-flanking delimiter run]
5926 or (b) part of a [right-flanking delimiter run]
5927 preceded by punctuation.
5929 3. A single `*` character [can close emphasis](@)
5930 iff it is part of a [right-flanking delimiter run].
5932 4. A single `_` character [can close emphasis] iff
5933 it is part of a [right-flanking delimiter run]
5934 and either (a) not part of a [left-flanking delimiter run]
5935 or (b) part of a [left-flanking delimiter run]
5936 followed by punctuation.
5938 5. A double `**` [can open strong emphasis](@)
5939 iff it is part of a [left-flanking delimiter run].
5941 6. A double `__` [can open strong emphasis] iff
5942 it is part of a [left-flanking delimiter run]
5943 and either (a) not part of a [right-flanking delimiter run]
5944 or (b) part of a [right-flanking delimiter run]
5945 preceded by punctuation.
5947 7. A double `**` [can close strong emphasis](@)
5948 iff it is part of a [right-flanking delimiter run].
5950 8. A double `__` [can close strong emphasis] iff
5951 it is part of a [right-flanking delimiter run]
5952 and either (a) not part of a [left-flanking delimiter run]
5953 or (b) part of a [left-flanking delimiter run]
5954 followed by punctuation.
5956 9. Emphasis begins with a delimiter that [can open emphasis] and ends
5957 with a delimiter that [can close emphasis], and that uses the same
5958 character (`_` or `*`) as the opening delimiter. The
5959 opening and closing delimiters must belong to separate
5960 [delimiter runs]. If one of the delimiters can both
5961 open and close emphasis, then the sum of the lengths of the
5962 delimiter runs containing the opening and closing delimiters
5963 must not be a multiple of 3.
5965 10. Strong emphasis begins with a delimiter that
5966 [can open strong emphasis] and ends with a delimiter that
5967 [can close strong emphasis], and that uses the same character
5968 (`_` or `*`) as the opening delimiter. The
5969 opening and closing delimiters must belong to separate
5970 [delimiter runs]. If one of the delimiters can both open
5971 and close strong emphasis, then the sum of the lengths of
5972 the delimiter runs containing the opening and closing
5973 delimiters must not be a multiple of 3.
5975 11. A literal `*` character cannot occur at the beginning or end of
5976 `*`-delimited emphasis or `**`-delimited strong emphasis, unless it
5977 is backslash-escaped.
5979 12. A literal `_` character cannot occur at the beginning or end of
5980 `_`-delimited emphasis or `__`-delimited strong emphasis, unless it
5981 is backslash-escaped.
5983 Where rules 1--12 above are compatible with multiple parsings,
5984 the following principles resolve ambiguity:
5986 13. The number of nestings should be minimized. Thus, for example,
5987 an interpretation `<strong>...</strong>` is always preferred to
5988 `<em><em>...</em></em>`.
5990 14. An interpretation `<em><strong>...</strong></em>` is always
5991 preferred to `<strong><em>...</em></strong>`.
5993 15. When two potential emphasis or strong emphasis spans overlap,
5994 so that the second begins before the first ends and ends after
5995 the first ends, the first takes precedence. Thus, for example,
5996 `*foo _bar* baz_` is parsed as `<em>foo _bar</em> baz_` rather
5997 than `*foo <em>bar* baz</em>`.
5999 16. When there are two potential emphasis or strong emphasis spans
6000 with the same closing delimiter, the shorter one (the one that
6001 opens later) takes precedence. Thus, for example,
6002 `**foo **bar baz**` is parsed as `**foo <strong>bar baz</strong>`
6003 rather than `<strong>foo **bar baz</strong>`.
6005 17. Inline code spans, links, images, and HTML tags group more tightly
6006 than emphasis. So, when there is a choice between an interpretation
6007 that contains one of these elements and one that does not, the
6008 former always wins. Thus, for example, `*[foo*](bar)` is
6009 parsed as `*<a href="bar">foo*</a>` rather than as
6010 `<em>[foo</em>](bar)`.
6012 These rules can be illustrated through a series of examples.
6016 ```````````````````````````````` example
6019 <p><em>foo bar</em></p>
6020 ````````````````````````````````
6023 This is not emphasis, because the opening `*` is followed by
6024 whitespace, and hence not part of a [left-flanking delimiter run]:
6026 ```````````````````````````````` example
6030 ````````````````````````````````
6033 This is not emphasis, because the opening `*` is preceded
6034 by an alphanumeric and followed by punctuation, and hence
6035 not part of a [left-flanking delimiter run]:
6037 ```````````````````````````````` example
6040 <p>a*"foo"*</p>
6041 ````````````````````````````````
6044 Unicode nonbreaking spaces count as whitespace, too:
6046 ```````````````````````````````` example
6050 ````````````````````````````````
6053 Intraword emphasis with `*` is permitted:
6055 ```````````````````````````````` example
6058 <p>foo<em>bar</em></p>
6059 ````````````````````````````````
6062 ```````````````````````````````` example
6065 <p>5<em>6</em>78</p>
6066 ````````````````````````````````
6071 ```````````````````````````````` example
6074 <p><em>foo bar</em></p>
6075 ````````````````````````````````
6078 This is not emphasis, because the opening `_` is followed by
6081 ```````````````````````````````` example
6085 ````````````````````````````````
6088 This is not emphasis, because the opening `_` is preceded
6089 by an alphanumeric and followed by punctuation:
6091 ```````````````````````````````` example
6094 <p>a_"foo"_</p>
6095 ````````````````````````````````
6098 Emphasis with `_` is not allowed inside words:
6100 ```````````````````````````````` example
6104 ````````````````````````````````
6107 ```````````````````````````````` example
6111 ````````````````````````````````
6114 ```````````````````````````````` example
6115 пристаням_стремятся_
6117 <p>пристаням_стремятся_</p>
6118 ````````````````````````````````
6121 Here `_` does not generate emphasis, because the first delimiter run
6122 is right-flanking and the second left-flanking:
6124 ```````````````````````````````` example
6127 <p>aa_"bb"_cc</p>
6128 ````````````````````````````````
6131 This is emphasis, even though the opening delimiter is
6132 both left- and right-flanking, because it is preceded by
6135 ```````````````````````````````` example
6138 <p>foo-<em>(bar)</em></p>
6139 ````````````````````````````````
6144 This is not emphasis, because the closing delimiter does
6145 not match the opening delimiter:
6147 ```````````````````````````````` example
6151 ````````````````````````````````
6154 This is not emphasis, because the closing `*` is preceded by
6157 ```````````````````````````````` example
6161 ````````````````````````````````
6164 A newline also counts as whitespace:
6166 ```````````````````````````````` example
6172 ````````````````````````````````
6175 This is not emphasis, because the second `*` is
6176 preceded by punctuation and followed by an alphanumeric
6177 (hence it is not part of a [right-flanking delimiter run]:
6179 ```````````````````````````````` example
6183 ````````````````````````````````
6186 The point of this restriction is more easily appreciated
6189 ```````````````````````````````` example
6192 <p><em>(<em>foo</em>)</em></p>
6193 ````````````````````````````````
6196 Intraword emphasis with `*` is allowed:
6198 ```````````````````````````````` example
6201 <p><em>foo</em>bar</p>
6202 ````````````````````````````````
6208 This is not emphasis, because the closing `_` is preceded by
6211 ```````````````````````````````` example
6215 ````````````````````````````````
6218 This is not emphasis, because the second `_` is
6219 preceded by punctuation and followed by an alphanumeric:
6221 ```````````````````````````````` example
6225 ````````````````````````````````
6228 This is emphasis within emphasis:
6230 ```````````````````````````````` example
6233 <p><em>(<em>foo</em>)</em></p>
6234 ````````````````````````````````
6237 Intraword emphasis is disallowed for `_`:
6239 ```````````````````````````````` example
6243 ````````````````````````````````
6246 ```````````````````````````````` example
6247 _пристаням_стремятся
6249 <p>_пристаням_стремятся</p>
6250 ````````````````````````````````
6253 ```````````````````````````````` example
6256 <p><em>foo_bar_baz</em></p>
6257 ````````````````````````````````
6260 This is emphasis, even though the closing delimiter is
6261 both left- and right-flanking, because it is followed by
6264 ```````````````````````````````` example
6267 <p><em>(bar)</em>.</p>
6268 ````````````````````````````````
6273 ```````````````````````````````` example
6276 <p><strong>foo bar</strong></p>
6277 ````````````````````````````````
6280 This is not strong emphasis, because the opening delimiter is
6281 followed by whitespace:
6283 ```````````````````````````````` example
6287 ````````````````````````````````
6290 This is not strong emphasis, because the opening `**` is preceded
6291 by an alphanumeric and followed by punctuation, and hence
6292 not part of a [left-flanking delimiter run]:
6294 ```````````````````````````````` example
6297 <p>a**"foo"**</p>
6298 ````````````````````````````````
6301 Intraword strong emphasis with `**` is permitted:
6303 ```````````````````````````````` example
6306 <p>foo<strong>bar</strong></p>
6307 ````````````````````````````````
6312 ```````````````````````````````` example
6315 <p><strong>foo bar</strong></p>
6316 ````````````````````````````````
6319 This is not strong emphasis, because the opening delimiter is
6320 followed by whitespace:
6322 ```````````````````````````````` example
6326 ````````````````````````````````
6329 A newline counts as whitespace:
6330 ```````````````````````````````` example
6336 ````````````````````````````````
6339 This is not strong emphasis, because the opening `__` is preceded
6340 by an alphanumeric and followed by punctuation:
6342 ```````````````````````````````` example
6345 <p>a__"foo"__</p>
6346 ````````````````````````````````
6349 Intraword strong emphasis is forbidden with `__`:
6351 ```````````````````````````````` example
6355 ````````````````````````````````
6358 ```````````````````````````````` example
6362 ````````````````````````````````
6365 ```````````````````````````````` example
6366 пристаням__стремятся__
6368 <p>пристаням__стремятся__</p>
6369 ````````````````````````````````
6372 ```````````````````````````````` example
6373 __foo, __bar__, baz__
6375 <p><strong>foo, <strong>bar</strong>, baz</strong></p>
6376 ````````````````````````````````
6379 This is strong emphasis, even though the opening delimiter is
6380 both left- and right-flanking, because it is preceded by
6383 ```````````````````````````````` example
6386 <p>foo-<strong>(bar)</strong></p>
6387 ````````````````````````````````
6393 This is not strong emphasis, because the closing delimiter is preceded
6396 ```````````````````````````````` example
6400 ````````````````````````````````
6403 (Nor can it be interpreted as an emphasized `*foo bar *`, because of
6406 This is not strong emphasis, because the second `**` is
6407 preceded by punctuation and followed by an alphanumeric:
6409 ```````````````````````````````` example
6413 ````````````````````````````````
6416 The point of this restriction is more easily appreciated
6417 with these examples:
6419 ```````````````````````````````` example
6422 <p><em>(<strong>foo</strong>)</em></p>
6423 ````````````````````````````````
6426 ```````````````````````````````` example
6427 **Gomphocarpus (*Gomphocarpus physocarpus*, syn.
6428 *Asclepias physocarpa*)**
6430 <p><strong>Gomphocarpus (<em>Gomphocarpus physocarpus</em>, syn.
6431 <em>Asclepias physocarpa</em>)</strong></p>
6432 ````````````````````````````````
6435 ```````````````````````````````` example
6438 <p><strong>foo "<em>bar</em>" foo</strong></p>
6439 ````````````````````````````````
6444 ```````````````````````````````` example
6447 <p><strong>foo</strong>bar</p>
6448 ````````````````````````````````
6453 This is not strong emphasis, because the closing delimiter is
6454 preceded by whitespace:
6456 ```````````````````````````````` example
6460 ````````````````````````````````
6463 This is not strong emphasis, because the second `__` is
6464 preceded by punctuation and followed by an alphanumeric:
6466 ```````````````````````````````` example
6470 ````````````````````````````````
6473 The point of this restriction is more easily appreciated
6476 ```````````````````````````````` example
6479 <p><em>(<strong>foo</strong>)</em></p>
6480 ````````````````````````````````
6483 Intraword strong emphasis is forbidden with `__`:
6485 ```````````````````````````````` example
6489 ````````````````````````````````
6492 ```````````````````````````````` example
6493 __пристаням__стремятся
6495 <p>__пристаням__стремятся</p>
6496 ````````````````````````````````
6499 ```````````````````````````````` example
6502 <p><strong>foo__bar__baz</strong></p>
6503 ````````````````````````````````
6506 This is strong emphasis, even though the closing delimiter is
6507 both left- and right-flanking, because it is followed by
6510 ```````````````````````````````` example
6513 <p><strong>(bar)</strong>.</p>
6514 ````````````````````````````````
6519 Any nonempty sequence of inline elements can be the contents of an
6522 ```````````````````````````````` example
6525 <p><em>foo <a href="/url">bar</a></em></p>
6526 ````````````````````````````````
6529 ```````````````````````````````` example
6535 ````````````````````````````````
6538 In particular, emphasis and strong emphasis can be nested
6541 ```````````````````````````````` example
6544 <p><em>foo <strong>bar</strong> baz</em></p>
6545 ````````````````````````````````
6548 ```````````````````````````````` example
6551 <p><em>foo <em>bar</em> baz</em></p>
6552 ````````````````````````````````
6555 ```````````````````````````````` example
6558 <p><em><em>foo</em> bar</em></p>
6559 ````````````````````````````````
6562 ```````````````````````````````` example
6565 <p><em>foo <em>bar</em></em></p>
6566 ````````````````````````````````
6569 ```````````````````````````````` example
6572 <p><em>foo <strong>bar</strong> baz</em></p>
6573 ````````````````````````````````
6575 ```````````````````````````````` example
6578 <p><em>foo<strong>bar</strong>baz</em></p>
6579 ````````````````````````````````
6581 Note that in the preceding case, the interpretation
6584 <p><em>foo</em><em>bar<em></em>baz</em></p>
6588 is precluded by the condition that a delimiter that
6589 can both open and close (like the `*` after `foo`)
6590 cannot form emphasis if the sum of the lengths of
6591 the delimiter runs containing the opening and
6592 closing delimiters is a multiple of 3.
6594 The same condition ensures that the following
6595 cases are all strong emphasis nested inside
6596 emphasis, even when the interior spaces are
6600 ```````````````````````````````` example
6603 <p><em><strong>foo</strong> bar</em></p>
6604 ````````````````````````````````
6607 ```````````````````````````````` example
6610 <p><em>foo <strong>bar</strong></em></p>
6611 ````````````````````````````````
6614 ```````````````````````````````` example
6617 <p><em>foo<strong>bar</strong></em></p>
6618 ````````````````````````````````
6621 Indefinite levels of nesting are possible:
6623 ```````````````````````````````` example
6624 *foo **bar *baz* bim** bop*
6626 <p><em>foo <strong>bar <em>baz</em> bim</strong> bop</em></p>
6627 ````````````````````````````````
6630 ```````````````````````````````` example
6633 <p><em>foo <a href="/url"><em>bar</em></a></em></p>
6634 ````````````````````````````````
6637 There can be no empty emphasis or strong emphasis:
6639 ```````````````````````````````` example
6640 ** is not an empty emphasis
6642 <p>** is not an empty emphasis</p>
6643 ````````````````````````````````
6646 ```````````````````````````````` example
6647 **** is not an empty strong emphasis
6649 <p>**** is not an empty strong emphasis</p>
6650 ````````````````````````````````
6656 Any nonempty sequence of inline elements can be the contents of an
6657 strongly emphasized span.
6659 ```````````````````````````````` example
6662 <p><strong>foo <a href="/url">bar</a></strong></p>
6663 ````````````````````````````````
6666 ```````````````````````````````` example
6672 ````````````````````````````````
6675 In particular, emphasis and strong emphasis can be nested
6676 inside strong emphasis:
6678 ```````````````````````````````` example
6681 <p><strong>foo <em>bar</em> baz</strong></p>
6682 ````````````````````````````````
6685 ```````````````````````````````` example
6688 <p><strong>foo <strong>bar</strong> baz</strong></p>
6689 ````````````````````````````````
6692 ```````````````````````````````` example
6695 <p><strong><strong>foo</strong> bar</strong></p>
6696 ````````````````````````````````
6699 ```````````````````````````````` example
6702 <p><strong>foo <strong>bar</strong></strong></p>
6703 ````````````````````````````````
6706 ```````````````````````````````` example
6709 <p><strong>foo <em>bar</em> baz</strong></p>
6710 ````````````````````````````````
6713 ```````````````````````````````` example
6716 <p><strong>foo<em>bar</em>baz</strong></p>
6717 ````````````````````````````````
6720 ```````````````````````````````` example
6723 <p><strong><em>foo</em> bar</strong></p>
6724 ````````````````````````````````
6727 ```````````````````````````````` example
6730 <p><strong>foo <em>bar</em></strong></p>
6731 ````````````````````````````````
6734 Indefinite levels of nesting are possible:
6736 ```````````````````````````````` example
6740 <p><strong>foo <em>bar <strong>baz</strong>
6741 bim</em> bop</strong></p>
6742 ````````````````````````````````
6745 ```````````````````````````````` example
6746 **foo [*bar*](/url)**
6748 <p><strong>foo <a href="/url"><em>bar</em></a></strong></p>
6749 ````````````````````````````````
6752 There can be no empty emphasis or strong emphasis:
6754 ```````````````````````````````` example
6755 __ is not an empty emphasis
6757 <p>__ is not an empty emphasis</p>
6758 ````````````````````````````````
6761 ```````````````````````````````` example
6762 ____ is not an empty strong emphasis
6764 <p>____ is not an empty strong emphasis</p>
6765 ````````````````````````````````
6771 ```````````````````````````````` example
6775 ````````````````````````````````
6778 ```````````````````````````````` example
6781 <p>foo <em>*</em></p>
6782 ````````````````````````````````
6785 ```````````````````````````````` example
6788 <p>foo <em>_</em></p>
6789 ````````````````````````````````
6792 ```````````````````````````````` example
6796 ````````````````````````````````
6799 ```````````````````````````````` example
6802 <p>foo <strong>*</strong></p>
6803 ````````````````````````````````
6806 ```````````````````````````````` example
6809 <p>foo <strong>_</strong></p>
6810 ````````````````````````````````
6813 Note that when delimiters do not match evenly, Rule 11 determines
6814 that the excess literal `*` characters will appear outside of the
6815 emphasis, rather than inside it:
6817 ```````````````````````````````` example
6820 <p>*<em>foo</em></p>
6821 ````````````````````````````````
6824 ```````````````````````````````` example
6827 <p><em>foo</em>*</p>
6828 ````````````````````````````````
6831 ```````````````````````````````` example
6834 <p>*<strong>foo</strong></p>
6835 ````````````````````````````````
6838 ```````````````````````````````` example
6841 <p>***<em>foo</em></p>
6842 ````````````````````````````````
6845 ```````````````````````````````` example
6848 <p><strong>foo</strong>*</p>
6849 ````````````````````````````````
6852 ```````````````````````````````` example
6855 <p><em>foo</em>***</p>
6856 ````````````````````````````````
6862 ```````````````````````````````` example
6866 ````````````````````````````````
6869 ```````````````````````````````` example
6872 <p>foo <em>_</em></p>
6873 ````````````````````````````````
6876 ```````````````````````````````` example
6879 <p>foo <em>*</em></p>
6880 ````````````````````````````````
6883 ```````````````````````````````` example
6887 ````````````````````````````````
6890 ```````````````````````````````` example
6893 <p>foo <strong>_</strong></p>
6894 ````````````````````````````````
6897 ```````````````````````````````` example
6900 <p>foo <strong>*</strong></p>
6901 ````````````````````````````````
6904 ```````````````````````````````` example
6907 <p>_<em>foo</em></p>
6908 ````````````````````````````````
6911 Note that when delimiters do not match evenly, Rule 12 determines
6912 that the excess literal `_` characters will appear outside of the
6913 emphasis, rather than inside it:
6915 ```````````````````````````````` example
6918 <p><em>foo</em>_</p>
6919 ````````````````````````````````
6922 ```````````````````````````````` example
6925 <p>_<strong>foo</strong></p>
6926 ````````````````````````````````
6929 ```````````````````````````````` example
6932 <p>___<em>foo</em></p>
6933 ````````````````````````````````
6936 ```````````````````````````````` example
6939 <p><strong>foo</strong>_</p>
6940 ````````````````````````````````
6943 ```````````````````````````````` example
6946 <p><em>foo</em>___</p>
6947 ````````````````````````````````
6950 Rule 13 implies that if you want emphasis nested directly inside
6951 emphasis, you must use different delimiters:
6953 ```````````````````````````````` example
6956 <p><strong>foo</strong></p>
6957 ````````````````````````````````
6960 ```````````````````````````````` example
6963 <p><em><em>foo</em></em></p>
6964 ````````````````````````````````
6967 ```````````````````````````````` example
6970 <p><strong>foo</strong></p>
6971 ````````````````````````````````
6974 ```````````````````````````````` example
6977 <p><em><em>foo</em></em></p>
6978 ````````````````````````````````
6981 However, strong emphasis within strong emphasis is possible without
6982 switching delimiters:
6984 ```````````````````````````````` example
6987 <p><strong><strong>foo</strong></strong></p>
6988 ````````````````````````````````
6991 ```````````````````````````````` example
6994 <p><strong><strong>foo</strong></strong></p>
6995 ````````````````````````````````
6999 Rule 13 can be applied to arbitrarily long sequences of
7002 ```````````````````````````````` example
7005 <p><strong><strong><strong>foo</strong></strong></strong></p>
7006 ````````````````````````````````
7011 ```````````````````````````````` example
7014 <p><em><strong>foo</strong></em></p>
7015 ````````````````````````````````
7018 ```````````````````````````````` example
7021 <p><em><strong><strong>foo</strong></strong></em></p>
7022 ````````````````````````````````
7027 ```````````````````````````````` example
7030 <p><em>foo _bar</em> baz_</p>
7031 ````````````````````````````````
7034 ```````````````````````````````` example
7035 *foo __bar *baz bim__ bam*
7037 <p><em>foo <strong>bar *baz bim</strong> bam</em></p>
7038 ````````````````````````````````
7043 ```````````````````````````````` example
7046 <p>**foo <strong>bar baz</strong></p>
7047 ````````````````````````````````
7050 ```````````````````````````````` example
7053 <p>*foo <em>bar baz</em></p>
7054 ````````````````````````````````
7059 ```````````````````````````````` example
7062 <p>*<a href="/url">bar*</a></p>
7063 ````````````````````````````````
7066 ```````````````````````````````` example
7069 <p>_foo <a href="/url">bar_</a></p>
7070 ````````````````````````````````
7073 ```````````````````````````````` example
7074 *<img src="foo" title="*"/>
7076 <p>*<img src="foo" title="*"/></p>
7077 ````````````````````````````````
7080 ```````````````````````````````` example
7083 <p>**<a href="**"></p>
7084 ````````````````````````````````
7087 ```````````````````````````````` example
7090 <p>__<a href="__"></p>
7091 ````````````````````````````````
7094 ```````````````````````````````` example
7097 <p><em>a <code>*</code></em></p>
7098 ````````````````````````````````
7101 ```````````````````````````````` example
7104 <p><em>a <code>_</code></em></p>
7105 ````````````````````````````````
7108 ```````````````````````````````` example
7109 **a<http://foo.bar/?q=**>
7111 <p>**a<a href="http://foo.bar/?q=**">http://foo.bar/?q=**</a></p>
7112 ````````````````````````````````
7115 ```````````````````````````````` example
7116 __a<http://foo.bar/?q=__>
7118 <p>__a<a href="http://foo.bar/?q=__">http://foo.bar/?q=__</a></p>
7119 ````````````````````````````````
7125 A link contains [link text] (the visible text), a [link destination]
7126 (the URI that is the link destination), and optionally a [link title].
7127 There are two basic kinds of links in Markdown. In [inline links] the
7128 destination and title are given immediately after the link text. In
7129 [reference links] the destination and title are defined elsewhere in
7132 A [link text](@) consists of a sequence of zero or more
7133 inline elements enclosed by square brackets (`[` and `]`). The
7134 following rules apply:
7136 - Links may not contain other links, at any level of nesting. If
7137 multiple otherwise valid link definitions appear nested inside each
7138 other, the inner-most definition is used.
7140 - Brackets are allowed in the [link text] only if (a) they
7141 are backslash-escaped or (b) they appear as a matched pair of brackets,
7142 with an open bracket `[`, a sequence of zero or more inlines, and
7143 a close bracket `]`.
7145 - Backtick [code spans], [autolinks], and raw [HTML tags] bind more tightly
7146 than the brackets in link text. Thus, for example,
7147 `` [foo`]` `` could not be a link text, since the second `]`
7148 is part of a code span.
7150 - The brackets in link text bind more tightly than markers for
7151 [emphasis and strong emphasis]. Thus, for example, `*[foo*](url)` is a link.
7153 A [link destination](@) consists of either
7155 - a sequence of zero or more characters between an opening `<` and a
7156 closing `>` that contains no spaces, line breaks, or unescaped
7157 `<` or `>` characters, or
7159 - a nonempty sequence of characters that does not include
7160 ASCII space or control characters, and includes parentheses
7161 only if (a) they are backslash-escaped or (b) they are part of
7162 a balanced pair of unescaped parentheses.
7164 A [link title](@) consists of either
7166 - a sequence of zero or more characters between straight double-quote
7167 characters (`"`), including a `"` character only if it is
7168 backslash-escaped, or
7170 - a sequence of zero or more characters between straight single-quote
7171 characters (`'`), including a `'` character only if it is
7172 backslash-escaped, or
7174 - a sequence of zero or more characters between matching parentheses
7175 (`(...)`), including a `)` character only if it is backslash-escaped.
7177 Although [link titles] may span multiple lines, they may not contain
7180 An [inline link](@) consists of a [link text] followed immediately
7181 by a left parenthesis `(`, optional [whitespace], an optional
7182 [link destination], an optional [link title] separated from the link
7183 destination by [whitespace], optional [whitespace], and a right
7184 parenthesis `)`. The link's text consists of the inlines contained
7185 in the [link text] (excluding the enclosing square brackets).
7186 The link's URI consists of the link destination, excluding enclosing
7187 `<...>` if present, with backslash-escapes in effect as described
7188 above. The link's title consists of the link title, excluding its
7189 enclosing delimiters, with backslash-escapes in effect as described
7192 Here is a simple inline link:
7194 ```````````````````````````````` example
7195 [link](/uri "title")
7197 <p><a href="/uri" title="title">link</a></p>
7198 ````````````````````````````````
7201 The title may be omitted:
7203 ```````````````````````````````` example
7206 <p><a href="/uri">link</a></p>
7207 ````````````````````````````````
7210 Both the title and the destination may be omitted:
7212 ```````````````````````````````` example
7215 <p><a href="">link</a></p>
7216 ````````````````````````````````
7219 ```````````````````````````````` example
7222 <p><a href="">link</a></p>
7223 ````````````````````````````````
7226 The destination cannot contain spaces or line breaks,
7227 even if enclosed in pointy brackets:
7229 ```````````````````````````````` example
7232 <p>[link](/my uri)</p>
7233 ````````````````````````````````
7236 ```````````````````````````````` example
7239 <p>[link](</my uri>)</p>
7240 ````````````````````````````````
7243 ```````````````````````````````` example
7249 ````````````````````````````````
7252 ```````````````````````````````` example
7258 ````````````````````````````````
7260 Parentheses inside the link destination may be escaped:
7262 ```````````````````````````````` example
7265 <p><a href="(foo)">link</a></p>
7266 ````````````````````````````````
7268 Any number parentheses are allowed without escaping, as long as they are
7271 ```````````````````````````````` example
7272 [link](foo(and(bar)))
7274 <p><a href="foo(and(bar))">link</a></p>
7275 ````````````````````````````````
7277 However, if you have unbalanced parentheses, you need to escape or use the
7280 ```````````````````````````````` example
7281 [link](foo\(and\(bar\))
7283 <p><a href="foo(and(bar)">link</a></p>
7284 ````````````````````````````````
7287 ```````````````````````````````` example
7288 [link](<foo(and(bar)>)
7290 <p><a href="foo(and(bar)">link</a></p>
7291 ````````````````````````````````
7294 Parentheses and other symbols can also be escaped, as usual
7297 ```````````````````````````````` example
7300 <p><a href="foo):">link</a></p>
7301 ````````````````````````````````
7304 A link can contain fragment identifiers and queries:
7306 ```````````````````````````````` example
7309 [link](http://example.com#fragment)
7311 [link](http://example.com?foo=3#frag)
7313 <p><a href="#fragment">link</a></p>
7314 <p><a href="http://example.com#fragment">link</a></p>
7315 <p><a href="http://example.com?foo=3#frag">link</a></p>
7316 ````````````````````````````````
7319 Note that a backslash before a non-escapable character is
7322 ```````````````````````````````` example
7325 <p><a href="foo%5Cbar">link</a></p>
7326 ````````````````````````````````
7329 URL-escaping should be left alone inside the destination, as all
7330 URL-escaped characters are also valid URL characters. Entity and
7331 numerical character references in the destination will be parsed
7332 into the corresponding Unicode code points, as usual. These may
7333 be optionally URL-escaped when written as HTML, but this spec
7334 does not enforce any particular policy for rendering URLs in
7335 HTML or other formats. Renderers may make different decisions
7336 about how to escape or normalize URLs in the output.
7338 ```````````````````````````````` example
7339 [link](foo%20bä)
7341 <p><a href="foo%20b%C3%A4">link</a></p>
7342 ````````````````````````````````
7345 Note that, because titles can often be parsed as destinations,
7346 if you try to omit the destination and keep the title, you'll
7347 get unexpected results:
7349 ```````````````````````````````` example
7352 <p><a href="%22title%22">link</a></p>
7353 ````````````````````````````````
7356 Titles may be in single quotes, double quotes, or parentheses:
7358 ```````````````````````````````` example
7359 [link](/url "title")
7360 [link](/url 'title')
7361 [link](/url (title))
7363 <p><a href="/url" title="title">link</a>
7364 <a href="/url" title="title">link</a>
7365 <a href="/url" title="title">link</a></p>
7366 ````````````````````````````````
7369 Backslash escapes and entity and numeric character references
7370 may be used in titles:
7372 ```````````````````````````````` example
7373 [link](/url "title \""")
7375 <p><a href="/url" title="title """>link</a></p>
7376 ````````````````````````````````
7379 Titles must be separated from the link using a [whitespace].
7380 Other [Unicode whitespace] like non-breaking space doesn't work.
7382 ```````````````````````````````` example
7383 [link](/url "title")
7385 <p><a href="/url%C2%A0%22title%22">link</a></p>
7386 ````````````````````````````````
7389 Nested balanced quotes are not allowed without escaping:
7391 ```````````````````````````````` example
7392 [link](/url "title "and" title")
7394 <p>[link](/url "title "and" title")</p>
7395 ````````````````````````````````
7398 But it is easy to work around this by using a different quote type:
7400 ```````````````````````````````` example
7401 [link](/url 'title "and" title')
7403 <p><a href="/url" title="title "and" title">link</a></p>
7404 ````````````````````````````````
7407 (Note: `Markdown.pl` did allow double quotes inside a double-quoted
7408 title, and its test suite included a test demonstrating this.
7409 But it is hard to see a good rationale for the extra complexity this
7410 brings, since there are already many ways---backslash escaping,
7411 entity and numeric character references, or using a different
7412 quote type for the enclosing title---to write titles containing
7413 double quotes. `Markdown.pl`'s handling of titles has a number
7414 of other strange features. For example, it allows single-quoted
7415 titles in inline links, but not reference links. And, in
7416 reference links but not inline links, it allows a title to begin
7417 with `"` and end with `)`. `Markdown.pl` 1.0.1 even allows
7418 titles with no closing quotation mark, though 1.0.2b8 does not.
7419 It seems preferable to adopt a simple, rational rule that works
7420 the same way in inline links and link reference definitions.)
7422 [Whitespace] is allowed around the destination and title:
7424 ```````````````````````````````` example
7428 <p><a href="/uri" title="title">link</a></p>
7429 ````````````````````````````````
7432 But it is not allowed between the link text and the
7433 following parenthesis:
7435 ```````````````````````````````` example
7438 <p>[link] (/uri)</p>
7439 ````````````````````````````````
7442 The link text may contain balanced brackets, but not unbalanced ones,
7443 unless they are escaped:
7445 ```````````````````````````````` example
7446 [link [foo [bar]]](/uri)
7448 <p><a href="/uri">link [foo [bar]]</a></p>
7449 ````````````````````````````````
7452 ```````````````````````````````` example
7455 <p>[link] bar](/uri)</p>
7456 ````````````````````````````````
7459 ```````````````````````````````` example
7462 <p>[link <a href="/uri">bar</a></p>
7463 ````````````````````````````````
7466 ```````````````````````````````` example
7469 <p><a href="/uri">link [bar</a></p>
7470 ````````````````````````````````
7473 The link text may contain inline content:
7475 ```````````````````````````````` example
7476 [link *foo **bar** `#`*](/uri)
7478 <p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p>
7479 ````````````````````````````````
7482 ```````````````````````````````` example
7483 [![moon](moon.jpg)](/uri)
7485 <p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p>
7486 ````````````````````````````````
7489 However, links may not contain other links, at any level of nesting.
7491 ```````````````````````````````` example
7492 [foo [bar](/uri)](/uri)
7494 <p>[foo <a href="/uri">bar</a>](/uri)</p>
7495 ````````````````````````````````
7498 ```````````````````````````````` example
7499 [foo *[bar [baz](/uri)](/uri)*](/uri)
7501 <p>[foo <em>[bar <a href="/uri">baz</a>](/uri)</em>](/uri)</p>
7502 ````````````````````````````````
7505 ```````````````````````````````` example
7506 ![[[foo](uri1)](uri2)](uri3)
7508 <p><img src="uri3" alt="[foo](uri2)" /></p>
7509 ````````````````````````````````
7512 These cases illustrate the precedence of link text grouping over
7515 ```````````````````````````````` example
7518 <p>*<a href="/uri">foo*</a></p>
7519 ````````````````````````````````
7522 ```````````````````````````````` example
7525 <p><a href="baz*">foo *bar</a></p>
7526 ````````````````````````````````
7529 Note that brackets that *aren't* part of links do not take
7532 ```````````````````````````````` example
7535 <p><em>foo [bar</em> baz]</p>
7536 ````````````````````````````````
7539 These cases illustrate the precedence of HTML tags, code spans,
7540 and autolinks over link grouping:
7542 ```````````````````````````````` example
7543 [foo <bar attr="](baz)">
7545 <p>[foo <bar attr="](baz)"></p>
7546 ````````````````````````````````
7549 ```````````````````````````````` example
7552 <p>[foo<code>](/uri)</code></p>
7553 ````````````````````````````````
7556 ```````````````````````````````` example
7557 [foo<http://example.com/?search=](uri)>
7559 <p>[foo<a href="http://example.com/?search=%5D(uri)">http://example.com/?search=](uri)</a></p>
7560 ````````````````````````````````
7563 There are three kinds of [reference link](@)s:
7564 [full](#full-reference-link), [collapsed](#collapsed-reference-link),
7565 and [shortcut](#shortcut-reference-link).
7567 A [full reference link](@)
7568 consists of a [link text] immediately followed by a [link label]
7569 that [matches] a [link reference definition] elsewhere in the document.
7571 A [link label](@) begins with a left bracket (`[`) and ends
7572 with the first right bracket (`]`) that is not backslash-escaped.
7573 Between these brackets there must be at least one [non-whitespace character].
7574 Unescaped square bracket characters are not allowed in
7575 [link labels]. A link label can have at most 999
7576 characters inside the square brackets.
7578 One label [matches](@)
7579 another just in case their normalized forms are equal. To normalize a
7580 label, perform the *Unicode case fold* and collapse consecutive internal
7581 [whitespace] to a single space. If there are multiple
7582 matching reference link definitions, the one that comes first in the
7583 document is used. (It is desirable in such cases to emit a warning.)
7585 The contents of the first link label are parsed as inlines, which are
7586 used as the link's text. The link's URI and title are provided by the
7587 matching [link reference definition].
7589 Here is a simple example:
7591 ```````````````````````````````` example
7596 <p><a href="/url" title="title">foo</a></p>
7597 ````````````````````````````````
7600 The rules for the [link text] are the same as with
7601 [inline links]. Thus:
7603 The link text may contain balanced brackets, but not unbalanced ones,
7604 unless they are escaped:
7606 ```````````````````````````````` example
7607 [link [foo [bar]]][ref]
7611 <p><a href="/uri">link [foo [bar]]</a></p>
7612 ````````````````````````````````
7615 ```````````````````````````````` example
7620 <p><a href="/uri">link [bar</a></p>
7621 ````````````````````````````````
7624 The link text may contain inline content:
7626 ```````````````````````````````` example
7627 [link *foo **bar** `#`*][ref]
7631 <p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p>
7632 ````````````````````````````````
7635 ```````````````````````````````` example
7636 [![moon](moon.jpg)][ref]
7640 <p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p>
7641 ````````````````````````````````
7644 However, links may not contain other links, at any level of nesting.
7646 ```````````````````````````````` example
7647 [foo [bar](/uri)][ref]
7651 <p>[foo <a href="/uri">bar</a>]<a href="/uri">ref</a></p>
7652 ````````````````````````````````
7655 ```````````````````````````````` example
7656 [foo *bar [baz][ref]*][ref]
7660 <p>[foo <em>bar <a href="/uri">baz</a></em>]<a href="/uri">ref</a></p>
7661 ````````````````````````````````
7664 (In the examples above, we have two [shortcut reference links]
7665 instead of one [full reference link].)
7667 The following cases illustrate the precedence of link text grouping over
7670 ```````````````````````````````` example
7675 <p>*<a href="/uri">foo*</a></p>
7676 ````````````````````````````````
7679 ```````````````````````````````` example
7684 <p><a href="/uri">foo *bar</a></p>
7685 ````````````````````````````````
7688 These cases illustrate the precedence of HTML tags, code spans,
7689 and autolinks over link grouping:
7691 ```````````````````````````````` example
7692 [foo <bar attr="][ref]">
7696 <p>[foo <bar attr="][ref]"></p>
7697 ````````````````````````````````
7700 ```````````````````````````````` example
7705 <p>[foo<code>][ref]</code></p>
7706 ````````````````````````````````
7709 ```````````````````````````````` example
7710 [foo<http://example.com/?search=][ref]>
7714 <p>[foo<a href="http://example.com/?search=%5D%5Bref%5D">http://example.com/?search=][ref]</a></p>
7715 ````````````````````````````````
7718 Matching is case-insensitive:
7720 ```````````````````````````````` example
7725 <p><a href="/url" title="title">foo</a></p>
7726 ````````````````````````````````
7729 Unicode case fold is used:
7731 ```````````````````````````````` example
7732 [Толпой][Толпой] is a Russian word.
7736 <p><a href="/url">Толпой</a> is a Russian word.</p>
7737 ````````````````````````````````
7740 Consecutive internal [whitespace] is treated as one space for
7741 purposes of determining matching:
7743 ```````````````````````````````` example
7749 <p><a href="/url">Baz</a></p>
7750 ````````````````````````````````
7753 No [whitespace] is allowed between the [link text] and the
7756 ```````````````````````````````` example
7761 <p>[foo] <a href="/url" title="title">bar</a></p>
7762 ````````````````````````````````
7765 ```````````````````````````````` example
7772 <a href="/url" title="title">bar</a></p>
7773 ````````````````````````````````
7776 This is a departure from John Gruber's original Markdown syntax
7777 description, which explicitly allows whitespace between the link
7778 text and the link label. It brings reference links in line with
7779 [inline links], which (according to both original Markdown and
7780 this spec) cannot have whitespace after the link text. More
7781 importantly, it prevents inadvertent capture of consecutive
7782 [shortcut reference links]. If whitespace is allowed between the
7783 link text and the link label, then in the following we will have
7784 a single reference link, not two shortcut reference links, as
7795 (Note that [shortcut reference links] were introduced by Gruber
7796 himself in a beta version of `Markdown.pl`, but never included
7797 in the official syntax description. Without shortcut reference
7798 links, it is harmless to allow space between the link text and
7799 link label; but once shortcut references are introduced, it is
7800 too dangerous to allow this, as it frequently leads to
7801 unintended results.)
7803 When there are multiple matching [link reference definitions],
7806 ```````````````````````````````` example
7813 <p><a href="/url1">bar</a></p>
7814 ````````````````````````````````
7817 Note that matching is performed on normalized strings, not parsed
7818 inline content. So the following does not match, even though the
7819 labels define equivalent inline content:
7821 ```````````````````````````````` example
7827 ````````````````````````````````
7830 [Link labels] cannot contain brackets, unless they are
7833 ```````````````````````````````` example
7840 ````````````````````````````````
7843 ```````````````````````````````` example
7848 <p>[foo][ref[bar]]</p>
7849 <p>[ref[bar]]: /uri</p>
7850 ````````````````````````````````
7853 ```````````````````````````````` example
7859 <p>[[[foo]]]: /url</p>
7860 ````````````````````````````````
7863 ```````````````````````````````` example
7868 <p><a href="/uri">foo</a></p>
7869 ````````````````````````````````
7872 Note that in this example `]` is not backslash-escaped:
7874 ```````````````````````````````` example
7879 <p><a href="/uri">bar\</a></p>
7880 ````````````````````````````````
7883 A [link label] must contain at least one [non-whitespace character]:
7885 ```````````````````````````````` example
7892 ````````````````````````````````
7895 ```````````````````````````````` example
7906 ````````````````````````````````
7909 A [collapsed reference link](@)
7910 consists of a [link label] that [matches] a
7911 [link reference definition] elsewhere in the
7912 document, followed by the string `[]`.
7913 The contents of the first link label are parsed as inlines,
7914 which are used as the link's text. The link's URI and title are
7915 provided by the matching reference link definition. Thus,
7916 `[foo][]` is equivalent to `[foo][foo]`.
7918 ```````````````````````````````` example
7923 <p><a href="/url" title="title">foo</a></p>
7924 ````````````````````````````````
7927 ```````````````````````````````` example
7930 [*foo* bar]: /url "title"
7932 <p><a href="/url" title="title"><em>foo</em> bar</a></p>
7933 ````````````````````````````````
7936 The link labels are case-insensitive:
7938 ```````````````````````````````` example
7943 <p><a href="/url" title="title">Foo</a></p>
7944 ````````````````````````````````
7948 As with full reference links, [whitespace] is not
7949 allowed between the two sets of brackets:
7951 ```````````````````````````````` example
7957 <p><a href="/url" title="title">foo</a>
7959 ````````````````````````````````
7962 A [shortcut reference link](@)
7963 consists of a [link label] that [matches] a
7964 [link reference definition] elsewhere in the
7965 document and is not followed by `[]` or a link label.
7966 The contents of the first link label are parsed as inlines,
7967 which are used as the link's text. The link's URI and title
7968 are provided by the matching link reference definition.
7969 Thus, `[foo]` is equivalent to `[foo][]`.
7971 ```````````````````````````````` example
7976 <p><a href="/url" title="title">foo</a></p>
7977 ````````````````````````````````
7980 ```````````````````````````````` example
7983 [*foo* bar]: /url "title"
7985 <p><a href="/url" title="title"><em>foo</em> bar</a></p>
7986 ````````````````````````````````
7989 ```````````````````````````````` example
7992 [*foo* bar]: /url "title"
7994 <p>[<a href="/url" title="title"><em>foo</em> bar</a>]</p>
7995 ````````````````````````````````
7998 ```````````````````````````````` example
8003 <p>[[bar <a href="/url">foo</a></p>
8004 ````````````````````````````````
8007 The link labels are case-insensitive:
8009 ```````````````````````````````` example
8014 <p><a href="/url" title="title">Foo</a></p>
8015 ````````````````````````````````
8018 A space after the link text should be preserved:
8020 ```````````````````````````````` example
8025 <p><a href="/url">foo</a> bar</p>
8026 ````````````````````````````````
8029 If you just want bracketed text, you can backslash-escape the
8030 opening bracket to avoid links:
8032 ```````````````````````````````` example
8038 ````````````````````````````````
8041 Note that this is a link, because a link label ends with the first
8042 following closing bracket:
8044 ```````````````````````````````` example
8049 <p>*<a href="/url">foo*</a></p>
8050 ````````````````````````````````
8053 Full and compact references take precedence over shortcut
8056 ```````````````````````````````` example
8062 <p><a href="/url2">foo</a></p>
8063 ````````````````````````````````
8065 ```````````````````````````````` example
8070 <p><a href="/url1">foo</a></p>
8071 ````````````````````````````````
8073 Inline links also take precedence:
8075 ```````````````````````````````` example
8080 <p><a href="">foo</a></p>
8081 ````````````````````````````````
8083 ```````````````````````````````` example
8088 <p><a href="/url1">foo</a>(not a link)</p>
8089 ````````````````````````````````
8091 In the following case `[bar][baz]` is parsed as a reference,
8092 `[foo]` as normal text:
8094 ```````````````````````````````` example
8099 <p>[foo]<a href="/url">bar</a></p>
8100 ````````````````````````````````
8103 Here, though, `[foo][bar]` is parsed as a reference, since
8106 ```````````````````````````````` example
8112 <p><a href="/url2">foo</a><a href="/url1">baz</a></p>
8113 ````````````````````````````````
8116 Here `[foo]` is not parsed as a shortcut reference, because it
8117 is followed by a link label (even though `[bar]` is not defined):
8119 ```````````````````````````````` example
8125 <p>[foo]<a href="/url1">bar</a></p>
8126 ````````````````````````````````
8132 Syntax for images is like the syntax for links, with one
8133 difference. Instead of [link text], we have an
8134 [image description](@). The rules for this are the
8135 same as for [link text], except that (a) an
8136 image description starts with `![` rather than `[`, and
8137 (b) an image description may contain links.
8138 An image description has inline elements
8139 as its contents. When an image is rendered to HTML,
8140 this is standardly used as the image's `alt` attribute.
8142 ```````````````````````````````` example
8143 ![foo](/url "title")
8145 <p><img src="/url" alt="foo" title="title" /></p>
8146 ````````````````````````````````
8149 ```````````````````````````````` example
8152 [foo *bar*]: train.jpg "train & tracks"
8154 <p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p>
8155 ````````````````````````````````
8158 ```````````````````````````````` example
8159 ![foo ![bar](/url)](/url2)
8161 <p><img src="/url2" alt="foo bar" /></p>
8162 ````````````````````````````````
8165 ```````````````````````````````` example
8166 ![foo [bar](/url)](/url2)
8168 <p><img src="/url2" alt="foo bar" /></p>
8169 ````````````````````````````````
8172 Though this spec is concerned with parsing, not rendering, it is
8173 recommended that in rendering to HTML, only the plain string content
8174 of the [image description] be used. Note that in
8175 the above example, the alt attribute's value is `foo bar`, not `foo
8176 [bar](/url)` or `foo <a href="/url">bar</a>`. Only the plain string
8177 content is rendered, without formatting.
8179 ```````````````````````````````` example
8182 [foo *bar*]: train.jpg "train & tracks"
8184 <p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p>
8185 ````````````````````````````````
8188 ```````````````````````````````` example
8189 ![foo *bar*][foobar]
8191 [FOOBAR]: train.jpg "train & tracks"
8193 <p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p>
8194 ````````````````````````````````
8197 ```````````````````````````````` example
8200 <p><img src="train.jpg" alt="foo" /></p>
8201 ````````````````````````````````
8204 ```````````````````````````````` example
8205 My ![foo bar](/path/to/train.jpg "title" )
8207 <p>My <img src="/path/to/train.jpg" alt="foo bar" title="title" /></p>
8208 ````````````````````````````````
8211 ```````````````````````````````` example
8214 <p><img src="url" alt="foo" /></p>
8215 ````````````````````````````````
8218 ```````````````````````````````` example
8221 <p><img src="/url" alt="" /></p>
8222 ````````````````````````````````
8227 ```````````````````````````````` example
8232 <p><img src="/url" alt="foo" /></p>
8233 ````````````````````````````````
8236 ```````````````````````````````` example
8241 <p><img src="/url" alt="foo" /></p>
8242 ````````````````````````````````
8247 ```````````````````````````````` example
8252 <p><img src="/url" alt="foo" title="title" /></p>
8253 ````````````````````````````````
8256 ```````````````````````````````` example
8259 [*foo* bar]: /url "title"
8261 <p><img src="/url" alt="foo bar" title="title" /></p>
8262 ````````````````````````````````
8265 The labels are case-insensitive:
8267 ```````````````````````````````` example
8272 <p><img src="/url" alt="Foo" title="title" /></p>
8273 ````````````````````````````````
8276 As with reference links, [whitespace] is not allowed
8277 between the two sets of brackets:
8279 ```````````````````````````````` example
8285 <p><img src="/url" alt="foo" title="title" />
8287 ````````````````````````````````
8292 ```````````````````````````````` example
8297 <p><img src="/url" alt="foo" title="title" /></p>
8298 ````````````````````````````````
8301 ```````````````````````````````` example
8304 [*foo* bar]: /url "title"
8306 <p><img src="/url" alt="foo bar" title="title" /></p>
8307 ````````````````````````````````
8310 Note that link labels cannot contain unescaped brackets:
8312 ```````````````````````````````` example
8315 [[foo]]: /url "title"
8318 <p>[[foo]]: /url "title"</p>
8319 ````````````````````````````````
8322 The link labels are case-insensitive:
8324 ```````````````````````````````` example
8329 <p><img src="/url" alt="Foo" title="title" /></p>
8330 ````````````````````````````````
8333 If you just want a literal `!` followed by bracketed text, you can
8334 backslash-escape the opening `[`:
8336 ```````````````````````````````` example
8342 ````````````````````````````````
8345 If you want a link after a literal `!`, backslash-escape the
8348 ```````````````````````````````` example
8353 <p>!<a href="/url" title="title">foo</a></p>
8354 ````````````````````````````````
8359 [Autolink](@)s are absolute URIs and email addresses inside
8360 `<` and `>`. They are parsed as links, with the URL or email address
8363 A [URI autolink](@) consists of `<`, followed by an
8364 [absolute URI] not containing `<`, followed by `>`. It is parsed as
8365 a link to the URI, with the URI as the link's label.
8367 An [absolute URI](@),
8368 for these purposes, consists of a [scheme] followed by a colon (`:`)
8369 followed by zero or more characters other than ASCII
8370 [whitespace] and control characters, `<`, and `>`. If
8371 the URI includes these characters, they must be percent-encoded
8372 (e.g. `%20` for a space).
8374 For purposes of this spec, a [scheme](@) is any sequence
8375 of 2--32 characters beginning with an ASCII letter and followed
8376 by any combination of ASCII letters, digits, or the symbols plus
8377 ("+"), period ("."), or hyphen ("-").
8379 Here are some valid autolinks:
8381 ```````````````````````````````` example
8382 <http://foo.bar.baz>
8384 <p><a href="http://foo.bar.baz">http://foo.bar.baz</a></p>
8385 ````````````````````````````````
8388 ```````````````````````````````` example
8389 <http://foo.bar.baz/test?q=hello&id=22&boolean>
8391 <p><a href="http://foo.bar.baz/test?q=hello&id=22&boolean">http://foo.bar.baz/test?q=hello&id=22&boolean</a></p>
8392 ````````````````````````````````
8395 ```````````````````````````````` example
8396 <irc://foo.bar:2233/baz>
8398 <p><a href="irc://foo.bar:2233/baz">irc://foo.bar:2233/baz</a></p>
8399 ````````````````````````````````
8402 Uppercase is also fine:
8404 ```````````````````````````````` example
8405 <MAILTO:FOO@BAR.BAZ>
8407 <p><a href="MAILTO:FOO@BAR.BAZ">MAILTO:FOO@BAR.BAZ</a></p>
8408 ````````````````````````````````
8411 Note that many strings that count as [absolute URIs] for
8412 purposes of this spec are not valid URIs, because their
8413 schemes are not registered or because of other problems
8416 ```````````````````````````````` example
8419 <p><a href="a+b+c:d">a+b+c:d</a></p>
8420 ````````````````````````````````
8423 ```````````````````````````````` example
8424 <made-up-scheme://foo,bar>
8426 <p><a href="made-up-scheme://foo,bar">made-up-scheme://foo,bar</a></p>
8427 ````````````````````````````````
8430 ```````````````````````````````` example
8433 <p><a href="http://../">http://../</a></p>
8434 ````````````````````````````````
8437 ```````````````````````````````` example
8438 <localhost:5001/foo>
8440 <p><a href="localhost:5001/foo">localhost:5001/foo</a></p>
8441 ````````````````````````````````
8444 Spaces are not allowed in autolinks:
8446 ```````````````````````````````` example
8447 <http://foo.bar/baz bim>
8449 <p><http://foo.bar/baz bim></p>
8450 ````````````````````````````````
8453 Backslash-escapes do not work inside autolinks:
8455 ```````````````````````````````` example
8456 <http://example.com/\[\>
8458 <p><a href="http://example.com/%5C%5B%5C">http://example.com/\[\</a></p>
8459 ````````````````````````````````
8462 An [email autolink](@)
8463 consists of `<`, followed by an [email address],
8464 followed by `>`. The link's label is the email address,
8465 and the URL is `mailto:` followed by the email address.
8467 An [email address](@),
8468 for these purposes, is anything that matches
8469 the [non-normative regex from the HTML5
8470 spec](https://html.spec.whatwg.org/multipage/forms.html#e-mail-state-(type=email)):
8472 /^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?
8473 (?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/
8475 Examples of email autolinks:
8477 ```````````````````````````````` example
8478 <foo@bar.example.com>
8480 <p><a href="mailto:foo@bar.example.com">foo@bar.example.com</a></p>
8481 ````````````````````````````````
8484 ```````````````````````````````` example
8485 <foo+special@Bar.baz-bar0.com>
8487 <p><a href="mailto:foo+special@Bar.baz-bar0.com">foo+special@Bar.baz-bar0.com</a></p>
8488 ````````````````````````````````
8491 Backslash-escapes do not work inside email autolinks:
8493 ```````````````````````````````` example
8494 <foo\+@bar.example.com>
8496 <p><foo+@bar.example.com></p>
8497 ````````````````````````````````
8500 These are not autolinks:
8502 ```````````````````````````````` example
8506 ````````````````````````````````
8509 ```````````````````````````````` example
8512 <p>< http://foo.bar ></p>
8513 ````````````````````````````````
8516 ```````````````````````````````` example
8519 <p><m:abc></p>
8520 ````````````````````````````````
8523 ```````````````````````````````` example
8526 <p><foo.bar.baz></p>
8527 ````````````````````````````````
8530 ```````````````````````````````` example
8533 <p>http://example.com</p>
8534 ````````````````````````````````
8537 ```````````````````````````````` example
8540 <p>foo@bar.example.com</p>
8541 ````````````````````````````````
8546 Text between `<` and `>` that looks like an HTML tag is parsed as a
8547 raw HTML tag and will be rendered in HTML without escaping.
8548 Tag and attribute names are not limited to current HTML tags,
8549 so custom tags (and even, say, DocBook tags) may be used.
8551 Here is the grammar for tags:
8553 A [tag name](@) consists of an ASCII letter
8554 followed by zero or more ASCII letters, digits, or
8557 An [attribute](@) consists of [whitespace],
8558 an [attribute name], and an optional
8559 [attribute value specification].
8561 An [attribute name](@)
8562 consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII
8563 letters, digits, `_`, `.`, `:`, or `-`. (Note: This is the XML
8564 specification restricted to ASCII. HTML5 is laxer.)
8566 An [attribute value specification](@)
8567 consists of optional [whitespace],
8568 a `=` character, optional [whitespace], and an [attribute
8571 An [attribute value](@)
8572 consists of an [unquoted attribute value],
8573 a [single-quoted attribute value], or a [double-quoted attribute value].
8575 An [unquoted attribute value](@)
8576 is a nonempty string of characters not
8577 including spaces, `"`, `'`, `=`, `<`, `>`, or `` ` ``.
8579 A [single-quoted attribute value](@)
8580 consists of `'`, zero or more
8581 characters not including `'`, and a final `'`.
8583 A [double-quoted attribute value](@)
8584 consists of `"`, zero or more
8585 characters not including `"`, and a final `"`.
8587 An [open tag](@) consists of a `<` character, a [tag name],
8588 zero or more [attributes], optional [whitespace], an optional `/`
8589 character, and a `>` character.
8591 A [closing tag](@) consists of the string `</`, a
8592 [tag name], optional [whitespace], and the character `>`.
8594 An [HTML comment](@) consists of `<!--` + *text* + `-->`,
8595 where *text* does not start with `>` or `->`, does not end with `-`,
8596 and does not contain `--`. (See the
8597 [HTML5 spec](http://www.w3.org/TR/html5/syntax.html#comments).)
8599 A [processing instruction](@)
8600 consists of the string `<?`, a string
8601 of characters not including the string `?>`, and the string
8604 A [declaration](@) consists of the
8605 string `<!`, a name consisting of one or more uppercase ASCII letters,
8606 [whitespace], a string of characters not including the
8607 character `>`, and the character `>`.
8609 A [CDATA section](@) consists of
8610 the string `<![CDATA[`, a string of characters not including the string
8611 `]]>`, and the string `]]>`.
8613 An [HTML tag](@) consists of an [open tag], a [closing tag],
8614 an [HTML comment], a [processing instruction], a [declaration],
8615 or a [CDATA section].
8617 Here are some simple open tags:
8619 ```````````````````````````````` example
8622 <p><a><bab><c2c></p>
8623 ````````````````````````````````
8628 ```````````````````````````````` example
8632 ````````````````````````````````
8635 [Whitespace] is allowed:
8637 ```````````````````````````````` example
8643 ````````````````````````````````
8648 ```````````````````````````````` example
8649 <a foo="bar" bam = 'baz <em>"</em>'
8650 _boolean zoop:33=zoop:33 />
8652 <p><a foo="bar" bam = 'baz <em>"</em>'
8653 _boolean zoop:33=zoop:33 /></p>
8654 ````````````````````````````````
8657 Custom tag names can be used:
8659 ```````````````````````````````` example
8660 Foo <responsive-image src="foo.jpg" />
8662 <p>Foo <responsive-image src="foo.jpg" /></p>
8663 ````````````````````````````````
8666 Illegal tag names, not parsed as HTML:
8668 ```````````````````````````````` example
8671 <p><33> <__></p>
8672 ````````````````````````````````
8675 Illegal attribute names:
8677 ```````````````````````````````` example
8680 <p><a h*#ref="hi"></p>
8681 ````````````````````````````````
8684 Illegal attribute values:
8686 ```````````````````````````````` example
8687 <a href="hi'> <a href=hi'>
8689 <p><a href="hi'> <a href=hi'></p>
8690 ````````````````````````````````
8693 Illegal [whitespace]:
8695 ```````````````````````````````` example
8700 foo><bar/ ></p>
8701 ````````````````````````````````
8704 Missing [whitespace]:
8706 ```````````````````````````````` example
8707 <a href='bar'title=title>
8709 <p><a href='bar'title=title></p>
8710 ````````````````````````````````
8715 ```````````````````````````````` example
8719 ````````````````````````````````
8722 Illegal attributes in closing tag:
8724 ```````````````````````````````` example
8727 <p></a href="foo"></p>
8728 ````````````````````````````````
8733 ```````````````````````````````` example
8735 comment - with hyphen -->
8737 <p>foo <!-- this is a
8738 comment - with hyphen --></p>
8739 ````````````````````````````````
8742 ```````````````````````````````` example
8743 foo <!-- not a comment -- two hyphens -->
8745 <p>foo <!-- not a comment -- two hyphens --></p>
8746 ````````````````````````````````
8751 ```````````````````````````````` example
8756 <p>foo <!--> foo --></p>
8757 <p>foo <!-- foo---></p>
8758 ````````````````````````````````
8761 Processing instructions:
8763 ```````````````````````````````` example
8764 foo <?php echo $a; ?>
8766 <p>foo <?php echo $a; ?></p>
8767 ````````````````````````````````
8772 ```````````````````````````````` example
8773 foo <!ELEMENT br EMPTY>
8775 <p>foo <!ELEMENT br EMPTY></p>
8776 ````````````````````````````````
8781 ```````````````````````````````` example
8784 <p>foo <![CDATA[>&<]]></p>
8785 ````````````````````````````````
8788 Entity and numeric character references are preserved in HTML
8791 ```````````````````````````````` example
8792 foo <a href="ö">
8794 <p>foo <a href="ö"></p>
8795 ````````````````````````````````
8798 Backslash escapes do not work in HTML attributes:
8800 ```````````````````````````````` example
8803 <p>foo <a href="\*"></p>
8804 ````````````````````````````````
8807 ```````````````````````````````` example
8810 <p><a href="""></p>
8811 ````````````````````````````````
8816 A line break (not in a code span or HTML tag) that is preceded
8817 by two or more spaces and does not occur at the end of a block
8818 is parsed as a [hard line break](@) (rendered
8819 in HTML as a `<br />` tag):
8821 ```````````````````````````````` example
8827 ````````````````````````````````
8830 For a more visible alternative, a backslash before the
8831 [line ending] may be used instead of two spaces:
8833 ```````````````````````````````` example
8839 ````````````````````````````````
8842 More than two spaces can be used:
8844 ```````````````````````````````` example
8850 ````````````````````````````````
8853 Leading spaces at the beginning of the next line are ignored:
8855 ```````````````````````````````` example
8861 ````````````````````````````````
8864 ```````````````````````````````` example
8870 ````````````````````````````````
8873 Line breaks can occur inside emphasis, links, and other constructs
8874 that allow inline content:
8876 ```````````````````````````````` example
8882 ````````````````````````````````
8885 ```````````````````````````````` example
8891 ````````````````````````````````
8894 Line breaks do not occur inside code spans
8896 ```````````````````````````````` example
8900 <p><code>code span</code></p>
8901 ````````````````````````````````
8904 ```````````````````````````````` example
8908 <p><code>code\ span</code></p>
8909 ````````````````````````````````
8914 ```````````````````````````````` example
8920 ````````````````````````````````
8923 ```````````````````````````````` example
8929 ````````````````````````````````
8932 Hard line breaks are for separating inline content within a block.
8933 Neither syntax for hard line breaks works at the end of a paragraph or
8934 other block element:
8936 ```````````````````````````````` example
8940 ````````````````````````````````
8943 ```````````````````````````````` example
8947 ````````````````````````````````
8950 ```````````````````````````````` example
8954 ````````````````````````````````
8957 ```````````````````````````````` example
8961 ````````````````````````````````
8966 A regular line break (not in a code span or HTML tag) that is not
8967 preceded by two or more spaces or a backslash is parsed as a
8968 [softbreak](@). (A softbreak may be rendered in HTML either as a
8969 [line ending] or as a space. The result will be the same in
8970 browsers. In the examples here, a [line ending] will be used.)
8972 ```````````````````````````````` example
8978 ````````````````````````````````
8981 Spaces at the end of the line and beginning of the next line are
8984 ```````````````````````````````` example
8990 ````````````````````````````````
8993 A conforming parser may render a soft line break in HTML either as a
8994 line break or as a space.
8996 A renderer may also provide an option to render soft line breaks
8997 as hard line breaks.
9001 Any characters not given an interpretation by the above rules will
9002 be parsed as plain textual content.
9004 ```````````````````````````````` example
9007 <p>hello $.;'there</p>
9008 ````````````````````````````````
9011 ```````````````````````````````` example
9015 ````````````````````````````````
9018 Internal spaces are preserved verbatim:
9020 ```````````````````````````````` example
9023 <p>Multiple spaces</p>
9024 ````````````````````````````````
9029 # Appendix: A parsing strategy
9031 In this appendix we describe some features of the parsing strategy
9032 used in the CommonMark reference implementations.
9036 Parsing has two phases:
9038 1. In the first phase, lines of input are consumed and the block
9039 structure of the document---its division into paragraphs, block quotes,
9040 list items, and so on---is constructed. Text is assigned to these
9041 blocks but not parsed. Link reference definitions are parsed and a
9042 map of links is constructed.
9044 2. In the second phase, the raw text contents of paragraphs and headings
9045 are parsed into sequences of Markdown inline elements (strings,
9046 code spans, links, emphasis, and so on), using the map of link
9047 references constructed in phase 1.
9049 At each point in processing, the document is represented as a tree of
9050 **blocks**. The root of the tree is a `document` block. The `document`
9051 may have any number of other blocks as **children**. These children
9052 may, in turn, have other blocks as children. The last child of a block
9053 is normally considered **open**, meaning that subsequent lines of input
9054 can alter its contents. (Blocks that are not open are **closed**.)
9055 Here, for example, is a possible document tree, with the open blocks
9062 "Lorem ipsum dolor\nsit amet."
9063 -> list (type=bullet tight=true bullet_char=-)
9066 "Qui *quodsi iracundia*"
9072 ## Phase 1: block structure
9074 Each line that is processed has an effect on this tree. The line is
9075 analyzed and, depending on its contents, the document may be altered
9076 in one or more of the following ways:
9078 1. One or more open blocks may be closed.
9079 2. One or more new blocks may be created as children of the
9081 3. Text may be added to the last (deepest) open block remaining
9084 Once a line has been incorporated into the tree in this way,
9085 it can be discarded, so input can be read in a stream.
9087 For each line, we follow this procedure:
9089 1. First we iterate through the open blocks, starting with the
9090 root document, and descending through last children down to the last
9091 open block. Each block imposes a condition that the line must satisfy
9092 if the block is to remain open. For example, a block quote requires a
9093 `>` character. A paragraph requires a non-blank line.
9094 In this phase we may match all or just some of the open
9095 blocks. But we cannot close unmatched blocks yet, because we may have a
9096 [lazy continuation line].
9098 2. Next, after consuming the continuation markers for existing
9099 blocks, we look for new block starts (e.g. `>` for a block quote).
9100 If we encounter a new block start, we close any blocks unmatched
9101 in step 1 before creating the new block as a child of the last
9104 3. Finally, we look at the remainder of the line (after block
9105 markers like `>`, list markers, and indentation have been consumed).
9106 This is text that can be incorporated into the last open
9107 block (a paragraph, code block, heading, or raw HTML).
9109 Setext headings are formed when we see a line of a paragraph
9110 that is a [setext heading underline].
9112 Reference link definitions are detected when a paragraph is closed;
9113 the accumulated text lines are parsed to see if they begin with
9114 one or more reference link definitions. Any remainder becomes a
9117 We can see how this works by considering how the tree above is
9118 generated by four lines of Markdown:
9123 > - Qui *quodsi iracundia*
9127 At the outset, our document model is just
9133 The first line of our text,
9139 causes a `block_quote` block to be created as a child of our
9140 open `document` block, and a `paragraph` block as a child of
9141 the `block_quote`. Then the text is added to the last open
9142 block, the `paragraph`:
9157 is a "lazy continuation" of the open `paragraph`, so it gets added
9158 to the paragraph's text:
9164 "Lorem ipsum dolor\nsit amet."
9170 > - Qui *quodsi iracundia*
9173 causes the `paragraph` block to be closed, and a new `list` block
9174 opened as a child of the `block_quote`. A `list_item` is also
9175 added as a child of the `list`, and a `paragraph` as a child of
9176 the `list_item`. The text is then added to the new `paragraph`:
9182 "Lorem ipsum dolor\nsit amet."
9183 -> list (type=bullet tight=true bullet_char=-)
9186 "Qui *quodsi iracundia*"
9195 causes the `list_item` (and its child the `paragraph`) to be closed,
9196 and a new `list_item` opened up as child of the `list`. A `paragraph`
9197 is added as a child of the new `list_item`, to contain the text.
9198 We thus obtain the final tree:
9204 "Lorem ipsum dolor\nsit amet."
9205 -> list (type=bullet tight=true bullet_char=-)
9208 "Qui *quodsi iracundia*"
9214 ## Phase 2: inline structure
9216 Once all of the input has been parsed, all open blocks are closed.
9218 We then "walk the tree," visiting every node, and parse raw
9219 string contents of paragraphs and headings as inlines. At this
9220 point we have seen all the link reference definitions, so we can
9221 resolve reference links as we go.
9227 str "Lorem ipsum dolor"
9230 list (type=bullet tight=true bullet_char=-)
9235 str "quodsi iracundia"
9241 Notice how the [line ending] in the first paragraph has
9242 been parsed as a `softbreak`, and the asterisks in the first list item
9243 have become an `emph`.
9245 ### An algorithm for parsing nested emphasis and links
9247 By far the trickiest part of inline parsing is handling emphasis,
9248 strong emphasis, links, and images. This is done using the following
9251 When we're parsing inlines and we hit either
9253 - a run of `*` or `_` characters, or
9256 we insert a text node with these symbols as its literal content, and we
9257 add a pointer to this text node to the [delimiter stack](@).
9259 The [delimiter stack] is a doubly linked list. Each
9260 element contains a pointer to a text node, plus information about
9262 - the type of delimiter (`[`, `![`, `*`, `_`)
9263 - the number of delimiters,
9264 - whether the delimiter is "active" (all are active to start), and
9265 - whether the delimiter is a potential opener, a potential closer,
9266 or both (which depends on what sort of characters precede
9267 and follow the delimiters).
9269 When we hit a `]` character, we call the *look for link or image*
9270 procedure (see below).
9272 When we hit the end of the input, we call the *process emphasis*
9273 procedure (see below), with `stack_bottom` = NULL.
9275 #### *look for link or image*
9277 Starting at the top of the delimiter stack, we look backwards
9278 through the stack for an opening `[` or `![` delimiter.
9280 - If we don't find one, we return a literal text node `]`.
9282 - If we do find one, but it's not *active*, we remove the inactive
9283 delimiter from the stack, and return a literal text node `]`.
9285 - If we find one and it's active, then we parse ahead to see if
9286 we have an inline link/image, reference link/image, compact reference
9287 link/image, or shortcut reference link/image.
9289 + If we don't, then we remove the opening delimiter from the
9290 delimiter stack and return a literal text node `]`.
9294 * We return a link or image node whose children are the inlines
9295 after the text node pointed to by the opening delimiter.
9297 * We run *process emphasis* on these inlines, with the `[` opener
9300 * We remove the opening delimiter.
9302 * If we have a link (and not an image), we also set all
9303 `[` delimiters before the opening delimiter to *inactive*. (This
9304 will prevent us from getting links within links.)
9306 #### *process emphasis*
9308 Parameter `stack_bottom` sets a lower bound to how far we
9309 descend in the [delimiter stack]. If it is NULL, we can
9310 go all the way to the bottom. Otherwise, we stop before
9311 visiting `stack_bottom`.
9313 Let `current_position` point to the element on the [delimiter stack]
9314 just above `stack_bottom` (or the first element if `stack_bottom`
9317 We keep track of the `openers_bottom` for each delimiter
9318 type (`*`, `_`). Initialize this to `stack_bottom`.
9320 Then we repeat the following until we run out of potential
9323 - Move `current_position` forward in the delimiter stack (if needed)
9324 until we find the first potential closer with delimiter `*` or `_`.
9325 (This will be the potential closer closest
9326 to the beginning of the input -- the first one in parse order.)
9328 - Now, look back in the stack (staying above `stack_bottom` and
9329 the `openers_bottom` for this delimiter type) for the
9330 first matching potential opener ("matching" means same delimiter).
9334 + Figure out whether we have emphasis or strong emphasis:
9335 if both closer and opener spans have length >= 2, we have
9336 strong, otherwise regular.
9338 + Insert an emph or strong emph node accordingly, after
9339 the text node corresponding to the opener.
9341 + Remove any delimiters between the opener and closer from
9342 the delimiter stack.
9344 + Remove 1 (for regular emph) or 2 (for strong emph) delimiters
9345 from the opening and closing text nodes. If they become empty
9346 as a result, remove them and remove the corresponding element
9347 of the delimiter stack. If the closing node is removed, reset
9348 `current_position` to the next element in the stack.
9352 + Set `openers_bottom` to the element before `current_position`.
9353 (We know that there are no openers for this kind of closer up to and
9354 including this point, so this puts a lower bound on future searches.)
9356 + If the closer at `current_position` is not a potential opener,
9357 remove it from the delimiter stack (since we know it can't
9358 be a closer either).
9360 + Advance `current_position` to the next element in the stack.
9362 After we're done, we remove all delimiters above `stack_bottom` from the