]> gerrit.simantics Code Review - simantics/platform.git/blob - tests/org.simantics.scl.compiler.tests/src/org/simantics/scl/compiler/tests/markdown/spec.txt
Fixed multiple issues causing dangling references to discarded queries
[simantics/platform.git] / tests / org.simantics.scl.compiler.tests / src / org / simantics / scl / compiler / tests / markdown / spec.txt
1 ---
2 title: CommonMark Spec
3 author: John MacFarlane
4 version: 0.27
5 date: '2016-11-18'
6 license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)'
7 ...
8
9 # Introduction
10
11 ## What is Markdown?
12
13 Markdown is a plain text format for writing structured documents,
14 based on conventions for indicating formatting in email
15 and usenet posts.  It was developed by John Gruber (with
16 help from Aaron Swartz) and released in 2004 in the form of a
17 [syntax description](http://daringfireball.net/projects/markdown/syntax)
18 and a Perl script (`Markdown.pl`) for converting Markdown to
19 HTML.  In the next decade, dozens of implementations were
20 developed in many languages.  Some extended the original
21 Markdown syntax with conventions for footnotes, tables, and
22 other document elements.  Some allowed Markdown documents to be
23 rendered in formats other than HTML.  Websites like Reddit,
24 StackOverflow, and GitHub had millions of people using Markdown.
25 And Markdown started to be used beyond the web, to author books,
26 articles, slide shows, letters, and lecture notes.
27
28 What distinguishes Markdown from many other lightweight markup
29 syntaxes, which are often easier to write, is its readability.
30 As Gruber writes:
31
32 > The overriding design goal for Markdown's formatting syntax is
33 > to make it as readable as possible. The idea is that a
34 > Markdown-formatted document should be publishable as-is, as
35 > plain text, without looking like it's been marked up with tags
36 > or formatting instructions.
37 > (<http://daringfireball.net/projects/markdown/>)
38
39 The point can be illustrated by comparing a sample of
40 [AsciiDoc](http://www.methods.co.nz/asciidoc/) with
41 an equivalent sample of Markdown.  Here is a sample of
42 AsciiDoc from the AsciiDoc manual:
43
44 ```
45 1. List item one.
46 +
47 List item one continued with a second paragraph followed by an
48 Indented block.
49 +
50 .................
51 $ ls *.sh
52 $ mv *.sh ~/tmp
53 .................
54 +
55 List item continued with a third paragraph.
56
57 2. List item two continued with an open block.
58 +
59 --
60 This paragraph is part of the preceding list item.
61
62 a. This list is nested and does not require explicit item
63 continuation.
64 +
65 This paragraph is part of the preceding list item.
66
67 b. List item b.
68
69 This paragraph belongs to item two of the outer list.
70 --
71 ```
72
73 And here is the equivalent in Markdown:
74 ```
75 1.  List item one.
76
77     List item one continued with a second paragraph followed by an
78     Indented block.
79
80         $ ls *.sh
81         $ mv *.sh ~/tmp
82
83     List item continued with a third paragraph.
84
85 2.  List item two continued with an open block.
86
87     This paragraph is part of the preceding list item.
88
89     1. This list is nested and does not require explicit item continuation.
90
91        This paragraph is part of the preceding list item.
92
93     2. List item b.
94
95     This paragraph belongs to item two of the outer list.
96 ```
97
98 The AsciiDoc version is, arguably, easier to write. You don't need
99 to worry about indentation.  But the Markdown version is much easier
100 to read.  The nesting of list items is apparent to the eye in the
101 source, not just in the processed document.
102
103 ## Why is a spec needed?
104
105 John Gruber's [canonical description of Markdown's
106 syntax](http://daringfireball.net/projects/markdown/syntax)
107 does not specify the syntax unambiguously.  Here are some examples of
108 questions it does not answer:
109
110 1.  How much indentation is needed for a sublist?  The spec says that
111     continuation paragraphs need to be indented four spaces, but is
112     not fully explicit about sublists.  It is natural to think that
113     they, too, must be indented four spaces, but `Markdown.pl` does
114     not require that.  This is hardly a "corner case," and divergences
115     between implementations on this issue often lead to surprises for
116     users in real documents. (See [this comment by John
117     Gruber](http://article.gmane.org/gmane.text.markdown.general/1997).)
118
119 2.  Is a blank line needed before a block quote or heading?
120     Most implementations do not require the blank line.  However,
121     this can lead to unexpected results in hard-wrapped text, and
122     also to ambiguities in parsing (note that some implementations
123     put the heading inside the blockquote, while others do not).
124     (John Gruber has also spoken [in favor of requiring the blank
125     lines](http://article.gmane.org/gmane.text.markdown.general/2146).)
126
127 3.  Is a blank line needed before an indented code block?
128     (`Markdown.pl` requires it, but this is not mentioned in the
129     documentation, and some implementations do not require it.)
130
131     ``` markdown
132     paragraph
133         code?
134     ```
135
136 4.  What is the exact rule for determining when list items get
137     wrapped in `<p>` tags?  Can a list be partially "loose" and partially
138     "tight"?  What should we do with a list like this?
139
140     ``` markdown
141     1. one
142
143     2. two
144     3. three
145     ```
146
147     Or this?
148
149     ``` markdown
150     1.  one
151         - a
152
153         - b
154     2.  two
155     ```
156
157     (There are some relevant comments by John Gruber
158     [here](http://article.gmane.org/gmane.text.markdown.general/2554).)
159
160 5.  Can list markers be indented?  Can ordered list markers be right-aligned?
161
162     ``` markdown
163      8. item 1
164      9. item 2
165     10. item 2a
166     ```
167
168 6.  Is this one list with a thematic break in its second item,
169     or two lists separated by a thematic break?
170
171     ``` markdown
172     * a
173     * * * * *
174     * b
175     ```
176
177 7.  When list markers change from numbers to bullets, do we have
178     two lists or one?  (The Markdown syntax description suggests two,
179     but the perl scripts and many other implementations produce one.)
180
181     ``` markdown
182     1. fee
183     2. fie
184     -  foe
185     -  fum
186     ```
187
188 8.  What are the precedence rules for the markers of inline structure?
189     For example, is the following a valid link, or does the code span
190     take precedence ?
191
192     ``` markdown
193     [a backtick (`)](/url) and [another backtick (`)](/url).
194     ```
195
196 9.  What are the precedence rules for markers of emphasis and strong
197     emphasis?  For example, how should the following be parsed?
198
199     ``` markdown
200     *foo *bar* baz*
201     ```
202
203 10. What are the precedence rules between block-level and inline-level
204     structure?  For example, how should the following be parsed?
205
206     ``` markdown
207     - `a long code span can contain a hyphen like this
208       - and it can screw things up`
209     ```
210
211 11. Can list items include section headings?  (`Markdown.pl` does not
212     allow this, but does allow blockquotes to include headings.)
213
214     ``` markdown
215     - # Heading
216     ```
217
218 12. Can list items be empty?
219
220     ``` markdown
221     * a
222     *
223     * b
224     ```
225
226 13. Can link references be defined inside block quotes or list items?
227
228     ``` markdown
229     > Blockquote [foo].
230     >
231     > [foo]: /url
232     ```
233
234 14. If there are multiple definitions for the same reference, which takes
235     precedence?
236
237     ``` markdown
238     [foo]: /url1
239     [foo]: /url2
240
241     [foo][]
242     ```
243
244 In the absence of a spec, early implementers consulted `Markdown.pl`
245 to resolve these ambiguities.  But `Markdown.pl` was quite buggy, and
246 gave manifestly bad results in many cases, so it was not a
247 satisfactory replacement for a spec.
248
249 Because there is no unambiguous spec, implementations have diverged
250 considerably.  As a result, users are often surprised to find that
251 a document that renders one way on one system (say, a github wiki)
252 renders differently on another (say, converting to docbook using
253 pandoc).  To make matters worse, because nothing in Markdown counts
254 as a "syntax error," the divergence often isn't discovered right away.
255
256 ## About this document
257
258 This document attempts to specify Markdown syntax unambiguously.
259 It contains many examples with side-by-side Markdown and
260 HTML.  These are intended to double as conformance tests.  An
261 accompanying script `spec_tests.py` can be used to run the tests
262 against any Markdown program:
263
264     python test/spec_tests.py --spec spec.txt --program PROGRAM
265
266 Since this document describes how Markdown is to be parsed into
267 an abstract syntax tree, it would have made sense to use an abstract
268 representation of the syntax tree instead of HTML.  But HTML is capable
269 of representing the structural distinctions we need to make, and the
270 choice of HTML for the tests makes it possible to run the tests against
271 an implementation without writing an abstract syntax tree renderer.
272
273 This document is generated from a text file, `spec.txt`, written
274 in Markdown with a small extension for the side-by-side tests.
275 The script `tools/makespec.py` can be used to convert `spec.txt` into
276 HTML or CommonMark (which can then be converted into other formats).
277
278 In the examples, the `→` character is used to represent tabs.
279
280 # Preliminaries
281
282 ## Characters and lines
283
284 Any sequence of [characters] is a valid CommonMark
285 document.
286
287 A [character](@) is a Unicode code point.  Although some
288 code points (for example, combining accents) do not correspond to
289 characters in an intuitive sense, all code points count as characters
290 for purposes of this spec.
291
292 This spec does not specify an encoding; it thinks of lines as composed
293 of [characters] rather than bytes.  A conforming parser may be limited
294 to a certain encoding.
295
296 A [line](@) is a sequence of zero or more [characters]
297 other than newline (`U+000A`) or carriage return (`U+000D`),
298 followed by a [line ending] or by the end of file.
299
300 A [line ending](@) is a newline (`U+000A`), a carriage return
301 (`U+000D`) not followed by a newline, or a carriage return and a
302 following newline.
303
304 A line containing no characters, or a line containing only spaces
305 (`U+0020`) or tabs (`U+0009`), is called a [blank line](@).
306
307 The following definitions of character classes will be used in this spec:
308
309 A [whitespace character](@) is a space
310 (`U+0020`), tab (`U+0009`), newline (`U+000A`), line tabulation (`U+000B`),
311 form feed (`U+000C`), or carriage return (`U+000D`).
312
313 [Whitespace](@) is a sequence of one or more [whitespace
314 characters].
315
316 A [Unicode whitespace character](@) is
317 any code point in the Unicode `Zs` general category, or a tab (`U+0009`),
318 carriage return (`U+000D`), newline (`U+000A`), or form feed
319 (`U+000C`).
320
321 [Unicode whitespace](@) is a sequence of one
322 or more [Unicode whitespace characters].
323
324 A [space](@) is `U+0020`.
325
326 A [non-whitespace character](@) is any character
327 that is not a [whitespace character].
328
329 An [ASCII punctuation character](@)
330 is `!`, `"`, `#`, `$`, `%`, `&`, `'`, `(`, `)`,
331 `*`, `+`, `,`, `-`, `.`, `/`, `:`, `;`, `<`, `=`, `>`, `?`, `@`,
332 `[`, `\`, `]`, `^`, `_`, `` ` ``, `{`, `|`, `}`, or `~`.
333
334 A [punctuation character](@) is an [ASCII
335 punctuation character] or anything in
336 the general Unicode categories  `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po`, or `Ps`.
337
338 ## Tabs
339
340 Tabs in lines are not expanded to [spaces].  However,
341 in contexts where whitespace helps to define block structure,
342 tabs behave as if they were replaced by spaces with a tab stop
343 of 4 characters.
344
345 Thus, for example, a tab can be used instead of four spaces
346 in an indented code block.  (Note, however, that internal
347 tabs are passed through as literal tabs, not expanded to
348 spaces.)
349
350 ```````````````````````````````` example
351 →foo→baz→→bim
352 .
353 <pre><code>foo→baz→→bim
354 </code></pre>
355 ````````````````````````````````
356
357 ```````````````````````````````` example
358   →foo→baz→→bim
359 .
360 <pre><code>foo→baz→→bim
361 </code></pre>
362 ````````````````````````````````
363
364 ```````````````````````````````` example
365     a→a
366     ὐ→a
367 .
368 <pre><code>a→a
369 ὐ→a
370 </code></pre>
371 ````````````````````````````````
372
373 In the following example, a continuation paragraph of a list
374 item is indented with a tab; this has exactly the same effect
375 as indentation with four spaces would:
376
377 ```````````````````````````````` example
378   - foo
379
380 →bar
381 .
382 <ul>
383 <li>
384 <p>foo</p>
385 <p>bar</p>
386 </li>
387 </ul>
388 ````````````````````````````````
389
390 ```````````````````````````````` example
391 - foo
392
393 →→bar
394 .
395 <ul>
396 <li>
397 <p>foo</p>
398 <pre><code>  bar
399 </code></pre>
400 </li>
401 </ul>
402 ````````````````````````````````
403
404 Normally the `>` that begins a block quote may be followed
405 optionally by a space, which is not considered part of the
406 content.  In the following case `>` is followed by a tab,
407 which is treated as if it were expanded into three spaces.
408 Since one of these spaces is considered part of the
409 delimiter, `foo` is considered to be indented six spaces
410 inside the block quote context, so we get an indented
411 code block starting with two spaces.
412
413 ```````````````````````````````` example
414 >→→foo
415 .
416 <blockquote>
417 <pre><code>  foo
418 </code></pre>
419 </blockquote>
420 ````````````````````````````````
421
422 ```````````````````````````````` example
423 -→→foo
424 .
425 <ul>
426 <li>
427 <pre><code>  foo
428 </code></pre>
429 </li>
430 </ul>
431 ````````````````````````````````
432
433
434 ```````````````````````````````` example
435     foo
436 →bar
437 .
438 <pre><code>foo
439 bar
440 </code></pre>
441 ````````````````````````````````
442
443 ```````````````````````````````` example
444  - foo
445    - bar
446 → - baz
447 .
448 <ul>
449 <li>foo
450 <ul>
451 <li>bar
452 <ul>
453 <li>baz</li>
454 </ul>
455 </li>
456 </ul>
457 </li>
458 </ul>
459 ````````````````````````````````
460
461 ```````````````````````````````` example
462 #→Foo
463 .
464 <h1>Foo</h1>
465 ````````````````````````````````
466
467 ```````````````````````````````` example
468 *→*→*→
469 .
470 <hr />
471 ````````````````````````````````
472
473
474 ## Insecure characters
475
476 For security reasons, the Unicode character `U+0000` must be replaced
477 with the REPLACEMENT CHARACTER (`U+FFFD`).
478
479 # Blocks and inlines
480
481 We can think of a document as a sequence of
482 [blocks](@)---structural elements like paragraphs, block
483 quotations, lists, headings, rules, and code blocks.  Some blocks (like
484 block quotes and list items) contain other blocks; others (like
485 headings and paragraphs) contain [inline](@) content---text,
486 links, emphasized text, images, code spans, and so on.
487
488 ## Precedence
489
490 Indicators of block structure always take precedence over indicators
491 of inline structure.  So, for example, the following is a list with
492 two items, not a list with one item containing a code span:
493
494 ```````````````````````````````` example
495 - `one
496 - two`
497 .
498 <ul>
499 <li>`one</li>
500 <li>two`</li>
501 </ul>
502 ````````````````````````````````
503
504
505 This means that parsing can proceed in two steps:  first, the block
506 structure of the document can be discerned; second, text lines inside
507 paragraphs, headings, and other block constructs can be parsed for inline
508 structure.  The second step requires information about link reference
509 definitions that will be available only at the end of the first
510 step.  Note that the first step requires processing lines in sequence,
511 but the second can be parallelized, since the inline parsing of
512 one block element does not affect the inline parsing of any other.
513
514 ## Container blocks and leaf blocks
515
516 We can divide blocks into two types:
517 [container block](@)s,
518 which can contain other blocks, and [leaf block](@)s,
519 which cannot.
520
521 # Leaf blocks
522
523 This section describes the different kinds of leaf block that make up a
524 Markdown document.
525
526 ## Thematic breaks
527
528 A line consisting of 0-3 spaces of indentation, followed by a sequence
529 of three or more matching `-`, `_`, or `*` characters, each followed
530 optionally by any number of spaces, forms a
531 [thematic break](@).
532
533 ```````````````````````````````` example
534 ***
535 ---
536 ___
537 .
538 <hr />
539 <hr />
540 <hr />
541 ````````````````````````````````
542
543
544 Wrong characters:
545
546 ```````````````````````````````` example
547 +++
548 .
549 <p>+++</p>
550 ````````````````````````````````
551
552
553 ```````````````````````````````` example
554 ===
555 .
556 <p>===</p>
557 ````````````````````````````````
558
559
560 Not enough characters:
561
562 ```````````````````````````````` example
563 --
564 **
565 __
566 .
567 <p>--
568 **
569 __</p>
570 ````````````````````````````````
571
572
573 One to three spaces indent are allowed:
574
575 ```````````````````````````````` example
576  ***
577   ***
578    ***
579 .
580 <hr />
581 <hr />
582 <hr />
583 ````````````````````````````````
584
585
586 Four spaces is too many:
587
588 ```````````````````````````````` example
589     ***
590 .
591 <pre><code>***
592 </code></pre>
593 ````````````````````````````````
594
595
596 ```````````````````````````````` example
597 Foo
598     ***
599 .
600 <p>Foo
601 ***</p>
602 ````````````````````````````````
603
604
605 More than three characters may be used:
606
607 ```````````````````````````````` example
608 _____________________________________
609 .
610 <hr />
611 ````````````````````````````````
612
613
614 Spaces are allowed between the characters:
615
616 ```````````````````````````````` example
617  - - -
618 .
619 <hr />
620 ````````````````````````````````
621
622
623 ```````````````````````````````` example
624  **  * ** * ** * **
625 .
626 <hr />
627 ````````````````````````````````
628
629
630 ```````````````````````````````` example
631 -     -      -      -
632 .
633 <hr />
634 ````````````````````````````````
635
636
637 Spaces are allowed at the end:
638
639 ```````````````````````````````` example
640 - - - -    
641 .
642 <hr />
643 ````````````````````````````````
644
645
646 However, no other characters may occur in the line:
647
648 ```````````````````````````````` example
649 _ _ _ _ a
650
651 a------
652
653 ---a---
654 .
655 <p>_ _ _ _ a</p>
656 <p>a------</p>
657 <p>---a---</p>
658 ````````````````````````````````
659
660
661 It is required that all of the [non-whitespace characters] be the same.
662 So, this is not a thematic break:
663
664 ```````````````````````````````` example
665  *-*
666 .
667 <p><em>-</em></p>
668 ````````````````````````````````
669
670
671 Thematic breaks do not need blank lines before or after:
672
673 ```````````````````````````````` example
674 - foo
675 ***
676 - bar
677 .
678 <ul>
679 <li>foo</li>
680 </ul>
681 <hr />
682 <ul>
683 <li>bar</li>
684 </ul>
685 ````````````````````````````````
686
687
688 Thematic breaks can interrupt a paragraph:
689
690 ```````````````````````````````` example
691 Foo
692 ***
693 bar
694 .
695 <p>Foo</p>
696 <hr />
697 <p>bar</p>
698 ````````````````````````````````
699
700
701 If a line of dashes that meets the above conditions for being a
702 thematic break could also be interpreted as the underline of a [setext
703 heading], the interpretation as a
704 [setext heading] takes precedence. Thus, for example,
705 this is a setext heading, not a paragraph followed by a thematic break:
706
707 ```````````````````````````````` example
708 Foo
709 ---
710 bar
711 .
712 <h2>Foo</h2>
713 <p>bar</p>
714 ````````````````````````````````
715
716
717 When both a thematic break and a list item are possible
718 interpretations of a line, the thematic break takes precedence:
719
720 ```````````````````````````````` example
721 * Foo
722 * * *
723 * Bar
724 .
725 <ul>
726 <li>Foo</li>
727 </ul>
728 <hr />
729 <ul>
730 <li>Bar</li>
731 </ul>
732 ````````````````````````````````
733
734
735 If you want a thematic break in a list item, use a different bullet:
736
737 ```````````````````````````````` example
738 - Foo
739 - * * *
740 .
741 <ul>
742 <li>Foo</li>
743 <li>
744 <hr />
745 </li>
746 </ul>
747 ````````````````````````````````
748
749
750 ## ATX headings
751
752 An [ATX heading](@)
753 consists of a string of characters, parsed as inline content, between an
754 opening sequence of 1--6 unescaped `#` characters and an optional
755 closing sequence of any number of unescaped `#` characters.
756 The opening sequence of `#` characters must be followed by a
757 [space] or by the end of line. The optional closing sequence of `#`s must be
758 preceded by a [space] and may be followed by spaces only.  The opening
759 `#` character may be indented 0-3 spaces.  The raw contents of the
760 heading are stripped of leading and trailing spaces before being parsed
761 as inline content.  The heading level is equal to the number of `#`
762 characters in the opening sequence.
763
764 Simple headings:
765
766 ```````````````````````````````` example
767 # foo
768 ## foo
769 ### foo
770 #### foo
771 ##### foo
772 ###### foo
773 .
774 <h1>foo</h1>
775 <h2>foo</h2>
776 <h3>foo</h3>
777 <h4>foo</h4>
778 <h5>foo</h5>
779 <h6>foo</h6>
780 ````````````````````````````````
781
782
783 More than six `#` characters is not a heading:
784
785 ```````````````````````````````` example
786 ####### foo
787 .
788 <p>####### foo</p>
789 ````````````````````````````````
790
791
792 At least one space is required between the `#` characters and the
793 heading's contents, unless the heading is empty.  Note that many
794 implementations currently do not require the space.  However, the
795 space was required by the
796 [original ATX implementation](http://www.aaronsw.com/2002/atx/atx.py),
797 and it helps prevent things like the following from being parsed as
798 headings:
799
800 ```````````````````````````````` example
801 #5 bolt
802
803 #hashtag
804 .
805 <p>#5 bolt</p>
806 <p>#hashtag</p>
807 ````````````````````````````````
808
809
810 This is not a heading, because the first `#` is escaped:
811
812 ```````````````````````````````` example
813 \## foo
814 .
815 <p>## foo</p>
816 ````````````````````````````````
817
818
819 Contents are parsed as inlines:
820
821 ```````````````````````````````` example
822 # foo *bar* \*baz\*
823 .
824 <h1>foo <em>bar</em> *baz*</h1>
825 ````````````````````````````````
826
827
828 Leading and trailing blanks are ignored in parsing inline content:
829
830 ```````````````````````````````` example
831 #                  foo                     
832 .
833 <h1>foo</h1>
834 ````````````````````````````````
835
836
837 One to three spaces indentation are allowed:
838
839 ```````````````````````````````` example
840  ### foo
841   ## foo
842    # foo
843 .
844 <h3>foo</h3>
845 <h2>foo</h2>
846 <h1>foo</h1>
847 ````````````````````````````````
848
849
850 Four spaces are too much:
851
852 ```````````````````````````````` example
853     # foo
854 .
855 <pre><code># foo
856 </code></pre>
857 ````````````````````````````````
858
859
860 ```````````````````````````````` example
861 foo
862     # bar
863 .
864 <p>foo
865 # bar</p>
866 ````````````````````````````````
867
868
869 A closing sequence of `#` characters is optional:
870
871 ```````````````````````````````` example
872 ## foo ##
873   ###   bar    ###
874 .
875 <h2>foo</h2>
876 <h3>bar</h3>
877 ````````````````````````````````
878
879
880 It need not be the same length as the opening sequence:
881
882 ```````````````````````````````` example
883 # foo ##################################
884 ##### foo ##
885 .
886 <h1>foo</h1>
887 <h5>foo</h5>
888 ````````````````````````````````
889
890
891 Spaces are allowed after the closing sequence:
892
893 ```````````````````````````````` example
894 ### foo ###     
895 .
896 <h3>foo</h3>
897 ````````````````````````````````
898
899
900 A sequence of `#` characters with anything but [spaces] following it
901 is not a closing sequence, but counts as part of the contents of the
902 heading:
903
904 ```````````````````````````````` example
905 ### foo ### b
906 .
907 <h3>foo ### b</h3>
908 ````````````````````````````````
909
910
911 The closing sequence must be preceded by a space:
912
913 ```````````````````````````````` example
914 # foo#
915 .
916 <h1>foo#</h1>
917 ````````````````````````````````
918
919
920 Backslash-escaped `#` characters do not count as part
921 of the closing sequence:
922
923 ```````````````````````````````` example
924 ### foo \###
925 ## foo #\##
926 # foo \#
927 .
928 <h3>foo ###</h3>
929 <h2>foo ###</h2>
930 <h1>foo #</h1>
931 ````````````````````````````````
932
933
934 ATX headings need not be separated from surrounding content by blank
935 lines, and they can interrupt paragraphs:
936
937 ```````````````````````````````` example
938 ****
939 ## foo
940 ****
941 .
942 <hr />
943 <h2>foo</h2>
944 <hr />
945 ````````````````````````````````
946
947
948 ```````````````````````````````` example
949 Foo bar
950 # baz
951 Bar foo
952 .
953 <p>Foo bar</p>
954 <h1>baz</h1>
955 <p>Bar foo</p>
956 ````````````````````````````````
957
958
959 ATX headings can be empty:
960
961 ```````````````````````````````` example
962 ## 
963 #
964 ### ###
965 .
966 <h2></h2>
967 <h1></h1>
968 <h3></h3>
969 ````````````````````````````````
970
971
972 ## Setext headings
973
974 A [setext heading](@) consists of one or more
975 lines of text, each containing at least one [non-whitespace
976 character], with no more than 3 spaces indentation, followed by
977 a [setext heading underline].  The lines of text must be such
978 that, were they not followed by the setext heading underline,
979 they would be interpreted as a paragraph:  they cannot be
980 interpretable as a [code fence], [ATX heading][ATX headings],
981 [block quote][block quotes], [thematic break][thematic breaks],
982 [list item][list items], or [HTML block][HTML blocks].
983
984 A [setext heading underline](@) is a sequence of
985 `=` characters or a sequence of `-` characters, with no more than 3
986 spaces indentation and any number of trailing spaces.  If a line
987 containing a single `-` can be interpreted as an
988 empty [list items], it should be interpreted this way
989 and not as a [setext heading underline].
990
991 The heading is a level 1 heading if `=` characters are used in
992 the [setext heading underline], and a level 2 heading if `-`
993 characters are used.  The contents of the heading are the result
994 of parsing the preceding lines of text as CommonMark inline
995 content.
996
997 In general, a setext heading need not be preceded or followed by a
998 blank line.  However, it cannot interrupt a paragraph, so when a
999 setext heading comes after a paragraph, a blank line is needed between
1000 them.
1001
1002 Simple examples:
1003
1004 ```````````````````````````````` example
1005 Foo *bar*
1006 =========
1007
1008 Foo *bar*
1009 ---------
1010 .
1011 <h1>Foo <em>bar</em></h1>
1012 <h2>Foo <em>bar</em></h2>
1013 ````````````````````````````````
1014
1015
1016 The content of the header may span more than one line:
1017
1018 ```````````````````````````````` example
1019 Foo *bar
1020 baz*
1021 ====
1022 .
1023 <h1>Foo <em>bar
1024 baz</em></h1>
1025 ````````````````````````````````
1026
1027
1028 The underlining can be any length:
1029
1030 ```````````````````````````````` example
1031 Foo
1032 -------------------------
1033
1034 Foo
1035 =
1036 .
1037 <h2>Foo</h2>
1038 <h1>Foo</h1>
1039 ````````````````````````````````
1040
1041
1042 The heading content can be indented up to three spaces, and need
1043 not line up with the underlining:
1044
1045 ```````````````````````````````` example
1046    Foo
1047 ---
1048
1049   Foo
1050 -----
1051
1052   Foo
1053   ===
1054 .
1055 <h2>Foo</h2>
1056 <h2>Foo</h2>
1057 <h1>Foo</h1>
1058 ````````````````````````````````
1059
1060
1061 Four spaces indent is too much:
1062
1063 ```````````````````````````````` example
1064     Foo
1065     ---
1066
1067     Foo
1068 ---
1069 .
1070 <pre><code>Foo
1071 ---
1072
1073 Foo
1074 </code></pre>
1075 <hr />
1076 ````````````````````````````````
1077
1078
1079 The setext heading underline can be indented up to three spaces, and
1080 may have trailing spaces:
1081
1082 ```````````````````````````````` example
1083 Foo
1084    ----      
1085 .
1086 <h2>Foo</h2>
1087 ````````````````````````````````
1088
1089
1090 Four spaces is too much:
1091
1092 ```````````````````````````````` example
1093 Foo
1094     ---
1095 .
1096 <p>Foo
1097 ---</p>
1098 ````````````````````````````````
1099
1100
1101 The setext heading underline cannot contain internal spaces:
1102
1103 ```````````````````````````````` example
1104 Foo
1105 = =
1106
1107 Foo
1108 --- -
1109 .
1110 <p>Foo
1111 = =</p>
1112 <p>Foo</p>
1113 <hr />
1114 ````````````````````````````````
1115
1116
1117 Trailing spaces in the content line do not cause a line break:
1118
1119 ```````````````````````````````` example
1120 Foo  
1121 -----
1122 .
1123 <h2>Foo</h2>
1124 ````````````````````````````````
1125
1126
1127 Nor does a backslash at the end:
1128
1129 ```````````````````````````````` example
1130 Foo\
1131 ----
1132 .
1133 <h2>Foo\</h2>
1134 ````````````````````````````````
1135
1136
1137 Since indicators of block structure take precedence over
1138 indicators of inline structure, the following are setext headings:
1139
1140 ```````````````````````````````` example
1141 `Foo
1142 ----
1143 `
1144
1145 <a title="a lot
1146 ---
1147 of dashes"/>
1148 .
1149 <h2>`Foo</h2>
1150 <p>`</p>
1151 <h2>&lt;a title=&quot;a lot</h2>
1152 <p>of dashes&quot;/&gt;</p>
1153 ````````````````````````````````
1154
1155
1156 The setext heading underline cannot be a [lazy continuation
1157 line] in a list item or block quote:
1158
1159 ```````````````````````````````` example
1160 > Foo
1161 ---
1162 .
1163 <blockquote>
1164 <p>Foo</p>
1165 </blockquote>
1166 <hr />
1167 ````````````````````````````````
1168
1169
1170 ```````````````````````````````` example
1171 > foo
1172 bar
1173 ===
1174 .
1175 <blockquote>
1176 <p>foo
1177 bar
1178 ===</p>
1179 </blockquote>
1180 ````````````````````````````````
1181
1182
1183 ```````````````````````````````` example
1184 - Foo
1185 ---
1186 .
1187 <ul>
1188 <li>Foo</li>
1189 </ul>
1190 <hr />
1191 ````````````````````````````````
1192
1193
1194 A blank line is needed between a paragraph and a following
1195 setext heading, since otherwise the paragraph becomes part
1196 of the heading's content:
1197
1198 ```````````````````````````````` example
1199 Foo
1200 Bar
1201 ---
1202 .
1203 <h2>Foo
1204 Bar</h2>
1205 ````````````````````````````````
1206
1207
1208 But in general a blank line is not required before or after
1209 setext headings:
1210
1211 ```````````````````````````````` example
1212 ---
1213 Foo
1214 ---
1215 Bar
1216 ---
1217 Baz
1218 .
1219 <hr />
1220 <h2>Foo</h2>
1221 <h2>Bar</h2>
1222 <p>Baz</p>
1223 ````````````````````````````````
1224
1225
1226 Setext headings cannot be empty:
1227
1228 ```````````````````````````````` example
1229
1230 ====
1231 .
1232 <p>====</p>
1233 ````````````````````````````````
1234
1235
1236 Setext heading text lines must not be interpretable as block
1237 constructs other than paragraphs.  So, the line of dashes
1238 in these examples gets interpreted as a thematic break:
1239
1240 ```````````````````````````````` example
1241 ---
1242 ---
1243 .
1244 <hr />
1245 <hr />
1246 ````````````````````````````````
1247
1248
1249 ```````````````````````````````` example
1250 - foo
1251 -----
1252 .
1253 <ul>
1254 <li>foo</li>
1255 </ul>
1256 <hr />
1257 ````````````````````````````````
1258
1259
1260 ```````````````````````````````` example
1261     foo
1262 ---
1263 .
1264 <pre><code>foo
1265 </code></pre>
1266 <hr />
1267 ````````````````````````````````
1268
1269
1270 ```````````````````````````````` example
1271 > foo
1272 -----
1273 .
1274 <blockquote>
1275 <p>foo</p>
1276 </blockquote>
1277 <hr />
1278 ````````````````````````````````
1279
1280
1281 If you want a heading with `> foo` as its literal text, you can
1282 use backslash escapes:
1283
1284 ```````````````````````````````` example
1285 \> foo
1286 ------
1287 .
1288 <h2>&gt; foo</h2>
1289 ````````````````````````````````
1290
1291
1292 **Compatibility note:**  Most existing Markdown implementations
1293 do not allow the text of setext headings to span multiple lines.
1294 But there is no consensus about how to interpret
1295
1296 ``` markdown
1297 Foo
1298 bar
1299 ---
1300 baz
1301 ```
1302
1303 One can find four different interpretations:
1304
1305 1. paragraph "Foo", heading "bar", paragraph "baz"
1306 2. paragraph "Foo bar", thematic break, paragraph "baz"
1307 3. paragraph "Foo bar --- baz"
1308 4. heading "Foo bar", paragraph "baz"
1309
1310 We find interpretation 4 most natural, and interpretation 4
1311 increases the expressive power of CommonMark, by allowing
1312 multiline headings.  Authors who want interpretation 1 can
1313 put a blank line after the first paragraph:
1314
1315 ```````````````````````````````` example
1316 Foo
1317
1318 bar
1319 ---
1320 baz
1321 .
1322 <p>Foo</p>
1323 <h2>bar</h2>
1324 <p>baz</p>
1325 ````````````````````````````````
1326
1327
1328 Authors who want interpretation 2 can put blank lines around
1329 the thematic break,
1330
1331 ```````````````````````````````` example
1332 Foo
1333 bar
1334
1335 ---
1336
1337 baz
1338 .
1339 <p>Foo
1340 bar</p>
1341 <hr />
1342 <p>baz</p>
1343 ````````````````````````````````
1344
1345
1346 or use a thematic break that cannot count as a [setext heading
1347 underline], such as
1348
1349 ```````````````````````````````` example
1350 Foo
1351 bar
1352 * * *
1353 baz
1354 .
1355 <p>Foo
1356 bar</p>
1357 <hr />
1358 <p>baz</p>
1359 ````````````````````````````````
1360
1361
1362 Authors who want interpretation 3 can use backslash escapes:
1363
1364 ```````````````````````````````` example
1365 Foo
1366 bar
1367 \---
1368 baz
1369 .
1370 <p>Foo
1371 bar
1372 ---
1373 baz</p>
1374 ````````````````````````````````
1375
1376
1377 ## Indented code blocks
1378
1379 An [indented code block](@) is composed of one or more
1380 [indented chunks] separated by blank lines.
1381 An [indented chunk](@) is a sequence of non-blank lines,
1382 each indented four or more spaces. The contents of the code block are
1383 the literal contents of the lines, including trailing
1384 [line endings], minus four spaces of indentation.
1385 An indented code block has no [info string].
1386
1387 An indented code block cannot interrupt a paragraph, so there must be
1388 a blank line between a paragraph and a following indented code block.
1389 (A blank line is not needed, however, between a code block and a following
1390 paragraph.)
1391
1392 ```````````````````````````````` example
1393     a simple
1394       indented code block
1395 .
1396 <pre><code>a simple
1397   indented code block
1398 </code></pre>
1399 ````````````````````````````````
1400
1401
1402 If there is any ambiguity between an interpretation of indentation
1403 as a code block and as indicating that material belongs to a [list
1404 item][list items], the list item interpretation takes precedence:
1405
1406 ```````````````````````````````` example
1407   - foo
1408
1409     bar
1410 .
1411 <ul>
1412 <li>
1413 <p>foo</p>
1414 <p>bar</p>
1415 </li>
1416 </ul>
1417 ````````````````````````````````
1418
1419
1420 ```````````````````````````````` example
1421 1.  foo
1422
1423     - bar
1424 .
1425 <ol>
1426 <li>
1427 <p>foo</p>
1428 <ul>
1429 <li>bar</li>
1430 </ul>
1431 </li>
1432 </ol>
1433 ````````````````````````````````
1434
1435
1436
1437 The contents of a code block are literal text, and do not get parsed
1438 as Markdown:
1439
1440 ```````````````````````````````` example
1441     <a/>
1442     *hi*
1443
1444     - one
1445 .
1446 <pre><code>&lt;a/&gt;
1447 *hi*
1448
1449 - one
1450 </code></pre>
1451 ````````````````````````````````
1452
1453
1454 Here we have three chunks separated by blank lines:
1455
1456 ```````````````````````````````` example
1457     chunk1
1458
1459     chunk2
1460   
1461  
1462  
1463     chunk3
1464 .
1465 <pre><code>chunk1
1466
1467 chunk2
1468
1469
1470
1471 chunk3
1472 </code></pre>
1473 ````````````````````````````````
1474
1475
1476 Any initial spaces beyond four will be included in the content, even
1477 in interior blank lines:
1478
1479 ```````````````````````````````` example
1480     chunk1
1481       
1482       chunk2
1483 .
1484 <pre><code>chunk1
1485   
1486   chunk2
1487 </code></pre>
1488 ````````````````````````````````
1489
1490
1491 An indented code block cannot interrupt a paragraph.  (This
1492 allows hanging indents and the like.)
1493
1494 ```````````````````````````````` example
1495 Foo
1496     bar
1497
1498 .
1499 <p>Foo
1500 bar</p>
1501 ````````````````````````````````
1502
1503
1504 However, any non-blank line with fewer than four leading spaces ends
1505 the code block immediately.  So a paragraph may occur immediately
1506 after indented code:
1507
1508 ```````````````````````````````` example
1509     foo
1510 bar
1511 .
1512 <pre><code>foo
1513 </code></pre>
1514 <p>bar</p>
1515 ````````````````````````````````
1516
1517
1518 And indented code can occur immediately before and after other kinds of
1519 blocks:
1520
1521 ```````````````````````````````` example
1522 # Heading
1523     foo
1524 Heading
1525 ------
1526     foo
1527 ----
1528 .
1529 <h1>Heading</h1>
1530 <pre><code>foo
1531 </code></pre>
1532 <h2>Heading</h2>
1533 <pre><code>foo
1534 </code></pre>
1535 <hr />
1536 ````````````````````````````````
1537
1538
1539 The first line can be indented more than four spaces:
1540
1541 ```````````````````````````````` example
1542         foo
1543     bar
1544 .
1545 <pre><code>    foo
1546 bar
1547 </code></pre>
1548 ````````````````````````````````
1549
1550
1551 Blank lines preceding or following an indented code block
1552 are not included in it:
1553
1554 ```````````````````````````````` example
1555
1556     
1557     foo
1558     
1559
1560 .
1561 <pre><code>foo
1562 </code></pre>
1563 ````````````````````````````````
1564
1565
1566 Trailing spaces are included in the code block's content:
1567
1568 ```````````````````````````````` example
1569     foo  
1570 .
1571 <pre><code>foo  
1572 </code></pre>
1573 ````````````````````````````````
1574
1575
1576
1577 ## Fenced code blocks
1578
1579 A [code fence](@) is a sequence
1580 of at least three consecutive backtick characters (`` ` ``) or
1581 tildes (`~`).  (Tildes and backticks cannot be mixed.)
1582 A [fenced code block](@)
1583 begins with a code fence, indented no more than three spaces.
1584
1585 The line with the opening code fence may optionally contain some text
1586 following the code fence; this is trimmed of leading and trailing
1587 spaces and called the [info string](@).
1588 The [info string] may not contain any backtick
1589 characters.  (The reason for this restriction is that otherwise
1590 some inline code would be incorrectly interpreted as the
1591 beginning of a fenced code block.)
1592
1593 The content of the code block consists of all subsequent lines, until
1594 a closing [code fence] of the same type as the code block
1595 began with (backticks or tildes), and with at least as many backticks
1596 or tildes as the opening code fence.  If the leading code fence is
1597 indented N spaces, then up to N spaces of indentation are removed from
1598 each line of the content (if present).  (If a content line is not
1599 indented, it is preserved unchanged.  If it is indented less than N
1600 spaces, all of the indentation is removed.)
1601
1602 The closing code fence may be indented up to three spaces, and may be
1603 followed only by spaces, which are ignored.  If the end of the
1604 containing block (or document) is reached and no closing code fence
1605 has been found, the code block contains all of the lines after the
1606 opening code fence until the end of the containing block (or
1607 document).  (An alternative spec would require backtracking in the
1608 event that a closing code fence is not found.  But this makes parsing
1609 much less efficient, and there seems to be no real down side to the
1610 behavior described here.)
1611
1612 A fenced code block may interrupt a paragraph, and does not require
1613 a blank line either before or after.
1614
1615 The content of a code fence is treated as literal text, not parsed
1616 as inlines.  The first word of the [info string] is typically used to
1617 specify the language of the code sample, and rendered in the `class`
1618 attribute of the `code` tag.  However, this spec does not mandate any
1619 particular treatment of the [info string].
1620
1621 Here is a simple example with backticks:
1622
1623 ```````````````````````````````` example
1624 ```
1625 <
1626  >
1627 ```
1628 .
1629 <pre><code>&lt;
1630  &gt;
1631 </code></pre>
1632 ````````````````````````````````
1633
1634
1635 With tildes:
1636
1637 ```````````````````````````````` example
1638 ~~~
1639 <
1640  >
1641 ~~~
1642 .
1643 <pre><code>&lt;
1644  &gt;
1645 </code></pre>
1646 ````````````````````````````````
1647
1648
1649 The closing code fence must use the same character as the opening
1650 fence:
1651
1652 ```````````````````````````````` example
1653 ```
1654 aaa
1655 ~~~
1656 ```
1657 .
1658 <pre><code>aaa
1659 ~~~
1660 </code></pre>
1661 ````````````````````````````````
1662
1663
1664 ```````````````````````````````` example
1665 ~~~
1666 aaa
1667 ```
1668 ~~~
1669 .
1670 <pre><code>aaa
1671 ```
1672 </code></pre>
1673 ````````````````````````````````
1674
1675
1676 The closing code fence must be at least as long as the opening fence:
1677
1678 ```````````````````````````````` example
1679 ````
1680 aaa
1681 ```
1682 ``````
1683 .
1684 <pre><code>aaa
1685 ```
1686 </code></pre>
1687 ````````````````````````````````
1688
1689
1690 ```````````````````````````````` example
1691 ~~~~
1692 aaa
1693 ~~~
1694 ~~~~
1695 .
1696 <pre><code>aaa
1697 ~~~
1698 </code></pre>
1699 ````````````````````````````````
1700
1701
1702 Unclosed code blocks are closed by the end of the document
1703 (or the enclosing [block quote][block quotes] or [list item][list items]):
1704
1705 ```````````````````````````````` example
1706 ```
1707 .
1708 <pre><code></code></pre>
1709 ````````````````````````````````
1710
1711
1712 ```````````````````````````````` example
1713 `````
1714
1715 ```
1716 aaa
1717 .
1718 <pre><code>
1719 ```
1720 aaa
1721 </code></pre>
1722 ````````````````````````````````
1723
1724
1725 ```````````````````````````````` example
1726 > ```
1727 > aaa
1728
1729 bbb
1730 .
1731 <blockquote>
1732 <pre><code>aaa
1733 </code></pre>
1734 </blockquote>
1735 <p>bbb</p>
1736 ````````````````````````````````
1737
1738
1739 A code block can have all empty lines as its content:
1740
1741 ```````````````````````````````` example
1742 ```
1743
1744   
1745 ```
1746 .
1747 <pre><code>
1748   
1749 </code></pre>
1750 ````````````````````````````````
1751
1752
1753 A code block can be empty:
1754
1755 ```````````````````````````````` example
1756 ```
1757 ```
1758 .
1759 <pre><code></code></pre>
1760 ````````````````````````````````
1761
1762
1763 Fences can be indented.  If the opening fence is indented,
1764 content lines will have equivalent opening indentation removed,
1765 if present:
1766
1767 ```````````````````````````````` example
1768  ```
1769  aaa
1770 aaa
1771 ```
1772 .
1773 <pre><code>aaa
1774 aaa
1775 </code></pre>
1776 ````````````````````````````````
1777
1778
1779 ```````````````````````````````` example
1780   ```
1781 aaa
1782   aaa
1783 aaa
1784   ```
1785 .
1786 <pre><code>aaa
1787 aaa
1788 aaa
1789 </code></pre>
1790 ````````````````````````````````
1791
1792
1793 ```````````````````````````````` example
1794    ```
1795    aaa
1796     aaa
1797   aaa
1798    ```
1799 .
1800 <pre><code>aaa
1801  aaa
1802 aaa
1803 </code></pre>
1804 ````````````````````````````````
1805
1806
1807 Four spaces indentation produces an indented code block:
1808
1809 ```````````````````````````````` example
1810     ```
1811     aaa
1812     ```
1813 .
1814 <pre><code>```
1815 aaa
1816 ```
1817 </code></pre>
1818 ````````````````````````````````
1819
1820
1821 Closing fences may be indented by 0-3 spaces, and their indentation
1822 need not match that of the opening fence:
1823
1824 ```````````````````````````````` example
1825 ```
1826 aaa
1827   ```
1828 .
1829 <pre><code>aaa
1830 </code></pre>
1831 ````````````````````````````````
1832
1833
1834 ```````````````````````````````` example
1835    ```
1836 aaa
1837   ```
1838 .
1839 <pre><code>aaa
1840 </code></pre>
1841 ````````````````````````````````
1842
1843
1844 This is not a closing fence, because it is indented 4 spaces:
1845
1846 ```````````````````````````````` example
1847 ```
1848 aaa
1849     ```
1850 .
1851 <pre><code>aaa
1852     ```
1853 </code></pre>
1854 ````````````````````````````````
1855
1856
1857
1858 Code fences (opening and closing) cannot contain internal spaces:
1859
1860 ```````````````````````````````` example
1861 ``` ```
1862 aaa
1863 .
1864 <p><code></code>
1865 aaa</p>
1866 ````````````````````````````````
1867
1868
1869 ```````````````````````````````` example
1870 ~~~~~~
1871 aaa
1872 ~~~ ~~
1873 .
1874 <pre><code>aaa
1875 ~~~ ~~
1876 </code></pre>
1877 ````````````````````````````````
1878
1879
1880 Fenced code blocks can interrupt paragraphs, and can be followed
1881 directly by paragraphs, without a blank line between:
1882
1883 ```````````````````````````````` example
1884 foo
1885 ```
1886 bar
1887 ```
1888 baz
1889 .
1890 <p>foo</p>
1891 <pre><code>bar
1892 </code></pre>
1893 <p>baz</p>
1894 ````````````````````````````````
1895
1896
1897 Other blocks can also occur before and after fenced code blocks
1898 without an intervening blank line:
1899
1900 ```````````````````````````````` example
1901 foo
1902 ---
1903 ~~~
1904 bar
1905 ~~~
1906 # baz
1907 .
1908 <h2>foo</h2>
1909 <pre><code>bar
1910 </code></pre>
1911 <h1>baz</h1>
1912 ````````````````````````````````
1913
1914
1915 An [info string] can be provided after the opening code fence.
1916 Opening and closing spaces will be stripped, and the first word, prefixed
1917 with `language-`, is used as the value for the `class` attribute of the
1918 `code` element within the enclosing `pre` element.
1919
1920 ```````````````````````````````` example
1921 ```ruby
1922 def foo(x)
1923   return 3
1924 end
1925 ```
1926 .
1927 <pre><code class="language-ruby">def foo(x)
1928   return 3
1929 end
1930 </code></pre>
1931 ````````````````````````````````
1932
1933
1934 ```````````````````````````````` example
1935 ~~~~    ruby startline=3 $%@#$
1936 def foo(x)
1937   return 3
1938 end
1939 ~~~~~~~
1940 .
1941 <pre><code class="language-ruby">def foo(x)
1942   return 3
1943 end
1944 </code></pre>
1945 ````````````````````````````````
1946
1947
1948 ```````````````````````````````` example
1949 ````;
1950 ````
1951 .
1952 <pre><code class="language-;"></code></pre>
1953 ````````````````````````````````
1954
1955
1956 [Info strings] for backtick code blocks cannot contain backticks:
1957
1958 ```````````````````````````````` example
1959 ``` aa ```
1960 foo
1961 .
1962 <p><code>aa</code>
1963 foo</p>
1964 ````````````````````````````````
1965
1966
1967 Closing code fences cannot have [info strings]:
1968
1969 ```````````````````````````````` example
1970 ```
1971 ``` aaa
1972 ```
1973 .
1974 <pre><code>``` aaa
1975 </code></pre>
1976 ````````````````````````````````
1977
1978
1979
1980 ## HTML blocks
1981
1982 An [HTML block](@) is a group of lines that is treated
1983 as raw HTML (and will not be escaped in HTML output).
1984
1985 There are seven kinds of [HTML block], which can be defined
1986 by their start and end conditions.  The block begins with a line that
1987 meets a [start condition](@) (after up to three spaces
1988 optional indentation).  It ends with the first subsequent line that
1989 meets a matching [end condition](@), or the last line of
1990 the document or other [container block]), if no line is encountered that meets the
1991 [end condition].  If the first line meets both the [start condition]
1992 and the [end condition], the block will contain just that line.
1993
1994 1.  **Start condition:**  line begins with the string `<script`,
1995 `<pre`, or `<style` (case-insensitive), followed by whitespace,
1996 the string `>`, or the end of the line.\
1997 **End condition:**  line contains an end tag
1998 `</script>`, `</pre>`, or `</style>` (case-insensitive; it
1999 need not match the start tag).
2000
2001 2.  **Start condition:** line begins with the string `<!--`.\
2002 **End condition:**  line contains the string `-->`.
2003
2004 3.  **Start condition:** line begins with the string `<?`.\
2005 **End condition:** line contains the string `?>`.
2006
2007 4.  **Start condition:** line begins with the string `<!`
2008 followed by an uppercase ASCII letter.\
2009 **End condition:** line contains the character `>`.
2010
2011 5.  **Start condition:**  line begins with the string
2012 `<![CDATA[`.\
2013 **End condition:** line contains the string `]]>`.
2014
2015 6.  **Start condition:** line begins the string `<` or `</`
2016 followed by one of the strings (case-insensitive) `address`,
2017 `article`, `aside`, `base`, `basefont`, `blockquote`, `body`,
2018 `caption`, `center`, `col`, `colgroup`, `dd`, `details`, `dialog`,
2019 `dir`, `div`, `dl`, `dt`, `fieldset`, `figcaption`, `figure`,
2020 `footer`, `form`, `frame`, `frameset`,
2021 `h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `head`, `header`, `hr`,
2022 `html`, `iframe`, `legend`, `li`, `link`, `main`, `menu`, `menuitem`,
2023 `meta`, `nav`, `noframes`, `ol`, `optgroup`, `option`, `p`, `param`,
2024 `section`, `source`, `summary`, `table`, `tbody`, `td`,
2025 `tfoot`, `th`, `thead`, `title`, `tr`, `track`, `ul`, followed
2026 by [whitespace], the end of the line, the string `>`, or
2027 the string `/>`.\
2028 **End condition:** line is followed by a [blank line].
2029
2030 7.  **Start condition:**  line begins with a complete [open tag]
2031 or [closing tag] (with any [tag name] other than `script`,
2032 `style`, or `pre`) followed only by [whitespace]
2033 or the end of the line.\
2034 **End condition:** line is followed by a [blank line].
2035
2036 All types of [HTML blocks] except type 7 may interrupt
2037 a paragraph.  Blocks of type 7 may not interrupt a paragraph.
2038 (This restriction is intended to prevent unwanted interpretation
2039 of long tags inside a wrapped paragraph as starting HTML blocks.)
2040
2041 Some simple examples follow.  Here are some basic HTML blocks
2042 of type 6:
2043
2044 ```````````````````````````````` example
2045 <table>
2046   <tr>
2047     <td>
2048            hi
2049     </td>
2050   </tr>
2051 </table>
2052
2053 okay.
2054 .
2055 <table>
2056   <tr>
2057     <td>
2058            hi
2059     </td>
2060   </tr>
2061 </table>
2062 <p>okay.</p>
2063 ````````````````````````````````
2064
2065
2066 ```````````````````````````````` example
2067  <div>
2068   *hello*
2069          <foo><a>
2070 .
2071  <div>
2072   *hello*
2073          <foo><a>
2074 ````````````````````````````````
2075
2076
2077 A block can also start with a closing tag:
2078
2079 ```````````````````````````````` example
2080 </div>
2081 *foo*
2082 .
2083 </div>
2084 *foo*
2085 ````````````````````````````````
2086
2087
2088 Here we have two HTML blocks with a Markdown paragraph between them:
2089
2090 ```````````````````````````````` example
2091 <DIV CLASS="foo">
2092
2093 *Markdown*
2094
2095 </DIV>
2096 .
2097 <DIV CLASS="foo">
2098 <p><em>Markdown</em></p>
2099 </DIV>
2100 ````````````````````````````````
2101
2102
2103 The tag on the first line can be partial, as long
2104 as it is split where there would be whitespace:
2105
2106 ```````````````````````````````` example
2107 <div id="foo"
2108   class="bar">
2109 </div>
2110 .
2111 <div id="foo"
2112   class="bar">
2113 </div>
2114 ````````````````````````````````
2115
2116
2117 ```````````````````````````````` example
2118 <div id="foo" class="bar
2119   baz">
2120 </div>
2121 .
2122 <div id="foo" class="bar
2123   baz">
2124 </div>
2125 ````````````````````````````````
2126
2127
2128 An open tag need not be closed:
2129 ```````````````````````````````` example
2130 <div>
2131 *foo*
2132
2133 *bar*
2134 .
2135 <div>
2136 *foo*
2137 <p><em>bar</em></p>
2138 ````````````````````````````````
2139
2140
2141
2142 A partial tag need not even be completed (garbage
2143 in, garbage out):
2144
2145 ```````````````````````````````` example
2146 <div id="foo"
2147 *hi*
2148 .
2149 <div id="foo"
2150 *hi*
2151 ````````````````````````````````
2152
2153
2154 ```````````````````````````````` example
2155 <div class
2156 foo
2157 .
2158 <div class
2159 foo
2160 ````````````````````````````````
2161
2162
2163 The initial tag doesn't even need to be a valid
2164 tag, as long as it starts like one:
2165
2166 ```````````````````````````````` example
2167 <div *???-&&&-<---
2168 *foo*
2169 .
2170 <div *???-&&&-<---
2171 *foo*
2172 ````````````````````````````````
2173
2174
2175 In type 6 blocks, the initial tag need not be on a line by
2176 itself:
2177
2178 ```````````````````````````````` example
2179 <div><a href="bar">*foo*</a></div>
2180 .
2181 <div><a href="bar">*foo*</a></div>
2182 ````````````````````````````````
2183
2184
2185 ```````````````````````````````` example
2186 <table><tr><td>
2187 foo
2188 </td></tr></table>
2189 .
2190 <table><tr><td>
2191 foo
2192 </td></tr></table>
2193 ````````````````````````````````
2194
2195
2196 Everything until the next blank line or end of document
2197 gets included in the HTML block.  So, in the following
2198 example, what looks like a Markdown code block
2199 is actually part of the HTML block, which continues until a blank
2200 line or the end of the document is reached:
2201
2202 ```````````````````````````````` example
2203 <div></div>
2204 ``` c
2205 int x = 33;
2206 ```
2207 .
2208 <div></div>
2209 ``` c
2210 int x = 33;
2211 ```
2212 ````````````````````````````````
2213
2214
2215 To start an [HTML block] with a tag that is *not* in the
2216 list of block-level tags in (6), you must put the tag by
2217 itself on the first line (and it must be complete):
2218
2219 ```````````````````````````````` example
2220 <a href="foo">
2221 *bar*
2222 </a>
2223 .
2224 <a href="foo">
2225 *bar*
2226 </a>
2227 ````````````````````````````````
2228
2229
2230 In type 7 blocks, the [tag name] can be anything:
2231
2232 ```````````````````````````````` example
2233 <Warning>
2234 *bar*
2235 </Warning>
2236 .
2237 <Warning>
2238 *bar*
2239 </Warning>
2240 ````````````````````````````````
2241
2242
2243 ```````````````````````````````` example
2244 <i class="foo">
2245 *bar*
2246 </i>
2247 .
2248 <i class="foo">
2249 *bar*
2250 </i>
2251 ````````````````````````````````
2252
2253
2254 ```````````````````````````````` example
2255 </ins>
2256 *bar*
2257 .
2258 </ins>
2259 *bar*
2260 ````````````````````````````````
2261
2262
2263 These rules are designed to allow us to work with tags that
2264 can function as either block-level or inline-level tags.
2265 The `<del>` tag is a nice example.  We can surround content with
2266 `<del>` tags in three different ways.  In this case, we get a raw
2267 HTML block, because the `<del>` tag is on a line by itself:
2268
2269 ```````````````````````````````` example
2270 <del>
2271 *foo*
2272 </del>
2273 .
2274 <del>
2275 *foo*
2276 </del>
2277 ````````````````````````````````
2278
2279
2280 In this case, we get a raw HTML block that just includes
2281 the `<del>` tag (because it ends with the following blank
2282 line).  So the contents get interpreted as CommonMark:
2283
2284 ```````````````````````````````` example
2285 <del>
2286
2287 *foo*
2288
2289 </del>
2290 .
2291 <del>
2292 <p><em>foo</em></p>
2293 </del>
2294 ````````````````````````````````
2295
2296
2297 Finally, in this case, the `<del>` tags are interpreted
2298 as [raw HTML] *inside* the CommonMark paragraph.  (Because
2299 the tag is not on a line by itself, we get inline HTML
2300 rather than an [HTML block].)
2301
2302 ```````````````````````````````` example
2303 <del>*foo*</del>
2304 .
2305 <p><del><em>foo</em></del></p>
2306 ````````````````````````````````
2307
2308
2309 HTML tags designed to contain literal content
2310 (`script`, `style`, `pre`), comments, processing instructions,
2311 and declarations are treated somewhat differently.
2312 Instead of ending at the first blank line, these blocks
2313 end at the first line containing a corresponding end tag.
2314 As a result, these blocks can contain blank lines:
2315
2316 A pre tag (type 1):
2317
2318 ```````````````````````````````` example
2319 <pre language="haskell"><code>
2320 import Text.HTML.TagSoup
2321
2322 main :: IO ()
2323 main = print $ parseTags tags
2324 </code></pre>
2325 okay
2326 .
2327 <pre language="haskell"><code>
2328 import Text.HTML.TagSoup
2329
2330 main :: IO ()
2331 main = print $ parseTags tags
2332 </code></pre>
2333 <p>okay</p>
2334 ````````````````````````````````
2335
2336
2337 A script tag (type 1):
2338
2339 ```````````````````````````````` example
2340 <script type="text/javascript">
2341 // JavaScript example
2342
2343 document.getElementById("demo").innerHTML = "Hello JavaScript!";
2344 </script>
2345 okay
2346 .
2347 <script type="text/javascript">
2348 // JavaScript example
2349
2350 document.getElementById("demo").innerHTML = "Hello JavaScript!";
2351 </script>
2352 <p>okay</p>
2353 ````````````````````````````````
2354
2355
2356 A style tag (type 1):
2357
2358 ```````````````````````````````` example
2359 <style
2360   type="text/css">
2361 h1 {color:red;}
2362
2363 p {color:blue;}
2364 </style>
2365 okay
2366 .
2367 <style
2368   type="text/css">
2369 h1 {color:red;}
2370
2371 p {color:blue;}
2372 </style>
2373 <p>okay</p>
2374 ````````````````````````````````
2375
2376
2377 If there is no matching end tag, the block will end at the
2378 end of the document (or the enclosing [block quote][block quotes]
2379 or [list item][list items]):
2380
2381 ```````````````````````````````` example
2382 <style
2383   type="text/css">
2384
2385 foo
2386 .
2387 <style
2388   type="text/css">
2389
2390 foo
2391 ````````````````````````````````
2392
2393
2394 ```````````````````````````````` example
2395 > <div>
2396 > foo
2397
2398 bar
2399 .
2400 <blockquote>
2401 <div>
2402 foo
2403 </blockquote>
2404 <p>bar</p>
2405 ````````````````````````````````
2406
2407
2408 ```````````````````````````````` example
2409 - <div>
2410 - foo
2411 .
2412 <ul>
2413 <li>
2414 <div>
2415 </li>
2416 <li>foo</li>
2417 </ul>
2418 ````````````````````````````````
2419
2420
2421 The end tag can occur on the same line as the start tag:
2422
2423 ```````````````````````````````` example
2424 <style>p{color:red;}</style>
2425 *foo*
2426 .
2427 <style>p{color:red;}</style>
2428 <p><em>foo</em></p>
2429 ````````````````````````````````
2430
2431
2432 ```````````````````````````````` example
2433 <!-- foo -->*bar*
2434 *baz*
2435 .
2436 <!-- foo -->*bar*
2437 <p><em>baz</em></p>
2438 ````````````````````````````````
2439
2440
2441 Note that anything on the last line after the
2442 end tag will be included in the [HTML block]:
2443
2444 ```````````````````````````````` example
2445 <script>
2446 foo
2447 </script>1. *bar*
2448 .
2449 <script>
2450 foo
2451 </script>1. *bar*
2452 ````````````````````````````````
2453
2454
2455 A comment (type 2):
2456
2457 ```````````````````````````````` example
2458 <!-- Foo
2459
2460 bar
2461    baz -->
2462 okay
2463 .
2464 <!-- Foo
2465
2466 bar
2467    baz -->
2468 <p>okay</p>
2469 ````````````````````````````````
2470
2471
2472
2473 A processing instruction (type 3):
2474
2475 ```````````````````````````````` example
2476 <?php
2477
2478   echo '>';
2479
2480 ?>
2481 okay
2482 .
2483 <?php
2484
2485   echo '>';
2486
2487 ?>
2488 <p>okay</p>
2489 ````````````````````````````````
2490
2491
2492 A declaration (type 4):
2493
2494 ```````````````````````````````` example
2495 <!DOCTYPE html>
2496 .
2497 <!DOCTYPE html>
2498 ````````````````````````````````
2499
2500
2501 CDATA (type 5):
2502
2503 ```````````````````````````````` example
2504 <![CDATA[
2505 function matchwo(a,b)
2506 {
2507   if (a < b && a < 0) then {
2508     return 1;
2509
2510   } else {
2511
2512     return 0;
2513   }
2514 }
2515 ]]>
2516 okay
2517 .
2518 <![CDATA[
2519 function matchwo(a,b)
2520 {
2521   if (a < b && a < 0) then {
2522     return 1;
2523
2524   } else {
2525
2526     return 0;
2527   }
2528 }
2529 ]]>
2530 <p>okay</p>
2531 ````````````````````````````````
2532
2533
2534 The opening tag can be indented 1-3 spaces, but not 4:
2535
2536 ```````````````````````````````` example
2537   <!-- foo -->
2538
2539     <!-- foo -->
2540 .
2541   <!-- foo -->
2542 <pre><code>&lt;!-- foo --&gt;
2543 </code></pre>
2544 ````````````````````````````````
2545
2546
2547 ```````````````````````````````` example
2548   <div>
2549
2550     <div>
2551 .
2552   <div>
2553 <pre><code>&lt;div&gt;
2554 </code></pre>
2555 ````````````````````````````````
2556
2557
2558 An HTML block of types 1--6 can interrupt a paragraph, and need not be
2559 preceded by a blank line.
2560
2561 ```````````````````````````````` example
2562 Foo
2563 <div>
2564 bar
2565 </div>
2566 .
2567 <p>Foo</p>
2568 <div>
2569 bar
2570 </div>
2571 ````````````````````````````````
2572
2573
2574 However, a following blank line is needed, except at the end of
2575 a document, and except for blocks of types 1--5, above:
2576
2577 ```````````````````````````````` example
2578 <div>
2579 bar
2580 </div>
2581 *foo*
2582 .
2583 <div>
2584 bar
2585 </div>
2586 *foo*
2587 ````````````````````````````````
2588
2589
2590 HTML blocks of type 7 cannot interrupt a paragraph:
2591
2592 ```````````````````````````````` example
2593 Foo
2594 <a href="bar">
2595 baz
2596 .
2597 <p>Foo
2598 <a href="bar">
2599 baz</p>
2600 ````````````````````````````````
2601
2602
2603 This rule differs from John Gruber's original Markdown syntax
2604 specification, which says:
2605
2606 > The only restrictions are that block-level HTML elements —
2607 > e.g. `<div>`, `<table>`, `<pre>`, `<p>`, etc. — must be separated from
2608 > surrounding content by blank lines, and the start and end tags of the
2609 > block should not be indented with tabs or spaces.
2610
2611 In some ways Gruber's rule is more restrictive than the one given
2612 here:
2613
2614 - It requires that an HTML block be preceded by a blank line.
2615 - It does not allow the start tag to be indented.
2616 - It requires a matching end tag, which it also does not allow to
2617   be indented.
2618
2619 Most Markdown implementations (including some of Gruber's own) do not
2620 respect all of these restrictions.
2621
2622 There is one respect, however, in which Gruber's rule is more liberal
2623 than the one given here, since it allows blank lines to occur inside
2624 an HTML block.  There are two reasons for disallowing them here.
2625 First, it removes the need to parse balanced tags, which is
2626 expensive and can require backtracking from the end of the document
2627 if no matching end tag is found. Second, it provides a very simple
2628 and flexible way of including Markdown content inside HTML tags:
2629 simply separate the Markdown from the HTML using blank lines:
2630
2631 Compare:
2632
2633 ```````````````````````````````` example
2634 <div>
2635
2636 *Emphasized* text.
2637
2638 </div>
2639 .
2640 <div>
2641 <p><em>Emphasized</em> text.</p>
2642 </div>
2643 ````````````````````````````````
2644
2645
2646 ```````````````````````````````` example
2647 <div>
2648 *Emphasized* text.
2649 </div>
2650 .
2651 <div>
2652 *Emphasized* text.
2653 </div>
2654 ````````````````````````````````
2655
2656
2657 Some Markdown implementations have adopted a convention of
2658 interpreting content inside tags as text if the open tag has
2659 the attribute `markdown=1`.  The rule given above seems a simpler and
2660 more elegant way of achieving the same expressive power, which is also
2661 much simpler to parse.
2662
2663 The main potential drawback is that one can no longer paste HTML
2664 blocks into Markdown documents with 100% reliability.  However,
2665 *in most cases* this will work fine, because the blank lines in
2666 HTML are usually followed by HTML block tags.  For example:
2667
2668 ```````````````````````````````` example
2669 <table>
2670
2671 <tr>
2672
2673 <td>
2674 Hi
2675 </td>
2676
2677 </tr>
2678
2679 </table>
2680 .
2681 <table>
2682 <tr>
2683 <td>
2684 Hi
2685 </td>
2686 </tr>
2687 </table>
2688 ````````````````````````````````
2689
2690
2691 There are problems, however, if the inner tags are indented
2692 *and* separated by spaces, as then they will be interpreted as
2693 an indented code block:
2694
2695 ```````````````````````````````` example
2696 <table>
2697
2698   <tr>
2699
2700     <td>
2701       Hi
2702     </td>
2703
2704   </tr>
2705
2706 </table>
2707 .
2708 <table>
2709   <tr>
2710 <pre><code>&lt;td&gt;
2711   Hi
2712 &lt;/td&gt;
2713 </code></pre>
2714   </tr>
2715 </table>
2716 ````````````````````````````````
2717
2718
2719 Fortunately, blank lines are usually not necessary and can be
2720 deleted.  The exception is inside `<pre>` tags, but as described
2721 above, raw HTML blocks starting with `<pre>` *can* contain blank
2722 lines.
2723
2724 ## Link reference definitions
2725
2726 A [link reference definition](@)
2727 consists of a [link label], indented up to three spaces, followed
2728 by a colon (`:`), optional [whitespace] (including up to one
2729 [line ending]), a [link destination],
2730 optional [whitespace] (including up to one
2731 [line ending]), and an optional [link
2732 title], which if it is present must be separated
2733 from the [link destination] by [whitespace].
2734 No further [non-whitespace characters] may occur on the line.
2735
2736 A [link reference definition]
2737 does not correspond to a structural element of a document.  Instead, it
2738 defines a label which can be used in [reference links]
2739 and reference-style [images] elsewhere in the document.  [Link
2740 reference definitions] can come either before or after the links that use
2741 them.
2742
2743 ```````````````````````````````` example
2744 [foo]: /url "title"
2745
2746 [foo]
2747 .
2748 <p><a href="/url" title="title">foo</a></p>
2749 ````````````````````````````````
2750
2751
2752 ```````````````````````````````` example
2753    [foo]: 
2754       /url  
2755            'the title'  
2756
2757 [foo]
2758 .
2759 <p><a href="/url" title="the title">foo</a></p>
2760 ````````````````````````````````
2761
2762
2763 ```````````````````````````````` example
2764 [Foo*bar\]]:my_(url) 'title (with parens)'
2765
2766 [Foo*bar\]]
2767 .
2768 <p><a href="my_(url)" title="title (with parens)">Foo*bar]</a></p>
2769 ````````````````````````````````
2770
2771
2772 ```````````````````````````````` example
2773 [Foo bar]:
2774 <my%20url>
2775 'title'
2776
2777 [Foo bar]
2778 .
2779 <p><a href="my%20url" title="title">Foo bar</a></p>
2780 ````````````````````````````````
2781
2782
2783 The title may extend over multiple lines:
2784
2785 ```````````````````````````````` example
2786 [foo]: /url '
2787 title
2788 line1
2789 line2
2790 '
2791
2792 [foo]
2793 .
2794 <p><a href="/url" title="
2795 title
2796 line1
2797 line2
2798 ">foo</a></p>
2799 ````````````````````````````````
2800
2801
2802 However, it may not contain a [blank line]:
2803
2804 ```````````````````````````````` example
2805 [foo]: /url 'title
2806
2807 with blank line'
2808
2809 [foo]
2810 .
2811 <p>[foo]: /url 'title</p>
2812 <p>with blank line'</p>
2813 <p>[foo]</p>
2814 ````````````````````````````````
2815
2816
2817 The title may be omitted:
2818
2819 ```````````````````````````````` example
2820 [foo]:
2821 /url
2822
2823 [foo]
2824 .
2825 <p><a href="/url">foo</a></p>
2826 ````````````````````````````````
2827
2828
2829 The link destination may not be omitted:
2830
2831 ```````````````````````````````` example
2832 [foo]:
2833
2834 [foo]
2835 .
2836 <p>[foo]:</p>
2837 <p>[foo]</p>
2838 ````````````````````````````````
2839
2840
2841 Both title and destination can contain backslash escapes
2842 and literal backslashes:
2843
2844 ```````````````````````````````` example
2845 [foo]: /url\bar\*baz "foo\"bar\baz"
2846
2847 [foo]
2848 .
2849 <p><a href="/url%5Cbar*baz" title="foo&quot;bar\baz">foo</a></p>
2850 ````````````````````````````````
2851
2852
2853 A link can come before its corresponding definition:
2854
2855 ```````````````````````````````` example
2856 [foo]
2857
2858 [foo]: url
2859 .
2860 <p><a href="url">foo</a></p>
2861 ````````````````````````````````
2862
2863
2864 If there are several matching definitions, the first one takes
2865 precedence:
2866
2867 ```````````````````````````````` example
2868 [foo]
2869
2870 [foo]: first
2871 [foo]: second
2872 .
2873 <p><a href="first">foo</a></p>
2874 ````````````````````````````````
2875
2876
2877 As noted in the section on [Links], matching of labels is
2878 case-insensitive (see [matches]).
2879
2880 ```````````````````````````````` example
2881 [FOO]: /url
2882
2883 [Foo]
2884 .
2885 <p><a href="/url">Foo</a></p>
2886 ````````````````````````````````
2887
2888
2889 ```````````````````````````````` example
2890 [ΑΓΩ]: /φου
2891
2892 [αγω]
2893 .
2894 <p><a href="/%CF%86%CE%BF%CF%85">αγω</a></p>
2895 ````````````````````````````````
2896
2897
2898 Here is a link reference definition with no corresponding link.
2899 It contributes nothing to the document.
2900
2901 ```````````````````````````````` example
2902 [foo]: /url
2903 .
2904 ````````````````````````````````
2905
2906
2907 Here is another one:
2908
2909 ```````````````````````````````` example
2910 [
2911 foo
2912 ]: /url
2913 bar
2914 .
2915 <p>bar</p>
2916 ````````````````````````````````
2917
2918
2919 This is not a link reference definition, because there are
2920 [non-whitespace characters] after the title:
2921
2922 ```````````````````````````````` example
2923 [foo]: /url "title" ok
2924 .
2925 <p>[foo]: /url &quot;title&quot; ok</p>
2926 ````````````````````````````````
2927
2928
2929 This is a link reference definition, but it has no title:
2930
2931 ```````````````````````````````` example
2932 [foo]: /url
2933 "title" ok
2934 .
2935 <p>&quot;title&quot; ok</p>
2936 ````````````````````````````````
2937
2938
2939 This is not a link reference definition, because it is indented
2940 four spaces:
2941
2942 ```````````````````````````````` example
2943     [foo]: /url "title"
2944
2945 [foo]
2946 .
2947 <pre><code>[foo]: /url &quot;title&quot;
2948 </code></pre>
2949 <p>[foo]</p>
2950 ````````````````````````````````
2951
2952
2953 This is not a link reference definition, because it occurs inside
2954 a code block:
2955
2956 ```````````````````````````````` example
2957 ```
2958 [foo]: /url
2959 ```
2960
2961 [foo]
2962 .
2963 <pre><code>[foo]: /url
2964 </code></pre>
2965 <p>[foo]</p>
2966 ````````````````````````````````
2967
2968
2969 A [link reference definition] cannot interrupt a paragraph.
2970
2971 ```````````````````````````````` example
2972 Foo
2973 [bar]: /baz
2974
2975 [bar]
2976 .
2977 <p>Foo
2978 [bar]: /baz</p>
2979 <p>[bar]</p>
2980 ````````````````````````````````
2981
2982
2983 However, it can directly follow other block elements, such as headings
2984 and thematic breaks, and it need not be followed by a blank line.
2985
2986 ```````````````````````````````` example
2987 # [Foo]
2988 [foo]: /url
2989 > bar
2990 .
2991 <h1><a href="/url">Foo</a></h1>
2992 <blockquote>
2993 <p>bar</p>
2994 </blockquote>
2995 ````````````````````````````````
2996
2997
2998 Several [link reference definitions]
2999 can occur one after another, without intervening blank lines.
3000
3001 ```````````````````````````````` example
3002 [foo]: /foo-url "foo"
3003 [bar]: /bar-url
3004   "bar"
3005 [baz]: /baz-url
3006
3007 [foo],
3008 [bar],
3009 [baz]
3010 .
3011 <p><a href="/foo-url" title="foo">foo</a>,
3012 <a href="/bar-url" title="bar">bar</a>,
3013 <a href="/baz-url">baz</a></p>
3014 ````````````````````````````````
3015
3016
3017 [Link reference definitions] can occur
3018 inside block containers, like lists and block quotations.  They
3019 affect the entire document, not just the container in which they
3020 are defined:
3021
3022 ```````````````````````````````` example
3023 [foo]
3024
3025 > [foo]: /url
3026 .
3027 <p><a href="/url">foo</a></p>
3028 <blockquote>
3029 </blockquote>
3030 ````````````````````````````````
3031
3032
3033
3034 ## Paragraphs
3035
3036 A sequence of non-blank lines that cannot be interpreted as other
3037 kinds of blocks forms a [paragraph](@).
3038 The contents of the paragraph are the result of parsing the
3039 paragraph's raw content as inlines.  The paragraph's raw content
3040 is formed by concatenating the lines and removing initial and final
3041 [whitespace].
3042
3043 A simple example with two paragraphs:
3044
3045 ```````````````````````````````` example
3046 aaa
3047
3048 bbb
3049 .
3050 <p>aaa</p>
3051 <p>bbb</p>
3052 ````````````````````````````````
3053
3054
3055 Paragraphs can contain multiple lines, but no blank lines:
3056
3057 ```````````````````````````````` example
3058 aaa
3059 bbb
3060
3061 ccc
3062 ddd
3063 .
3064 <p>aaa
3065 bbb</p>
3066 <p>ccc
3067 ddd</p>
3068 ````````````````````````````````
3069
3070
3071 Multiple blank lines between paragraph have no effect:
3072
3073 ```````````````````````````````` example
3074 aaa
3075
3076
3077 bbb
3078 .
3079 <p>aaa</p>
3080 <p>bbb</p>
3081 ````````````````````````````````
3082
3083
3084 Leading spaces are skipped:
3085
3086 ```````````````````````````````` example
3087   aaa
3088  bbb
3089 .
3090 <p>aaa
3091 bbb</p>
3092 ````````````````````````````````
3093
3094
3095 Lines after the first may be indented any amount, since indented
3096 code blocks cannot interrupt paragraphs.
3097
3098 ```````````````````````````````` example
3099 aaa
3100              bbb
3101                                        ccc
3102 .
3103 <p>aaa
3104 bbb
3105 ccc</p>
3106 ````````````````````````````````
3107
3108
3109 However, the first line may be indented at most three spaces,
3110 or an indented code block will be triggered:
3111
3112 ```````````````````````````````` example
3113    aaa
3114 bbb
3115 .
3116 <p>aaa
3117 bbb</p>
3118 ````````````````````````````````
3119
3120
3121 ```````````````````````````````` example
3122     aaa
3123 bbb
3124 .
3125 <pre><code>aaa
3126 </code></pre>
3127 <p>bbb</p>
3128 ````````````````````````````````
3129
3130
3131 Final spaces are stripped before inline parsing, so a paragraph
3132 that ends with two or more spaces will not end with a [hard line
3133 break]:
3134
3135 ```````````````````````````````` example
3136 aaa     
3137 bbb     
3138 .
3139 <p>aaa<br />
3140 bbb</p>
3141 ````````````````````````````````
3142
3143
3144 ## Blank lines
3145
3146 [Blank lines] between block-level elements are ignored,
3147 except for the role they play in determining whether a [list]
3148 is [tight] or [loose].
3149
3150 Blank lines at the beginning and end of the document are also ignored.
3151
3152 ```````````````````````````````` example
3153   
3154
3155 aaa
3156   
3157
3158 # aaa
3159
3160   
3161 .
3162 <p>aaa</p>
3163 <h1>aaa</h1>
3164 ````````````````````````````````
3165
3166
3167
3168 # Container blocks
3169
3170 A [container block] is a block that has other
3171 blocks as its contents.  There are two basic kinds of container blocks:
3172 [block quotes] and [list items].
3173 [Lists] are meta-containers for [list items].
3174
3175 We define the syntax for container blocks recursively.  The general
3176 form of the definition is:
3177
3178 > If X is a sequence of blocks, then the result of
3179 > transforming X in such-and-such a way is a container of type Y
3180 > with these blocks as its content.
3181
3182 So, we explain what counts as a block quote or list item by explaining
3183 how these can be *generated* from their contents. This should suffice
3184 to define the syntax, although it does not give a recipe for *parsing*
3185 these constructions.  (A recipe is provided below in the section entitled
3186 [A parsing strategy](#appendix-a-parsing-strategy).)
3187
3188 ## Block quotes
3189
3190 A [block quote marker](@)
3191 consists of 0-3 spaces of initial indent, plus (a) the character `>` together
3192 with a following space, or (b) a single character `>` not followed by a space.
3193
3194 The following rules define [block quotes]:
3195
3196 1.  **Basic case.**  If a string of lines *Ls* constitute a sequence
3197     of blocks *Bs*, then the result of prepending a [block quote
3198     marker] to the beginning of each line in *Ls*
3199     is a [block quote](#block-quotes) containing *Bs*.
3200
3201 2.  **Laziness.**  If a string of lines *Ls* constitute a [block
3202     quote](#block-quotes) with contents *Bs*, then the result of deleting
3203     the initial [block quote marker] from one or
3204     more lines in which the next [non-whitespace character] after the [block
3205     quote marker] is [paragraph continuation
3206     text] is a block quote with *Bs* as its content.
3207     [Paragraph continuation text](@) is text
3208     that will be parsed as part of the content of a paragraph, but does
3209     not occur at the beginning of the paragraph.
3210
3211 3.  **Consecutiveness.**  A document cannot contain two [block
3212     quotes] in a row unless there is a [blank line] between them.
3213
3214 Nothing else counts as a [block quote](#block-quotes).
3215
3216 Here is a simple example:
3217
3218 ```````````````````````````````` example
3219 > # Foo
3220 > bar
3221 > baz
3222 .
3223 <blockquote>
3224 <h1>Foo</h1>
3225 <p>bar
3226 baz</p>
3227 </blockquote>
3228 ````````````````````````````````
3229
3230
3231 The spaces after the `>` characters can be omitted:
3232
3233 ```````````````````````````````` example
3234 ># Foo
3235 >bar
3236 > baz
3237 .
3238 <blockquote>
3239 <h1>Foo</h1>
3240 <p>bar
3241 baz</p>
3242 </blockquote>
3243 ````````````````````````````````
3244
3245
3246 The `>` characters can be indented 1-3 spaces:
3247
3248 ```````````````````````````````` example
3249    > # Foo
3250    > bar
3251  > baz
3252 .
3253 <blockquote>
3254 <h1>Foo</h1>
3255 <p>bar
3256 baz</p>
3257 </blockquote>
3258 ````````````````````````````````
3259
3260
3261 Four spaces gives us a code block:
3262
3263 ```````````````````````````````` example
3264     > # Foo
3265     > bar
3266     > baz
3267 .
3268 <pre><code>&gt; # Foo
3269 &gt; bar
3270 &gt; baz
3271 </code></pre>
3272 ````````````````````````````````
3273
3274
3275 The Laziness clause allows us to omit the `>` before
3276 [paragraph continuation text]:
3277
3278 ```````````````````````````````` example
3279 > # Foo
3280 > bar
3281 baz
3282 .
3283 <blockquote>
3284 <h1>Foo</h1>
3285 <p>bar
3286 baz</p>
3287 </blockquote>
3288 ````````````````````````````````
3289
3290
3291 A block quote can contain some lazy and some non-lazy
3292 continuation lines:
3293
3294 ```````````````````````````````` example
3295 > bar
3296 baz
3297 > foo
3298 .
3299 <blockquote>
3300 <p>bar
3301 baz
3302 foo</p>
3303 </blockquote>
3304 ````````````````````````````````
3305
3306
3307 Laziness only applies to lines that would have been continuations of
3308 paragraphs had they been prepended with [block quote markers].
3309 For example, the `> ` cannot be omitted in the second line of
3310
3311 ``` markdown
3312 > foo
3313 > ---
3314 ```
3315
3316 without changing the meaning:
3317
3318 ```````````````````````````````` example
3319 > foo
3320 ---
3321 .
3322 <blockquote>
3323 <p>foo</p>
3324 </blockquote>
3325 <hr />
3326 ````````````````````````````````
3327
3328
3329 Similarly, if we omit the `> ` in the second line of
3330
3331 ``` markdown
3332 > - foo
3333 > - bar
3334 ```
3335
3336 then the block quote ends after the first line:
3337
3338 ```````````````````````````````` example
3339 > - foo
3340 - bar
3341 .
3342 <blockquote>
3343 <ul>
3344 <li>foo</li>
3345 </ul>
3346 </blockquote>
3347 <ul>
3348 <li>bar</li>
3349 </ul>
3350 ````````````````````````````````
3351
3352
3353 For the same reason, we can't omit the `> ` in front of
3354 subsequent lines of an indented or fenced code block:
3355
3356 ```````````````````````````````` example
3357 >     foo
3358     bar
3359 .
3360 <blockquote>
3361 <pre><code>foo
3362 </code></pre>
3363 </blockquote>
3364 <pre><code>bar
3365 </code></pre>
3366 ````````````````````````````````
3367
3368
3369 ```````````````````````````````` example
3370 > ```
3371 foo
3372 ```
3373 .
3374 <blockquote>
3375 <pre><code></code></pre>
3376 </blockquote>
3377 <p>foo</p>
3378 <pre><code></code></pre>
3379 ````````````````````````````````
3380
3381
3382 Note that in the following case, we have a [lazy
3383 continuation line]:
3384
3385 ```````````````````````````````` example
3386 > foo
3387     - bar
3388 .
3389 <blockquote>
3390 <p>foo
3391 - bar</p>
3392 </blockquote>
3393 ````````````````````````````````
3394
3395
3396 To see why, note that in
3397
3398 ```markdown
3399 > foo
3400 >     - bar
3401 ```
3402
3403 the `- bar` is indented too far to start a list, and can't
3404 be an indented code block because indented code blocks cannot
3405 interrupt paragraphs, so it is [paragraph continuation text].
3406
3407 A block quote can be empty:
3408
3409 ```````````````````````````````` example
3410 >
3411 .
3412 <blockquote>
3413 </blockquote>
3414 ````````````````````````````````
3415
3416
3417 ```````````````````````````````` example
3418 >
3419 >  
3420
3421 .
3422 <blockquote>
3423 </blockquote>
3424 ````````````````````````````````
3425
3426
3427 A block quote can have initial or final blank lines:
3428
3429 ```````````````````````````````` example
3430 >
3431 > foo
3432 >  
3433 .
3434 <blockquote>
3435 <p>foo</p>
3436 </blockquote>
3437 ````````````````````````````````
3438
3439
3440 A blank line always separates block quotes:
3441
3442 ```````````````````````````````` example
3443 > foo
3444
3445 > bar
3446 .
3447 <blockquote>
3448 <p>foo</p>
3449 </blockquote>
3450 <blockquote>
3451 <p>bar</p>
3452 </blockquote>
3453 ````````````````````````````````
3454
3455
3456 (Most current Markdown implementations, including John Gruber's
3457 original `Markdown.pl`, will parse this example as a single block quote
3458 with two paragraphs.  But it seems better to allow the author to decide
3459 whether two block quotes or one are wanted.)
3460
3461 Consecutiveness means that if we put these block quotes together,
3462 we get a single block quote:
3463
3464 ```````````````````````````````` example
3465 > foo
3466 > bar
3467 .
3468 <blockquote>
3469 <p>foo
3470 bar</p>
3471 </blockquote>
3472 ````````````````````````````````
3473
3474
3475 To get a block quote with two paragraphs, use:
3476
3477 ```````````````````````````````` example
3478 > foo
3479 >
3480 > bar
3481 .
3482 <blockquote>
3483 <p>foo</p>
3484 <p>bar</p>
3485 </blockquote>
3486 ````````````````````````````````
3487
3488
3489 Block quotes can interrupt paragraphs:
3490
3491 ```````````````````````````````` example
3492 foo
3493 > bar
3494 .
3495 <p>foo</p>
3496 <blockquote>
3497 <p>bar</p>
3498 </blockquote>
3499 ````````````````````````````````
3500
3501
3502 In general, blank lines are not needed before or after block
3503 quotes:
3504
3505 ```````````````````````````````` example
3506 > aaa
3507 ***
3508 > bbb
3509 .
3510 <blockquote>
3511 <p>aaa</p>
3512 </blockquote>
3513 <hr />
3514 <blockquote>
3515 <p>bbb</p>
3516 </blockquote>
3517 ````````````````````````````````
3518
3519
3520 However, because of laziness, a blank line is needed between
3521 a block quote and a following paragraph:
3522
3523 ```````````````````````````````` example
3524 > bar
3525 baz
3526 .
3527 <blockquote>
3528 <p>bar
3529 baz</p>
3530 </blockquote>
3531 ````````````````````````````````
3532
3533
3534 ```````````````````````````````` example
3535 > bar
3536
3537 baz
3538 .
3539 <blockquote>
3540 <p>bar</p>
3541 </blockquote>
3542 <p>baz</p>
3543 ````````````````````````````````
3544
3545
3546 ```````````````````````````````` example
3547 > bar
3548 >
3549 baz
3550 .
3551 <blockquote>
3552 <p>bar</p>
3553 </blockquote>
3554 <p>baz</p>
3555 ````````````````````````````````
3556
3557
3558 It is a consequence of the Laziness rule that any number
3559 of initial `>`s may be omitted on a continuation line of a
3560 nested block quote:
3561
3562 ```````````````````````````````` example
3563 > > > foo
3564 bar
3565 .
3566 <blockquote>
3567 <blockquote>
3568 <blockquote>
3569 <p>foo
3570 bar</p>
3571 </blockquote>
3572 </blockquote>
3573 </blockquote>
3574 ````````````````````````````````
3575
3576
3577 ```````````````````````````````` example
3578 >>> foo
3579 > bar
3580 >>baz
3581 .
3582 <blockquote>
3583 <blockquote>
3584 <blockquote>
3585 <p>foo
3586 bar
3587 baz</p>
3588 </blockquote>
3589 </blockquote>
3590 </blockquote>
3591 ````````````````````````````````
3592
3593
3594 When including an indented code block in a block quote,
3595 remember that the [block quote marker] includes
3596 both the `>` and a following space.  So *five spaces* are needed after
3597 the `>`:
3598
3599 ```````````````````````````````` example
3600 >     code
3601
3602 >    not code
3603 .
3604 <blockquote>
3605 <pre><code>code
3606 </code></pre>
3607 </blockquote>
3608 <blockquote>
3609 <p>not code</p>
3610 </blockquote>
3611 ````````````````````````````````
3612
3613
3614
3615 ## List items
3616
3617 A [list marker](@) is a
3618 [bullet list marker] or an [ordered list marker].
3619
3620 A [bullet list marker](@)
3621 is a `-`, `+`, or `*` character.
3622
3623 An [ordered list marker](@)
3624 is a sequence of 1--9 arabic digits (`0-9`), followed by either a
3625 `.` character or a `)` character.  (The reason for the length
3626 limit is that with 10 digits we start seeing integer overflows
3627 in some browsers.)
3628
3629 The following rules define [list items]:
3630
3631 1.  **Basic case.**  If a sequence of lines *Ls* constitute a sequence of
3632     blocks *Bs* starting with a [non-whitespace character] and not separated
3633     from each other by more than one blank line, and *M* is a list
3634     marker of width *W* followed by 1 ≤ *N* ≤ 4 spaces, then the result
3635     of prepending *M* and the following spaces to the first line of
3636     *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a
3637     list item with *Bs* as its contents.  The type of the list item
3638     (bullet or ordered) is determined by the type of its list marker.
3639     If the list item is ordered, then it is also assigned a start
3640     number, based on the ordered list marker.
3641
3642     Exceptions: When the first list item in a [list] interrupts
3643     a paragraph---that is, when it starts on a line that would
3644     otherwise count as [paragraph continuation text]---then (a)
3645     the lines *Ls* must not begin with a blank line, and (b) if
3646     the list item is ordered, the start number must be 1.
3647
3648 For example, let *Ls* be the lines
3649
3650 ```````````````````````````````` example
3651 A paragraph
3652 with two lines.
3653
3654     indented code
3655
3656 > A block quote.
3657 .
3658 <p>A paragraph
3659 with two lines.</p>
3660 <pre><code>indented code
3661 </code></pre>
3662 <blockquote>
3663 <p>A block quote.</p>
3664 </blockquote>
3665 ````````````````````````````````
3666
3667
3668 And let *M* be the marker `1.`, and *N* = 2.  Then rule #1 says
3669 that the following is an ordered list item with start number 1,
3670 and the same contents as *Ls*:
3671
3672 ```````````````````````````````` example
3673 1.  A paragraph
3674     with two lines.
3675
3676         indented code
3677
3678     > A block quote.
3679 .
3680 <ol>
3681 <li>
3682 <p>A paragraph
3683 with two lines.</p>
3684 <pre><code>indented code
3685 </code></pre>
3686 <blockquote>
3687 <p>A block quote.</p>
3688 </blockquote>
3689 </li>
3690 </ol>
3691 ````````````````````````````````
3692
3693
3694 The most important thing to notice is that the position of
3695 the text after the list marker determines how much indentation
3696 is needed in subsequent blocks in the list item.  If the list
3697 marker takes up two spaces, and there are three spaces between
3698 the list marker and the next [non-whitespace character], then blocks
3699 must be indented five spaces in order to fall under the list
3700 item.
3701
3702 Here are some examples showing how far content must be indented to be
3703 put under the list item:
3704
3705 ```````````````````````````````` example
3706 - one
3707
3708  two
3709 .
3710 <ul>
3711 <li>one</li>
3712 </ul>
3713 <p>two</p>
3714 ````````````````````````````````
3715
3716
3717 ```````````````````````````````` example
3718 - one
3719
3720   two
3721 .
3722 <ul>
3723 <li>
3724 <p>one</p>
3725 <p>two</p>
3726 </li>
3727 </ul>
3728 ````````````````````````````````
3729
3730
3731 ```````````````````````````````` example
3732  -    one
3733
3734      two
3735 .
3736 <ul>
3737 <li>one</li>
3738 </ul>
3739 <pre><code> two
3740 </code></pre>
3741 ````````````````````````````````
3742
3743
3744 ```````````````````````````````` example
3745  -    one
3746
3747       two
3748 .
3749 <ul>
3750 <li>
3751 <p>one</p>
3752 <p>two</p>
3753 </li>
3754 </ul>
3755 ````````````````````````````````
3756
3757
3758 It is tempting to think of this in terms of columns:  the continuation
3759 blocks must be indented at least to the column of the first
3760 [non-whitespace character] after the list marker. However, that is not quite right.
3761 The spaces after the list marker determine how much relative indentation
3762 is needed.  Which column this indentation reaches will depend on
3763 how the list item is embedded in other constructions, as shown by
3764 this example:
3765
3766 ```````````````````````````````` example
3767    > > 1.  one
3768 >>
3769 >>     two
3770 .
3771 <blockquote>
3772 <blockquote>
3773 <ol>
3774 <li>
3775 <p>one</p>
3776 <p>two</p>
3777 </li>
3778 </ol>
3779 </blockquote>
3780 </blockquote>
3781 ````````````````````````````````
3782
3783
3784 Here `two` occurs in the same column as the list marker `1.`,
3785 but is actually contained in the list item, because there is
3786 sufficient indentation after the last containing blockquote marker.
3787
3788 The converse is also possible.  In the following example, the word `two`
3789 occurs far to the right of the initial text of the list item, `one`, but
3790 it is not considered part of the list item, because it is not indented
3791 far enough past the blockquote marker:
3792
3793 ```````````````````````````````` example
3794 >>- one
3795 >>
3796   >  > two
3797 .
3798 <blockquote>
3799 <blockquote>
3800 <ul>
3801 <li>one</li>
3802 </ul>
3803 <p>two</p>
3804 </blockquote>
3805 </blockquote>
3806 ````````````````````````````````
3807
3808
3809 Note that at least one space is needed between the list marker and
3810 any following content, so these are not list items:
3811
3812 ```````````````````````````````` example
3813 -one
3814
3815 2.two
3816 .
3817 <p>-one</p>
3818 <p>2.two</p>
3819 ````````````````````````````````
3820
3821
3822 A list item may contain blocks that are separated by more than
3823 one blank line.
3824
3825 ```````````````````````````````` example
3826 - foo
3827
3828
3829   bar
3830 .
3831 <ul>
3832 <li>
3833 <p>foo</p>
3834 <p>bar</p>
3835 </li>
3836 </ul>
3837 ````````````````````````````````
3838
3839
3840 A list item may contain any kind of block:
3841
3842 ```````````````````````````````` example
3843 1.  foo
3844
3845     ```
3846     bar
3847     ```
3848
3849     baz
3850
3851     > bam
3852 .
3853 <ol>
3854 <li>
3855 <p>foo</p>
3856 <pre><code>bar
3857 </code></pre>
3858 <p>baz</p>
3859 <blockquote>
3860 <p>bam</p>
3861 </blockquote>
3862 </li>
3863 </ol>
3864 ````````````````````````````````
3865
3866
3867 A list item that contains an indented code block will preserve
3868 empty lines within the code block verbatim.
3869
3870 ```````````````````````````````` example
3871 - Foo
3872
3873       bar
3874
3875
3876       baz
3877 .
3878 <ul>
3879 <li>
3880 <p>Foo</p>
3881 <pre><code>bar
3882
3883
3884 baz
3885 </code></pre>
3886 </li>
3887 </ul>
3888 ````````````````````````````````
3889
3890 Note that ordered list start numbers must be nine digits or less:
3891
3892 ```````````````````````````````` example
3893 123456789. ok
3894 .
3895 <ol start="123456789">
3896 <li>ok</li>
3897 </ol>
3898 ````````````````````````````````
3899
3900
3901 ```````````````````````````````` example
3902 1234567890. not ok
3903 .
3904 <p>1234567890. not ok</p>
3905 ````````````````````````````````
3906
3907
3908 A start number may begin with 0s:
3909
3910 ```````````````````````````````` example
3911 0. ok
3912 .
3913 <ol start="0">
3914 <li>ok</li>
3915 </ol>
3916 ````````````````````````````````
3917
3918
3919 ```````````````````````````````` example
3920 003. ok
3921 .
3922 <ol start="3">
3923 <li>ok</li>
3924 </ol>
3925 ````````````````````````````````
3926
3927
3928 A start number may not be negative:
3929
3930 ```````````````````````````````` example
3931 -1. not ok
3932 .
3933 <p>-1. not ok</p>
3934 ````````````````````````````````
3935
3936
3937
3938 2.  **Item starting with indented code.**  If a sequence of lines *Ls*
3939     constitute a sequence of blocks *Bs* starting with an indented code
3940     block and not separated from each other by more than one blank line,
3941     and *M* is a list marker of width *W* followed by
3942     one space, then the result of prepending *M* and the following
3943     space to the first line of *Ls*, and indenting subsequent lines of
3944     *Ls* by *W + 1* spaces, is a list item with *Bs* as its contents.
3945     If a line is empty, then it need not be indented.  The type of the
3946     list item (bullet or ordered) is determined by the type of its list
3947     marker.  If the list item is ordered, then it is also assigned a
3948     start number, based on the ordered list marker.
3949
3950 An indented code block will have to be indented four spaces beyond
3951 the edge of the region where text will be included in the list item.
3952 In the following case that is 6 spaces:
3953
3954 ```````````````````````````````` example
3955 - foo
3956
3957       bar
3958 .
3959 <ul>
3960 <li>
3961 <p>foo</p>
3962 <pre><code>bar
3963 </code></pre>
3964 </li>
3965 </ul>
3966 ````````````````````````````````
3967
3968
3969 And in this case it is 11 spaces:
3970
3971 ```````````````````````````````` example
3972   10.  foo
3973
3974            bar
3975 .
3976 <ol start="10">
3977 <li>
3978 <p>foo</p>
3979 <pre><code>bar
3980 </code></pre>
3981 </li>
3982 </ol>
3983 ````````````````````````````````
3984
3985
3986 If the *first* block in the list item is an indented code block,
3987 then by rule #2, the contents must be indented *one* space after the
3988 list marker:
3989
3990 ```````````````````````````````` example
3991     indented code
3992
3993 paragraph
3994
3995     more code
3996 .
3997 <pre><code>indented code
3998 </code></pre>
3999 <p>paragraph</p>
4000 <pre><code>more code
4001 </code></pre>
4002 ````````````````````````````````
4003
4004
4005 ```````````````````````````````` example
4006 1.     indented code
4007
4008    paragraph
4009
4010        more code
4011 .
4012 <ol>
4013 <li>
4014 <pre><code>indented code
4015 </code></pre>
4016 <p>paragraph</p>
4017 <pre><code>more code
4018 </code></pre>
4019 </li>
4020 </ol>
4021 ````````````````````````````````
4022
4023
4024 Note that an additional space indent is interpreted as space
4025 inside the code block:
4026
4027 ```````````````````````````````` example
4028 1.      indented code
4029
4030    paragraph
4031
4032        more code
4033 .
4034 <ol>
4035 <li>
4036 <pre><code> indented code
4037 </code></pre>
4038 <p>paragraph</p>
4039 <pre><code>more code
4040 </code></pre>
4041 </li>
4042 </ol>
4043 ````````````````````````````````
4044
4045
4046 Note that rules #1 and #2 only apply to two cases:  (a) cases
4047 in which the lines to be included in a list item begin with a
4048 [non-whitespace character], and (b) cases in which
4049 they begin with an indented code
4050 block.  In a case like the following, where the first block begins with
4051 a three-space indent, the rules do not allow us to form a list item by
4052 indenting the whole thing and prepending a list marker:
4053
4054 ```````````````````````````````` example
4055    foo
4056
4057 bar
4058 .
4059 <p>foo</p>
4060 <p>bar</p>
4061 ````````````````````````````````
4062
4063
4064 ```````````````````````````````` example
4065 -    foo
4066
4067   bar
4068 .
4069 <ul>
4070 <li>foo</li>
4071 </ul>
4072 <p>bar</p>
4073 ````````````````````````````````
4074
4075
4076 This is not a significant restriction, because when a block begins
4077 with 1-3 spaces indent, the indentation can always be removed without
4078 a change in interpretation, allowing rule #1 to be applied.  So, in
4079 the above case:
4080
4081 ```````````````````````````````` example
4082 -  foo
4083
4084    bar
4085 .
4086 <ul>
4087 <li>
4088 <p>foo</p>
4089 <p>bar</p>
4090 </li>
4091 </ul>
4092 ````````````````````````````````
4093
4094
4095 3.  **Item starting with a blank line.**  If a sequence of lines *Ls*
4096     starting with a single [blank line] constitute a (possibly empty)
4097     sequence of blocks *Bs*, not separated from each other by more than
4098     one blank line, and *M* is a list marker of width *W*,
4099     then the result of prepending *M* to the first line of *Ls*, and
4100     indenting subsequent lines of *Ls* by *W + 1* spaces, is a list
4101     item with *Bs* as its contents.
4102     If a line is empty, then it need not be indented.  The type of the
4103     list item (bullet or ordered) is determined by the type of its list
4104     marker.  If the list item is ordered, then it is also assigned a
4105     start number, based on the ordered list marker.
4106
4107 Here are some list items that start with a blank line but are not empty:
4108
4109 ```````````````````````````````` example
4110 -
4111   foo
4112 -
4113   ```
4114   bar
4115   ```
4116 -
4117       baz
4118 .
4119 <ul>
4120 <li>foo</li>
4121 <li>
4122 <pre><code>bar
4123 </code></pre>
4124 </li>
4125 <li>
4126 <pre><code>baz
4127 </code></pre>
4128 </li>
4129 </ul>
4130 ````````````````````````````````
4131
4132 When the list item starts with a blank line, the number of spaces
4133 following the list marker doesn't change the required indentation:
4134
4135 ```````````````````````````````` example
4136 -   
4137   foo
4138 .
4139 <ul>
4140 <li>foo</li>
4141 </ul>
4142 ````````````````````````````````
4143
4144
4145 A list item can begin with at most one blank line.
4146 In the following example, `foo` is not part of the list
4147 item:
4148
4149 ```````````````````````````````` example
4150 -
4151
4152   foo
4153 .
4154 <ul>
4155 <li></li>
4156 </ul>
4157 <p>foo</p>
4158 ````````````````````````````````
4159
4160
4161 Here is an empty bullet list item:
4162
4163 ```````````````````````````````` example
4164 - foo
4165 -
4166 - bar
4167 .
4168 <ul>
4169 <li>foo</li>
4170 <li></li>
4171 <li>bar</li>
4172 </ul>
4173 ````````````````````````````````
4174
4175
4176 It does not matter whether there are spaces following the [list marker]:
4177
4178 ```````````````````````````````` example
4179 - foo
4180 -   
4181 - bar
4182 .
4183 <ul>
4184 <li>foo</li>
4185 <li></li>
4186 <li>bar</li>
4187 </ul>
4188 ````````````````````````````````
4189
4190
4191 Here is an empty ordered list item:
4192
4193 ```````````````````````````````` example
4194 1. foo
4195 2.
4196 3. bar
4197 .
4198 <ol>
4199 <li>foo</li>
4200 <li></li>
4201 <li>bar</li>
4202 </ol>
4203 ````````````````````````````````
4204
4205
4206 A list may start or end with an empty list item:
4207
4208 ```````````````````````````````` example
4209 *
4210 .
4211 <ul>
4212 <li></li>
4213 </ul>
4214 ````````````````````````````````
4215
4216 However, an empty list item cannot interrupt a paragraph:
4217
4218 ```````````````````````````````` example
4219 foo
4220 *
4221
4222 foo
4223 1.
4224 .
4225 <p>foo
4226 *</p>
4227 <p>foo
4228 1.</p>
4229 ````````````````````````````````
4230
4231
4232 4.  **Indentation.**  If a sequence of lines *Ls* constitutes a list item
4233     according to rule #1, #2, or #3, then the result of indenting each line
4234     of *Ls* by 1-3 spaces (the same for each line) also constitutes a
4235     list item with the same contents and attributes.  If a line is
4236     empty, then it need not be indented.
4237
4238 Indented one space:
4239
4240 ```````````````````````````````` example
4241  1.  A paragraph
4242      with two lines.
4243
4244          indented code
4245
4246      > A block quote.
4247 .
4248 <ol>
4249 <li>
4250 <p>A paragraph
4251 with two lines.</p>
4252 <pre><code>indented code
4253 </code></pre>
4254 <blockquote>
4255 <p>A block quote.</p>
4256 </blockquote>
4257 </li>
4258 </ol>
4259 ````````````````````````````````
4260
4261
4262 Indented two spaces:
4263
4264 ```````````````````````````````` example
4265   1.  A paragraph
4266       with two lines.
4267
4268           indented code
4269
4270       > A block quote.
4271 .
4272 <ol>
4273 <li>
4274 <p>A paragraph
4275 with two lines.</p>
4276 <pre><code>indented code
4277 </code></pre>
4278 <blockquote>
4279 <p>A block quote.</p>
4280 </blockquote>
4281 </li>
4282 </ol>
4283 ````````````````````````````````
4284
4285
4286 Indented three spaces:
4287
4288 ```````````````````````````````` example
4289    1.  A paragraph
4290        with two lines.
4291
4292            indented code
4293
4294        > A block quote.
4295 .
4296 <ol>
4297 <li>
4298 <p>A paragraph
4299 with two lines.</p>
4300 <pre><code>indented code
4301 </code></pre>
4302 <blockquote>
4303 <p>A block quote.</p>
4304 </blockquote>
4305 </li>
4306 </ol>
4307 ````````````````````````````````
4308
4309
4310 Four spaces indent gives a code block:
4311
4312 ```````````````````````````````` example
4313     1.  A paragraph
4314         with two lines.
4315
4316             indented code
4317
4318         > A block quote.
4319 .
4320 <pre><code>1.  A paragraph
4321     with two lines.
4322
4323         indented code
4324
4325     &gt; A block quote.
4326 </code></pre>
4327 ````````````````````````````````
4328
4329
4330
4331 5.  **Laziness.**  If a string of lines *Ls* constitute a [list
4332     item](#list-items) with contents *Bs*, then the result of deleting
4333     some or all of the indentation from one or more lines in which the
4334     next [non-whitespace character] after the indentation is
4335     [paragraph continuation text] is a
4336     list item with the same contents and attributes.  The unindented
4337     lines are called
4338     [lazy continuation line](@)s.
4339
4340 Here is an example with [lazy continuation lines]:
4341
4342 ```````````````````````````````` example
4343   1.  A paragraph
4344 with two lines.
4345
4346           indented code
4347
4348       > A block quote.
4349 .
4350 <ol>
4351 <li>
4352 <p>A paragraph
4353 with two lines.</p>
4354 <pre><code>indented code
4355 </code></pre>
4356 <blockquote>
4357 <p>A block quote.</p>
4358 </blockquote>
4359 </li>
4360 </ol>
4361 ````````````````````````````````
4362
4363
4364 Indentation can be partially deleted:
4365
4366 ```````````````````````````````` example
4367   1.  A paragraph
4368     with two lines.
4369 .
4370 <ol>
4371 <li>A paragraph
4372 with two lines.</li>
4373 </ol>
4374 ````````````````````````````````
4375
4376
4377 These examples show how laziness can work in nested structures:
4378
4379 ```````````````````````````````` example
4380 > 1. > Blockquote
4381 continued here.
4382 .
4383 <blockquote>
4384 <ol>
4385 <li>
4386 <blockquote>
4387 <p>Blockquote
4388 continued here.</p>
4389 </blockquote>
4390 </li>
4391 </ol>
4392 </blockquote>
4393 ````````````````````````````````
4394
4395
4396 ```````````````````````````````` example
4397 > 1. > Blockquote
4398 > continued here.
4399 .
4400 <blockquote>
4401 <ol>
4402 <li>
4403 <blockquote>
4404 <p>Blockquote
4405 continued here.</p>
4406 </blockquote>
4407 </li>
4408 </ol>
4409 </blockquote>
4410 ````````````````````````````````
4411
4412
4413
4414 6.  **That's all.** Nothing that is not counted as a list item by rules
4415     #1--5 counts as a [list item](#list-items).
4416
4417 The rules for sublists follow from the general rules above.  A sublist
4418 must be indented the same number of spaces a paragraph would need to be
4419 in order to be included in the list item.
4420
4421 So, in this case we need two spaces indent:
4422
4423 ```````````````````````````````` example
4424 - foo
4425   - bar
4426     - baz
4427       - boo
4428 .
4429 <ul>
4430 <li>foo
4431 <ul>
4432 <li>bar
4433 <ul>
4434 <li>baz
4435 <ul>
4436 <li>boo</li>
4437 </ul>
4438 </li>
4439 </ul>
4440 </li>
4441 </ul>
4442 </li>
4443 </ul>
4444 ````````````````````````````````
4445
4446
4447 One is not enough:
4448
4449 ```````````````````````````````` example
4450 - foo
4451  - bar
4452   - baz
4453    - boo
4454 .
4455 <ul>
4456 <li>foo</li>
4457 <li>bar</li>
4458 <li>baz</li>
4459 <li>boo</li>
4460 </ul>
4461 ````````````````````````````````
4462
4463
4464 Here we need four, because the list marker is wider:
4465
4466 ```````````````````````````````` example
4467 10) foo
4468     - bar
4469 .
4470 <ol start="10">
4471 <li>foo
4472 <ul>
4473 <li>bar</li>
4474 </ul>
4475 </li>
4476 </ol>
4477 ````````````````````````````````
4478
4479
4480 Three is not enough:
4481
4482 ```````````````````````````````` example
4483 10) foo
4484    - bar
4485 .
4486 <ol start="10">
4487 <li>foo</li>
4488 </ol>
4489 <ul>
4490 <li>bar</li>
4491 </ul>
4492 ````````````````````````````````
4493
4494
4495 A list may be the first block in a list item:
4496
4497 ```````````````````````````````` example
4498 - - foo
4499 .
4500 <ul>
4501 <li>
4502 <ul>
4503 <li>foo</li>
4504 </ul>
4505 </li>
4506 </ul>
4507 ````````````````````````````````
4508
4509
4510 ```````````````````````````````` example
4511 1. - 2. foo
4512 .
4513 <ol>
4514 <li>
4515 <ul>
4516 <li>
4517 <ol start="2">
4518 <li>foo</li>
4519 </ol>
4520 </li>
4521 </ul>
4522 </li>
4523 </ol>
4524 ````````````````````````````````
4525
4526
4527 A list item can contain a heading:
4528
4529 ```````````````````````````````` example
4530 - # Foo
4531 - Bar
4532   ---
4533   baz
4534 .
4535 <ul>
4536 <li>
4537 <h1>Foo</h1>
4538 </li>
4539 <li>
4540 <h2>Bar</h2>
4541 baz</li>
4542 </ul>
4543 ````````````````````````````````
4544
4545
4546 ### Motivation
4547
4548 John Gruber's Markdown spec says the following about list items:
4549
4550 1. "List markers typically start at the left margin, but may be indented
4551    by up to three spaces. List markers must be followed by one or more
4552    spaces or a tab."
4553
4554 2. "To make lists look nice, you can wrap items with hanging indents....
4555    But if you don't want to, you don't have to."
4556
4557 3. "List items may consist of multiple paragraphs. Each subsequent
4558    paragraph in a list item must be indented by either 4 spaces or one
4559    tab."
4560
4561 4. "It looks nice if you indent every line of the subsequent paragraphs,
4562    but here again, Markdown will allow you to be lazy."
4563
4564 5. "To put a blockquote within a list item, the blockquote's `>`
4565    delimiters need to be indented."
4566
4567 6. "To put a code block within a list item, the code block needs to be
4568    indented twice — 8 spaces or two tabs."
4569
4570 These rules specify that a paragraph under a list item must be indented
4571 four spaces (presumably, from the left margin, rather than the start of
4572 the list marker, but this is not said), and that code under a list item
4573 must be indented eight spaces instead of the usual four.  They also say
4574 that a block quote must be indented, but not by how much; however, the
4575 example given has four spaces indentation.  Although nothing is said
4576 about other kinds of block-level content, it is certainly reasonable to
4577 infer that *all* block elements under a list item, including other
4578 lists, must be indented four spaces.  This principle has been called the
4579 *four-space rule*.
4580
4581 The four-space rule is clear and principled, and if the reference
4582 implementation `Markdown.pl` had followed it, it probably would have
4583 become the standard.  However, `Markdown.pl` allowed paragraphs and
4584 sublists to start with only two spaces indentation, at least on the
4585 outer level.  Worse, its behavior was inconsistent: a sublist of an
4586 outer-level list needed two spaces indentation, but a sublist of this
4587 sublist needed three spaces.  It is not surprising, then, that different
4588 implementations of Markdown have developed very different rules for
4589 determining what comes under a list item.  (Pandoc and python-Markdown,
4590 for example, stuck with Gruber's syntax description and the four-space
4591 rule, while discount, redcarpet, marked, PHP Markdown, and others
4592 followed `Markdown.pl`'s behavior more closely.)
4593
4594 Unfortunately, given the divergences between implementations, there
4595 is no way to give a spec for list items that will be guaranteed not
4596 to break any existing documents.  However, the spec given here should
4597 correctly handle lists formatted with either the four-space rule or
4598 the more forgiving `Markdown.pl` behavior, provided they are laid out
4599 in a way that is natural for a human to read.
4600
4601 The strategy here is to let the width and indentation of the list marker
4602 determine the indentation necessary for blocks to fall under the list
4603 item, rather than having a fixed and arbitrary number.  The writer can
4604 think of the body of the list item as a unit which gets indented to the
4605 right enough to fit the list marker (and any indentation on the list
4606 marker).  (The laziness rule, #5, then allows continuation lines to be
4607 unindented if needed.)
4608
4609 This rule is superior, we claim, to any rule requiring a fixed level of
4610 indentation from the margin.  The four-space rule is clear but
4611 unnatural. It is quite unintuitive that
4612
4613 ``` markdown
4614 - foo
4615
4616   bar
4617
4618   - baz
4619 ```
4620
4621 should be parsed as two lists with an intervening paragraph,
4622
4623 ``` html
4624 <ul>
4625 <li>foo</li>
4626 </ul>
4627 <p>bar</p>
4628 <ul>
4629 <li>baz</li>
4630 </ul>
4631 ```
4632
4633 as the four-space rule demands, rather than a single list,
4634
4635 ``` html
4636 <ul>
4637 <li>
4638 <p>foo</p>
4639 <p>bar</p>
4640 <ul>
4641 <li>baz</li>
4642 </ul>
4643 </li>
4644 </ul>
4645 ```
4646
4647 The choice of four spaces is arbitrary.  It can be learned, but it is
4648 not likely to be guessed, and it trips up beginners regularly.
4649
4650 Would it help to adopt a two-space rule?  The problem is that such
4651 a rule, together with the rule allowing 1--3 spaces indentation of the
4652 initial list marker, allows text that is indented *less than* the
4653 original list marker to be included in the list item. For example,
4654 `Markdown.pl` parses
4655
4656 ``` markdown
4657    - one
4658
4659   two
4660 ```
4661
4662 as a single list item, with `two` a continuation paragraph:
4663
4664 ``` html
4665 <ul>
4666 <li>
4667 <p>one</p>
4668 <p>two</p>
4669 </li>
4670 </ul>
4671 ```
4672
4673 and similarly
4674
4675 ``` markdown
4676 >   - one
4677 >
4678 >  two
4679 ```
4680
4681 as
4682
4683 ``` html
4684 <blockquote>
4685 <ul>
4686 <li>
4687 <p>one</p>
4688 <p>two</p>
4689 </li>
4690 </ul>
4691 </blockquote>
4692 ```
4693
4694 This is extremely unintuitive.
4695
4696 Rather than requiring a fixed indent from the margin, we could require
4697 a fixed indent (say, two spaces, or even one space) from the list marker (which
4698 may itself be indented).  This proposal would remove the last anomaly
4699 discussed.  Unlike the spec presented above, it would count the following
4700 as a list item with a subparagraph, even though the paragraph `bar`
4701 is not indented as far as the first paragraph `foo`:
4702
4703 ``` markdown
4704  10. foo
4705
4706    bar  
4707 ```
4708
4709 Arguably this text does read like a list item with `bar` as a subparagraph,
4710 which may count in favor of the proposal.  However, on this proposal indented
4711 code would have to be indented six spaces after the list marker.  And this
4712 would break a lot of existing Markdown, which has the pattern:
4713
4714 ``` markdown
4715 1.  foo
4716
4717         indented code
4718 ```
4719
4720 where the code is indented eight spaces.  The spec above, by contrast, will
4721 parse this text as expected, since the code block's indentation is measured
4722 from the beginning of `foo`.
4723
4724 The one case that needs special treatment is a list item that *starts*
4725 with indented code.  How much indentation is required in that case, since
4726 we don't have a "first paragraph" to measure from?  Rule #2 simply stipulates
4727 that in such cases, we require one space indentation from the list marker
4728 (and then the normal four spaces for the indented code).  This will match the
4729 four-space rule in cases where the list marker plus its initial indentation
4730 takes four spaces (a common case), but diverge in other cases.
4731
4732 ## Lists
4733
4734 A [list](@) is a sequence of one or more
4735 list items [of the same type].  The list items
4736 may be separated by any number of blank lines.
4737
4738 Two list items are [of the same type](@)
4739 if they begin with a [list marker] of the same type.
4740 Two list markers are of the
4741 same type if (a) they are bullet list markers using the same character
4742 (`-`, `+`, or `*`) or (b) they are ordered list numbers with the same
4743 delimiter (either `.` or `)`).
4744
4745 A list is an [ordered list](@)
4746 if its constituent list items begin with
4747 [ordered list markers], and a
4748 [bullet list](@) if its constituent list
4749 items begin with [bullet list markers].
4750
4751 The [start number](@)
4752 of an [ordered list] is determined by the list number of
4753 its initial list item.  The numbers of subsequent list items are
4754 disregarded.
4755
4756 A list is [loose](@) if any of its constituent
4757 list items are separated by blank lines, or if any of its constituent
4758 list items directly contain two block-level elements with a blank line
4759 between them.  Otherwise a list is [tight](@).
4760 (The difference in HTML output is that paragraphs in a loose list are
4761 wrapped in `<p>` tags, while paragraphs in a tight list are not.)
4762
4763 Changing the bullet or ordered list delimiter starts a new list:
4764
4765 ```````````````````````````````` example
4766 - foo
4767 - bar
4768 + baz
4769 .
4770 <ul>
4771 <li>foo</li>
4772 <li>bar</li>
4773 </ul>
4774 <ul>
4775 <li>baz</li>
4776 </ul>
4777 ````````````````````````````````
4778
4779
4780 ```````````````````````````````` example
4781 1. foo
4782 2. bar
4783 3) baz
4784 .
4785 <ol>
4786 <li>foo</li>
4787 <li>bar</li>
4788 </ol>
4789 <ol start="3">
4790 <li>baz</li>
4791 </ol>
4792 ````````````````````````````````
4793
4794
4795 In CommonMark, a list can interrupt a paragraph. That is,
4796 no blank line is needed to separate a paragraph from a following
4797 list:
4798
4799 ```````````````````````````````` example
4800 Foo
4801 - bar
4802 - baz
4803 .
4804 <p>Foo</p>
4805 <ul>
4806 <li>bar</li>
4807 <li>baz</li>
4808 </ul>
4809 ````````````````````````````````
4810
4811 `Markdown.pl` does not allow this, through fear of triggering a list
4812 via a numeral in a hard-wrapped line:
4813
4814 ``` markdown
4815 The number of windows in my house is
4816 14.  The number of doors is 6.
4817 ```
4818
4819 Oddly, though, `Markdown.pl` *does* allow a blockquote to
4820 interrupt a paragraph, even though the same considerations might
4821 apply.
4822
4823 In CommonMark, we do allow lists to interrupt paragraphs, for
4824 two reasons.  First, it is natural and not uncommon for people
4825 to start lists without blank lines:
4826
4827 ``` markdown
4828 I need to buy
4829 - new shoes
4830 - a coat
4831 - a plane ticket
4832 ```
4833
4834 Second, we are attracted to a
4835
4836 > [principle of uniformity](@):
4837 > if a chunk of text has a certain
4838 > meaning, it will continue to have the same meaning when put into a
4839 > container block (such as a list item or blockquote).
4840
4841 (Indeed, the spec for [list items] and [block quotes] presupposes
4842 this principle.) This principle implies that if
4843
4844 ``` markdown
4845   * I need to buy
4846     - new shoes
4847     - a coat
4848     - a plane ticket
4849 ```
4850
4851 is a list item containing a paragraph followed by a nested sublist,
4852 as all Markdown implementations agree it is (though the paragraph
4853 may be rendered without `<p>` tags, since the list is "tight"),
4854 then
4855
4856 ``` markdown
4857 I need to buy
4858 - new shoes
4859 - a coat
4860 - a plane ticket
4861 ```
4862
4863 by itself should be a paragraph followed by a nested sublist.
4864
4865 Since it is well established Markdown practice to allow lists to
4866 interrupt paragraphs inside list items, the [principle of
4867 uniformity] requires us to allow this outside list items as
4868 well.  ([reStructuredText](http://docutils.sourceforge.net/rst.html)
4869 takes a different approach, requiring blank lines before lists
4870 even inside other list items.)
4871
4872 In order to solve of unwanted lists in paragraphs with
4873 hard-wrapped numerals, we allow only lists starting with `1` to
4874 interrupt paragraphs.  Thus,
4875
4876 ```````````````````````````````` example
4877 The number of windows in my house is
4878 14.  The number of doors is 6.
4879 .
4880 <p>The number of windows in my house is
4881 14.  The number of doors is 6.</p>
4882 ````````````````````````````````
4883
4884 We may still get an unintended result in cases like
4885
4886 ```````````````````````````````` example
4887 The number of windows in my house is
4888 1.  The number of doors is 6.
4889 .
4890 <p>The number of windows in my house is</p>
4891 <ol>
4892 <li>The number of doors is 6.</li>
4893 </ol>
4894 ````````````````````````````````
4895
4896 but this rule should prevent most spurious list captures.
4897
4898 There can be any number of blank lines between items:
4899
4900 ```````````````````````````````` example
4901 - foo
4902
4903 - bar
4904
4905
4906 - baz
4907 .
4908 <ul>
4909 <li>
4910 <p>foo</p>
4911 </li>
4912 <li>
4913 <p>bar</p>
4914 </li>
4915 <li>
4916 <p>baz</p>
4917 </li>
4918 </ul>
4919 ````````````````````````````````
4920
4921 ```````````````````````````````` example
4922 - foo
4923   - bar
4924     - baz
4925
4926
4927       bim
4928 .
4929 <ul>
4930 <li>foo
4931 <ul>
4932 <li>bar
4933 <ul>
4934 <li>
4935 <p>baz</p>
4936 <p>bim</p>
4937 </li>
4938 </ul>
4939 </li>
4940 </ul>
4941 </li>
4942 </ul>
4943 ````````````````````````````````
4944
4945
4946 To separate consecutive lists of the same type, or to separate a
4947 list from an indented code block that would otherwise be parsed
4948 as a subparagraph of the final list item, you can insert a blank HTML
4949 comment:
4950
4951 ```````````````````````````````` example
4952 - foo
4953 - bar
4954
4955 <!-- -->
4956
4957 - baz
4958 - bim
4959 .
4960 <ul>
4961 <li>foo</li>
4962 <li>bar</li>
4963 </ul>
4964 <!-- -->
4965 <ul>
4966 <li>baz</li>
4967 <li>bim</li>
4968 </ul>
4969 ````````````````````````````````
4970
4971
4972 ```````````````````````````````` example
4973 -   foo
4974
4975     notcode
4976
4977 -   foo
4978
4979 <!-- -->
4980
4981     code
4982 .
4983 <ul>
4984 <li>
4985 <p>foo</p>
4986 <p>notcode</p>
4987 </li>
4988 <li>
4989 <p>foo</p>
4990 </li>
4991 </ul>
4992 <!-- -->
4993 <pre><code>code
4994 </code></pre>
4995 ````````````````````````````````
4996
4997
4998 List items need not be indented to the same level.  The following
4999 list items will be treated as items at the same list level,
5000 since none is indented enough to belong to the previous list
5001 item:
5002
5003 ```````````````````````````````` example
5004 - a
5005  - b
5006   - c
5007    - d
5008     - e
5009    - f
5010   - g
5011  - h
5012 - i
5013 .
5014 <ul>
5015 <li>a</li>
5016 <li>b</li>
5017 <li>c</li>
5018 <li>d</li>
5019 <li>e</li>
5020 <li>f</li>
5021 <li>g</li>
5022 <li>h</li>
5023 <li>i</li>
5024 </ul>
5025 ````````````````````````````````
5026
5027
5028 ```````````````````````````````` example
5029 1. a
5030
5031   2. b
5032
5033     3. c
5034 .
5035 <ol>
5036 <li>
5037 <p>a</p>
5038 </li>
5039 <li>
5040 <p>b</p>
5041 </li>
5042 <li>
5043 <p>c</p>
5044 </li>
5045 </ol>
5046 ````````````````````````````````
5047
5048
5049 This is a loose list, because there is a blank line between
5050 two of the list items:
5051
5052 ```````````````````````````````` example
5053 - a
5054 - b
5055
5056 - c
5057 .
5058 <ul>
5059 <li>
5060 <p>a</p>
5061 </li>
5062 <li>
5063 <p>b</p>
5064 </li>
5065 <li>
5066 <p>c</p>
5067 </li>
5068 </ul>
5069 ````````````````````````````````
5070
5071
5072 So is this, with a empty second item:
5073
5074 ```````````````````````````````` example
5075 * a
5076 *
5077
5078 * c
5079 .
5080 <ul>
5081 <li>
5082 <p>a</p>
5083 </li>
5084 <li></li>
5085 <li>
5086 <p>c</p>
5087 </li>
5088 </ul>
5089 ````````````````````````````````
5090
5091
5092 These are loose lists, even though there is no space between the items,
5093 because one of the items directly contains two block-level elements
5094 with a blank line between them:
5095
5096 ```````````````````````````````` example
5097 - a
5098 - b
5099
5100   c
5101 - d
5102 .
5103 <ul>
5104 <li>
5105 <p>a</p>
5106 </li>
5107 <li>
5108 <p>b</p>
5109 <p>c</p>
5110 </li>
5111 <li>
5112 <p>d</p>
5113 </li>
5114 </ul>
5115 ````````````````````````````````
5116
5117
5118 ```````````````````````````````` example
5119 - a
5120 - b
5121
5122   [ref]: /url
5123 - d
5124 .
5125 <ul>
5126 <li>
5127 <p>a</p>
5128 </li>
5129 <li>
5130 <p>b</p>
5131 </li>
5132 <li>
5133 <p>d</p>
5134 </li>
5135 </ul>
5136 ````````````````````````````````
5137
5138
5139 This is a tight list, because the blank lines are in a code block:
5140
5141 ```````````````````````````````` example
5142 - a
5143 - ```
5144   b
5145
5146
5147   ```
5148 - c
5149 .
5150 <ul>
5151 <li>a</li>
5152 <li>
5153 <pre><code>b
5154
5155
5156 </code></pre>
5157 </li>
5158 <li>c</li>
5159 </ul>
5160 ````````````````````````````````
5161
5162
5163 This is a tight list, because the blank line is between two
5164 paragraphs of a sublist.  So the sublist is loose while
5165 the outer list is tight:
5166
5167 ```````````````````````````````` example
5168 - a
5169   - b
5170
5171     c
5172 - d
5173 .
5174 <ul>
5175 <li>a
5176 <ul>
5177 <li>
5178 <p>b</p>
5179 <p>c</p>
5180 </li>
5181 </ul>
5182 </li>
5183 <li>d</li>
5184 </ul>
5185 ````````````````````````````````
5186
5187
5188 This is a tight list, because the blank line is inside the
5189 block quote:
5190
5191 ```````````````````````````````` example
5192 * a
5193   > b
5194   >
5195 * c
5196 .
5197 <ul>
5198 <li>a
5199 <blockquote>
5200 <p>b</p>
5201 </blockquote>
5202 </li>
5203 <li>c</li>
5204 </ul>
5205 ````````````````````````````````
5206
5207
5208 This list is tight, because the consecutive block elements
5209 are not separated by blank lines:
5210
5211 ```````````````````````````````` example
5212 - a
5213   > b
5214   ```
5215   c
5216   ```
5217 - d
5218 .
5219 <ul>
5220 <li>a
5221 <blockquote>
5222 <p>b</p>
5223 </blockquote>
5224 <pre><code>c
5225 </code></pre>
5226 </li>
5227 <li>d</li>
5228 </ul>
5229 ````````````````````````````````
5230
5231
5232 A single-paragraph list is tight:
5233
5234 ```````````````````````````````` example
5235 - a
5236 .
5237 <ul>
5238 <li>a</li>
5239 </ul>
5240 ````````````````````````````````
5241
5242
5243 ```````````````````````````````` example
5244 - a
5245   - b
5246 .
5247 <ul>
5248 <li>a
5249 <ul>
5250 <li>b</li>
5251 </ul>
5252 </li>
5253 </ul>
5254 ````````````````````````````````
5255
5256
5257 This list is loose, because of the blank line between the
5258 two block elements in the list item:
5259
5260 ```````````````````````````````` example
5261 1. ```
5262    foo
5263    ```
5264
5265    bar
5266 .
5267 <ol>
5268 <li>
5269 <pre><code>foo
5270 </code></pre>
5271 <p>bar</p>
5272 </li>
5273 </ol>
5274 ````````````````````````````````
5275
5276
5277 Here the outer list is loose, the inner list tight:
5278
5279 ```````````````````````````````` example
5280 * foo
5281   * bar
5282
5283   baz
5284 .
5285 <ul>
5286 <li>
5287 <p>foo</p>
5288 <ul>
5289 <li>bar</li>
5290 </ul>
5291 <p>baz</p>
5292 </li>
5293 </ul>
5294 ````````````````````````````````
5295
5296
5297 ```````````````````````````````` example
5298 - a
5299   - b
5300   - c
5301
5302 - d
5303   - e
5304   - f
5305 .
5306 <ul>
5307 <li>
5308 <p>a</p>
5309 <ul>
5310 <li>b</li>
5311 <li>c</li>
5312 </ul>
5313 </li>
5314 <li>
5315 <p>d</p>
5316 <ul>
5317 <li>e</li>
5318 <li>f</li>
5319 </ul>
5320 </li>
5321 </ul>
5322 ````````````````````````````````
5323
5324
5325 # Inlines
5326
5327 Inlines are parsed sequentially from the beginning of the character
5328 stream to the end (left to right, in left-to-right languages).
5329 Thus, for example, in
5330
5331 ```````````````````````````````` example
5332 `hi`lo`
5333 .
5334 <p><code>hi</code>lo`</p>
5335 ````````````````````````````````
5336
5337
5338 `hi` is parsed as code, leaving the backtick at the end as a literal
5339 backtick.
5340
5341 ## Backslash escapes
5342
5343 Any ASCII punctuation character may be backslash-escaped:
5344
5345 ```````````````````````````````` example
5346 \!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~
5347 .
5348 <p>!&quot;#$%&amp;'()*+,-./:;&lt;=&gt;?@[\]^_`{|}~</p>
5349 ````````````````````````````````
5350
5351
5352 Backslashes before other characters are treated as literal
5353 backslashes:
5354
5355 ```````````````````````````````` example
5356 \→\A\a\ \3\φ\«
5357 .
5358 <p>\→\A\a\ \3\φ\«</p>
5359 ````````````````````````````````
5360
5361
5362 Escaped characters are treated as regular characters and do
5363 not have their usual Markdown meanings:
5364
5365 ```````````````````````````````` example
5366 \*not emphasized*
5367 \<br/> not a tag
5368 \[not a link](/foo)
5369 \`not code`
5370 1\. not a list
5371 \* not a list
5372 \# not a heading
5373 \[foo]: /url "not a reference"
5374 .
5375 <p>*not emphasized*
5376 &lt;br/&gt; not a tag
5377 [not a link](/foo)
5378 `not code`
5379 1. not a list
5380 * not a list
5381 # not a heading
5382 [foo]: /url &quot;not a reference&quot;</p>
5383 ````````````````````````````````
5384
5385
5386 If a backslash is itself escaped, the following character is not:
5387
5388 ```````````````````````````````` example
5389 \\*emphasis*
5390 .
5391 <p>\<em>emphasis</em></p>
5392 ````````````````````````````````
5393
5394
5395 A backslash at the end of the line is a [hard line break]:
5396
5397 ```````````````````````````````` example
5398 foo\
5399 bar
5400 .
5401 <p>foo<br />
5402 bar</p>
5403 ````````````````````````````````
5404
5405
5406 Backslash escapes do not work in code blocks, code spans, autolinks, or
5407 raw HTML:
5408
5409 ```````````````````````````````` example
5410 `` \[\` ``
5411 .
5412 <p><code>\[\`</code></p>
5413 ````````````````````````````````
5414
5415
5416 ```````````````````````````````` example
5417     \[\]
5418 .
5419 <pre><code>\[\]
5420 </code></pre>
5421 ````````````````````````````````
5422
5423
5424 ```````````````````````````````` example
5425 ~~~
5426 \[\]
5427 ~~~
5428 .
5429 <pre><code>\[\]
5430 </code></pre>
5431 ````````````````````````````````
5432
5433
5434 ```````````````````````````````` example
5435 <http://example.com?find=\*>
5436 .
5437 <p><a href="http://example.com?find=%5C*">http://example.com?find=\*</a></p>
5438 ````````````````````````````````
5439
5440
5441 ```````````````````````````````` example
5442 <a href="/bar\/)">
5443 .
5444 <a href="/bar\/)">
5445 ````````````````````````````````
5446
5447
5448 But they work in all other contexts, including URLs and link titles,
5449 link references, and [info strings] in [fenced code blocks]:
5450
5451 ```````````````````````````````` example
5452 [foo](/bar\* "ti\*tle")
5453 .
5454 <p><a href="/bar*" title="ti*tle">foo</a></p>
5455 ````````````````````````````````
5456
5457
5458 ```````````````````````````````` example
5459 [foo]
5460
5461 [foo]: /bar\* "ti\*tle"
5462 .
5463 <p><a href="/bar*" title="ti*tle">foo</a></p>
5464 ````````````````````````````````
5465
5466
5467 ```````````````````````````````` example
5468 ``` foo\+bar
5469 foo
5470 ```
5471 .
5472 <pre><code class="language-foo+bar">foo
5473 </code></pre>
5474 ````````````````````````````````
5475
5476
5477
5478 ## Entity and numeric character references
5479
5480 All valid HTML entity references and numeric character
5481 references, except those occuring in code blocks and code spans,
5482 are recognized as such and treated as equivalent to the
5483 corresponding Unicode characters.  Conforming CommonMark parsers
5484 need not store information about whether a particular character
5485 was represented in the source using a Unicode character or
5486 an entity reference.
5487
5488 [Entity references](@) consist of `&` + any of the valid
5489 HTML5 entity names + `;`. The
5490 document <https://html.spec.whatwg.org/multipage/entities.json>
5491 is used as an authoritative source for the valid entity
5492 references and their corresponding code points.
5493
5494 ```````````````````````````````` example
5495 &nbsp; &amp; &copy; &AElig; &Dcaron;
5496 &frac34; &HilbertSpace; &DifferentialD;
5497 &ClockwiseContourIntegral; &ngE;
5498 .
5499 <p>  &amp; © Æ Ď
5500 ¾ ℋ ⅆ
5501 ∲ ≧̸</p>
5502 ````````````````````````````````
5503
5504
5505 [Decimal numeric character
5506 references](@)
5507 consist of `&#` + a string of 1--8 arabic digits + `;`. A
5508 numeric character reference is parsed as the corresponding
5509 Unicode character. Invalid Unicode code points will be replaced by
5510 the REPLACEMENT CHARACTER (`U+FFFD`).  For security reasons,
5511 the code point `U+0000` will also be replaced by `U+FFFD`.
5512
5513 ```````````````````````````````` example
5514 &#35; &#1234; &#992; &#98765432; &#0;
5515 .
5516 <p># Ӓ Ϡ � �</p>
5517 ````````````````````````````````
5518
5519
5520 [Hexadecimal numeric character
5521 references](@) consist of `&#` +
5522 either `X` or `x` + a string of 1-8 hexadecimal digits + `;`.
5523 They too are parsed as the corresponding Unicode character (this
5524 time specified with a hexadecimal numeral instead of decimal).
5525
5526 ```````````````````````````````` example
5527 &#X22; &#XD06; &#xcab;
5528 .
5529 <p>&quot; ആ ಫ</p>
5530 ````````````````````````````````
5531
5532
5533 Here are some nonentities:
5534
5535 ```````````````````````````````` example
5536 &nbsp &x; &#; &#x;
5537 &ThisIsNotDefined; &hi?;
5538 .
5539 <p>&amp;nbsp &amp;x; &amp;#; &amp;#x;
5540 &amp;ThisIsNotDefined; &amp;hi?;</p>
5541 ````````````````````````````````
5542
5543
5544 Although HTML5 does accept some entity references
5545 without a trailing semicolon (such as `&copy`), these are not
5546 recognized here, because it makes the grammar too ambiguous:
5547
5548 ```````````````````````````````` example
5549 &copy
5550 .
5551 <p>&amp;copy</p>
5552 ````````````````````````````````
5553
5554
5555 Strings that are not on the list of HTML5 named entities are not
5556 recognized as entity references either:
5557
5558 ```````````````````````````````` example
5559 &MadeUpEntity;
5560 .
5561 <p>&amp;MadeUpEntity;</p>
5562 ````````````````````````````````
5563
5564
5565 Entity and numeric character references are recognized in any
5566 context besides code spans or code blocks, including
5567 URLs, [link titles], and [fenced code block][] [info strings]:
5568
5569 ```````````````````````````````` example
5570 <a href="&ouml;&ouml;.html">
5571 .
5572 <a href="&ouml;&ouml;.html">
5573 ````````````````````````````````
5574
5575
5576 ```````````````````````````````` example
5577 [foo](/f&ouml;&ouml; "f&ouml;&ouml;")
5578 .
5579 <p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p>
5580 ````````````````````````````````
5581
5582
5583 ```````````````````````````````` example
5584 [foo]
5585
5586 [foo]: /f&ouml;&ouml; "f&ouml;&ouml;"
5587 .
5588 <p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p>
5589 ````````````````````````````````
5590
5591
5592 ```````````````````````````````` example
5593 ``` f&ouml;&ouml;
5594 foo
5595 ```
5596 .
5597 <pre><code class="language-föö">foo
5598 </code></pre>
5599 ````````````````````````````````
5600
5601
5602 Entity and numeric character references are treated as literal
5603 text in code spans and code blocks:
5604
5605 ```````````````````````````````` example
5606 `f&ouml;&ouml;`
5607 .
5608 <p><code>f&amp;ouml;&amp;ouml;</code></p>
5609 ````````````````````````````````
5610
5611
5612 ```````````````````````````````` example
5613     f&ouml;f&ouml;
5614 .
5615 <pre><code>f&amp;ouml;f&amp;ouml;
5616 </code></pre>
5617 ````````````````````````````````
5618
5619
5620 ## Code spans
5621
5622 A [backtick string](@)
5623 is a string of one or more backtick characters (`` ` ``) that is neither
5624 preceded nor followed by a backtick.
5625
5626 A [code span](@) begins with a backtick string and ends with
5627 a backtick string of equal length.  The contents of the code span are
5628 the characters between the two backtick strings, with leading and
5629 trailing spaces and [line endings] removed, and
5630 [whitespace] collapsed to single spaces.
5631
5632 This is a simple code span:
5633
5634 ```````````````````````````````` example
5635 `foo`
5636 .
5637 <p><code>foo</code></p>
5638 ````````````````````````````````
5639
5640
5641 Here two backticks are used, because the code contains a backtick.
5642 This example also illustrates stripping of leading and trailing spaces:
5643
5644 ```````````````````````````````` example
5645 `` foo ` bar  ``
5646 .
5647 <p><code>foo ` bar</code></p>
5648 ````````````````````````````````
5649
5650
5651 This example shows the motivation for stripping leading and trailing
5652 spaces:
5653
5654 ```````````````````````````````` example
5655 ` `` `
5656 .
5657 <p><code>``</code></p>
5658 ````````````````````````````````
5659
5660
5661 [Line endings] are treated like spaces:
5662
5663 ```````````````````````````````` example
5664 ``
5665 foo
5666 ``
5667 .
5668 <p><code>foo</code></p>
5669 ````````````````````````````````
5670
5671
5672 Interior spaces and [line endings] are collapsed into
5673 single spaces, just as they would be by a browser:
5674
5675 ```````````````````````````````` example
5676 `foo   bar
5677   baz`
5678 .
5679 <p><code>foo bar baz</code></p>
5680 ````````````````````````````````
5681
5682
5683 Not all [Unicode whitespace] (for instance, non-breaking space) is
5684 collapsed, however:
5685
5686 ```````````````````````````````` example
5687 `a  b`
5688 .
5689 <p><code>a  b</code></p>
5690 ````````````````````````````````
5691
5692
5693 Q: Why not just leave the spaces, since browsers will collapse them
5694 anyway?  A:  Because we might be targeting a non-HTML format, and we
5695 shouldn't rely on HTML-specific rendering assumptions.
5696
5697 (Existing implementations differ in their treatment of internal
5698 spaces and [line endings].  Some, including `Markdown.pl` and
5699 `showdown`, convert an internal [line ending] into a
5700 `<br />` tag.  But this makes things difficult for those who like to
5701 hard-wrap their paragraphs, since a line break in the midst of a code
5702 span will cause an unintended line break in the output.  Others just
5703 leave internal spaces as they are, which is fine if only HTML is being
5704 targeted.)
5705
5706 ```````````````````````````````` example
5707 `foo `` bar`
5708 .
5709 <p><code>foo `` bar</code></p>
5710 ````````````````````````````````
5711
5712
5713 Note that backslash escapes do not work in code spans. All backslashes
5714 are treated literally:
5715
5716 ```````````````````````````````` example
5717 `foo\`bar`
5718 .
5719 <p><code>foo\</code>bar`</p>
5720 ````````````````````````````````
5721
5722
5723 Backslash escapes are never needed, because one can always choose a
5724 string of *n* backtick characters as delimiters, where the code does
5725 not contain any strings of exactly *n* backtick characters.
5726
5727 Code span backticks have higher precedence than any other inline
5728 constructs except HTML tags and autolinks.  Thus, for example, this is
5729 not parsed as emphasized text, since the second `*` is part of a code
5730 span:
5731
5732 ```````````````````````````````` example
5733 *foo`*`
5734 .
5735 <p>*foo<code>*</code></p>
5736 ````````````````````````````````
5737
5738
5739 And this is not parsed as a link:
5740
5741 ```````````````````````````````` example
5742 [not a `link](/foo`)
5743 .
5744 <p>[not a <code>link](/foo</code>)</p>
5745 ````````````````````````````````
5746
5747
5748 Code spans, HTML tags, and autolinks have the same precedence.
5749 Thus, this is code:
5750
5751 ```````````````````````````````` example
5752 `<a href="`">`
5753 .
5754 <p><code>&lt;a href=&quot;</code>&quot;&gt;`</p>
5755 ````````````````````````````````
5756
5757
5758 But this is an HTML tag:
5759
5760 ```````````````````````````````` example
5761 <a href="`">`
5762 .
5763 <p><a href="`">`</p>
5764 ````````````````````````````````
5765
5766
5767 And this is code:
5768
5769 ```````````````````````````````` example
5770 `<http://foo.bar.`baz>`
5771 .
5772 <p><code>&lt;http://foo.bar.</code>baz&gt;`</p>
5773 ````````````````````````````````
5774
5775
5776 But this is an autolink:
5777
5778 ```````````````````````````````` example
5779 <http://foo.bar.`baz>`
5780 .
5781 <p><a href="http://foo.bar.%60baz">http://foo.bar.`baz</a>`</p>
5782 ````````````````````````````````
5783
5784
5785 When a backtick string is not closed by a matching backtick string,
5786 we just have literal backticks:
5787
5788 ```````````````````````````````` example
5789 ```foo``
5790 .
5791 <p>```foo``</p>
5792 ````````````````````````````````
5793
5794
5795 ```````````````````````````````` example
5796 `foo
5797 .
5798 <p>`foo</p>
5799 ````````````````````````````````
5800
5801 The following case also illustrates the need for opening and
5802 closing backtick strings to be equal in length:
5803
5804 ```````````````````````````````` example
5805 `foo``bar``
5806 .
5807 <p>`foo<code>bar</code></p>
5808 ````````````````````````````````
5809
5810
5811 ## Emphasis and strong emphasis
5812
5813 John Gruber's original [Markdown syntax
5814 description](http://daringfireball.net/projects/markdown/syntax#em) says:
5815
5816 > Markdown treats asterisks (`*`) and underscores (`_`) as indicators of
5817 > emphasis. Text wrapped with one `*` or `_` will be wrapped with an HTML
5818 > `<em>` tag; double `*`'s or `_`'s will be wrapped with an HTML `<strong>`
5819 > tag.
5820
5821 This is enough for most users, but these rules leave much undecided,
5822 especially when it comes to nested emphasis.  The original
5823 `Markdown.pl` test suite makes it clear that triple `***` and
5824 `___` delimiters can be used for strong emphasis, and most
5825 implementations have also allowed the following patterns:
5826
5827 ``` markdown
5828 ***strong emph***
5829 ***strong** in emph*
5830 ***emph* in strong**
5831 **in strong *emph***
5832 *in emph **strong***
5833 ```
5834
5835 The following patterns are less widely supported, but the intent
5836 is clear and they are useful (especially in contexts like bibliography
5837 entries):
5838
5839 ``` markdown
5840 *emph *with emph* in it*
5841 **strong **with strong** in it**
5842 ```
5843
5844 Many implementations have also restricted intraword emphasis to
5845 the `*` forms, to avoid unwanted emphasis in words containing
5846 internal underscores.  (It is best practice to put these in code
5847 spans, but users often do not.)
5848
5849 ``` markdown
5850 internal emphasis: foo*bar*baz
5851 no emphasis: foo_bar_baz
5852 ```
5853
5854 The rules given below capture all of these patterns, while allowing
5855 for efficient parsing strategies that do not backtrack.
5856
5857 First, some definitions.  A [delimiter run](@) is either
5858 a sequence of one or more `*` characters that is not preceded or
5859 followed by a `*` character, or a sequence of one or more `_`
5860 characters that is not preceded or followed by a `_` character.
5861
5862 A [left-flanking delimiter run](@) is
5863 a [delimiter run] that is (a) not followed by [Unicode whitespace],
5864 and (b) not followed by a [punctuation character], or
5865 preceded by [Unicode whitespace] or a [punctuation character].
5866 For purposes of this definition, the beginning and the end of
5867 the line count as Unicode whitespace.
5868
5869 A [right-flanking delimiter run](@) is
5870 a [delimiter run] that is (a) not preceded by [Unicode whitespace],
5871 and (b) not preceded by a [punctuation character], or
5872 followed by [Unicode whitespace] or a [punctuation character].
5873 For purposes of this definition, the beginning and the end of
5874 the line count as Unicode whitespace.
5875
5876 Here are some examples of delimiter runs.
5877
5878   - left-flanking but not right-flanking:
5879
5880     ```
5881     ***abc
5882       _abc
5883     **"abc"
5884      _"abc"
5885     ```
5886
5887   - right-flanking but not left-flanking:
5888
5889     ```
5890      abc***
5891      abc_
5892     "abc"**
5893     "abc"_
5894     ```
5895
5896   - Both left and right-flanking:
5897
5898     ```
5899      abc***def
5900     "abc"_"def"
5901     ```
5902
5903   - Neither left nor right-flanking:
5904
5905     ```
5906     abc *** def
5907     a _ b
5908     ```
5909
5910 (The idea of distinguishing left-flanking and right-flanking
5911 delimiter runs based on the character before and the character
5912 after comes from Roopesh Chander's
5913 [vfmd](http://www.vfmd.org/vfmd-spec/specification/#procedure-for-identifying-emphasis-tags).
5914 vfmd uses the terminology "emphasis indicator string" instead of "delimiter
5915 run," and its rules for distinguishing left- and right-flanking runs
5916 are a bit more complex than the ones given here.)
5917
5918 The following rules define emphasis and strong emphasis:
5919
5920 1.  A single `*` character [can open emphasis](@)
5921     iff (if and only if) it is part of a [left-flanking delimiter run].
5922
5923 2.  A single `_` character [can open emphasis] iff
5924     it is part of a [left-flanking delimiter run]
5925     and either (a) not part of a [right-flanking delimiter run]
5926     or (b) part of a [right-flanking delimiter run]
5927     preceded by punctuation.
5928
5929 3.  A single `*` character [can close emphasis](@)
5930     iff it is part of a [right-flanking delimiter run].
5931
5932 4.  A single `_` character [can close emphasis] iff
5933     it is part of a [right-flanking delimiter run]
5934     and either (a) not part of a [left-flanking delimiter run]
5935     or (b) part of a [left-flanking delimiter run]
5936     followed by punctuation.
5937
5938 5.  A double `**` [can open strong emphasis](@)
5939     iff it is part of a [left-flanking delimiter run].
5940
5941 6.  A double `__` [can open strong emphasis] iff
5942     it is part of a [left-flanking delimiter run]
5943     and either (a) not part of a [right-flanking delimiter run]
5944     or (b) part of a [right-flanking delimiter run]
5945     preceded by punctuation.
5946
5947 7.  A double `**` [can close strong emphasis](@)
5948     iff it is part of a [right-flanking delimiter run].
5949
5950 8.  A double `__` [can close strong emphasis] iff
5951     it is part of a [right-flanking delimiter run]
5952     and either (a) not part of a [left-flanking delimiter run]
5953     or (b) part of a [left-flanking delimiter run]
5954     followed by punctuation.
5955
5956 9.  Emphasis begins with a delimiter that [can open emphasis] and ends
5957     with a delimiter that [can close emphasis], and that uses the same
5958     character (`_` or `*`) as the opening delimiter.  The
5959     opening and closing delimiters must belong to separate
5960     [delimiter runs].  If one of the delimiters can both
5961     open and close emphasis, then the sum of the lengths of the
5962     delimiter runs containing the opening and closing delimiters
5963     must not be a multiple of 3.
5964
5965 10. Strong emphasis begins with a delimiter that
5966     [can open strong emphasis] and ends with a delimiter that
5967     [can close strong emphasis], and that uses the same character
5968     (`_` or `*`) as the opening delimiter.  The
5969     opening and closing delimiters must belong to separate
5970     [delimiter runs].  If one of the delimiters can both open
5971     and close strong emphasis, then the sum of the lengths of
5972     the delimiter runs containing the opening and closing
5973     delimiters must not be a multiple of 3.
5974
5975 11. A literal `*` character cannot occur at the beginning or end of
5976     `*`-delimited emphasis or `**`-delimited strong emphasis, unless it
5977     is backslash-escaped.
5978
5979 12. A literal `_` character cannot occur at the beginning or end of
5980     `_`-delimited emphasis or `__`-delimited strong emphasis, unless it
5981     is backslash-escaped.
5982
5983 Where rules 1--12 above are compatible with multiple parsings,
5984 the following principles resolve ambiguity:
5985
5986 13. The number of nestings should be minimized. Thus, for example,
5987     an interpretation `<strong>...</strong>` is always preferred to
5988     `<em><em>...</em></em>`.
5989
5990 14. An interpretation `<em><strong>...</strong></em>` is always
5991     preferred to `<strong><em>...</em></strong>`.
5992
5993 15. When two potential emphasis or strong emphasis spans overlap,
5994     so that the second begins before the first ends and ends after
5995     the first ends, the first takes precedence. Thus, for example,
5996     `*foo _bar* baz_` is parsed as `<em>foo _bar</em> baz_` rather
5997     than `*foo <em>bar* baz</em>`.
5998
5999 16. When there are two potential emphasis or strong emphasis spans
6000     with the same closing delimiter, the shorter one (the one that
6001     opens later) takes precedence. Thus, for example,
6002     `**foo **bar baz**` is parsed as `**foo <strong>bar baz</strong>`
6003     rather than `<strong>foo **bar baz</strong>`.
6004
6005 17. Inline code spans, links, images, and HTML tags group more tightly
6006     than emphasis.  So, when there is a choice between an interpretation
6007     that contains one of these elements and one that does not, the
6008     former always wins.  Thus, for example, `*[foo*](bar)` is
6009     parsed as `*<a href="bar">foo*</a>` rather than as
6010     `<em>[foo</em>](bar)`.
6011
6012 These rules can be illustrated through a series of examples.
6013
6014 Rule 1:
6015
6016 ```````````````````````````````` example
6017 *foo bar*
6018 .
6019 <p><em>foo bar</em></p>
6020 ````````````````````````````````
6021
6022
6023 This is not emphasis, because the opening `*` is followed by
6024 whitespace, and hence not part of a [left-flanking delimiter run]:
6025
6026 ```````````````````````````````` example
6027 a * foo bar*
6028 .
6029 <p>a * foo bar*</p>
6030 ````````````````````````````````
6031
6032
6033 This is not emphasis, because the opening `*` is preceded
6034 by an alphanumeric and followed by punctuation, and hence
6035 not part of a [left-flanking delimiter run]:
6036
6037 ```````````````````````````````` example
6038 a*"foo"*
6039 .
6040 <p>a*&quot;foo&quot;*</p>
6041 ````````````````````````````````
6042
6043
6044 Unicode nonbreaking spaces count as whitespace, too:
6045
6046 ```````````````````````````````` example
6047 * a *
6048 .
6049 <p>* a *</p>
6050 ````````````````````````````````
6051
6052
6053 Intraword emphasis with `*` is permitted:
6054
6055 ```````````````````````````````` example
6056 foo*bar*
6057 .
6058 <p>foo<em>bar</em></p>
6059 ````````````````````````````````
6060
6061
6062 ```````````````````````````````` example
6063 5*6*78
6064 .
6065 <p>5<em>6</em>78</p>
6066 ````````````````````````````````
6067
6068
6069 Rule 2:
6070
6071 ```````````````````````````````` example
6072 _foo bar_
6073 .
6074 <p><em>foo bar</em></p>
6075 ````````````````````````````````
6076
6077
6078 This is not emphasis, because the opening `_` is followed by
6079 whitespace:
6080
6081 ```````````````````````````````` example
6082 _ foo bar_
6083 .
6084 <p>_ foo bar_</p>
6085 ````````````````````````````````
6086
6087
6088 This is not emphasis, because the opening `_` is preceded
6089 by an alphanumeric and followed by punctuation:
6090
6091 ```````````````````````````````` example
6092 a_"foo"_
6093 .
6094 <p>a_&quot;foo&quot;_</p>
6095 ````````````````````````````````
6096
6097
6098 Emphasis with `_` is not allowed inside words:
6099
6100 ```````````````````````````````` example
6101 foo_bar_
6102 .
6103 <p>foo_bar_</p>
6104 ````````````````````````````````
6105
6106
6107 ```````````````````````````````` example
6108 5_6_78
6109 .
6110 <p>5_6_78</p>
6111 ````````````````````````````````
6112
6113
6114 ```````````````````````````````` example
6115 пристаням_стремятся_
6116 .
6117 <p>пристаням_стремятся_</p>
6118 ````````````````````````````````
6119
6120
6121 Here `_` does not generate emphasis, because the first delimiter run
6122 is right-flanking and the second left-flanking:
6123
6124 ```````````````````````````````` example
6125 aa_"bb"_cc
6126 .
6127 <p>aa_&quot;bb&quot;_cc</p>
6128 ````````````````````````````````
6129
6130
6131 This is emphasis, even though the opening delimiter is
6132 both left- and right-flanking, because it is preceded by
6133 punctuation:
6134
6135 ```````````````````````````````` example
6136 foo-_(bar)_
6137 .
6138 <p>foo-<em>(bar)</em></p>
6139 ````````````````````````````````
6140
6141
6142 Rule 3:
6143
6144 This is not emphasis, because the closing delimiter does
6145 not match the opening delimiter:
6146
6147 ```````````````````````````````` example
6148 _foo*
6149 .
6150 <p>_foo*</p>
6151 ````````````````````````````````
6152
6153
6154 This is not emphasis, because the closing `*` is preceded by
6155 whitespace:
6156
6157 ```````````````````````````````` example
6158 *foo bar *
6159 .
6160 <p>*foo bar *</p>
6161 ````````````````````````````````
6162
6163
6164 A newline also counts as whitespace:
6165
6166 ```````````````````````````````` example
6167 *foo bar
6168 *
6169 .
6170 <p>*foo bar
6171 *</p>
6172 ````````````````````````````````
6173
6174
6175 This is not emphasis, because the second `*` is
6176 preceded by punctuation and followed by an alphanumeric
6177 (hence it is not part of a [right-flanking delimiter run]:
6178
6179 ```````````````````````````````` example
6180 *(*foo)
6181 .
6182 <p>*(*foo)</p>
6183 ````````````````````````````````
6184
6185
6186 The point of this restriction is more easily appreciated
6187 with this example:
6188
6189 ```````````````````````````````` example
6190 *(*foo*)*
6191 .
6192 <p><em>(<em>foo</em>)</em></p>
6193 ````````````````````````````````
6194
6195
6196 Intraword emphasis with `*` is allowed:
6197
6198 ```````````````````````````````` example
6199 *foo*bar
6200 .
6201 <p><em>foo</em>bar</p>
6202 ````````````````````````````````
6203
6204
6205
6206 Rule 4:
6207
6208 This is not emphasis, because the closing `_` is preceded by
6209 whitespace:
6210
6211 ```````````````````````````````` example
6212 _foo bar _
6213 .
6214 <p>_foo bar _</p>
6215 ````````````````````````````````
6216
6217
6218 This is not emphasis, because the second `_` is
6219 preceded by punctuation and followed by an alphanumeric:
6220
6221 ```````````````````````````````` example
6222 _(_foo)
6223 .
6224 <p>_(_foo)</p>
6225 ````````````````````````````````
6226
6227
6228 This is emphasis within emphasis:
6229
6230 ```````````````````````````````` example
6231 _(_foo_)_
6232 .
6233 <p><em>(<em>foo</em>)</em></p>
6234 ````````````````````````````````
6235
6236
6237 Intraword emphasis is disallowed for `_`:
6238
6239 ```````````````````````````````` example
6240 _foo_bar
6241 .
6242 <p>_foo_bar</p>
6243 ````````````````````````````````
6244
6245
6246 ```````````````````````````````` example
6247 _пристаням_стремятся
6248 .
6249 <p>_пристаням_стремятся</p>
6250 ````````````````````````````````
6251
6252
6253 ```````````````````````````````` example
6254 _foo_bar_baz_
6255 .
6256 <p><em>foo_bar_baz</em></p>
6257 ````````````````````````````````
6258
6259
6260 This is emphasis, even though the closing delimiter is
6261 both left- and right-flanking, because it is followed by
6262 punctuation:
6263
6264 ```````````````````````````````` example
6265 _(bar)_.
6266 .
6267 <p><em>(bar)</em>.</p>
6268 ````````````````````````````````
6269
6270
6271 Rule 5:
6272
6273 ```````````````````````````````` example
6274 **foo bar**
6275 .
6276 <p><strong>foo bar</strong></p>
6277 ````````````````````````````````
6278
6279
6280 This is not strong emphasis, because the opening delimiter is
6281 followed by whitespace:
6282
6283 ```````````````````````````````` example
6284 ** foo bar**
6285 .
6286 <p>** foo bar**</p>
6287 ````````````````````````````````
6288
6289
6290 This is not strong emphasis, because the opening `**` is preceded
6291 by an alphanumeric and followed by punctuation, and hence
6292 not part of a [left-flanking delimiter run]:
6293
6294 ```````````````````````````````` example
6295 a**"foo"**
6296 .
6297 <p>a**&quot;foo&quot;**</p>
6298 ````````````````````````````````
6299
6300
6301 Intraword strong emphasis with `**` is permitted:
6302
6303 ```````````````````````````````` example
6304 foo**bar**
6305 .
6306 <p>foo<strong>bar</strong></p>
6307 ````````````````````````````````
6308
6309
6310 Rule 6:
6311
6312 ```````````````````````````````` example
6313 __foo bar__
6314 .
6315 <p><strong>foo bar</strong></p>
6316 ````````````````````````````````
6317
6318
6319 This is not strong emphasis, because the opening delimiter is
6320 followed by whitespace:
6321
6322 ```````````````````````````````` example
6323 __ foo bar__
6324 .
6325 <p>__ foo bar__</p>
6326 ````````````````````````````````
6327
6328
6329 A newline counts as whitespace:
6330 ```````````````````````````````` example
6331 __
6332 foo bar__
6333 .
6334 <p>__
6335 foo bar__</p>
6336 ````````````````````````````````
6337
6338
6339 This is not strong emphasis, because the opening `__` is preceded
6340 by an alphanumeric and followed by punctuation:
6341
6342 ```````````````````````````````` example
6343 a__"foo"__
6344 .
6345 <p>a__&quot;foo&quot;__</p>
6346 ````````````````````````````````
6347
6348
6349 Intraword strong emphasis is forbidden with `__`:
6350
6351 ```````````````````````````````` example
6352 foo__bar__
6353 .
6354 <p>foo__bar__</p>
6355 ````````````````````````````````
6356
6357
6358 ```````````````````````````````` example
6359 5__6__78
6360 .
6361 <p>5__6__78</p>
6362 ````````````````````````````````
6363
6364
6365 ```````````````````````````````` example
6366 пристаням__стремятся__
6367 .
6368 <p>пристаням__стремятся__</p>
6369 ````````````````````````````````
6370
6371
6372 ```````````````````````````````` example
6373 __foo, __bar__, baz__
6374 .
6375 <p><strong>foo, <strong>bar</strong>, baz</strong></p>
6376 ````````````````````````````````
6377
6378
6379 This is strong emphasis, even though the opening delimiter is
6380 both left- and right-flanking, because it is preceded by
6381 punctuation:
6382
6383 ```````````````````````````````` example
6384 foo-__(bar)__
6385 .
6386 <p>foo-<strong>(bar)</strong></p>
6387 ````````````````````````````````
6388
6389
6390
6391 Rule 7:
6392
6393 This is not strong emphasis, because the closing delimiter is preceded
6394 by whitespace:
6395
6396 ```````````````````````````````` example
6397 **foo bar **
6398 .
6399 <p>**foo bar **</p>
6400 ````````````````````````````````
6401
6402
6403 (Nor can it be interpreted as an emphasized `*foo bar *`, because of
6404 Rule 11.)
6405
6406 This is not strong emphasis, because the second `**` is
6407 preceded by punctuation and followed by an alphanumeric:
6408
6409 ```````````````````````````````` example
6410 **(**foo)
6411 .
6412 <p>**(**foo)</p>
6413 ````````````````````````````````
6414
6415
6416 The point of this restriction is more easily appreciated
6417 with these examples:
6418
6419 ```````````````````````````````` example
6420 *(**foo**)*
6421 .
6422 <p><em>(<strong>foo</strong>)</em></p>
6423 ````````````````````````````````
6424
6425
6426 ```````````````````````````````` example
6427 **Gomphocarpus (*Gomphocarpus physocarpus*, syn.
6428 *Asclepias physocarpa*)**
6429 .
6430 <p><strong>Gomphocarpus (<em>Gomphocarpus physocarpus</em>, syn.
6431 <em>Asclepias physocarpa</em>)</strong></p>
6432 ````````````````````````````````
6433
6434
6435 ```````````````````````````````` example
6436 **foo "*bar*" foo**
6437 .
6438 <p><strong>foo &quot;<em>bar</em>&quot; foo</strong></p>
6439 ````````````````````````````````
6440
6441
6442 Intraword emphasis:
6443
6444 ```````````````````````````````` example
6445 **foo**bar
6446 .
6447 <p><strong>foo</strong>bar</p>
6448 ````````````````````````````````
6449
6450
6451 Rule 8:
6452
6453 This is not strong emphasis, because the closing delimiter is
6454 preceded by whitespace:
6455
6456 ```````````````````````````````` example
6457 __foo bar __
6458 .
6459 <p>__foo bar __</p>
6460 ````````````````````````````````
6461
6462
6463 This is not strong emphasis, because the second `__` is
6464 preceded by punctuation and followed by an alphanumeric:
6465
6466 ```````````````````````````````` example
6467 __(__foo)
6468 .
6469 <p>__(__foo)</p>
6470 ````````````````````````````````
6471
6472
6473 The point of this restriction is more easily appreciated
6474 with this example:
6475
6476 ```````````````````````````````` example
6477 _(__foo__)_
6478 .
6479 <p><em>(<strong>foo</strong>)</em></p>
6480 ````````````````````````````````
6481
6482
6483 Intraword strong emphasis is forbidden with `__`:
6484
6485 ```````````````````````````````` example
6486 __foo__bar
6487 .
6488 <p>__foo__bar</p>
6489 ````````````````````````````````
6490
6491
6492 ```````````````````````````````` example
6493 __пристаням__стремятся
6494 .
6495 <p>__пристаням__стремятся</p>
6496 ````````````````````````````````
6497
6498
6499 ```````````````````````````````` example
6500 __foo__bar__baz__
6501 .
6502 <p><strong>foo__bar__baz</strong></p>
6503 ````````````````````````````````
6504
6505
6506 This is strong emphasis, even though the closing delimiter is
6507 both left- and right-flanking, because it is followed by
6508 punctuation:
6509
6510 ```````````````````````````````` example
6511 __(bar)__.
6512 .
6513 <p><strong>(bar)</strong>.</p>
6514 ````````````````````````````````
6515
6516
6517 Rule 9:
6518
6519 Any nonempty sequence of inline elements can be the contents of an
6520 emphasized span.
6521
6522 ```````````````````````````````` example
6523 *foo [bar](/url)*
6524 .
6525 <p><em>foo <a href="/url">bar</a></em></p>
6526 ````````````````````````````````
6527
6528
6529 ```````````````````````````````` example
6530 *foo
6531 bar*
6532 .
6533 <p><em>foo
6534 bar</em></p>
6535 ````````````````````````````````
6536
6537
6538 In particular, emphasis and strong emphasis can be nested
6539 inside emphasis:
6540
6541 ```````````````````````````````` example
6542 _foo __bar__ baz_
6543 .
6544 <p><em>foo <strong>bar</strong> baz</em></p>
6545 ````````````````````````````````
6546
6547
6548 ```````````````````````````````` example
6549 _foo _bar_ baz_
6550 .
6551 <p><em>foo <em>bar</em> baz</em></p>
6552 ````````````````````````````````
6553
6554
6555 ```````````````````````````````` example
6556 __foo_ bar_
6557 .
6558 <p><em><em>foo</em> bar</em></p>
6559 ````````````````````````````````
6560
6561
6562 ```````````````````````````````` example
6563 *foo *bar**
6564 .
6565 <p><em>foo <em>bar</em></em></p>
6566 ````````````````````````````````
6567
6568
6569 ```````````````````````````````` example
6570 *foo **bar** baz*
6571 .
6572 <p><em>foo <strong>bar</strong> baz</em></p>
6573 ````````````````````````````````
6574
6575 ```````````````````````````````` example
6576 *foo**bar**baz*
6577 .
6578 <p><em>foo<strong>bar</strong>baz</em></p>
6579 ````````````````````````````````
6580
6581 Note that in the preceding case, the interpretation
6582
6583 ``` markdown
6584 <p><em>foo</em><em>bar<em></em>baz</em></p>
6585 ```
6586
6587
6588 is precluded by the condition that a delimiter that
6589 can both open and close (like the `*` after `foo`)
6590 cannot form emphasis if the sum of the lengths of
6591 the delimiter runs containing the opening and
6592 closing delimiters is a multiple of 3.
6593
6594 The same condition ensures that the following
6595 cases are all strong emphasis nested inside
6596 emphasis, even when the interior spaces are
6597 omitted:
6598
6599
6600 ```````````````````````````````` example
6601 ***foo** bar*
6602 .
6603 <p><em><strong>foo</strong> bar</em></p>
6604 ````````````````````````````````
6605
6606
6607 ```````````````````````````````` example
6608 *foo **bar***
6609 .
6610 <p><em>foo <strong>bar</strong></em></p>
6611 ````````````````````````````````
6612
6613
6614 ```````````````````````````````` example
6615 *foo**bar***
6616 .
6617 <p><em>foo<strong>bar</strong></em></p>
6618 ````````````````````````````````
6619
6620
6621 Indefinite levels of nesting are possible:
6622
6623 ```````````````````````````````` example
6624 *foo **bar *baz* bim** bop*
6625 .
6626 <p><em>foo <strong>bar <em>baz</em> bim</strong> bop</em></p>
6627 ````````````````````````````````
6628
6629
6630 ```````````````````````````````` example
6631 *foo [*bar*](/url)*
6632 .
6633 <p><em>foo <a href="/url"><em>bar</em></a></em></p>
6634 ````````````````````````````````
6635
6636
6637 There can be no empty emphasis or strong emphasis:
6638
6639 ```````````````````````````````` example
6640 ** is not an empty emphasis
6641 .
6642 <p>** is not an empty emphasis</p>
6643 ````````````````````````````````
6644
6645
6646 ```````````````````````````````` example
6647 **** is not an empty strong emphasis
6648 .
6649 <p>**** is not an empty strong emphasis</p>
6650 ````````````````````````````````
6651
6652
6653
6654 Rule 10:
6655
6656 Any nonempty sequence of inline elements can be the contents of an
6657 strongly emphasized span.
6658
6659 ```````````````````````````````` example
6660 **foo [bar](/url)**
6661 .
6662 <p><strong>foo <a href="/url">bar</a></strong></p>
6663 ````````````````````````````````
6664
6665
6666 ```````````````````````````````` example
6667 **foo
6668 bar**
6669 .
6670 <p><strong>foo
6671 bar</strong></p>
6672 ````````````````````````````````
6673
6674
6675 In particular, emphasis and strong emphasis can be nested
6676 inside strong emphasis:
6677
6678 ```````````````````````````````` example
6679 __foo _bar_ baz__
6680 .
6681 <p><strong>foo <em>bar</em> baz</strong></p>
6682 ````````````````````````````````
6683
6684
6685 ```````````````````````````````` example
6686 __foo __bar__ baz__
6687 .
6688 <p><strong>foo <strong>bar</strong> baz</strong></p>
6689 ````````````````````````````````
6690
6691
6692 ```````````````````````````````` example
6693 ____foo__ bar__
6694 .
6695 <p><strong><strong>foo</strong> bar</strong></p>
6696 ````````````````````````````````
6697
6698
6699 ```````````````````````````````` example
6700 **foo **bar****
6701 .
6702 <p><strong>foo <strong>bar</strong></strong></p>
6703 ````````````````````````````````
6704
6705
6706 ```````````````````````````````` example
6707 **foo *bar* baz**
6708 .
6709 <p><strong>foo <em>bar</em> baz</strong></p>
6710 ````````````````````````````````
6711
6712
6713 ```````````````````````````````` example
6714 **foo*bar*baz**
6715 .
6716 <p><strong>foo<em>bar</em>baz</strong></p>
6717 ````````````````````````````````
6718
6719
6720 ```````````````````````````````` example
6721 ***foo* bar**
6722 .
6723 <p><strong><em>foo</em> bar</strong></p>
6724 ````````````````````````````````
6725
6726
6727 ```````````````````````````````` example
6728 **foo *bar***
6729 .
6730 <p><strong>foo <em>bar</em></strong></p>
6731 ````````````````````````````````
6732
6733
6734 Indefinite levels of nesting are possible:
6735
6736 ```````````````````````````````` example
6737 **foo *bar **baz**
6738 bim* bop**
6739 .
6740 <p><strong>foo <em>bar <strong>baz</strong>
6741 bim</em> bop</strong></p>
6742 ````````````````````````````````
6743
6744
6745 ```````````````````````````````` example
6746 **foo [*bar*](/url)**
6747 .
6748 <p><strong>foo <a href="/url"><em>bar</em></a></strong></p>
6749 ````````````````````````````````
6750
6751
6752 There can be no empty emphasis or strong emphasis:
6753
6754 ```````````````````````````````` example
6755 __ is not an empty emphasis
6756 .
6757 <p>__ is not an empty emphasis</p>
6758 ````````````````````````````````
6759
6760
6761 ```````````````````````````````` example
6762 ____ is not an empty strong emphasis
6763 .
6764 <p>____ is not an empty strong emphasis</p>
6765 ````````````````````````````````
6766
6767
6768
6769 Rule 11:
6770
6771 ```````````````````````````````` example
6772 foo ***
6773 .
6774 <p>foo ***</p>
6775 ````````````````````````````````
6776
6777
6778 ```````````````````````````````` example
6779 foo *\**
6780 .
6781 <p>foo <em>*</em></p>
6782 ````````````````````````````````
6783
6784
6785 ```````````````````````````````` example
6786 foo *_*
6787 .
6788 <p>foo <em>_</em></p>
6789 ````````````````````````````````
6790
6791
6792 ```````````````````````````````` example
6793 foo *****
6794 .
6795 <p>foo *****</p>
6796 ````````````````````````````````
6797
6798
6799 ```````````````````````````````` example
6800 foo **\***
6801 .
6802 <p>foo <strong>*</strong></p>
6803 ````````````````````````````````
6804
6805
6806 ```````````````````````````````` example
6807 foo **_**
6808 .
6809 <p>foo <strong>_</strong></p>
6810 ````````````````````````````````
6811
6812
6813 Note that when delimiters do not match evenly, Rule 11 determines
6814 that the excess literal `*` characters will appear outside of the
6815 emphasis, rather than inside it:
6816
6817 ```````````````````````````````` example
6818 **foo*
6819 .
6820 <p>*<em>foo</em></p>
6821 ````````````````````````````````
6822
6823
6824 ```````````````````````````````` example
6825 *foo**
6826 .
6827 <p><em>foo</em>*</p>
6828 ````````````````````````````````
6829
6830
6831 ```````````````````````````````` example
6832 ***foo**
6833 .
6834 <p>*<strong>foo</strong></p>
6835 ````````````````````````````````
6836
6837
6838 ```````````````````````````````` example
6839 ****foo*
6840 .
6841 <p>***<em>foo</em></p>
6842 ````````````````````````````````
6843
6844
6845 ```````````````````````````````` example
6846 **foo***
6847 .
6848 <p><strong>foo</strong>*</p>
6849 ````````````````````````````````
6850
6851
6852 ```````````````````````````````` example
6853 *foo****
6854 .
6855 <p><em>foo</em>***</p>
6856 ````````````````````````````````
6857
6858
6859
6860 Rule 12:
6861
6862 ```````````````````````````````` example
6863 foo ___
6864 .
6865 <p>foo ___</p>
6866 ````````````````````````````````
6867
6868
6869 ```````````````````````````````` example
6870 foo _\__
6871 .
6872 <p>foo <em>_</em></p>
6873 ````````````````````````````````
6874
6875
6876 ```````````````````````````````` example
6877 foo _*_
6878 .
6879 <p>foo <em>*</em></p>
6880 ````````````````````````````````
6881
6882
6883 ```````````````````````````````` example
6884 foo _____
6885 .
6886 <p>foo _____</p>
6887 ````````````````````````````````
6888
6889
6890 ```````````````````````````````` example
6891 foo __\___
6892 .
6893 <p>foo <strong>_</strong></p>
6894 ````````````````````````````````
6895
6896
6897 ```````````````````````````````` example
6898 foo __*__
6899 .
6900 <p>foo <strong>*</strong></p>
6901 ````````````````````````````````
6902
6903
6904 ```````````````````````````````` example
6905 __foo_
6906 .
6907 <p>_<em>foo</em></p>
6908 ````````````````````````````````
6909
6910
6911 Note that when delimiters do not match evenly, Rule 12 determines
6912 that the excess literal `_` characters will appear outside of the
6913 emphasis, rather than inside it:
6914
6915 ```````````````````````````````` example
6916 _foo__
6917 .
6918 <p><em>foo</em>_</p>
6919 ````````````````````````````````
6920
6921
6922 ```````````````````````````````` example
6923 ___foo__
6924 .
6925 <p>_<strong>foo</strong></p>
6926 ````````````````````````````````
6927
6928
6929 ```````````````````````````````` example
6930 ____foo_
6931 .
6932 <p>___<em>foo</em></p>
6933 ````````````````````````````````
6934
6935
6936 ```````````````````````````````` example
6937 __foo___
6938 .
6939 <p><strong>foo</strong>_</p>
6940 ````````````````````````````````
6941
6942
6943 ```````````````````````````````` example
6944 _foo____
6945 .
6946 <p><em>foo</em>___</p>
6947 ````````````````````````````````
6948
6949
6950 Rule 13 implies that if you want emphasis nested directly inside
6951 emphasis, you must use different delimiters:
6952
6953 ```````````````````````````````` example
6954 **foo**
6955 .
6956 <p><strong>foo</strong></p>
6957 ````````````````````````````````
6958
6959
6960 ```````````````````````````````` example
6961 *_foo_*
6962 .
6963 <p><em><em>foo</em></em></p>
6964 ````````````````````````````````
6965
6966
6967 ```````````````````````````````` example
6968 __foo__
6969 .
6970 <p><strong>foo</strong></p>
6971 ````````````````````````````````
6972
6973
6974 ```````````````````````````````` example
6975 _*foo*_
6976 .
6977 <p><em><em>foo</em></em></p>
6978 ````````````````````````````````
6979
6980
6981 However, strong emphasis within strong emphasis is possible without
6982 switching delimiters:
6983
6984 ```````````````````````````````` example
6985 ****foo****
6986 .
6987 <p><strong><strong>foo</strong></strong></p>
6988 ````````````````````````````````
6989
6990
6991 ```````````````````````````````` example
6992 ____foo____
6993 .
6994 <p><strong><strong>foo</strong></strong></p>
6995 ````````````````````````````````
6996
6997
6998
6999 Rule 13 can be applied to arbitrarily long sequences of
7000 delimiters:
7001
7002 ```````````````````````````````` example
7003 ******foo******
7004 .
7005 <p><strong><strong><strong>foo</strong></strong></strong></p>
7006 ````````````````````````````````
7007
7008
7009 Rule 14:
7010
7011 ```````````````````````````````` example
7012 ***foo***
7013 .
7014 <p><em><strong>foo</strong></em></p>
7015 ````````````````````````````````
7016
7017
7018 ```````````````````````````````` example
7019 _____foo_____
7020 .
7021 <p><em><strong><strong>foo</strong></strong></em></p>
7022 ````````````````````````````````
7023
7024
7025 Rule 15:
7026
7027 ```````````````````````````````` example
7028 *foo _bar* baz_
7029 .
7030 <p><em>foo _bar</em> baz_</p>
7031 ````````````````````````````````
7032
7033
7034 ```````````````````````````````` example
7035 *foo __bar *baz bim__ bam*
7036 .
7037 <p><em>foo <strong>bar *baz bim</strong> bam</em></p>
7038 ````````````````````````````````
7039
7040
7041 Rule 16:
7042
7043 ```````````````````````````````` example
7044 **foo **bar baz**
7045 .
7046 <p>**foo <strong>bar baz</strong></p>
7047 ````````````````````````````````
7048
7049
7050 ```````````````````````````````` example
7051 *foo *bar baz*
7052 .
7053 <p>*foo <em>bar baz</em></p>
7054 ````````````````````````````````
7055
7056
7057 Rule 17:
7058
7059 ```````````````````````````````` example
7060 *[bar*](/url)
7061 .
7062 <p>*<a href="/url">bar*</a></p>
7063 ````````````````````````````````
7064
7065
7066 ```````````````````````````````` example
7067 _foo [bar_](/url)
7068 .
7069 <p>_foo <a href="/url">bar_</a></p>
7070 ````````````````````````````````
7071
7072
7073 ```````````````````````````````` example
7074 *<img src="foo" title="*"/>
7075 .
7076 <p>*<img src="foo" title="*"/></p>
7077 ````````````````````````````````
7078
7079
7080 ```````````````````````````````` example
7081 **<a href="**">
7082 .
7083 <p>**<a href="**"></p>
7084 ````````````````````````````````
7085
7086
7087 ```````````````````````````````` example
7088 __<a href="__">
7089 .
7090 <p>__<a href="__"></p>
7091 ````````````````````````````````
7092
7093
7094 ```````````````````````````````` example
7095 *a `*`*
7096 .
7097 <p><em>a <code>*</code></em></p>
7098 ````````````````````````````````
7099
7100
7101 ```````````````````````````````` example
7102 _a `_`_
7103 .
7104 <p><em>a <code>_</code></em></p>
7105 ````````````````````````````````
7106
7107
7108 ```````````````````````````````` example
7109 **a<http://foo.bar/?q=**>
7110 .
7111 <p>**a<a href="http://foo.bar/?q=**">http://foo.bar/?q=**</a></p>
7112 ````````````````````````````````
7113
7114
7115 ```````````````````````````````` example
7116 __a<http://foo.bar/?q=__>
7117 .
7118 <p>__a<a href="http://foo.bar/?q=__">http://foo.bar/?q=__</a></p>
7119 ````````````````````````````````
7120
7121
7122
7123 ## Links
7124
7125 A link contains [link text] (the visible text), a [link destination]
7126 (the URI that is the link destination), and optionally a [link title].
7127 There are two basic kinds of links in Markdown.  In [inline links] the
7128 destination and title are given immediately after the link text.  In
7129 [reference links] the destination and title are defined elsewhere in
7130 the document.
7131
7132 A [link text](@) consists of a sequence of zero or more
7133 inline elements enclosed by square brackets (`[` and `]`).  The
7134 following rules apply:
7135
7136 - Links may not contain other links, at any level of nesting. If
7137   multiple otherwise valid link definitions appear nested inside each
7138   other, the inner-most definition is used.
7139
7140 - Brackets are allowed in the [link text] only if (a) they
7141   are backslash-escaped or (b) they appear as a matched pair of brackets,
7142   with an open bracket `[`, a sequence of zero or more inlines, and
7143   a close bracket `]`.
7144
7145 - Backtick [code spans], [autolinks], and raw [HTML tags] bind more tightly
7146   than the brackets in link text.  Thus, for example,
7147   `` [foo`]` `` could not be a link text, since the second `]`
7148   is part of a code span.
7149
7150 - The brackets in link text bind more tightly than markers for
7151   [emphasis and strong emphasis]. Thus, for example, `*[foo*](url)` is a link.
7152
7153 A [link destination](@) consists of either
7154
7155 - a sequence of zero or more characters between an opening `<` and a
7156   closing `>` that contains no spaces, line breaks, or unescaped
7157   `<` or `>` characters, or
7158
7159 - a nonempty sequence of characters that does not include
7160   ASCII space or control characters, and includes parentheses
7161   only if (a) they are backslash-escaped or (b) they are part of
7162   a balanced pair of unescaped parentheses.
7163
7164 A [link title](@)  consists of either
7165
7166 - a sequence of zero or more characters between straight double-quote
7167   characters (`"`), including a `"` character only if it is
7168   backslash-escaped, or
7169
7170 - a sequence of zero or more characters between straight single-quote
7171   characters (`'`), including a `'` character only if it is
7172   backslash-escaped, or
7173
7174 - a sequence of zero or more characters between matching parentheses
7175   (`(...)`), including a `)` character only if it is backslash-escaped.
7176
7177 Although [link titles] may span multiple lines, they may not contain
7178 a [blank line].
7179
7180 An [inline link](@) consists of a [link text] followed immediately
7181 by a left parenthesis `(`, optional [whitespace], an optional
7182 [link destination], an optional [link title] separated from the link
7183 destination by [whitespace], optional [whitespace], and a right
7184 parenthesis `)`. The link's text consists of the inlines contained
7185 in the [link text] (excluding the enclosing square brackets).
7186 The link's URI consists of the link destination, excluding enclosing
7187 `<...>` if present, with backslash-escapes in effect as described
7188 above.  The link's title consists of the link title, excluding its
7189 enclosing delimiters, with backslash-escapes in effect as described
7190 above.
7191
7192 Here is a simple inline link:
7193
7194 ```````````````````````````````` example
7195 [link](/uri "title")
7196 .
7197 <p><a href="/uri" title="title">link</a></p>
7198 ````````````````````````````````
7199
7200
7201 The title may be omitted:
7202
7203 ```````````````````````````````` example
7204 [link](/uri)
7205 .
7206 <p><a href="/uri">link</a></p>
7207 ````````````````````````````````
7208
7209
7210 Both the title and the destination may be omitted:
7211
7212 ```````````````````````````````` example
7213 [link]()
7214 .
7215 <p><a href="">link</a></p>
7216 ````````````````````````````````
7217
7218
7219 ```````````````````````````````` example
7220 [link](<>)
7221 .
7222 <p><a href="">link</a></p>
7223 ````````````````````````````````
7224
7225
7226 The destination cannot contain spaces or line breaks,
7227 even if enclosed in pointy brackets:
7228
7229 ```````````````````````````````` example
7230 [link](/my uri)
7231 .
7232 <p>[link](/my uri)</p>
7233 ````````````````````````````````
7234
7235
7236 ```````````````````````````````` example
7237 [link](</my uri>)
7238 .
7239 <p>[link](&lt;/my uri&gt;)</p>
7240 ````````````````````````````````
7241
7242
7243 ```````````````````````````````` example
7244 [link](foo
7245 bar)
7246 .
7247 <p>[link](foo
7248 bar)</p>
7249 ````````````````````````````````
7250
7251
7252 ```````````````````````````````` example
7253 [link](<foo
7254 bar>)
7255 .
7256 <p>[link](<foo
7257 bar>)</p>
7258 ````````````````````````````````
7259
7260 Parentheses inside the link destination may be escaped:
7261
7262 ```````````````````````````````` example
7263 [link](\(foo\))
7264 .
7265 <p><a href="(foo)">link</a></p>
7266 ````````````````````````````````
7267
7268 Any number parentheses are allowed without escaping, as long as they are
7269 balanced:
7270
7271 ```````````````````````````````` example
7272 [link](foo(and(bar)))
7273 .
7274 <p><a href="foo(and(bar))">link</a></p>
7275 ````````````````````````````````
7276
7277 However, if you have unbalanced parentheses, you need to escape or use the
7278 `<...>` form:
7279
7280 ```````````````````````````````` example
7281 [link](foo\(and\(bar\))
7282 .
7283 <p><a href="foo(and(bar)">link</a></p>
7284 ````````````````````````````````
7285
7286
7287 ```````````````````````````````` example
7288 [link](<foo(and(bar)>)
7289 .
7290 <p><a href="foo(and(bar)">link</a></p>
7291 ````````````````````````````````
7292
7293
7294 Parentheses and other symbols can also be escaped, as usual
7295 in Markdown:
7296
7297 ```````````````````````````````` example
7298 [link](foo\)\:)
7299 .
7300 <p><a href="foo):">link</a></p>
7301 ````````````````````````````````
7302
7303
7304 A link can contain fragment identifiers and queries:
7305
7306 ```````````````````````````````` example
7307 [link](#fragment)
7308
7309 [link](http://example.com#fragment)
7310
7311 [link](http://example.com?foo=3#frag)
7312 .
7313 <p><a href="#fragment">link</a></p>
7314 <p><a href="http://example.com#fragment">link</a></p>
7315 <p><a href="http://example.com?foo=3#frag">link</a></p>
7316 ````````````````````````````````
7317
7318
7319 Note that a backslash before a non-escapable character is
7320 just a backslash:
7321
7322 ```````````````````````````````` example
7323 [link](foo\bar)
7324 .
7325 <p><a href="foo%5Cbar">link</a></p>
7326 ````````````````````````````````
7327
7328
7329 URL-escaping should be left alone inside the destination, as all
7330 URL-escaped characters are also valid URL characters. Entity and
7331 numerical character references in the destination will be parsed
7332 into the corresponding Unicode code points, as usual.  These may
7333 be optionally URL-escaped when written as HTML, but this spec
7334 does not enforce any particular policy for rendering URLs in
7335 HTML or other formats.  Renderers may make different decisions
7336 about how to escape or normalize URLs in the output.
7337
7338 ```````````````````````````````` example
7339 [link](foo%20b&auml;)
7340 .
7341 <p><a href="foo%20b%C3%A4">link</a></p>
7342 ````````````````````````````````
7343
7344
7345 Note that, because titles can often be parsed as destinations,
7346 if you try to omit the destination and keep the title, you'll
7347 get unexpected results:
7348
7349 ```````````````````````````````` example
7350 [link]("title")
7351 .
7352 <p><a href="%22title%22">link</a></p>
7353 ````````````````````````````````
7354
7355
7356 Titles may be in single quotes, double quotes, or parentheses:
7357
7358 ```````````````````````````````` example
7359 [link](/url "title")
7360 [link](/url 'title')
7361 [link](/url (title))
7362 .
7363 <p><a href="/url" title="title">link</a>
7364 <a href="/url" title="title">link</a>
7365 <a href="/url" title="title">link</a></p>
7366 ````````````````````````````````
7367
7368
7369 Backslash escapes and entity and numeric character references
7370 may be used in titles:
7371
7372 ```````````````````````````````` example
7373 [link](/url "title \"&quot;")
7374 .
7375 <p><a href="/url" title="title &quot;&quot;">link</a></p>
7376 ````````````````````````````````
7377
7378
7379 Titles must be separated from the link using a [whitespace].
7380 Other [Unicode whitespace] like non-breaking space doesn't work.
7381
7382 ```````````````````````````````` example
7383 [link](/url "title")
7384 .
7385 <p><a href="/url%C2%A0%22title%22">link</a></p>
7386 ````````````````````````````````
7387
7388
7389 Nested balanced quotes are not allowed without escaping:
7390
7391 ```````````````````````````````` example
7392 [link](/url "title "and" title")
7393 .
7394 <p>[link](/url &quot;title &quot;and&quot; title&quot;)</p>
7395 ````````````````````````````````
7396
7397
7398 But it is easy to work around this by using a different quote type:
7399
7400 ```````````````````````````````` example
7401 [link](/url 'title "and" title')
7402 .
7403 <p><a href="/url" title="title &quot;and&quot; title">link</a></p>
7404 ````````````````````````````````
7405
7406
7407 (Note:  `Markdown.pl` did allow double quotes inside a double-quoted
7408 title, and its test suite included a test demonstrating this.
7409 But it is hard to see a good rationale for the extra complexity this
7410 brings, since there are already many ways---backslash escaping,
7411 entity and numeric character references, or using a different
7412 quote type for the enclosing title---to write titles containing
7413 double quotes.  `Markdown.pl`'s handling of titles has a number
7414 of other strange features.  For example, it allows single-quoted
7415 titles in inline links, but not reference links.  And, in
7416 reference links but not inline links, it allows a title to begin
7417 with `"` and end with `)`.  `Markdown.pl` 1.0.1 even allows
7418 titles with no closing quotation mark, though 1.0.2b8 does not.
7419 It seems preferable to adopt a simple, rational rule that works
7420 the same way in inline links and link reference definitions.)
7421
7422 [Whitespace] is allowed around the destination and title:
7423
7424 ```````````````````````````````` example
7425 [link](   /uri
7426   "title"  )
7427 .
7428 <p><a href="/uri" title="title">link</a></p>
7429 ````````````````````````````````
7430
7431
7432 But it is not allowed between the link text and the
7433 following parenthesis:
7434
7435 ```````````````````````````````` example
7436 [link] (/uri)
7437 .
7438 <p>[link] (/uri)</p>
7439 ````````````````````````````````
7440
7441
7442 The link text may contain balanced brackets, but not unbalanced ones,
7443 unless they are escaped:
7444
7445 ```````````````````````````````` example
7446 [link [foo [bar]]](/uri)
7447 .
7448 <p><a href="/uri">link [foo [bar]]</a></p>
7449 ````````````````````````````````
7450
7451
7452 ```````````````````````````````` example
7453 [link] bar](/uri)
7454 .
7455 <p>[link] bar](/uri)</p>
7456 ````````````````````````````````
7457
7458
7459 ```````````````````````````````` example
7460 [link [bar](/uri)
7461 .
7462 <p>[link <a href="/uri">bar</a></p>
7463 ````````````````````````````````
7464
7465
7466 ```````````````````````````````` example
7467 [link \[bar](/uri)
7468 .
7469 <p><a href="/uri">link [bar</a></p>
7470 ````````````````````````````````
7471
7472
7473 The link text may contain inline content:
7474
7475 ```````````````````````````````` example
7476 [link *foo **bar** `#`*](/uri)
7477 .
7478 <p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p>
7479 ````````````````````````````````
7480
7481
7482 ```````````````````````````````` example
7483 [![moon](moon.jpg)](/uri)
7484 .
7485 <p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p>
7486 ````````````````````````````````
7487
7488
7489 However, links may not contain other links, at any level of nesting.
7490
7491 ```````````````````````````````` example
7492 [foo [bar](/uri)](/uri)
7493 .
7494 <p>[foo <a href="/uri">bar</a>](/uri)</p>
7495 ````````````````````````````````
7496
7497
7498 ```````````````````````````````` example
7499 [foo *[bar [baz](/uri)](/uri)*](/uri)
7500 .
7501 <p>[foo <em>[bar <a href="/uri">baz</a>](/uri)</em>](/uri)</p>
7502 ````````````````````````````````
7503
7504
7505 ```````````````````````````````` example
7506 ![[[foo](uri1)](uri2)](uri3)
7507 .
7508 <p><img src="uri3" alt="[foo](uri2)" /></p>
7509 ````````````````````````````````
7510
7511
7512 These cases illustrate the precedence of link text grouping over
7513 emphasis grouping:
7514
7515 ```````````````````````````````` example
7516 *[foo*](/uri)
7517 .
7518 <p>*<a href="/uri">foo*</a></p>
7519 ````````````````````````````````
7520
7521
7522 ```````````````````````````````` example
7523 [foo *bar](baz*)
7524 .
7525 <p><a href="baz*">foo *bar</a></p>
7526 ````````````````````````````````
7527
7528
7529 Note that brackets that *aren't* part of links do not take
7530 precedence:
7531
7532 ```````````````````````````````` example
7533 *foo [bar* baz]
7534 .
7535 <p><em>foo [bar</em> baz]</p>
7536 ````````````````````````````````
7537
7538
7539 These cases illustrate the precedence of HTML tags, code spans,
7540 and autolinks over link grouping:
7541
7542 ```````````````````````````````` example
7543 [foo <bar attr="](baz)">
7544 .
7545 <p>[foo <bar attr="](baz)"></p>
7546 ````````````````````````````````
7547
7548
7549 ```````````````````````````````` example
7550 [foo`](/uri)`
7551 .
7552 <p>[foo<code>](/uri)</code></p>
7553 ````````````````````````````````
7554
7555
7556 ```````````````````````````````` example
7557 [foo<http://example.com/?search=](uri)>
7558 .
7559 <p>[foo<a href="http://example.com/?search=%5D(uri)">http://example.com/?search=](uri)</a></p>
7560 ````````````````````````````````
7561
7562
7563 There are three kinds of [reference link](@)s:
7564 [full](#full-reference-link), [collapsed](#collapsed-reference-link),
7565 and [shortcut](#shortcut-reference-link).
7566
7567 A [full reference link](@)
7568 consists of a [link text] immediately followed by a [link label]
7569 that [matches] a [link reference definition] elsewhere in the document.
7570
7571 A [link label](@)  begins with a left bracket (`[`) and ends
7572 with the first right bracket (`]`) that is not backslash-escaped.
7573 Between these brackets there must be at least one [non-whitespace character].
7574 Unescaped square bracket characters are not allowed in
7575 [link labels].  A link label can have at most 999
7576 characters inside the square brackets.
7577
7578 One label [matches](@)
7579 another just in case their normalized forms are equal.  To normalize a
7580 label, perform the *Unicode case fold* and collapse consecutive internal
7581 [whitespace] to a single space.  If there are multiple
7582 matching reference link definitions, the one that comes first in the
7583 document is used.  (It is desirable in such cases to emit a warning.)
7584
7585 The contents of the first link label are parsed as inlines, which are
7586 used as the link's text.  The link's URI and title are provided by the
7587 matching [link reference definition].
7588
7589 Here is a simple example:
7590
7591 ```````````````````````````````` example
7592 [foo][bar]
7593
7594 [bar]: /url "title"
7595 .
7596 <p><a href="/url" title="title">foo</a></p>
7597 ````````````````````````````````
7598
7599
7600 The rules for the [link text] are the same as with
7601 [inline links].  Thus:
7602
7603 The link text may contain balanced brackets, but not unbalanced ones,
7604 unless they are escaped:
7605
7606 ```````````````````````````````` example
7607 [link [foo [bar]]][ref]
7608
7609 [ref]: /uri
7610 .
7611 <p><a href="/uri">link [foo [bar]]</a></p>
7612 ````````````````````````````````
7613
7614
7615 ```````````````````````````````` example
7616 [link \[bar][ref]
7617
7618 [ref]: /uri
7619 .
7620 <p><a href="/uri">link [bar</a></p>
7621 ````````````````````````````````
7622
7623
7624 The link text may contain inline content:
7625
7626 ```````````````````````````````` example
7627 [link *foo **bar** `#`*][ref]
7628
7629 [ref]: /uri
7630 .
7631 <p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p>
7632 ````````````````````````````````
7633
7634
7635 ```````````````````````````````` example
7636 [![moon](moon.jpg)][ref]
7637
7638 [ref]: /uri
7639 .
7640 <p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p>
7641 ````````````````````````````````
7642
7643
7644 However, links may not contain other links, at any level of nesting.
7645
7646 ```````````````````````````````` example
7647 [foo [bar](/uri)][ref]
7648
7649 [ref]: /uri
7650 .
7651 <p>[foo <a href="/uri">bar</a>]<a href="/uri">ref</a></p>
7652 ````````````````````````````````
7653
7654
7655 ```````````````````````````````` example
7656 [foo *bar [baz][ref]*][ref]
7657
7658 [ref]: /uri
7659 .
7660 <p>[foo <em>bar <a href="/uri">baz</a></em>]<a href="/uri">ref</a></p>
7661 ````````````````````````````````
7662
7663
7664 (In the examples above, we have two [shortcut reference links]
7665 instead of one [full reference link].)
7666
7667 The following cases illustrate the precedence of link text grouping over
7668 emphasis grouping:
7669
7670 ```````````````````````````````` example
7671 *[foo*][ref]
7672
7673 [ref]: /uri
7674 .
7675 <p>*<a href="/uri">foo*</a></p>
7676 ````````````````````````````````
7677
7678
7679 ```````````````````````````````` example
7680 [foo *bar][ref]
7681
7682 [ref]: /uri
7683 .
7684 <p><a href="/uri">foo *bar</a></p>
7685 ````````````````````````````````
7686
7687
7688 These cases illustrate the precedence of HTML tags, code spans,
7689 and autolinks over link grouping:
7690
7691 ```````````````````````````````` example
7692 [foo <bar attr="][ref]">
7693
7694 [ref]: /uri
7695 .
7696 <p>[foo <bar attr="][ref]"></p>
7697 ````````````````````````````````
7698
7699
7700 ```````````````````````````````` example
7701 [foo`][ref]`
7702
7703 [ref]: /uri
7704 .
7705 <p>[foo<code>][ref]</code></p>
7706 ````````````````````````````````
7707
7708
7709 ```````````````````````````````` example
7710 [foo<http://example.com/?search=][ref]>
7711
7712 [ref]: /uri
7713 .
7714 <p>[foo<a href="http://example.com/?search=%5D%5Bref%5D">http://example.com/?search=][ref]</a></p>
7715 ````````````````````````````````
7716
7717
7718 Matching is case-insensitive:
7719
7720 ```````````````````````````````` example
7721 [foo][BaR]
7722
7723 [bar]: /url "title"
7724 .
7725 <p><a href="/url" title="title">foo</a></p>
7726 ````````````````````````````````
7727
7728
7729 Unicode case fold is used:
7730
7731 ```````````````````````````````` example
7732 [Толпой][Толпой] is a Russian word.
7733
7734 [ТОЛПОЙ]: /url
7735 .
7736 <p><a href="/url">Толпой</a> is a Russian word.</p>
7737 ````````````````````````````````
7738
7739
7740 Consecutive internal [whitespace] is treated as one space for
7741 purposes of determining matching:
7742
7743 ```````````````````````````````` example
7744 [Foo
7745   bar]: /url
7746
7747 [Baz][Foo bar]
7748 .
7749 <p><a href="/url">Baz</a></p>
7750 ````````````````````````````````
7751
7752
7753 No [whitespace] is allowed between the [link text] and the
7754 [link label]:
7755
7756 ```````````````````````````````` example
7757 [foo] [bar]
7758
7759 [bar]: /url "title"
7760 .
7761 <p>[foo] <a href="/url" title="title">bar</a></p>
7762 ````````````````````````````````
7763
7764
7765 ```````````````````````````````` example
7766 [foo]
7767 [bar]
7768
7769 [bar]: /url "title"
7770 .
7771 <p>[foo]
7772 <a href="/url" title="title">bar</a></p>
7773 ````````````````````````````````
7774
7775
7776 This is a departure from John Gruber's original Markdown syntax
7777 description, which explicitly allows whitespace between the link
7778 text and the link label.  It brings reference links in line with
7779 [inline links], which (according to both original Markdown and
7780 this spec) cannot have whitespace after the link text.  More
7781 importantly, it prevents inadvertent capture of consecutive
7782 [shortcut reference links]. If whitespace is allowed between the
7783 link text and the link label, then in the following we will have
7784 a single reference link, not two shortcut reference links, as
7785 intended:
7786
7787 ``` markdown
7788 [foo]
7789 [bar]
7790
7791 [foo]: /url1
7792 [bar]: /url2
7793 ```
7794
7795 (Note that [shortcut reference links] were introduced by Gruber
7796 himself in a beta version of `Markdown.pl`, but never included
7797 in the official syntax description.  Without shortcut reference
7798 links, it is harmless to allow space between the link text and
7799 link label; but once shortcut references are introduced, it is
7800 too dangerous to allow this, as it frequently leads to
7801 unintended results.)
7802
7803 When there are multiple matching [link reference definitions],
7804 the first is used:
7805
7806 ```````````````````````````````` example
7807 [foo]: /url1
7808
7809 [foo]: /url2
7810
7811 [bar][foo]
7812 .
7813 <p><a href="/url1">bar</a></p>
7814 ````````````````````````````````
7815
7816
7817 Note that matching is performed on normalized strings, not parsed
7818 inline content.  So the following does not match, even though the
7819 labels define equivalent inline content:
7820
7821 ```````````````````````````````` example
7822 [bar][foo\!]
7823
7824 [foo!]: /url
7825 .
7826 <p>[bar][foo!]</p>
7827 ````````````````````````````````
7828
7829
7830 [Link labels] cannot contain brackets, unless they are
7831 backslash-escaped:
7832
7833 ```````````````````````````````` example
7834 [foo][ref[]
7835
7836 [ref[]: /uri
7837 .
7838 <p>[foo][ref[]</p>
7839 <p>[ref[]: /uri</p>
7840 ````````````````````````````````
7841
7842
7843 ```````````````````````````````` example
7844 [foo][ref[bar]]
7845
7846 [ref[bar]]: /uri
7847 .
7848 <p>[foo][ref[bar]]</p>
7849 <p>[ref[bar]]: /uri</p>
7850 ````````````````````````````````
7851
7852
7853 ```````````````````````````````` example
7854 [[[foo]]]
7855
7856 [[[foo]]]: /url
7857 .
7858 <p>[[[foo]]]</p>
7859 <p>[[[foo]]]: /url</p>
7860 ````````````````````````````````
7861
7862
7863 ```````````````````````````````` example
7864 [foo][ref\[]
7865
7866 [ref\[]: /uri
7867 .
7868 <p><a href="/uri">foo</a></p>
7869 ````````````````````````````````
7870
7871
7872 Note that in this example `]` is not backslash-escaped:
7873
7874 ```````````````````````````````` example
7875 [bar\\]: /uri
7876
7877 [bar\\]
7878 .
7879 <p><a href="/uri">bar\</a></p>
7880 ````````````````````````````````
7881
7882
7883 A [link label] must contain at least one [non-whitespace character]:
7884
7885 ```````````````````````````````` example
7886 []
7887
7888 []: /uri
7889 .
7890 <p>[]</p>
7891 <p>[]: /uri</p>
7892 ````````````````````````````````
7893
7894
7895 ```````````````````````````````` example
7896 [
7897  ]
7898
7899 [
7900  ]: /uri
7901 .
7902 <p>[
7903 ]</p>
7904 <p>[
7905 ]: /uri</p>
7906 ````````````````````````````````
7907
7908
7909 A [collapsed reference link](@)
7910 consists of a [link label] that [matches] a
7911 [link reference definition] elsewhere in the
7912 document, followed by the string `[]`.
7913 The contents of the first link label are parsed as inlines,
7914 which are used as the link's text.  The link's URI and title are
7915 provided by the matching reference link definition.  Thus,
7916 `[foo][]` is equivalent to `[foo][foo]`.
7917
7918 ```````````````````````````````` example
7919 [foo][]
7920
7921 [foo]: /url "title"
7922 .
7923 <p><a href="/url" title="title">foo</a></p>
7924 ````````````````````````````````
7925
7926
7927 ```````````````````````````````` example
7928 [*foo* bar][]
7929
7930 [*foo* bar]: /url "title"
7931 .
7932 <p><a href="/url" title="title"><em>foo</em> bar</a></p>
7933 ````````````````````````````````
7934
7935
7936 The link labels are case-insensitive:
7937
7938 ```````````````````````````````` example
7939 [Foo][]
7940
7941 [foo]: /url "title"
7942 .
7943 <p><a href="/url" title="title">Foo</a></p>
7944 ````````````````````````````````
7945
7946
7947
7948 As with full reference links, [whitespace] is not
7949 allowed between the two sets of brackets:
7950
7951 ```````````````````````````````` example
7952 [foo] 
7953 []
7954
7955 [foo]: /url "title"
7956 .
7957 <p><a href="/url" title="title">foo</a>
7958 []</p>
7959 ````````````````````````````````
7960
7961
7962 A [shortcut reference link](@)
7963 consists of a [link label] that [matches] a
7964 [link reference definition] elsewhere in the
7965 document and is not followed by `[]` or a link label.
7966 The contents of the first link label are parsed as inlines,
7967 which are used as the link's text.  The link's URI and title
7968 are provided by the matching link reference definition.
7969 Thus, `[foo]` is equivalent to `[foo][]`.
7970
7971 ```````````````````````````````` example
7972 [foo]
7973
7974 [foo]: /url "title"
7975 .
7976 <p><a href="/url" title="title">foo</a></p>
7977 ````````````````````````````````
7978
7979
7980 ```````````````````````````````` example
7981 [*foo* bar]
7982
7983 [*foo* bar]: /url "title"
7984 .
7985 <p><a href="/url" title="title"><em>foo</em> bar</a></p>
7986 ````````````````````````````````
7987
7988
7989 ```````````````````````````````` example
7990 [[*foo* bar]]
7991
7992 [*foo* bar]: /url "title"
7993 .
7994 <p>[<a href="/url" title="title"><em>foo</em> bar</a>]</p>
7995 ````````````````````````````````
7996
7997
7998 ```````````````````````````````` example
7999 [[bar [foo]
8000
8001 [foo]: /url
8002 .
8003 <p>[[bar <a href="/url">foo</a></p>
8004 ````````````````````````````````
8005
8006
8007 The link labels are case-insensitive:
8008
8009 ```````````````````````````````` example
8010 [Foo]
8011
8012 [foo]: /url "title"
8013 .
8014 <p><a href="/url" title="title">Foo</a></p>
8015 ````````````````````````````````
8016
8017
8018 A space after the link text should be preserved:
8019
8020 ```````````````````````````````` example
8021 [foo] bar
8022
8023 [foo]: /url
8024 .
8025 <p><a href="/url">foo</a> bar</p>
8026 ````````````````````````````````
8027
8028
8029 If you just want bracketed text, you can backslash-escape the
8030 opening bracket to avoid links:
8031
8032 ```````````````````````````````` example
8033 \[foo]
8034
8035 [foo]: /url "title"
8036 .
8037 <p>[foo]</p>
8038 ````````````````````````````````
8039
8040
8041 Note that this is a link, because a link label ends with the first
8042 following closing bracket:
8043
8044 ```````````````````````````````` example
8045 [foo*]: /url
8046
8047 *[foo*]
8048 .
8049 <p>*<a href="/url">foo*</a></p>
8050 ````````````````````````````````
8051
8052
8053 Full and compact references take precedence over shortcut
8054 references:
8055
8056 ```````````````````````````````` example
8057 [foo][bar]
8058
8059 [foo]: /url1
8060 [bar]: /url2
8061 .
8062 <p><a href="/url2">foo</a></p>
8063 ````````````````````````````````
8064
8065 ```````````````````````````````` example
8066 [foo][]
8067
8068 [foo]: /url1
8069 .
8070 <p><a href="/url1">foo</a></p>
8071 ````````````````````````````````
8072
8073 Inline links also take precedence:
8074
8075 ```````````````````````````````` example
8076 [foo]()
8077
8078 [foo]: /url1
8079 .
8080 <p><a href="">foo</a></p>
8081 ````````````````````````````````
8082
8083 ```````````````````````````````` example
8084 [foo](not a link)
8085
8086 [foo]: /url1
8087 .
8088 <p><a href="/url1">foo</a>(not a link)</p>
8089 ````````````````````````````````
8090
8091 In the following case `[bar][baz]` is parsed as a reference,
8092 `[foo]` as normal text:
8093
8094 ```````````````````````````````` example
8095 [foo][bar][baz]
8096
8097 [baz]: /url
8098 .
8099 <p>[foo]<a href="/url">bar</a></p>
8100 ````````````````````````````````
8101
8102
8103 Here, though, `[foo][bar]` is parsed as a reference, since
8104 `[bar]` is defined:
8105
8106 ```````````````````````````````` example
8107 [foo][bar][baz]
8108
8109 [baz]: /url1
8110 [bar]: /url2
8111 .
8112 <p><a href="/url2">foo</a><a href="/url1">baz</a></p>
8113 ````````````````````````````````
8114
8115
8116 Here `[foo]` is not parsed as a shortcut reference, because it
8117 is followed by a link label (even though `[bar]` is not defined):
8118
8119 ```````````````````````````````` example
8120 [foo][bar][baz]
8121
8122 [baz]: /url1
8123 [foo]: /url2
8124 .
8125 <p>[foo]<a href="/url1">bar</a></p>
8126 ````````````````````````````````
8127
8128
8129
8130 ## Images
8131
8132 Syntax for images is like the syntax for links, with one
8133 difference. Instead of [link text], we have an
8134 [image description](@).  The rules for this are the
8135 same as for [link text], except that (a) an
8136 image description starts with `![` rather than `[`, and
8137 (b) an image description may contain links.
8138 An image description has inline elements
8139 as its contents.  When an image is rendered to HTML,
8140 this is standardly used as the image's `alt` attribute.
8141
8142 ```````````````````````````````` example
8143 ![foo](/url "title")
8144 .
8145 <p><img src="/url" alt="foo" title="title" /></p>
8146 ````````````````````````````````
8147
8148
8149 ```````````````````````````````` example
8150 ![foo *bar*]
8151
8152 [foo *bar*]: train.jpg "train & tracks"
8153 .
8154 <p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p>
8155 ````````````````````````````````
8156
8157
8158 ```````````````````````````````` example
8159 ![foo ![bar](/url)](/url2)
8160 .
8161 <p><img src="/url2" alt="foo bar" /></p>
8162 ````````````````````````````````
8163
8164
8165 ```````````````````````````````` example
8166 ![foo [bar](/url)](/url2)
8167 .
8168 <p><img src="/url2" alt="foo bar" /></p>
8169 ````````````````````````````````
8170
8171
8172 Though this spec is concerned with parsing, not rendering, it is
8173 recommended that in rendering to HTML, only the plain string content
8174 of the [image description] be used.  Note that in
8175 the above example, the alt attribute's value is `foo bar`, not `foo
8176 [bar](/url)` or `foo <a href="/url">bar</a>`.  Only the plain string
8177 content is rendered, without formatting.
8178
8179 ```````````````````````````````` example
8180 ![foo *bar*][]
8181
8182 [foo *bar*]: train.jpg "train & tracks"
8183 .
8184 <p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p>
8185 ````````````````````````````````
8186
8187
8188 ```````````````````````````````` example
8189 ![foo *bar*][foobar]
8190
8191 [FOOBAR]: train.jpg "train & tracks"
8192 .
8193 <p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p>
8194 ````````````````````````````````
8195
8196
8197 ```````````````````````````````` example
8198 ![foo](train.jpg)
8199 .
8200 <p><img src="train.jpg" alt="foo" /></p>
8201 ````````````````````````````````
8202
8203
8204 ```````````````````````````````` example
8205 My ![foo bar](/path/to/train.jpg  "title"   )
8206 .
8207 <p>My <img src="/path/to/train.jpg" alt="foo bar" title="title" /></p>
8208 ````````````````````````````````
8209
8210
8211 ```````````````````````````````` example
8212 ![foo](<url>)
8213 .
8214 <p><img src="url" alt="foo" /></p>
8215 ````````````````````````````````
8216
8217
8218 ```````````````````````````````` example
8219 ![](/url)
8220 .
8221 <p><img src="/url" alt="" /></p>
8222 ````````````````````````````````
8223
8224
8225 Reference-style:
8226
8227 ```````````````````````````````` example
8228 ![foo][bar]
8229
8230 [bar]: /url
8231 .
8232 <p><img src="/url" alt="foo" /></p>
8233 ````````````````````````````````
8234
8235
8236 ```````````````````````````````` example
8237 ![foo][bar]
8238
8239 [BAR]: /url
8240 .
8241 <p><img src="/url" alt="foo" /></p>
8242 ````````````````````````````````
8243
8244
8245 Collapsed:
8246
8247 ```````````````````````````````` example
8248 ![foo][]
8249
8250 [foo]: /url "title"
8251 .
8252 <p><img src="/url" alt="foo" title="title" /></p>
8253 ````````````````````````````````
8254
8255
8256 ```````````````````````````````` example
8257 ![*foo* bar][]
8258
8259 [*foo* bar]: /url "title"
8260 .
8261 <p><img src="/url" alt="foo bar" title="title" /></p>
8262 ````````````````````````````````
8263
8264
8265 The labels are case-insensitive:
8266
8267 ```````````````````````````````` example
8268 ![Foo][]
8269
8270 [foo]: /url "title"
8271 .
8272 <p><img src="/url" alt="Foo" title="title" /></p>
8273 ````````````````````````````````
8274
8275
8276 As with reference links, [whitespace] is not allowed
8277 between the two sets of brackets:
8278
8279 ```````````````````````````````` example
8280 ![foo] 
8281 []
8282
8283 [foo]: /url "title"
8284 .
8285 <p><img src="/url" alt="foo" title="title" />
8286 []</p>
8287 ````````````````````````````````
8288
8289
8290 Shortcut:
8291
8292 ```````````````````````````````` example
8293 ![foo]
8294
8295 [foo]: /url "title"
8296 .
8297 <p><img src="/url" alt="foo" title="title" /></p>
8298 ````````````````````````````````
8299
8300
8301 ```````````````````````````````` example
8302 ![*foo* bar]
8303
8304 [*foo* bar]: /url "title"
8305 .
8306 <p><img src="/url" alt="foo bar" title="title" /></p>
8307 ````````````````````````````````
8308
8309
8310 Note that link labels cannot contain unescaped brackets:
8311
8312 ```````````````````````````````` example
8313 ![[foo]]
8314
8315 [[foo]]: /url "title"
8316 .
8317 <p>![[foo]]</p>
8318 <p>[[foo]]: /url &quot;title&quot;</p>
8319 ````````````````````````````````
8320
8321
8322 The link labels are case-insensitive:
8323
8324 ```````````````````````````````` example
8325 ![Foo]
8326
8327 [foo]: /url "title"
8328 .
8329 <p><img src="/url" alt="Foo" title="title" /></p>
8330 ````````````````````````````````
8331
8332
8333 If you just want a literal `!` followed by bracketed text, you can
8334 backslash-escape the opening `[`:
8335
8336 ```````````````````````````````` example
8337 !\[foo]
8338
8339 [foo]: /url "title"
8340 .
8341 <p>![foo]</p>
8342 ````````````````````````````````
8343
8344
8345 If you want a link after a literal `!`, backslash-escape the
8346 `!`:
8347
8348 ```````````````````````````````` example
8349 \![foo]
8350
8351 [foo]: /url "title"
8352 .
8353 <p>!<a href="/url" title="title">foo</a></p>
8354 ````````````````````````````````
8355
8356
8357 ## Autolinks
8358
8359 [Autolink](@)s are absolute URIs and email addresses inside
8360 `<` and `>`. They are parsed as links, with the URL or email address
8361 as the link label.
8362
8363 A [URI autolink](@) consists of `<`, followed by an
8364 [absolute URI] not containing `<`, followed by `>`.  It is parsed as
8365 a link to the URI, with the URI as the link's label.
8366
8367 An [absolute URI](@),
8368 for these purposes, consists of a [scheme] followed by a colon (`:`)
8369 followed by zero or more characters other than ASCII
8370 [whitespace] and control characters, `<`, and `>`.  If
8371 the URI includes these characters, they must be percent-encoded
8372 (e.g. `%20` for a space).
8373
8374 For purposes of this spec, a [scheme](@) is any sequence
8375 of 2--32 characters beginning with an ASCII letter and followed
8376 by any combination of ASCII letters, digits, or the symbols plus
8377 ("+"), period ("."), or hyphen ("-").
8378
8379 Here are some valid autolinks:
8380
8381 ```````````````````````````````` example
8382 <http://foo.bar.baz>
8383 .
8384 <p><a href="http://foo.bar.baz">http://foo.bar.baz</a></p>
8385 ````````````````````````````````
8386
8387
8388 ```````````````````````````````` example
8389 <http://foo.bar.baz/test?q=hello&id=22&boolean>
8390 .
8391 <p><a href="http://foo.bar.baz/test?q=hello&amp;id=22&amp;boolean">http://foo.bar.baz/test?q=hello&amp;id=22&amp;boolean</a></p>
8392 ````````````````````````````````
8393
8394
8395 ```````````````````````````````` example
8396 <irc://foo.bar:2233/baz>
8397 .
8398 <p><a href="irc://foo.bar:2233/baz">irc://foo.bar:2233/baz</a></p>
8399 ````````````````````````````````
8400
8401
8402 Uppercase is also fine:
8403
8404 ```````````````````````````````` example
8405 <MAILTO:FOO@BAR.BAZ>
8406 .
8407 <p><a href="MAILTO:FOO@BAR.BAZ">MAILTO:FOO@BAR.BAZ</a></p>
8408 ````````````````````````````````
8409
8410
8411 Note that many strings that count as [absolute URIs] for
8412 purposes of this spec are not valid URIs, because their
8413 schemes are not registered or because of other problems
8414 with their syntax:
8415
8416 ```````````````````````````````` example
8417 <a+b+c:d>
8418 .
8419 <p><a href="a+b+c:d">a+b+c:d</a></p>
8420 ````````````````````````````````
8421
8422
8423 ```````````````````````````````` example
8424 <made-up-scheme://foo,bar>
8425 .
8426 <p><a href="made-up-scheme://foo,bar">made-up-scheme://foo,bar</a></p>
8427 ````````````````````````````````
8428
8429
8430 ```````````````````````````````` example
8431 <http://../>
8432 .
8433 <p><a href="http://../">http://../</a></p>
8434 ````````````````````````````````
8435
8436
8437 ```````````````````````````````` example
8438 <localhost:5001/foo>
8439 .
8440 <p><a href="localhost:5001/foo">localhost:5001/foo</a></p>
8441 ````````````````````````````````
8442
8443
8444 Spaces are not allowed in autolinks:
8445
8446 ```````````````````````````````` example
8447 <http://foo.bar/baz bim>
8448 .
8449 <p>&lt;http://foo.bar/baz bim&gt;</p>
8450 ````````````````````````````````
8451
8452
8453 Backslash-escapes do not work inside autolinks:
8454
8455 ```````````````````````````````` example
8456 <http://example.com/\[\>
8457 .
8458 <p><a href="http://example.com/%5C%5B%5C">http://example.com/\[\</a></p>
8459 ````````````````````````````````
8460
8461
8462 An [email autolink](@)
8463 consists of `<`, followed by an [email address],
8464 followed by `>`.  The link's label is the email address,
8465 and the URL is `mailto:` followed by the email address.
8466
8467 An [email address](@),
8468 for these purposes, is anything that matches
8469 the [non-normative regex from the HTML5
8470 spec](https://html.spec.whatwg.org/multipage/forms.html#e-mail-state-(type=email)):
8471
8472     /^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?
8473     (?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/
8474
8475 Examples of email autolinks:
8476
8477 ```````````````````````````````` example
8478 <foo@bar.example.com>
8479 .
8480 <p><a href="mailto:foo@bar.example.com">foo@bar.example.com</a></p>
8481 ````````````````````````````````
8482
8483
8484 ```````````````````````````````` example
8485 <foo+special@Bar.baz-bar0.com>
8486 .
8487 <p><a href="mailto:foo+special@Bar.baz-bar0.com">foo+special@Bar.baz-bar0.com</a></p>
8488 ````````````````````````````````
8489
8490
8491 Backslash-escapes do not work inside email autolinks:
8492
8493 ```````````````````````````````` example
8494 <foo\+@bar.example.com>
8495 .
8496 <p>&lt;foo+@bar.example.com&gt;</p>
8497 ````````````````````````````````
8498
8499
8500 These are not autolinks:
8501
8502 ```````````````````````````````` example
8503 <>
8504 .
8505 <p>&lt;&gt;</p>
8506 ````````````````````````````````
8507
8508
8509 ```````````````````````````````` example
8510 < http://foo.bar >
8511 .
8512 <p>&lt; http://foo.bar &gt;</p>
8513 ````````````````````````````````
8514
8515
8516 ```````````````````````````````` example
8517 <m:abc>
8518 .
8519 <p>&lt;m:abc&gt;</p>
8520 ````````````````````````````````
8521
8522
8523 ```````````````````````````````` example
8524 <foo.bar.baz>
8525 .
8526 <p>&lt;foo.bar.baz&gt;</p>
8527 ````````````````````````````````
8528
8529
8530 ```````````````````````````````` example
8531 http://example.com
8532 .
8533 <p>http://example.com</p>
8534 ````````````````````````````````
8535
8536
8537 ```````````````````````````````` example
8538 foo@bar.example.com
8539 .
8540 <p>foo@bar.example.com</p>
8541 ````````````````````````````````
8542
8543
8544 ## Raw HTML
8545
8546 Text between `<` and `>` that looks like an HTML tag is parsed as a
8547 raw HTML tag and will be rendered in HTML without escaping.
8548 Tag and attribute names are not limited to current HTML tags,
8549 so custom tags (and even, say, DocBook tags) may be used.
8550
8551 Here is the grammar for tags:
8552
8553 A [tag name](@) consists of an ASCII letter
8554 followed by zero or more ASCII letters, digits, or
8555 hyphens (`-`).
8556
8557 An [attribute](@) consists of [whitespace],
8558 an [attribute name], and an optional
8559 [attribute value specification].
8560
8561 An [attribute name](@)
8562 consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII
8563 letters, digits, `_`, `.`, `:`, or `-`.  (Note:  This is the XML
8564 specification restricted to ASCII.  HTML5 is laxer.)
8565
8566 An [attribute value specification](@)
8567 consists of optional [whitespace],
8568 a `=` character, optional [whitespace], and an [attribute
8569 value].
8570
8571 An [attribute value](@)
8572 consists of an [unquoted attribute value],
8573 a [single-quoted attribute value], or a [double-quoted attribute value].
8574
8575 An [unquoted attribute value](@)
8576 is a nonempty string of characters not
8577 including spaces, `"`, `'`, `=`, `<`, `>`, or `` ` ``.
8578
8579 A [single-quoted attribute value](@)
8580 consists of `'`, zero or more
8581 characters not including `'`, and a final `'`.
8582
8583 A [double-quoted attribute value](@)
8584 consists of `"`, zero or more
8585 characters not including `"`, and a final `"`.
8586
8587 An [open tag](@) consists of a `<` character, a [tag name],
8588 zero or more [attributes], optional [whitespace], an optional `/`
8589 character, and a `>` character.
8590
8591 A [closing tag](@) consists of the string `</`, a
8592 [tag name], optional [whitespace], and the character `>`.
8593
8594 An [HTML comment](@) consists of `<!--` + *text* + `-->`,
8595 where *text* does not start with `>` or `->`, does not end with `-`,
8596 and does not contain `--`.  (See the
8597 [HTML5 spec](http://www.w3.org/TR/html5/syntax.html#comments).)
8598
8599 A [processing instruction](@)
8600 consists of the string `<?`, a string
8601 of characters not including the string `?>`, and the string
8602 `?>`.
8603
8604 A [declaration](@) consists of the
8605 string `<!`, a name consisting of one or more uppercase ASCII letters,
8606 [whitespace], a string of characters not including the
8607 character `>`, and the character `>`.
8608
8609 A [CDATA section](@) consists of
8610 the string `<![CDATA[`, a string of characters not including the string
8611 `]]>`, and the string `]]>`.
8612
8613 An [HTML tag](@) consists of an [open tag], a [closing tag],
8614 an [HTML comment], a [processing instruction], a [declaration],
8615 or a [CDATA section].
8616
8617 Here are some simple open tags:
8618
8619 ```````````````````````````````` example
8620 <a><bab><c2c>
8621 .
8622 <p><a><bab><c2c></p>
8623 ````````````````````````````````
8624
8625
8626 Empty elements:
8627
8628 ```````````````````````````````` example
8629 <a/><b2/>
8630 .
8631 <p><a/><b2/></p>
8632 ````````````````````````````````
8633
8634
8635 [Whitespace] is allowed:
8636
8637 ```````````````````````````````` example
8638 <a  /><b2
8639 data="foo" >
8640 .
8641 <p><a  /><b2
8642 data="foo" ></p>
8643 ````````````````````````````````
8644
8645
8646 With attributes:
8647
8648 ```````````````````````````````` example
8649 <a foo="bar" bam = 'baz <em>"</em>'
8650 _boolean zoop:33=zoop:33 />
8651 .
8652 <p><a foo="bar" bam = 'baz <em>"</em>'
8653 _boolean zoop:33=zoop:33 /></p>
8654 ````````````````````````````````
8655
8656
8657 Custom tag names can be used:
8658
8659 ```````````````````````````````` example
8660 Foo <responsive-image src="foo.jpg" />
8661 .
8662 <p>Foo <responsive-image src="foo.jpg" /></p>
8663 ````````````````````````````````
8664
8665
8666 Illegal tag names, not parsed as HTML:
8667
8668 ```````````````````````````````` example
8669 <33> <__>
8670 .
8671 <p>&lt;33&gt; &lt;__&gt;</p>
8672 ````````````````````````````````
8673
8674
8675 Illegal attribute names:
8676
8677 ```````````````````````````````` example
8678 <a h*#ref="hi">
8679 .
8680 <p>&lt;a h*#ref=&quot;hi&quot;&gt;</p>
8681 ````````````````````````````````
8682
8683
8684 Illegal attribute values:
8685
8686 ```````````````````````````````` example
8687 <a href="hi'> <a href=hi'>
8688 .
8689 <p>&lt;a href=&quot;hi'&gt; &lt;a href=hi'&gt;</p>
8690 ````````````````````````````````
8691
8692
8693 Illegal [whitespace]:
8694
8695 ```````````````````````````````` example
8696 < a><
8697 foo><bar/ >
8698 .
8699 <p>&lt; a&gt;&lt;
8700 foo&gt;&lt;bar/ &gt;</p>
8701 ````````````````````````````````
8702
8703
8704 Missing [whitespace]:
8705
8706 ```````````````````````````````` example
8707 <a href='bar'title=title>
8708 .
8709 <p>&lt;a href='bar'title=title&gt;</p>
8710 ````````````````````````````````
8711
8712
8713 Closing tags:
8714
8715 ```````````````````````````````` example
8716 </a></foo >
8717 .
8718 <p></a></foo ></p>
8719 ````````````````````````````````
8720
8721
8722 Illegal attributes in closing tag:
8723
8724 ```````````````````````````````` example
8725 </a href="foo">
8726 .
8727 <p>&lt;/a href=&quot;foo&quot;&gt;</p>
8728 ````````````````````````````````
8729
8730
8731 Comments:
8732
8733 ```````````````````````````````` example
8734 foo <!-- this is a
8735 comment - with hyphen -->
8736 .
8737 <p>foo <!-- this is a
8738 comment - with hyphen --></p>
8739 ````````````````````````````````
8740
8741
8742 ```````````````````````````````` example
8743 foo <!-- not a comment -- two hyphens -->
8744 .
8745 <p>foo &lt;!-- not a comment -- two hyphens --&gt;</p>
8746 ````````````````````````````````
8747
8748
8749 Not comments:
8750
8751 ```````````````````````````````` example
8752 foo <!--> foo -->
8753
8754 foo <!-- foo--->
8755 .
8756 <p>foo &lt;!--&gt; foo --&gt;</p>
8757 <p>foo &lt;!-- foo---&gt;</p>
8758 ````````````````````````````````
8759
8760
8761 Processing instructions:
8762
8763 ```````````````````````````````` example
8764 foo <?php echo $a; ?>
8765 .
8766 <p>foo <?php echo $a; ?></p>
8767 ````````````````````````````````
8768
8769
8770 Declarations:
8771
8772 ```````````````````````````````` example
8773 foo <!ELEMENT br EMPTY>
8774 .
8775 <p>foo <!ELEMENT br EMPTY></p>
8776 ````````````````````````````````
8777
8778
8779 CDATA sections:
8780
8781 ```````````````````````````````` example
8782 foo <![CDATA[>&<]]>
8783 .
8784 <p>foo <![CDATA[>&<]]></p>
8785 ````````````````````````````````
8786
8787
8788 Entity and numeric character references are preserved in HTML
8789 attributes:
8790
8791 ```````````````````````````````` example
8792 foo <a href="&ouml;">
8793 .
8794 <p>foo <a href="&ouml;"></p>
8795 ````````````````````````````````
8796
8797
8798 Backslash escapes do not work in HTML attributes:
8799
8800 ```````````````````````````````` example
8801 foo <a href="\*">
8802 .
8803 <p>foo <a href="\*"></p>
8804 ````````````````````````````````
8805
8806
8807 ```````````````````````````````` example
8808 <a href="\"">
8809 .
8810 <p>&lt;a href=&quot;&quot;&quot;&gt;</p>
8811 ````````````````````````````````
8812
8813
8814 ## Hard line breaks
8815
8816 A line break (not in a code span or HTML tag) that is preceded
8817 by two or more spaces and does not occur at the end of a block
8818 is parsed as a [hard line break](@) (rendered
8819 in HTML as a `<br />` tag):
8820
8821 ```````````````````````````````` example
8822 foo  
8823 baz
8824 .
8825 <p>foo<br />
8826 baz</p>
8827 ````````````````````````````````
8828
8829
8830 For a more visible alternative, a backslash before the
8831 [line ending] may be used instead of two spaces:
8832
8833 ```````````````````````````````` example
8834 foo\
8835 baz
8836 .
8837 <p>foo<br />
8838 baz</p>
8839 ````````````````````````````````
8840
8841
8842 More than two spaces can be used:
8843
8844 ```````````````````````````````` example
8845 foo       
8846 baz
8847 .
8848 <p>foo<br />
8849 baz</p>
8850 ````````````````````````````````
8851
8852
8853 Leading spaces at the beginning of the next line are ignored:
8854
8855 ```````````````````````````````` example
8856 foo  
8857      bar
8858 .
8859 <p>foo<br />
8860 bar</p>
8861 ````````````````````````````````
8862
8863
8864 ```````````````````````````````` example
8865 foo\
8866      bar
8867 .
8868 <p>foo<br />
8869 bar</p>
8870 ````````````````````````````````
8871
8872
8873 Line breaks can occur inside emphasis, links, and other constructs
8874 that allow inline content:
8875
8876 ```````````````````````````````` example
8877 *foo  
8878 bar*
8879 .
8880 <p><em>foo<br />
8881 bar</em></p>
8882 ````````````````````````````````
8883
8884
8885 ```````````````````````````````` example
8886 *foo\
8887 bar*
8888 .
8889 <p><em>foo<br />
8890 bar</em></p>
8891 ````````````````````````````````
8892
8893
8894 Line breaks do not occur inside code spans
8895
8896 ```````````````````````````````` example
8897 `code  
8898 span`
8899 .
8900 <p><code>code span</code></p>
8901 ````````````````````````````````
8902
8903
8904 ```````````````````````````````` example
8905 `code\
8906 span`
8907 .
8908 <p><code>code\ span</code></p>
8909 ````````````````````````````````
8910
8911
8912 or HTML tags:
8913
8914 ```````````````````````````````` example
8915 <a href="foo  
8916 bar">
8917 .
8918 <p><a href="foo  
8919 bar"></p>
8920 ````````````````````````````````
8921
8922
8923 ```````````````````````````````` example
8924 <a href="foo\
8925 bar">
8926 .
8927 <p><a href="foo\
8928 bar"></p>
8929 ````````````````````````````````
8930
8931
8932 Hard line breaks are for separating inline content within a block.
8933 Neither syntax for hard line breaks works at the end of a paragraph or
8934 other block element:
8935
8936 ```````````````````````````````` example
8937 foo\
8938 .
8939 <p>foo\</p>
8940 ````````````````````````````````
8941
8942
8943 ```````````````````````````````` example
8944 foo  
8945 .
8946 <p>foo</p>
8947 ````````````````````````````````
8948
8949
8950 ```````````````````````````````` example
8951 ### foo\
8952 .
8953 <h3>foo\</h3>
8954 ````````````````````````````````
8955
8956
8957 ```````````````````````````````` example
8958 ### foo  
8959 .
8960 <h3>foo</h3>
8961 ````````````````````````````````
8962
8963
8964 ## Soft line breaks
8965
8966 A regular line break (not in a code span or HTML tag) that is not
8967 preceded by two or more spaces or a backslash is parsed as a
8968 [softbreak](@).  (A softbreak may be rendered in HTML either as a
8969 [line ending] or as a space. The result will be the same in
8970 browsers. In the examples here, a [line ending] will be used.)
8971
8972 ```````````````````````````````` example
8973 foo
8974 baz
8975 .
8976 <p>foo
8977 baz</p>
8978 ````````````````````````````````
8979
8980
8981 Spaces at the end of the line and beginning of the next line are
8982 removed:
8983
8984 ```````````````````````````````` example
8985 foo 
8986  baz
8987 .
8988 <p>foo
8989 baz</p>
8990 ````````````````````````````````
8991
8992
8993 A conforming parser may render a soft line break in HTML either as a
8994 line break or as a space.
8995
8996 A renderer may also provide an option to render soft line breaks
8997 as hard line breaks.
8998
8999 ## Textual content
9000
9001 Any characters not given an interpretation by the above rules will
9002 be parsed as plain textual content.
9003
9004 ```````````````````````````````` example
9005 hello $.;'there
9006 .
9007 <p>hello $.;'there</p>
9008 ````````````````````````````````
9009
9010
9011 ```````````````````````````````` example
9012 Foo χρῆν
9013 .
9014 <p>Foo χρῆν</p>
9015 ````````````````````````````````
9016
9017
9018 Internal spaces are preserved verbatim:
9019
9020 ```````````````````````````````` example
9021 Multiple     spaces
9022 .
9023 <p>Multiple     spaces</p>
9024 ````````````````````````````````
9025
9026
9027 <!-- END TESTS -->
9028
9029 # Appendix: A parsing strategy
9030
9031 In this appendix we describe some features of the parsing strategy
9032 used in the CommonMark reference implementations.
9033
9034 ## Overview
9035
9036 Parsing has two phases:
9037
9038 1. In the first phase, lines of input are consumed and the block
9039 structure of the document---its division into paragraphs, block quotes,
9040 list items, and so on---is constructed.  Text is assigned to these
9041 blocks but not parsed. Link reference definitions are parsed and a
9042 map of links is constructed.
9043
9044 2. In the second phase, the raw text contents of paragraphs and headings
9045 are parsed into sequences of Markdown inline elements (strings,
9046 code spans, links, emphasis, and so on), using the map of link
9047 references constructed in phase 1.
9048
9049 At each point in processing, the document is represented as a tree of
9050 **blocks**.  The root of the tree is a `document` block.  The `document`
9051 may have any number of other blocks as **children**.  These children
9052 may, in turn, have other blocks as children.  The last child of a block
9053 is normally considered **open**, meaning that subsequent lines of input
9054 can alter its contents.  (Blocks that are not open are **closed**.)
9055 Here, for example, is a possible document tree, with the open blocks
9056 marked by arrows:
9057
9058 ``` tree
9059 -> document
9060   -> block_quote
9061        paragraph
9062          "Lorem ipsum dolor\nsit amet."
9063     -> list (type=bullet tight=true bullet_char=-)
9064          list_item
9065            paragraph
9066              "Qui *quodsi iracundia*"
9067       -> list_item
9068         -> paragraph
9069              "aliquando id"
9070 ```
9071
9072 ## Phase 1: block structure
9073
9074 Each line that is processed has an effect on this tree.  The line is
9075 analyzed and, depending on its contents, the document may be altered
9076 in one or more of the following ways:
9077
9078 1. One or more open blocks may be closed.
9079 2. One or more new blocks may be created as children of the
9080    last open block.
9081 3. Text may be added to the last (deepest) open block remaining
9082    on the tree.
9083
9084 Once a line has been incorporated into the tree in this way,
9085 it can be discarded, so input can be read in a stream.
9086
9087 For each line, we follow this procedure:
9088
9089 1. First we iterate through the open blocks, starting with the
9090 root document, and descending through last children down to the last
9091 open block.  Each block imposes a condition that the line must satisfy
9092 if the block is to remain open.  For example, a block quote requires a
9093 `>` character.  A paragraph requires a non-blank line.
9094 In this phase we may match all or just some of the open
9095 blocks.  But we cannot close unmatched blocks yet, because we may have a
9096 [lazy continuation line].
9097
9098 2.  Next, after consuming the continuation markers for existing
9099 blocks, we look for new block starts (e.g. `>` for a block quote).
9100 If we encounter a new block start, we close any blocks unmatched
9101 in step 1 before creating the new block as a child of the last
9102 matched block.
9103
9104 3.  Finally, we look at the remainder of the line (after block
9105 markers like `>`, list markers, and indentation have been consumed).
9106 This is text that can be incorporated into the last open
9107 block (a paragraph, code block, heading, or raw HTML).
9108
9109 Setext headings are formed when we see a line of a paragraph
9110 that is a [setext heading underline].
9111
9112 Reference link definitions are detected when a paragraph is closed;
9113 the accumulated text lines are parsed to see if they begin with
9114 one or more reference link definitions.  Any remainder becomes a
9115 normal paragraph.
9116
9117 We can see how this works by considering how the tree above is
9118 generated by four lines of Markdown:
9119
9120 ``` markdown
9121 > Lorem ipsum dolor
9122 sit amet.
9123 > - Qui *quodsi iracundia*
9124 > - aliquando id
9125 ```
9126
9127 At the outset, our document model is just
9128
9129 ``` tree
9130 -> document
9131 ```
9132
9133 The first line of our text,
9134
9135 ``` markdown
9136 > Lorem ipsum dolor
9137 ```
9138
9139 causes a `block_quote` block to be created as a child of our
9140 open `document` block, and a `paragraph` block as a child of
9141 the `block_quote`.  Then the text is added to the last open
9142 block, the `paragraph`:
9143
9144 ``` tree
9145 -> document
9146   -> block_quote
9147     -> paragraph
9148          "Lorem ipsum dolor"
9149 ```
9150
9151 The next line,
9152
9153 ``` markdown
9154 sit amet.
9155 ```
9156
9157 is a "lazy continuation" of the open `paragraph`, so it gets added
9158 to the paragraph's text:
9159
9160 ``` tree
9161 -> document
9162   -> block_quote
9163     -> paragraph
9164          "Lorem ipsum dolor\nsit amet."
9165 ```
9166
9167 The third line,
9168
9169 ``` markdown
9170 > - Qui *quodsi iracundia*
9171 ```
9172
9173 causes the `paragraph` block to be closed, and a new `list` block
9174 opened as a child of the `block_quote`.  A `list_item` is also
9175 added as a child of the `list`, and a `paragraph` as a child of
9176 the `list_item`.  The text is then added to the new `paragraph`:
9177
9178 ``` tree
9179 -> document
9180   -> block_quote
9181        paragraph
9182          "Lorem ipsum dolor\nsit amet."
9183     -> list (type=bullet tight=true bullet_char=-)
9184       -> list_item
9185         -> paragraph
9186              "Qui *quodsi iracundia*"
9187 ```
9188
9189 The fourth line,
9190
9191 ``` markdown
9192 > - aliquando id
9193 ```
9194
9195 causes the `list_item` (and its child the `paragraph`) to be closed,
9196 and a new `list_item` opened up as child of the `list`.  A `paragraph`
9197 is added as a child of the new `list_item`, to contain the text.
9198 We thus obtain the final tree:
9199
9200 ``` tree
9201 -> document
9202   -> block_quote
9203        paragraph
9204          "Lorem ipsum dolor\nsit amet."
9205     -> list (type=bullet tight=true bullet_char=-)
9206          list_item
9207            paragraph
9208              "Qui *quodsi iracundia*"
9209       -> list_item
9210         -> paragraph
9211              "aliquando id"
9212 ```
9213
9214 ## Phase 2: inline structure
9215
9216 Once all of the input has been parsed, all open blocks are closed.
9217
9218 We then "walk the tree," visiting every node, and parse raw
9219 string contents of paragraphs and headings as inlines.  At this
9220 point we have seen all the link reference definitions, so we can
9221 resolve reference links as we go.
9222
9223 ``` tree
9224 document
9225   block_quote
9226     paragraph
9227       str "Lorem ipsum dolor"
9228       softbreak
9229       str "sit amet."
9230     list (type=bullet tight=true bullet_char=-)
9231       list_item
9232         paragraph
9233           str "Qui "
9234           emph
9235             str "quodsi iracundia"
9236       list_item
9237         paragraph
9238           str "aliquando id"
9239 ```
9240
9241 Notice how the [line ending] in the first paragraph has
9242 been parsed as a `softbreak`, and the asterisks in the first list item
9243 have become an `emph`.
9244
9245 ### An algorithm for parsing nested emphasis and links
9246
9247 By far the trickiest part of inline parsing is handling emphasis,
9248 strong emphasis, links, and images.  This is done using the following
9249 algorithm.
9250
9251 When we're parsing inlines and we hit either
9252
9253 - a run of `*` or `_` characters, or
9254 - a `[` or `![`
9255
9256 we insert a text node with these symbols as its literal content, and we
9257 add a pointer to this text node to the [delimiter stack](@).
9258
9259 The [delimiter stack] is a doubly linked list.  Each
9260 element contains a pointer to a text node, plus information about
9261
9262 - the type of delimiter (`[`, `![`, `*`, `_`)
9263 - the number of delimiters,
9264 - whether the delimiter is "active" (all are active to start), and
9265 - whether the delimiter is a potential opener, a potential closer,
9266   or both (which depends on what sort of characters precede
9267   and follow the delimiters).
9268
9269 When we hit a `]` character, we call the *look for link or image*
9270 procedure (see below).
9271
9272 When we hit the end of the input, we call the *process emphasis*
9273 procedure (see below), with `stack_bottom` = NULL.
9274
9275 #### *look for link or image*
9276
9277 Starting at the top of the delimiter stack, we look backwards
9278 through the stack for an opening `[` or `![` delimiter.
9279
9280 - If we don't find one, we return a literal text node `]`.
9281
9282 - If we do find one, but it's not *active*, we remove the inactive
9283   delimiter from the stack, and return a literal text node `]`.
9284
9285 - If we find one and it's active, then we parse ahead to see if
9286   we have an inline link/image, reference link/image, compact reference
9287   link/image, or shortcut reference link/image.
9288
9289   + If we don't, then we remove the opening delimiter from the
9290     delimiter stack and return a literal text node `]`.
9291
9292   + If we do, then
9293
9294     * We return a link or image node whose children are the inlines
9295       after the text node pointed to by the opening delimiter.
9296
9297     * We run *process emphasis* on these inlines, with the `[` opener
9298       as `stack_bottom`.
9299
9300     * We remove the opening delimiter.
9301
9302     * If we have a link (and not an image), we also set all
9303       `[` delimiters before the opening delimiter to *inactive*.  (This
9304       will prevent us from getting links within links.)
9305
9306 #### *process emphasis*
9307
9308 Parameter `stack_bottom` sets a lower bound to how far we
9309 descend in the [delimiter stack].  If it is NULL, we can
9310 go all the way to the bottom.  Otherwise, we stop before
9311 visiting `stack_bottom`.
9312
9313 Let `current_position` point to the element on the [delimiter stack]
9314 just above `stack_bottom` (or the first element if `stack_bottom`
9315 is NULL).
9316
9317 We keep track of the `openers_bottom` for each delimiter
9318 type (`*`, `_`).  Initialize this to `stack_bottom`.
9319
9320 Then we repeat the following until we run out of potential
9321 closers:
9322
9323 - Move `current_position` forward in the delimiter stack (if needed)
9324   until we find the first potential closer with delimiter `*` or `_`.
9325   (This will be the potential closer closest
9326   to the beginning of the input -- the first one in parse order.)
9327
9328 - Now, look back in the stack (staying above `stack_bottom` and
9329   the `openers_bottom` for this delimiter type) for the
9330   first matching potential opener ("matching" means same delimiter).
9331
9332 - If one is found:
9333
9334   + Figure out whether we have emphasis or strong emphasis:
9335     if both closer and opener spans have length >= 2, we have
9336     strong, otherwise regular.
9337
9338   + Insert an emph or strong emph node accordingly, after
9339     the text node corresponding to the opener.
9340
9341   + Remove any delimiters between the opener and closer from
9342     the delimiter stack.
9343
9344   + Remove 1 (for regular emph) or 2 (for strong emph) delimiters
9345     from the opening and closing text nodes.  If they become empty
9346     as a result, remove them and remove the corresponding element
9347     of the delimiter stack.  If the closing node is removed, reset
9348     `current_position` to the next element in the stack.
9349
9350 - If none in found:
9351
9352   + Set `openers_bottom` to the element before `current_position`.
9353     (We know that there are no openers for this kind of closer up to and
9354     including this point, so this puts a lower bound on future searches.)
9355
9356   + If the closer at `current_position` is not a potential opener,
9357     remove it from the delimiter stack (since we know it can't
9358     be a closer either).
9359
9360   + Advance `current_position` to the next element in the stack.
9361
9362 After we're done, we remove all delimiters above `stack_bottom` from the
9363 delimiter stack.
9364