]> gerrit.simantics Code Review - simantics/platform.git/blob - bundles/org.simantics.scl.compiler/tests/org/simantics/scl/compiler/tests/markdown/spec.txt
Added info on backup location to documentation backup.
[simantics/platform.git] / bundles / org.simantics.scl.compiler / tests / org / simantics / scl / compiler / tests / markdown / spec.txt
1 ---
2 title: CommonMark Spec
3 author: John MacFarlane
4 version: 0.25
5 date: '2016-03-24'
6 license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)'
7 ...
8
9 # Introduction
10
11 ## What is Markdown?
12
13 Markdown is a plain text format for writing structured documents,
14 based on conventions used for indicating formatting in email and
15 usenet posts.  It was developed in 2004 by John Gruber, who wrote
16 the first Markdown-to-HTML converter in perl, and it soon became
17 widely used in websites.  By 2014 there were dozens of
18 implementations in many languages.  Some of them extended basic
19 Markdown syntax with conventions for footnotes, definition lists,
20 tables, and other constructs, and some allowed output not just in
21 HTML but in LaTeX and many other formats.
22
23 ## Why is a spec needed?
24
25 John Gruber's [canonical description of Markdown's
26 syntax](http://daringfireball.net/projects/markdown/syntax)
27 does not specify the syntax unambiguously.  Here are some examples of
28 questions it does not answer:
29
30 1.  How much indentation is needed for a sublist?  The spec says that
31     continuation paragraphs need to be indented four spaces, but is
32     not fully explicit about sublists.  It is natural to think that
33     they, too, must be indented four spaces, but `Markdown.pl` does
34     not require that.  This is hardly a "corner case," and divergences
35     between implementations on this issue often lead to surprises for
36     users in real documents. (See [this comment by John
37     Gruber](http://article.gmane.org/gmane.text.markdown.general/1997).)
38
39 2.  Is a blank line needed before a block quote or heading?
40     Most implementations do not require the blank line.  However,
41     this can lead to unexpected results in hard-wrapped text, and
42     also to ambiguities in parsing (note that some implementations
43     put the heading inside the blockquote, while others do not).
44     (John Gruber has also spoken [in favor of requiring the blank
45     lines](http://article.gmane.org/gmane.text.markdown.general/2146).)
46
47 3.  Is a blank line needed before an indented code block?
48     (`Markdown.pl` requires it, but this is not mentioned in the
49     documentation, and some implementations do not require it.)
50
51     ``` markdown
52     paragraph
53         code?
54     ```
55
56 4.  What is the exact rule for determining when list items get
57     wrapped in `<p>` tags?  Can a list be partially "loose" and partially
58     "tight"?  What should we do with a list like this?
59
60     ``` markdown
61     1. one
62
63     2. two
64     3. three
65     ```
66
67     Or this?
68
69     ``` markdown
70     1.  one
71         - a
72
73         - b
74     2.  two
75     ```
76
77     (There are some relevant comments by John Gruber
78     [here](http://article.gmane.org/gmane.text.markdown.general/2554).)
79
80 5.  Can list markers be indented?  Can ordered list markers be right-aligned?
81
82     ``` markdown
83      8. item 1
84      9. item 2
85     10. item 2a
86     ```
87
88 6.  Is this one list with a thematic break in its second item,
89     or two lists separated by a thematic break?
90
91     ``` markdown
92     * a
93     * * * * *
94     * b
95     ```
96
97 7.  When list markers change from numbers to bullets, do we have
98     two lists or one?  (The Markdown syntax description suggests two,
99     but the perl scripts and many other implementations produce one.)
100
101     ``` markdown
102     1. fee
103     2. fie
104     -  foe
105     -  fum
106     ```
107
108 8.  What are the precedence rules for the markers of inline structure?
109     For example, is the following a valid link, or does the code span
110     take precedence ?
111
112     ``` markdown
113     [a backtick (`)](/url) and [another backtick (`)](/url).
114     ```
115
116 9.  What are the precedence rules for markers of emphasis and strong
117     emphasis?  For example, how should the following be parsed?
118
119     ``` markdown
120     *foo *bar* baz*
121     ```
122
123 10. What are the precedence rules between block-level and inline-level
124     structure?  For example, how should the following be parsed?
125
126     ``` markdown
127     - `a long code span can contain a hyphen like this
128       - and it can screw things up`
129     ```
130
131 11. Can list items include section headings?  (`Markdown.pl` does not
132     allow this, but does allow blockquotes to include headings.)
133
134     ``` markdown
135     - # Heading
136     ```
137
138 12. Can list items be empty?
139
140     ``` markdown
141     * a
142     *
143     * b
144     ```
145
146 13. Can link references be defined inside block quotes or list items?
147
148     ``` markdown
149     > Blockquote [foo].
150     >
151     > [foo]: /url
152     ```
153
154 14. If there are multiple definitions for the same reference, which takes
155     precedence?
156
157     ``` markdown
158     [foo]: /url1
159     [foo]: /url2
160
161     [foo][]
162     ```
163
164 In the absence of a spec, early implementers consulted `Markdown.pl`
165 to resolve these ambiguities.  But `Markdown.pl` was quite buggy, and
166 gave manifestly bad results in many cases, so it was not a
167 satisfactory replacement for a spec.
168
169 Because there is no unambiguous spec, implementations have diverged
170 considerably.  As a result, users are often surprised to find that
171 a document that renders one way on one system (say, a github wiki)
172 renders differently on another (say, converting to docbook using
173 pandoc).  To make matters worse, because nothing in Markdown counts
174 as a "syntax error," the divergence often isn't discovered right away.
175
176 ## About this document
177
178 This document attempts to specify Markdown syntax unambiguously.
179 It contains many examples with side-by-side Markdown and
180 HTML.  These are intended to double as conformance tests.  An
181 accompanying script `spec_tests.py` can be used to run the tests
182 against any Markdown program:
183
184     python test/spec_tests.py --spec spec.txt --program PROGRAM
185
186 Since this document describes how Markdown is to be parsed into
187 an abstract syntax tree, it would have made sense to use an abstract
188 representation of the syntax tree instead of HTML.  But HTML is capable
189 of representing the structural distinctions we need to make, and the
190 choice of HTML for the tests makes it possible to run the tests against
191 an implementation without writing an abstract syntax tree renderer.
192
193 This document is generated from a text file, `spec.txt`, written
194 in Markdown with a small extension for the side-by-side tests.
195 The script `tools/makespec.py` can be used to convert `spec.txt` into
196 HTML or CommonMark (which can then be converted into other formats).
197
198 In the examples, the `→` character is used to represent tabs.
199
200 # Preliminaries
201
202 ## Characters and lines
203
204 Any sequence of [characters] is a valid CommonMark
205 document.
206
207 A [character](@) is a Unicode code point.  Although some
208 code points (for example, combining accents) do not correspond to
209 characters in an intuitive sense, all code points count as characters
210 for purposes of this spec.
211
212 This spec does not specify an encoding; it thinks of lines as composed
213 of [characters] rather than bytes.  A conforming parser may be limited
214 to a certain encoding.
215
216 A [line](@) is a sequence of zero or more [characters]
217 other than newline (`U+000A`) or carriage return (`U+000D`),
218 followed by a [line ending] or by the end of file.
219
220 A [line ending](@) is a newline (`U+000A`), a carriage return
221 (`U+000D`) not followed by a newline, or a carriage return and a
222 following newline.
223
224 A line containing no characters, or a line containing only spaces
225 (`U+0020`) or tabs (`U+0009`), is called a [blank line](@).
226
227 The following definitions of character classes will be used in this spec:
228
229 A [whitespace character](@) is a space
230 (`U+0020`), tab (`U+0009`), newline (`U+000A`), line tabulation (`U+000B`),
231 form feed (`U+000C`), or carriage return (`U+000D`).
232
233 [Whitespace](@) is a sequence of one or more [whitespace
234 characters].
235
236 A [Unicode whitespace character](@) is
237 any code point in the Unicode `Zs` class, or a tab (`U+0009`),
238 carriage return (`U+000D`), newline (`U+000A`), or form feed
239 (`U+000C`).
240
241 [Unicode whitespace](@) is a sequence of one
242 or more [Unicode whitespace characters].
243
244 A [space](@) is `U+0020`.
245
246 A [non-whitespace character](@) is any character
247 that is not a [whitespace character].
248
249 An [ASCII punctuation character](@)
250 is `!`, `"`, `#`, `$`, `%`, `&`, `'`, `(`, `)`,
251 `*`, `+`, `,`, `-`, `.`, `/`, `:`, `;`, `<`, `=`, `>`, `?`, `@`,
252 `[`, `\`, `]`, `^`, `_`, `` ` ``, `{`, `|`, `}`, or `~`.
253
254 A [punctuation character](@) is an [ASCII
255 punctuation character] or anything in
256 the Unicode classes `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po`, or `Ps`.
257
258 ## Tabs
259
260 Tabs in lines are not expanded to [spaces].  However,
261 in contexts where indentation is significant for the
262 document's structure, tabs behave as if they were replaced
263 by spaces with a tab stop of 4 characters.
264
265 ```````````````````````````````` example
266 →foo→baz→→bim
267 .
268 <pre><code>foo→baz→→bim
269 </code></pre>
270 ````````````````````````````````
271
272
273 ```````````````````````````````` example
274   →foo→baz→→bim
275 .
276 <pre><code>foo→baz→→bim
277 </code></pre>
278 ````````````````````````````````
279
280
281 ```````````````````````````````` example
282     a→a
283     ὐ→a
284 .
285 <pre><code>a→a
286 ὐ→a
287 </code></pre>
288 ````````````````````````````````
289
290
291 ```````````````````````````````` example
292   - foo
293
294 →bar
295 .
296 <ul>
297 <li>
298 <p>foo</p>
299 <p>bar</p>
300 </li>
301 </ul>
302 ````````````````````````````````
303
304 ```````````````````````````````` example
305 - foo
306
307 →→bar
308 .
309 <ul>
310 <li>
311 <p>foo</p>
312 <pre><code>  bar
313 </code></pre>
314 </li>
315 </ul>
316 ````````````````````````````````
317
318 ```````````````````````````````` example
319 >→→foo
320 .
321 <blockquote>
322 <pre><code>  foo
323 </code></pre>
324 </blockquote>
325 ````````````````````````````````
326
327 ```````````````````````````````` example
328 -→→foo
329 .
330 <ul>
331 <li>
332 <pre><code>  foo
333 </code></pre>
334 </li>
335 </ul>
336 ````````````````````````````````
337
338
339 ```````````````````````````````` example
340     foo
341 →bar
342 .
343 <pre><code>foo
344 bar
345 </code></pre>
346 ````````````````````````````````
347
348 ```````````````````````````````` example
349  - foo
350    - bar
351 → - baz
352 .
353 <ul>
354 <li>foo
355 <ul>
356 <li>bar
357 <ul>
358 <li>baz</li>
359 </ul>
360 </li>
361 </ul>
362 </li>
363 </ul>
364 ````````````````````````````````
365
366
367
368 ## Insecure characters
369
370 For security reasons, the Unicode character `U+0000` must be replaced
371 with the REPLACEMENT CHARACTER (`U+FFFD`).
372
373 # Blocks and inlines
374
375 We can think of a document as a sequence of
376 [blocks](@)---structural elements like paragraphs, block
377 quotations, lists, headings, rules, and code blocks.  Some blocks (like
378 block quotes and list items) contain other blocks; others (like
379 headings and paragraphs) contain [inline](@) content---text,
380 links, emphasized text, images, code, and so on.
381
382 ## Precedence
383
384 Indicators of block structure always take precedence over indicators
385 of inline structure.  So, for example, the following is a list with
386 two items, not a list with one item containing a code span:
387
388 ```````````````````````````````` example
389 - `one
390 - two`
391 .
392 <ul>
393 <li>`one</li>
394 <li>two`</li>
395 </ul>
396 ````````````````````````````````
397
398
399 This means that parsing can proceed in two steps:  first, the block
400 structure of the document can be discerned; second, text lines inside
401 paragraphs, headings, and other block constructs can be parsed for inline
402 structure.  The second step requires information about link reference
403 definitions that will be available only at the end of the first
404 step.  Note that the first step requires processing lines in sequence,
405 but the second can be parallelized, since the inline parsing of
406 one block element does not affect the inline parsing of any other.
407
408 ## Container blocks and leaf blocks
409
410 We can divide blocks into two types:
411 [container block](@)s,
412 which can contain other blocks, and [leaf block](@)s,
413 which cannot.
414
415 # Leaf blocks
416
417 This section describes the different kinds of leaf block that make up a
418 Markdown document.
419
420 ## Thematic breaks
421
422 A line consisting of 0-3 spaces of indentation, followed by a sequence
423 of three or more matching `-`, `_`, or `*` characters, each followed
424 optionally by any number of spaces, forms a
425 [thematic break](@).
426
427 ```````````````````````````````` example
428 ***
429 ---
430 ___
431 .
432 <hr />
433 <hr />
434 <hr />
435 ````````````````````````````````
436
437
438 Wrong characters:
439
440 ```````````````````````````````` example
441 +++
442 .
443 <p>+++</p>
444 ````````````````````````````````
445
446
447 ```````````````````````````````` example
448 ===
449 .
450 <p>===</p>
451 ````````````````````````````````
452
453
454 Not enough characters:
455
456 ```````````````````````````````` example
457 --
458 **
459 __
460 .
461 <p>--
462 **
463 __</p>
464 ````````````````````````````````
465
466
467 One to three spaces indent are allowed:
468
469 ```````````````````````````````` example
470  ***
471   ***
472    ***
473 .
474 <hr />
475 <hr />
476 <hr />
477 ````````````````````````````````
478
479
480 Four spaces is too many:
481
482 ```````````````````````````````` example
483     ***
484 .
485 <pre><code>***
486 </code></pre>
487 ````````````````````````````````
488
489
490 ```````````````````````````````` example
491 Foo
492     ***
493 .
494 <p>Foo
495 ***</p>
496 ````````````````````````````````
497
498
499 More than three characters may be used:
500
501 ```````````````````````````````` example
502 _____________________________________
503 .
504 <hr />
505 ````````````````````````````````
506
507
508 Spaces are allowed between the characters:
509
510 ```````````````````````````````` example
511  - - -
512 .
513 <hr />
514 ````````````````````````````````
515
516
517 ```````````````````````````````` example
518  **  * ** * ** * **
519 .
520 <hr />
521 ````````````````````````````````
522
523
524 ```````````````````````````````` example
525 -     -      -      -
526 .
527 <hr />
528 ````````````````````````````````
529
530
531 Spaces are allowed at the end:
532
533 ```````````````````````````````` example
534 - - - -    
535 .
536 <hr />
537 ````````````````````````````````
538
539
540 However, no other characters may occur in the line:
541
542 ```````````````````````````````` example
543 _ _ _ _ a
544
545 a------
546
547 ---a---
548 .
549 <p>_ _ _ _ a</p>
550 <p>a------</p>
551 <p>---a---</p>
552 ````````````````````````````````
553
554
555 It is required that all of the [non-whitespace characters] be the same.
556 So, this is not a thematic break:
557
558 ```````````````````````````````` example
559  *-*
560 .
561 <p><em>-</em></p>
562 ````````````````````````````````
563
564
565 Thematic breaks do not need blank lines before or after:
566
567 ```````````````````````````````` example
568 - foo
569 ***
570 - bar
571 .
572 <ul>
573 <li>foo</li>
574 </ul>
575 <hr />
576 <ul>
577 <li>bar</li>
578 </ul>
579 ````````````````````````````````
580
581
582 Thematic breaks can interrupt a paragraph:
583
584 ```````````````````````````````` example
585 Foo
586 ***
587 bar
588 .
589 <p>Foo</p>
590 <hr />
591 <p>bar</p>
592 ````````````````````````````````
593
594
595 If a line of dashes that meets the above conditions for being a
596 thematic break could also be interpreted as the underline of a [setext
597 heading], the interpretation as a
598 [setext heading] takes precedence. Thus, for example,
599 this is a setext heading, not a paragraph followed by a thematic break:
600
601 ```````````````````````````````` example
602 Foo
603 ---
604 bar
605 .
606 <h2>Foo</h2>
607 <p>bar</p>
608 ````````````````````````````````
609
610
611 When both a thematic break and a list item are possible
612 interpretations of a line, the thematic break takes precedence:
613
614 ```````````````````````````````` example
615 * Foo
616 * * *
617 * Bar
618 .
619 <ul>
620 <li>Foo</li>
621 </ul>
622 <hr />
623 <ul>
624 <li>Bar</li>
625 </ul>
626 ````````````````````````````````
627
628
629 If you want a thematic break in a list item, use a different bullet:
630
631 ```````````````````````````````` example
632 - Foo
633 - * * *
634 .
635 <ul>
636 <li>Foo</li>
637 <li>
638 <hr />
639 </li>
640 </ul>
641 ````````````````````````````````
642
643
644 ## ATX headings
645
646 An [ATX heading](@)
647 consists of a string of characters, parsed as inline content, between an
648 opening sequence of 1--6 unescaped `#` characters and an optional
649 closing sequence of any number of unescaped `#` characters.
650 The opening sequence of `#` characters must be followed by a
651 [space] or by the end of line. The optional closing sequence of `#`s must be
652 preceded by a [space] and may be followed by spaces only.  The opening
653 `#` character may be indented 0-3 spaces.  The raw contents of the
654 heading are stripped of leading and trailing spaces before being parsed
655 as inline content.  The heading level is equal to the number of `#`
656 characters in the opening sequence.
657
658 Simple headings:
659
660 ```````````````````````````````` example
661 # foo
662 ## foo
663 ### foo
664 #### foo
665 ##### foo
666 ###### foo
667 .
668 <h1>foo</h1>
669 <h2>foo</h2>
670 <h3>foo</h3>
671 <h4>foo</h4>
672 <h5>foo</h5>
673 <h6>foo</h6>
674 ````````````````````````````````
675
676
677 More than six `#` characters is not a heading:
678
679 ```````````````````````````````` example
680 ####### foo
681 .
682 <p>####### foo</p>
683 ````````````````````````````````
684
685
686 At least one space is required between the `#` characters and the
687 heading's contents, unless the heading is empty.  Note that many
688 implementations currently do not require the space.  However, the
689 space was required by the
690 [original ATX implementation](http://www.aaronsw.com/2002/atx/atx.py),
691 and it helps prevent things like the following from being parsed as
692 headings:
693
694 ```````````````````````````````` example
695 #5 bolt
696
697 #hashtag
698 .
699 <p>#5 bolt</p>
700 <p>#hashtag</p>
701 ````````````````````````````````
702
703
704 A tab will not work:
705
706 ```````````````````````````````` example
707 #→foo
708 .
709 <p>#→foo</p>
710 ````````````````````````````````
711
712
713 This is not a heading, because the first `#` is escaped:
714
715 ```````````````````````````````` example
716 \## foo
717 .
718 <p>## foo</p>
719 ````````````````````````````````
720
721
722 Contents are parsed as inlines:
723
724 ```````````````````````````````` example
725 # foo *bar* \*baz\*
726 .
727 <h1>foo <em>bar</em> *baz*</h1>
728 ````````````````````````````````
729
730
731 Leading and trailing blanks are ignored in parsing inline content:
732
733 ```````````````````````````````` example
734 #                  foo                     
735 .
736 <h1>foo</h1>
737 ````````````````````````````````
738
739
740 One to three spaces indentation are allowed:
741
742 ```````````````````````````````` example
743  ### foo
744   ## foo
745    # foo
746 .
747 <h3>foo</h3>
748 <h2>foo</h2>
749 <h1>foo</h1>
750 ````````````````````````````````
751
752
753 Four spaces are too much:
754
755 ```````````````````````````````` example
756     # foo
757 .
758 <pre><code># foo
759 </code></pre>
760 ````````````````````````````````
761
762
763 ```````````````````````````````` example
764 foo
765     # bar
766 .
767 <p>foo
768 # bar</p>
769 ````````````````````````````````
770
771
772 A closing sequence of `#` characters is optional:
773
774 ```````````````````````````````` example
775 ## foo ##
776   ###   bar    ###
777 .
778 <h2>foo</h2>
779 <h3>bar</h3>
780 ````````````````````````````````
781
782
783 It need not be the same length as the opening sequence:
784
785 ```````````````````````````````` example
786 # foo ##################################
787 ##### foo ##
788 .
789 <h1>foo</h1>
790 <h5>foo</h5>
791 ````````````````````````````````
792
793
794 Spaces are allowed after the closing sequence:
795
796 ```````````````````````````````` example
797 ### foo ###     
798 .
799 <h3>foo</h3>
800 ````````````````````````````````
801
802
803 A sequence of `#` characters with anything but [spaces] following it
804 is not a closing sequence, but counts as part of the contents of the
805 heading:
806
807 ```````````````````````````````` example
808 ### foo ### b
809 .
810 <h3>foo ### b</h3>
811 ````````````````````````````````
812
813
814 The closing sequence must be preceded by a space:
815
816 ```````````````````````````````` example
817 # foo#
818 .
819 <h1>foo#</h1>
820 ````````````````````````````````
821
822
823 Backslash-escaped `#` characters do not count as part
824 of the closing sequence:
825
826 ```````````````````````````````` example
827 ### foo \###
828 ## foo #\##
829 # foo \#
830 .
831 <h3>foo ###</h3>
832 <h2>foo ###</h2>
833 <h1>foo #</h1>
834 ````````````````````````````````
835
836
837 ATX headings need not be separated from surrounding content by blank
838 lines, and they can interrupt paragraphs:
839
840 ```````````````````````````````` example
841 ****
842 ## foo
843 ****
844 .
845 <hr />
846 <h2>foo</h2>
847 <hr />
848 ````````````````````````````````
849
850
851 ```````````````````````````````` example
852 Foo bar
853 # baz
854 Bar foo
855 .
856 <p>Foo bar</p>
857 <h1>baz</h1>
858 <p>Bar foo</p>
859 ````````````````````````````````
860
861
862 ATX headings can be empty:
863
864 ```````````````````````````````` example
865 ## 
866 #
867 ### ###
868 .
869 <h2></h2>
870 <h1></h1>
871 <h3></h3>
872 ````````````````````````````````
873
874
875 ## Setext headings
876
877 A [setext heading](@) consists of one or more
878 lines of text, each containing at least one [non-whitespace
879 character], with no more than 3 spaces indentation, followed by
880 a [setext heading underline].  The lines of text must be such
881 that, were they not followed by the setext heading underline,
882 they would be interpreted as a paragraph:  they cannot be
883 interpretable as a [code fence], [ATX heading][ATX headings],
884 [block quote][block quotes], [thematic break][thematic breaks],
885 [list item][list items], or [HTML block][HTML blocks].
886
887 A [setext heading underline](@) is a sequence of
888 `=` characters or a sequence of `-` characters, with no more than 3
889 spaces indentation and any number of trailing spaces.  If a line
890 containing a single `-` can be interpreted as an
891 empty [list items], it should be interpreted this way
892 and not as a [setext heading underline].
893
894 The heading is a level 1 heading if `=` characters are used in
895 the [setext heading underline], and a level 2 heading if `-`
896 characters are used.  The contents of the heading are the result
897 of parsing the preceding lines of text as CommonMark inline
898 content.
899
900 In general, a setext heading need not be preceded or followed by a
901 blank line.  However, it cannot interrupt a paragraph, so when a
902 setext heading comes after a paragraph, a blank line is needed between
903 them.
904
905 Simple examples:
906
907 ```````````````````````````````` example
908 Foo *bar*
909 =========
910
911 Foo *bar*
912 ---------
913 .
914 <h1>Foo <em>bar</em></h1>
915 <h2>Foo <em>bar</em></h2>
916 ````````````````````````````````
917
918
919 The content of the header may span more than one line:
920
921 ```````````````````````````````` example
922 Foo *bar
923 baz*
924 ====
925 .
926 <h1>Foo <em>bar
927 baz</em></h1>
928 ````````````````````````````````
929
930
931 The underlining can be any length:
932
933 ```````````````````````````````` example
934 Foo
935 -------------------------
936
937 Foo
938 =
939 .
940 <h2>Foo</h2>
941 <h1>Foo</h1>
942 ````````````````````````````````
943
944
945 The heading content can be indented up to three spaces, and need
946 not line up with the underlining:
947
948 ```````````````````````````````` example
949    Foo
950 ---
951
952   Foo
953 -----
954
955   Foo
956   ===
957 .
958 <h2>Foo</h2>
959 <h2>Foo</h2>
960 <h1>Foo</h1>
961 ````````````````````````````````
962
963
964 Four spaces indent is too much:
965
966 ```````````````````````````````` example
967     Foo
968     ---
969
970     Foo
971 ---
972 .
973 <pre><code>Foo
974 ---
975
976 Foo
977 </code></pre>
978 <hr />
979 ````````````````````````````````
980
981
982 The setext heading underline can be indented up to three spaces, and
983 may have trailing spaces:
984
985 ```````````````````````````````` example
986 Foo
987    ----      
988 .
989 <h2>Foo</h2>
990 ````````````````````````````````
991
992
993 Four spaces is too much:
994
995 ```````````````````````````````` example
996 Foo
997     ---
998 .
999 <p>Foo
1000 ---</p>
1001 ````````````````````````````````
1002
1003
1004 The setext heading underline cannot contain internal spaces:
1005
1006 ```````````````````````````````` example
1007 Foo
1008 = =
1009
1010 Foo
1011 --- -
1012 .
1013 <p>Foo
1014 = =</p>
1015 <p>Foo</p>
1016 <hr />
1017 ````````````````````````````````
1018
1019
1020 Trailing spaces in the content line do not cause a line break:
1021
1022 ```````````````````````````````` example
1023 Foo  
1024 -----
1025 .
1026 <h2>Foo</h2>
1027 ````````````````````````````````
1028
1029
1030 Nor does a backslash at the end:
1031
1032 ```````````````````````````````` example
1033 Foo\
1034 ----
1035 .
1036 <h2>Foo\</h2>
1037 ````````````````````````````````
1038
1039
1040 Since indicators of block structure take precedence over
1041 indicators of inline structure, the following are setext headings:
1042
1043 ```````````````````````````````` example
1044 `Foo
1045 ----
1046 `
1047
1048 <a title="a lot
1049 ---
1050 of dashes"/>
1051 .
1052 <h2>`Foo</h2>
1053 <p>`</p>
1054 <h2>&lt;a title=&quot;a lot</h2>
1055 <p>of dashes&quot;/&gt;</p>
1056 ````````````````````````````````
1057
1058
1059 The setext heading underline cannot be a [lazy continuation
1060 line] in a list item or block quote:
1061
1062 ```````````````````````````````` example
1063 > Foo
1064 ---
1065 .
1066 <blockquote>
1067 <p>Foo</p>
1068 </blockquote>
1069 <hr />
1070 ````````````````````````````````
1071
1072
1073 ```````````````````````````````` example
1074 > foo
1075 bar
1076 ===
1077 .
1078 <blockquote>
1079 <p>foo
1080 bar
1081 ===</p>
1082 </blockquote>
1083 ````````````````````````````````
1084
1085
1086 ```````````````````````````````` example
1087 - Foo
1088 ---
1089 .
1090 <ul>
1091 <li>Foo</li>
1092 </ul>
1093 <hr />
1094 ````````````````````````````````
1095
1096
1097 A blank line is needed between a paragraph and a following
1098 setext heading, since otherwise the paragraph becomes part
1099 of the heading's content:
1100
1101 ```````````````````````````````` example
1102 Foo
1103 Bar
1104 ---
1105 .
1106 <h2>Foo
1107 Bar</h2>
1108 ````````````````````````````````
1109
1110
1111 But in general a blank line is not required before or after
1112 setext headings:
1113
1114 ```````````````````````````````` example
1115 ---
1116 Foo
1117 ---
1118 Bar
1119 ---
1120 Baz
1121 .
1122 <hr />
1123 <h2>Foo</h2>
1124 <h2>Bar</h2>
1125 <p>Baz</p>
1126 ````````````````````````````````
1127
1128
1129 Setext headings cannot be empty:
1130
1131 ```````````````````````````````` example
1132
1133 ====
1134 .
1135 <p>====</p>
1136 ````````````````````````````````
1137
1138
1139 Setext heading text lines must not be interpretable as block
1140 constructs other than paragraphs.  So, the line of dashes
1141 in these examples gets interpreted as a thematic break:
1142
1143 ```````````````````````````````` example
1144 ---
1145 ---
1146 .
1147 <hr />
1148 <hr />
1149 ````````````````````````````````
1150
1151
1152 ```````````````````````````````` example
1153 - foo
1154 -----
1155 .
1156 <ul>
1157 <li>foo</li>
1158 </ul>
1159 <hr />
1160 ````````````````````````````````
1161
1162
1163 ```````````````````````````````` example
1164     foo
1165 ---
1166 .
1167 <pre><code>foo
1168 </code></pre>
1169 <hr />
1170 ````````````````````````````````
1171
1172
1173 ```````````````````````````````` example
1174 > foo
1175 -----
1176 .
1177 <blockquote>
1178 <p>foo</p>
1179 </blockquote>
1180 <hr />
1181 ````````````````````````````````
1182
1183
1184 If you want a heading with `> foo` as its literal text, you can
1185 use backslash escapes:
1186
1187 ```````````````````````````````` example
1188 \> foo
1189 ------
1190 .
1191 <h2>&gt; foo</h2>
1192 ````````````````````````````````
1193
1194
1195 **Compatibility note:**  Most existing Markdown implementations
1196 do not allow the text of setext headings to span multiple lines.
1197 But there is no consensus about how to interpret
1198
1199 ``` markdown
1200 Foo
1201 bar
1202 ---
1203 baz
1204 ```
1205
1206 One can find four different interpretations:
1207
1208 1. paragraph "Foo", heading "bar", paragraph "baz"
1209 2. paragraph "Foo bar", thematic break, paragraph "baz"
1210 3. paragraph "Foo bar --- baz"
1211 4. heading "Foo bar", paragraph "baz"
1212
1213 We find interpretation 4 most natural, and interpretation 4
1214 increases the expressive power of CommonMark, by allowing
1215 multiline headings.  Authors who want interpretation 1 can
1216 put a blank line after the first paragraph:
1217
1218 ```````````````````````````````` example
1219 Foo
1220
1221 bar
1222 ---
1223 baz
1224 .
1225 <p>Foo</p>
1226 <h2>bar</h2>
1227 <p>baz</p>
1228 ````````````````````````````````
1229
1230
1231 Authors who want interpretation 2 can put blank lines around
1232 the thematic break,
1233
1234 ```````````````````````````````` example
1235 Foo
1236 bar
1237
1238 ---
1239
1240 baz
1241 .
1242 <p>Foo
1243 bar</p>
1244 <hr />
1245 <p>baz</p>
1246 ````````````````````````````````
1247
1248
1249 or use a thematic break that cannot count as a [setext heading
1250 underline], such as
1251
1252 ```````````````````````````````` example
1253 Foo
1254 bar
1255 * * *
1256 baz
1257 .
1258 <p>Foo
1259 bar</p>
1260 <hr />
1261 <p>baz</p>
1262 ````````````````````````````````
1263
1264
1265 Authors who want interpretation 3 can use backslash escapes:
1266
1267 ```````````````````````````````` example
1268 Foo
1269 bar
1270 \---
1271 baz
1272 .
1273 <p>Foo
1274 bar
1275 ---
1276 baz</p>
1277 ````````````````````````````````
1278
1279
1280 ## Indented code blocks
1281
1282 An [indented code block](@) is composed of one or more
1283 [indented chunks] separated by blank lines.
1284 An [indented chunk](@) is a sequence of non-blank lines,
1285 each indented four or more spaces. The contents of the code block are
1286 the literal contents of the lines, including trailing
1287 [line endings], minus four spaces of indentation.
1288 An indented code block has no [info string].
1289
1290 An indented code block cannot interrupt a paragraph, so there must be
1291 a blank line between a paragraph and a following indented code block.
1292 (A blank line is not needed, however, between a code block and a following
1293 paragraph.)
1294
1295 ```````````````````````````````` example
1296     a simple
1297       indented code block
1298 .
1299 <pre><code>a simple
1300   indented code block
1301 </code></pre>
1302 ````````````````````````````````
1303
1304
1305 If there is any ambiguity between an interpretation of indentation
1306 as a code block and as indicating that material belongs to a [list
1307 item][list items], the list item interpretation takes precedence:
1308
1309 ```````````````````````````````` example
1310   - foo
1311
1312     bar
1313 .
1314 <ul>
1315 <li>
1316 <p>foo</p>
1317 <p>bar</p>
1318 </li>
1319 </ul>
1320 ````````````````````````````````
1321
1322
1323 ```````````````````````````````` example
1324 1.  foo
1325
1326     - bar
1327 .
1328 <ol>
1329 <li>
1330 <p>foo</p>
1331 <ul>
1332 <li>bar</li>
1333 </ul>
1334 </li>
1335 </ol>
1336 ````````````````````````````````
1337
1338
1339
1340 The contents of a code block are literal text, and do not get parsed
1341 as Markdown:
1342
1343 ```````````````````````````````` example
1344     <a/>
1345     *hi*
1346
1347     - one
1348 .
1349 <pre><code>&lt;a/&gt;
1350 *hi*
1351
1352 - one
1353 </code></pre>
1354 ````````````````````````````````
1355
1356
1357 Here we have three chunks separated by blank lines:
1358
1359 ```````````````````````````````` example
1360     chunk1
1361
1362     chunk2
1363   
1364  
1365  
1366     chunk3
1367 .
1368 <pre><code>chunk1
1369
1370 chunk2
1371
1372
1373
1374 chunk3
1375 </code></pre>
1376 ````````````````````````````````
1377
1378
1379 Any initial spaces beyond four will be included in the content, even
1380 in interior blank lines:
1381
1382 ```````````````````````````````` example
1383     chunk1
1384       
1385       chunk2
1386 .
1387 <pre><code>chunk1
1388   
1389   chunk2
1390 </code></pre>
1391 ````````````````````````````````
1392
1393
1394 An indented code block cannot interrupt a paragraph.  (This
1395 allows hanging indents and the like.)
1396
1397 ```````````````````````````````` example
1398 Foo
1399     bar
1400
1401 .
1402 <p>Foo
1403 bar</p>
1404 ````````````````````````````````
1405
1406
1407 However, any non-blank line with fewer than four leading spaces ends
1408 the code block immediately.  So a paragraph may occur immediately
1409 after indented code:
1410
1411 ```````````````````````````````` example
1412     foo
1413 bar
1414 .
1415 <pre><code>foo
1416 </code></pre>
1417 <p>bar</p>
1418 ````````````````````````````````
1419
1420
1421 And indented code can occur immediately before and after other kinds of
1422 blocks:
1423
1424 ```````````````````````````````` example
1425 # Heading
1426     foo
1427 Heading
1428 ------
1429     foo
1430 ----
1431 .
1432 <h1>Heading</h1>
1433 <pre><code>foo
1434 </code></pre>
1435 <h2>Heading</h2>
1436 <pre><code>foo
1437 </code></pre>
1438 <hr />
1439 ````````````````````````````````
1440
1441
1442 The first line can be indented more than four spaces:
1443
1444 ```````````````````````````````` example
1445         foo
1446     bar
1447 .
1448 <pre><code>    foo
1449 bar
1450 </code></pre>
1451 ````````````````````````````````
1452
1453
1454 Blank lines preceding or following an indented code block
1455 are not included in it:
1456
1457 ```````````````````````````````` example
1458
1459     
1460     foo
1461     
1462
1463 .
1464 <pre><code>foo
1465 </code></pre>
1466 ````````````````````````````````
1467
1468
1469 Trailing spaces are included in the code block's content:
1470
1471 ```````````````````````````````` example
1472     foo  
1473 .
1474 <pre><code>foo  
1475 </code></pre>
1476 ````````````````````````````````
1477
1478
1479
1480 ## Fenced code blocks
1481
1482 A [code fence](@) is a sequence
1483 of at least three consecutive backtick characters (`` ` ``) or
1484 tildes (`~`).  (Tildes and backticks cannot be mixed.)
1485 A [fenced code block](@)
1486 begins with a code fence, indented no more than three spaces.
1487
1488 The line with the opening code fence may optionally contain some text
1489 following the code fence; this is trimmed of leading and trailing
1490 spaces and called the [info string](@).
1491 The [info string] may not contain any backtick
1492 characters.  (The reason for this restriction is that otherwise
1493 some inline code would be incorrectly interpreted as the
1494 beginning of a fenced code block.)
1495
1496 The content of the code block consists of all subsequent lines, until
1497 a closing [code fence] of the same type as the code block
1498 began with (backticks or tildes), and with at least as many backticks
1499 or tildes as the opening code fence.  If the leading code fence is
1500 indented N spaces, then up to N spaces of indentation are removed from
1501 each line of the content (if present).  (If a content line is not
1502 indented, it is preserved unchanged.  If it is indented less than N
1503 spaces, all of the indentation is removed.)
1504
1505 The closing code fence may be indented up to three spaces, and may be
1506 followed only by spaces, which are ignored.  If the end of the
1507 containing block (or document) is reached and no closing code fence
1508 has been found, the code block contains all of the lines after the
1509 opening code fence until the end of the containing block (or
1510 document).  (An alternative spec would require backtracking in the
1511 event that a closing code fence is not found.  But this makes parsing
1512 much less efficient, and there seems to be no real down side to the
1513 behavior described here.)
1514
1515 A fenced code block may interrupt a paragraph, and does not require
1516 a blank line either before or after.
1517
1518 The content of a code fence is treated as literal text, not parsed
1519 as inlines.  The first word of the [info string] is typically used to
1520 specify the language of the code sample, and rendered in the `class`
1521 attribute of the `code` tag.  However, this spec does not mandate any
1522 particular treatment of the [info string].
1523
1524 Here is a simple example with backticks:
1525
1526 ```````````````````````````````` example
1527 ```
1528 <
1529  >
1530 ```
1531 .
1532 <pre><code>&lt;
1533  &gt;
1534 </code></pre>
1535 ````````````````````````````````
1536
1537
1538 With tildes:
1539
1540 ```````````````````````````````` example
1541 ~~~
1542 <
1543  >
1544 ~~~
1545 .
1546 <pre><code>&lt;
1547  &gt;
1548 </code></pre>
1549 ````````````````````````````````
1550
1551
1552 The closing code fence must use the same character as the opening
1553 fence:
1554
1555 ```````````````````````````````` example
1556 ```
1557 aaa
1558 ~~~
1559 ```
1560 .
1561 <pre><code>aaa
1562 ~~~
1563 </code></pre>
1564 ````````````````````````````````
1565
1566
1567 ```````````````````````````````` example
1568 ~~~
1569 aaa
1570 ```
1571 ~~~
1572 .
1573 <pre><code>aaa
1574 ```
1575 </code></pre>
1576 ````````````````````````````````
1577
1578
1579 The closing code fence must be at least as long as the opening fence:
1580
1581 ```````````````````````````````` example
1582 ````
1583 aaa
1584 ```
1585 ``````
1586 .
1587 <pre><code>aaa
1588 ```
1589 </code></pre>
1590 ````````````````````````````````
1591
1592
1593 ```````````````````````````````` example
1594 ~~~~
1595 aaa
1596 ~~~
1597 ~~~~
1598 .
1599 <pre><code>aaa
1600 ~~~
1601 </code></pre>
1602 ````````````````````````````````
1603
1604
1605 Unclosed code blocks are closed by the end of the document
1606 (or the enclosing [block quote][block quotes] or [list item][list items]):
1607
1608 ```````````````````````````````` example
1609 ```
1610 .
1611 <pre><code></code></pre>
1612 ````````````````````````````````
1613
1614
1615 ```````````````````````````````` example
1616 `````
1617
1618 ```
1619 aaa
1620 .
1621 <pre><code>
1622 ```
1623 aaa
1624 </code></pre>
1625 ````````````````````````````````
1626
1627
1628 ```````````````````````````````` example
1629 > ```
1630 > aaa
1631
1632 bbb
1633 .
1634 <blockquote>
1635 <pre><code>aaa
1636 </code></pre>
1637 </blockquote>
1638 <p>bbb</p>
1639 ````````````````````````````````
1640
1641
1642 A code block can have all empty lines as its content:
1643
1644 ```````````````````````````````` example
1645 ```
1646
1647   
1648 ```
1649 .
1650 <pre><code>
1651   
1652 </code></pre>
1653 ````````````````````````````````
1654
1655
1656 A code block can be empty:
1657
1658 ```````````````````````````````` example
1659 ```
1660 ```
1661 .
1662 <pre><code></code></pre>
1663 ````````````````````````````````
1664
1665
1666 Fences can be indented.  If the opening fence is indented,
1667 content lines will have equivalent opening indentation removed,
1668 if present:
1669
1670 ```````````````````````````````` example
1671  ```
1672  aaa
1673 aaa
1674 ```
1675 .
1676 <pre><code>aaa
1677 aaa
1678 </code></pre>
1679 ````````````````````````````````
1680
1681
1682 ```````````````````````````````` example
1683   ```
1684 aaa
1685   aaa
1686 aaa
1687   ```
1688 .
1689 <pre><code>aaa
1690 aaa
1691 aaa
1692 </code></pre>
1693 ````````````````````````````````
1694
1695
1696 ```````````````````````````````` example
1697    ```
1698    aaa
1699     aaa
1700   aaa
1701    ```
1702 .
1703 <pre><code>aaa
1704  aaa
1705 aaa
1706 </code></pre>
1707 ````````````````````````````````
1708
1709
1710 Four spaces indentation produces an indented code block:
1711
1712 ```````````````````````````````` example
1713     ```
1714     aaa
1715     ```
1716 .
1717 <pre><code>```
1718 aaa
1719 ```
1720 </code></pre>
1721 ````````````````````````````````
1722
1723
1724 Closing fences may be indented by 0-3 spaces, and their indentation
1725 need not match that of the opening fence:
1726
1727 ```````````````````````````````` example
1728 ```
1729 aaa
1730   ```
1731 .
1732 <pre><code>aaa
1733 </code></pre>
1734 ````````````````````````````````
1735
1736
1737 ```````````````````````````````` example
1738    ```
1739 aaa
1740   ```
1741 .
1742 <pre><code>aaa
1743 </code></pre>
1744 ````````````````````````````````
1745
1746
1747 This is not a closing fence, because it is indented 4 spaces:
1748
1749 ```````````````````````````````` example
1750 ```
1751 aaa
1752     ```
1753 .
1754 <pre><code>aaa
1755     ```
1756 </code></pre>
1757 ````````````````````````````````
1758
1759
1760
1761 Code fences (opening and closing) cannot contain internal spaces:
1762
1763 ```````````````````````````````` example
1764 ``` ```
1765 aaa
1766 .
1767 <p><code></code>
1768 aaa</p>
1769 ````````````````````````````````
1770
1771
1772 ```````````````````````````````` example
1773 ~~~~~~
1774 aaa
1775 ~~~ ~~
1776 .
1777 <pre><code>aaa
1778 ~~~ ~~
1779 </code></pre>
1780 ````````````````````````````````
1781
1782
1783 Fenced code blocks can interrupt paragraphs, and can be followed
1784 directly by paragraphs, without a blank line between:
1785
1786 ```````````````````````````````` example
1787 foo
1788 ```
1789 bar
1790 ```
1791 baz
1792 .
1793 <p>foo</p>
1794 <pre><code>bar
1795 </code></pre>
1796 <p>baz</p>
1797 ````````````````````````````````
1798
1799
1800 Other blocks can also occur before and after fenced code blocks
1801 without an intervening blank line:
1802
1803 ```````````````````````````````` example
1804 foo
1805 ---
1806 ~~~
1807 bar
1808 ~~~
1809 # baz
1810 .
1811 <h2>foo</h2>
1812 <pre><code>bar
1813 </code></pre>
1814 <h1>baz</h1>
1815 ````````````````````````````````
1816
1817
1818 An [info string] can be provided after the opening code fence.
1819 Opening and closing spaces will be stripped, and the first word, prefixed
1820 with `language-`, is used as the value for the `class` attribute of the
1821 `code` element within the enclosing `pre` element.
1822
1823 ```````````````````````````````` example
1824 ```ruby
1825 def foo(x)
1826   return 3
1827 end
1828 ```
1829 .
1830 <pre><code class="language-ruby">def foo(x)
1831   return 3
1832 end
1833 </code></pre>
1834 ````````````````````````````````
1835
1836
1837 ```````````````````````````````` example
1838 ~~~~    ruby startline=3 $%@#$
1839 def foo(x)
1840   return 3
1841 end
1842 ~~~~~~~
1843 .
1844 <pre><code class="language-ruby">def foo(x)
1845   return 3
1846 end
1847 </code></pre>
1848 ````````````````````````````````
1849
1850
1851 ```````````````````````````````` example
1852 ````;
1853 ````
1854 .
1855 <pre><code class="language-;"></code></pre>
1856 ````````````````````````````````
1857
1858
1859 [Info strings] for backtick code blocks cannot contain backticks:
1860
1861 ```````````````````````````````` example
1862 ``` aa ```
1863 foo
1864 .
1865 <p><code>aa</code>
1866 foo</p>
1867 ````````````````````````````````
1868
1869
1870 Closing code fences cannot have [info strings]:
1871
1872 ```````````````````````````````` example
1873 ```
1874 ``` aaa
1875 ```
1876 .
1877 <pre><code>``` aaa
1878 </code></pre>
1879 ````````````````````````````````
1880
1881
1882
1883 ## HTML blocks
1884
1885 An [HTML block](@) is a group of lines that is treated
1886 as raw HTML (and will not be escaped in HTML output).
1887
1888 There are seven kinds of [HTML block], which can be defined
1889 by their start and end conditions.  The block begins with a line that
1890 meets a [start condition](@) (after up to three spaces
1891 optional indentation).  It ends with the first subsequent line that
1892 meets a matching [end condition](@), or the last line of
1893 the document, if no line is encountered that meets the
1894 [end condition].  If the first line meets both the [start condition]
1895 and the [end condition], the block will contain just that line.
1896
1897 1.  **Start condition:**  line begins with the string `<script`,
1898 `<pre`, or `<style` (case-insensitive), followed by whitespace,
1899 the string `>`, or the end of the line.\
1900 **End condition:**  line contains an end tag
1901 `</script>`, `</pre>`, or `</style>` (case-insensitive; it
1902 need not match the start tag).
1903
1904 2.  **Start condition:** line begins with the string `<!--`.\
1905 **End condition:**  line contains the string `-->`.
1906
1907 3.  **Start condition:** line begins with the string `<?`.\
1908 **End condition:** line contains the string `?>`.
1909
1910 4.  **Start condition:** line begins with the string `<!`
1911 followed by an uppercase ASCII letter.\
1912 **End condition:** line contains the character `>`.
1913
1914 5.  **Start condition:**  line begins with the string
1915 `<![CDATA[`.\
1916 **End condition:** line contains the string `]]>`.
1917
1918 6.  **Start condition:** line begins the string `<` or `</`
1919 followed by one of the strings (case-insensitive) `address`,
1920 `article`, `aside`, `base`, `basefont`, `blockquote`, `body`,
1921 `caption`, `center`, `col`, `colgroup`, `dd`, `details`, `dialog`,
1922 `dir`, `div`, `dl`, `dt`, `fieldset`, `figcaption`, `figure`,
1923 `footer`, `form`, `frame`, `frameset`, `h1`, `head`, `header`, `hr`,
1924 `html`, `iframe`, `legend`, `li`, `link`, `main`, `menu`, `menuitem`,
1925 `meta`, `nav`, `noframes`, `ol`, `optgroup`, `option`, `p`, `param`,
1926 `section`, `source`, `summary`, `table`, `tbody`, `td`,
1927 `tfoot`, `th`, `thead`, `title`, `tr`, `track`, `ul`, followed
1928 by [whitespace], the end of the line, the string `>`, or
1929 the string `/>`.\
1930 **End condition:** line is followed by a [blank line].
1931
1932 7.  **Start condition:**  line begins with a complete [open tag]
1933 or [closing tag] (with any [tag name] other than `script`,
1934 `style`, or `pre`) followed only by [whitespace]
1935 or the end of the line.\
1936 **End condition:** line is followed by a [blank line].
1937
1938 All types of [HTML blocks] except type 7 may interrupt
1939 a paragraph.  Blocks of type 7 may not interrupt a paragraph.
1940 (This restriction is intended to prevent unwanted interpretation
1941 of long tags inside a wrapped paragraph as starting HTML blocks.)
1942
1943 Some simple examples follow.  Here are some basic HTML blocks
1944 of type 6:
1945
1946 ```````````````````````````````` example
1947 <table>
1948   <tr>
1949     <td>
1950            hi
1951     </td>
1952   </tr>
1953 </table>
1954
1955 okay.
1956 .
1957 <table>
1958   <tr>
1959     <td>
1960            hi
1961     </td>
1962   </tr>
1963 </table>
1964 <p>okay.</p>
1965 ````````````````````````````````
1966
1967
1968 ```````````````````````````````` example
1969  <div>
1970   *hello*
1971          <foo><a>
1972 .
1973  <div>
1974   *hello*
1975          <foo><a>
1976 ````````````````````````````````
1977
1978
1979 A block can also start with a closing tag:
1980
1981 ```````````````````````````````` example
1982 </div>
1983 *foo*
1984 .
1985 </div>
1986 *foo*
1987 ````````````````````````````````
1988
1989
1990 Here we have two HTML blocks with a Markdown paragraph between them:
1991
1992 ```````````````````````````````` example
1993 <DIV CLASS="foo">
1994
1995 *Markdown*
1996
1997 </DIV>
1998 .
1999 <DIV CLASS="foo">
2000 <p><em>Markdown</em></p>
2001 </DIV>
2002 ````````````````````````````````
2003
2004
2005 The tag on the first line can be partial, as long
2006 as it is split where there would be whitespace:
2007
2008 ```````````````````````````````` example
2009 <div id="foo"
2010   class="bar">
2011 </div>
2012 .
2013 <div id="foo"
2014   class="bar">
2015 </div>
2016 ````````````````````````````````
2017
2018
2019 ```````````````````````````````` example
2020 <div id="foo" class="bar
2021   baz">
2022 </div>
2023 .
2024 <div id="foo" class="bar
2025   baz">
2026 </div>
2027 ````````````````````````````````
2028
2029
2030 An open tag need not be closed:
2031 ```````````````````````````````` example
2032 <div>
2033 *foo*
2034
2035 *bar*
2036 .
2037 <div>
2038 *foo*
2039 <p><em>bar</em></p>
2040 ````````````````````````````````
2041
2042
2043
2044 A partial tag need not even be completed (garbage
2045 in, garbage out):
2046
2047 ```````````````````````````````` example
2048 <div id="foo"
2049 *hi*
2050 .
2051 <div id="foo"
2052 *hi*
2053 ````````````````````````````````
2054
2055
2056 ```````````````````````````````` example
2057 <div class
2058 foo
2059 .
2060 <div class
2061 foo
2062 ````````````````````````````````
2063
2064
2065 The initial tag doesn't even need to be a valid
2066 tag, as long as it starts like one:
2067
2068 ```````````````````````````````` example
2069 <div *???-&&&-<---
2070 *foo*
2071 .
2072 <div *???-&&&-<---
2073 *foo*
2074 ````````````````````````````````
2075
2076
2077 In type 6 blocks, the initial tag need not be on a line by
2078 itself:
2079
2080 ```````````````````````````````` example
2081 <div><a href="bar">*foo*</a></div>
2082 .
2083 <div><a href="bar">*foo*</a></div>
2084 ````````````````````````````````
2085
2086
2087 ```````````````````````````````` example
2088 <table><tr><td>
2089 foo
2090 </td></tr></table>
2091 .
2092 <table><tr><td>
2093 foo
2094 </td></tr></table>
2095 ````````````````````````````````
2096
2097
2098 Everything until the next blank line or end of document
2099 gets included in the HTML block.  So, in the following
2100 example, what looks like a Markdown code block
2101 is actually part of the HTML block, which continues until a blank
2102 line or the end of the document is reached:
2103
2104 ```````````````````````````````` example
2105 <div></div>
2106 ``` c
2107 int x = 33;
2108 ```
2109 .
2110 <div></div>
2111 ``` c
2112 int x = 33;
2113 ```
2114 ````````````````````````````````
2115
2116
2117 To start an [HTML block] with a tag that is *not* in the
2118 list of block-level tags in (6), you must put the tag by
2119 itself on the first line (and it must be complete):
2120
2121 ```````````````````````````````` example
2122 <a href="foo">
2123 *bar*
2124 </a>
2125 .
2126 <a href="foo">
2127 *bar*
2128 </a>
2129 ````````````````````````````````
2130
2131
2132 In type 7 blocks, the [tag name] can be anything:
2133
2134 ```````````````````````````````` example
2135 <Warning>
2136 *bar*
2137 </Warning>
2138 .
2139 <Warning>
2140 *bar*
2141 </Warning>
2142 ````````````````````````````````
2143
2144
2145 ```````````````````````````````` example
2146 <i class="foo">
2147 *bar*
2148 </i>
2149 .
2150 <i class="foo">
2151 *bar*
2152 </i>
2153 ````````````````````````````````
2154
2155
2156 ```````````````````````````````` example
2157 </ins>
2158 *bar*
2159 .
2160 </ins>
2161 *bar*
2162 ````````````````````````````````
2163
2164
2165 These rules are designed to allow us to work with tags that
2166 can function as either block-level or inline-level tags.
2167 The `<del>` tag is a nice example.  We can surround content with
2168 `<del>` tags in three different ways.  In this case, we get a raw
2169 HTML block, because the `<del>` tag is on a line by itself:
2170
2171 ```````````````````````````````` example
2172 <del>
2173 *foo*
2174 </del>
2175 .
2176 <del>
2177 *foo*
2178 </del>
2179 ````````````````````````````````
2180
2181
2182 In this case, we get a raw HTML block that just includes
2183 the `<del>` tag (because it ends with the following blank
2184 line).  So the contents get interpreted as CommonMark:
2185
2186 ```````````````````````````````` example
2187 <del>
2188
2189 *foo*
2190
2191 </del>
2192 .
2193 <del>
2194 <p><em>foo</em></p>
2195 </del>
2196 ````````````````````````````````
2197
2198
2199 Finally, in this case, the `<del>` tags are interpreted
2200 as [raw HTML] *inside* the CommonMark paragraph.  (Because
2201 the tag is not on a line by itself, we get inline HTML
2202 rather than an [HTML block].)
2203
2204 ```````````````````````````````` example
2205 <del>*foo*</del>
2206 .
2207 <p><del><em>foo</em></del></p>
2208 ````````````````````````````````
2209
2210
2211 HTML tags designed to contain literal content
2212 (`script`, `style`, `pre`), comments, processing instructions,
2213 and declarations are treated somewhat differently.
2214 Instead of ending at the first blank line, these blocks
2215 end at the first line containing a corresponding end tag.
2216 As a result, these blocks can contain blank lines:
2217
2218 A pre tag (type 1):
2219
2220 ```````````````````````````````` example
2221 <pre language="haskell"><code>
2222 import Text.HTML.TagSoup
2223
2224 main :: IO ()
2225 main = print $ parseTags tags
2226 </code></pre>
2227 .
2228 <pre language="haskell"><code>
2229 import Text.HTML.TagSoup
2230
2231 main :: IO ()
2232 main = print $ parseTags tags
2233 </code></pre>
2234 ````````````````````````````````
2235
2236
2237 A script tag (type 1):
2238
2239 ```````````````````````````````` example
2240 <script type="text/javascript">
2241 // JavaScript example
2242
2243 document.getElementById("demo").innerHTML = "Hello JavaScript!";
2244 </script>
2245 .
2246 <script type="text/javascript">
2247 // JavaScript example
2248
2249 document.getElementById("demo").innerHTML = "Hello JavaScript!";
2250 </script>
2251 ````````````````````````````````
2252
2253
2254 A style tag (type 1):
2255
2256 ```````````````````````````````` example
2257 <style
2258   type="text/css">
2259 h1 {color:red;}
2260
2261 p {color:blue;}
2262 </style>
2263 .
2264 <style
2265   type="text/css">
2266 h1 {color:red;}
2267
2268 p {color:blue;}
2269 </style>
2270 ````````````````````````````````
2271
2272
2273 If there is no matching end tag, the block will end at the
2274 end of the document (or the enclosing [block quote][block quotes]
2275 or [list item][list items]):
2276
2277 ```````````````````````````````` example
2278 <style
2279   type="text/css">
2280
2281 foo
2282 .
2283 <style
2284   type="text/css">
2285
2286 foo
2287 ````````````````````````````````
2288
2289
2290 ```````````````````````````````` example
2291 > <div>
2292 > foo
2293
2294 bar
2295 .
2296 <blockquote>
2297 <div>
2298 foo
2299 </blockquote>
2300 <p>bar</p>
2301 ````````````````````````````````
2302
2303
2304 ```````````````````````````````` example
2305 - <div>
2306 - foo
2307 .
2308 <ul>
2309 <li>
2310 <div>
2311 </li>
2312 <li>foo</li>
2313 </ul>
2314 ````````````````````````````````
2315
2316
2317 The end tag can occur on the same line as the start tag:
2318
2319 ```````````````````````````````` example
2320 <style>p{color:red;}</style>
2321 *foo*
2322 .
2323 <style>p{color:red;}</style>
2324 <p><em>foo</em></p>
2325 ````````````````````````````````
2326
2327
2328 ```````````````````````````````` example
2329 <!-- foo -->*bar*
2330 *baz*
2331 .
2332 <!-- foo -->*bar*
2333 <p><em>baz</em></p>
2334 ````````````````````````````````
2335
2336
2337 Note that anything on the last line after the
2338 end tag will be included in the [HTML block]:
2339
2340 ```````````````````````````````` example
2341 <script>
2342 foo
2343 </script>1. *bar*
2344 .
2345 <script>
2346 foo
2347 </script>1. *bar*
2348 ````````````````````````````````
2349
2350
2351 A comment (type 2):
2352
2353 ```````````````````````````````` example
2354 <!-- Foo
2355
2356 bar
2357    baz -->
2358 .
2359 <!-- Foo
2360
2361 bar
2362    baz -->
2363 ````````````````````````````````
2364
2365
2366
2367 A processing instruction (type 3):
2368
2369 ```````````````````````````````` example
2370 <?php
2371
2372   echo '>';
2373
2374 ?>
2375 .
2376 <?php
2377
2378   echo '>';
2379
2380 ?>
2381 ````````````````````````````````
2382
2383
2384 A declaration (type 4):
2385
2386 ```````````````````````````````` example
2387 <!DOCTYPE html>
2388 .
2389 <!DOCTYPE html>
2390 ````````````````````````````````
2391
2392
2393 CDATA (type 5):
2394
2395 ```````````````````````````````` example
2396 <![CDATA[
2397 function matchwo(a,b)
2398 {
2399   if (a < b && a < 0) then {
2400     return 1;
2401
2402   } else {
2403
2404     return 0;
2405   }
2406 }
2407 ]]>
2408 .
2409 <![CDATA[
2410 function matchwo(a,b)
2411 {
2412   if (a < b && a < 0) then {
2413     return 1;
2414
2415   } else {
2416
2417     return 0;
2418   }
2419 }
2420 ]]>
2421 ````````````````````````````````
2422
2423
2424 The opening tag can be indented 1-3 spaces, but not 4:
2425
2426 ```````````````````````````````` example
2427   <!-- foo -->
2428
2429     <!-- foo -->
2430 .
2431   <!-- foo -->
2432 <pre><code>&lt;!-- foo --&gt;
2433 </code></pre>
2434 ````````````````````````````````
2435
2436
2437 ```````````````````````````````` example
2438   <div>
2439
2440     <div>
2441 .
2442   <div>
2443 <pre><code>&lt;div&gt;
2444 </code></pre>
2445 ````````````````````````````````
2446
2447
2448 An HTML block of types 1--6 can interrupt a paragraph, and need not be
2449 preceded by a blank line.
2450
2451 ```````````````````````````````` example
2452 Foo
2453 <div>
2454 bar
2455 </div>
2456 .
2457 <p>Foo</p>
2458 <div>
2459 bar
2460 </div>
2461 ````````````````````````````````
2462
2463
2464 However, a following blank line is needed, except at the end of
2465 a document, and except for blocks of types 1--5, above:
2466
2467 ```````````````````````````````` example
2468 <div>
2469 bar
2470 </div>
2471 *foo*
2472 .
2473 <div>
2474 bar
2475 </div>
2476 *foo*
2477 ````````````````````````````````
2478
2479
2480 HTML blocks of type 7 cannot interrupt a paragraph:
2481
2482 ```````````````````````````````` example
2483 Foo
2484 <a href="bar">
2485 baz
2486 .
2487 <p>Foo
2488 <a href="bar">
2489 baz</p>
2490 ````````````````````````````````
2491
2492
2493 This rule differs from John Gruber's original Markdown syntax
2494 specification, which says:
2495
2496 > The only restrictions are that block-level HTML elements —
2497 > e.g. `<div>`, `<table>`, `<pre>`, `<p>`, etc. — must be separated from
2498 > surrounding content by blank lines, and the start and end tags of the
2499 > block should not be indented with tabs or spaces.
2500
2501 In some ways Gruber's rule is more restrictive than the one given
2502 here:
2503
2504 - It requires that an HTML block be preceded by a blank line.
2505 - It does not allow the start tag to be indented.
2506 - It requires a matching end tag, which it also does not allow to
2507   be indented.
2508
2509 Most Markdown implementations (including some of Gruber's own) do not
2510 respect all of these restrictions.
2511
2512 There is one respect, however, in which Gruber's rule is more liberal
2513 than the one given here, since it allows blank lines to occur inside
2514 an HTML block.  There are two reasons for disallowing them here.
2515 First, it removes the need to parse balanced tags, which is
2516 expensive and can require backtracking from the end of the document
2517 if no matching end tag is found. Second, it provides a very simple
2518 and flexible way of including Markdown content inside HTML tags:
2519 simply separate the Markdown from the HTML using blank lines:
2520
2521 Compare:
2522
2523 ```````````````````````````````` example
2524 <div>
2525
2526 *Emphasized* text.
2527
2528 </div>
2529 .
2530 <div>
2531 <p><em>Emphasized</em> text.</p>
2532 </div>
2533 ````````````````````````````````
2534
2535
2536 ```````````````````````````````` example
2537 <div>
2538 *Emphasized* text.
2539 </div>
2540 .
2541 <div>
2542 *Emphasized* text.
2543 </div>
2544 ````````````````````````````````
2545
2546
2547 Some Markdown implementations have adopted a convention of
2548 interpreting content inside tags as text if the open tag has
2549 the attribute `markdown=1`.  The rule given above seems a simpler and
2550 more elegant way of achieving the same expressive power, which is also
2551 much simpler to parse.
2552
2553 The main potential drawback is that one can no longer paste HTML
2554 blocks into Markdown documents with 100% reliability.  However,
2555 *in most cases* this will work fine, because the blank lines in
2556 HTML are usually followed by HTML block tags.  For example:
2557
2558 ```````````````````````````````` example
2559 <table>
2560
2561 <tr>
2562
2563 <td>
2564 Hi
2565 </td>
2566
2567 </tr>
2568
2569 </table>
2570 .
2571 <table>
2572 <tr>
2573 <td>
2574 Hi
2575 </td>
2576 </tr>
2577 </table>
2578 ````````````````````````````````
2579
2580
2581 There are problems, however, if the inner tags are indented
2582 *and* separated by spaces, as then they will be interpreted as
2583 an indented code block:
2584
2585 ```````````````````````````````` example
2586 <table>
2587
2588   <tr>
2589
2590     <td>
2591       Hi
2592     </td>
2593
2594   </tr>
2595
2596 </table>
2597 .
2598 <table>
2599   <tr>
2600 <pre><code>&lt;td&gt;
2601   Hi
2602 &lt;/td&gt;
2603 </code></pre>
2604   </tr>
2605 </table>
2606 ````````````````````````````````
2607
2608
2609 Fortunately, blank lines are usually not necessary and can be
2610 deleted.  The exception is inside `<pre>` tags, but as described
2611 above, raw HTML blocks starting with `<pre>` *can* contain blank
2612 lines.
2613
2614 ## Link reference definitions
2615
2616 A [link reference definition](@)
2617 consists of a [link label], indented up to three spaces, followed
2618 by a colon (`:`), optional [whitespace] (including up to one
2619 [line ending]), a [link destination],
2620 optional [whitespace] (including up to one
2621 [line ending]), and an optional [link
2622 title], which if it is present must be separated
2623 from the [link destination] by [whitespace].
2624 No further [non-whitespace characters] may occur on the line.
2625
2626 A [link reference definition]
2627 does not correspond to a structural element of a document.  Instead, it
2628 defines a label which can be used in [reference links]
2629 and reference-style [images] elsewhere in the document.  [Link
2630 reference definitions] can come either before or after the links that use
2631 them.
2632
2633 ```````````````````````````````` example
2634 [foo]: /url "title"
2635
2636 [foo]
2637 .
2638 <p><a href="/url" title="title">foo</a></p>
2639 ````````````````````````````````
2640
2641
2642 ```````````````````````````````` example
2643    [foo]: 
2644       /url  
2645            'the title'  
2646
2647 [foo]
2648 .
2649 <p><a href="/url" title="the title">foo</a></p>
2650 ````````````````````````````````
2651
2652
2653 ```````````````````````````````` example
2654 [Foo*bar\]]:my_(url) 'title (with parens)'
2655
2656 [Foo*bar\]]
2657 .
2658 <p><a href="my_(url)" title="title (with parens)">Foo*bar]</a></p>
2659 ````````````````````````````````
2660
2661
2662 ```````````````````````````````` example
2663 [Foo bar]:
2664 <my%20url>
2665 'title'
2666
2667 [Foo bar]
2668 .
2669 <p><a href="my%20url" title="title">Foo bar</a></p>
2670 ````````````````````````````````
2671
2672
2673 The title may extend over multiple lines:
2674
2675 ```````````````````````````````` example
2676 [foo]: /url '
2677 title
2678 line1
2679 line2
2680 '
2681
2682 [foo]
2683 .
2684 <p><a href="/url" title="
2685 title
2686 line1
2687 line2
2688 ">foo</a></p>
2689 ````````````````````````````````
2690
2691
2692 However, it may not contain a [blank line]:
2693
2694 ```````````````````````````````` example
2695 [foo]: /url 'title
2696
2697 with blank line'
2698
2699 [foo]
2700 .
2701 <p>[foo]: /url 'title</p>
2702 <p>with blank line'</p>
2703 <p>[foo]</p>
2704 ````````````````````````````````
2705
2706
2707 The title may be omitted:
2708
2709 ```````````````````````````````` example
2710 [foo]:
2711 /url
2712
2713 [foo]
2714 .
2715 <p><a href="/url">foo</a></p>
2716 ````````````````````````````````
2717
2718
2719 The link destination may not be omitted:
2720
2721 ```````````````````````````````` example
2722 [foo]:
2723
2724 [foo]
2725 .
2726 <p>[foo]:</p>
2727 <p>[foo]</p>
2728 ````````````````````````````````
2729
2730
2731 Both title and destination can contain backslash escapes
2732 and literal backslashes:
2733
2734 ```````````````````````````````` example
2735 [foo]: /url\bar\*baz "foo\"bar\baz"
2736
2737 [foo]
2738 .
2739 <p><a href="/url%5Cbar*baz" title="foo&quot;bar\baz">foo</a></p>
2740 ````````````````````````````````
2741
2742
2743 A link can come before its corresponding definition:
2744
2745 ```````````````````````````````` example
2746 [foo]
2747
2748 [foo]: url
2749 .
2750 <p><a href="url">foo</a></p>
2751 ````````````````````````````````
2752
2753
2754 If there are several matching definitions, the first one takes
2755 precedence:
2756
2757 ```````````````````````````````` example
2758 [foo]
2759
2760 [foo]: first
2761 [foo]: second
2762 .
2763 <p><a href="first">foo</a></p>
2764 ````````````````````````````````
2765
2766
2767 As noted in the section on [Links], matching of labels is
2768 case-insensitive (see [matches]).
2769
2770 ```````````````````````````````` example
2771 [FOO]: /url
2772
2773 [Foo]
2774 .
2775 <p><a href="/url">Foo</a></p>
2776 ````````````````````````````````
2777
2778
2779 ```````````````````````````````` example
2780 [ΑΓΩ]: /φου
2781
2782 [αγω]
2783 .
2784 <p><a href="/%CF%86%CE%BF%CF%85">αγω</a></p>
2785 ````````````````````````````````
2786
2787
2788 Here is a link reference definition with no corresponding link.
2789 It contributes nothing to the document.
2790
2791 ```````````````````````````````` example
2792 [foo]: /url
2793 .
2794 ````````````````````````````````
2795
2796
2797 Here is another one:
2798
2799 ```````````````````````````````` example
2800 [
2801 foo
2802 ]: /url
2803 bar
2804 .
2805 <p>bar</p>
2806 ````````````````````````````````
2807
2808
2809 This is not a link reference definition, because there are
2810 [non-whitespace characters] after the title:
2811
2812 ```````````````````````````````` example
2813 [foo]: /url "title" ok
2814 .
2815 <p>[foo]: /url &quot;title&quot; ok</p>
2816 ````````````````````````````````
2817
2818
2819 This is a link reference definition, but it has no title:
2820
2821 ```````````````````````````````` example
2822 [foo]: /url
2823 "title" ok
2824 .
2825 <p>&quot;title&quot; ok</p>
2826 ````````````````````````````````
2827
2828
2829 This is not a link reference definition, because it is indented
2830 four spaces:
2831
2832 ```````````````````````````````` example
2833     [foo]: /url "title"
2834
2835 [foo]
2836 .
2837 <pre><code>[foo]: /url &quot;title&quot;
2838 </code></pre>
2839 <p>[foo]</p>
2840 ````````````````````````````````
2841
2842
2843 This is not a link reference definition, because it occurs inside
2844 a code block:
2845
2846 ```````````````````````````````` example
2847 ```
2848 [foo]: /url
2849 ```
2850
2851 [foo]
2852 .
2853 <pre><code>[foo]: /url
2854 </code></pre>
2855 <p>[foo]</p>
2856 ````````````````````````````````
2857
2858
2859 A [link reference definition] cannot interrupt a paragraph.
2860
2861 ```````````````````````````````` example
2862 Foo
2863 [bar]: /baz
2864
2865 [bar]
2866 .
2867 <p>Foo
2868 [bar]: /baz</p>
2869 <p>[bar]</p>
2870 ````````````````````````````````
2871
2872
2873 However, it can directly follow other block elements, such as headings
2874 and thematic breaks, and it need not be followed by a blank line.
2875
2876 ```````````````````````````````` example
2877 # [Foo]
2878 [foo]: /url
2879 > bar
2880 .
2881 <h1><a href="/url">Foo</a></h1>
2882 <blockquote>
2883 <p>bar</p>
2884 </blockquote>
2885 ````````````````````````````````
2886
2887
2888 Several [link reference definitions]
2889 can occur one after another, without intervening blank lines.
2890
2891 ```````````````````````````````` example
2892 [foo]: /foo-url "foo"
2893 [bar]: /bar-url
2894   "bar"
2895 [baz]: /baz-url
2896
2897 [foo],
2898 [bar],
2899 [baz]
2900 .
2901 <p><a href="/foo-url" title="foo">foo</a>,
2902 <a href="/bar-url" title="bar">bar</a>,
2903 <a href="/baz-url">baz</a></p>
2904 ````````````````````````````````
2905
2906
2907 [Link reference definitions] can occur
2908 inside block containers, like lists and block quotations.  They
2909 affect the entire document, not just the container in which they
2910 are defined:
2911
2912 ```````````````````````````````` example
2913 [foo]
2914
2915 > [foo]: /url
2916 .
2917 <p><a href="/url">foo</a></p>
2918 <blockquote>
2919 </blockquote>
2920 ````````````````````````````````
2921
2922
2923
2924 ## Paragraphs
2925
2926 A sequence of non-blank lines that cannot be interpreted as other
2927 kinds of blocks forms a [paragraph](@).
2928 The contents of the paragraph are the result of parsing the
2929 paragraph's raw content as inlines.  The paragraph's raw content
2930 is formed by concatenating the lines and removing initial and final
2931 [whitespace].
2932
2933 A simple example with two paragraphs:
2934
2935 ```````````````````````````````` example
2936 aaa
2937
2938 bbb
2939 .
2940 <p>aaa</p>
2941 <p>bbb</p>
2942 ````````````````````````````````
2943
2944
2945 Paragraphs can contain multiple lines, but no blank lines:
2946
2947 ```````````````````````````````` example
2948 aaa
2949 bbb
2950
2951 ccc
2952 ddd
2953 .
2954 <p>aaa
2955 bbb</p>
2956 <p>ccc
2957 ddd</p>
2958 ````````````````````````````````
2959
2960
2961 Multiple blank lines between paragraph have no effect:
2962
2963 ```````````````````````````````` example
2964 aaa
2965
2966
2967 bbb
2968 .
2969 <p>aaa</p>
2970 <p>bbb</p>
2971 ````````````````````````````````
2972
2973
2974 Leading spaces are skipped:
2975
2976 ```````````````````````````````` example
2977   aaa
2978  bbb
2979 .
2980 <p>aaa
2981 bbb</p>
2982 ````````````````````````````````
2983
2984
2985 Lines after the first may be indented any amount, since indented
2986 code blocks cannot interrupt paragraphs.
2987
2988 ```````````````````````````````` example
2989 aaa
2990              bbb
2991                                        ccc
2992 .
2993 <p>aaa
2994 bbb
2995 ccc</p>
2996 ````````````````````````````````
2997
2998
2999 However, the first line may be indented at most three spaces,
3000 or an indented code block will be triggered:
3001
3002 ```````````````````````````````` example
3003    aaa
3004 bbb
3005 .
3006 <p>aaa
3007 bbb</p>
3008 ````````````````````````````````
3009
3010
3011 ```````````````````````````````` example
3012     aaa
3013 bbb
3014 .
3015 <pre><code>aaa
3016 </code></pre>
3017 <p>bbb</p>
3018 ````````````````````````````````
3019
3020
3021 Final spaces are stripped before inline parsing, so a paragraph
3022 that ends with two or more spaces will not end with a [hard line
3023 break]:
3024
3025 ```````````````````````````````` example
3026 aaa     
3027 bbb     
3028 .
3029 <p>aaa<br />
3030 bbb</p>
3031 ````````````````````````````````
3032
3033
3034 ## Blank lines
3035
3036 [Blank lines] between block-level elements are ignored,
3037 except for the role they play in determining whether a [list]
3038 is [tight] or [loose].
3039
3040 Blank lines at the beginning and end of the document are also ignored.
3041
3042 ```````````````````````````````` example
3043   
3044
3045 aaa
3046   
3047
3048 # aaa
3049
3050   
3051 .
3052 <p>aaa</p>
3053 <h1>aaa</h1>
3054 ````````````````````````````````
3055
3056
3057
3058 # Container blocks
3059
3060 A [container block] is a block that has other
3061 blocks as its contents.  There are two basic kinds of container blocks:
3062 [block quotes] and [list items].
3063 [Lists] are meta-containers for [list items].
3064
3065 We define the syntax for container blocks recursively.  The general
3066 form of the definition is:
3067
3068 > If X is a sequence of blocks, then the result of
3069 > transforming X in such-and-such a way is a container of type Y
3070 > with these blocks as its content.
3071
3072 So, we explain what counts as a block quote or list item by explaining
3073 how these can be *generated* from their contents. This should suffice
3074 to define the syntax, although it does not give a recipe for *parsing*
3075 these constructions.  (A recipe is provided below in the section entitled
3076 [A parsing strategy](#appendix-a-parsing-strategy).)
3077
3078 ## Block quotes
3079
3080 A [block quote marker](@)
3081 consists of 0-3 spaces of initial indent, plus (a) the character `>` together
3082 with a following space, or (b) a single character `>` not followed by a space.
3083
3084 The following rules define [block quotes]:
3085
3086 1.  **Basic case.**  If a string of lines *Ls* constitute a sequence
3087     of blocks *Bs*, then the result of prepending a [block quote
3088     marker] to the beginning of each line in *Ls*
3089     is a [block quote](#block-quotes) containing *Bs*.
3090
3091 2.  **Laziness.**  If a string of lines *Ls* constitute a [block
3092     quote](#block-quotes) with contents *Bs*, then the result of deleting
3093     the initial [block quote marker] from one or
3094     more lines in which the next [non-whitespace character] after the [block
3095     quote marker] is [paragraph continuation
3096     text] is a block quote with *Bs* as its content.
3097     [Paragraph continuation text](@) is text
3098     that will be parsed as part of the content of a paragraph, but does
3099     not occur at the beginning of the paragraph.
3100
3101 3.  **Consecutiveness.**  A document cannot contain two [block
3102     quotes] in a row unless there is a [blank line] between them.
3103
3104 Nothing else counts as a [block quote](#block-quotes).
3105
3106 Here is a simple example:
3107
3108 ```````````````````````````````` example
3109 > # Foo
3110 > bar
3111 > baz
3112 .
3113 <blockquote>
3114 <h1>Foo</h1>
3115 <p>bar
3116 baz</p>
3117 </blockquote>
3118 ````````````````````````````````
3119
3120
3121 The spaces after the `>` characters can be omitted:
3122
3123 ```````````````````````````````` example
3124 ># Foo
3125 >bar
3126 > baz
3127 .
3128 <blockquote>
3129 <h1>Foo</h1>
3130 <p>bar
3131 baz</p>
3132 </blockquote>
3133 ````````````````````````````````
3134
3135
3136 The `>` characters can be indented 1-3 spaces:
3137
3138 ```````````````````````````````` example
3139    > # Foo
3140    > bar
3141  > baz
3142 .
3143 <blockquote>
3144 <h1>Foo</h1>
3145 <p>bar
3146 baz</p>
3147 </blockquote>
3148 ````````````````````````````````
3149
3150
3151 Four spaces gives us a code block:
3152
3153 ```````````````````````````````` example
3154     > # Foo
3155     > bar
3156     > baz
3157 .
3158 <pre><code>&gt; # Foo
3159 &gt; bar
3160 &gt; baz
3161 </code></pre>
3162 ````````````````````````````````
3163
3164
3165 The Laziness clause allows us to omit the `>` before a
3166 paragraph continuation line:
3167
3168 ```````````````````````````````` example
3169 > # Foo
3170 > bar
3171 baz
3172 .
3173 <blockquote>
3174 <h1>Foo</h1>
3175 <p>bar
3176 baz</p>
3177 </blockquote>
3178 ````````````````````````````````
3179
3180
3181 A block quote can contain some lazy and some non-lazy
3182 continuation lines:
3183
3184 ```````````````````````````````` example
3185 > bar
3186 baz
3187 > foo
3188 .
3189 <blockquote>
3190 <p>bar
3191 baz
3192 foo</p>
3193 </blockquote>
3194 ````````````````````````````````
3195
3196
3197 Laziness only applies to lines that would have been continuations of
3198 paragraphs had they been prepended with [block quote markers].
3199 For example, the `> ` cannot be omitted in the second line of
3200
3201 ``` markdown
3202 > foo
3203 > ---
3204 ```
3205
3206 without changing the meaning:
3207
3208 ```````````````````````````````` example
3209 > foo
3210 ---
3211 .
3212 <blockquote>
3213 <p>foo</p>
3214 </blockquote>
3215 <hr />
3216 ````````````````````````````````
3217
3218
3219 Similarly, if we omit the `> ` in the second line of
3220
3221 ``` markdown
3222 > - foo
3223 > - bar
3224 ```
3225
3226 then the block quote ends after the first line:
3227
3228 ```````````````````````````````` example
3229 > - foo
3230 - bar
3231 .
3232 <blockquote>
3233 <ul>
3234 <li>foo</li>
3235 </ul>
3236 </blockquote>
3237 <ul>
3238 <li>bar</li>
3239 </ul>
3240 ````````````````````````````````
3241
3242
3243 For the same reason, we can't omit the `> ` in front of
3244 subsequent lines of an indented or fenced code block:
3245
3246 ```````````````````````````````` example
3247 >     foo
3248     bar
3249 .
3250 <blockquote>
3251 <pre><code>foo
3252 </code></pre>
3253 </blockquote>
3254 <pre><code>bar
3255 </code></pre>
3256 ````````````````````````````````
3257
3258
3259 ```````````````````````````````` example
3260 > ```
3261 foo
3262 ```
3263 .
3264 <blockquote>
3265 <pre><code></code></pre>
3266 </blockquote>
3267 <p>foo</p>
3268 <pre><code></code></pre>
3269 ````````````````````````````````
3270
3271
3272 Note that in the following case, we have a paragraph
3273 continuation line:
3274
3275 ```````````````````````````````` example
3276 > foo
3277     - bar
3278 .
3279 <blockquote>
3280 <p>foo
3281 - bar</p>
3282 </blockquote>
3283 ````````````````````````````````
3284
3285
3286 To see why, note that in
3287
3288 ```markdown
3289 > foo
3290 >     - bar
3291 ```
3292
3293 the `- bar` is indented too far to start a list, and can't
3294 be an indented code block because indented code blocks cannot
3295 interrupt paragraphs, so it is a [paragraph continuation line].
3296
3297 A block quote can be empty:
3298
3299 ```````````````````````````````` example
3300 >
3301 .
3302 <blockquote>
3303 </blockquote>
3304 ````````````````````````````````
3305
3306
3307 ```````````````````````````````` example
3308 >
3309 >  
3310
3311 .
3312 <blockquote>
3313 </blockquote>
3314 ````````````````````````````````
3315
3316
3317 A block quote can have initial or final blank lines:
3318
3319 ```````````````````````````````` example
3320 >
3321 > foo
3322 >  
3323 .
3324 <blockquote>
3325 <p>foo</p>
3326 </blockquote>
3327 ````````````````````````````````
3328
3329
3330 A blank line always separates block quotes:
3331
3332 ```````````````````````````````` example
3333 > foo
3334
3335 > bar
3336 .
3337 <blockquote>
3338 <p>foo</p>
3339 </blockquote>
3340 <blockquote>
3341 <p>bar</p>
3342 </blockquote>
3343 ````````````````````````````````
3344
3345
3346 (Most current Markdown implementations, including John Gruber's
3347 original `Markdown.pl`, will parse this example as a single block quote
3348 with two paragraphs.  But it seems better to allow the author to decide
3349 whether two block quotes or one are wanted.)
3350
3351 Consecutiveness means that if we put these block quotes together,
3352 we get a single block quote:
3353
3354 ```````````````````````````````` example
3355 > foo
3356 > bar
3357 .
3358 <blockquote>
3359 <p>foo
3360 bar</p>
3361 </blockquote>
3362 ````````````````````````````````
3363
3364
3365 To get a block quote with two paragraphs, use:
3366
3367 ```````````````````````````````` example
3368 > foo
3369 >
3370 > bar
3371 .
3372 <blockquote>
3373 <p>foo</p>
3374 <p>bar</p>
3375 </blockquote>
3376 ````````````````````````````````
3377
3378
3379 Block quotes can interrupt paragraphs:
3380
3381 ```````````````````````````````` example
3382 foo
3383 > bar
3384 .
3385 <p>foo</p>
3386 <blockquote>
3387 <p>bar</p>
3388 </blockquote>
3389 ````````````````````````````````
3390
3391
3392 In general, blank lines are not needed before or after block
3393 quotes:
3394
3395 ```````````````````````````````` example
3396 > aaa
3397 ***
3398 > bbb
3399 .
3400 <blockquote>
3401 <p>aaa</p>
3402 </blockquote>
3403 <hr />
3404 <blockquote>
3405 <p>bbb</p>
3406 </blockquote>
3407 ````````````````````````````````
3408
3409
3410 However, because of laziness, a blank line is needed between
3411 a block quote and a following paragraph:
3412
3413 ```````````````````````````````` example
3414 > bar
3415 baz
3416 .
3417 <blockquote>
3418 <p>bar
3419 baz</p>
3420 </blockquote>
3421 ````````````````````````````````
3422
3423
3424 ```````````````````````````````` example
3425 > bar
3426
3427 baz
3428 .
3429 <blockquote>
3430 <p>bar</p>
3431 </blockquote>
3432 <p>baz</p>
3433 ````````````````````````````````
3434
3435
3436 ```````````````````````````````` example
3437 > bar
3438 >
3439 baz
3440 .
3441 <blockquote>
3442 <p>bar</p>
3443 </blockquote>
3444 <p>baz</p>
3445 ````````````````````````````````
3446
3447
3448 It is a consequence of the Laziness rule that any number
3449 of initial `>`s may be omitted on a continuation line of a
3450 nested block quote:
3451
3452 ```````````````````````````````` example
3453 > > > foo
3454 bar
3455 .
3456 <blockquote>
3457 <blockquote>
3458 <blockquote>
3459 <p>foo
3460 bar</p>
3461 </blockquote>
3462 </blockquote>
3463 </blockquote>
3464 ````````````````````````````````
3465
3466
3467 ```````````````````````````````` example
3468 >>> foo
3469 > bar
3470 >>baz
3471 .
3472 <blockquote>
3473 <blockquote>
3474 <blockquote>
3475 <p>foo
3476 bar
3477 baz</p>
3478 </blockquote>
3479 </blockquote>
3480 </blockquote>
3481 ````````````````````````````````
3482
3483
3484 When including an indented code block in a block quote,
3485 remember that the [block quote marker] includes
3486 both the `>` and a following space.  So *five spaces* are needed after
3487 the `>`:
3488
3489 ```````````````````````````````` example
3490 >     code
3491
3492 >    not code
3493 .
3494 <blockquote>
3495 <pre><code>code
3496 </code></pre>
3497 </blockquote>
3498 <blockquote>
3499 <p>not code</p>
3500 </blockquote>
3501 ````````````````````````````````
3502
3503
3504
3505 ## List items
3506
3507 A [list marker](@) is a
3508 [bullet list marker] or an [ordered list marker].
3509
3510 A [bullet list marker](@)
3511 is a `-`, `+`, or `*` character.
3512
3513 An [ordered list marker](@)
3514 is a sequence of 1--9 arabic digits (`0-9`), followed by either a
3515 `.` character or a `)` character.  (The reason for the length
3516 limit is that with 10 digits we start seeing integer overflows
3517 in some browsers.)
3518
3519 The following rules define [list items]:
3520
3521 1.  **Basic case.**  If a sequence of lines *Ls* constitute a sequence of
3522     blocks *Bs* starting with a [non-whitespace character] and not separated
3523     from each other by more than one blank line, and *M* is a list
3524     marker of width *W* followed by 0 < *N* < 5 spaces, then the result
3525     of prepending *M* and the following spaces to the first line of
3526     *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a
3527     list item with *Bs* as its contents.  The type of the list item
3528     (bullet or ordered) is determined by the type of its list marker.
3529     If the list item is ordered, then it is also assigned a start
3530     number, based on the ordered list marker.
3531
3532 For example, let *Ls* be the lines
3533
3534 ```````````````````````````````` example
3535 A paragraph
3536 with two lines.
3537
3538     indented code
3539
3540 > A block quote.
3541 .
3542 <p>A paragraph
3543 with two lines.</p>
3544 <pre><code>indented code
3545 </code></pre>
3546 <blockquote>
3547 <p>A block quote.</p>
3548 </blockquote>
3549 ````````````````````````````````
3550
3551
3552 And let *M* be the marker `1.`, and *N* = 2.  Then rule #1 says
3553 that the following is an ordered list item with start number 1,
3554 and the same contents as *Ls*:
3555
3556 ```````````````````````````````` example
3557 1.  A paragraph
3558     with two lines.
3559
3560         indented code
3561
3562     > A block quote.
3563 .
3564 <ol>
3565 <li>
3566 <p>A paragraph
3567 with two lines.</p>
3568 <pre><code>indented code
3569 </code></pre>
3570 <blockquote>
3571 <p>A block quote.</p>
3572 </blockquote>
3573 </li>
3574 </ol>
3575 ````````````````````````````````
3576
3577
3578 The most important thing to notice is that the position of
3579 the text after the list marker determines how much indentation
3580 is needed in subsequent blocks in the list item.  If the list
3581 marker takes up two spaces, and there are three spaces between
3582 the list marker and the next [non-whitespace character], then blocks
3583 must be indented five spaces in order to fall under the list
3584 item.
3585
3586 Here are some examples showing how far content must be indented to be
3587 put under the list item:
3588
3589 ```````````````````````````````` example
3590 - one
3591
3592  two
3593 .
3594 <ul>
3595 <li>one</li>
3596 </ul>
3597 <p>two</p>
3598 ````````````````````````````````
3599
3600
3601 ```````````````````````````````` example
3602 - one
3603
3604   two
3605 .
3606 <ul>
3607 <li>
3608 <p>one</p>
3609 <p>two</p>
3610 </li>
3611 </ul>
3612 ````````````````````````````````
3613
3614
3615 ```````````````````````````````` example
3616  -    one
3617
3618      two
3619 .
3620 <ul>
3621 <li>one</li>
3622 </ul>
3623 <pre><code> two
3624 </code></pre>
3625 ````````````````````````````````
3626
3627
3628 ```````````````````````````````` example
3629  -    one
3630
3631       two
3632 .
3633 <ul>
3634 <li>
3635 <p>one</p>
3636 <p>two</p>
3637 </li>
3638 </ul>
3639 ````````````````````````````````
3640
3641
3642 It is tempting to think of this in terms of columns:  the continuation
3643 blocks must be indented at least to the column of the first
3644 [non-whitespace character] after the list marker. However, that is not quite right.
3645 The spaces after the list marker determine how much relative indentation
3646 is needed.  Which column this indentation reaches will depend on
3647 how the list item is embedded in other constructions, as shown by
3648 this example:
3649
3650 ```````````````````````````````` example
3651    > > 1.  one
3652 >>
3653 >>     two
3654 .
3655 <blockquote>
3656 <blockquote>
3657 <ol>
3658 <li>
3659 <p>one</p>
3660 <p>two</p>
3661 </li>
3662 </ol>
3663 </blockquote>
3664 </blockquote>
3665 ````````````````````````````````
3666
3667
3668 Here `two` occurs in the same column as the list marker `1.`,
3669 but is actually contained in the list item, because there is
3670 sufficient indentation after the last containing blockquote marker.
3671
3672 The converse is also possible.  In the following example, the word `two`
3673 occurs far to the right of the initial text of the list item, `one`, but
3674 it is not considered part of the list item, because it is not indented
3675 far enough past the blockquote marker:
3676
3677 ```````````````````````````````` example
3678 >>- one
3679 >>
3680   >  > two
3681 .
3682 <blockquote>
3683 <blockquote>
3684 <ul>
3685 <li>one</li>
3686 </ul>
3687 <p>two</p>
3688 </blockquote>
3689 </blockquote>
3690 ````````````````````````````````
3691
3692
3693 Note that at least one space is needed between the list marker and
3694 any following content, so these are not list items:
3695
3696 ```````````````````````````````` example
3697 -one
3698
3699 2.two
3700 .
3701 <p>-one</p>
3702 <p>2.two</p>
3703 ````````````````````````````````
3704
3705
3706 A list item may not contain blocks that are separated by more than
3707 one blank line.  Thus, two blank lines will end a list, unless the
3708 two blanks are contained in a [fenced code block].
3709
3710 ```````````````````````````````` example
3711 - foo
3712
3713   bar
3714
3715 - foo
3716
3717
3718   bar
3719
3720 - ```
3721   foo
3722
3723
3724   bar
3725   ```
3726
3727 - baz
3728
3729   + ```
3730     foo
3731
3732
3733     bar
3734     ```
3735 .
3736 <ul>
3737 <li>
3738 <p>foo</p>
3739 <p>bar</p>
3740 </li>
3741 <li>
3742 <p>foo</p>
3743 </li>
3744 </ul>
3745 <p>bar</p>
3746 <ul>
3747 <li>
3748 <pre><code>foo
3749
3750
3751 bar
3752 </code></pre>
3753 </li>
3754 <li>
3755 <p>baz</p>
3756 <ul>
3757 <li>
3758 <pre><code>foo
3759
3760
3761 bar
3762 </code></pre>
3763 </li>
3764 </ul>
3765 </li>
3766 </ul>
3767 ````````````````````````````````
3768
3769
3770 A list item may contain any kind of block:
3771
3772 ```````````````````````````````` example
3773 1.  foo
3774
3775     ```
3776     bar
3777     ```
3778
3779     baz
3780
3781     > bam
3782 .
3783 <ol>
3784 <li>
3785 <p>foo</p>
3786 <pre><code>bar
3787 </code></pre>
3788 <p>baz</p>
3789 <blockquote>
3790 <p>bam</p>
3791 </blockquote>
3792 </li>
3793 </ol>
3794 ````````````````````````````````
3795
3796
3797 A list item that contains an indented code block will preserve
3798 empty lines within the code block verbatim, unless there are two
3799 or more empty lines in a row (since as described above, two
3800 blank lines end the list):
3801
3802 ```````````````````````````````` example
3803 - Foo
3804
3805       bar
3806
3807       baz
3808 .
3809 <ul>
3810 <li>
3811 <p>Foo</p>
3812 <pre><code>bar
3813
3814 baz
3815 </code></pre>
3816 </li>
3817 </ul>
3818 ````````````````````````````````
3819
3820
3821 ```````````````````````````````` example
3822 - Foo
3823
3824       bar
3825
3826
3827       baz
3828 .
3829 <ul>
3830 <li>
3831 <p>Foo</p>
3832 <pre><code>bar
3833 </code></pre>
3834 </li>
3835 </ul>
3836 <pre><code>  baz
3837 </code></pre>
3838 ````````````````````````````````
3839
3840
3841 Note that ordered list start numbers must be nine digits or less:
3842
3843 ```````````````````````````````` example
3844 123456789. ok
3845 .
3846 <ol start="123456789">
3847 <li>ok</li>
3848 </ol>
3849 ````````````````````````````````
3850
3851
3852 ```````````````````````````````` example
3853 1234567890. not ok
3854 .
3855 <p>1234567890. not ok</p>
3856 ````````````````````````````````
3857
3858
3859 A start number may begin with 0s:
3860
3861 ```````````````````````````````` example
3862 0. ok
3863 .
3864 <ol start="0">
3865 <li>ok</li>
3866 </ol>
3867 ````````````````````````````````
3868
3869
3870 ```````````````````````````````` example
3871 003. ok
3872 .
3873 <ol start="3">
3874 <li>ok</li>
3875 </ol>
3876 ````````````````````````````````
3877
3878
3879 A start number may not be negative:
3880
3881 ```````````````````````````````` example
3882 -1. not ok
3883 .
3884 <p>-1. not ok</p>
3885 ````````````````````````````````
3886
3887
3888
3889 2.  **Item starting with indented code.**  If a sequence of lines *Ls*
3890     constitute a sequence of blocks *Bs* starting with an indented code
3891     block and not separated from each other by more than one blank line,
3892     and *M* is a list marker of width *W* followed by
3893     one space, then the result of prepending *M* and the following
3894     space to the first line of *Ls*, and indenting subsequent lines of
3895     *Ls* by *W + 1* spaces, is a list item with *Bs* as its contents.
3896     If a line is empty, then it need not be indented.  The type of the
3897     list item (bullet or ordered) is determined by the type of its list
3898     marker.  If the list item is ordered, then it is also assigned a
3899     start number, based on the ordered list marker.
3900
3901 An indented code block will have to be indented four spaces beyond
3902 the edge of the region where text will be included in the list item.
3903 In the following case that is 6 spaces:
3904
3905 ```````````````````````````````` example
3906 - foo
3907
3908       bar
3909 .
3910 <ul>
3911 <li>
3912 <p>foo</p>
3913 <pre><code>bar
3914 </code></pre>
3915 </li>
3916 </ul>
3917 ````````````````````````````````
3918
3919
3920 And in this case it is 11 spaces:
3921
3922 ```````````````````````````````` example
3923   10.  foo
3924
3925            bar
3926 .
3927 <ol start="10">
3928 <li>
3929 <p>foo</p>
3930 <pre><code>bar
3931 </code></pre>
3932 </li>
3933 </ol>
3934 ````````````````````````````````
3935
3936
3937 If the *first* block in the list item is an indented code block,
3938 then by rule #2, the contents must be indented *one* space after the
3939 list marker:
3940
3941 ```````````````````````````````` example
3942     indented code
3943
3944 paragraph
3945
3946     more code
3947 .
3948 <pre><code>indented code
3949 </code></pre>
3950 <p>paragraph</p>
3951 <pre><code>more code
3952 </code></pre>
3953 ````````````````````````````````
3954
3955
3956 ```````````````````````````````` example
3957 1.     indented code
3958
3959    paragraph
3960
3961        more code
3962 .
3963 <ol>
3964 <li>
3965 <pre><code>indented code
3966 </code></pre>
3967 <p>paragraph</p>
3968 <pre><code>more code
3969 </code></pre>
3970 </li>
3971 </ol>
3972 ````````````````````````````````
3973
3974
3975 Note that an additional space indent is interpreted as space
3976 inside the code block:
3977
3978 ```````````````````````````````` example
3979 1.      indented code
3980
3981    paragraph
3982
3983        more code
3984 .
3985 <ol>
3986 <li>
3987 <pre><code> indented code
3988 </code></pre>
3989 <p>paragraph</p>
3990 <pre><code>more code
3991 </code></pre>
3992 </li>
3993 </ol>
3994 ````````````````````````````````
3995
3996
3997 Note that rules #1 and #2 only apply to two cases:  (a) cases
3998 in which the lines to be included in a list item begin with a
3999 [non-whitespace character], and (b) cases in which
4000 they begin with an indented code
4001 block.  In a case like the following, where the first block begins with
4002 a three-space indent, the rules do not allow us to form a list item by
4003 indenting the whole thing and prepending a list marker:
4004
4005 ```````````````````````````````` example
4006    foo
4007
4008 bar
4009 .
4010 <p>foo</p>
4011 <p>bar</p>
4012 ````````````````````````````````
4013
4014
4015 ```````````````````````````````` example
4016 -    foo
4017
4018   bar
4019 .
4020 <ul>
4021 <li>foo</li>
4022 </ul>
4023 <p>bar</p>
4024 ````````````````````````````````
4025
4026
4027 This is not a significant restriction, because when a block begins
4028 with 1-3 spaces indent, the indentation can always be removed without
4029 a change in interpretation, allowing rule #1 to be applied.  So, in
4030 the above case:
4031
4032 ```````````````````````````````` example
4033 -  foo
4034
4035    bar
4036 .
4037 <ul>
4038 <li>
4039 <p>foo</p>
4040 <p>bar</p>
4041 </li>
4042 </ul>
4043 ````````````````````````````````
4044
4045
4046 3.  **Item starting with a blank line.**  If a sequence of lines *Ls*
4047     starting with a single [blank line] constitute a (possibly empty)
4048     sequence of blocks *Bs*, not separated from each other by more than
4049     one blank line, and *M* is a list marker of width *W*,
4050     then the result of prepending *M* to the first line of *Ls*, and
4051     indenting subsequent lines of *Ls* by *W + 1* spaces, is a list
4052     item with *Bs* as its contents.
4053     If a line is empty, then it need not be indented.  The type of the
4054     list item (bullet or ordered) is determined by the type of its list
4055     marker.  If the list item is ordered, then it is also assigned a
4056     start number, based on the ordered list marker.
4057
4058 Here are some list items that start with a blank line but are not empty:
4059
4060 ```````````````````````````````` example
4061 -
4062   foo
4063 -
4064   ```
4065   bar
4066   ```
4067 -
4068       baz
4069 .
4070 <ul>
4071 <li>foo</li>
4072 <li>
4073 <pre><code>bar
4074 </code></pre>
4075 </li>
4076 <li>
4077 <pre><code>baz
4078 </code></pre>
4079 </li>
4080 </ul>
4081 ````````````````````````````````
4082
4083 When the list item starts with a blank line, the number of spaces
4084 following the list marker doesn't change the required indentation:
4085
4086 ```````````````````````````````` example
4087 -   
4088   foo
4089 .
4090 <ul>
4091 <li>foo</li>
4092 </ul>
4093 ````````````````````````````````
4094
4095
4096 A list item can begin with at most one blank line.
4097 In the following example, `foo` is not part of the list
4098 item:
4099
4100 ```````````````````````````````` example
4101 -
4102
4103   foo
4104 .
4105 <ul>
4106 <li></li>
4107 </ul>
4108 <p>foo</p>
4109 ````````````````````````````````
4110
4111
4112 Here is an empty bullet list item:
4113
4114 ```````````````````````````````` example
4115 - foo
4116 -
4117 - bar
4118 .
4119 <ul>
4120 <li>foo</li>
4121 <li></li>
4122 <li>bar</li>
4123 </ul>
4124 ````````````````````````````````
4125
4126
4127 It does not matter whether there are spaces following the [list marker]:
4128
4129 ```````````````````````````````` example
4130 - foo
4131 -   
4132 - bar
4133 .
4134 <ul>
4135 <li>foo</li>
4136 <li></li>
4137 <li>bar</li>
4138 </ul>
4139 ````````````````````````````````
4140
4141
4142 Here is an empty ordered list item:
4143
4144 ```````````````````````````````` example
4145 1. foo
4146 2.
4147 3. bar
4148 .
4149 <ol>
4150 <li>foo</li>
4151 <li></li>
4152 <li>bar</li>
4153 </ol>
4154 ````````````````````````````````
4155
4156
4157 A list may start or end with an empty list item:
4158
4159 ```````````````````````````````` example
4160 *
4161 .
4162 <ul>
4163 <li></li>
4164 </ul>
4165 ````````````````````````````````
4166
4167
4168
4169 4.  **Indentation.**  If a sequence of lines *Ls* constitutes a list item
4170     according to rule #1, #2, or #3, then the result of indenting each line
4171     of *Ls* by 1-3 spaces (the same for each line) also constitutes a
4172     list item with the same contents and attributes.  If a line is
4173     empty, then it need not be indented.
4174
4175 Indented one space:
4176
4177 ```````````````````````````````` example
4178  1.  A paragraph
4179      with two lines.
4180
4181          indented code
4182
4183      > A block quote.
4184 .
4185 <ol>
4186 <li>
4187 <p>A paragraph
4188 with two lines.</p>
4189 <pre><code>indented code
4190 </code></pre>
4191 <blockquote>
4192 <p>A block quote.</p>
4193 </blockquote>
4194 </li>
4195 </ol>
4196 ````````````````````````````````
4197
4198
4199 Indented two spaces:
4200
4201 ```````````````````````````````` example
4202   1.  A paragraph
4203       with two lines.
4204
4205           indented code
4206
4207       > A block quote.
4208 .
4209 <ol>
4210 <li>
4211 <p>A paragraph
4212 with two lines.</p>
4213 <pre><code>indented code
4214 </code></pre>
4215 <blockquote>
4216 <p>A block quote.</p>
4217 </blockquote>
4218 </li>
4219 </ol>
4220 ````````````````````````````````
4221
4222
4223 Indented three spaces:
4224
4225 ```````````````````````````````` example
4226    1.  A paragraph
4227        with two lines.
4228
4229            indented code
4230
4231        > A block quote.
4232 .
4233 <ol>
4234 <li>
4235 <p>A paragraph
4236 with two lines.</p>
4237 <pre><code>indented code
4238 </code></pre>
4239 <blockquote>
4240 <p>A block quote.</p>
4241 </blockquote>
4242 </li>
4243 </ol>
4244 ````````````````````````````````
4245
4246
4247 Four spaces indent gives a code block:
4248
4249 ```````````````````````````````` example
4250     1.  A paragraph
4251         with two lines.
4252
4253             indented code
4254
4255         > A block quote.
4256 .
4257 <pre><code>1.  A paragraph
4258     with two lines.
4259
4260         indented code
4261
4262     &gt; A block quote.
4263 </code></pre>
4264 ````````````````````````````````
4265
4266
4267
4268 5.  **Laziness.**  If a string of lines *Ls* constitute a [list
4269     item](#list-items) with contents *Bs*, then the result of deleting
4270     some or all of the indentation from one or more lines in which the
4271     next [non-whitespace character] after the indentation is
4272     [paragraph continuation text] is a
4273     list item with the same contents and attributes.  The unindented
4274     lines are called
4275     [lazy continuation line](@)s.
4276
4277 Here is an example with [lazy continuation lines]:
4278
4279 ```````````````````````````````` example
4280   1.  A paragraph
4281 with two lines.
4282
4283           indented code
4284
4285       > A block quote.
4286 .
4287 <ol>
4288 <li>
4289 <p>A paragraph
4290 with two lines.</p>
4291 <pre><code>indented code
4292 </code></pre>
4293 <blockquote>
4294 <p>A block quote.</p>
4295 </blockquote>
4296 </li>
4297 </ol>
4298 ````````````````````````````````
4299
4300
4301 Indentation can be partially deleted:
4302
4303 ```````````````````````````````` example
4304   1.  A paragraph
4305     with two lines.
4306 .
4307 <ol>
4308 <li>A paragraph
4309 with two lines.</li>
4310 </ol>
4311 ````````````````````````````````
4312
4313
4314 These examples show how laziness can work in nested structures:
4315
4316 ```````````````````````````````` example
4317 > 1. > Blockquote
4318 continued here.
4319 .
4320 <blockquote>
4321 <ol>
4322 <li>
4323 <blockquote>
4324 <p>Blockquote
4325 continued here.</p>
4326 </blockquote>
4327 </li>
4328 </ol>
4329 </blockquote>
4330 ````````````````````````````````
4331
4332
4333 ```````````````````````````````` example
4334 > 1. > Blockquote
4335 > continued here.
4336 .
4337 <blockquote>
4338 <ol>
4339 <li>
4340 <blockquote>
4341 <p>Blockquote
4342 continued here.</p>
4343 </blockquote>
4344 </li>
4345 </ol>
4346 </blockquote>
4347 ````````````````````````````````
4348
4349
4350
4351 6.  **That's all.** Nothing that is not counted as a list item by rules
4352     #1--5 counts as a [list item](#list-items).
4353
4354 The rules for sublists follow from the general rules above.  A sublist
4355 must be indented the same number of spaces a paragraph would need to be
4356 in order to be included in the list item.
4357
4358 So, in this case we need two spaces indent:
4359
4360 ```````````````````````````````` example
4361 - foo
4362   - bar
4363     - baz
4364 .
4365 <ul>
4366 <li>foo
4367 <ul>
4368 <li>bar
4369 <ul>
4370 <li>baz</li>
4371 </ul>
4372 </li>
4373 </ul>
4374 </li>
4375 </ul>
4376 ````````````````````````````````
4377
4378
4379 One is not enough:
4380
4381 ```````````````````````````````` example
4382 - foo
4383  - bar
4384   - baz
4385 .
4386 <ul>
4387 <li>foo</li>
4388 <li>bar</li>
4389 <li>baz</li>
4390 </ul>
4391 ````````````````````````````````
4392
4393
4394 Here we need four, because the list marker is wider:
4395
4396 ```````````````````````````````` example
4397 10) foo
4398     - bar
4399 .
4400 <ol start="10">
4401 <li>foo
4402 <ul>
4403 <li>bar</li>
4404 </ul>
4405 </li>
4406 </ol>
4407 ````````````````````````````````
4408
4409
4410 Three is not enough:
4411
4412 ```````````````````````````````` example
4413 10) foo
4414    - bar
4415 .
4416 <ol start="10">
4417 <li>foo</li>
4418 </ol>
4419 <ul>
4420 <li>bar</li>
4421 </ul>
4422 ````````````````````````````````
4423
4424
4425 A list may be the first block in a list item:
4426
4427 ```````````````````````````````` example
4428 - - foo
4429 .
4430 <ul>
4431 <li>
4432 <ul>
4433 <li>foo</li>
4434 </ul>
4435 </li>
4436 </ul>
4437 ````````````````````````````````
4438
4439
4440 ```````````````````````````````` example
4441 1. - 2. foo
4442 .
4443 <ol>
4444 <li>
4445 <ul>
4446 <li>
4447 <ol start="2">
4448 <li>foo</li>
4449 </ol>
4450 </li>
4451 </ul>
4452 </li>
4453 </ol>
4454 ````````````````````````````````
4455
4456
4457 A list item can contain a heading:
4458
4459 ```````````````````````````````` example
4460 - # Foo
4461 - Bar
4462   ---
4463   baz
4464 .
4465 <ul>
4466 <li>
4467 <h1>Foo</h1>
4468 </li>
4469 <li>
4470 <h2>Bar</h2>
4471 baz</li>
4472 </ul>
4473 ````````````````````````````````
4474
4475
4476 ### Motivation
4477
4478 John Gruber's Markdown spec says the following about list items:
4479
4480 1. "List markers typically start at the left margin, but may be indented
4481    by up to three spaces. List markers must be followed by one or more
4482    spaces or a tab."
4483
4484 2. "To make lists look nice, you can wrap items with hanging indents....
4485    But if you don't want to, you don't have to."
4486
4487 3. "List items may consist of multiple paragraphs. Each subsequent
4488    paragraph in a list item must be indented by either 4 spaces or one
4489    tab."
4490
4491 4. "It looks nice if you indent every line of the subsequent paragraphs,
4492    but here again, Markdown will allow you to be lazy."
4493
4494 5. "To put a blockquote within a list item, the blockquote's `>`
4495    delimiters need to be indented."
4496
4497 6. "To put a code block within a list item, the code block needs to be
4498    indented twice — 8 spaces or two tabs."
4499
4500 These rules specify that a paragraph under a list item must be indented
4501 four spaces (presumably, from the left margin, rather than the start of
4502 the list marker, but this is not said), and that code under a list item
4503 must be indented eight spaces instead of the usual four.  They also say
4504 that a block quote must be indented, but not by how much; however, the
4505 example given has four spaces indentation.  Although nothing is said
4506 about other kinds of block-level content, it is certainly reasonable to
4507 infer that *all* block elements under a list item, including other
4508 lists, must be indented four spaces.  This principle has been called the
4509 *four-space rule*.
4510
4511 The four-space rule is clear and principled, and if the reference
4512 implementation `Markdown.pl` had followed it, it probably would have
4513 become the standard.  However, `Markdown.pl` allowed paragraphs and
4514 sublists to start with only two spaces indentation, at least on the
4515 outer level.  Worse, its behavior was inconsistent: a sublist of an
4516 outer-level list needed two spaces indentation, but a sublist of this
4517 sublist needed three spaces.  It is not surprising, then, that different
4518 implementations of Markdown have developed very different rules for
4519 determining what comes under a list item.  (Pandoc and python-Markdown,
4520 for example, stuck with Gruber's syntax description and the four-space
4521 rule, while discount, redcarpet, marked, PHP Markdown, and others
4522 followed `Markdown.pl`'s behavior more closely.)
4523
4524 Unfortunately, given the divergences between implementations, there
4525 is no way to give a spec for list items that will be guaranteed not
4526 to break any existing documents.  However, the spec given here should
4527 correctly handle lists formatted with either the four-space rule or
4528 the more forgiving `Markdown.pl` behavior, provided they are laid out
4529 in a way that is natural for a human to read.
4530
4531 The strategy here is to let the width and indentation of the list marker
4532 determine the indentation necessary for blocks to fall under the list
4533 item, rather than having a fixed and arbitrary number.  The writer can
4534 think of the body of the list item as a unit which gets indented to the
4535 right enough to fit the list marker (and any indentation on the list
4536 marker).  (The laziness rule, #5, then allows continuation lines to be
4537 unindented if needed.)
4538
4539 This rule is superior, we claim, to any rule requiring a fixed level of
4540 indentation from the margin.  The four-space rule is clear but
4541 unnatural. It is quite unintuitive that
4542
4543 ``` markdown
4544 - foo
4545
4546   bar
4547
4548   - baz
4549 ```
4550
4551 should be parsed as two lists with an intervening paragraph,
4552
4553 ``` html
4554 <ul>
4555 <li>foo</li>
4556 </ul>
4557 <p>bar</p>
4558 <ul>
4559 <li>baz</li>
4560 </ul>
4561 ```
4562
4563 as the four-space rule demands, rather than a single list,
4564
4565 ``` html
4566 <ul>
4567 <li>
4568 <p>foo</p>
4569 <p>bar</p>
4570 <ul>
4571 <li>baz</li>
4572 </ul>
4573 </li>
4574 </ul>
4575 ```
4576
4577 The choice of four spaces is arbitrary.  It can be learned, but it is
4578 not likely to be guessed, and it trips up beginners regularly.
4579
4580 Would it help to adopt a two-space rule?  The problem is that such
4581 a rule, together with the rule allowing 1--3 spaces indentation of the
4582 initial list marker, allows text that is indented *less than* the
4583 original list marker to be included in the list item. For example,
4584 `Markdown.pl` parses
4585
4586 ``` markdown
4587    - one
4588
4589   two
4590 ```
4591
4592 as a single list item, with `two` a continuation paragraph:
4593
4594 ``` html
4595 <ul>
4596 <li>
4597 <p>one</p>
4598 <p>two</p>
4599 </li>
4600 </ul>
4601 ```
4602
4603 and similarly
4604
4605 ``` markdown
4606 >   - one
4607 >
4608 >  two
4609 ```
4610
4611 as
4612
4613 ``` html
4614 <blockquote>
4615 <ul>
4616 <li>
4617 <p>one</p>
4618 <p>two</p>
4619 </li>
4620 </ul>
4621 </blockquote>
4622 ```
4623
4624 This is extremely unintuitive.
4625
4626 Rather than requiring a fixed indent from the margin, we could require
4627 a fixed indent (say, two spaces, or even one space) from the list marker (which
4628 may itself be indented).  This proposal would remove the last anomaly
4629 discussed.  Unlike the spec presented above, it would count the following
4630 as a list item with a subparagraph, even though the paragraph `bar`
4631 is not indented as far as the first paragraph `foo`:
4632
4633 ``` markdown
4634  10. foo
4635
4636    bar  
4637 ```
4638
4639 Arguably this text does read like a list item with `bar` as a subparagraph,
4640 which may count in favor of the proposal.  However, on this proposal indented
4641 code would have to be indented six spaces after the list marker.  And this
4642 would break a lot of existing Markdown, which has the pattern:
4643
4644 ``` markdown
4645 1.  foo
4646
4647         indented code
4648 ```
4649
4650 where the code is indented eight spaces.  The spec above, by contrast, will
4651 parse this text as expected, since the code block's indentation is measured
4652 from the beginning of `foo`.
4653
4654 The one case that needs special treatment is a list item that *starts*
4655 with indented code.  How much indentation is required in that case, since
4656 we don't have a "first paragraph" to measure from?  Rule #2 simply stipulates
4657 that in such cases, we require one space indentation from the list marker
4658 (and then the normal four spaces for the indented code).  This will match the
4659 four-space rule in cases where the list marker plus its initial indentation
4660 takes four spaces (a common case), but diverge in other cases.
4661
4662 ## Lists
4663
4664 A [list](@) is a sequence of one or more
4665 list items [of the same type].  The list items
4666 may be separated by single [blank lines], but two
4667 blank lines end all containing lists.
4668
4669 Two list items are [of the same type](@)
4670 if they begin with a [list marker] of the same type.
4671 Two list markers are of the
4672 same type if (a) they are bullet list markers using the same character
4673 (`-`, `+`, or `*`) or (b) they are ordered list numbers with the same
4674 delimiter (either `.` or `)`).
4675
4676 A list is an [ordered list](@)
4677 if its constituent list items begin with
4678 [ordered list markers], and a
4679 [bullet list](@) if its constituent list
4680 items begin with [bullet list markers].
4681
4682 The [start number](@)
4683 of an [ordered list] is determined by the list number of
4684 its initial list item.  The numbers of subsequent list items are
4685 disregarded.
4686
4687 A list is [loose](@) if any of its constituent
4688 list items are separated by blank lines, or if any of its constituent
4689 list items directly contain two block-level elements with a blank line
4690 between them.  Otherwise a list is [tight](@).
4691 (The difference in HTML output is that paragraphs in a loose list are
4692 wrapped in `<p>` tags, while paragraphs in a tight list are not.)
4693
4694 Changing the bullet or ordered list delimiter starts a new list:
4695
4696 ```````````````````````````````` example
4697 - foo
4698 - bar
4699 + baz
4700 .
4701 <ul>
4702 <li>foo</li>
4703 <li>bar</li>
4704 </ul>
4705 <ul>
4706 <li>baz</li>
4707 </ul>
4708 ````````````````````````````````
4709
4710
4711 ```````````````````````````````` example
4712 1. foo
4713 2. bar
4714 3) baz
4715 .
4716 <ol>
4717 <li>foo</li>
4718 <li>bar</li>
4719 </ol>
4720 <ol start="3">
4721 <li>baz</li>
4722 </ol>
4723 ````````````````````````````````
4724
4725
4726 In CommonMark, a list can interrupt a paragraph. That is,
4727 no blank line is needed to separate a paragraph from a following
4728 list:
4729
4730 ```````````````````````````````` example
4731 Foo
4732 - bar
4733 - baz
4734 .
4735 <p>Foo</p>
4736 <ul>
4737 <li>bar</li>
4738 <li>baz</li>
4739 </ul>
4740 ````````````````````````````````
4741
4742
4743 `Markdown.pl` does not allow this, through fear of triggering a list
4744 via a numeral in a hard-wrapped line:
4745
4746 ```````````````````````````````` example
4747 The number of windows in my house is
4748 14.  The number of doors is 6.
4749 .
4750 <p>The number of windows in my house is</p>
4751 <ol start="14">
4752 <li>The number of doors is 6.</li>
4753 </ol>
4754 ````````````````````````````````
4755
4756
4757 Oddly, `Markdown.pl` *does* allow a blockquote to interrupt a paragraph,
4758 even though the same considerations might apply.  We think that the two
4759 cases should be treated the same.  Here are two reasons for allowing
4760 lists to interrupt paragraphs:
4761
4762 First, it is natural and not uncommon for people to start lists without
4763 blank lines:
4764
4765     I need to buy
4766     - new shoes
4767     - a coat
4768     - a plane ticket
4769
4770 Second, we are attracted to a
4771
4772 > [principle of uniformity](@):
4773 > if a chunk of text has a certain
4774 > meaning, it will continue to have the same meaning when put into a
4775 > container block (such as a list item or blockquote).
4776
4777 (Indeed, the spec for [list items] and [block quotes] presupposes
4778 this principle.) This principle implies that if
4779
4780       * I need to buy
4781         - new shoes
4782         - a coat
4783         - a plane ticket
4784
4785 is a list item containing a paragraph followed by a nested sublist,
4786 as all Markdown implementations agree it is (though the paragraph
4787 may be rendered without `<p>` tags, since the list is "tight"),
4788 then
4789
4790     I need to buy
4791     - new shoes
4792     - a coat
4793     - a plane ticket
4794
4795 by itself should be a paragraph followed by a nested sublist.
4796
4797 Our adherence to the [principle of uniformity]
4798 thus inclines us to think that there are two coherent packages:
4799
4800 1.  Require blank lines before *all* lists and blockquotes,
4801     including lists that occur as sublists inside other list items.
4802
4803 2.  Require blank lines in none of these places.
4804
4805 [reStructuredText](http://docutils.sourceforge.net/rst.html) takes
4806 the first approach, for which there is much to be said.  But the second
4807 seems more consistent with established practice with Markdown.
4808
4809 There can be blank lines between items, but two blank lines end
4810 a list:
4811
4812 ```````````````````````````````` example
4813 - foo
4814
4815 - bar
4816
4817
4818 - baz
4819 .
4820 <ul>
4821 <li>
4822 <p>foo</p>
4823 </li>
4824 <li>
4825 <p>bar</p>
4826 </li>
4827 </ul>
4828 <ul>
4829 <li>baz</li>
4830 </ul>
4831 ````````````````````````````````
4832
4833
4834 As illustrated above in the section on [list items],
4835 two blank lines between blocks *within* a list item will also end a
4836 list:
4837
4838 ```````````````````````````````` example
4839 - foo
4840
4841
4842   bar
4843 - baz
4844 .
4845 <ul>
4846 <li>foo</li>
4847 </ul>
4848 <p>bar</p>
4849 <ul>
4850 <li>baz</li>
4851 </ul>
4852 ````````````````````````````````
4853
4854
4855 Indeed, two blank lines will end *all* containing lists:
4856
4857 ```````````````````````````````` example
4858 - foo
4859   - bar
4860     - baz
4861
4862
4863       bim
4864 .
4865 <ul>
4866 <li>foo
4867 <ul>
4868 <li>bar
4869 <ul>
4870 <li>baz</li>
4871 </ul>
4872 </li>
4873 </ul>
4874 </li>
4875 </ul>
4876 <pre><code>  bim
4877 </code></pre>
4878 ````````````````````````````````
4879
4880
4881 Thus, two blank lines can be used to separate consecutive lists of
4882 the same type, or to separate a list from an indented code block
4883 that would otherwise be parsed as a subparagraph of the final list
4884 item:
4885
4886 ```````````````````````````````` example
4887 - foo
4888 - bar
4889
4890
4891 - baz
4892 - bim
4893 .
4894 <ul>
4895 <li>foo</li>
4896 <li>bar</li>
4897 </ul>
4898 <ul>
4899 <li>baz</li>
4900 <li>bim</li>
4901 </ul>
4902 ````````````````````````````````
4903
4904
4905 ```````````````````````````````` example
4906 -   foo
4907
4908     notcode
4909
4910 -   foo
4911
4912
4913     code
4914 .
4915 <ul>
4916 <li>
4917 <p>foo</p>
4918 <p>notcode</p>
4919 </li>
4920 <li>
4921 <p>foo</p>
4922 </li>
4923 </ul>
4924 <pre><code>code
4925 </code></pre>
4926 ````````````````````````````````
4927
4928
4929 List items need not be indented to the same level.  The following
4930 list items will be treated as items at the same list level,
4931 since none is indented enough to belong to the previous list
4932 item:
4933
4934 ```````````````````````````````` example
4935 - a
4936  - b
4937   - c
4938    - d
4939     - e
4940    - f
4941   - g
4942  - h
4943 - i
4944 .
4945 <ul>
4946 <li>a</li>
4947 <li>b</li>
4948 <li>c</li>
4949 <li>d</li>
4950 <li>e</li>
4951 <li>f</li>
4952 <li>g</li>
4953 <li>h</li>
4954 <li>i</li>
4955 </ul>
4956 ````````````````````````````````
4957
4958
4959 ```````````````````````````````` example
4960 1. a
4961
4962   2. b
4963
4964     3. c
4965 .
4966 <ol>
4967 <li>
4968 <p>a</p>
4969 </li>
4970 <li>
4971 <p>b</p>
4972 </li>
4973 <li>
4974 <p>c</p>
4975 </li>
4976 </ol>
4977 ````````````````````````````````
4978
4979
4980 This is a loose list, because there is a blank line between
4981 two of the list items:
4982
4983 ```````````````````````````````` example
4984 - a
4985 - b
4986
4987 - c
4988 .
4989 <ul>
4990 <li>
4991 <p>a</p>
4992 </li>
4993 <li>
4994 <p>b</p>
4995 </li>
4996 <li>
4997 <p>c</p>
4998 </li>
4999 </ul>
5000 ````````````````````````````````
5001
5002
5003 So is this, with a empty second item:
5004
5005 ```````````````````````````````` example
5006 * a
5007 *
5008
5009 * c
5010 .
5011 <ul>
5012 <li>
5013 <p>a</p>
5014 </li>
5015 <li></li>
5016 <li>
5017 <p>c</p>
5018 </li>
5019 </ul>
5020 ````````````````````````````````
5021
5022
5023 These are loose lists, even though there is no space between the items,
5024 because one of the items directly contains two block-level elements
5025 with a blank line between them:
5026
5027 ```````````````````````````````` example
5028 - a
5029 - b
5030
5031   c
5032 - d
5033 .
5034 <ul>
5035 <li>
5036 <p>a</p>
5037 </li>
5038 <li>
5039 <p>b</p>
5040 <p>c</p>
5041 </li>
5042 <li>
5043 <p>d</p>
5044 </li>
5045 </ul>
5046 ````````````````````````````````
5047
5048
5049 ```````````````````````````````` example
5050 - a
5051 - b
5052
5053   [ref]: /url
5054 - d
5055 .
5056 <ul>
5057 <li>
5058 <p>a</p>
5059 </li>
5060 <li>
5061 <p>b</p>
5062 </li>
5063 <li>
5064 <p>d</p>
5065 </li>
5066 </ul>
5067 ````````````````````````````````
5068
5069
5070 This is a tight list, because the blank lines are in a code block:
5071
5072 ```````````````````````````````` example
5073 - a
5074 - ```
5075   b
5076
5077
5078   ```
5079 - c
5080 .
5081 <ul>
5082 <li>a</li>
5083 <li>
5084 <pre><code>b
5085
5086
5087 </code></pre>
5088 </li>
5089 <li>c</li>
5090 </ul>
5091 ````````````````````````````````
5092
5093
5094 This is a tight list, because the blank line is between two
5095 paragraphs of a sublist.  So the sublist is loose while
5096 the outer list is tight:
5097
5098 ```````````````````````````````` example
5099 - a
5100   - b
5101
5102     c
5103 - d
5104 .
5105 <ul>
5106 <li>a
5107 <ul>
5108 <li>
5109 <p>b</p>
5110 <p>c</p>
5111 </li>
5112 </ul>
5113 </li>
5114 <li>d</li>
5115 </ul>
5116 ````````````````````````````````
5117
5118
5119 This is a tight list, because the blank line is inside the
5120 block quote:
5121
5122 ```````````````````````````````` example
5123 * a
5124   > b
5125   >
5126 * c
5127 .
5128 <ul>
5129 <li>a
5130 <blockquote>
5131 <p>b</p>
5132 </blockquote>
5133 </li>
5134 <li>c</li>
5135 </ul>
5136 ````````````````````````````````
5137
5138
5139 This list is tight, because the consecutive block elements
5140 are not separated by blank lines:
5141
5142 ```````````````````````````````` example
5143 - a
5144   > b
5145   ```
5146   c
5147   ```
5148 - d
5149 .
5150 <ul>
5151 <li>a
5152 <blockquote>
5153 <p>b</p>
5154 </blockquote>
5155 <pre><code>c
5156 </code></pre>
5157 </li>
5158 <li>d</li>
5159 </ul>
5160 ````````````````````````````````
5161
5162
5163 A single-paragraph list is tight:
5164
5165 ```````````````````````````````` example
5166 - a
5167 .
5168 <ul>
5169 <li>a</li>
5170 </ul>
5171 ````````````````````````````````
5172
5173
5174 ```````````````````````````````` example
5175 - a
5176   - b
5177 .
5178 <ul>
5179 <li>a
5180 <ul>
5181 <li>b</li>
5182 </ul>
5183 </li>
5184 </ul>
5185 ````````````````````````````````
5186
5187
5188 This list is loose, because of the blank line between the
5189 two block elements in the list item:
5190
5191 ```````````````````````````````` example
5192 1. ```
5193    foo
5194    ```
5195
5196    bar
5197 .
5198 <ol>
5199 <li>
5200 <pre><code>foo
5201 </code></pre>
5202 <p>bar</p>
5203 </li>
5204 </ol>
5205 ````````````````````````````````
5206
5207
5208 Here the outer list is loose, the inner list tight:
5209
5210 ```````````````````````````````` example
5211 * foo
5212   * bar
5213
5214   baz
5215 .
5216 <ul>
5217 <li>
5218 <p>foo</p>
5219 <ul>
5220 <li>bar</li>
5221 </ul>
5222 <p>baz</p>
5223 </li>
5224 </ul>
5225 ````````````````````````````````
5226
5227
5228 ```````````````````````````````` example
5229 - a
5230   - b
5231   - c
5232
5233 - d
5234   - e
5235   - f
5236 .
5237 <ul>
5238 <li>
5239 <p>a</p>
5240 <ul>
5241 <li>b</li>
5242 <li>c</li>
5243 </ul>
5244 </li>
5245 <li>
5246 <p>d</p>
5247 <ul>
5248 <li>e</li>
5249 <li>f</li>
5250 </ul>
5251 </li>
5252 </ul>
5253 ````````````````````````````````
5254
5255
5256 # Inlines
5257
5258 Inlines are parsed sequentially from the beginning of the character
5259 stream to the end (left to right, in left-to-right languages).
5260 Thus, for example, in
5261
5262 ```````````````````````````````` example
5263 `hi`lo`
5264 .
5265 <p><code>hi</code>lo`</p>
5266 ````````````````````````````````
5267
5268
5269 `hi` is parsed as code, leaving the backtick at the end as a literal
5270 backtick.
5271
5272 ## Backslash escapes
5273
5274 Any ASCII punctuation character may be backslash-escaped:
5275
5276 ```````````````````````````````` example
5277 \!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~
5278 .
5279 <p>!&quot;#$%&amp;'()*+,-./:;&lt;=&gt;?@[\]^_`{|}~</p>
5280 ````````````````````````````````
5281
5282
5283 Backslashes before other characters are treated as literal
5284 backslashes:
5285
5286 ```````````````````````````````` example
5287 \→\A\a\ \3\φ\«
5288 .
5289 <p>\→\A\a\ \3\φ\«</p>
5290 ````````````````````````````````
5291
5292
5293 Escaped characters are treated as regular characters and do
5294 not have their usual Markdown meanings:
5295
5296 ```````````````````````````````` example
5297 \*not emphasized*
5298 \<br/> not a tag
5299 \[not a link](/foo)
5300 \`not code`
5301 1\. not a list
5302 \* not a list
5303 \# not a heading
5304 \[foo]: /url "not a reference"
5305 .
5306 <p>*not emphasized*
5307 &lt;br/&gt; not a tag
5308 [not a link](/foo)
5309 `not code`
5310 1. not a list
5311 * not a list
5312 # not a heading
5313 [foo]: /url &quot;not a reference&quot;</p>
5314 ````````````````````````````````
5315
5316
5317 If a backslash is itself escaped, the following character is not:
5318
5319 ```````````````````````````````` example
5320 \\*emphasis*
5321 .
5322 <p>\<em>emphasis</em></p>
5323 ````````````````````````````````
5324
5325
5326 A backslash at the end of the line is a [hard line break]:
5327
5328 ```````````````````````````````` example
5329 foo\
5330 bar
5331 .
5332 <p>foo<br />
5333 bar</p>
5334 ````````````````````````````````
5335
5336
5337 Backslash escapes do not work in code blocks, code spans, autolinks, or
5338 raw HTML:
5339
5340 ```````````````````````````````` example
5341 `` \[\` ``
5342 .
5343 <p><code>\[\`</code></p>
5344 ````````````````````````````````
5345
5346
5347 ```````````````````````````````` example
5348     \[\]
5349 .
5350 <pre><code>\[\]
5351 </code></pre>
5352 ````````````````````````````````
5353
5354
5355 ```````````````````````````````` example
5356 ~~~
5357 \[\]
5358 ~~~
5359 .
5360 <pre><code>\[\]
5361 </code></pre>
5362 ````````````````````````````````
5363
5364
5365 ```````````````````````````````` example
5366 <http://example.com?find=\*>
5367 .
5368 <p><a href="http://example.com?find=%5C*">http://example.com?find=\*</a></p>
5369 ````````````````````````````````
5370
5371
5372 ```````````````````````````````` example
5373 <a href="/bar\/)">
5374 .
5375 <a href="/bar\/)">
5376 ````````````````````````````````
5377
5378
5379 But they work in all other contexts, including URLs and link titles,
5380 link references, and [info strings] in [fenced code blocks]:
5381
5382 ```````````````````````````````` example
5383 [foo](/bar\* "ti\*tle")
5384 .
5385 <p><a href="/bar*" title="ti*tle">foo</a></p>
5386 ````````````````````````````````
5387
5388
5389 ```````````````````````````````` example
5390 [foo]
5391
5392 [foo]: /bar\* "ti\*tle"
5393 .
5394 <p><a href="/bar*" title="ti*tle">foo</a></p>
5395 ````````````````````````````````
5396
5397
5398 ```````````````````````````````` example
5399 ``` foo\+bar
5400 foo
5401 ```
5402 .
5403 <pre><code class="language-foo+bar">foo
5404 </code></pre>
5405 ````````````````````````````````
5406
5407
5408
5409 ## Entity and numeric character references
5410
5411 All valid HTML entity references and numeric character
5412 references, except those occuring in code blocks and code spans,
5413 are recognized as such and treated as equivalent to the
5414 corresponding Unicode characters.  Conforming CommonMark parsers
5415 need not store information about whether a particular character
5416 was represented in the source using a Unicode character or
5417 an entity reference.
5418
5419 [Entity references](@) consist of `&` + any of the valid
5420 HTML5 entity names + `;`. The
5421 document <https://html.spec.whatwg.org/multipage/entities.json>
5422 is used as an authoritative source for the valid entity
5423 references and their corresponding code points.
5424
5425 ```````````````````````````````` example
5426 &nbsp; &amp; &copy; &AElig; &Dcaron;
5427 &frac34; &HilbertSpace; &DifferentialD;
5428 &ClockwiseContourIntegral; &ngE;
5429 .
5430 <p>  &amp; © Æ Ď
5431 ¾ ℋ ⅆ
5432 ∲ ≧̸</p>
5433 ````````````````````````````````
5434
5435
5436 [Decimal numeric character
5437 references](@)
5438 consist of `&#` + a string of 1--8 arabic digits + `;`. A
5439 numeric character reference is parsed as the corresponding
5440 Unicode character. Invalid Unicode code points will be replaced by
5441 the REPLACEMENT CHARACTER (`U+FFFD`).  For security reasons,
5442 the code point `U+0000` will also be replaced by `U+FFFD`.
5443
5444 ```````````````````````````````` example
5445 &#35; &#1234; &#992; &#98765432; &#0;
5446 .
5447 <p># Ӓ Ϡ � �</p>
5448 ````````````````````````````````
5449
5450
5451 [Hexadecimal numeric character
5452 references](@) consist of `&#` +
5453 either `X` or `x` + a string of 1-8 hexadecimal digits + `;`.
5454 They too are parsed as the corresponding Unicode character (this
5455 time specified with a hexadecimal numeral instead of decimal).
5456
5457 ```````````````````````````````` example
5458 &#X22; &#XD06; &#xcab;
5459 .
5460 <p>&quot; ആ ಫ</p>
5461 ````````````````````````````````
5462
5463
5464 Here are some nonentities:
5465
5466 ```````````````````````````````` example
5467 &nbsp &x; &#; &#x;
5468 &ThisIsNotDefined; &hi?;
5469 .
5470 <p>&amp;nbsp &amp;x; &amp;#; &amp;#x;
5471 &amp;ThisIsNotDefined; &amp;hi?;</p>
5472 ````````````````````````````````
5473
5474
5475 Although HTML5 does accept some entity references
5476 without a trailing semicolon (such as `&copy`), these are not
5477 recognized here, because it makes the grammar too ambiguous:
5478
5479 ```````````````````````````````` example
5480 &copy
5481 .
5482 <p>&amp;copy</p>
5483 ````````````````````````````````
5484
5485
5486 Strings that are not on the list of HTML5 named entities are not
5487 recognized as entity references either:
5488
5489 ```````````````````````````````` example
5490 &MadeUpEntity;
5491 .
5492 <p>&amp;MadeUpEntity;</p>
5493 ````````````````````````````````
5494
5495
5496 Entity and numeric character references are recognized in any
5497 context besides code spans or code blocks, including
5498 URLs, [link titles], and [fenced code block][] [info strings]:
5499
5500 ```````````````````````````````` example
5501 <a href="&ouml;&ouml;.html">
5502 .
5503 <a href="&ouml;&ouml;.html">
5504 ````````````````````````````````
5505
5506
5507 ```````````````````````````````` example
5508 [foo](/f&ouml;&ouml; "f&ouml;&ouml;")
5509 .
5510 <p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p>
5511 ````````````````````````````````
5512
5513
5514 ```````````````````````````````` example
5515 [foo]
5516
5517 [foo]: /f&ouml;&ouml; "f&ouml;&ouml;"
5518 .
5519 <p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p>
5520 ````````````````````````````````
5521
5522
5523 ```````````````````````````````` example
5524 ``` f&ouml;&ouml;
5525 foo
5526 ```
5527 .
5528 <pre><code class="language-föö">foo
5529 </code></pre>
5530 ````````````````````````````````
5531
5532
5533 Entity and numeric character references are treated as literal
5534 text in code spans and code blocks:
5535
5536 ```````````````````````````````` example
5537 `f&ouml;&ouml;`
5538 .
5539 <p><code>f&amp;ouml;&amp;ouml;</code></p>
5540 ````````````````````````````````
5541
5542
5543 ```````````````````````````````` example
5544     f&ouml;f&ouml;
5545 .
5546 <pre><code>f&amp;ouml;f&amp;ouml;
5547 </code></pre>
5548 ````````````````````````````````
5549
5550
5551 ## Code spans
5552
5553 A [backtick string](@)
5554 is a string of one or more backtick characters (`` ` ``) that is neither
5555 preceded nor followed by a backtick.
5556
5557 A [code span](@) begins with a backtick string and ends with
5558 a backtick string of equal length.  The contents of the code span are
5559 the characters between the two backtick strings, with leading and
5560 trailing spaces and [line endings] removed, and
5561 [whitespace] collapsed to single spaces.
5562
5563 This is a simple code span:
5564
5565 ```````````````````````````````` example
5566 `foo`
5567 .
5568 <p><code>foo</code></p>
5569 ````````````````````````````````
5570
5571
5572 Here two backticks are used, because the code contains a backtick.
5573 This example also illustrates stripping of leading and trailing spaces:
5574
5575 ```````````````````````````````` example
5576 `` foo ` bar  ``
5577 .
5578 <p><code>foo ` bar</code></p>
5579 ````````````````````````````````
5580
5581
5582 This example shows the motivation for stripping leading and trailing
5583 spaces:
5584
5585 ```````````````````````````````` example
5586 ` `` `
5587 .
5588 <p><code>``</code></p>
5589 ````````````````````````````````
5590
5591
5592 [Line endings] are treated like spaces:
5593
5594 ```````````````````````````````` example
5595 ``
5596 foo
5597 ``
5598 .
5599 <p><code>foo</code></p>
5600 ````````````````````````````````
5601
5602
5603 Interior spaces and [line endings] are collapsed into
5604 single spaces, just as they would be by a browser:
5605
5606 ```````````````````````````````` example
5607 `foo   bar
5608   baz`
5609 .
5610 <p><code>foo bar baz</code></p>
5611 ````````````````````````````````
5612
5613
5614 Q: Why not just leave the spaces, since browsers will collapse them
5615 anyway?  A:  Because we might be targeting a non-HTML format, and we
5616 shouldn't rely on HTML-specific rendering assumptions.
5617
5618 (Existing implementations differ in their treatment of internal
5619 spaces and [line endings].  Some, including `Markdown.pl` and
5620 `showdown`, convert an internal [line ending] into a
5621 `<br />` tag.  But this makes things difficult for those who like to
5622 hard-wrap their paragraphs, since a line break in the midst of a code
5623 span will cause an unintended line break in the output.  Others just
5624 leave internal spaces as they are, which is fine if only HTML is being
5625 targeted.)
5626
5627 ```````````````````````````````` example
5628 `foo `` bar`
5629 .
5630 <p><code>foo `` bar</code></p>
5631 ````````````````````````````````
5632
5633
5634 Note that backslash escapes do not work in code spans. All backslashes
5635 are treated literally:
5636
5637 ```````````````````````````````` example
5638 `foo\`bar`
5639 .
5640 <p><code>foo\</code>bar`</p>
5641 ````````````````````````````````
5642
5643
5644 Backslash escapes are never needed, because one can always choose a
5645 string of *n* backtick characters as delimiters, where the code does
5646 not contain any strings of exactly *n* backtick characters.
5647
5648 Code span backticks have higher precedence than any other inline
5649 constructs except HTML tags and autolinks.  Thus, for example, this is
5650 not parsed as emphasized text, since the second `*` is part of a code
5651 span:
5652
5653 ```````````````````````````````` example
5654 *foo`*`
5655 .
5656 <p>*foo<code>*</code></p>
5657 ````````````````````````````````
5658
5659
5660 And this is not parsed as a link:
5661
5662 ```````````````````````````````` example
5663 [not a `link](/foo`)
5664 .
5665 <p>[not a <code>link](/foo</code>)</p>
5666 ````````````````````````````````
5667
5668
5669 Code spans, HTML tags, and autolinks have the same precedence.
5670 Thus, this is code:
5671
5672 ```````````````````````````````` example
5673 `<a href="`">`
5674 .
5675 <p><code>&lt;a href=&quot;</code>&quot;&gt;`</p>
5676 ````````````````````````````````
5677
5678
5679 But this is an HTML tag:
5680
5681 ```````````````````````````````` example
5682 <a href="`">`
5683 .
5684 <p><a href="`">`</p>
5685 ````````````````````````````````
5686
5687
5688 And this is code:
5689
5690 ```````````````````````````````` example
5691 `<http://foo.bar.`baz>`
5692 .
5693 <p><code>&lt;http://foo.bar.</code>baz&gt;`</p>
5694 ````````````````````````````````
5695
5696
5697 But this is an autolink:
5698
5699 ```````````````````````````````` example
5700 <http://foo.bar.`baz>`
5701 .
5702 <p><a href="http://foo.bar.%60baz">http://foo.bar.`baz</a>`</p>
5703 ````````````````````````````````
5704
5705
5706 When a backtick string is not closed by a matching backtick string,
5707 we just have literal backticks:
5708
5709 ```````````````````````````````` example
5710 ```foo``
5711 .
5712 <p>```foo``</p>
5713 ````````````````````````````````
5714
5715
5716 ```````````````````````````````` example
5717 `foo
5718 .
5719 <p>`foo</p>
5720 ````````````````````````````````
5721
5722
5723 ## Emphasis and strong emphasis
5724
5725 John Gruber's original [Markdown syntax
5726 description](http://daringfireball.net/projects/markdown/syntax#em) says:
5727
5728 > Markdown treats asterisks (`*`) and underscores (`_`) as indicators of
5729 > emphasis. Text wrapped with one `*` or `_` will be wrapped with an HTML
5730 > `<em>` tag; double `*`'s or `_`'s will be wrapped with an HTML `<strong>`
5731 > tag.
5732
5733 This is enough for most users, but these rules leave much undecided,
5734 especially when it comes to nested emphasis.  The original
5735 `Markdown.pl` test suite makes it clear that triple `***` and
5736 `___` delimiters can be used for strong emphasis, and most
5737 implementations have also allowed the following patterns:
5738
5739 ``` markdown
5740 ***strong emph***
5741 ***strong** in emph*
5742 ***emph* in strong**
5743 **in strong *emph***
5744 *in emph **strong***
5745 ```
5746
5747 The following patterns are less widely supported, but the intent
5748 is clear and they are useful (especially in contexts like bibliography
5749 entries):
5750
5751 ``` markdown
5752 *emph *with emph* in it*
5753 **strong **with strong** in it**
5754 ```
5755
5756 Many implementations have also restricted intraword emphasis to
5757 the `*` forms, to avoid unwanted emphasis in words containing
5758 internal underscores.  (It is best practice to put these in code
5759 spans, but users often do not.)
5760
5761 ``` markdown
5762 internal emphasis: foo*bar*baz
5763 no emphasis: foo_bar_baz
5764 ```
5765
5766 The rules given below capture all of these patterns, while allowing
5767 for efficient parsing strategies that do not backtrack.
5768
5769 First, some definitions.  A [delimiter run](@) is either
5770 a sequence of one or more `*` characters that is not preceded or
5771 followed by a `*` character, or a sequence of one or more `_`
5772 characters that is not preceded or followed by a `_` character.
5773
5774 A [left-flanking delimiter run](@) is
5775 a [delimiter run] that is (a) not followed by [Unicode whitespace],
5776 and (b) either not followed by a [punctuation character], or
5777 preceded by [Unicode whitespace] or a [punctuation character].
5778 For purposes of this definition, the beginning and the end of
5779 the line count as Unicode whitespace.
5780
5781 A [right-flanking delimiter run](@) is
5782 a [delimiter run] that is (a) not preceded by [Unicode whitespace],
5783 and (b) either not preceded by a [punctuation character], or
5784 followed by [Unicode whitespace] or a [punctuation character].
5785 For purposes of this definition, the beginning and the end of
5786 the line count as Unicode whitespace.
5787
5788 Here are some examples of delimiter runs.
5789
5790   - left-flanking but not right-flanking:
5791
5792     ```
5793     ***abc
5794       _abc
5795     **"abc"
5796      _"abc"
5797     ```
5798
5799   - right-flanking but not left-flanking:
5800
5801     ```
5802      abc***
5803      abc_
5804     "abc"**
5805     "abc"_
5806     ```
5807
5808   - Both left and right-flanking:
5809
5810     ```
5811      abc***def
5812     "abc"_"def"
5813     ```
5814
5815   - Neither left nor right-flanking:
5816
5817     ```
5818     abc *** def
5819     a _ b
5820     ```
5821
5822 (The idea of distinguishing left-flanking and right-flanking
5823 delimiter runs based on the character before and the character
5824 after comes from Roopesh Chander's
5825 [vfmd](http://www.vfmd.org/vfmd-spec/specification/#procedure-for-identifying-emphasis-tags).
5826 vfmd uses the terminology "emphasis indicator string" instead of "delimiter
5827 run," and its rules for distinguishing left- and right-flanking runs
5828 are a bit more complex than the ones given here.)
5829
5830 The following rules define emphasis and strong emphasis:
5831
5832 1.  A single `*` character [can open emphasis](@)
5833     iff (if and only if) it is part of a [left-flanking delimiter run].
5834
5835 2.  A single `_` character [can open emphasis] iff
5836     it is part of a [left-flanking delimiter run]
5837     and either (a) not part of a [right-flanking delimiter run]
5838     or (b) part of a [right-flanking delimiter run]
5839     preceded by punctuation.
5840
5841 3.  A single `*` character [can close emphasis](@)
5842     iff it is part of a [right-flanking delimiter run].
5843
5844 4.  A single `_` character [can close emphasis] iff
5845     it is part of a [right-flanking delimiter run]
5846     and either (a) not part of a [left-flanking delimiter run]
5847     or (b) part of a [left-flanking delimiter run]
5848     followed by punctuation.
5849
5850 5.  A double `**` [can open strong emphasis](@)
5851     iff it is part of a [left-flanking delimiter run].
5852
5853 6.  A double `__` [can open strong emphasis] iff
5854     it is part of a [left-flanking delimiter run]
5855     and either (a) not part of a [right-flanking delimiter run]
5856     or (b) part of a [right-flanking delimiter run]
5857     preceded by punctuation.
5858
5859 7.  A double `**` [can close strong emphasis](@)
5860     iff it is part of a [right-flanking delimiter run].
5861
5862 8.  A double `__` [can close strong emphasis]
5863     it is part of a [right-flanking delimiter run]
5864     and either (a) not part of a [left-flanking delimiter run]
5865     or (b) part of a [left-flanking delimiter run]
5866     followed by punctuation.
5867
5868 9.  Emphasis begins with a delimiter that [can open emphasis] and ends
5869     with a delimiter that [can close emphasis], and that uses the same
5870     character (`_` or `*`) as the opening delimiter.  There must
5871     be a nonempty sequence of inlines between the open delimiter
5872     and the closing delimiter; these form the contents of the emphasis
5873     inline.
5874
5875 10. Strong emphasis begins with a delimiter that
5876     [can open strong emphasis] and ends with a delimiter that
5877     [can close strong emphasis], and that uses the same character
5878     (`_` or `*`) as the opening delimiter.
5879     There must be a nonempty sequence of inlines between the open
5880     delimiter and the closing delimiter; these form the contents of
5881     the strong emphasis inline.
5882
5883 11. A literal `*` character cannot occur at the beginning or end of
5884     `*`-delimited emphasis or `**`-delimited strong emphasis, unless it
5885     is backslash-escaped.
5886
5887 12. A literal `_` character cannot occur at the beginning or end of
5888     `_`-delimited emphasis or `__`-delimited strong emphasis, unless it
5889     is backslash-escaped.
5890
5891 Where rules 1--12 above are compatible with multiple parsings,
5892 the following principles resolve ambiguity:
5893
5894 13. The number of nestings should be minimized. Thus, for example,
5895     an interpretation `<strong>...</strong>` is always preferred to
5896     `<em><em>...</em></em>`.
5897
5898 14. An interpretation `<strong><em>...</em></strong>` is always
5899     preferred to `<em><strong>..</strong></em>`.
5900
5901 15. When two potential emphasis or strong emphasis spans overlap,
5902     so that the second begins before the first ends and ends after
5903     the first ends, the first takes precedence. Thus, for example,
5904     `*foo _bar* baz_` is parsed as `<em>foo _bar</em> baz_` rather
5905     than `*foo <em>bar* baz</em>`.  For the same reason,
5906     `**foo*bar**` is parsed as `<em><em>foo</em>bar</em>*`
5907     rather than `<strong>foo*bar</strong>`.
5908
5909 16. When there are two potential emphasis or strong emphasis spans
5910     with the same closing delimiter, the shorter one (the one that
5911     opens later) takes precedence. Thus, for example,
5912     `**foo **bar baz**` is parsed as `**foo <strong>bar baz</strong>`
5913     rather than `<strong>foo **bar baz</strong>`.
5914
5915 17. Inline code spans, links, images, and HTML tags group more tightly
5916     than emphasis.  So, when there is a choice between an interpretation
5917     that contains one of these elements and one that does not, the
5918     former always wins.  Thus, for example, `*[foo*](bar)` is
5919     parsed as `*<a href="bar">foo*</a>` rather than as
5920     `<em>[foo</em>](bar)`.
5921
5922 These rules can be illustrated through a series of examples.
5923
5924 Rule 1:
5925
5926 ```````````````````````````````` example
5927 *foo bar*
5928 .
5929 <p><em>foo bar</em></p>
5930 ````````````````````````````````
5931
5932
5933 This is not emphasis, because the opening `*` is followed by
5934 whitespace, and hence not part of a [left-flanking delimiter run]:
5935
5936 ```````````````````````````````` example
5937 a * foo bar*
5938 .
5939 <p>a * foo bar*</p>
5940 ````````````````````````````````
5941
5942
5943 This is not emphasis, because the opening `*` is preceded
5944 by an alphanumeric and followed by punctuation, and hence
5945 not part of a [left-flanking delimiter run]:
5946
5947 ```````````````````````````````` example
5948 a*"foo"*
5949 .
5950 <p>a*&quot;foo&quot;*</p>
5951 ````````````````````````````````
5952
5953
5954 Unicode nonbreaking spaces count as whitespace, too:
5955
5956 ```````````````````````````````` example
5957 * a *
5958 .
5959 <p>* a *</p>
5960 ````````````````````````````````
5961
5962
5963 Intraword emphasis with `*` is permitted:
5964
5965 ```````````````````````````````` example
5966 foo*bar*
5967 .
5968 <p>foo<em>bar</em></p>
5969 ````````````````````````````````
5970
5971
5972 ```````````````````````````````` example
5973 5*6*78
5974 .
5975 <p>5<em>6</em>78</p>
5976 ````````````````````````````````
5977
5978
5979 Rule 2:
5980
5981 ```````````````````````````````` example
5982 _foo bar_
5983 .
5984 <p><em>foo bar</em></p>
5985 ````````````````````````````````
5986
5987
5988 This is not emphasis, because the opening `_` is followed by
5989 whitespace:
5990
5991 ```````````````````````````````` example
5992 _ foo bar_
5993 .
5994 <p>_ foo bar_</p>
5995 ````````````````````````````````
5996
5997
5998 This is not emphasis, because the opening `_` is preceded
5999 by an alphanumeric and followed by punctuation:
6000
6001 ```````````````````````````````` example
6002 a_"foo"_
6003 .
6004 <p>a_&quot;foo&quot;_</p>
6005 ````````````````````````````````
6006
6007
6008 Emphasis with `_` is not allowed inside words:
6009
6010 ```````````````````````````````` example
6011 foo_bar_
6012 .
6013 <p>foo_bar_</p>
6014 ````````````````````````````````
6015
6016
6017 ```````````````````````````````` example
6018 5_6_78
6019 .
6020 <p>5_6_78</p>
6021 ````````````````````````````````
6022
6023
6024 ```````````````````````````````` example
6025 пристаням_стремятся_
6026 .
6027 <p>пристаням_стремятся_</p>
6028 ````````````````````````````````
6029
6030
6031 Here `_` does not generate emphasis, because the first delimiter run
6032 is right-flanking and the second left-flanking:
6033
6034 ```````````````````````````````` example
6035 aa_"bb"_cc
6036 .
6037 <p>aa_&quot;bb&quot;_cc</p>
6038 ````````````````````````````````
6039
6040
6041 This is emphasis, even though the opening delimiter is
6042 both left- and right-flanking, because it is preceded by
6043 punctuation:
6044
6045 ```````````````````````````````` example
6046 foo-_(bar)_
6047 .
6048 <p>foo-<em>(bar)</em></p>
6049 ````````````````````````````````
6050
6051
6052 Rule 3:
6053
6054 This is not emphasis, because the closing delimiter does
6055 not match the opening delimiter:
6056
6057 ```````````````````````````````` example
6058 _foo*
6059 .
6060 <p>_foo*</p>
6061 ````````````````````````````````
6062
6063
6064 This is not emphasis, because the closing `*` is preceded by
6065 whitespace:
6066
6067 ```````````````````````````````` example
6068 *foo bar *
6069 .
6070 <p>*foo bar *</p>
6071 ````````````````````````````````
6072
6073
6074 A newline also counts as whitespace:
6075
6076 ```````````````````````````````` example
6077 *foo bar
6078 *
6079 .
6080 <p>*foo bar</p>
6081 <ul>
6082 <li></li>
6083 </ul>
6084 ````````````````````````````````
6085
6086
6087 This is not emphasis, because the second `*` is
6088 preceded by punctuation and followed by an alphanumeric
6089 (hence it is not part of a [right-flanking delimiter run]:
6090
6091 ```````````````````````````````` example
6092 *(*foo)
6093 .
6094 <p>*(*foo)</p>
6095 ````````````````````````````````
6096
6097
6098 The point of this restriction is more easily appreciated
6099 with this example:
6100
6101 ```````````````````````````````` example
6102 *(*foo*)*
6103 .
6104 <p><em>(<em>foo</em>)</em></p>
6105 ````````````````````````````````
6106
6107
6108 Intraword emphasis with `*` is allowed:
6109
6110 ```````````````````````````````` example
6111 *foo*bar
6112 .
6113 <p><em>foo</em>bar</p>
6114 ````````````````````````````````
6115
6116
6117
6118 Rule 4:
6119
6120 This is not emphasis, because the closing `_` is preceded by
6121 whitespace:
6122
6123 ```````````````````````````````` example
6124 _foo bar _
6125 .
6126 <p>_foo bar _</p>
6127 ````````````````````````````````
6128
6129
6130 This is not emphasis, because the second `_` is
6131 preceded by punctuation and followed by an alphanumeric:
6132
6133 ```````````````````````````````` example
6134 _(_foo)
6135 .
6136 <p>_(_foo)</p>
6137 ````````````````````````````````
6138
6139
6140 This is emphasis within emphasis:
6141
6142 ```````````````````````````````` example
6143 _(_foo_)_
6144 .
6145 <p><em>(<em>foo</em>)</em></p>
6146 ````````````````````````````````
6147
6148
6149 Intraword emphasis is disallowed for `_`:
6150
6151 ```````````````````````````````` example
6152 _foo_bar
6153 .
6154 <p>_foo_bar</p>
6155 ````````````````````````````````
6156
6157
6158 ```````````````````````````````` example
6159 _пристаням_стремятся
6160 .
6161 <p>_пристаням_стремятся</p>
6162 ````````````````````````````````
6163
6164
6165 ```````````````````````````````` example
6166 _foo_bar_baz_
6167 .
6168 <p><em>foo_bar_baz</em></p>
6169 ````````````````````````````````
6170
6171
6172 This is emphasis, even though the closing delimiter is
6173 both left- and right-flanking, because it is followed by
6174 punctuation:
6175
6176 ```````````````````````````````` example
6177 _(bar)_.
6178 .
6179 <p><em>(bar)</em>.</p>
6180 ````````````````````````````````
6181
6182
6183 Rule 5:
6184
6185 ```````````````````````````````` example
6186 **foo bar**
6187 .
6188 <p><strong>foo bar</strong></p>
6189 ````````````````````````````````
6190
6191
6192 This is not strong emphasis, because the opening delimiter is
6193 followed by whitespace:
6194
6195 ```````````````````````````````` example
6196 ** foo bar**
6197 .
6198 <p>** foo bar**</p>
6199 ````````````````````````````````
6200
6201
6202 This is not strong emphasis, because the opening `**` is preceded
6203 by an alphanumeric and followed by punctuation, and hence
6204 not part of a [left-flanking delimiter run]:
6205
6206 ```````````````````````````````` example
6207 a**"foo"**
6208 .
6209 <p>a**&quot;foo&quot;**</p>
6210 ````````````````````````````````
6211
6212
6213 Intraword strong emphasis with `**` is permitted:
6214
6215 ```````````````````````````````` example
6216 foo**bar**
6217 .
6218 <p>foo<strong>bar</strong></p>
6219 ````````````````````````````````
6220
6221
6222 Rule 6:
6223
6224 ```````````````````````````````` example
6225 __foo bar__
6226 .
6227 <p><strong>foo bar</strong></p>
6228 ````````````````````````````````
6229
6230
6231 This is not strong emphasis, because the opening delimiter is
6232 followed by whitespace:
6233
6234 ```````````````````````````````` example
6235 __ foo bar__
6236 .
6237 <p>__ foo bar__</p>
6238 ````````````````````````````````
6239
6240
6241 A newline counts as whitespace:
6242 ```````````````````````````````` example
6243 __
6244 foo bar__
6245 .
6246 <p>__
6247 foo bar__</p>
6248 ````````````````````````````````
6249
6250
6251 This is not strong emphasis, because the opening `__` is preceded
6252 by an alphanumeric and followed by punctuation:
6253
6254 ```````````````````````````````` example
6255 a__"foo"__
6256 .
6257 <p>a__&quot;foo&quot;__</p>
6258 ````````````````````````````````
6259
6260
6261 Intraword strong emphasis is forbidden with `__`:
6262
6263 ```````````````````````````````` example
6264 foo__bar__
6265 .
6266 <p>foo__bar__</p>
6267 ````````````````````````````````
6268
6269
6270 ```````````````````````````````` example
6271 5__6__78
6272 .
6273 <p>5__6__78</p>
6274 ````````````````````````````````
6275
6276
6277 ```````````````````````````````` example
6278 пристаням__стремятся__
6279 .
6280 <p>пристаням__стремятся__</p>
6281 ````````````````````````````````
6282
6283
6284 ```````````````````````````````` example
6285 __foo, __bar__, baz__
6286 .
6287 <p><strong>foo, <strong>bar</strong>, baz</strong></p>
6288 ````````````````````````````````
6289
6290
6291 This is strong emphasis, even though the opening delimiter is
6292 both left- and right-flanking, because it is preceded by
6293 punctuation:
6294
6295 ```````````````````````````````` example
6296 foo-__(bar)__
6297 .
6298 <p>foo-<strong>(bar)</strong></p>
6299 ````````````````````````````````
6300
6301
6302
6303 Rule 7:
6304
6305 This is not strong emphasis, because the closing delimiter is preceded
6306 by whitespace:
6307
6308 ```````````````````````````````` example
6309 **foo bar **
6310 .
6311 <p>**foo bar **</p>
6312 ````````````````````````````````
6313
6314
6315 (Nor can it be interpreted as an emphasized `*foo bar *`, because of
6316 Rule 11.)
6317
6318 This is not strong emphasis, because the second `**` is
6319 preceded by punctuation and followed by an alphanumeric:
6320
6321 ```````````````````````````````` example
6322 **(**foo)
6323 .
6324 <p>**(**foo)</p>
6325 ````````````````````````````````
6326
6327
6328 The point of this restriction is more easily appreciated
6329 with these examples:
6330
6331 ```````````````````````````````` example
6332 *(**foo**)*
6333 .
6334 <p><em>(<strong>foo</strong>)</em></p>
6335 ````````````````````````````````
6336
6337
6338 ```````````````````````````````` example
6339 **Gomphocarpus (*Gomphocarpus physocarpus*, syn.
6340 *Asclepias physocarpa*)**
6341 .
6342 <p><strong>Gomphocarpus (<em>Gomphocarpus physocarpus</em>, syn.
6343 <em>Asclepias physocarpa</em>)</strong></p>
6344 ````````````````````````````````
6345
6346
6347 ```````````````````````````````` example
6348 **foo "*bar*" foo**
6349 .
6350 <p><strong>foo &quot;<em>bar</em>&quot; foo</strong></p>
6351 ````````````````````````````````
6352
6353
6354 Intraword emphasis:
6355
6356 ```````````````````````````````` example
6357 **foo**bar
6358 .
6359 <p><strong>foo</strong>bar</p>
6360 ````````````````````````````````
6361
6362
6363 Rule 8:
6364
6365 This is not strong emphasis, because the closing delimiter is
6366 preceded by whitespace:
6367
6368 ```````````````````````````````` example
6369 __foo bar __
6370 .
6371 <p>__foo bar __</p>
6372 ````````````````````````````````
6373
6374
6375 This is not strong emphasis, because the second `__` is
6376 preceded by punctuation and followed by an alphanumeric:
6377
6378 ```````````````````````````````` example
6379 __(__foo)
6380 .
6381 <p>__(__foo)</p>
6382 ````````````````````````````````
6383
6384
6385 The point of this restriction is more easily appreciated
6386 with this example:
6387
6388 ```````````````````````````````` example
6389 _(__foo__)_
6390 .
6391 <p><em>(<strong>foo</strong>)</em></p>
6392 ````````````````````````````````
6393
6394
6395 Intraword strong emphasis is forbidden with `__`:
6396
6397 ```````````````````````````````` example
6398 __foo__bar
6399 .
6400 <p>__foo__bar</p>
6401 ````````````````````````````````
6402
6403
6404 ```````````````````````````````` example
6405 __пристаням__стремятся
6406 .
6407 <p>__пристаням__стремятся</p>
6408 ````````````````````````````````
6409
6410
6411 ```````````````````````````````` example
6412 __foo__bar__baz__
6413 .
6414 <p><strong>foo__bar__baz</strong></p>
6415 ````````````````````````````````
6416
6417
6418 This is strong emphasis, even though the closing delimiter is
6419 both left- and right-flanking, because it is followed by
6420 punctuation:
6421
6422 ```````````````````````````````` example
6423 __(bar)__.
6424 .
6425 <p><strong>(bar)</strong>.</p>
6426 ````````````````````````````````
6427
6428
6429 Rule 9:
6430
6431 Any nonempty sequence of inline elements can be the contents of an
6432 emphasized span.
6433
6434 ```````````````````````````````` example
6435 *foo [bar](/url)*
6436 .
6437 <p><em>foo <a href="/url">bar</a></em></p>
6438 ````````````````````````````````
6439
6440
6441 ```````````````````````````````` example
6442 *foo
6443 bar*
6444 .
6445 <p><em>foo
6446 bar</em></p>
6447 ````````````````````````````````
6448
6449
6450 In particular, emphasis and strong emphasis can be nested
6451 inside emphasis:
6452
6453 ```````````````````````````````` example
6454 _foo __bar__ baz_
6455 .
6456 <p><em>foo <strong>bar</strong> baz</em></p>
6457 ````````````````````````````````
6458
6459
6460 ```````````````````````````````` example
6461 _foo _bar_ baz_
6462 .
6463 <p><em>foo <em>bar</em> baz</em></p>
6464 ````````````````````````````````
6465
6466
6467 ```````````````````````````````` example
6468 __foo_ bar_
6469 .
6470 <p><em><em>foo</em> bar</em></p>
6471 ````````````````````````````````
6472
6473
6474 ```````````````````````````````` example
6475 *foo *bar**
6476 .
6477 <p><em>foo <em>bar</em></em></p>
6478 ````````````````````````````````
6479
6480
6481 ```````````````````````````````` example
6482 *foo **bar** baz*
6483 .
6484 <p><em>foo <strong>bar</strong> baz</em></p>
6485 ````````````````````````````````
6486
6487
6488 But note:
6489
6490 ```````````````````````````````` example
6491 *foo**bar**baz*
6492 .
6493 <p><em>foo</em><em>bar</em><em>baz</em></p>
6494 ````````````````````````````````
6495
6496
6497 The difference is that in the preceding case, the internal delimiters
6498 [can close emphasis], while in the cases with spaces, they cannot.
6499
6500 ```````````````````````````````` example
6501 ***foo** bar*
6502 .
6503 <p><em><strong>foo</strong> bar</em></p>
6504 ````````````````````````````````
6505
6506
6507 ```````````````````````````````` example
6508 *foo **bar***
6509 .
6510 <p><em>foo <strong>bar</strong></em></p>
6511 ````````````````````````````````
6512
6513
6514 Note, however, that in the following case we get no strong
6515 emphasis, because the opening delimiter is closed by the first
6516 `*` before `bar`:
6517
6518 ```````````````````````````````` example
6519 *foo**bar***
6520 .
6521 <p><em>foo</em><em>bar</em>**</p>
6522 ````````````````````````````````
6523
6524
6525
6526 Indefinite levels of nesting are possible:
6527
6528 ```````````````````````````````` example
6529 *foo **bar *baz* bim** bop*
6530 .
6531 <p><em>foo <strong>bar <em>baz</em> bim</strong> bop</em></p>
6532 ````````````````````````````````
6533
6534
6535 ```````````````````````````````` example
6536 *foo [*bar*](/url)*
6537 .
6538 <p><em>foo <a href="/url"><em>bar</em></a></em></p>
6539 ````````````````````````````````
6540
6541
6542 There can be no empty emphasis or strong emphasis:
6543
6544 ```````````````````````````````` example
6545 ** is not an empty emphasis
6546 .
6547 <p>** is not an empty emphasis</p>
6548 ````````````````````````````````
6549
6550
6551 ```````````````````````````````` example
6552 **** is not an empty strong emphasis
6553 .
6554 <p>**** is not an empty strong emphasis</p>
6555 ````````````````````````````````
6556
6557
6558
6559 Rule 10:
6560
6561 Any nonempty sequence of inline elements can be the contents of an
6562 strongly emphasized span.
6563
6564 ```````````````````````````````` example
6565 **foo [bar](/url)**
6566 .
6567 <p><strong>foo <a href="/url">bar</a></strong></p>
6568 ````````````````````````````````
6569
6570
6571 ```````````````````````````````` example
6572 **foo
6573 bar**
6574 .
6575 <p><strong>foo
6576 bar</strong></p>
6577 ````````````````````````````````
6578
6579
6580 In particular, emphasis and strong emphasis can be nested
6581 inside strong emphasis:
6582
6583 ```````````````````````````````` example
6584 __foo _bar_ baz__
6585 .
6586 <p><strong>foo <em>bar</em> baz</strong></p>
6587 ````````````````````````````````
6588
6589
6590 ```````````````````````````````` example
6591 __foo __bar__ baz__
6592 .
6593 <p><strong>foo <strong>bar</strong> baz</strong></p>
6594 ````````````````````````````````
6595
6596
6597 ```````````````````````````````` example
6598 ____foo__ bar__
6599 .
6600 <p><strong><strong>foo</strong> bar</strong></p>
6601 ````````````````````````````````
6602
6603
6604 ```````````````````````````````` example
6605 **foo **bar****
6606 .
6607 <p><strong>foo <strong>bar</strong></strong></p>
6608 ````````````````````````````````
6609
6610
6611 ```````````````````````````````` example
6612 **foo *bar* baz**
6613 .
6614 <p><strong>foo <em>bar</em> baz</strong></p>
6615 ````````````````````````````````
6616
6617
6618 But note:
6619
6620 ```````````````````````````````` example
6621 **foo*bar*baz**
6622 .
6623 <p><em><em>foo</em>bar</em>baz**</p>
6624 ````````````````````````````````
6625
6626
6627 The difference is that in the preceding case, the internal delimiters
6628 [can close emphasis], while in the cases with spaces, they cannot.
6629
6630 ```````````````````````````````` example
6631 ***foo* bar**
6632 .
6633 <p><strong><em>foo</em> bar</strong></p>
6634 ````````````````````````````````
6635
6636
6637 ```````````````````````````````` example
6638 **foo *bar***
6639 .
6640 <p><strong>foo <em>bar</em></strong></p>
6641 ````````````````````````````````
6642
6643
6644 Indefinite levels of nesting are possible:
6645
6646 ```````````````````````````````` example
6647 **foo *bar **baz**
6648 bim* bop**
6649 .
6650 <p><strong>foo <em>bar <strong>baz</strong>
6651 bim</em> bop</strong></p>
6652 ````````````````````````````````
6653
6654
6655 ```````````````````````````````` example
6656 **foo [*bar*](/url)**
6657 .
6658 <p><strong>foo <a href="/url"><em>bar</em></a></strong></p>
6659 ````````````````````````````````
6660
6661
6662 There can be no empty emphasis or strong emphasis:
6663
6664 ```````````````````````````````` example
6665 __ is not an empty emphasis
6666 .
6667 <p>__ is not an empty emphasis</p>
6668 ````````````````````````````````
6669
6670
6671 ```````````````````````````````` example
6672 ____ is not an empty strong emphasis
6673 .
6674 <p>____ is not an empty strong emphasis</p>
6675 ````````````````````````````````
6676
6677
6678
6679 Rule 11:
6680
6681 ```````````````````````````````` example
6682 foo ***
6683 .
6684 <p>foo ***</p>
6685 ````````````````````````````````
6686
6687
6688 ```````````````````````````````` example
6689 foo *\**
6690 .
6691 <p>foo <em>*</em></p>
6692 ````````````````````````````````
6693
6694
6695 ```````````````````````````````` example
6696 foo *_*
6697 .
6698 <p>foo <em>_</em></p>
6699 ````````````````````````````````
6700
6701
6702 ```````````````````````````````` example
6703 foo *****
6704 .
6705 <p>foo *****</p>
6706 ````````````````````````````````
6707
6708
6709 ```````````````````````````````` example
6710 foo **\***
6711 .
6712 <p>foo <strong>*</strong></p>
6713 ````````````````````````````````
6714
6715
6716 ```````````````````````````````` example
6717 foo **_**
6718 .
6719 <p>foo <strong>_</strong></p>
6720 ````````````````````````````````
6721
6722
6723 Note that when delimiters do not match evenly, Rule 11 determines
6724 that the excess literal `*` characters will appear outside of the
6725 emphasis, rather than inside it:
6726
6727 ```````````````````````````````` example
6728 **foo*
6729 .
6730 <p>*<em>foo</em></p>
6731 ````````````````````````````````
6732
6733
6734 ```````````````````````````````` example
6735 *foo**
6736 .
6737 <p><em>foo</em>*</p>
6738 ````````````````````````````````
6739
6740
6741 ```````````````````````````````` example
6742 ***foo**
6743 .
6744 <p>*<strong>foo</strong></p>
6745 ````````````````````````````````
6746
6747
6748 ```````````````````````````````` example
6749 ****foo*
6750 .
6751 <p>***<em>foo</em></p>
6752 ````````````````````````````````
6753
6754
6755 ```````````````````````````````` example
6756 **foo***
6757 .
6758 <p><strong>foo</strong>*</p>
6759 ````````````````````````````````
6760
6761
6762 ```````````````````````````````` example
6763 *foo****
6764 .
6765 <p><em>foo</em>***</p>
6766 ````````````````````````````````
6767
6768
6769
6770 Rule 12:
6771
6772 ```````````````````````````````` example
6773 foo ___
6774 .
6775 <p>foo ___</p>
6776 ````````````````````````````````
6777
6778
6779 ```````````````````````````````` example
6780 foo _\__
6781 .
6782 <p>foo <em>_</em></p>
6783 ````````````````````````````````
6784
6785
6786 ```````````````````````````````` example
6787 foo _*_
6788 .
6789 <p>foo <em>*</em></p>
6790 ````````````````````````````````
6791
6792
6793 ```````````````````````````````` example
6794 foo _____
6795 .
6796 <p>foo _____</p>
6797 ````````````````````````````````
6798
6799
6800 ```````````````````````````````` example
6801 foo __\___
6802 .
6803 <p>foo <strong>_</strong></p>
6804 ````````````````````````````````
6805
6806
6807 ```````````````````````````````` example
6808 foo __*__
6809 .
6810 <p>foo <strong>*</strong></p>
6811 ````````````````````````````````
6812
6813
6814 ```````````````````````````````` example
6815 __foo_
6816 .
6817 <p>_<em>foo</em></p>
6818 ````````````````````````````````
6819
6820
6821 Note that when delimiters do not match evenly, Rule 12 determines
6822 that the excess literal `_` characters will appear outside of the
6823 emphasis, rather than inside it:
6824
6825 ```````````````````````````````` example
6826 _foo__
6827 .
6828 <p><em>foo</em>_</p>
6829 ````````````````````````````````
6830
6831
6832 ```````````````````````````````` example
6833 ___foo__
6834 .
6835 <p>_<strong>foo</strong></p>
6836 ````````````````````````````````
6837
6838
6839 ```````````````````````````````` example
6840 ____foo_
6841 .
6842 <p>___<em>foo</em></p>
6843 ````````````````````````````````
6844
6845
6846 ```````````````````````````````` example
6847 __foo___
6848 .
6849 <p><strong>foo</strong>_</p>
6850 ````````````````````````````````
6851
6852
6853 ```````````````````````````````` example
6854 _foo____
6855 .
6856 <p><em>foo</em>___</p>
6857 ````````````````````````````````
6858
6859
6860 Rule 13 implies that if you want emphasis nested directly inside
6861 emphasis, you must use different delimiters:
6862
6863 ```````````````````````````````` example
6864 **foo**
6865 .
6866 <p><strong>foo</strong></p>
6867 ````````````````````````````````
6868
6869
6870 ```````````````````````````````` example
6871 *_foo_*
6872 .
6873 <p><em><em>foo</em></em></p>
6874 ````````````````````````````````
6875
6876
6877 ```````````````````````````````` example
6878 __foo__
6879 .
6880 <p><strong>foo</strong></p>
6881 ````````````````````````````````
6882
6883
6884 ```````````````````````````````` example
6885 _*foo*_
6886 .
6887 <p><em><em>foo</em></em></p>
6888 ````````````````````````````````
6889
6890
6891 However, strong emphasis within strong emphasis is possible without
6892 switching delimiters:
6893
6894 ```````````````````````````````` example
6895 ****foo****
6896 .
6897 <p><strong><strong>foo</strong></strong></p>
6898 ````````````````````````````````
6899
6900
6901 ```````````````````````````````` example
6902 ____foo____
6903 .
6904 <p><strong><strong>foo</strong></strong></p>
6905 ````````````````````````````````
6906
6907
6908
6909 Rule 13 can be applied to arbitrarily long sequences of
6910 delimiters:
6911
6912 ```````````````````````````````` example
6913 ******foo******
6914 .
6915 <p><strong><strong><strong>foo</strong></strong></strong></p>
6916 ````````````````````````````````
6917
6918
6919 Rule 14:
6920
6921 ```````````````````````````````` example
6922 ***foo***
6923 .
6924 <p><strong><em>foo</em></strong></p>
6925 ````````````````````````````````
6926
6927
6928 ```````````````````````````````` example
6929 _____foo_____
6930 .
6931 <p><strong><strong><em>foo</em></strong></strong></p>
6932 ````````````````````````````````
6933
6934
6935 Rule 15:
6936
6937 ```````````````````````````````` example
6938 *foo _bar* baz_
6939 .
6940 <p><em>foo _bar</em> baz_</p>
6941 ````````````````````````````````
6942
6943
6944 ```````````````````````````````` example
6945 **foo*bar**
6946 .
6947 <p><em><em>foo</em>bar</em>*</p>
6948 ````````````````````````````````
6949
6950
6951 ```````````````````````````````` example
6952 *foo __bar *baz bim__ bam*
6953 .
6954 <p><em>foo <strong>bar *baz bim</strong> bam</em></p>
6955 ````````````````````````````````
6956
6957
6958 Rule 16:
6959
6960 ```````````````````````````````` example
6961 **foo **bar baz**
6962 .
6963 <p>**foo <strong>bar baz</strong></p>
6964 ````````````````````````````````
6965
6966
6967 ```````````````````````````````` example
6968 *foo *bar baz*
6969 .
6970 <p>*foo <em>bar baz</em></p>
6971 ````````````````````````````````
6972
6973
6974 Rule 17:
6975
6976 ```````````````````````````````` example
6977 *[bar*](/url)
6978 .
6979 <p>*<a href="/url">bar*</a></p>
6980 ````````````````````````````````
6981
6982
6983 ```````````````````````````````` example
6984 _foo [bar_](/url)
6985 .
6986 <p>_foo <a href="/url">bar_</a></p>
6987 ````````````````````````````````
6988
6989
6990 ```````````````````````````````` example
6991 *<img src="foo" title="*"/>
6992 .
6993 <p>*<img src="foo" title="*"/></p>
6994 ````````````````````````````````
6995
6996
6997 ```````````````````````````````` example
6998 **<a href="**">
6999 .
7000 <p>**<a href="**"></p>
7001 ````````````````````````````````
7002
7003
7004 ```````````````````````````````` example
7005 __<a href="__">
7006 .
7007 <p>__<a href="__"></p>
7008 ````````````````````````````````
7009
7010
7011 ```````````````````````````````` example
7012 *a `*`*
7013 .
7014 <p><em>a <code>*</code></em></p>
7015 ````````````````````````````````
7016
7017
7018 ```````````````````````````````` example
7019 _a `_`_
7020 .
7021 <p><em>a <code>_</code></em></p>
7022 ````````````````````````````````
7023
7024
7025 ```````````````````````````````` example
7026 **a<http://foo.bar/?q=**>
7027 .
7028 <p>**a<a href="http://foo.bar/?q=**">http://foo.bar/?q=**</a></p>
7029 ````````````````````````````````
7030
7031
7032 ```````````````````````````````` example
7033 __a<http://foo.bar/?q=__>
7034 .
7035 <p>__a<a href="http://foo.bar/?q=__">http://foo.bar/?q=__</a></p>
7036 ````````````````````````````````
7037
7038
7039
7040 ## Links
7041
7042 A link contains [link text] (the visible text), a [link destination]
7043 (the URI that is the link destination), and optionally a [link title].
7044 There are two basic kinds of links in Markdown.  In [inline links] the
7045 destination and title are given immediately after the link text.  In
7046 [reference links] the destination and title are defined elsewhere in
7047 the document.
7048
7049 A [link text](@) consists of a sequence of zero or more
7050 inline elements enclosed by square brackets (`[` and `]`).  The
7051 following rules apply:
7052
7053 - Links may not contain other links, at any level of nesting. If
7054   multiple otherwise valid link definitions appear nested inside each
7055   other, the inner-most definition is used.
7056
7057 - Brackets are allowed in the [link text] only if (a) they
7058   are backslash-escaped or (b) they appear as a matched pair of brackets,
7059   with an open bracket `[`, a sequence of zero or more inlines, and
7060   a close bracket `]`.
7061
7062 - Backtick [code spans], [autolinks], and raw [HTML tags] bind more tightly
7063   than the brackets in link text.  Thus, for example,
7064   `` [foo`]` `` could not be a link text, since the second `]`
7065   is part of a code span.
7066
7067 - The brackets in link text bind more tightly than markers for
7068   [emphasis and strong emphasis]. Thus, for example, `*[foo*](url)` is a link.
7069
7070 A [link destination](@) consists of either
7071
7072 - a sequence of zero or more characters between an opening `<` and a
7073   closing `>` that contains no spaces, line breaks, or unescaped
7074   `<` or `>` characters, or
7075
7076 - a nonempty sequence of characters that does not include
7077   ASCII space or control characters, and includes parentheses
7078   only if (a) they are backslash-escaped or (b) they are part of
7079   a balanced pair of unescaped parentheses that is not itself
7080   inside a balanced pair of unescaped parentheses.
7081
7082 A [link title](@)  consists of either
7083
7084 - a sequence of zero or more characters between straight double-quote
7085   characters (`"`), including a `"` character only if it is
7086   backslash-escaped, or
7087
7088 - a sequence of zero or more characters between straight single-quote
7089   characters (`'`), including a `'` character only if it is
7090   backslash-escaped, or
7091
7092 - a sequence of zero or more characters between matching parentheses
7093   (`(...)`), including a `)` character only if it is backslash-escaped.
7094
7095 Although [link titles] may span multiple lines, they may not contain
7096 a [blank line].
7097
7098 An [inline link](@) consists of a [link text] followed immediately
7099 by a left parenthesis `(`, optional [whitespace], an optional
7100 [link destination], an optional [link title] separated from the link
7101 destination by [whitespace], optional [whitespace], and a right
7102 parenthesis `)`. The link's text consists of the inlines contained
7103 in the [link text] (excluding the enclosing square brackets).
7104 The link's URI consists of the link destination, excluding enclosing
7105 `<...>` if present, with backslash-escapes in effect as described
7106 above.  The link's title consists of the link title, excluding its
7107 enclosing delimiters, with backslash-escapes in effect as described
7108 above.
7109
7110 Here is a simple inline link:
7111
7112 ```````````````````````````````` example
7113 [link](/uri "title")
7114 .
7115 <p><a href="/uri" title="title">link</a></p>
7116 ````````````````````````````````
7117
7118
7119 The title may be omitted:
7120
7121 ```````````````````````````````` example
7122 [link](/uri)
7123 .
7124 <p><a href="/uri">link</a></p>
7125 ````````````````````````````````
7126
7127
7128 Both the title and the destination may be omitted:
7129
7130 ```````````````````````````````` example
7131 [link]()
7132 .
7133 <p><a href="">link</a></p>
7134 ````````````````````````````````
7135
7136
7137 ```````````````````````````````` example
7138 [link](<>)
7139 .
7140 <p><a href="">link</a></p>
7141 ````````````````````````````````
7142
7143
7144 The destination cannot contain spaces or line breaks,
7145 even if enclosed in pointy brackets:
7146
7147 ```````````````````````````````` example
7148 [link](/my uri)
7149 .
7150 <p>[link](/my uri)</p>
7151 ````````````````````````````````
7152
7153
7154 ```````````````````````````````` example
7155 [link](</my uri>)
7156 .
7157 <p>[link](&lt;/my uri&gt;)</p>
7158 ````````````````````````````````
7159
7160
7161 ```````````````````````````````` example
7162 [link](foo
7163 bar)
7164 .
7165 <p>[link](foo
7166 bar)</p>
7167 ````````````````````````````````
7168
7169
7170 ```````````````````````````````` example
7171 [link](<foo
7172 bar>)
7173 .
7174 <p>[link](<foo
7175 bar>)</p>
7176 ````````````````````````````````
7177
7178 Parentheses inside the link destination may be escaped:
7179
7180 ```````````````````````````````` example
7181 [link](\(foo\))
7182 .
7183 <p><a href="(foo)">link</a></p>
7184 ````````````````````````````````
7185
7186 One level of balanced parentheses is allowed without escaping:
7187
7188 ```````````````````````````````` example
7189 [link]((foo)and(bar))
7190 .
7191 <p><a href="(foo)and(bar)">link</a></p>
7192 ````````````````````````````````
7193
7194 However, if you have parentheses within parentheses, you need to escape
7195 or use the `<...>` form:
7196
7197 ```````````````````````````````` example
7198 [link](foo(and(bar)))
7199 .
7200 <p>[link](foo(and(bar)))</p>
7201 ````````````````````````````````
7202
7203
7204 ```````````````````````````````` example
7205 [link](foo(and\(bar\)))
7206 .
7207 <p><a href="foo(and(bar))">link</a></p>
7208 ````````````````````````````````
7209
7210
7211 ```````````````````````````````` example
7212 [link](<foo(and(bar))>)
7213 .
7214 <p><a href="foo(and(bar))">link</a></p>
7215 ````````````````````````````````
7216
7217
7218 Parentheses and other symbols can also be escaped, as usual
7219 in Markdown:
7220
7221 ```````````````````````````````` example
7222 [link](foo\)\:)
7223 .
7224 <p><a href="foo):">link</a></p>
7225 ````````````````````````````````
7226
7227
7228 A link can contain fragment identifiers and queries:
7229
7230 ```````````````````````````````` example
7231 [link](#fragment)
7232
7233 [link](http://example.com#fragment)
7234
7235 [link](http://example.com?foo=3#frag)
7236 .
7237 <p><a href="#fragment">link</a></p>
7238 <p><a href="http://example.com#fragment">link</a></p>
7239 <p><a href="http://example.com?foo=3#frag">link</a></p>
7240 ````````````````````````````````
7241
7242
7243 Note that a backslash before a non-escapable character is
7244 just a backslash:
7245
7246 ```````````````````````````````` example
7247 [link](foo\bar)
7248 .
7249 <p><a href="foo%5Cbar">link</a></p>
7250 ````````````````````````````````
7251
7252
7253 URL-escaping should be left alone inside the destination, as all
7254 URL-escaped characters are also valid URL characters. Entity and
7255 numerical character references in the destination will be parsed
7256 into the corresponding Unicode code points, as usual.  These may
7257 be optionally URL-escaped when written as HTML, but this spec
7258 does not enforce any particular policy for rendering URLs in
7259 HTML or other formats.  Renderers may make different decisions
7260 about how to escape or normalize URLs in the output.
7261
7262 ```````````````````````````````` example
7263 [link](foo%20b&auml;)
7264 .
7265 <p><a href="foo%20b%C3%A4">link</a></p>
7266 ````````````````````````````````
7267
7268
7269 Note that, because titles can often be parsed as destinations,
7270 if you try to omit the destination and keep the title, you'll
7271 get unexpected results:
7272
7273 ```````````````````````````````` example
7274 [link]("title")
7275 .
7276 <p><a href="%22title%22">link</a></p>
7277 ````````````````````````````````
7278
7279
7280 Titles may be in single quotes, double quotes, or parentheses:
7281
7282 ```````````````````````````````` example
7283 [link](/url "title")
7284 [link](/url 'title')
7285 [link](/url (title))
7286 .
7287 <p><a href="/url" title="title">link</a>
7288 <a href="/url" title="title">link</a>
7289 <a href="/url" title="title">link</a></p>
7290 ````````````````````````````````
7291
7292
7293 Backslash escapes and entity and numeric character references
7294 may be used in titles:
7295
7296 ```````````````````````````````` example
7297 [link](/url "title \"&quot;")
7298 .
7299 <p><a href="/url" title="title &quot;&quot;">link</a></p>
7300 ````````````````````````````````
7301
7302
7303 Nested balanced quotes are not allowed without escaping:
7304
7305 ```````````````````````````````` example
7306 [link](/url "title "and" title")
7307 .
7308 <p>[link](/url &quot;title &quot;and&quot; title&quot;)</p>
7309 ````````````````````````````````
7310
7311
7312 But it is easy to work around this by using a different quote type:
7313
7314 ```````````````````````````````` example
7315 [link](/url 'title "and" title')
7316 .
7317 <p><a href="/url" title="title &quot;and&quot; title">link</a></p>
7318 ````````````````````````````````
7319
7320
7321 (Note:  `Markdown.pl` did allow double quotes inside a double-quoted
7322 title, and its test suite included a test demonstrating this.
7323 But it is hard to see a good rationale for the extra complexity this
7324 brings, since there are already many ways---backslash escaping,
7325 entity and numeric character references, or using a different
7326 quote type for the enclosing title---to write titles containing
7327 double quotes.  `Markdown.pl`'s handling of titles has a number
7328 of other strange features.  For example, it allows single-quoted
7329 titles in inline links, but not reference links.  And, in
7330 reference links but not inline links, it allows a title to begin
7331 with `"` and end with `)`.  `Markdown.pl` 1.0.1 even allows
7332 titles with no closing quotation mark, though 1.0.2b8 does not.
7333 It seems preferable to adopt a simple, rational rule that works
7334 the same way in inline links and link reference definitions.)
7335
7336 [Whitespace] is allowed around the destination and title:
7337
7338 ```````````````````````````````` example
7339 [link](   /uri
7340   "title"  )
7341 .
7342 <p><a href="/uri" title="title">link</a></p>
7343 ````````````````````````````````
7344
7345
7346 But it is not allowed between the link text and the
7347 following parenthesis:
7348
7349 ```````````````````````````````` example
7350 [link] (/uri)
7351 .
7352 <p>[link] (/uri)</p>
7353 ````````````````````````````````
7354
7355
7356 The link text may contain balanced brackets, but not unbalanced ones,
7357 unless they are escaped:
7358
7359 ```````````````````````````````` example
7360 [link [foo [bar]]](/uri)
7361 .
7362 <p><a href="/uri">link [foo [bar]]</a></p>
7363 ````````````````````````````````
7364
7365
7366 ```````````````````````````````` example
7367 [link] bar](/uri)
7368 .
7369 <p>[link] bar](/uri)</p>
7370 ````````````````````````````````
7371
7372
7373 ```````````````````````````````` example
7374 [link [bar](/uri)
7375 .
7376 <p>[link <a href="/uri">bar</a></p>
7377 ````````````````````````````````
7378
7379
7380 ```````````````````````````````` example
7381 [link \[bar](/uri)
7382 .
7383 <p><a href="/uri">link [bar</a></p>
7384 ````````````````````````````````
7385
7386
7387 The link text may contain inline content:
7388
7389 ```````````````````````````````` example
7390 [link *foo **bar** `#`*](/uri)
7391 .
7392 <p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p>
7393 ````````````````````````````````
7394
7395
7396 ```````````````````````````````` example
7397 [![moon](moon.jpg)](/uri)
7398 .
7399 <p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p>
7400 ````````````````````````````````
7401
7402
7403 However, links may not contain other links, at any level of nesting.
7404
7405 ```````````````````````````````` example
7406 [foo [bar](/uri)](/uri)
7407 .
7408 <p>[foo <a href="/uri">bar</a>](/uri)</p>
7409 ````````````````````````````````
7410
7411
7412 ```````````````````````````````` example
7413 [foo *[bar [baz](/uri)](/uri)*](/uri)
7414 .
7415 <p>[foo <em>[bar <a href="/uri">baz</a>](/uri)</em>](/uri)</p>
7416 ````````````````````````````````
7417
7418
7419 ```````````````````````````````` example
7420 ![[[foo](uri1)](uri2)](uri3)
7421 .
7422 <p><img src="uri3" alt="[foo](uri2)" /></p>
7423 ````````````````````````````````
7424
7425
7426 These cases illustrate the precedence of link text grouping over
7427 emphasis grouping:
7428
7429 ```````````````````````````````` example
7430 *[foo*](/uri)
7431 .
7432 <p>*<a href="/uri">foo*</a></p>
7433 ````````````````````````````````
7434
7435
7436 ```````````````````````````````` example
7437 [foo *bar](baz*)
7438 .
7439 <p><a href="baz*">foo *bar</a></p>
7440 ````````````````````````````````
7441
7442
7443 Note that brackets that *aren't* part of links do not take
7444 precedence:
7445
7446 ```````````````````````````````` example
7447 *foo [bar* baz]
7448 .
7449 <p><em>foo [bar</em> baz]</p>
7450 ````````````````````````````````
7451
7452
7453 These cases illustrate the precedence of HTML tags, code spans,
7454 and autolinks over link grouping:
7455
7456 ```````````````````````````````` example
7457 [foo <bar attr="](baz)">
7458 .
7459 <p>[foo <bar attr="](baz)"></p>
7460 ````````````````````````````````
7461
7462
7463 ```````````````````````````````` example
7464 [foo`](/uri)`
7465 .
7466 <p>[foo<code>](/uri)</code></p>
7467 ````````````````````````````````
7468
7469
7470 ```````````````````````````````` example
7471 [foo<http://example.com/?search=](uri)>
7472 .
7473 <p>[foo<a href="http://example.com/?search=%5D(uri)">http://example.com/?search=](uri)</a></p>
7474 ````````````````````````````````
7475
7476
7477 There are three kinds of [reference link](@)s:
7478 [full](#full-reference-link), [collapsed](#collapsed-reference-link),
7479 and [shortcut](#shortcut-reference-link).
7480
7481 A [full reference link](@)
7482 consists of a [link text] immediately followed by a [link label]
7483 that [matches] a [link reference definition] elsewhere in the document.
7484
7485 A [link label](@)  begins with a left bracket (`[`) and ends
7486 with the first right bracket (`]`) that is not backslash-escaped.
7487 Between these brackets there must be at least one [non-whitespace character].
7488 Unescaped square bracket characters are not allowed in
7489 [link labels].  A link label can have at most 999
7490 characters inside the square brackets.
7491
7492 One label [matches](@)
7493 another just in case their normalized forms are equal.  To normalize a
7494 label, perform the *Unicode case fold* and collapse consecutive internal
7495 [whitespace] to a single space.  If there are multiple
7496 matching reference link definitions, the one that comes first in the
7497 document is used.  (It is desirable in such cases to emit a warning.)
7498
7499 The contents of the first link label are parsed as inlines, which are
7500 used as the link's text.  The link's URI and title are provided by the
7501 matching [link reference definition].
7502
7503 Here is a simple example:
7504
7505 ```````````````````````````````` example
7506 [foo][bar]
7507
7508 [bar]: /url "title"
7509 .
7510 <p><a href="/url" title="title">foo</a></p>
7511 ````````````````````````````````
7512
7513
7514 The rules for the [link text] are the same as with
7515 [inline links].  Thus:
7516
7517 The link text may contain balanced brackets, but not unbalanced ones,
7518 unless they are escaped:
7519
7520 ```````````````````````````````` example
7521 [link [foo [bar]]][ref]
7522
7523 [ref]: /uri
7524 .
7525 <p><a href="/uri">link [foo [bar]]</a></p>
7526 ````````````````````````````````
7527
7528
7529 ```````````````````````````````` example
7530 [link \[bar][ref]
7531
7532 [ref]: /uri
7533 .
7534 <p><a href="/uri">link [bar</a></p>
7535 ````````````````````````````````
7536
7537
7538 The link text may contain inline content:
7539
7540 ```````````````````````````````` example
7541 [link *foo **bar** `#`*][ref]
7542
7543 [ref]: /uri
7544 .
7545 <p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p>
7546 ````````````````````````````````
7547
7548
7549 ```````````````````````````````` example
7550 [![moon](moon.jpg)][ref]
7551
7552 [ref]: /uri
7553 .
7554 <p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p>
7555 ````````````````````````````````
7556
7557
7558 However, links may not contain other links, at any level of nesting.
7559
7560 ```````````````````````````````` example
7561 [foo [bar](/uri)][ref]
7562
7563 [ref]: /uri
7564 .
7565 <p>[foo <a href="/uri">bar</a>]<a href="/uri">ref</a></p>
7566 ````````````````````````````````
7567
7568
7569 ```````````````````````````````` example
7570 [foo *bar [baz][ref]*][ref]
7571
7572 [ref]: /uri
7573 .
7574 <p>[foo <em>bar <a href="/uri">baz</a></em>]<a href="/uri">ref</a></p>
7575 ````````````````````````````````
7576
7577
7578 (In the examples above, we have two [shortcut reference links]
7579 instead of one [full reference link].)
7580
7581 The following cases illustrate the precedence of link text grouping over
7582 emphasis grouping:
7583
7584 ```````````````````````````````` example
7585 *[foo*][ref]
7586
7587 [ref]: /uri
7588 .
7589 <p>*<a href="/uri">foo*</a></p>
7590 ````````````````````````````````
7591
7592
7593 ```````````````````````````````` example
7594 [foo *bar][ref]
7595
7596 [ref]: /uri
7597 .
7598 <p><a href="/uri">foo *bar</a></p>
7599 ````````````````````````````````
7600
7601
7602 These cases illustrate the precedence of HTML tags, code spans,
7603 and autolinks over link grouping:
7604
7605 ```````````````````````````````` example
7606 [foo <bar attr="][ref]">
7607
7608 [ref]: /uri
7609 .
7610 <p>[foo <bar attr="][ref]"></p>
7611 ````````````````````````````````
7612
7613
7614 ```````````````````````````````` example
7615 [foo`][ref]`
7616
7617 [ref]: /uri
7618 .
7619 <p>[foo<code>][ref]</code></p>
7620 ````````````````````````````````
7621
7622
7623 ```````````````````````````````` example
7624 [foo<http://example.com/?search=][ref]>
7625
7626 [ref]: /uri
7627 .
7628 <p>[foo<a href="http://example.com/?search=%5D%5Bref%5D">http://example.com/?search=][ref]</a></p>
7629 ````````````````````````````````
7630
7631
7632 Matching is case-insensitive:
7633
7634 ```````````````````````````````` example
7635 [foo][BaR]
7636
7637 [bar]: /url "title"
7638 .
7639 <p><a href="/url" title="title">foo</a></p>
7640 ````````````````````````````````
7641
7642
7643 Unicode case fold is used:
7644
7645 ```````````````````````````````` example
7646 [Толпой][Толпой] is a Russian word.
7647
7648 [ТОЛПОЙ]: /url
7649 .
7650 <p><a href="/url">Толпой</a> is a Russian word.</p>
7651 ````````````````````````````````
7652
7653
7654 Consecutive internal [whitespace] is treated as one space for
7655 purposes of determining matching:
7656
7657 ```````````````````````````````` example
7658 [Foo
7659   bar]: /url
7660
7661 [Baz][Foo bar]
7662 .
7663 <p><a href="/url">Baz</a></p>
7664 ````````````````````````````````
7665
7666
7667 No [whitespace] is allowed between the [link text] and the
7668 [link label]:
7669
7670 ```````````````````````````````` example
7671 [foo] [bar]
7672
7673 [bar]: /url "title"
7674 .
7675 <p>[foo] <a href="/url" title="title">bar</a></p>
7676 ````````````````````````````````
7677
7678
7679 ```````````````````````````````` example
7680 [foo]
7681 [bar]
7682
7683 [bar]: /url "title"
7684 .
7685 <p>[foo]
7686 <a href="/url" title="title">bar</a></p>
7687 ````````````````````````````````
7688
7689
7690 This is a departure from John Gruber's original Markdown syntax
7691 description, which explicitly allows whitespace between the link
7692 text and the link label.  It brings reference links in line with
7693 [inline links], which (according to both original Markdown and
7694 this spec) cannot have whitespace after the link text.  More
7695 importantly, it prevents inadvertent capture of consecutive
7696 [shortcut reference links]. If whitespace is allowed between the
7697 link text and the link label, then in the following we will have
7698 a single reference link, not two shortcut reference links, as
7699 intended:
7700
7701 ``` markdown
7702 [foo]
7703 [bar]
7704
7705 [foo]: /url1
7706 [bar]: /url2
7707 ```
7708
7709 (Note that [shortcut reference links] were introduced by Gruber
7710 himself in a beta version of `Markdown.pl`, but never included
7711 in the official syntax description.  Without shortcut reference
7712 links, it is harmless to allow space between the link text and
7713 link label; but once shortcut references are introduced, it is
7714 too dangerous to allow this, as it frequently leads to
7715 unintended results.)
7716
7717 When there are multiple matching [link reference definitions],
7718 the first is used:
7719
7720 ```````````````````````````````` example
7721 [foo]: /url1
7722
7723 [foo]: /url2
7724
7725 [bar][foo]
7726 .
7727 <p><a href="/url1">bar</a></p>
7728 ````````````````````````````````
7729
7730
7731 Note that matching is performed on normalized strings, not parsed
7732 inline content.  So the following does not match, even though the
7733 labels define equivalent inline content:
7734
7735 ```````````````````````````````` example
7736 [bar][foo\!]
7737
7738 [foo!]: /url
7739 .
7740 <p>[bar][foo!]</p>
7741 ````````````````````````````````
7742
7743
7744 [Link labels] cannot contain brackets, unless they are
7745 backslash-escaped:
7746
7747 ```````````````````````````````` example
7748 [foo][ref[]
7749
7750 [ref[]: /uri
7751 .
7752 <p>[foo][ref[]</p>
7753 <p>[ref[]: /uri</p>
7754 ````````````````````````````````
7755
7756
7757 ```````````````````````````````` example
7758 [foo][ref[bar]]
7759
7760 [ref[bar]]: /uri
7761 .
7762 <p>[foo][ref[bar]]</p>
7763 <p>[ref[bar]]: /uri</p>
7764 ````````````````````````````````
7765
7766
7767 ```````````````````````````````` example
7768 [[[foo]]]
7769
7770 [[[foo]]]: /url
7771 .
7772 <p>[[[foo]]]</p>
7773 <p>[[[foo]]]: /url</p>
7774 ````````````````````````````````
7775
7776
7777 ```````````````````````````````` example
7778 [foo][ref\[]
7779
7780 [ref\[]: /uri
7781 .
7782 <p><a href="/uri">foo</a></p>
7783 ````````````````````````````````
7784
7785
7786 Note that in this example `]` is not backslash-escaped:
7787
7788 ```````````````````````````````` example
7789 [bar\\]: /uri
7790
7791 [bar\\]
7792 .
7793 <p><a href="/uri">bar\</a></p>
7794 ````````````````````````````````
7795
7796
7797 A [link label] must contain at least one [non-whitespace character]:
7798
7799 ```````````````````````````````` example
7800 []
7801
7802 []: /uri
7803 .
7804 <p>[]</p>
7805 <p>[]: /uri</p>
7806 ````````````````````````````````
7807
7808
7809 ```````````````````````````````` example
7810 [
7811  ]
7812
7813 [
7814  ]: /uri
7815 .
7816 <p>[
7817 ]</p>
7818 <p>[
7819 ]: /uri</p>
7820 ````````````````````````````````
7821
7822
7823 A [collapsed reference link](@)
7824 consists of a [link label] that [matches] a
7825 [link reference definition] elsewhere in the
7826 document, followed by the string `[]`.
7827 The contents of the first link label are parsed as inlines,
7828 which are used as the link's text.  The link's URI and title are
7829 provided by the matching reference link definition.  Thus,
7830 `[foo][]` is equivalent to `[foo][foo]`.
7831
7832 ```````````````````````````````` example
7833 [foo][]
7834
7835 [foo]: /url "title"
7836 .
7837 <p><a href="/url" title="title">foo</a></p>
7838 ````````````````````````````````
7839
7840
7841 ```````````````````````````````` example
7842 [*foo* bar][]
7843
7844 [*foo* bar]: /url "title"
7845 .
7846 <p><a href="/url" title="title"><em>foo</em> bar</a></p>
7847 ````````````````````````````````
7848
7849
7850 The link labels are case-insensitive:
7851
7852 ```````````````````````````````` example
7853 [Foo][]
7854
7855 [foo]: /url "title"
7856 .
7857 <p><a href="/url" title="title">Foo</a></p>
7858 ````````````````````````````````
7859
7860
7861
7862 As with full reference links, [whitespace] is not
7863 allowed between the two sets of brackets:
7864
7865 ```````````````````````````````` example
7866 [foo] 
7867 []
7868
7869 [foo]: /url "title"
7870 .
7871 <p><a href="/url" title="title">foo</a>
7872 []</p>
7873 ````````````````````````````````
7874
7875
7876 A [shortcut reference link](@)
7877 consists of a [link label] that [matches] a
7878 [link reference definition] elsewhere in the
7879 document and is not followed by `[]` or a link label.
7880 The contents of the first link label are parsed as inlines,
7881 which are used as the link's text.  the link's URI and title
7882 are provided by the matching link reference definition.
7883 Thus, `[foo]` is equivalent to `[foo][]`.
7884
7885 ```````````````````````````````` example
7886 [foo]
7887
7888 [foo]: /url "title"
7889 .
7890 <p><a href="/url" title="title">foo</a></p>
7891 ````````````````````````````````
7892
7893
7894 ```````````````````````````````` example
7895 [*foo* bar]
7896
7897 [*foo* bar]: /url "title"
7898 .
7899 <p><a href="/url" title="title"><em>foo</em> bar</a></p>
7900 ````````````````````````````````
7901
7902
7903 ```````````````````````````````` example
7904 [[*foo* bar]]
7905
7906 [*foo* bar]: /url "title"
7907 .
7908 <p>[<a href="/url" title="title"><em>foo</em> bar</a>]</p>
7909 ````````````````````````````````
7910
7911
7912 ```````````````````````````````` example
7913 [[bar [foo]
7914
7915 [foo]: /url
7916 .
7917 <p>[[bar <a href="/url">foo</a></p>
7918 ````````````````````````````````
7919
7920
7921 The link labels are case-insensitive:
7922
7923 ```````````````````````````````` example
7924 [Foo]
7925
7926 [foo]: /url "title"
7927 .
7928 <p><a href="/url" title="title">Foo</a></p>
7929 ````````````````````````````````
7930
7931
7932 A space after the link text should be preserved:
7933
7934 ```````````````````````````````` example
7935 [foo] bar
7936
7937 [foo]: /url
7938 .
7939 <p><a href="/url">foo</a> bar</p>
7940 ````````````````````````````````
7941
7942
7943 If you just want bracketed text, you can backslash-escape the
7944 opening bracket to avoid links:
7945
7946 ```````````````````````````````` example
7947 \[foo]
7948
7949 [foo]: /url "title"
7950 .
7951 <p>[foo]</p>
7952 ````````````````````````````````
7953
7954
7955 Note that this is a link, because a link label ends with the first
7956 following closing bracket:
7957
7958 ```````````````````````````````` example
7959 [foo*]: /url
7960
7961 *[foo*]
7962 .
7963 <p>*<a href="/url">foo*</a></p>
7964 ````````````````````````````````
7965
7966
7967 Full references take precedence over shortcut references:
7968
7969 ```````````````````````````````` example
7970 [foo][bar]
7971
7972 [foo]: /url1
7973 [bar]: /url2
7974 .
7975 <p><a href="/url2">foo</a></p>
7976 ````````````````````````````````
7977
7978
7979 In the following case `[bar][baz]` is parsed as a reference,
7980 `[foo]` as normal text:
7981
7982 ```````````````````````````````` example
7983 [foo][bar][baz]
7984
7985 [baz]: /url
7986 .
7987 <p>[foo]<a href="/url">bar</a></p>
7988 ````````````````````````````````
7989
7990
7991 Here, though, `[foo][bar]` is parsed as a reference, since
7992 `[bar]` is defined:
7993
7994 ```````````````````````````````` example
7995 [foo][bar][baz]
7996
7997 [baz]: /url1
7998 [bar]: /url2
7999 .
8000 <p><a href="/url2">foo</a><a href="/url1">baz</a></p>
8001 ````````````````````````````````
8002
8003
8004 Here `[foo]` is not parsed as a shortcut reference, because it
8005 is followed by a link label (even though `[bar]` is not defined):
8006
8007 ```````````````````````````````` example
8008 [foo][bar][baz]
8009
8010 [baz]: /url1
8011 [foo]: /url2
8012 .
8013 <p>[foo]<a href="/url1">bar</a></p>
8014 ````````````````````````````````
8015
8016
8017
8018 ## Images
8019
8020 Syntax for images is like the syntax for links, with one
8021 difference. Instead of [link text], we have an
8022 [image description](@).  The rules for this are the
8023 same as for [link text], except that (a) an
8024 image description starts with `![` rather than `[`, and
8025 (b) an image description may contain links.
8026 An image description has inline elements
8027 as its contents.  When an image is rendered to HTML,
8028 this is standardly used as the image's `alt` attribute.
8029
8030 ```````````````````````````````` example
8031 ![foo](/url "title")
8032 .
8033 <p><img src="/url" alt="foo" title="title" /></p>
8034 ````````````````````````````````
8035
8036
8037 ```````````````````````````````` example
8038 ![foo *bar*]
8039
8040 [foo *bar*]: train.jpg "train & tracks"
8041 .
8042 <p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p>
8043 ````````````````````````````````
8044
8045
8046 ```````````````````````````````` example
8047 ![foo ![bar](/url)](/url2)
8048 .
8049 <p><img src="/url2" alt="foo bar" /></p>
8050 ````````````````````````````````
8051
8052
8053 ```````````````````````````````` example
8054 ![foo [bar](/url)](/url2)
8055 .
8056 <p><img src="/url2" alt="foo bar" /></p>
8057 ````````````````````````````````
8058
8059
8060 Though this spec is concerned with parsing, not rendering, it is
8061 recommended that in rendering to HTML, only the plain string content
8062 of the [image description] be used.  Note that in
8063 the above example, the alt attribute's value is `foo bar`, not `foo
8064 [bar](/url)` or `foo <a href="/url">bar</a>`.  Only the plain string
8065 content is rendered, without formatting.
8066
8067 ```````````````````````````````` example
8068 ![foo *bar*][]
8069
8070 [foo *bar*]: train.jpg "train & tracks"
8071 .
8072 <p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p>
8073 ````````````````````````````````
8074
8075
8076 ```````````````````````````````` example
8077 ![foo *bar*][foobar]
8078
8079 [FOOBAR]: train.jpg "train & tracks"
8080 .
8081 <p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p>
8082 ````````````````````````````````
8083
8084
8085 ```````````````````````````````` example
8086 ![foo](train.jpg)
8087 .
8088 <p><img src="train.jpg" alt="foo" /></p>
8089 ````````````````````````````````
8090
8091
8092 ```````````````````````````````` example
8093 My ![foo bar](/path/to/train.jpg  "title"   )
8094 .
8095 <p>My <img src="/path/to/train.jpg" alt="foo bar" title="title" /></p>
8096 ````````````````````````````````
8097
8098
8099 ```````````````````````````````` example
8100 ![foo](<url>)
8101 .
8102 <p><img src="url" alt="foo" /></p>
8103 ````````````````````````````````
8104
8105
8106 ```````````````````````````````` example
8107 ![](/url)
8108 .
8109 <p><img src="/url" alt="" /></p>
8110 ````````````````````````````````
8111
8112
8113 Reference-style:
8114
8115 ```````````````````````````````` example
8116 ![foo][bar]
8117
8118 [bar]: /url
8119 .
8120 <p><img src="/url" alt="foo" /></p>
8121 ````````````````````````````````
8122
8123
8124 ```````````````````````````````` example
8125 ![foo][bar]
8126
8127 [BAR]: /url
8128 .
8129 <p><img src="/url" alt="foo" /></p>
8130 ````````````````````````````````
8131
8132
8133 Collapsed:
8134
8135 ```````````````````````````````` example
8136 ![foo][]
8137
8138 [foo]: /url "title"
8139 .
8140 <p><img src="/url" alt="foo" title="title" /></p>
8141 ````````````````````````````````
8142
8143
8144 ```````````````````````````````` example
8145 ![*foo* bar][]
8146
8147 [*foo* bar]: /url "title"
8148 .
8149 <p><img src="/url" alt="foo bar" title="title" /></p>
8150 ````````````````````````````````
8151
8152
8153 The labels are case-insensitive:
8154
8155 ```````````````````````````````` example
8156 ![Foo][]
8157
8158 [foo]: /url "title"
8159 .
8160 <p><img src="/url" alt="Foo" title="title" /></p>
8161 ````````````````````````````````
8162
8163
8164 As with reference links, [whitespace] is not allowed
8165 between the two sets of brackets:
8166
8167 ```````````````````````````````` example
8168 ![foo] 
8169 []
8170
8171 [foo]: /url "title"
8172 .
8173 <p><img src="/url" alt="foo" title="title" />
8174 []</p>
8175 ````````````````````````````````
8176
8177
8178 Shortcut:
8179
8180 ```````````````````````````````` example
8181 ![foo]
8182
8183 [foo]: /url "title"
8184 .
8185 <p><img src="/url" alt="foo" title="title" /></p>
8186 ````````````````````````````````
8187
8188
8189 ```````````````````````````````` example
8190 ![*foo* bar]
8191
8192 [*foo* bar]: /url "title"
8193 .
8194 <p><img src="/url" alt="foo bar" title="title" /></p>
8195 ````````````````````````````````
8196
8197
8198 Note that link labels cannot contain unescaped brackets:
8199
8200 ```````````````````````````````` example
8201 ![[foo]]
8202
8203 [[foo]]: /url "title"
8204 .
8205 <p>![[foo]]</p>
8206 <p>[[foo]]: /url &quot;title&quot;</p>
8207 ````````````````````````````````
8208
8209
8210 The link labels are case-insensitive:
8211
8212 ```````````````````````````````` example
8213 ![Foo]
8214
8215 [foo]: /url "title"
8216 .
8217 <p><img src="/url" alt="Foo" title="title" /></p>
8218 ````````````````````````````````
8219
8220
8221 If you just want bracketed text, you can backslash-escape the
8222 opening `!` and `[`:
8223
8224 ```````````````````````````````` example
8225 \!\[foo]
8226
8227 [foo]: /url "title"
8228 .
8229 <p>![foo]</p>
8230 ````````````````````````````````
8231
8232
8233 If you want a link after a literal `!`, backslash-escape the
8234 `!`:
8235
8236 ```````````````````````````````` example
8237 \![foo]
8238
8239 [foo]: /url "title"
8240 .
8241 <p>!<a href="/url" title="title">foo</a></p>
8242 ````````````````````````````````
8243
8244
8245 ## Autolinks
8246
8247 [Autolink](@)s are absolute URIs and email addresses inside
8248 `<` and `>`. They are parsed as links, with the URL or email address
8249 as the link label.
8250
8251 A [URI autolink](@) consists of `<`, followed by an
8252 [absolute URI] not containing `<`, followed by `>`.  It is parsed as
8253 a link to the URI, with the URI as the link's label.
8254
8255 An [absolute URI](@),
8256 for these purposes, consists of a [scheme] followed by a colon (`:`)
8257 followed by zero or more characters other than ASCII
8258 [whitespace] and control characters, `<`, and `>`.  If
8259 the URI includes these characters, they must be percent-encoded
8260 (e.g. `%20` for a space).
8261
8262 For purposes of this spec, a [scheme](@) is any sequence
8263 of 2--32 characters beginning with an ASCII letter and followed
8264 by any combination of ASCII letters, digits, or the symbols plus
8265 ("+"), period ("."), or hyphen ("-").
8266
8267 Here are some valid autolinks:
8268
8269 ```````````````````````````````` example
8270 <http://foo.bar.baz>
8271 .
8272 <p><a href="http://foo.bar.baz">http://foo.bar.baz</a></p>
8273 ````````````````````````````````
8274
8275
8276 ```````````````````````````````` example
8277 <http://foo.bar.baz/test?q=hello&id=22&boolean>
8278 .
8279 <p><a href="http://foo.bar.baz/test?q=hello&amp;id=22&amp;boolean">http://foo.bar.baz/test?q=hello&amp;id=22&amp;boolean</a></p>
8280 ````````````````````````````````
8281
8282
8283 ```````````````````````````````` example
8284 <irc://foo.bar:2233/baz>
8285 .
8286 <p><a href="irc://foo.bar:2233/baz">irc://foo.bar:2233/baz</a></p>
8287 ````````````````````````````````
8288
8289
8290 Uppercase is also fine:
8291
8292 ```````````````````````````````` example
8293 <MAILTO:FOO@BAR.BAZ>
8294 .
8295 <p><a href="MAILTO:FOO@BAR.BAZ">MAILTO:FOO@BAR.BAZ</a></p>
8296 ````````````````````````````````
8297
8298
8299 Note that many strings that count as [absolute URIs] for
8300 purposes of this spec are not valid URIs, because their
8301 schemes are not registered or because of other problems
8302 with their syntax:
8303
8304 ```````````````````````````````` example
8305 <a+b+c:d>
8306 .
8307 <p><a href="a+b+c:d">a+b+c:d</a></p>
8308 ````````````````````````````````
8309
8310
8311 ```````````````````````````````` example
8312 <made-up-scheme://foo,bar>
8313 .
8314 <p><a href="made-up-scheme://foo,bar">made-up-scheme://foo,bar</a></p>
8315 ````````````````````````````````
8316
8317
8318 ```````````````````````````````` example
8319 <http://../>
8320 .
8321 <p><a href="http://../">http://../</a></p>
8322 ````````````````````````````````
8323
8324
8325 ```````````````````````````````` example
8326 <localhost:5001/foo>
8327 .
8328 <p><a href="localhost:5001/foo">localhost:5001/foo</a></p>
8329 ````````````````````````````````
8330
8331
8332 Spaces are not allowed in autolinks:
8333
8334 ```````````````````````````````` example
8335 <http://foo.bar/baz bim>
8336 .
8337 <p>&lt;http://foo.bar/baz bim&gt;</p>
8338 ````````````````````````````````
8339
8340
8341 Backslash-escapes do not work inside autolinks:
8342
8343 ```````````````````````````````` example
8344 <http://example.com/\[\>
8345 .
8346 <p><a href="http://example.com/%5C%5B%5C">http://example.com/\[\</a></p>
8347 ````````````````````````````````
8348
8349
8350 An [email autolink](@)
8351 consists of `<`, followed by an [email address],
8352 followed by `>`.  The link's label is the email address,
8353 and the URL is `mailto:` followed by the email address.
8354
8355 An [email address](@),
8356 for these purposes, is anything that matches
8357 the [non-normative regex from the HTML5
8358 spec](https://html.spec.whatwg.org/multipage/forms.html#e-mail-state-(type=email)):
8359
8360     /^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?
8361     (?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/
8362
8363 Examples of email autolinks:
8364
8365 ```````````````````````````````` example
8366 <foo@bar.example.com>
8367 .
8368 <p><a href="mailto:foo@bar.example.com">foo@bar.example.com</a></p>
8369 ````````````````````````````````
8370
8371
8372 ```````````````````````````````` example
8373 <foo+special@Bar.baz-bar0.com>
8374 .
8375 <p><a href="mailto:foo+special@Bar.baz-bar0.com">foo+special@Bar.baz-bar0.com</a></p>
8376 ````````````````````````````````
8377
8378
8379 Backslash-escapes do not work inside email autolinks:
8380
8381 ```````````````````````````````` example
8382 <foo\+@bar.example.com>
8383 .
8384 <p>&lt;foo+@bar.example.com&gt;</p>
8385 ````````````````````````````````
8386
8387
8388 These are not autolinks:
8389
8390 ```````````````````````````````` example
8391 <>
8392 .
8393 <p>&lt;&gt;</p>
8394 ````````````````````````````````
8395
8396
8397 ```````````````````````````````` example
8398 < http://foo.bar >
8399 .
8400 <p>&lt; http://foo.bar &gt;</p>
8401 ````````````````````````````````
8402
8403
8404 ```````````````````````````````` example
8405 <m:abc>
8406 .
8407 <p>&lt;m:abc&gt;</p>
8408 ````````````````````````````````
8409
8410
8411 ```````````````````````````````` example
8412 <foo.bar.baz>
8413 .
8414 <p>&lt;foo.bar.baz&gt;</p>
8415 ````````````````````````````````
8416
8417
8418 ```````````````````````````````` example
8419 http://example.com
8420 .
8421 <p>http://example.com</p>
8422 ````````````````````````````````
8423
8424
8425 ```````````````````````````````` example
8426 foo@bar.example.com
8427 .
8428 <p>foo@bar.example.com</p>
8429 ````````````````````````````````
8430
8431
8432 ## Raw HTML
8433
8434 Text between `<` and `>` that looks like an HTML tag is parsed as a
8435 raw HTML tag and will be rendered in HTML without escaping.
8436 Tag and attribute names are not limited to current HTML tags,
8437 so custom tags (and even, say, DocBook tags) may be used.
8438
8439 Here is the grammar for tags:
8440
8441 A [tag name](@) consists of an ASCII letter
8442 followed by zero or more ASCII letters, digits, or
8443 hyphens (`-`).
8444
8445 An [attribute](@) consists of [whitespace],
8446 an [attribute name], and an optional
8447 [attribute value specification].
8448
8449 An [attribute name](@)
8450 consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII
8451 letters, digits, `_`, `.`, `:`, or `-`.  (Note:  This is the XML
8452 specification restricted to ASCII.  HTML5 is laxer.)
8453
8454 An [attribute value specification](@)
8455 consists of optional [whitespace],
8456 a `=` character, optional [whitespace], and an [attribute
8457 value].
8458
8459 An [attribute value](@)
8460 consists of an [unquoted attribute value],
8461 a [single-quoted attribute value], or a [double-quoted attribute value].
8462
8463 An [unquoted attribute value](@)
8464 is a nonempty string of characters not
8465 including spaces, `"`, `'`, `=`, `<`, `>`, or `` ` ``.
8466
8467 A [single-quoted attribute value](@)
8468 consists of `'`, zero or more
8469 characters not including `'`, and a final `'`.
8470
8471 A [double-quoted attribute value](@)
8472 consists of `"`, zero or more
8473 characters not including `"`, and a final `"`.
8474
8475 An [open tag](@) consists of a `<` character, a [tag name],
8476 zero or more [attributes], optional [whitespace], an optional `/`
8477 character, and a `>` character.
8478
8479 A [closing tag](@) consists of the string `</`, a
8480 [tag name], optional [whitespace], and the character `>`.
8481
8482 An [HTML comment](@) consists of `<!--` + *text* + `-->`,
8483 where *text* does not start with `>` or `->`, does not end with `-`,
8484 and does not contain `--`.  (See the
8485 [HTML5 spec](http://www.w3.org/TR/html5/syntax.html#comments).)
8486
8487 A [processing instruction](@)
8488 consists of the string `<?`, a string
8489 of characters not including the string `?>`, and the string
8490 `?>`.
8491
8492 A [declaration](@) consists of the
8493 string `<!`, a name consisting of one or more uppercase ASCII letters,
8494 [whitespace], a string of characters not including the
8495 character `>`, and the character `>`.
8496
8497 A [CDATA section](@) consists of
8498 the string `<![CDATA[`, a string of characters not including the string
8499 `]]>`, and the string `]]>`.
8500
8501 An [HTML tag](@) consists of an [open tag], a [closing tag],
8502 an [HTML comment], a [processing instruction], a [declaration],
8503 or a [CDATA section].
8504
8505 Here are some simple open tags:
8506
8507 ```````````````````````````````` example
8508 <a><bab><c2c>
8509 .
8510 <p><a><bab><c2c></p>
8511 ````````````````````````````````
8512
8513
8514 Empty elements:
8515
8516 ```````````````````````````````` example
8517 <a/><b2/>
8518 .
8519 <p><a/><b2/></p>
8520 ````````````````````````````````
8521
8522
8523 [Whitespace] is allowed:
8524
8525 ```````````````````````````````` example
8526 <a  /><b2
8527 data="foo" >
8528 .
8529 <p><a  /><b2
8530 data="foo" ></p>
8531 ````````````````````````````````
8532
8533
8534 With attributes:
8535
8536 ```````````````````````````````` example
8537 <a foo="bar" bam = 'baz <em>"</em>'
8538 _boolean zoop:33=zoop:33 />
8539 .
8540 <p><a foo="bar" bam = 'baz <em>"</em>'
8541 _boolean zoop:33=zoop:33 /></p>
8542 ````````````````````````````````
8543
8544
8545 Custom tag names can be used:
8546
8547 ```````````````````````````````` example
8548 Foo <responsive-image src="foo.jpg" />
8549 .
8550 <p>Foo <responsive-image src="foo.jpg" /></p>
8551 ````````````````````````````````
8552
8553
8554 Illegal tag names, not parsed as HTML:
8555
8556 ```````````````````````````````` example
8557 <33> <__>
8558 .
8559 <p>&lt;33&gt; &lt;__&gt;</p>
8560 ````````````````````````````````
8561
8562
8563 Illegal attribute names:
8564
8565 ```````````````````````````````` example
8566 <a h*#ref="hi">
8567 .
8568 <p>&lt;a h*#ref=&quot;hi&quot;&gt;</p>
8569 ````````````````````````````````
8570
8571
8572 Illegal attribute values:
8573
8574 ```````````````````````````````` example
8575 <a href="hi'> <a href=hi'>
8576 .
8577 <p>&lt;a href=&quot;hi'&gt; &lt;a href=hi'&gt;</p>
8578 ````````````````````````````````
8579
8580
8581 Illegal [whitespace]:
8582
8583 ```````````````````````````````` example
8584 < a><
8585 foo><bar/ >
8586 .
8587 <p>&lt; a&gt;&lt;
8588 foo&gt;&lt;bar/ &gt;</p>
8589 ````````````````````````````````
8590
8591
8592 Missing [whitespace]:
8593
8594 ```````````````````````````````` example
8595 <a href='bar'title=title>
8596 .
8597 <p>&lt;a href='bar'title=title&gt;</p>
8598 ````````````````````````````````
8599
8600
8601 Closing tags:
8602
8603 ```````````````````````````````` example
8604 </a></foo >
8605 .
8606 <p></a></foo ></p>
8607 ````````````````````````````````
8608
8609
8610 Illegal attributes in closing tag:
8611
8612 ```````````````````````````````` example
8613 </a href="foo">
8614 .
8615 <p>&lt;/a href=&quot;foo&quot;&gt;</p>
8616 ````````````````````````````````
8617
8618
8619 Comments:
8620
8621 ```````````````````````````````` example
8622 foo <!-- this is a
8623 comment - with hyphen -->
8624 .
8625 <p>foo <!-- this is a
8626 comment - with hyphen --></p>
8627 ````````````````````````````````
8628
8629
8630 ```````````````````````````````` example
8631 foo <!-- not a comment -- two hyphens -->
8632 .
8633 <p>foo &lt;!-- not a comment -- two hyphens --&gt;</p>
8634 ````````````````````````````````
8635
8636
8637 Not comments:
8638
8639 ```````````````````````````````` example
8640 foo <!--> foo -->
8641
8642 foo <!-- foo--->
8643 .
8644 <p>foo &lt;!--&gt; foo --&gt;</p>
8645 <p>foo &lt;!-- foo---&gt;</p>
8646 ````````````````````````````````
8647
8648
8649 Processing instructions:
8650
8651 ```````````````````````````````` example
8652 foo <?php echo $a; ?>
8653 .
8654 <p>foo <?php echo $a; ?></p>
8655 ````````````````````````````````
8656
8657
8658 Declarations:
8659
8660 ```````````````````````````````` example
8661 foo <!ELEMENT br EMPTY>
8662 .
8663 <p>foo <!ELEMENT br EMPTY></p>
8664 ````````````````````````````````
8665
8666
8667 CDATA sections:
8668
8669 ```````````````````````````````` example
8670 foo <![CDATA[>&<]]>
8671 .
8672 <p>foo <![CDATA[>&<]]></p>
8673 ````````````````````````````````
8674
8675
8676 Entity and numeric character references are preserved in HTML
8677 attributes:
8678
8679 ```````````````````````````````` example
8680 foo <a href="&ouml;">
8681 .
8682 <p>foo <a href="&ouml;"></p>
8683 ````````````````````````````````
8684
8685
8686 Backslash escapes do not work in HTML attributes:
8687
8688 ```````````````````````````````` example
8689 foo <a href="\*">
8690 .
8691 <p>foo <a href="\*"></p>
8692 ````````````````````````````````
8693
8694
8695 ```````````````````````````````` example
8696 <a href="\"">
8697 .
8698 <p>&lt;a href=&quot;&quot;&quot;&gt;</p>
8699 ````````````````````````````````
8700
8701
8702 ## Hard line breaks
8703
8704 A line break (not in a code span or HTML tag) that is preceded
8705 by two or more spaces and does not occur at the end of a block
8706 is parsed as a [hard line break](@) (rendered
8707 in HTML as a `<br />` tag):
8708
8709 ```````````````````````````````` example
8710 foo  
8711 baz
8712 .
8713 <p>foo<br />
8714 baz</p>
8715 ````````````````````````````````
8716
8717
8718 For a more visible alternative, a backslash before the
8719 [line ending] may be used instead of two spaces:
8720
8721 ```````````````````````````````` example
8722 foo\
8723 baz
8724 .
8725 <p>foo<br />
8726 baz</p>
8727 ````````````````````````````````
8728
8729
8730 More than two spaces can be used:
8731
8732 ```````````````````````````````` example
8733 foo       
8734 baz
8735 .
8736 <p>foo<br />
8737 baz</p>
8738 ````````````````````````````````
8739
8740
8741 Leading spaces at the beginning of the next line are ignored:
8742
8743 ```````````````````````````````` example
8744 foo  
8745      bar
8746 .
8747 <p>foo<br />
8748 bar</p>
8749 ````````````````````````````````
8750
8751
8752 ```````````````````````````````` example
8753 foo\
8754      bar
8755 .
8756 <p>foo<br />
8757 bar</p>
8758 ````````````````````````````````
8759
8760
8761 Line breaks can occur inside emphasis, links, and other constructs
8762 that allow inline content:
8763
8764 ```````````````````````````````` example
8765 *foo  
8766 bar*
8767 .
8768 <p><em>foo<br />
8769 bar</em></p>
8770 ````````````````````````````````
8771
8772
8773 ```````````````````````````````` example
8774 *foo\
8775 bar*
8776 .
8777 <p><em>foo<br />
8778 bar</em></p>
8779 ````````````````````````````````
8780
8781
8782 Line breaks do not occur inside code spans
8783
8784 ```````````````````````````````` example
8785 `code  
8786 span`
8787 .
8788 <p><code>code span</code></p>
8789 ````````````````````````````````
8790
8791
8792 ```````````````````````````````` example
8793 `code\
8794 span`
8795 .
8796 <p><code>code\ span</code></p>
8797 ````````````````````````````````
8798
8799
8800 or HTML tags:
8801
8802 ```````````````````````````````` example
8803 <a href="foo  
8804 bar">
8805 .
8806 <p><a href="foo  
8807 bar"></p>
8808 ````````````````````````````````
8809
8810
8811 ```````````````````````````````` example
8812 <a href="foo\
8813 bar">
8814 .
8815 <p><a href="foo\
8816 bar"></p>
8817 ````````````````````````````````
8818
8819
8820 Hard line breaks are for separating inline content within a block.
8821 Neither syntax for hard line breaks works at the end of a paragraph or
8822 other block element:
8823
8824 ```````````````````````````````` example
8825 foo\
8826 .
8827 <p>foo\</p>
8828 ````````````````````````````````
8829
8830
8831 ```````````````````````````````` example
8832 foo  
8833 .
8834 <p>foo</p>
8835 ````````````````````````````````
8836
8837
8838 ```````````````````````````````` example
8839 ### foo\
8840 .
8841 <h3>foo\</h3>
8842 ````````````````````````````````
8843
8844
8845 ```````````````````````````````` example
8846 ### foo  
8847 .
8848 <h3>foo</h3>
8849 ````````````````````````````````
8850
8851
8852 ## Soft line breaks
8853
8854 A regular line break (not in a code span or HTML tag) that is not
8855 preceded by two or more spaces or a backslash is parsed as a
8856 softbreak.  (A softbreak may be rendered in HTML either as a
8857 [line ending] or as a space. The result will be the same in
8858 browsers. In the examples here, a [line ending] will be used.)
8859
8860 ```````````````````````````````` example
8861 foo
8862 baz
8863 .
8864 <p>foo
8865 baz</p>
8866 ````````````````````````````````
8867
8868
8869 Spaces at the end of the line and beginning of the next line are
8870 removed:
8871
8872 ```````````````````````````````` example
8873 foo 
8874  baz
8875 .
8876 <p>foo
8877 baz</p>
8878 ````````````````````````````````
8879
8880
8881 A conforming parser may render a soft line break in HTML either as a
8882 line break or as a space.
8883
8884 A renderer may also provide an option to render soft line breaks
8885 as hard line breaks.
8886
8887 ## Textual content
8888
8889 Any characters not given an interpretation by the above rules will
8890 be parsed as plain textual content.
8891
8892 ```````````````````````````````` example
8893 hello $.;'there
8894 .
8895 <p>hello $.;'there</p>
8896 ````````````````````````````````
8897
8898
8899 ```````````````````````````````` example
8900 Foo χρῆν
8901 .
8902 <p>Foo χρῆν</p>
8903 ````````````````````````````````
8904
8905
8906 Internal spaces are preserved verbatim:
8907
8908 ```````````````````````````````` example
8909 Multiple     spaces
8910 .
8911 <p>Multiple     spaces</p>
8912 ````````````````````````````````
8913
8914
8915 <!-- END TESTS -->
8916
8917 # Appendix: A parsing strategy
8918
8919 In this appendix we describe some features of the parsing strategy
8920 used in the CommonMark reference implementations.
8921
8922 ## Overview
8923
8924 Parsing has two phases:
8925
8926 1. In the first phase, lines of input are consumed and the block
8927 structure of the document---its division into paragraphs, block quotes,
8928 list items, and so on---is constructed.  Text is assigned to these
8929 blocks but not parsed. Link reference definitions are parsed and a
8930 map of links is constructed.
8931
8932 2. In the second phase, the raw text contents of paragraphs and headings
8933 are parsed into sequences of Markdown inline elements (strings,
8934 code spans, links, emphasis, and so on), using the map of link
8935 references constructed in phase 1.
8936
8937 At each point in processing, the document is represented as a tree of
8938 **blocks**.  The root of the tree is a `document` block.  The `document`
8939 may have any number of other blocks as **children**.  These children
8940 may, in turn, have other blocks as children.  The last child of a block
8941 is normally considered **open**, meaning that subsequent lines of input
8942 can alter its contents.  (Blocks that are not open are **closed**.)
8943 Here, for example, is a possible document tree, with the open blocks
8944 marked by arrows:
8945
8946 ``` tree
8947 -> document
8948   -> block_quote
8949        paragraph
8950          "Lorem ipsum dolor\nsit amet."
8951     -> list (type=bullet tight=true bullet_char=-)
8952          list_item
8953            paragraph
8954              "Qui *quodsi iracundia*"
8955       -> list_item
8956         -> paragraph
8957              "aliquando id"
8958 ```
8959
8960 ## Phase 1: block structure
8961
8962 Each line that is processed has an effect on this tree.  The line is
8963 analyzed and, depending on its contents, the document may be altered
8964 in one or more of the following ways:
8965
8966 1. One or more open blocks may be closed.
8967 2. One or more new blocks may be created as children of the
8968    last open block.
8969 3. Text may be added to the last (deepest) open block remaining
8970    on the tree.
8971
8972 Once a line has been incorporated into the tree in this way,
8973 it can be discarded, so input can be read in a stream.
8974
8975 For each line, we follow this procedure:
8976
8977 1. First we iterate through the open blocks, starting with the
8978 root document, and descending through last children down to the last
8979 open block.  Each block imposes a condition that the line must satisfy
8980 if the block is to remain open.  For example, a block quote requires a
8981 `>` character.  A paragraph requires a non-blank line.
8982 In this phase we may match all or just some of the open
8983 blocks.  But we cannot close unmatched blocks yet, because we may have a
8984 [lazy continuation line].
8985
8986 2.  Next, after consuming the continuation markers for existing
8987 blocks, we look for new block starts (e.g. `>` for a block quote.
8988 If we encounter a new block start, we close any blocks unmatched
8989 in step 1 before creating the new block as a child of the last
8990 matched block.
8991
8992 3.  Finally, we look at the remainder of the line (after block
8993 markers like `>`, list markers, and indentation have been consumed).
8994 This is text that can be incorporated into the last open
8995 block (a paragraph, code block, heading, or raw HTML).
8996
8997 Setext headings are formed when we see a line of a paragraph
8998 that is a [setext heading underline].
8999
9000 Reference link definitions are detected when a paragraph is closed;
9001 the accumulated text lines are parsed to see if they begin with
9002 one or more reference link definitions.  Any remainder becomes a
9003 normal paragraph.
9004
9005 We can see how this works by considering how the tree above is
9006 generated by four lines of Markdown:
9007
9008 ``` markdown
9009 > Lorem ipsum dolor
9010 sit amet.
9011 > - Qui *quodsi iracundia*
9012 > - aliquando id
9013 ```
9014
9015 At the outset, our document model is just
9016
9017 ``` tree
9018 -> document
9019 ```
9020
9021 The first line of our text,
9022
9023 ``` markdown
9024 > Lorem ipsum dolor
9025 ```
9026
9027 causes a `block_quote` block to be created as a child of our
9028 open `document` block, and a `paragraph` block as a child of
9029 the `block_quote`.  Then the text is added to the last open
9030 block, the `paragraph`:
9031
9032 ``` tree
9033 -> document
9034   -> block_quote
9035     -> paragraph
9036          "Lorem ipsum dolor"
9037 ```
9038
9039 The next line,
9040
9041 ``` markdown
9042 sit amet.
9043 ```
9044
9045 is a "lazy continuation" of the open `paragraph`, so it gets added
9046 to the paragraph's text:
9047
9048 ``` tree
9049 -> document
9050   -> block_quote
9051     -> paragraph
9052          "Lorem ipsum dolor\nsit amet."
9053 ```
9054
9055 The third line,
9056
9057 ``` markdown
9058 > - Qui *quodsi iracundia*
9059 ```
9060
9061 causes the `paragraph` block to be closed, and a new `list` block
9062 opened as a child of the `block_quote`.  A `list_item` is also
9063 added as a child of the `list`, and a `paragraph` as a child of
9064 the `list_item`.  The text is then added to the new `paragraph`:
9065
9066 ``` tree
9067 -> document
9068   -> block_quote
9069        paragraph
9070          "Lorem ipsum dolor\nsit amet."
9071     -> list (type=bullet tight=true bullet_char=-)
9072       -> list_item
9073         -> paragraph
9074              "Qui *quodsi iracundia*"
9075 ```
9076
9077 The fourth line,
9078
9079 ``` markdown
9080 > - aliquando id
9081 ```
9082
9083 causes the `list_item` (and its child the `paragraph`) to be closed,
9084 and a new `list_item` opened up as child of the `list`.  A `paragraph`
9085 is added as a child of the new `list_item`, to contain the text.
9086 We thus obtain the final tree:
9087
9088 ``` tree
9089 -> document
9090   -> block_quote
9091        paragraph
9092          "Lorem ipsum dolor\nsit amet."
9093     -> list (type=bullet tight=true bullet_char=-)
9094          list_item
9095            paragraph
9096              "Qui *quodsi iracundia*"
9097       -> list_item
9098         -> paragraph
9099              "aliquando id"
9100 ```
9101
9102 ## Phase 2: inline structure
9103
9104 Once all of the input has been parsed, all open blocks are closed.
9105
9106 We then "walk the tree," visiting every node, and parse raw
9107 string contents of paragraphs and headings as inlines.  At this
9108 point we have seen all the link reference definitions, so we can
9109 resolve reference links as we go.
9110
9111 ``` tree
9112 document
9113   block_quote
9114     paragraph
9115       str "Lorem ipsum dolor"
9116       softbreak
9117       str "sit amet."
9118     list (type=bullet tight=true bullet_char=-)
9119       list_item
9120         paragraph
9121           str "Qui "
9122           emph
9123             str "quodsi iracundia"
9124       list_item
9125         paragraph
9126           str "aliquando id"
9127 ```
9128
9129 Notice how the [line ending] in the first paragraph has
9130 been parsed as a `softbreak`, and the asterisks in the first list item
9131 have become an `emph`.
9132
9133 ### An algorithm for parsing nested emphasis and links
9134
9135 By far the trickiest part of inline parsing is handling emphasis,
9136 strong emphasis, links, and images.  This is done using the following
9137 algorithm.
9138
9139 When we're parsing inlines and we hit either
9140
9141 - a run of `*` or `_` characters, or
9142 - a `[` or `![`
9143
9144 we insert a text node with these symbols as its literal content, and we
9145 add a pointer to this text node to the [delimiter stack](@).
9146
9147 The [delimiter stack] is a doubly linked list.  Each
9148 element contains a pointer to a text node, plus information about
9149
9150 - the type of delimiter (`[`, `![`, `*`, `_`)
9151 - the number of delimiters,
9152 - whether the delimiter is "active" (all are active to start), and
9153 - whether the delimiter is a potential opener, a potential closer,
9154   or both (which depends on what sort of characters precede
9155   and follow the delimiters).
9156
9157 When we hit a `]` character, we call the *look for link or image*
9158 procedure (see below).
9159
9160 When we hit the end of the input, we call the *process emphasis*
9161 procedure (see below), with `stack_bottom` = NULL.
9162
9163 #### *look for link or image*
9164
9165 Starting at the top of the delimiter stack, we look backwards
9166 through the stack for an opening `[` or `![` delimiter.
9167
9168 - If we don't find one, we return a literal text node `]`.
9169
9170 - If we do find one, but it's not *active*, we remove the inactive
9171   delimiter from the stack, and return a literal text node `]`.
9172
9173 - If we find one and it's active, then we parse ahead to see if
9174   we have an inline link/image, reference link/image, compact reference
9175   link/image, or shortcut reference link/image.
9176
9177   + If we don't, then we remove the opening delimiter from the
9178     delimiter stack and return a literal text node `]`.
9179
9180   + If we do, then
9181
9182     * We return a link or image node whose children are the inlines
9183       after the text node pointed to by the opening delimiter.
9184
9185     * We run *process emphasis* on these inlines, with the `[` opener
9186       as `stack_bottom`.
9187
9188     * We remove the opening delimiter.
9189
9190     * If we have a link (and not an image), we also set all
9191       `[` delimiters before the opening delimiter to *inactive*.  (This
9192       will prevent us from getting links within links.)
9193
9194 #### *process emphasis*
9195
9196 Parameter `stack_bottom` sets a lower bound to how far we
9197 descend in the [delimiter stack].  If it is NULL, we can
9198 go all the way to the bottom.  Otherwise, we stop before
9199 visiting `stack_bottom`.
9200
9201 Let `current_position` point to the element on the [delimiter stack]
9202 just above `stack_bottom` (or the first element if `stack_bottom`
9203 is NULL).
9204
9205 We keep track of the `openers_bottom` for each delimiter
9206 type (`*`, `_`).  Initialize this to `stack_bottom`.
9207
9208 Then we repeat the following until we run out of potential
9209 closers:
9210
9211 - Move `current_position` forward in the delimiter stack (if needed)
9212   until we find the first potential closer with delimiter `*` or `_`.
9213   (This will be the potential closer closest
9214   to the beginning of the input -- the first one in parse order.)
9215
9216 - Now, look back in the stack (staying above `stack_bottom` and
9217   the `openers_bottom` for this delimiter type) for the
9218   first matching potential opener ("matching" means same delimiter).
9219
9220 - If one is found:
9221
9222   + Figure out whether we have emphasis or strong emphasis:
9223     if both closer and opener spans have length >= 2, we have
9224     strong, otherwise regular.
9225
9226   + Insert an emph or strong emph node accordingly, after
9227     the text node corresponding to the opener.
9228
9229   + Remove any delimiters between the opener and closer from
9230     the delimiter stack.
9231
9232   + Remove 1 (for regular emph) or 2 (for strong emph) delimiters
9233     from the opening and closing text nodes.  If they become empty
9234     as a result, remove them and remove the corresponding element
9235     of the delimiter stack.  If the closing node is removed, reset
9236     `current_position` to the next element in the stack.
9237
9238 - If none in found:
9239
9240   + Set `openers_bottom` to the element before `current_position`.
9241     (We know that there are no openers for this kind of closer up to and
9242     including this point, so this puts a lower bound on future searches.)
9243
9244   + If the closer at `current_position` is not a potential opener,
9245     remove it from the delimiter stack (since we know it can't
9246     be a closer either).
9247
9248   + Advance `current_position` to the next element in the stack.
9249
9250 After we're done, we remove all delimiters above `stack_bottom` from the
9251 delimiter stack.
9252