]> gerrit.simantics Code Review - simantics/platform.git/blob - tests/org.simantics.scl.compiler.tests/src/org/simantics/scl/compiler/tests/markdown/spec.txt
Merged changes from feature/scl to master.
[simantics/platform.git] / tests / org.simantics.scl.compiler.tests / src / org / simantics / scl / compiler / tests / markdown / spec.txt
1 ---
2 title: CommonMark Spec
3 author: John MacFarlane
4 version: 0.26
5 date: '2016-07-15'
6 license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)'
7 ...
8
9 # Introduction
10
11 ## What is Markdown?
12
13 Markdown is a plain text format for writing structured documents,
14 based on conventions used for indicating formatting in email and
15 usenet posts.  It was developed in 2004 by John Gruber, who wrote
16 the first Markdown-to-HTML converter in Perl, and it soon became
17 ubiquitous.  In the next decade, dozens of implementations were
18 developed in many languages.  Some extended the original
19 Markdown syntax with conventions for footnotes, tables, and
20 other document elements.  Some allowed Markdown documents to be
21 rendered in formats other than HTML.  Websites like Reddit,
22 StackOverflow, and GitHub had millions of people using Markdown.
23 And Markdown started to be used beyond the web, to author books,
24 articles, slide shows, letters, and lecture notes.
25
26 What distinguishes Markdown from many other lightweight markup
27 syntaxes, which are often easier to write, is its readability.
28 As Gruber writes:
29
30 > The overriding design goal for Markdown's formatting syntax is
31 > to make it as readable as possible. The idea is that a
32 > Markdown-formatted document should be publishable as-is, as
33 > plain text, without looking like it's been marked up with tags
34 > or formatting instructions.
35 > (<http://daringfireball.net/projects/markdown/>)
36
37 The point can be illustrated by comparing a sample of
38 [AsciiDoc](http://www.methods.co.nz/asciidoc/) with
39 an equivalent sample of Markdown.  Here is a sample of
40 AsciiDoc from the AsciiDoc manual:
41
42 ```
43 1. List item one.
44 +
45 List item one continued with a second paragraph followed by an
46 Indented block.
47 +
48 .................
49 $ ls *.sh
50 $ mv *.sh ~/tmp
51 .................
52 +
53 List item continued with a third paragraph.
54
55 2. List item two continued with an open block.
56 +
57 --
58 This paragraph is part of the preceding list item.
59
60 a. This list is nested and does not require explicit item
61 continuation.
62 +
63 This paragraph is part of the preceding list item.
64
65 b. List item b.
66
67 This paragraph belongs to item two of the outer list.
68 --
69 ```
70
71 And here is the equivalent in Markdown:
72 ```
73 1.  List item one.
74
75     List item one continued with a second paragraph followed by an
76     Indented block.
77
78         $ ls *.sh
79         $ mv *.sh ~/tmp
80
81     List item continued with a third paragraph.
82
83 2.  List item two continued with an open block.
84
85     This paragraph is part of the preceding list item.
86
87     1. This list is nested and does not require explicit item continuation.
88
89        This paragraph is part of the preceding list item.
90
91     2. List item b.
92
93     This paragraph belongs to item two of the outer list.
94 ```
95
96 The AsciiDoc version is, arguably, easier to write. You don't need
97 to worry about indentation.  But the Markdown version is much easier
98 to read.  The nesting of list items is apparent to the eye in the
99 source, not just in the processed document.
100
101 ## Why is a spec needed?
102
103 John Gruber's [canonical description of Markdown's
104 syntax](http://daringfireball.net/projects/markdown/syntax)
105 does not specify the syntax unambiguously.  Here are some examples of
106 questions it does not answer:
107
108 1.  How much indentation is needed for a sublist?  The spec says that
109     continuation paragraphs need to be indented four spaces, but is
110     not fully explicit about sublists.  It is natural to think that
111     they, too, must be indented four spaces, but `Markdown.pl` does
112     not require that.  This is hardly a "corner case," and divergences
113     between implementations on this issue often lead to surprises for
114     users in real documents. (See [this comment by John
115     Gruber](http://article.gmane.org/gmane.text.markdown.general/1997).)
116
117 2.  Is a blank line needed before a block quote or heading?
118     Most implementations do not require the blank line.  However,
119     this can lead to unexpected results in hard-wrapped text, and
120     also to ambiguities in parsing (note that some implementations
121     put the heading inside the blockquote, while others do not).
122     (John Gruber has also spoken [in favor of requiring the blank
123     lines](http://article.gmane.org/gmane.text.markdown.general/2146).)
124
125 3.  Is a blank line needed before an indented code block?
126     (`Markdown.pl` requires it, but this is not mentioned in the
127     documentation, and some implementations do not require it.)
128
129     ``` markdown
130     paragraph
131         code?
132     ```
133
134 4.  What is the exact rule for determining when list items get
135     wrapped in `<p>` tags?  Can a list be partially "loose" and partially
136     "tight"?  What should we do with a list like this?
137
138     ``` markdown
139     1. one
140
141     2. two
142     3. three
143     ```
144
145     Or this?
146
147     ``` markdown
148     1.  one
149         - a
150
151         - b
152     2.  two
153     ```
154
155     (There are some relevant comments by John Gruber
156     [here](http://article.gmane.org/gmane.text.markdown.general/2554).)
157
158 5.  Can list markers be indented?  Can ordered list markers be right-aligned?
159
160     ``` markdown
161      8. item 1
162      9. item 2
163     10. item 2a
164     ```
165
166 6.  Is this one list with a thematic break in its second item,
167     or two lists separated by a thematic break?
168
169     ``` markdown
170     * a
171     * * * * *
172     * b
173     ```
174
175 7.  When list markers change from numbers to bullets, do we have
176     two lists or one?  (The Markdown syntax description suggests two,
177     but the perl scripts and many other implementations produce one.)
178
179     ``` markdown
180     1. fee
181     2. fie
182     -  foe
183     -  fum
184     ```
185
186 8.  What are the precedence rules for the markers of inline structure?
187     For example, is the following a valid link, or does the code span
188     take precedence ?
189
190     ``` markdown
191     [a backtick (`)](/url) and [another backtick (`)](/url).
192     ```
193
194 9.  What are the precedence rules for markers of emphasis and strong
195     emphasis?  For example, how should the following be parsed?
196
197     ``` markdown
198     *foo *bar* baz*
199     ```
200
201 10. What are the precedence rules between block-level and inline-level
202     structure?  For example, how should the following be parsed?
203
204     ``` markdown
205     - `a long code span can contain a hyphen like this
206       - and it can screw things up`
207     ```
208
209 11. Can list items include section headings?  (`Markdown.pl` does not
210     allow this, but does allow blockquotes to include headings.)
211
212     ``` markdown
213     - # Heading
214     ```
215
216 12. Can list items be empty?
217
218     ``` markdown
219     * a
220     *
221     * b
222     ```
223
224 13. Can link references be defined inside block quotes or list items?
225
226     ``` markdown
227     > Blockquote [foo].
228     >
229     > [foo]: /url
230     ```
231
232 14. If there are multiple definitions for the same reference, which takes
233     precedence?
234
235     ``` markdown
236     [foo]: /url1
237     [foo]: /url2
238
239     [foo][]
240     ```
241
242 In the absence of a spec, early implementers consulted `Markdown.pl`
243 to resolve these ambiguities.  But `Markdown.pl` was quite buggy, and
244 gave manifestly bad results in many cases, so it was not a
245 satisfactory replacement for a spec.
246
247 Because there is no unambiguous spec, implementations have diverged
248 considerably.  As a result, users are often surprised to find that
249 a document that renders one way on one system (say, a github wiki)
250 renders differently on another (say, converting to docbook using
251 pandoc).  To make matters worse, because nothing in Markdown counts
252 as a "syntax error," the divergence often isn't discovered right away.
253
254 ## About this document
255
256 This document attempts to specify Markdown syntax unambiguously.
257 It contains many examples with side-by-side Markdown and
258 HTML.  These are intended to double as conformance tests.  An
259 accompanying script `spec_tests.py` can be used to run the tests
260 against any Markdown program:
261
262     python test/spec_tests.py --spec spec.txt --program PROGRAM
263
264 Since this document describes how Markdown is to be parsed into
265 an abstract syntax tree, it would have made sense to use an abstract
266 representation of the syntax tree instead of HTML.  But HTML is capable
267 of representing the structural distinctions we need to make, and the
268 choice of HTML for the tests makes it possible to run the tests against
269 an implementation without writing an abstract syntax tree renderer.
270
271 This document is generated from a text file, `spec.txt`, written
272 in Markdown with a small extension for the side-by-side tests.
273 The script `tools/makespec.py` can be used to convert `spec.txt` into
274 HTML or CommonMark (which can then be converted into other formats).
275
276 In the examples, the `→` character is used to represent tabs.
277
278 # Preliminaries
279
280 ## Characters and lines
281
282 Any sequence of [characters] is a valid CommonMark
283 document.
284
285 A [character](@) is a Unicode code point.  Although some
286 code points (for example, combining accents) do not correspond to
287 characters in an intuitive sense, all code points count as characters
288 for purposes of this spec.
289
290 This spec does not specify an encoding; it thinks of lines as composed
291 of [characters] rather than bytes.  A conforming parser may be limited
292 to a certain encoding.
293
294 A [line](@) is a sequence of zero or more [characters]
295 other than newline (`U+000A`) or carriage return (`U+000D`),
296 followed by a [line ending] or by the end of file.
297
298 A [line ending](@) is a newline (`U+000A`), a carriage return
299 (`U+000D`) not followed by a newline, or a carriage return and a
300 following newline.
301
302 A line containing no characters, or a line containing only spaces
303 (`U+0020`) or tabs (`U+0009`), is called a [blank line](@).
304
305 The following definitions of character classes will be used in this spec:
306
307 A [whitespace character](@) is a space
308 (`U+0020`), tab (`U+0009`), newline (`U+000A`), line tabulation (`U+000B`),
309 form feed (`U+000C`), or carriage return (`U+000D`).
310
311 [Whitespace](@) is a sequence of one or more [whitespace
312 characters].
313
314 A [Unicode whitespace character](@) is
315 any code point in the Unicode `Zs` class, or a tab (`U+0009`),
316 carriage return (`U+000D`), newline (`U+000A`), or form feed
317 (`U+000C`).
318
319 [Unicode whitespace](@) is a sequence of one
320 or more [Unicode whitespace characters].
321
322 A [space](@) is `U+0020`.
323
324 A [non-whitespace character](@) is any character
325 that is not a [whitespace character].
326
327 An [ASCII punctuation character](@)
328 is `!`, `"`, `#`, `$`, `%`, `&`, `'`, `(`, `)`,
329 `*`, `+`, `,`, `-`, `.`, `/`, `:`, `;`, `<`, `=`, `>`, `?`, `@`,
330 `[`, `\`, `]`, `^`, `_`, `` ` ``, `{`, `|`, `}`, or `~`.
331
332 A [punctuation character](@) is an [ASCII
333 punctuation character] or anything in
334 the Unicode classes `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po`, or `Ps`.
335
336 ## Tabs
337
338 Tabs in lines are not expanded to [spaces].  However,
339 in contexts where whitespace helps to define block structure,
340 tabs behave as if they were replaced by spaces with a tab stop
341 of 4 characters.
342
343 Thus, for example, a tab can be used instead of four spaces
344 in an indented code block.  (Note, however, that internal
345 tabs are passed through as literal tabs, not expanded to
346 spaces.)
347
348 ```````````````````````````````` example
349 →foo→baz→→bim
350 .
351 <pre><code>foo→baz→→bim
352 </code></pre>
353 ````````````````````````````````
354
355 ```````````````````````````````` example
356   →foo→baz→→bim
357 .
358 <pre><code>foo→baz→→bim
359 </code></pre>
360 ````````````````````````````````
361
362 ```````````````````````````````` example
363     a→a
364     ὐ→a
365 .
366 <pre><code>a→a
367 ὐ→a
368 </code></pre>
369 ````````````````````````````````
370
371 In the following example, a continuation paragraph of a list
372 item is indented with a tab; this has exactly the same effect
373 as indentation with four spaces would:
374
375 ```````````````````````````````` example
376   - foo
377
378 →bar
379 .
380 <ul>
381 <li>
382 <p>foo</p>
383 <p>bar</p>
384 </li>
385 </ul>
386 ````````````````````````````````
387
388 ```````````````````````````````` example
389 - foo
390
391 →→bar
392 .
393 <ul>
394 <li>
395 <p>foo</p>
396 <pre><code>  bar
397 </code></pre>
398 </li>
399 </ul>
400 ````````````````````````````````
401
402 Normally the `>` that begins a block quote may be followed
403 optionally by a space, which is not considered part of the
404 content.  In the following case `>` is followed by a tab,
405 which is treated as if it were expanded into spaces.
406 Since one of theses spaces is considered part of the
407 delimiter, `foo` is considered to be indented six spaces
408 inside the block quote context, so we get an indented
409 code block starting with two spaces.
410
411 ```````````````````````````````` example
412 >→→foo
413 .
414 <blockquote>
415 <pre><code>  foo
416 </code></pre>
417 </blockquote>
418 ````````````````````````````````
419
420 ```````````````````````````````` example
421 -→→foo
422 .
423 <ul>
424 <li>
425 <pre><code>  foo
426 </code></pre>
427 </li>
428 </ul>
429 ````````````````````````````````
430
431
432 ```````````````````````````````` example
433     foo
434 →bar
435 .
436 <pre><code>foo
437 bar
438 </code></pre>
439 ````````````````````````````````
440
441 ```````````````````````````````` example
442  - foo
443    - bar
444 → - baz
445 .
446 <ul>
447 <li>foo
448 <ul>
449 <li>bar
450 <ul>
451 <li>baz</li>
452 </ul>
453 </li>
454 </ul>
455 </li>
456 </ul>
457 ````````````````````````````````
458
459 ```````````````````````````````` example
460 #→Foo
461 .
462 <h1>Foo</h1>
463 ````````````````````````````````
464
465 ```````````````````````````````` example
466 *→*→*→
467 .
468 <hr />
469 ````````````````````````````````
470
471
472 ## Insecure characters
473
474 For security reasons, the Unicode character `U+0000` must be replaced
475 with the REPLACEMENT CHARACTER (`U+FFFD`).
476
477 # Blocks and inlines
478
479 We can think of a document as a sequence of
480 [blocks](@)---structural elements like paragraphs, block
481 quotations, lists, headings, rules, and code blocks.  Some blocks (like
482 block quotes and list items) contain other blocks; others (like
483 headings and paragraphs) contain [inline](@) content---text,
484 links, emphasized text, images, code, and so on.
485
486 ## Precedence
487
488 Indicators of block structure always take precedence over indicators
489 of inline structure.  So, for example, the following is a list with
490 two items, not a list with one item containing a code span:
491
492 ```````````````````````````````` example
493 - `one
494 - two`
495 .
496 <ul>
497 <li>`one</li>
498 <li>two`</li>
499 </ul>
500 ````````````````````````````````
501
502
503 This means that parsing can proceed in two steps:  first, the block
504 structure of the document can be discerned; second, text lines inside
505 paragraphs, headings, and other block constructs can be parsed for inline
506 structure.  The second step requires information about link reference
507 definitions that will be available only at the end of the first
508 step.  Note that the first step requires processing lines in sequence,
509 but the second can be parallelized, since the inline parsing of
510 one block element does not affect the inline parsing of any other.
511
512 ## Container blocks and leaf blocks
513
514 We can divide blocks into two types:
515 [container block](@)s,
516 which can contain other blocks, and [leaf block](@)s,
517 which cannot.
518
519 # Leaf blocks
520
521 This section describes the different kinds of leaf block that make up a
522 Markdown document.
523
524 ## Thematic breaks
525
526 A line consisting of 0-3 spaces of indentation, followed by a sequence
527 of three or more matching `-`, `_`, or `*` characters, each followed
528 optionally by any number of spaces, forms a
529 [thematic break](@).
530
531 ```````````````````````````````` example
532 ***
533 ---
534 ___
535 .
536 <hr />
537 <hr />
538 <hr />
539 ````````````````````````````````
540
541
542 Wrong characters:
543
544 ```````````````````````````````` example
545 +++
546 .
547 <p>+++</p>
548 ````````````````````````````````
549
550
551 ```````````````````````````````` example
552 ===
553 .
554 <p>===</p>
555 ````````````````````````````````
556
557
558 Not enough characters:
559
560 ```````````````````````````````` example
561 --
562 **
563 __
564 .
565 <p>--
566 **
567 __</p>
568 ````````````````````````````````
569
570
571 One to three spaces indent are allowed:
572
573 ```````````````````````````````` example
574  ***
575   ***
576    ***
577 .
578 <hr />
579 <hr />
580 <hr />
581 ````````````````````````````````
582
583
584 Four spaces is too many:
585
586 ```````````````````````````````` example
587     ***
588 .
589 <pre><code>***
590 </code></pre>
591 ````````````````````````````````
592
593
594 ```````````````````````````````` example
595 Foo
596     ***
597 .
598 <p>Foo
599 ***</p>
600 ````````````````````````````````
601
602
603 More than three characters may be used:
604
605 ```````````````````````````````` example
606 _____________________________________
607 .
608 <hr />
609 ````````````````````````````````
610
611
612 Spaces are allowed between the characters:
613
614 ```````````````````````````````` example
615  - - -
616 .
617 <hr />
618 ````````````````````````````````
619
620
621 ```````````````````````````````` example
622  **  * ** * ** * **
623 .
624 <hr />
625 ````````````````````````````````
626
627
628 ```````````````````````````````` example
629 -     -      -      -
630 .
631 <hr />
632 ````````````````````````````````
633
634
635 Spaces are allowed at the end:
636
637 ```````````````````````````````` example
638 - - - -    
639 .
640 <hr />
641 ````````````````````````````````
642
643
644 However, no other characters may occur in the line:
645
646 ```````````````````````````````` example
647 _ _ _ _ a
648
649 a------
650
651 ---a---
652 .
653 <p>_ _ _ _ a</p>
654 <p>a------</p>
655 <p>---a---</p>
656 ````````````````````````````````
657
658
659 It is required that all of the [non-whitespace characters] be the same.
660 So, this is not a thematic break:
661
662 ```````````````````````````````` example
663  *-*
664 .
665 <p><em>-</em></p>
666 ````````````````````````````````
667
668
669 Thematic breaks do not need blank lines before or after:
670
671 ```````````````````````````````` example
672 - foo
673 ***
674 - bar
675 .
676 <ul>
677 <li>foo</li>
678 </ul>
679 <hr />
680 <ul>
681 <li>bar</li>
682 </ul>
683 ````````````````````````````````
684
685
686 Thematic breaks can interrupt a paragraph:
687
688 ```````````````````````````````` example
689 Foo
690 ***
691 bar
692 .
693 <p>Foo</p>
694 <hr />
695 <p>bar</p>
696 ````````````````````````````````
697
698
699 If a line of dashes that meets the above conditions for being a
700 thematic break could also be interpreted as the underline of a [setext
701 heading], the interpretation as a
702 [setext heading] takes precedence. Thus, for example,
703 this is a setext heading, not a paragraph followed by a thematic break:
704
705 ```````````````````````````````` example
706 Foo
707 ---
708 bar
709 .
710 <h2>Foo</h2>
711 <p>bar</p>
712 ````````````````````````````````
713
714
715 When both a thematic break and a list item are possible
716 interpretations of a line, the thematic break takes precedence:
717
718 ```````````````````````````````` example
719 * Foo
720 * * *
721 * Bar
722 .
723 <ul>
724 <li>Foo</li>
725 </ul>
726 <hr />
727 <ul>
728 <li>Bar</li>
729 </ul>
730 ````````````````````````````````
731
732
733 If you want a thematic break in a list item, use a different bullet:
734
735 ```````````````````````````````` example
736 - Foo
737 - * * *
738 .
739 <ul>
740 <li>Foo</li>
741 <li>
742 <hr />
743 </li>
744 </ul>
745 ````````````````````````````````
746
747
748 ## ATX headings
749
750 An [ATX heading](@)
751 consists of a string of characters, parsed as inline content, between an
752 opening sequence of 1--6 unescaped `#` characters and an optional
753 closing sequence of any number of unescaped `#` characters.
754 The opening sequence of `#` characters must be followed by a
755 [space] or by the end of line. The optional closing sequence of `#`s must be
756 preceded by a [space] and may be followed by spaces only.  The opening
757 `#` character may be indented 0-3 spaces.  The raw contents of the
758 heading are stripped of leading and trailing spaces before being parsed
759 as inline content.  The heading level is equal to the number of `#`
760 characters in the opening sequence.
761
762 Simple headings:
763
764 ```````````````````````````````` example
765 # foo
766 ## foo
767 ### foo
768 #### foo
769 ##### foo
770 ###### foo
771 .
772 <h1>foo</h1>
773 <h2>foo</h2>
774 <h3>foo</h3>
775 <h4>foo</h4>
776 <h5>foo</h5>
777 <h6>foo</h6>
778 ````````````````````````````````
779
780
781 More than six `#` characters is not a heading:
782
783 ```````````````````````````````` example
784 ####### foo
785 .
786 <p>####### foo</p>
787 ````````````````````````````````
788
789
790 At least one space is required between the `#` characters and the
791 heading's contents, unless the heading is empty.  Note that many
792 implementations currently do not require the space.  However, the
793 space was required by the
794 [original ATX implementation](http://www.aaronsw.com/2002/atx/atx.py),
795 and it helps prevent things like the following from being parsed as
796 headings:
797
798 ```````````````````````````````` example
799 #5 bolt
800
801 #hashtag
802 .
803 <p>#5 bolt</p>
804 <p>#hashtag</p>
805 ````````````````````````````````
806
807
808 This is not a heading, because the first `#` is escaped:
809
810 ```````````````````````````````` example
811 \## foo
812 .
813 <p>## foo</p>
814 ````````````````````````````````
815
816
817 Contents are parsed as inlines:
818
819 ```````````````````````````````` example
820 # foo *bar* \*baz\*
821 .
822 <h1>foo <em>bar</em> *baz*</h1>
823 ````````````````````````````````
824
825
826 Leading and trailing blanks are ignored in parsing inline content:
827
828 ```````````````````````````````` example
829 #                  foo                     
830 .
831 <h1>foo</h1>
832 ````````````````````````````````
833
834
835 One to three spaces indentation are allowed:
836
837 ```````````````````````````````` example
838  ### foo
839   ## foo
840    # foo
841 .
842 <h3>foo</h3>
843 <h2>foo</h2>
844 <h1>foo</h1>
845 ````````````````````````````````
846
847
848 Four spaces are too much:
849
850 ```````````````````````````````` example
851     # foo
852 .
853 <pre><code># foo
854 </code></pre>
855 ````````````````````````````````
856
857
858 ```````````````````````````````` example
859 foo
860     # bar
861 .
862 <p>foo
863 # bar</p>
864 ````````````````````````````````
865
866
867 A closing sequence of `#` characters is optional:
868
869 ```````````````````````````````` example
870 ## foo ##
871   ###   bar    ###
872 .
873 <h2>foo</h2>
874 <h3>bar</h3>
875 ````````````````````````````````
876
877
878 It need not be the same length as the opening sequence:
879
880 ```````````````````````````````` example
881 # foo ##################################
882 ##### foo ##
883 .
884 <h1>foo</h1>
885 <h5>foo</h5>
886 ````````````````````````````````
887
888
889 Spaces are allowed after the closing sequence:
890
891 ```````````````````````````````` example
892 ### foo ###     
893 .
894 <h3>foo</h3>
895 ````````````````````````````````
896
897
898 A sequence of `#` characters with anything but [spaces] following it
899 is not a closing sequence, but counts as part of the contents of the
900 heading:
901
902 ```````````````````````````````` example
903 ### foo ### b
904 .
905 <h3>foo ### b</h3>
906 ````````````````````````````````
907
908
909 The closing sequence must be preceded by a space:
910
911 ```````````````````````````````` example
912 # foo#
913 .
914 <h1>foo#</h1>
915 ````````````````````````````````
916
917
918 Backslash-escaped `#` characters do not count as part
919 of the closing sequence:
920
921 ```````````````````````````````` example
922 ### foo \###
923 ## foo #\##
924 # foo \#
925 .
926 <h3>foo ###</h3>
927 <h2>foo ###</h2>
928 <h1>foo #</h1>
929 ````````````````````````````````
930
931
932 ATX headings need not be separated from surrounding content by blank
933 lines, and they can interrupt paragraphs:
934
935 ```````````````````````````````` example
936 ****
937 ## foo
938 ****
939 .
940 <hr />
941 <h2>foo</h2>
942 <hr />
943 ````````````````````````````````
944
945
946 ```````````````````````````````` example
947 Foo bar
948 # baz
949 Bar foo
950 .
951 <p>Foo bar</p>
952 <h1>baz</h1>
953 <p>Bar foo</p>
954 ````````````````````````````````
955
956
957 ATX headings can be empty:
958
959 ```````````````````````````````` example
960 ## 
961 #
962 ### ###
963 .
964 <h2></h2>
965 <h1></h1>
966 <h3></h3>
967 ````````````````````````````````
968
969
970 ## Setext headings
971
972 A [setext heading](@) consists of one or more
973 lines of text, each containing at least one [non-whitespace
974 character], with no more than 3 spaces indentation, followed by
975 a [setext heading underline].  The lines of text must be such
976 that, were they not followed by the setext heading underline,
977 they would be interpreted as a paragraph:  they cannot be
978 interpretable as a [code fence], [ATX heading][ATX headings],
979 [block quote][block quotes], [thematic break][thematic breaks],
980 [list item][list items], or [HTML block][HTML blocks].
981
982 A [setext heading underline](@) is a sequence of
983 `=` characters or a sequence of `-` characters, with no more than 3
984 spaces indentation and any number of trailing spaces.  If a line
985 containing a single `-` can be interpreted as an
986 empty [list items], it should be interpreted this way
987 and not as a [setext heading underline].
988
989 The heading is a level 1 heading if `=` characters are used in
990 the [setext heading underline], and a level 2 heading if `-`
991 characters are used.  The contents of the heading are the result
992 of parsing the preceding lines of text as CommonMark inline
993 content.
994
995 In general, a setext heading need not be preceded or followed by a
996 blank line.  However, it cannot interrupt a paragraph, so when a
997 setext heading comes after a paragraph, a blank line is needed between
998 them.
999
1000 Simple examples:
1001
1002 ```````````````````````````````` example
1003 Foo *bar*
1004 =========
1005
1006 Foo *bar*
1007 ---------
1008 .
1009 <h1>Foo <em>bar</em></h1>
1010 <h2>Foo <em>bar</em></h2>
1011 ````````````````````````````````
1012
1013
1014 The content of the header may span more than one line:
1015
1016 ```````````````````````````````` example
1017 Foo *bar
1018 baz*
1019 ====
1020 .
1021 <h1>Foo <em>bar
1022 baz</em></h1>
1023 ````````````````````````````````
1024
1025
1026 The underlining can be any length:
1027
1028 ```````````````````````````````` example
1029 Foo
1030 -------------------------
1031
1032 Foo
1033 =
1034 .
1035 <h2>Foo</h2>
1036 <h1>Foo</h1>
1037 ````````````````````````````````
1038
1039
1040 The heading content can be indented up to three spaces, and need
1041 not line up with the underlining:
1042
1043 ```````````````````````````````` example
1044    Foo
1045 ---
1046
1047   Foo
1048 -----
1049
1050   Foo
1051   ===
1052 .
1053 <h2>Foo</h2>
1054 <h2>Foo</h2>
1055 <h1>Foo</h1>
1056 ````````````````````````````````
1057
1058
1059 Four spaces indent is too much:
1060
1061 ```````````````````````````````` example
1062     Foo
1063     ---
1064
1065     Foo
1066 ---
1067 .
1068 <pre><code>Foo
1069 ---
1070
1071 Foo
1072 </code></pre>
1073 <hr />
1074 ````````````````````````````````
1075
1076
1077 The setext heading underline can be indented up to three spaces, and
1078 may have trailing spaces:
1079
1080 ```````````````````````````````` example
1081 Foo
1082    ----      
1083 .
1084 <h2>Foo</h2>
1085 ````````````````````````````````
1086
1087
1088 Four spaces is too much:
1089
1090 ```````````````````````````````` example
1091 Foo
1092     ---
1093 .
1094 <p>Foo
1095 ---</p>
1096 ````````````````````````````````
1097
1098
1099 The setext heading underline cannot contain internal spaces:
1100
1101 ```````````````````````````````` example
1102 Foo
1103 = =
1104
1105 Foo
1106 --- -
1107 .
1108 <p>Foo
1109 = =</p>
1110 <p>Foo</p>
1111 <hr />
1112 ````````````````````````````````
1113
1114
1115 Trailing spaces in the content line do not cause a line break:
1116
1117 ```````````````````````````````` example
1118 Foo  
1119 -----
1120 .
1121 <h2>Foo</h2>
1122 ````````````````````````````````
1123
1124
1125 Nor does a backslash at the end:
1126
1127 ```````````````````````````````` example
1128 Foo\
1129 ----
1130 .
1131 <h2>Foo\</h2>
1132 ````````````````````````````````
1133
1134
1135 Since indicators of block structure take precedence over
1136 indicators of inline structure, the following are setext headings:
1137
1138 ```````````````````````````````` example
1139 `Foo
1140 ----
1141 `
1142
1143 <a title="a lot
1144 ---
1145 of dashes"/>
1146 .
1147 <h2>`Foo</h2>
1148 <p>`</p>
1149 <h2>&lt;a title=&quot;a lot</h2>
1150 <p>of dashes&quot;/&gt;</p>
1151 ````````````````````````````````
1152
1153
1154 The setext heading underline cannot be a [lazy continuation
1155 line] in a list item or block quote:
1156
1157 ```````````````````````````````` example
1158 > Foo
1159 ---
1160 .
1161 <blockquote>
1162 <p>Foo</p>
1163 </blockquote>
1164 <hr />
1165 ````````````````````````````````
1166
1167
1168 ```````````````````````````````` example
1169 > foo
1170 bar
1171 ===
1172 .
1173 <blockquote>
1174 <p>foo
1175 bar
1176 ===</p>
1177 </blockquote>
1178 ````````````````````````````````
1179
1180
1181 ```````````````````````````````` example
1182 - Foo
1183 ---
1184 .
1185 <ul>
1186 <li>Foo</li>
1187 </ul>
1188 <hr />
1189 ````````````````````````````````
1190
1191
1192 A blank line is needed between a paragraph and a following
1193 setext heading, since otherwise the paragraph becomes part
1194 of the heading's content:
1195
1196 ```````````````````````````````` example
1197 Foo
1198 Bar
1199 ---
1200 .
1201 <h2>Foo
1202 Bar</h2>
1203 ````````````````````````````````
1204
1205
1206 But in general a blank line is not required before or after
1207 setext headings:
1208
1209 ```````````````````````````````` example
1210 ---
1211 Foo
1212 ---
1213 Bar
1214 ---
1215 Baz
1216 .
1217 <hr />
1218 <h2>Foo</h2>
1219 <h2>Bar</h2>
1220 <p>Baz</p>
1221 ````````````````````````````````
1222
1223
1224 Setext headings cannot be empty:
1225
1226 ```````````````````````````````` example
1227
1228 ====
1229 .
1230 <p>====</p>
1231 ````````````````````````````````
1232
1233
1234 Setext heading text lines must not be interpretable as block
1235 constructs other than paragraphs.  So, the line of dashes
1236 in these examples gets interpreted as a thematic break:
1237
1238 ```````````````````````````````` example
1239 ---
1240 ---
1241 .
1242 <hr />
1243 <hr />
1244 ````````````````````````````````
1245
1246
1247 ```````````````````````````````` example
1248 - foo
1249 -----
1250 .
1251 <ul>
1252 <li>foo</li>
1253 </ul>
1254 <hr />
1255 ````````````````````````````````
1256
1257
1258 ```````````````````````````````` example
1259     foo
1260 ---
1261 .
1262 <pre><code>foo
1263 </code></pre>
1264 <hr />
1265 ````````````````````````````````
1266
1267
1268 ```````````````````````````````` example
1269 > foo
1270 -----
1271 .
1272 <blockquote>
1273 <p>foo</p>
1274 </blockquote>
1275 <hr />
1276 ````````````````````````````````
1277
1278
1279 If you want a heading with `> foo` as its literal text, you can
1280 use backslash escapes:
1281
1282 ```````````````````````````````` example
1283 \> foo
1284 ------
1285 .
1286 <h2>&gt; foo</h2>
1287 ````````````````````````````````
1288
1289
1290 **Compatibility note:**  Most existing Markdown implementations
1291 do not allow the text of setext headings to span multiple lines.
1292 But there is no consensus about how to interpret
1293
1294 ``` markdown
1295 Foo
1296 bar
1297 ---
1298 baz
1299 ```
1300
1301 One can find four different interpretations:
1302
1303 1. paragraph "Foo", heading "bar", paragraph "baz"
1304 2. paragraph "Foo bar", thematic break, paragraph "baz"
1305 3. paragraph "Foo bar --- baz"
1306 4. heading "Foo bar", paragraph "baz"
1307
1308 We find interpretation 4 most natural, and interpretation 4
1309 increases the expressive power of CommonMark, by allowing
1310 multiline headings.  Authors who want interpretation 1 can
1311 put a blank line after the first paragraph:
1312
1313 ```````````````````````````````` example
1314 Foo
1315
1316 bar
1317 ---
1318 baz
1319 .
1320 <p>Foo</p>
1321 <h2>bar</h2>
1322 <p>baz</p>
1323 ````````````````````````````````
1324
1325
1326 Authors who want interpretation 2 can put blank lines around
1327 the thematic break,
1328
1329 ```````````````````````````````` example
1330 Foo
1331 bar
1332
1333 ---
1334
1335 baz
1336 .
1337 <p>Foo
1338 bar</p>
1339 <hr />
1340 <p>baz</p>
1341 ````````````````````````````````
1342
1343
1344 or use a thematic break that cannot count as a [setext heading
1345 underline], such as
1346
1347 ```````````````````````````````` example
1348 Foo
1349 bar
1350 * * *
1351 baz
1352 .
1353 <p>Foo
1354 bar</p>
1355 <hr />
1356 <p>baz</p>
1357 ````````````````````````````````
1358
1359
1360 Authors who want interpretation 3 can use backslash escapes:
1361
1362 ```````````````````````````````` example
1363 Foo
1364 bar
1365 \---
1366 baz
1367 .
1368 <p>Foo
1369 bar
1370 ---
1371 baz</p>
1372 ````````````````````````````````
1373
1374
1375 ## Indented code blocks
1376
1377 An [indented code block](@) is composed of one or more
1378 [indented chunks] separated by blank lines.
1379 An [indented chunk](@) is a sequence of non-blank lines,
1380 each indented four or more spaces. The contents of the code block are
1381 the literal contents of the lines, including trailing
1382 [line endings], minus four spaces of indentation.
1383 An indented code block has no [info string].
1384
1385 An indented code block cannot interrupt a paragraph, so there must be
1386 a blank line between a paragraph and a following indented code block.
1387 (A blank line is not needed, however, between a code block and a following
1388 paragraph.)
1389
1390 ```````````````````````````````` example
1391     a simple
1392       indented code block
1393 .
1394 <pre><code>a simple
1395   indented code block
1396 </code></pre>
1397 ````````````````````````````````
1398
1399
1400 If there is any ambiguity between an interpretation of indentation
1401 as a code block and as indicating that material belongs to a [list
1402 item][list items], the list item interpretation takes precedence:
1403
1404 ```````````````````````````````` example
1405   - foo
1406
1407     bar
1408 .
1409 <ul>
1410 <li>
1411 <p>foo</p>
1412 <p>bar</p>
1413 </li>
1414 </ul>
1415 ````````````````````````````````
1416
1417
1418 ```````````````````````````````` example
1419 1.  foo
1420
1421     - bar
1422 .
1423 <ol>
1424 <li>
1425 <p>foo</p>
1426 <ul>
1427 <li>bar</li>
1428 </ul>
1429 </li>
1430 </ol>
1431 ````````````````````````````````
1432
1433
1434
1435 The contents of a code block are literal text, and do not get parsed
1436 as Markdown:
1437
1438 ```````````````````````````````` example
1439     <a/>
1440     *hi*
1441
1442     - one
1443 .
1444 <pre><code>&lt;a/&gt;
1445 *hi*
1446
1447 - one
1448 </code></pre>
1449 ````````````````````````````````
1450
1451
1452 Here we have three chunks separated by blank lines:
1453
1454 ```````````````````````````````` example
1455     chunk1
1456
1457     chunk2
1458   
1459  
1460  
1461     chunk3
1462 .
1463 <pre><code>chunk1
1464
1465 chunk2
1466
1467
1468
1469 chunk3
1470 </code></pre>
1471 ````````````````````````````````
1472
1473
1474 Any initial spaces beyond four will be included in the content, even
1475 in interior blank lines:
1476
1477 ```````````````````````````````` example
1478     chunk1
1479       
1480       chunk2
1481 .
1482 <pre><code>chunk1
1483   
1484   chunk2
1485 </code></pre>
1486 ````````````````````````````````
1487
1488
1489 An indented code block cannot interrupt a paragraph.  (This
1490 allows hanging indents and the like.)
1491
1492 ```````````````````````````````` example
1493 Foo
1494     bar
1495
1496 .
1497 <p>Foo
1498 bar</p>
1499 ````````````````````````````````
1500
1501
1502 However, any non-blank line with fewer than four leading spaces ends
1503 the code block immediately.  So a paragraph may occur immediately
1504 after indented code:
1505
1506 ```````````````````````````````` example
1507     foo
1508 bar
1509 .
1510 <pre><code>foo
1511 </code></pre>
1512 <p>bar</p>
1513 ````````````````````````````````
1514
1515
1516 And indented code can occur immediately before and after other kinds of
1517 blocks:
1518
1519 ```````````````````````````````` example
1520 # Heading
1521     foo
1522 Heading
1523 ------
1524     foo
1525 ----
1526 .
1527 <h1>Heading</h1>
1528 <pre><code>foo
1529 </code></pre>
1530 <h2>Heading</h2>
1531 <pre><code>foo
1532 </code></pre>
1533 <hr />
1534 ````````````````````````````````
1535
1536
1537 The first line can be indented more than four spaces:
1538
1539 ```````````````````````````````` example
1540         foo
1541     bar
1542 .
1543 <pre><code>    foo
1544 bar
1545 </code></pre>
1546 ````````````````````````````````
1547
1548
1549 Blank lines preceding or following an indented code block
1550 are not included in it:
1551
1552 ```````````````````````````````` example
1553
1554     
1555     foo
1556     
1557
1558 .
1559 <pre><code>foo
1560 </code></pre>
1561 ````````````````````````````````
1562
1563
1564 Trailing spaces are included in the code block's content:
1565
1566 ```````````````````````````````` example
1567     foo  
1568 .
1569 <pre><code>foo  
1570 </code></pre>
1571 ````````````````````````````````
1572
1573
1574
1575 ## Fenced code blocks
1576
1577 A [code fence](@) is a sequence
1578 of at least three consecutive backtick characters (`` ` ``) or
1579 tildes (`~`).  (Tildes and backticks cannot be mixed.)
1580 A [fenced code block](@)
1581 begins with a code fence, indented no more than three spaces.
1582
1583 The line with the opening code fence may optionally contain some text
1584 following the code fence; this is trimmed of leading and trailing
1585 spaces and called the [info string](@).
1586 The [info string] may not contain any backtick
1587 characters.  (The reason for this restriction is that otherwise
1588 some inline code would be incorrectly interpreted as the
1589 beginning of a fenced code block.)
1590
1591 The content of the code block consists of all subsequent lines, until
1592 a closing [code fence] of the same type as the code block
1593 began with (backticks or tildes), and with at least as many backticks
1594 or tildes as the opening code fence.  If the leading code fence is
1595 indented N spaces, then up to N spaces of indentation are removed from
1596 each line of the content (if present).  (If a content line is not
1597 indented, it is preserved unchanged.  If it is indented less than N
1598 spaces, all of the indentation is removed.)
1599
1600 The closing code fence may be indented up to three spaces, and may be
1601 followed only by spaces, which are ignored.  If the end of the
1602 containing block (or document) is reached and no closing code fence
1603 has been found, the code block contains all of the lines after the
1604 opening code fence until the end of the containing block (or
1605 document).  (An alternative spec would require backtracking in the
1606 event that a closing code fence is not found.  But this makes parsing
1607 much less efficient, and there seems to be no real down side to the
1608 behavior described here.)
1609
1610 A fenced code block may interrupt a paragraph, and does not require
1611 a blank line either before or after.
1612
1613 The content of a code fence is treated as literal text, not parsed
1614 as inlines.  The first word of the [info string] is typically used to
1615 specify the language of the code sample, and rendered in the `class`
1616 attribute of the `code` tag.  However, this spec does not mandate any
1617 particular treatment of the [info string].
1618
1619 Here is a simple example with backticks:
1620
1621 ```````````````````````````````` example
1622 ```
1623 <
1624  >
1625 ```
1626 .
1627 <pre><code>&lt;
1628  &gt;
1629 </code></pre>
1630 ````````````````````````````````
1631
1632
1633 With tildes:
1634
1635 ```````````````````````````````` example
1636 ~~~
1637 <
1638  >
1639 ~~~
1640 .
1641 <pre><code>&lt;
1642  &gt;
1643 </code></pre>
1644 ````````````````````````````````
1645
1646
1647 The closing code fence must use the same character as the opening
1648 fence:
1649
1650 ```````````````````````````````` example
1651 ```
1652 aaa
1653 ~~~
1654 ```
1655 .
1656 <pre><code>aaa
1657 ~~~
1658 </code></pre>
1659 ````````````````````````````````
1660
1661
1662 ```````````````````````````````` example
1663 ~~~
1664 aaa
1665 ```
1666 ~~~
1667 .
1668 <pre><code>aaa
1669 ```
1670 </code></pre>
1671 ````````````````````````````````
1672
1673
1674 The closing code fence must be at least as long as the opening fence:
1675
1676 ```````````````````````````````` example
1677 ````
1678 aaa
1679 ```
1680 ``````
1681 .
1682 <pre><code>aaa
1683 ```
1684 </code></pre>
1685 ````````````````````````````````
1686
1687
1688 ```````````````````````````````` example
1689 ~~~~
1690 aaa
1691 ~~~
1692 ~~~~
1693 .
1694 <pre><code>aaa
1695 ~~~
1696 </code></pre>
1697 ````````````````````````````````
1698
1699
1700 Unclosed code blocks are closed by the end of the document
1701 (or the enclosing [block quote][block quotes] or [list item][list items]):
1702
1703 ```````````````````````````````` example
1704 ```
1705 .
1706 <pre><code></code></pre>
1707 ````````````````````````````````
1708
1709
1710 ```````````````````````````````` example
1711 `````
1712
1713 ```
1714 aaa
1715 .
1716 <pre><code>
1717 ```
1718 aaa
1719 </code></pre>
1720 ````````````````````````````````
1721
1722
1723 ```````````````````````````````` example
1724 > ```
1725 > aaa
1726
1727 bbb
1728 .
1729 <blockquote>
1730 <pre><code>aaa
1731 </code></pre>
1732 </blockquote>
1733 <p>bbb</p>
1734 ````````````````````````````````
1735
1736
1737 A code block can have all empty lines as its content:
1738
1739 ```````````````````````````````` example
1740 ```
1741
1742   
1743 ```
1744 .
1745 <pre><code>
1746   
1747 </code></pre>
1748 ````````````````````````````````
1749
1750
1751 A code block can be empty:
1752
1753 ```````````````````````````````` example
1754 ```
1755 ```
1756 .
1757 <pre><code></code></pre>
1758 ````````````````````````````````
1759
1760
1761 Fences can be indented.  If the opening fence is indented,
1762 content lines will have equivalent opening indentation removed,
1763 if present:
1764
1765 ```````````````````````````````` example
1766  ```
1767  aaa
1768 aaa
1769 ```
1770 .
1771 <pre><code>aaa
1772 aaa
1773 </code></pre>
1774 ````````````````````````````````
1775
1776
1777 ```````````````````````````````` example
1778   ```
1779 aaa
1780   aaa
1781 aaa
1782   ```
1783 .
1784 <pre><code>aaa
1785 aaa
1786 aaa
1787 </code></pre>
1788 ````````````````````````````````
1789
1790
1791 ```````````````````````````````` example
1792    ```
1793    aaa
1794     aaa
1795   aaa
1796    ```
1797 .
1798 <pre><code>aaa
1799  aaa
1800 aaa
1801 </code></pre>
1802 ````````````````````````````````
1803
1804
1805 Four spaces indentation produces an indented code block:
1806
1807 ```````````````````````````````` example
1808     ```
1809     aaa
1810     ```
1811 .
1812 <pre><code>```
1813 aaa
1814 ```
1815 </code></pre>
1816 ````````````````````````````````
1817
1818
1819 Closing fences may be indented by 0-3 spaces, and their indentation
1820 need not match that of the opening fence:
1821
1822 ```````````````````````````````` example
1823 ```
1824 aaa
1825   ```
1826 .
1827 <pre><code>aaa
1828 </code></pre>
1829 ````````````````````````````````
1830
1831
1832 ```````````````````````````````` example
1833    ```
1834 aaa
1835   ```
1836 .
1837 <pre><code>aaa
1838 </code></pre>
1839 ````````````````````````````````
1840
1841
1842 This is not a closing fence, because it is indented 4 spaces:
1843
1844 ```````````````````````````````` example
1845 ```
1846 aaa
1847     ```
1848 .
1849 <pre><code>aaa
1850     ```
1851 </code></pre>
1852 ````````````````````````````````
1853
1854
1855
1856 Code fences (opening and closing) cannot contain internal spaces:
1857
1858 ```````````````````````````````` example
1859 ``` ```
1860 aaa
1861 .
1862 <p><code></code>
1863 aaa</p>
1864 ````````````````````````````````
1865
1866
1867 ```````````````````````````````` example
1868 ~~~~~~
1869 aaa
1870 ~~~ ~~
1871 .
1872 <pre><code>aaa
1873 ~~~ ~~
1874 </code></pre>
1875 ````````````````````````````````
1876
1877
1878 Fenced code blocks can interrupt paragraphs, and can be followed
1879 directly by paragraphs, without a blank line between:
1880
1881 ```````````````````````````````` example
1882 foo
1883 ```
1884 bar
1885 ```
1886 baz
1887 .
1888 <p>foo</p>
1889 <pre><code>bar
1890 </code></pre>
1891 <p>baz</p>
1892 ````````````````````````````````
1893
1894
1895 Other blocks can also occur before and after fenced code blocks
1896 without an intervening blank line:
1897
1898 ```````````````````````````````` example
1899 foo
1900 ---
1901 ~~~
1902 bar
1903 ~~~
1904 # baz
1905 .
1906 <h2>foo</h2>
1907 <pre><code>bar
1908 </code></pre>
1909 <h1>baz</h1>
1910 ````````````````````````````````
1911
1912
1913 An [info string] can be provided after the opening code fence.
1914 Opening and closing spaces will be stripped, and the first word, prefixed
1915 with `language-`, is used as the value for the `class` attribute of the
1916 `code` element within the enclosing `pre` element.
1917
1918 ```````````````````````````````` example
1919 ```ruby
1920 def foo(x)
1921   return 3
1922 end
1923 ```
1924 .
1925 <pre><code class="language-ruby">def foo(x)
1926   return 3
1927 end
1928 </code></pre>
1929 ````````````````````````````````
1930
1931
1932 ```````````````````````````````` example
1933 ~~~~    ruby startline=3 $%@#$
1934 def foo(x)
1935   return 3
1936 end
1937 ~~~~~~~
1938 .
1939 <pre><code class="language-ruby">def foo(x)
1940   return 3
1941 end
1942 </code></pre>
1943 ````````````````````````````````
1944
1945
1946 ```````````````````````````````` example
1947 ````;
1948 ````
1949 .
1950 <pre><code class="language-;"></code></pre>
1951 ````````````````````````````````
1952
1953
1954 [Info strings] for backtick code blocks cannot contain backticks:
1955
1956 ```````````````````````````````` example
1957 ``` aa ```
1958 foo
1959 .
1960 <p><code>aa</code>
1961 foo</p>
1962 ````````````````````````````````
1963
1964
1965 Closing code fences cannot have [info strings]:
1966
1967 ```````````````````````````````` example
1968 ```
1969 ``` aaa
1970 ```
1971 .
1972 <pre><code>``` aaa
1973 </code></pre>
1974 ````````````````````````````````
1975
1976
1977
1978 ## HTML blocks
1979
1980 An [HTML block](@) is a group of lines that is treated
1981 as raw HTML (and will not be escaped in HTML output).
1982
1983 There are seven kinds of [HTML block], which can be defined
1984 by their start and end conditions.  The block begins with a line that
1985 meets a [start condition](@) (after up to three spaces
1986 optional indentation).  It ends with the first subsequent line that
1987 meets a matching [end condition](@), or the last line of
1988 the document or other [container block]), if no line is encountered that meets the
1989 [end condition].  If the first line meets both the [start condition]
1990 and the [end condition], the block will contain just that line.
1991
1992 1.  **Start condition:**  line begins with the string `<script`,
1993 `<pre`, or `<style` (case-insensitive), followed by whitespace,
1994 the string `>`, or the end of the line.\
1995 **End condition:**  line contains an end tag
1996 `</script>`, `</pre>`, or `</style>` (case-insensitive; it
1997 need not match the start tag).
1998
1999 2.  **Start condition:** line begins with the string `<!--`.\
2000 **End condition:**  line contains the string `-->`.
2001
2002 3.  **Start condition:** line begins with the string `<?`.\
2003 **End condition:** line contains the string `?>`.
2004
2005 4.  **Start condition:** line begins with the string `<!`
2006 followed by an uppercase ASCII letter.\
2007 **End condition:** line contains the character `>`.
2008
2009 5.  **Start condition:**  line begins with the string
2010 `<![CDATA[`.\
2011 **End condition:** line contains the string `]]>`.
2012
2013 6.  **Start condition:** line begins the string `<` or `</`
2014 followed by one of the strings (case-insensitive) `address`,
2015 `article`, `aside`, `base`, `basefont`, `blockquote`, `body`,
2016 `caption`, `center`, `col`, `colgroup`, `dd`, `details`, `dialog`,
2017 `dir`, `div`, `dl`, `dt`, `fieldset`, `figcaption`, `figure`,
2018 `footer`, `form`, `frame`, `frameset`,
2019 `h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `head`, `header`, `hr`,
2020 `html`, `iframe`, `legend`, `li`, `link`, `main`, `menu`, `menuitem`,
2021 `meta`, `nav`, `noframes`, `ol`, `optgroup`, `option`, `p`, `param`,
2022 `section`, `source`, `summary`, `table`, `tbody`, `td`,
2023 `tfoot`, `th`, `thead`, `title`, `tr`, `track`, `ul`, followed
2024 by [whitespace], the end of the line, the string `>`, or
2025 the string `/>`.\
2026 **End condition:** line is followed by a [blank line].
2027
2028 7.  **Start condition:**  line begins with a complete [open tag]
2029 or [closing tag] (with any [tag name] other than `script`,
2030 `style`, or `pre`) followed only by [whitespace]
2031 or the end of the line.\
2032 **End condition:** line is followed by a [blank line].
2033
2034 All types of [HTML blocks] except type 7 may interrupt
2035 a paragraph.  Blocks of type 7 may not interrupt a paragraph.
2036 (This restriction is intended to prevent unwanted interpretation
2037 of long tags inside a wrapped paragraph as starting HTML blocks.)
2038
2039 Some simple examples follow.  Here are some basic HTML blocks
2040 of type 6:
2041
2042 ```````````````````````````````` example
2043 <table>
2044   <tr>
2045     <td>
2046            hi
2047     </td>
2048   </tr>
2049 </table>
2050
2051 okay.
2052 .
2053 <table>
2054   <tr>
2055     <td>
2056            hi
2057     </td>
2058   </tr>
2059 </table>
2060 <p>okay.</p>
2061 ````````````````````````````````
2062
2063
2064 ```````````````````````````````` example
2065  <div>
2066   *hello*
2067          <foo><a>
2068 .
2069  <div>
2070   *hello*
2071          <foo><a>
2072 ````````````````````````````````
2073
2074
2075 A block can also start with a closing tag:
2076
2077 ```````````````````````````````` example
2078 </div>
2079 *foo*
2080 .
2081 </div>
2082 *foo*
2083 ````````````````````````````````
2084
2085
2086 Here we have two HTML blocks with a Markdown paragraph between them:
2087
2088 ```````````````````````````````` example
2089 <DIV CLASS="foo">
2090
2091 *Markdown*
2092
2093 </DIV>
2094 .
2095 <DIV CLASS="foo">
2096 <p><em>Markdown</em></p>
2097 </DIV>
2098 ````````````````````````````````
2099
2100
2101 The tag on the first line can be partial, as long
2102 as it is split where there would be whitespace:
2103
2104 ```````````````````````````````` example
2105 <div id="foo"
2106   class="bar">
2107 </div>
2108 .
2109 <div id="foo"
2110   class="bar">
2111 </div>
2112 ````````````````````````````````
2113
2114
2115 ```````````````````````````````` example
2116 <div id="foo" class="bar
2117   baz">
2118 </div>
2119 .
2120 <div id="foo" class="bar
2121   baz">
2122 </div>
2123 ````````````````````````````````
2124
2125
2126 An open tag need not be closed:
2127 ```````````````````````````````` example
2128 <div>
2129 *foo*
2130
2131 *bar*
2132 .
2133 <div>
2134 *foo*
2135 <p><em>bar</em></p>
2136 ````````````````````````````````
2137
2138
2139
2140 A partial tag need not even be completed (garbage
2141 in, garbage out):
2142
2143 ```````````````````````````````` example
2144 <div id="foo"
2145 *hi*
2146 .
2147 <div id="foo"
2148 *hi*
2149 ````````````````````````````````
2150
2151
2152 ```````````````````````````````` example
2153 <div class
2154 foo
2155 .
2156 <div class
2157 foo
2158 ````````````````````````````````
2159
2160
2161 The initial tag doesn't even need to be a valid
2162 tag, as long as it starts like one:
2163
2164 ```````````````````````````````` example
2165 <div *???-&&&-<---
2166 *foo*
2167 .
2168 <div *???-&&&-<---
2169 *foo*
2170 ````````````````````````````````
2171
2172
2173 In type 6 blocks, the initial tag need not be on a line by
2174 itself:
2175
2176 ```````````````````````````````` example
2177 <div><a href="bar">*foo*</a></div>
2178 .
2179 <div><a href="bar">*foo*</a></div>
2180 ````````````````````````````````
2181
2182
2183 ```````````````````````````````` example
2184 <table><tr><td>
2185 foo
2186 </td></tr></table>
2187 .
2188 <table><tr><td>
2189 foo
2190 </td></tr></table>
2191 ````````````````````````````````
2192
2193
2194 Everything until the next blank line or end of document
2195 gets included in the HTML block.  So, in the following
2196 example, what looks like a Markdown code block
2197 is actually part of the HTML block, which continues until a blank
2198 line or the end of the document is reached:
2199
2200 ```````````````````````````````` example
2201 <div></div>
2202 ``` c
2203 int x = 33;
2204 ```
2205 .
2206 <div></div>
2207 ``` c
2208 int x = 33;
2209 ```
2210 ````````````````````````````````
2211
2212
2213 To start an [HTML block] with a tag that is *not* in the
2214 list of block-level tags in (6), you must put the tag by
2215 itself on the first line (and it must be complete):
2216
2217 ```````````````````````````````` example
2218 <a href="foo">
2219 *bar*
2220 </a>
2221 .
2222 <a href="foo">
2223 *bar*
2224 </a>
2225 ````````````````````````````````
2226
2227
2228 In type 7 blocks, the [tag name] can be anything:
2229
2230 ```````````````````````````````` example
2231 <Warning>
2232 *bar*
2233 </Warning>
2234 .
2235 <Warning>
2236 *bar*
2237 </Warning>
2238 ````````````````````````````````
2239
2240
2241 ```````````````````````````````` example
2242 <i class="foo">
2243 *bar*
2244 </i>
2245 .
2246 <i class="foo">
2247 *bar*
2248 </i>
2249 ````````````````````````````````
2250
2251
2252 ```````````````````````````````` example
2253 </ins>
2254 *bar*
2255 .
2256 </ins>
2257 *bar*
2258 ````````````````````````````````
2259
2260
2261 These rules are designed to allow us to work with tags that
2262 can function as either block-level or inline-level tags.
2263 The `<del>` tag is a nice example.  We can surround content with
2264 `<del>` tags in three different ways.  In this case, we get a raw
2265 HTML block, because the `<del>` tag is on a line by itself:
2266
2267 ```````````````````````````````` example
2268 <del>
2269 *foo*
2270 </del>
2271 .
2272 <del>
2273 *foo*
2274 </del>
2275 ````````````````````````````````
2276
2277
2278 In this case, we get a raw HTML block that just includes
2279 the `<del>` tag (because it ends with the following blank
2280 line).  So the contents get interpreted as CommonMark:
2281
2282 ```````````````````````````````` example
2283 <del>
2284
2285 *foo*
2286
2287 </del>
2288 .
2289 <del>
2290 <p><em>foo</em></p>
2291 </del>
2292 ````````````````````````````````
2293
2294
2295 Finally, in this case, the `<del>` tags are interpreted
2296 as [raw HTML] *inside* the CommonMark paragraph.  (Because
2297 the tag is not on a line by itself, we get inline HTML
2298 rather than an [HTML block].)
2299
2300 ```````````````````````````````` example
2301 <del>*foo*</del>
2302 .
2303 <p><del><em>foo</em></del></p>
2304 ````````````````````````````````
2305
2306
2307 HTML tags designed to contain literal content
2308 (`script`, `style`, `pre`), comments, processing instructions,
2309 and declarations are treated somewhat differently.
2310 Instead of ending at the first blank line, these blocks
2311 end at the first line containing a corresponding end tag.
2312 As a result, these blocks can contain blank lines:
2313
2314 A pre tag (type 1):
2315
2316 ```````````````````````````````` example
2317 <pre language="haskell"><code>
2318 import Text.HTML.TagSoup
2319
2320 main :: IO ()
2321 main = print $ parseTags tags
2322 </code></pre>
2323 okay
2324 .
2325 <pre language="haskell"><code>
2326 import Text.HTML.TagSoup
2327
2328 main :: IO ()
2329 main = print $ parseTags tags
2330 </code></pre>
2331 <p>okay</p>
2332 ````````````````````````````````
2333
2334
2335 A script tag (type 1):
2336
2337 ```````````````````````````````` example
2338 <script type="text/javascript">
2339 // JavaScript example
2340
2341 document.getElementById("demo").innerHTML = "Hello JavaScript!";
2342 </script>
2343 okay
2344 .
2345 <script type="text/javascript">
2346 // JavaScript example
2347
2348 document.getElementById("demo").innerHTML = "Hello JavaScript!";
2349 </script>
2350 <p>okay</p>
2351 ````````````````````````````````
2352
2353
2354 A style tag (type 1):
2355
2356 ```````````````````````````````` example
2357 <style
2358   type="text/css">
2359 h1 {color:red;}
2360
2361 p {color:blue;}
2362 </style>
2363 okay
2364 .
2365 <style
2366   type="text/css">
2367 h1 {color:red;}
2368
2369 p {color:blue;}
2370 </style>
2371 <p>okay</p>
2372 ````````````````````````````````
2373
2374
2375 If there is no matching end tag, the block will end at the
2376 end of the document (or the enclosing [block quote][block quotes]
2377 or [list item][list items]):
2378
2379 ```````````````````````````````` example
2380 <style
2381   type="text/css">
2382
2383 foo
2384 .
2385 <style
2386   type="text/css">
2387
2388 foo
2389 ````````````````````````````````
2390
2391
2392 ```````````````````````````````` example
2393 > <div>
2394 > foo
2395
2396 bar
2397 .
2398 <blockquote>
2399 <div>
2400 foo
2401 </blockquote>
2402 <p>bar</p>
2403 ````````````````````````````````
2404
2405
2406 ```````````````````````````````` example
2407 - <div>
2408 - foo
2409 .
2410 <ul>
2411 <li>
2412 <div>
2413 </li>
2414 <li>foo</li>
2415 </ul>
2416 ````````````````````````````````
2417
2418
2419 The end tag can occur on the same line as the start tag:
2420
2421 ```````````````````````````````` example
2422 <style>p{color:red;}</style>
2423 *foo*
2424 .
2425 <style>p{color:red;}</style>
2426 <p><em>foo</em></p>
2427 ````````````````````````````````
2428
2429
2430 ```````````````````````````````` example
2431 <!-- foo -->*bar*
2432 *baz*
2433 .
2434 <!-- foo -->*bar*
2435 <p><em>baz</em></p>
2436 ````````````````````````````````
2437
2438
2439 Note that anything on the last line after the
2440 end tag will be included in the [HTML block]:
2441
2442 ```````````````````````````````` example
2443 <script>
2444 foo
2445 </script>1. *bar*
2446 .
2447 <script>
2448 foo
2449 </script>1. *bar*
2450 ````````````````````````````````
2451
2452
2453 A comment (type 2):
2454
2455 ```````````````````````````````` example
2456 <!-- Foo
2457
2458 bar
2459    baz -->
2460 okay
2461 .
2462 <!-- Foo
2463
2464 bar
2465    baz -->
2466 <p>okay</p>
2467 ````````````````````````````````
2468
2469
2470
2471 A processing instruction (type 3):
2472
2473 ```````````````````````````````` example
2474 <?php
2475
2476   echo '>';
2477
2478 ?>
2479 okay
2480 .
2481 <?php
2482
2483   echo '>';
2484
2485 ?>
2486 <p>okay</p>
2487 ````````````````````````````````
2488
2489
2490 A declaration (type 4):
2491
2492 ```````````````````````````````` example
2493 <!DOCTYPE html>
2494 .
2495 <!DOCTYPE html>
2496 ````````````````````````````````
2497
2498
2499 CDATA (type 5):
2500
2501 ```````````````````````````````` example
2502 <![CDATA[
2503 function matchwo(a,b)
2504 {
2505   if (a < b && a < 0) then {
2506     return 1;
2507
2508   } else {
2509
2510     return 0;
2511   }
2512 }
2513 ]]>
2514 okay
2515 .
2516 <![CDATA[
2517 function matchwo(a,b)
2518 {
2519   if (a < b && a < 0) then {
2520     return 1;
2521
2522   } else {
2523
2524     return 0;
2525   }
2526 }
2527 ]]>
2528 <p>okay</p>
2529 ````````````````````````````````
2530
2531
2532 The opening tag can be indented 1-3 spaces, but not 4:
2533
2534 ```````````````````````````````` example
2535   <!-- foo -->
2536
2537     <!-- foo -->
2538 .
2539   <!-- foo -->
2540 <pre><code>&lt;!-- foo --&gt;
2541 </code></pre>
2542 ````````````````````````````````
2543
2544
2545 ```````````````````````````````` example
2546   <div>
2547
2548     <div>
2549 .
2550   <div>
2551 <pre><code>&lt;div&gt;
2552 </code></pre>
2553 ````````````````````````````````
2554
2555
2556 An HTML block of types 1--6 can interrupt a paragraph, and need not be
2557 preceded by a blank line.
2558
2559 ```````````````````````````````` example
2560 Foo
2561 <div>
2562 bar
2563 </div>
2564 .
2565 <p>Foo</p>
2566 <div>
2567 bar
2568 </div>
2569 ````````````````````````````````
2570
2571
2572 However, a following blank line is needed, except at the end of
2573 a document, and except for blocks of types 1--5, above:
2574
2575 ```````````````````````````````` example
2576 <div>
2577 bar
2578 </div>
2579 *foo*
2580 .
2581 <div>
2582 bar
2583 </div>
2584 *foo*
2585 ````````````````````````````````
2586
2587
2588 HTML blocks of type 7 cannot interrupt a paragraph:
2589
2590 ```````````````````````````````` example
2591 Foo
2592 <a href="bar">
2593 baz
2594 .
2595 <p>Foo
2596 <a href="bar">
2597 baz</p>
2598 ````````````````````````````````
2599
2600
2601 This rule differs from John Gruber's original Markdown syntax
2602 specification, which says:
2603
2604 > The only restrictions are that block-level HTML elements —
2605 > e.g. `<div>`, `<table>`, `<pre>`, `<p>`, etc. — must be separated from
2606 > surrounding content by blank lines, and the start and end tags of the
2607 > block should not be indented with tabs or spaces.
2608
2609 In some ways Gruber's rule is more restrictive than the one given
2610 here:
2611
2612 - It requires that an HTML block be preceded by a blank line.
2613 - It does not allow the start tag to be indented.
2614 - It requires a matching end tag, which it also does not allow to
2615   be indented.
2616
2617 Most Markdown implementations (including some of Gruber's own) do not
2618 respect all of these restrictions.
2619
2620 There is one respect, however, in which Gruber's rule is more liberal
2621 than the one given here, since it allows blank lines to occur inside
2622 an HTML block.  There are two reasons for disallowing them here.
2623 First, it removes the need to parse balanced tags, which is
2624 expensive and can require backtracking from the end of the document
2625 if no matching end tag is found. Second, it provides a very simple
2626 and flexible way of including Markdown content inside HTML tags:
2627 simply separate the Markdown from the HTML using blank lines:
2628
2629 Compare:
2630
2631 ```````````````````````````````` example
2632 <div>
2633
2634 *Emphasized* text.
2635
2636 </div>
2637 .
2638 <div>
2639 <p><em>Emphasized</em> text.</p>
2640 </div>
2641 ````````````````````````````````
2642
2643
2644 ```````````````````````````````` example
2645 <div>
2646 *Emphasized* text.
2647 </div>
2648 .
2649 <div>
2650 *Emphasized* text.
2651 </div>
2652 ````````````````````````````````
2653
2654
2655 Some Markdown implementations have adopted a convention of
2656 interpreting content inside tags as text if the open tag has
2657 the attribute `markdown=1`.  The rule given above seems a simpler and
2658 more elegant way of achieving the same expressive power, which is also
2659 much simpler to parse.
2660
2661 The main potential drawback is that one can no longer paste HTML
2662 blocks into Markdown documents with 100% reliability.  However,
2663 *in most cases* this will work fine, because the blank lines in
2664 HTML are usually followed by HTML block tags.  For example:
2665
2666 ```````````````````````````````` example
2667 <table>
2668
2669 <tr>
2670
2671 <td>
2672 Hi
2673 </td>
2674
2675 </tr>
2676
2677 </table>
2678 .
2679 <table>
2680 <tr>
2681 <td>
2682 Hi
2683 </td>
2684 </tr>
2685 </table>
2686 ````````````````````````````````
2687
2688
2689 There are problems, however, if the inner tags are indented
2690 *and* separated by spaces, as then they will be interpreted as
2691 an indented code block:
2692
2693 ```````````````````````````````` example
2694 <table>
2695
2696   <tr>
2697
2698     <td>
2699       Hi
2700     </td>
2701
2702   </tr>
2703
2704 </table>
2705 .
2706 <table>
2707   <tr>
2708 <pre><code>&lt;td&gt;
2709   Hi
2710 &lt;/td&gt;
2711 </code></pre>
2712   </tr>
2713 </table>
2714 ````````````````````````````````
2715
2716
2717 Fortunately, blank lines are usually not necessary and can be
2718 deleted.  The exception is inside `<pre>` tags, but as described
2719 above, raw HTML blocks starting with `<pre>` *can* contain blank
2720 lines.
2721
2722 ## Link reference definitions
2723
2724 A [link reference definition](@)
2725 consists of a [link label], indented up to three spaces, followed
2726 by a colon (`:`), optional [whitespace] (including up to one
2727 [line ending]), a [link destination],
2728 optional [whitespace] (including up to one
2729 [line ending]), and an optional [link
2730 title], which if it is present must be separated
2731 from the [link destination] by [whitespace].
2732 No further [non-whitespace characters] may occur on the line.
2733
2734 A [link reference definition]
2735 does not correspond to a structural element of a document.  Instead, it
2736 defines a label which can be used in [reference links]
2737 and reference-style [images] elsewhere in the document.  [Link
2738 reference definitions] can come either before or after the links that use
2739 them.
2740
2741 ```````````````````````````````` example
2742 [foo]: /url "title"
2743
2744 [foo]
2745 .
2746 <p><a href="/url" title="title">foo</a></p>
2747 ````````````````````````````````
2748
2749
2750 ```````````````````````````````` example
2751    [foo]: 
2752       /url  
2753            'the title'  
2754
2755 [foo]
2756 .
2757 <p><a href="/url" title="the title">foo</a></p>
2758 ````````````````````````````````
2759
2760
2761 ```````````````````````````````` example
2762 [Foo*bar\]]:my_(url) 'title (with parens)'
2763
2764 [Foo*bar\]]
2765 .
2766 <p><a href="my_(url)" title="title (with parens)">Foo*bar]</a></p>
2767 ````````````````````````````````
2768
2769
2770 ```````````````````````````````` example
2771 [Foo bar]:
2772 <my%20url>
2773 'title'
2774
2775 [Foo bar]
2776 .
2777 <p><a href="my%20url" title="title">Foo bar</a></p>
2778 ````````````````````````````````
2779
2780
2781 The title may extend over multiple lines:
2782
2783 ```````````````````````````````` example
2784 [foo]: /url '
2785 title
2786 line1
2787 line2
2788 '
2789
2790 [foo]
2791 .
2792 <p><a href="/url" title="
2793 title
2794 line1
2795 line2
2796 ">foo</a></p>
2797 ````````````````````````````````
2798
2799
2800 However, it may not contain a [blank line]:
2801
2802 ```````````````````````````````` example
2803 [foo]: /url 'title
2804
2805 with blank line'
2806
2807 [foo]
2808 .
2809 <p>[foo]: /url 'title</p>
2810 <p>with blank line'</p>
2811 <p>[foo]</p>
2812 ````````````````````````````````
2813
2814
2815 The title may be omitted:
2816
2817 ```````````````````````````````` example
2818 [foo]:
2819 /url
2820
2821 [foo]
2822 .
2823 <p><a href="/url">foo</a></p>
2824 ````````````````````````````````
2825
2826
2827 The link destination may not be omitted:
2828
2829 ```````````````````````````````` example
2830 [foo]:
2831
2832 [foo]
2833 .
2834 <p>[foo]:</p>
2835 <p>[foo]</p>
2836 ````````````````````````````````
2837
2838
2839 Both title and destination can contain backslash escapes
2840 and literal backslashes:
2841
2842 ```````````````````````````````` example
2843 [foo]: /url\bar\*baz "foo\"bar\baz"
2844
2845 [foo]
2846 .
2847 <p><a href="/url%5Cbar*baz" title="foo&quot;bar\baz">foo</a></p>
2848 ````````````````````````````````
2849
2850
2851 A link can come before its corresponding definition:
2852
2853 ```````````````````````````````` example
2854 [foo]
2855
2856 [foo]: url
2857 .
2858 <p><a href="url">foo</a></p>
2859 ````````````````````````````````
2860
2861
2862 If there are several matching definitions, the first one takes
2863 precedence:
2864
2865 ```````````````````````````````` example
2866 [foo]
2867
2868 [foo]: first
2869 [foo]: second
2870 .
2871 <p><a href="first">foo</a></p>
2872 ````````````````````````````````
2873
2874
2875 As noted in the section on [Links], matching of labels is
2876 case-insensitive (see [matches]).
2877
2878 ```````````````````````````````` example
2879 [FOO]: /url
2880
2881 [Foo]
2882 .
2883 <p><a href="/url">Foo</a></p>
2884 ````````````````````````````````
2885
2886
2887 ```````````````````````````````` example
2888 [ΑΓΩ]: /φου
2889
2890 [αγω]
2891 .
2892 <p><a href="/%CF%86%CE%BF%CF%85">αγω</a></p>
2893 ````````````````````````````````
2894
2895
2896 Here is a link reference definition with no corresponding link.
2897 It contributes nothing to the document.
2898
2899 ```````````````````````````````` example
2900 [foo]: /url
2901 .
2902 ````````````````````````````````
2903
2904
2905 Here is another one:
2906
2907 ```````````````````````````````` example
2908 [
2909 foo
2910 ]: /url
2911 bar
2912 .
2913 <p>bar</p>
2914 ````````````````````````````````
2915
2916
2917 This is not a link reference definition, because there are
2918 [non-whitespace characters] after the title:
2919
2920 ```````````````````````````````` example
2921 [foo]: /url "title" ok
2922 .
2923 <p>[foo]: /url &quot;title&quot; ok</p>
2924 ````````````````````````````````
2925
2926
2927 This is a link reference definition, but it has no title:
2928
2929 ```````````````````````````````` example
2930 [foo]: /url
2931 "title" ok
2932 .
2933 <p>&quot;title&quot; ok</p>
2934 ````````````````````````````````
2935
2936
2937 This is not a link reference definition, because it is indented
2938 four spaces:
2939
2940 ```````````````````````````````` example
2941     [foo]: /url "title"
2942
2943 [foo]
2944 .
2945 <pre><code>[foo]: /url &quot;title&quot;
2946 </code></pre>
2947 <p>[foo]</p>
2948 ````````````````````````````````
2949
2950
2951 This is not a link reference definition, because it occurs inside
2952 a code block:
2953
2954 ```````````````````````````````` example
2955 ```
2956 [foo]: /url
2957 ```
2958
2959 [foo]
2960 .
2961 <pre><code>[foo]: /url
2962 </code></pre>
2963 <p>[foo]</p>
2964 ````````````````````````````````
2965
2966
2967 A [link reference definition] cannot interrupt a paragraph.
2968
2969 ```````````````````````````````` example
2970 Foo
2971 [bar]: /baz
2972
2973 [bar]
2974 .
2975 <p>Foo
2976 [bar]: /baz</p>
2977 <p>[bar]</p>
2978 ````````````````````````````````
2979
2980
2981 However, it can directly follow other block elements, such as headings
2982 and thematic breaks, and it need not be followed by a blank line.
2983
2984 ```````````````````````````````` example
2985 # [Foo]
2986 [foo]: /url
2987 > bar
2988 .
2989 <h1><a href="/url">Foo</a></h1>
2990 <blockquote>
2991 <p>bar</p>
2992 </blockquote>
2993 ````````````````````````````````
2994
2995
2996 Several [link reference definitions]
2997 can occur one after another, without intervening blank lines.
2998
2999 ```````````````````````````````` example
3000 [foo]: /foo-url "foo"
3001 [bar]: /bar-url
3002   "bar"
3003 [baz]: /baz-url
3004
3005 [foo],
3006 [bar],
3007 [baz]
3008 .
3009 <p><a href="/foo-url" title="foo">foo</a>,
3010 <a href="/bar-url" title="bar">bar</a>,
3011 <a href="/baz-url">baz</a></p>
3012 ````````````````````````````````
3013
3014
3015 [Link reference definitions] can occur
3016 inside block containers, like lists and block quotations.  They
3017 affect the entire document, not just the container in which they
3018 are defined:
3019
3020 ```````````````````````````````` example
3021 [foo]
3022
3023 > [foo]: /url
3024 .
3025 <p><a href="/url">foo</a></p>
3026 <blockquote>
3027 </blockquote>
3028 ````````````````````````````````
3029
3030
3031
3032 ## Paragraphs
3033
3034 A sequence of non-blank lines that cannot be interpreted as other
3035 kinds of blocks forms a [paragraph](@).
3036 The contents of the paragraph are the result of parsing the
3037 paragraph's raw content as inlines.  The paragraph's raw content
3038 is formed by concatenating the lines and removing initial and final
3039 [whitespace].
3040
3041 A simple example with two paragraphs:
3042
3043 ```````````````````````````````` example
3044 aaa
3045
3046 bbb
3047 .
3048 <p>aaa</p>
3049 <p>bbb</p>
3050 ````````````````````````````````
3051
3052
3053 Paragraphs can contain multiple lines, but no blank lines:
3054
3055 ```````````````````````````````` example
3056 aaa
3057 bbb
3058
3059 ccc
3060 ddd
3061 .
3062 <p>aaa
3063 bbb</p>
3064 <p>ccc
3065 ddd</p>
3066 ````````````````````````````````
3067
3068
3069 Multiple blank lines between paragraph have no effect:
3070
3071 ```````````````````````````````` example
3072 aaa
3073
3074
3075 bbb
3076 .
3077 <p>aaa</p>
3078 <p>bbb</p>
3079 ````````````````````````````````
3080
3081
3082 Leading spaces are skipped:
3083
3084 ```````````````````````````````` example
3085   aaa
3086  bbb
3087 .
3088 <p>aaa
3089 bbb</p>
3090 ````````````````````````````````
3091
3092
3093 Lines after the first may be indented any amount, since indented
3094 code blocks cannot interrupt paragraphs.
3095
3096 ```````````````````````````````` example
3097 aaa
3098              bbb
3099                                        ccc
3100 .
3101 <p>aaa
3102 bbb
3103 ccc</p>
3104 ````````````````````````````````
3105
3106
3107 However, the first line may be indented at most three spaces,
3108 or an indented code block will be triggered:
3109
3110 ```````````````````````````````` example
3111    aaa
3112 bbb
3113 .
3114 <p>aaa
3115 bbb</p>
3116 ````````````````````````````````
3117
3118
3119 ```````````````````````````````` example
3120     aaa
3121 bbb
3122 .
3123 <pre><code>aaa
3124 </code></pre>
3125 <p>bbb</p>
3126 ````````````````````````````````
3127
3128
3129 Final spaces are stripped before inline parsing, so a paragraph
3130 that ends with two or more spaces will not end with a [hard line
3131 break]:
3132
3133 ```````````````````````````````` example
3134 aaa     
3135 bbb     
3136 .
3137 <p>aaa<br />
3138 bbb</p>
3139 ````````````````````````````````
3140
3141
3142 ## Blank lines
3143
3144 [Blank lines] between block-level elements are ignored,
3145 except for the role they play in determining whether a [list]
3146 is [tight] or [loose].
3147
3148 Blank lines at the beginning and end of the document are also ignored.
3149
3150 ```````````````````````````````` example
3151   
3152
3153 aaa
3154   
3155
3156 # aaa
3157
3158   
3159 .
3160 <p>aaa</p>
3161 <h1>aaa</h1>
3162 ````````````````````````````````
3163
3164
3165
3166 # Container blocks
3167
3168 A [container block] is a block that has other
3169 blocks as its contents.  There are two basic kinds of container blocks:
3170 [block quotes] and [list items].
3171 [Lists] are meta-containers for [list items].
3172
3173 We define the syntax for container blocks recursively.  The general
3174 form of the definition is:
3175
3176 > If X is a sequence of blocks, then the result of
3177 > transforming X in such-and-such a way is a container of type Y
3178 > with these blocks as its content.
3179
3180 So, we explain what counts as a block quote or list item by explaining
3181 how these can be *generated* from their contents. This should suffice
3182 to define the syntax, although it does not give a recipe for *parsing*
3183 these constructions.  (A recipe is provided below in the section entitled
3184 [A parsing strategy](#appendix-a-parsing-strategy).)
3185
3186 ## Block quotes
3187
3188 A [block quote marker](@)
3189 consists of 0-3 spaces of initial indent, plus (a) the character `>` together
3190 with a following space, or (b) a single character `>` not followed by a space.
3191
3192 The following rules define [block quotes]:
3193
3194 1.  **Basic case.**  If a string of lines *Ls* constitute a sequence
3195     of blocks *Bs*, then the result of prepending a [block quote
3196     marker] to the beginning of each line in *Ls*
3197     is a [block quote](#block-quotes) containing *Bs*.
3198
3199 2.  **Laziness.**  If a string of lines *Ls* constitute a [block
3200     quote](#block-quotes) with contents *Bs*, then the result of deleting
3201     the initial [block quote marker] from one or
3202     more lines in which the next [non-whitespace character] after the [block
3203     quote marker] is [paragraph continuation
3204     text] is a block quote with *Bs* as its content.
3205     [Paragraph continuation text](@) is text
3206     that will be parsed as part of the content of a paragraph, but does
3207     not occur at the beginning of the paragraph.
3208
3209 3.  **Consecutiveness.**  A document cannot contain two [block
3210     quotes] in a row unless there is a [blank line] between them.
3211
3212 Nothing else counts as a [block quote](#block-quotes).
3213
3214 Here is a simple example:
3215
3216 ```````````````````````````````` example
3217 > # Foo
3218 > bar
3219 > baz
3220 .
3221 <blockquote>
3222 <h1>Foo</h1>
3223 <p>bar
3224 baz</p>
3225 </blockquote>
3226 ````````````````````````````````
3227
3228
3229 The spaces after the `>` characters can be omitted:
3230
3231 ```````````````````````````````` example
3232 ># Foo
3233 >bar
3234 > baz
3235 .
3236 <blockquote>
3237 <h1>Foo</h1>
3238 <p>bar
3239 baz</p>
3240 </blockquote>
3241 ````````````````````````````````
3242
3243
3244 The `>` characters can be indented 1-3 spaces:
3245
3246 ```````````````````````````````` example
3247    > # Foo
3248    > bar
3249  > baz
3250 .
3251 <blockquote>
3252 <h1>Foo</h1>
3253 <p>bar
3254 baz</p>
3255 </blockquote>
3256 ````````````````````````````````
3257
3258
3259 Four spaces gives us a code block:
3260
3261 ```````````````````````````````` example
3262     > # Foo
3263     > bar
3264     > baz
3265 .
3266 <pre><code>&gt; # Foo
3267 &gt; bar
3268 &gt; baz
3269 </code></pre>
3270 ````````````````````````````````
3271
3272
3273 The Laziness clause allows us to omit the `>` before
3274 [paragraph continuation text]:
3275
3276 ```````````````````````````````` example
3277 > # Foo
3278 > bar
3279 baz
3280 .
3281 <blockquote>
3282 <h1>Foo</h1>
3283 <p>bar
3284 baz</p>
3285 </blockquote>
3286 ````````````````````````````````
3287
3288
3289 A block quote can contain some lazy and some non-lazy
3290 continuation lines:
3291
3292 ```````````````````````````````` example
3293 > bar
3294 baz
3295 > foo
3296 .
3297 <blockquote>
3298 <p>bar
3299 baz
3300 foo</p>
3301 </blockquote>
3302 ````````````````````````````````
3303
3304
3305 Laziness only applies to lines that would have been continuations of
3306 paragraphs had they been prepended with [block quote markers].
3307 For example, the `> ` cannot be omitted in the second line of
3308
3309 ``` markdown
3310 > foo
3311 > ---
3312 ```
3313
3314 without changing the meaning:
3315
3316 ```````````````````````````````` example
3317 > foo
3318 ---
3319 .
3320 <blockquote>
3321 <p>foo</p>
3322 </blockquote>
3323 <hr />
3324 ````````````````````````````````
3325
3326
3327 Similarly, if we omit the `> ` in the second line of
3328
3329 ``` markdown
3330 > - foo
3331 > - bar
3332 ```
3333
3334 then the block quote ends after the first line:
3335
3336 ```````````````````````````````` example
3337 > - foo
3338 - bar
3339 .
3340 <blockquote>
3341 <ul>
3342 <li>foo</li>
3343 </ul>
3344 </blockquote>
3345 <ul>
3346 <li>bar</li>
3347 </ul>
3348 ````````````````````````````````
3349
3350
3351 For the same reason, we can't omit the `> ` in front of
3352 subsequent lines of an indented or fenced code block:
3353
3354 ```````````````````````````````` example
3355 >     foo
3356     bar
3357 .
3358 <blockquote>
3359 <pre><code>foo
3360 </code></pre>
3361 </blockquote>
3362 <pre><code>bar
3363 </code></pre>
3364 ````````````````````````````````
3365
3366
3367 ```````````````````````````````` example
3368 > ```
3369 foo
3370 ```
3371 .
3372 <blockquote>
3373 <pre><code></code></pre>
3374 </blockquote>
3375 <p>foo</p>
3376 <pre><code></code></pre>
3377 ````````````````````````````````
3378
3379
3380 Note that in the following case, we have a [lazy
3381 continuation line]:
3382
3383 ```````````````````````````````` example
3384 > foo
3385     - bar
3386 .
3387 <blockquote>
3388 <p>foo
3389 - bar</p>
3390 </blockquote>
3391 ````````````````````````````````
3392
3393
3394 To see why, note that in
3395
3396 ```markdown
3397 > foo
3398 >     - bar
3399 ```
3400
3401 the `- bar` is indented too far to start a list, and can't
3402 be an indented code block because indented code blocks cannot
3403 interrupt paragraphs, so it is [paragraph continuation text].
3404
3405 A block quote can be empty:
3406
3407 ```````````````````````````````` example
3408 >
3409 .
3410 <blockquote>
3411 </blockquote>
3412 ````````````````````````````````
3413
3414
3415 ```````````````````````````````` example
3416 >
3417 >  
3418
3419 .
3420 <blockquote>
3421 </blockquote>
3422 ````````````````````````````````
3423
3424
3425 A block quote can have initial or final blank lines:
3426
3427 ```````````````````````````````` example
3428 >
3429 > foo
3430 >  
3431 .
3432 <blockquote>
3433 <p>foo</p>
3434 </blockquote>
3435 ````````````````````````````````
3436
3437
3438 A blank line always separates block quotes:
3439
3440 ```````````````````````````````` example
3441 > foo
3442
3443 > bar
3444 .
3445 <blockquote>
3446 <p>foo</p>
3447 </blockquote>
3448 <blockquote>
3449 <p>bar</p>
3450 </blockquote>
3451 ````````````````````````````````
3452
3453
3454 (Most current Markdown implementations, including John Gruber's
3455 original `Markdown.pl`, will parse this example as a single block quote
3456 with two paragraphs.  But it seems better to allow the author to decide
3457 whether two block quotes or one are wanted.)
3458
3459 Consecutiveness means that if we put these block quotes together,
3460 we get a single block quote:
3461
3462 ```````````````````````````````` example
3463 > foo
3464 > bar
3465 .
3466 <blockquote>
3467 <p>foo
3468 bar</p>
3469 </blockquote>
3470 ````````````````````````````````
3471
3472
3473 To get a block quote with two paragraphs, use:
3474
3475 ```````````````````````````````` example
3476 > foo
3477 >
3478 > bar
3479 .
3480 <blockquote>
3481 <p>foo</p>
3482 <p>bar</p>
3483 </blockquote>
3484 ````````````````````````````````
3485
3486
3487 Block quotes can interrupt paragraphs:
3488
3489 ```````````````````````````````` example
3490 foo
3491 > bar
3492 .
3493 <p>foo</p>
3494 <blockquote>
3495 <p>bar</p>
3496 </blockquote>
3497 ````````````````````````````````
3498
3499
3500 In general, blank lines are not needed before or after block
3501 quotes:
3502
3503 ```````````````````````````````` example
3504 > aaa
3505 ***
3506 > bbb
3507 .
3508 <blockquote>
3509 <p>aaa</p>
3510 </blockquote>
3511 <hr />
3512 <blockquote>
3513 <p>bbb</p>
3514 </blockquote>
3515 ````````````````````````````````
3516
3517
3518 However, because of laziness, a blank line is needed between
3519 a block quote and a following paragraph:
3520
3521 ```````````````````````````````` example
3522 > bar
3523 baz
3524 .
3525 <blockquote>
3526 <p>bar
3527 baz</p>
3528 </blockquote>
3529 ````````````````````````````````
3530
3531
3532 ```````````````````````````````` example
3533 > bar
3534
3535 baz
3536 .
3537 <blockquote>
3538 <p>bar</p>
3539 </blockquote>
3540 <p>baz</p>
3541 ````````````````````````````````
3542
3543
3544 ```````````````````````````````` example
3545 > bar
3546 >
3547 baz
3548 .
3549 <blockquote>
3550 <p>bar</p>
3551 </blockquote>
3552 <p>baz</p>
3553 ````````````````````````````````
3554
3555
3556 It is a consequence of the Laziness rule that any number
3557 of initial `>`s may be omitted on a continuation line of a
3558 nested block quote:
3559
3560 ```````````````````````````````` example
3561 > > > foo
3562 bar
3563 .
3564 <blockquote>
3565 <blockquote>
3566 <blockquote>
3567 <p>foo
3568 bar</p>
3569 </blockquote>
3570 </blockquote>
3571 </blockquote>
3572 ````````````````````````````````
3573
3574
3575 ```````````````````````````````` example
3576 >>> foo
3577 > bar
3578 >>baz
3579 .
3580 <blockquote>
3581 <blockquote>
3582 <blockquote>
3583 <p>foo
3584 bar
3585 baz</p>
3586 </blockquote>
3587 </blockquote>
3588 </blockquote>
3589 ````````````````````````````````
3590
3591
3592 When including an indented code block in a block quote,
3593 remember that the [block quote marker] includes
3594 both the `>` and a following space.  So *five spaces* are needed after
3595 the `>`:
3596
3597 ```````````````````````````````` example
3598 >     code
3599
3600 >    not code
3601 .
3602 <blockquote>
3603 <pre><code>code
3604 </code></pre>
3605 </blockquote>
3606 <blockquote>
3607 <p>not code</p>
3608 </blockquote>
3609 ````````````````````````````````
3610
3611
3612
3613 ## List items
3614
3615 A [list marker](@) is a
3616 [bullet list marker] or an [ordered list marker].
3617
3618 A [bullet list marker](@)
3619 is a `-`, `+`, or `*` character.
3620
3621 An [ordered list marker](@)
3622 is a sequence of 1--9 arabic digits (`0-9`), followed by either a
3623 `.` character or a `)` character.  (The reason for the length
3624 limit is that with 10 digits we start seeing integer overflows
3625 in some browsers.)
3626
3627 The following rules define [list items]:
3628
3629 1.  **Basic case.**  If a sequence of lines *Ls* constitute a sequence of
3630     blocks *Bs* starting with a [non-whitespace character] and not separated
3631     from each other by more than one blank line, and *M* is a list
3632     marker of width *W* followed by 1 ≤ *N* ≤ 4 spaces, then the result
3633     of prepending *M* and the following spaces to the first line of
3634     *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a
3635     list item with *Bs* as its contents.  The type of the list item
3636     (bullet or ordered) is determined by the type of its list marker.
3637     If the list item is ordered, then it is also assigned a start
3638     number, based on the ordered list marker.
3639
3640     Exceptions: When the first list item in a [list] interrupts
3641     a paragraph---that is, when it starts on a line that would
3642     otherwise count as [paragraph continuation text]---then (a)
3643     the lines *Ls* must not begin with a blank line, and (b) if
3644     the list item is ordered, the start number must be 1.
3645
3646 For example, let *Ls* be the lines
3647
3648 ```````````````````````````````` example
3649 A paragraph
3650 with two lines.
3651
3652     indented code
3653
3654 > A block quote.
3655 .
3656 <p>A paragraph
3657 with two lines.</p>
3658 <pre><code>indented code
3659 </code></pre>
3660 <blockquote>
3661 <p>A block quote.</p>
3662 </blockquote>
3663 ````````````````````````````````
3664
3665
3666 And let *M* be the marker `1.`, and *N* = 2.  Then rule #1 says
3667 that the following is an ordered list item with start number 1,
3668 and the same contents as *Ls*:
3669
3670 ```````````````````````````````` example
3671 1.  A paragraph
3672     with two lines.
3673
3674         indented code
3675
3676     > A block quote.
3677 .
3678 <ol>
3679 <li>
3680 <p>A paragraph
3681 with two lines.</p>
3682 <pre><code>indented code
3683 </code></pre>
3684 <blockquote>
3685 <p>A block quote.</p>
3686 </blockquote>
3687 </li>
3688 </ol>
3689 ````````````````````````````````
3690
3691
3692 The most important thing to notice is that the position of
3693 the text after the list marker determines how much indentation
3694 is needed in subsequent blocks in the list item.  If the list
3695 marker takes up two spaces, and there are three spaces between
3696 the list marker and the next [non-whitespace character], then blocks
3697 must be indented five spaces in order to fall under the list
3698 item.
3699
3700 Here are some examples showing how far content must be indented to be
3701 put under the list item:
3702
3703 ```````````````````````````````` example
3704 - one
3705
3706  two
3707 .
3708 <ul>
3709 <li>one</li>
3710 </ul>
3711 <p>two</p>
3712 ````````````````````````````````
3713
3714
3715 ```````````````````````````````` example
3716 - one
3717
3718   two
3719 .
3720 <ul>
3721 <li>
3722 <p>one</p>
3723 <p>two</p>
3724 </li>
3725 </ul>
3726 ````````````````````````````````
3727
3728
3729 ```````````````````````````````` example
3730  -    one
3731
3732      two
3733 .
3734 <ul>
3735 <li>one</li>
3736 </ul>
3737 <pre><code> two
3738 </code></pre>
3739 ````````````````````````````````
3740
3741
3742 ```````````````````````````````` example
3743  -    one
3744
3745       two
3746 .
3747 <ul>
3748 <li>
3749 <p>one</p>
3750 <p>two</p>
3751 </li>
3752 </ul>
3753 ````````````````````````````````
3754
3755
3756 It is tempting to think of this in terms of columns:  the continuation
3757 blocks must be indented at least to the column of the first
3758 [non-whitespace character] after the list marker. However, that is not quite right.
3759 The spaces after the list marker determine how much relative indentation
3760 is needed.  Which column this indentation reaches will depend on
3761 how the list item is embedded in other constructions, as shown by
3762 this example:
3763
3764 ```````````````````````````````` example
3765    > > 1.  one
3766 >>
3767 >>     two
3768 .
3769 <blockquote>
3770 <blockquote>
3771 <ol>
3772 <li>
3773 <p>one</p>
3774 <p>two</p>
3775 </li>
3776 </ol>
3777 </blockquote>
3778 </blockquote>
3779 ````````````````````````````````
3780
3781
3782 Here `two` occurs in the same column as the list marker `1.`,
3783 but is actually contained in the list item, because there is
3784 sufficient indentation after the last containing blockquote marker.
3785
3786 The converse is also possible.  In the following example, the word `two`
3787 occurs far to the right of the initial text of the list item, `one`, but
3788 it is not considered part of the list item, because it is not indented
3789 far enough past the blockquote marker:
3790
3791 ```````````````````````````````` example
3792 >>- one
3793 >>
3794   >  > two
3795 .
3796 <blockquote>
3797 <blockquote>
3798 <ul>
3799 <li>one</li>
3800 </ul>
3801 <p>two</p>
3802 </blockquote>
3803 </blockquote>
3804 ````````````````````````````````
3805
3806
3807 Note that at least one space is needed between the list marker and
3808 any following content, so these are not list items:
3809
3810 ```````````````````````````````` example
3811 -one
3812
3813 2.two
3814 .
3815 <p>-one</p>
3816 <p>2.two</p>
3817 ````````````````````````````````
3818
3819
3820 A list item may contain blocks that are separated by more than
3821 one blank line.
3822
3823 ```````````````````````````````` example
3824 - foo
3825
3826
3827   bar
3828 .
3829 <ul>
3830 <li>
3831 <p>foo</p>
3832 <p>bar</p>
3833 </li>
3834 </ul>
3835 ````````````````````````````````
3836
3837
3838 A list item may contain any kind of block:
3839
3840 ```````````````````````````````` example
3841 1.  foo
3842
3843     ```
3844     bar
3845     ```
3846
3847     baz
3848
3849     > bam
3850 .
3851 <ol>
3852 <li>
3853 <p>foo</p>
3854 <pre><code>bar
3855 </code></pre>
3856 <p>baz</p>
3857 <blockquote>
3858 <p>bam</p>
3859 </blockquote>
3860 </li>
3861 </ol>
3862 ````````````````````````````````
3863
3864
3865 A list item that contains an indented code block will preserve
3866 empty lines within the code block verbatim.
3867
3868 ```````````````````````````````` example
3869 - Foo
3870
3871       bar
3872
3873
3874       baz
3875 .
3876 <ul>
3877 <li>
3878 <p>Foo</p>
3879 <pre><code>bar
3880
3881
3882 baz
3883 </code></pre>
3884 </li>
3885 </ul>
3886 ````````````````````````````````
3887
3888 Note that ordered list start numbers must be nine digits or less:
3889
3890 ```````````````````````````````` example
3891 123456789. ok
3892 .
3893 <ol start="123456789">
3894 <li>ok</li>
3895 </ol>
3896 ````````````````````````````````
3897
3898
3899 ```````````````````````````````` example
3900 1234567890. not ok
3901 .
3902 <p>1234567890. not ok</p>
3903 ````````````````````````````````
3904
3905
3906 A start number may begin with 0s:
3907
3908 ```````````````````````````````` example
3909 0. ok
3910 .
3911 <ol start="0">
3912 <li>ok</li>
3913 </ol>
3914 ````````````````````````````````
3915
3916
3917 ```````````````````````````````` example
3918 003. ok
3919 .
3920 <ol start="3">
3921 <li>ok</li>
3922 </ol>
3923 ````````````````````````````````
3924
3925
3926 A start number may not be negative:
3927
3928 ```````````````````````````````` example
3929 -1. not ok
3930 .
3931 <p>-1. not ok</p>
3932 ````````````````````````````````
3933
3934
3935
3936 2.  **Item starting with indented code.**  If a sequence of lines *Ls*
3937     constitute a sequence of blocks *Bs* starting with an indented code
3938     block and not separated from each other by more than one blank line,
3939     and *M* is a list marker of width *W* followed by
3940     one space, then the result of prepending *M* and the following
3941     space to the first line of *Ls*, and indenting subsequent lines of
3942     *Ls* by *W + 1* spaces, is a list item with *Bs* as its contents.
3943     If a line is empty, then it need not be indented.  The type of the
3944     list item (bullet or ordered) is determined by the type of its list
3945     marker.  If the list item is ordered, then it is also assigned a
3946     start number, based on the ordered list marker.
3947
3948 An indented code block will have to be indented four spaces beyond
3949 the edge of the region where text will be included in the list item.
3950 In the following case that is 6 spaces:
3951
3952 ```````````````````````````````` example
3953 - foo
3954
3955       bar
3956 .
3957 <ul>
3958 <li>
3959 <p>foo</p>
3960 <pre><code>bar
3961 </code></pre>
3962 </li>
3963 </ul>
3964 ````````````````````````````````
3965
3966
3967 And in this case it is 11 spaces:
3968
3969 ```````````````````````````````` example
3970   10.  foo
3971
3972            bar
3973 .
3974 <ol start="10">
3975 <li>
3976 <p>foo</p>
3977 <pre><code>bar
3978 </code></pre>
3979 </li>
3980 </ol>
3981 ````````````````````````````````
3982
3983
3984 If the *first* block in the list item is an indented code block,
3985 then by rule #2, the contents must be indented *one* space after the
3986 list marker:
3987
3988 ```````````````````````````````` example
3989     indented code
3990
3991 paragraph
3992
3993     more code
3994 .
3995 <pre><code>indented code
3996 </code></pre>
3997 <p>paragraph</p>
3998 <pre><code>more code
3999 </code></pre>
4000 ````````````````````````````````
4001
4002
4003 ```````````````````````````````` example
4004 1.     indented code
4005
4006    paragraph
4007
4008        more code
4009 .
4010 <ol>
4011 <li>
4012 <pre><code>indented code
4013 </code></pre>
4014 <p>paragraph</p>
4015 <pre><code>more code
4016 </code></pre>
4017 </li>
4018 </ol>
4019 ````````````````````````````````
4020
4021
4022 Note that an additional space indent is interpreted as space
4023 inside the code block:
4024
4025 ```````````````````````````````` example
4026 1.      indented code
4027
4028    paragraph
4029
4030        more code
4031 .
4032 <ol>
4033 <li>
4034 <pre><code> indented code
4035 </code></pre>
4036 <p>paragraph</p>
4037 <pre><code>more code
4038 </code></pre>
4039 </li>
4040 </ol>
4041 ````````````````````````````````
4042
4043
4044 Note that rules #1 and #2 only apply to two cases:  (a) cases
4045 in which the lines to be included in a list item begin with a
4046 [non-whitespace character], and (b) cases in which
4047 they begin with an indented code
4048 block.  In a case like the following, where the first block begins with
4049 a three-space indent, the rules do not allow us to form a list item by
4050 indenting the whole thing and prepending a list marker:
4051
4052 ```````````````````````````````` example
4053    foo
4054
4055 bar
4056 .
4057 <p>foo</p>
4058 <p>bar</p>
4059 ````````````````````````````````
4060
4061
4062 ```````````````````````````````` example
4063 -    foo
4064
4065   bar
4066 .
4067 <ul>
4068 <li>foo</li>
4069 </ul>
4070 <p>bar</p>
4071 ````````````````````````````````
4072
4073
4074 This is not a significant restriction, because when a block begins
4075 with 1-3 spaces indent, the indentation can always be removed without
4076 a change in interpretation, allowing rule #1 to be applied.  So, in
4077 the above case:
4078
4079 ```````````````````````````````` example
4080 -  foo
4081
4082    bar
4083 .
4084 <ul>
4085 <li>
4086 <p>foo</p>
4087 <p>bar</p>
4088 </li>
4089 </ul>
4090 ````````````````````````````````
4091
4092
4093 3.  **Item starting with a blank line.**  If a sequence of lines *Ls*
4094     starting with a single [blank line] constitute a (possibly empty)
4095     sequence of blocks *Bs*, not separated from each other by more than
4096     one blank line, and *M* is a list marker of width *W*,
4097     then the result of prepending *M* to the first line of *Ls*, and
4098     indenting subsequent lines of *Ls* by *W + 1* spaces, is a list
4099     item with *Bs* as its contents.
4100     If a line is empty, then it need not be indented.  The type of the
4101     list item (bullet or ordered) is determined by the type of its list
4102     marker.  If the list item is ordered, then it is also assigned a
4103     start number, based on the ordered list marker.
4104
4105 Here are some list items that start with a blank line but are not empty:
4106
4107 ```````````````````````````````` example
4108 -
4109   foo
4110 -
4111   ```
4112   bar
4113   ```
4114 -
4115       baz
4116 .
4117 <ul>
4118 <li>foo</li>
4119 <li>
4120 <pre><code>bar
4121 </code></pre>
4122 </li>
4123 <li>
4124 <pre><code>baz
4125 </code></pre>
4126 </li>
4127 </ul>
4128 ````````````````````````````````
4129
4130 When the list item starts with a blank line, the number of spaces
4131 following the list marker doesn't change the required indentation:
4132
4133 ```````````````````````````````` example
4134 -   
4135   foo
4136 .
4137 <ul>
4138 <li>foo</li>
4139 </ul>
4140 ````````````````````````````````
4141
4142
4143 A list item can begin with at most one blank line.
4144 In the following example, `foo` is not part of the list
4145 item:
4146
4147 ```````````````````````````````` example
4148 -
4149
4150   foo
4151 .
4152 <ul>
4153 <li></li>
4154 </ul>
4155 <p>foo</p>
4156 ````````````````````````````````
4157
4158
4159 Here is an empty bullet list item:
4160
4161 ```````````````````````````````` example
4162 - foo
4163 -
4164 - bar
4165 .
4166 <ul>
4167 <li>foo</li>
4168 <li></li>
4169 <li>bar</li>
4170 </ul>
4171 ````````````````````````````````
4172
4173
4174 It does not matter whether there are spaces following the [list marker]:
4175
4176 ```````````````````````````````` example
4177 - foo
4178 -   
4179 - bar
4180 .
4181 <ul>
4182 <li>foo</li>
4183 <li></li>
4184 <li>bar</li>
4185 </ul>
4186 ````````````````````````````````
4187
4188
4189 Here is an empty ordered list item:
4190
4191 ```````````````````````````````` example
4192 1. foo
4193 2.
4194 3. bar
4195 .
4196 <ol>
4197 <li>foo</li>
4198 <li></li>
4199 <li>bar</li>
4200 </ol>
4201 ````````````````````````````````
4202
4203
4204 A list may start or end with an empty list item:
4205
4206 ```````````````````````````````` example
4207 *
4208 .
4209 <ul>
4210 <li></li>
4211 </ul>
4212 ````````````````````````````````
4213
4214 However, an empty list item cannot interrupt a paragraph:
4215
4216 ```````````````````````````````` example
4217 foo
4218 *
4219
4220 foo
4221 1.
4222 .
4223 <p>foo
4224 *</p>
4225 <p>foo
4226 1.</p>
4227 ````````````````````````````````
4228
4229
4230 4.  **Indentation.**  If a sequence of lines *Ls* constitutes a list item
4231     according to rule #1, #2, or #3, then the result of indenting each line
4232     of *Ls* by 1-3 spaces (the same for each line) also constitutes a
4233     list item with the same contents and attributes.  If a line is
4234     empty, then it need not be indented.
4235
4236 Indented one space:
4237
4238 ```````````````````````````````` example
4239  1.  A paragraph
4240      with two lines.
4241
4242          indented code
4243
4244      > A block quote.
4245 .
4246 <ol>
4247 <li>
4248 <p>A paragraph
4249 with two lines.</p>
4250 <pre><code>indented code
4251 </code></pre>
4252 <blockquote>
4253 <p>A block quote.</p>
4254 </blockquote>
4255 </li>
4256 </ol>
4257 ````````````````````````````````
4258
4259
4260 Indented two spaces:
4261
4262 ```````````````````````````````` example
4263   1.  A paragraph
4264       with two lines.
4265
4266           indented code
4267
4268       > A block quote.
4269 .
4270 <ol>
4271 <li>
4272 <p>A paragraph
4273 with two lines.</p>
4274 <pre><code>indented code
4275 </code></pre>
4276 <blockquote>
4277 <p>A block quote.</p>
4278 </blockquote>
4279 </li>
4280 </ol>
4281 ````````````````````````````````
4282
4283
4284 Indented three spaces:
4285
4286 ```````````````````````````````` example
4287    1.  A paragraph
4288        with two lines.
4289
4290            indented code
4291
4292        > A block quote.
4293 .
4294 <ol>
4295 <li>
4296 <p>A paragraph
4297 with two lines.</p>
4298 <pre><code>indented code
4299 </code></pre>
4300 <blockquote>
4301 <p>A block quote.</p>
4302 </blockquote>
4303 </li>
4304 </ol>
4305 ````````````````````````````````
4306
4307
4308 Four spaces indent gives a code block:
4309
4310 ```````````````````````````````` example
4311     1.  A paragraph
4312         with two lines.
4313
4314             indented code
4315
4316         > A block quote.
4317 .
4318 <pre><code>1.  A paragraph
4319     with two lines.
4320
4321         indented code
4322
4323     &gt; A block quote.
4324 </code></pre>
4325 ````````````````````````````````
4326
4327
4328
4329 5.  **Laziness.**  If a string of lines *Ls* constitute a [list
4330     item](#list-items) with contents *Bs*, then the result of deleting
4331     some or all of the indentation from one or more lines in which the
4332     next [non-whitespace character] after the indentation is
4333     [paragraph continuation text] is a
4334     list item with the same contents and attributes.  The unindented
4335     lines are called
4336     [lazy continuation line](@)s.
4337
4338 Here is an example with [lazy continuation lines]:
4339
4340 ```````````````````````````````` example
4341   1.  A paragraph
4342 with two lines.
4343
4344           indented code
4345
4346       > A block quote.
4347 .
4348 <ol>
4349 <li>
4350 <p>A paragraph
4351 with two lines.</p>
4352 <pre><code>indented code
4353 </code></pre>
4354 <blockquote>
4355 <p>A block quote.</p>
4356 </blockquote>
4357 </li>
4358 </ol>
4359 ````````````````````````````````
4360
4361
4362 Indentation can be partially deleted:
4363
4364 ```````````````````````````````` example
4365   1.  A paragraph
4366     with two lines.
4367 .
4368 <ol>
4369 <li>A paragraph
4370 with two lines.</li>
4371 </ol>
4372 ````````````````````````````````
4373
4374
4375 These examples show how laziness can work in nested structures:
4376
4377 ```````````````````````````````` example
4378 > 1. > Blockquote
4379 continued here.
4380 .
4381 <blockquote>
4382 <ol>
4383 <li>
4384 <blockquote>
4385 <p>Blockquote
4386 continued here.</p>
4387 </blockquote>
4388 </li>
4389 </ol>
4390 </blockquote>
4391 ````````````````````````````````
4392
4393
4394 ```````````````````````````````` example
4395 > 1. > Blockquote
4396 > continued here.
4397 .
4398 <blockquote>
4399 <ol>
4400 <li>
4401 <blockquote>
4402 <p>Blockquote
4403 continued here.</p>
4404 </blockquote>
4405 </li>
4406 </ol>
4407 </blockquote>
4408 ````````````````````````````````
4409
4410
4411
4412 6.  **That's all.** Nothing that is not counted as a list item by rules
4413     #1--5 counts as a [list item](#list-items).
4414
4415 The rules for sublists follow from the general rules above.  A sublist
4416 must be indented the same number of spaces a paragraph would need to be
4417 in order to be included in the list item.
4418
4419 So, in this case we need two spaces indent:
4420
4421 ```````````````````````````````` example
4422 - foo
4423   - bar
4424     - baz
4425       - boo
4426 .
4427 <ul>
4428 <li>foo
4429 <ul>
4430 <li>bar
4431 <ul>
4432 <li>baz
4433 <ul>
4434 <li>boo</li>
4435 </ul>
4436 </li>
4437 </ul>
4438 </li>
4439 </ul>
4440 </li>
4441 </ul>
4442 ````````````````````````````````
4443
4444
4445 One is not enough:
4446
4447 ```````````````````````````````` example
4448 - foo
4449  - bar
4450   - baz
4451    - boo
4452 .
4453 <ul>
4454 <li>foo</li>
4455 <li>bar</li>
4456 <li>baz</li>
4457 <li>boo</li>
4458 </ul>
4459 ````````````````````````````````
4460
4461
4462 Here we need four, because the list marker is wider:
4463
4464 ```````````````````````````````` example
4465 10) foo
4466     - bar
4467 .
4468 <ol start="10">
4469 <li>foo
4470 <ul>
4471 <li>bar</li>
4472 </ul>
4473 </li>
4474 </ol>
4475 ````````````````````````````````
4476
4477
4478 Three is not enough:
4479
4480 ```````````````````````````````` example
4481 10) foo
4482    - bar
4483 .
4484 <ol start="10">
4485 <li>foo</li>
4486 </ol>
4487 <ul>
4488 <li>bar</li>
4489 </ul>
4490 ````````````````````````````````
4491
4492
4493 A list may be the first block in a list item:
4494
4495 ```````````````````````````````` example
4496 - - foo
4497 .
4498 <ul>
4499 <li>
4500 <ul>
4501 <li>foo</li>
4502 </ul>
4503 </li>
4504 </ul>
4505 ````````````````````````````````
4506
4507
4508 ```````````````````````````````` example
4509 1. - 2. foo
4510 .
4511 <ol>
4512 <li>
4513 <ul>
4514 <li>
4515 <ol start="2">
4516 <li>foo</li>
4517 </ol>
4518 </li>
4519 </ul>
4520 </li>
4521 </ol>
4522 ````````````````````````````````
4523
4524
4525 A list item can contain a heading:
4526
4527 ```````````````````````````````` example
4528 - # Foo
4529 - Bar
4530   ---
4531   baz
4532 .
4533 <ul>
4534 <li>
4535 <h1>Foo</h1>
4536 </li>
4537 <li>
4538 <h2>Bar</h2>
4539 baz</li>
4540 </ul>
4541 ````````````````````````````````
4542
4543
4544 ### Motivation
4545
4546 John Gruber's Markdown spec says the following about list items:
4547
4548 1. "List markers typically start at the left margin, but may be indented
4549    by up to three spaces. List markers must be followed by one or more
4550    spaces or a tab."
4551
4552 2. "To make lists look nice, you can wrap items with hanging indents....
4553    But if you don't want to, you don't have to."
4554
4555 3. "List items may consist of multiple paragraphs. Each subsequent
4556    paragraph in a list item must be indented by either 4 spaces or one
4557    tab."
4558
4559 4. "It looks nice if you indent every line of the subsequent paragraphs,
4560    but here again, Markdown will allow you to be lazy."
4561
4562 5. "To put a blockquote within a list item, the blockquote's `>`
4563    delimiters need to be indented."
4564
4565 6. "To put a code block within a list item, the code block needs to be
4566    indented twice — 8 spaces or two tabs."
4567
4568 These rules specify that a paragraph under a list item must be indented
4569 four spaces (presumably, from the left margin, rather than the start of
4570 the list marker, but this is not said), and that code under a list item
4571 must be indented eight spaces instead of the usual four.  They also say
4572 that a block quote must be indented, but not by how much; however, the
4573 example given has four spaces indentation.  Although nothing is said
4574 about other kinds of block-level content, it is certainly reasonable to
4575 infer that *all* block elements under a list item, including other
4576 lists, must be indented four spaces.  This principle has been called the
4577 *four-space rule*.
4578
4579 The four-space rule is clear and principled, and if the reference
4580 implementation `Markdown.pl` had followed it, it probably would have
4581 become the standard.  However, `Markdown.pl` allowed paragraphs and
4582 sublists to start with only two spaces indentation, at least on the
4583 outer level.  Worse, its behavior was inconsistent: a sublist of an
4584 outer-level list needed two spaces indentation, but a sublist of this
4585 sublist needed three spaces.  It is not surprising, then, that different
4586 implementations of Markdown have developed very different rules for
4587 determining what comes under a list item.  (Pandoc and python-Markdown,
4588 for example, stuck with Gruber's syntax description and the four-space
4589 rule, while discount, redcarpet, marked, PHP Markdown, and others
4590 followed `Markdown.pl`'s behavior more closely.)
4591
4592 Unfortunately, given the divergences between implementations, there
4593 is no way to give a spec for list items that will be guaranteed not
4594 to break any existing documents.  However, the spec given here should
4595 correctly handle lists formatted with either the four-space rule or
4596 the more forgiving `Markdown.pl` behavior, provided they are laid out
4597 in a way that is natural for a human to read.
4598
4599 The strategy here is to let the width and indentation of the list marker
4600 determine the indentation necessary for blocks to fall under the list
4601 item, rather than having a fixed and arbitrary number.  The writer can
4602 think of the body of the list item as a unit which gets indented to the
4603 right enough to fit the list marker (and any indentation on the list
4604 marker).  (The laziness rule, #5, then allows continuation lines to be
4605 unindented if needed.)
4606
4607 This rule is superior, we claim, to any rule requiring a fixed level of
4608 indentation from the margin.  The four-space rule is clear but
4609 unnatural. It is quite unintuitive that
4610
4611 ``` markdown
4612 - foo
4613
4614   bar
4615
4616   - baz
4617 ```
4618
4619 should be parsed as two lists with an intervening paragraph,
4620
4621 ``` html
4622 <ul>
4623 <li>foo</li>
4624 </ul>
4625 <p>bar</p>
4626 <ul>
4627 <li>baz</li>
4628 </ul>
4629 ```
4630
4631 as the four-space rule demands, rather than a single list,
4632
4633 ``` html
4634 <ul>
4635 <li>
4636 <p>foo</p>
4637 <p>bar</p>
4638 <ul>
4639 <li>baz</li>
4640 </ul>
4641 </li>
4642 </ul>
4643 ```
4644
4645 The choice of four spaces is arbitrary.  It can be learned, but it is
4646 not likely to be guessed, and it trips up beginners regularly.
4647
4648 Would it help to adopt a two-space rule?  The problem is that such
4649 a rule, together with the rule allowing 1--3 spaces indentation of the
4650 initial list marker, allows text that is indented *less than* the
4651 original list marker to be included in the list item. For example,
4652 `Markdown.pl` parses
4653
4654 ``` markdown
4655    - one
4656
4657   two
4658 ```
4659
4660 as a single list item, with `two` a continuation paragraph:
4661
4662 ``` html
4663 <ul>
4664 <li>
4665 <p>one</p>
4666 <p>two</p>
4667 </li>
4668 </ul>
4669 ```
4670
4671 and similarly
4672
4673 ``` markdown
4674 >   - one
4675 >
4676 >  two
4677 ```
4678
4679 as
4680
4681 ``` html
4682 <blockquote>
4683 <ul>
4684 <li>
4685 <p>one</p>
4686 <p>two</p>
4687 </li>
4688 </ul>
4689 </blockquote>
4690 ```
4691
4692 This is extremely unintuitive.
4693
4694 Rather than requiring a fixed indent from the margin, we could require
4695 a fixed indent (say, two spaces, or even one space) from the list marker (which
4696 may itself be indented).  This proposal would remove the last anomaly
4697 discussed.  Unlike the spec presented above, it would count the following
4698 as a list item with a subparagraph, even though the paragraph `bar`
4699 is not indented as far as the first paragraph `foo`:
4700
4701 ``` markdown
4702  10. foo
4703
4704    bar  
4705 ```
4706
4707 Arguably this text does read like a list item with `bar` as a subparagraph,
4708 which may count in favor of the proposal.  However, on this proposal indented
4709 code would have to be indented six spaces after the list marker.  And this
4710 would break a lot of existing Markdown, which has the pattern:
4711
4712 ``` markdown
4713 1.  foo
4714
4715         indented code
4716 ```
4717
4718 where the code is indented eight spaces.  The spec above, by contrast, will
4719 parse this text as expected, since the code block's indentation is measured
4720 from the beginning of `foo`.
4721
4722 The one case that needs special treatment is a list item that *starts*
4723 with indented code.  How much indentation is required in that case, since
4724 we don't have a "first paragraph" to measure from?  Rule #2 simply stipulates
4725 that in such cases, we require one space indentation from the list marker
4726 (and then the normal four spaces for the indented code).  This will match the
4727 four-space rule in cases where the list marker plus its initial indentation
4728 takes four spaces (a common case), but diverge in other cases.
4729
4730 ## Lists
4731
4732 A [list](@) is a sequence of one or more
4733 list items [of the same type].  The list items
4734 may be separated by any number of blank lines.
4735
4736 Two list items are [of the same type](@)
4737 if they begin with a [list marker] of the same type.
4738 Two list markers are of the
4739 same type if (a) they are bullet list markers using the same character
4740 (`-`, `+`, or `*`) or (b) they are ordered list numbers with the same
4741 delimiter (either `.` or `)`).
4742
4743 A list is an [ordered list](@)
4744 if its constituent list items begin with
4745 [ordered list markers], and a
4746 [bullet list](@) if its constituent list
4747 items begin with [bullet list markers].
4748
4749 The [start number](@)
4750 of an [ordered list] is determined by the list number of
4751 its initial list item.  The numbers of subsequent list items are
4752 disregarded.
4753
4754 A list is [loose](@) if any of its constituent
4755 list items are separated by blank lines, or if any of its constituent
4756 list items directly contain two block-level elements with a blank line
4757 between them.  Otherwise a list is [tight](@).
4758 (The difference in HTML output is that paragraphs in a loose list are
4759 wrapped in `<p>` tags, while paragraphs in a tight list are not.)
4760
4761 Changing the bullet or ordered list delimiter starts a new list:
4762
4763 ```````````````````````````````` example
4764 - foo
4765 - bar
4766 + baz
4767 .
4768 <ul>
4769 <li>foo</li>
4770 <li>bar</li>
4771 </ul>
4772 <ul>
4773 <li>baz</li>
4774 </ul>
4775 ````````````````````````````````
4776
4777
4778 ```````````````````````````````` example
4779 1. foo
4780 2. bar
4781 3) baz
4782 .
4783 <ol>
4784 <li>foo</li>
4785 <li>bar</li>
4786 </ol>
4787 <ol start="3">
4788 <li>baz</li>
4789 </ol>
4790 ````````````````````````````````
4791
4792
4793 In CommonMark, a list can interrupt a paragraph. That is,
4794 no blank line is needed to separate a paragraph from a following
4795 list:
4796
4797 ```````````````````````````````` example
4798 Foo
4799 - bar
4800 - baz
4801 .
4802 <p>Foo</p>
4803 <ul>
4804 <li>bar</li>
4805 <li>baz</li>
4806 </ul>
4807 ````````````````````````````````
4808
4809 `Markdown.pl` does not allow this, through fear of triggering a list
4810 via a numeral in a hard-wrapped line:
4811
4812 ``` markdown
4813 The number of windows in my house is
4814 14.  The number of doors is 6.
4815 ```
4816
4817 Oddly, though, `Markdown.pl` *does* allow a blockquote to
4818 interrupt a paragraph, even though the same considerations might
4819 apply.
4820
4821 In CommonMark, we do allow lists to interrupt paragraphs, for
4822 two reasons.  First, it is natural and not uncommon for people
4823 to start lists without blank lines:
4824
4825 ``` markdown
4826 I need to buy
4827 - new shoes
4828 - a coat
4829 - a plane ticket
4830 ```
4831
4832 Second, we are attracted to a
4833
4834 > [principle of uniformity](@):
4835 > if a chunk of text has a certain
4836 > meaning, it will continue to have the same meaning when put into a
4837 > container block (such as a list item or blockquote).
4838
4839 (Indeed, the spec for [list items] and [block quotes] presupposes
4840 this principle.) This principle implies that if
4841
4842 ``` markdown
4843   * I need to buy
4844     - new shoes
4845     - a coat
4846     - a plane ticket
4847 ```
4848
4849 is a list item containing a paragraph followed by a nested sublist,
4850 as all Markdown implementations agree it is (though the paragraph
4851 may be rendered without `<p>` tags, since the list is "tight"),
4852 then
4853
4854 ``` markdown
4855 I need to buy
4856 - new shoes
4857 - a coat
4858 - a plane ticket
4859 ```
4860
4861 by itself should be a paragraph followed by a nested sublist.
4862
4863 Since it is well established Markdown practice to allow lists to
4864 interrupt paragraphs inside list items, the [principle of
4865 uniformity] requires us to allow this outside list items as
4866 well.  ([reStructuredText](http://docutils.sourceforge.net/rst.html)
4867 takes a different approach, requiring blank lines before lists
4868 even inside other list items.)
4869
4870 In order to solve of unwanted lists in paragraphs with
4871 hard-wrapped numerals, we allow only lists starting with `1` to
4872 interrupt paragraphs.  Thus,
4873
4874 ```````````````````````````````` example
4875 The number of windows in my house is
4876 14.  The number of doors is 6.
4877 .
4878 <p>The number of windows in my house is
4879 14.  The number of doors is 6.</p>
4880 ````````````````````````````````
4881
4882 We may still get an unintended result in cases like
4883
4884 ```````````````````````````````` example
4885 The number of windows in my house is
4886 1.  The number of doors is 6.
4887 .
4888 <p>The number of windows in my house is</p>
4889 <ol>
4890 <li>The number of doors is 6.</li>
4891 </ol>
4892 ````````````````````````````````
4893
4894 but this rule should prevent most spurious list captures.
4895
4896 There can be any number of blank lines between items:
4897
4898 ```````````````````````````````` example
4899 - foo
4900
4901 - bar
4902
4903
4904 - baz
4905 .
4906 <ul>
4907 <li>
4908 <p>foo</p>
4909 </li>
4910 <li>
4911 <p>bar</p>
4912 </li>
4913 <li>
4914 <p>baz</p>
4915 </li>
4916 </ul>
4917 ````````````````````````````````
4918
4919 ```````````````````````````````` example
4920 - foo
4921   - bar
4922     - baz
4923
4924
4925       bim
4926 .
4927 <ul>
4928 <li>foo
4929 <ul>
4930 <li>bar
4931 <ul>
4932 <li>
4933 <p>baz</p>
4934 <p>bim</p>
4935 </li>
4936 </ul>
4937 </li>
4938 </ul>
4939 </li>
4940 </ul>
4941 ````````````````````````````````
4942
4943
4944 To separate consecutive lists of the same type, or to separate a
4945 list from an indented code block that would otherwise be parsed
4946 as a subparagraph of the final list item, you can insert a blank HTML
4947 comment:
4948
4949 ```````````````````````````````` example
4950 - foo
4951 - bar
4952
4953 <!-- -->
4954
4955 - baz
4956 - bim
4957 .
4958 <ul>
4959 <li>foo</li>
4960 <li>bar</li>
4961 </ul>
4962 <!-- -->
4963 <ul>
4964 <li>baz</li>
4965 <li>bim</li>
4966 </ul>
4967 ````````````````````````````````
4968
4969
4970 ```````````````````````````````` example
4971 -   foo
4972
4973     notcode
4974
4975 -   foo
4976
4977 <!-- -->
4978
4979     code
4980 .
4981 <ul>
4982 <li>
4983 <p>foo</p>
4984 <p>notcode</p>
4985 </li>
4986 <li>
4987 <p>foo</p>
4988 </li>
4989 </ul>
4990 <!-- -->
4991 <pre><code>code
4992 </code></pre>
4993 ````````````````````````````````
4994
4995
4996 List items need not be indented to the same level.  The following
4997 list items will be treated as items at the same list level,
4998 since none is indented enough to belong to the previous list
4999 item:
5000
5001 ```````````````````````````````` example
5002 - a
5003  - b
5004   - c
5005    - d
5006     - e
5007    - f
5008   - g
5009  - h
5010 - i
5011 .
5012 <ul>
5013 <li>a</li>
5014 <li>b</li>
5015 <li>c</li>
5016 <li>d</li>
5017 <li>e</li>
5018 <li>f</li>
5019 <li>g</li>
5020 <li>h</li>
5021 <li>i</li>
5022 </ul>
5023 ````````````````````````````````
5024
5025
5026 ```````````````````````````````` example
5027 1. a
5028
5029   2. b
5030
5031     3. c
5032 .
5033 <ol>
5034 <li>
5035 <p>a</p>
5036 </li>
5037 <li>
5038 <p>b</p>
5039 </li>
5040 <li>
5041 <p>c</p>
5042 </li>
5043 </ol>
5044 ````````````````````````````````
5045
5046
5047 This is a loose list, because there is a blank line between
5048 two of the list items:
5049
5050 ```````````````````````````````` example
5051 - a
5052 - b
5053
5054 - c
5055 .
5056 <ul>
5057 <li>
5058 <p>a</p>
5059 </li>
5060 <li>
5061 <p>b</p>
5062 </li>
5063 <li>
5064 <p>c</p>
5065 </li>
5066 </ul>
5067 ````````````````````````````````
5068
5069
5070 So is this, with a empty second item:
5071
5072 ```````````````````````````````` example
5073 * a
5074 *
5075
5076 * c
5077 .
5078 <ul>
5079 <li>
5080 <p>a</p>
5081 </li>
5082 <li></li>
5083 <li>
5084 <p>c</p>
5085 </li>
5086 </ul>
5087 ````````````````````````````````
5088
5089
5090 These are loose lists, even though there is no space between the items,
5091 because one of the items directly contains two block-level elements
5092 with a blank line between them:
5093
5094 ```````````````````````````````` example
5095 - a
5096 - b
5097
5098   c
5099 - d
5100 .
5101 <ul>
5102 <li>
5103 <p>a</p>
5104 </li>
5105 <li>
5106 <p>b</p>
5107 <p>c</p>
5108 </li>
5109 <li>
5110 <p>d</p>
5111 </li>
5112 </ul>
5113 ````````````````````````````````
5114
5115
5116 ```````````````````````````````` example
5117 - a
5118 - b
5119
5120   [ref]: /url
5121 - d
5122 .
5123 <ul>
5124 <li>
5125 <p>a</p>
5126 </li>
5127 <li>
5128 <p>b</p>
5129 </li>
5130 <li>
5131 <p>d</p>
5132 </li>
5133 </ul>
5134 ````````````````````````````````
5135
5136
5137 This is a tight list, because the blank lines are in a code block:
5138
5139 ```````````````````````````````` example
5140 - a
5141 - ```
5142   b
5143
5144
5145   ```
5146 - c
5147 .
5148 <ul>
5149 <li>a</li>
5150 <li>
5151 <pre><code>b
5152
5153
5154 </code></pre>
5155 </li>
5156 <li>c</li>
5157 </ul>
5158 ````````````````````````````````
5159
5160
5161 This is a tight list, because the blank line is between two
5162 paragraphs of a sublist.  So the sublist is loose while
5163 the outer list is tight:
5164
5165 ```````````````````````````````` example
5166 - a
5167   - b
5168
5169     c
5170 - d
5171 .
5172 <ul>
5173 <li>a
5174 <ul>
5175 <li>
5176 <p>b</p>
5177 <p>c</p>
5178 </li>
5179 </ul>
5180 </li>
5181 <li>d</li>
5182 </ul>
5183 ````````````````````````````````
5184
5185
5186 This is a tight list, because the blank line is inside the
5187 block quote:
5188
5189 ```````````````````````````````` example
5190 * a
5191   > b
5192   >
5193 * c
5194 .
5195 <ul>
5196 <li>a
5197 <blockquote>
5198 <p>b</p>
5199 </blockquote>
5200 </li>
5201 <li>c</li>
5202 </ul>
5203 ````````````````````````````````
5204
5205
5206 This list is tight, because the consecutive block elements
5207 are not separated by blank lines:
5208
5209 ```````````````````````````````` example
5210 - a
5211   > b
5212   ```
5213   c
5214   ```
5215 - d
5216 .
5217 <ul>
5218 <li>a
5219 <blockquote>
5220 <p>b</p>
5221 </blockquote>
5222 <pre><code>c
5223 </code></pre>
5224 </li>
5225 <li>d</li>
5226 </ul>
5227 ````````````````````````````````
5228
5229
5230 A single-paragraph list is tight:
5231
5232 ```````````````````````````````` example
5233 - a
5234 .
5235 <ul>
5236 <li>a</li>
5237 </ul>
5238 ````````````````````````````````
5239
5240
5241 ```````````````````````````````` example
5242 - a
5243   - b
5244 .
5245 <ul>
5246 <li>a
5247 <ul>
5248 <li>b</li>
5249 </ul>
5250 </li>
5251 </ul>
5252 ````````````````````````````````
5253
5254
5255 This list is loose, because of the blank line between the
5256 two block elements in the list item:
5257
5258 ```````````````````````````````` example
5259 1. ```
5260    foo
5261    ```
5262
5263    bar
5264 .
5265 <ol>
5266 <li>
5267 <pre><code>foo
5268 </code></pre>
5269 <p>bar</p>
5270 </li>
5271 </ol>
5272 ````````````````````````````````
5273
5274
5275 Here the outer list is loose, the inner list tight:
5276
5277 ```````````````````````````````` example
5278 * foo
5279   * bar
5280
5281   baz
5282 .
5283 <ul>
5284 <li>
5285 <p>foo</p>
5286 <ul>
5287 <li>bar</li>
5288 </ul>
5289 <p>baz</p>
5290 </li>
5291 </ul>
5292 ````````````````````````````````
5293
5294
5295 ```````````````````````````````` example
5296 - a
5297   - b
5298   - c
5299
5300 - d
5301   - e
5302   - f
5303 .
5304 <ul>
5305 <li>
5306 <p>a</p>
5307 <ul>
5308 <li>b</li>
5309 <li>c</li>
5310 </ul>
5311 </li>
5312 <li>
5313 <p>d</p>
5314 <ul>
5315 <li>e</li>
5316 <li>f</li>
5317 </ul>
5318 </li>
5319 </ul>
5320 ````````````````````````````````
5321
5322
5323 # Inlines
5324
5325 Inlines are parsed sequentially from the beginning of the character
5326 stream to the end (left to right, in left-to-right languages).
5327 Thus, for example, in
5328
5329 ```````````````````````````````` example
5330 `hi`lo`
5331 .
5332 <p><code>hi</code>lo`</p>
5333 ````````````````````````````````
5334
5335
5336 `hi` is parsed as code, leaving the backtick at the end as a literal
5337 backtick.
5338
5339 ## Backslash escapes
5340
5341 Any ASCII punctuation character may be backslash-escaped:
5342
5343 ```````````````````````````````` example
5344 \!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~
5345 .
5346 <p>!&quot;#$%&amp;'()*+,-./:;&lt;=&gt;?@[\]^_`{|}~</p>
5347 ````````````````````````````````
5348
5349
5350 Backslashes before other characters are treated as literal
5351 backslashes:
5352
5353 ```````````````````````````````` example
5354 \→\A\a\ \3\φ\«
5355 .
5356 <p>\→\A\a\ \3\φ\«</p>
5357 ````````````````````````````````
5358
5359
5360 Escaped characters are treated as regular characters and do
5361 not have their usual Markdown meanings:
5362
5363 ```````````````````````````````` example
5364 \*not emphasized*
5365 \<br/> not a tag
5366 \[not a link](/foo)
5367 \`not code`
5368 1\. not a list
5369 \* not a list
5370 \# not a heading
5371 \[foo]: /url "not a reference"
5372 .
5373 <p>*not emphasized*
5374 &lt;br/&gt; not a tag
5375 [not a link](/foo)
5376 `not code`
5377 1. not a list
5378 * not a list
5379 # not a heading
5380 [foo]: /url &quot;not a reference&quot;</p>
5381 ````````````````````````````````
5382
5383
5384 If a backslash is itself escaped, the following character is not:
5385
5386 ```````````````````````````````` example
5387 \\*emphasis*
5388 .
5389 <p>\<em>emphasis</em></p>
5390 ````````````````````````````````
5391
5392
5393 A backslash at the end of the line is a [hard line break]:
5394
5395 ```````````````````````````````` example
5396 foo\
5397 bar
5398 .
5399 <p>foo<br />
5400 bar</p>
5401 ````````````````````````````````
5402
5403
5404 Backslash escapes do not work in code blocks, code spans, autolinks, or
5405 raw HTML:
5406
5407 ```````````````````````````````` example
5408 `` \[\` ``
5409 .
5410 <p><code>\[\`</code></p>
5411 ````````````````````````````````
5412
5413
5414 ```````````````````````````````` example
5415     \[\]
5416 .
5417 <pre><code>\[\]
5418 </code></pre>
5419 ````````````````````````````````
5420
5421
5422 ```````````````````````````````` example
5423 ~~~
5424 \[\]
5425 ~~~
5426 .
5427 <pre><code>\[\]
5428 </code></pre>
5429 ````````````````````````````````
5430
5431
5432 ```````````````````````````````` example
5433 <http://example.com?find=\*>
5434 .
5435 <p><a href="http://example.com?find=%5C*">http://example.com?find=\*</a></p>
5436 ````````````````````````````````
5437
5438
5439 ```````````````````````````````` example
5440 <a href="/bar\/)">
5441 .
5442 <a href="/bar\/)">
5443 ````````````````````````````````
5444
5445
5446 But they work in all other contexts, including URLs and link titles,
5447 link references, and [info strings] in [fenced code blocks]:
5448
5449 ```````````````````````````````` example
5450 [foo](/bar\* "ti\*tle")
5451 .
5452 <p><a href="/bar*" title="ti*tle">foo</a></p>
5453 ````````````````````````````````
5454
5455
5456 ```````````````````````````````` example
5457 [foo]
5458
5459 [foo]: /bar\* "ti\*tle"
5460 .
5461 <p><a href="/bar*" title="ti*tle">foo</a></p>
5462 ````````````````````````````````
5463
5464
5465 ```````````````````````````````` example
5466 ``` foo\+bar
5467 foo
5468 ```
5469 .
5470 <pre><code class="language-foo+bar">foo
5471 </code></pre>
5472 ````````````````````````````````
5473
5474
5475
5476 ## Entity and numeric character references
5477
5478 All valid HTML entity references and numeric character
5479 references, except those occuring in code blocks and code spans,
5480 are recognized as such and treated as equivalent to the
5481 corresponding Unicode characters.  Conforming CommonMark parsers
5482 need not store information about whether a particular character
5483 was represented in the source using a Unicode character or
5484 an entity reference.
5485
5486 [Entity references](@) consist of `&` + any of the valid
5487 HTML5 entity names + `;`. The
5488 document <https://html.spec.whatwg.org/multipage/entities.json>
5489 is used as an authoritative source for the valid entity
5490 references and their corresponding code points.
5491
5492 ```````````````````````````````` example
5493 &nbsp; &amp; &copy; &AElig; &Dcaron;
5494 &frac34; &HilbertSpace; &DifferentialD;
5495 &ClockwiseContourIntegral; &ngE;
5496 .
5497 <p>  &amp; © Æ Ď
5498 ¾ ℋ ⅆ
5499 ∲ ≧̸</p>
5500 ````````````````````````````````
5501
5502
5503 [Decimal numeric character
5504 references](@)
5505 consist of `&#` + a string of 1--8 arabic digits + `;`. A
5506 numeric character reference is parsed as the corresponding
5507 Unicode character. Invalid Unicode code points will be replaced by
5508 the REPLACEMENT CHARACTER (`U+FFFD`).  For security reasons,
5509 the code point `U+0000` will also be replaced by `U+FFFD`.
5510
5511 ```````````````````````````````` example
5512 &#35; &#1234; &#992; &#98765432; &#0;
5513 .
5514 <p># Ӓ Ϡ � �</p>
5515 ````````````````````````````````
5516
5517
5518 [Hexadecimal numeric character
5519 references](@) consist of `&#` +
5520 either `X` or `x` + a string of 1-8 hexadecimal digits + `;`.
5521 They too are parsed as the corresponding Unicode character (this
5522 time specified with a hexadecimal numeral instead of decimal).
5523
5524 ```````````````````````````````` example
5525 &#X22; &#XD06; &#xcab;
5526 .
5527 <p>&quot; ആ ಫ</p>
5528 ````````````````````````````````
5529
5530
5531 Here are some nonentities:
5532
5533 ```````````````````````````````` example
5534 &nbsp &x; &#; &#x;
5535 &ThisIsNotDefined; &hi?;
5536 .
5537 <p>&amp;nbsp &amp;x; &amp;#; &amp;#x;
5538 &amp;ThisIsNotDefined; &amp;hi?;</p>
5539 ````````````````````````````````
5540
5541
5542 Although HTML5 does accept some entity references
5543 without a trailing semicolon (such as `&copy`), these are not
5544 recognized here, because it makes the grammar too ambiguous:
5545
5546 ```````````````````````````````` example
5547 &copy
5548 .
5549 <p>&amp;copy</p>
5550 ````````````````````````````````
5551
5552
5553 Strings that are not on the list of HTML5 named entities are not
5554 recognized as entity references either:
5555
5556 ```````````````````````````````` example
5557 &MadeUpEntity;
5558 .
5559 <p>&amp;MadeUpEntity;</p>
5560 ````````````````````````````````
5561
5562
5563 Entity and numeric character references are recognized in any
5564 context besides code spans or code blocks, including
5565 URLs, [link titles], and [fenced code block][] [info strings]:
5566
5567 ```````````````````````````````` example
5568 <a href="&ouml;&ouml;.html">
5569 .
5570 <a href="&ouml;&ouml;.html">
5571 ````````````````````````````````
5572
5573
5574 ```````````````````````````````` example
5575 [foo](/f&ouml;&ouml; "f&ouml;&ouml;")
5576 .
5577 <p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p>
5578 ````````````````````````````````
5579
5580
5581 ```````````````````````````````` example
5582 [foo]
5583
5584 [foo]: /f&ouml;&ouml; "f&ouml;&ouml;"
5585 .
5586 <p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p>
5587 ````````````````````````````````
5588
5589
5590 ```````````````````````````````` example
5591 ``` f&ouml;&ouml;
5592 foo
5593 ```
5594 .
5595 <pre><code class="language-föö">foo
5596 </code></pre>
5597 ````````````````````````````````
5598
5599
5600 Entity and numeric character references are treated as literal
5601 text in code spans and code blocks:
5602
5603 ```````````````````````````````` example
5604 `f&ouml;&ouml;`
5605 .
5606 <p><code>f&amp;ouml;&amp;ouml;</code></p>
5607 ````````````````````````````````
5608
5609
5610 ```````````````````````````````` example
5611     f&ouml;f&ouml;
5612 .
5613 <pre><code>f&amp;ouml;f&amp;ouml;
5614 </code></pre>
5615 ````````````````````````````````
5616
5617
5618 ## Code spans
5619
5620 A [backtick string](@)
5621 is a string of one or more backtick characters (`` ` ``) that is neither
5622 preceded nor followed by a backtick.
5623
5624 A [code span](@) begins with a backtick string and ends with
5625 a backtick string of equal length.  The contents of the code span are
5626 the characters between the two backtick strings, with leading and
5627 trailing spaces and [line endings] removed, and
5628 [whitespace] collapsed to single spaces.
5629
5630 This is a simple code span:
5631
5632 ```````````````````````````````` example
5633 `foo`
5634 .
5635 <p><code>foo</code></p>
5636 ````````````````````````````````
5637
5638
5639 Here two backticks are used, because the code contains a backtick.
5640 This example also illustrates stripping of leading and trailing spaces:
5641
5642 ```````````````````````````````` example
5643 `` foo ` bar  ``
5644 .
5645 <p><code>foo ` bar</code></p>
5646 ````````````````````````````````
5647
5648
5649 This example shows the motivation for stripping leading and trailing
5650 spaces:
5651
5652 ```````````````````````````````` example
5653 ` `` `
5654 .
5655 <p><code>``</code></p>
5656 ````````````````````````````````
5657
5658
5659 [Line endings] are treated like spaces:
5660
5661 ```````````````````````````````` example
5662 ``
5663 foo
5664 ``
5665 .
5666 <p><code>foo</code></p>
5667 ````````````````````````````````
5668
5669
5670 Interior spaces and [line endings] are collapsed into
5671 single spaces, just as they would be by a browser:
5672
5673 ```````````````````````````````` example
5674 `foo   bar
5675   baz`
5676 .
5677 <p><code>foo bar baz</code></p>
5678 ````````````````````````````````
5679
5680
5681 Not all [Unicode whitespace] (for instance, non-breaking space) is
5682 collapsed, however:
5683
5684 ```````````````````````````````` example
5685 `a  b`
5686 .
5687 <p><code>a  b</code></p>
5688 ````````````````````````````````
5689
5690
5691 Q: Why not just leave the spaces, since browsers will collapse them
5692 anyway?  A:  Because we might be targeting a non-HTML format, and we
5693 shouldn't rely on HTML-specific rendering assumptions.
5694
5695 (Existing implementations differ in their treatment of internal
5696 spaces and [line endings].  Some, including `Markdown.pl` and
5697 `showdown`, convert an internal [line ending] into a
5698 `<br />` tag.  But this makes things difficult for those who like to
5699 hard-wrap their paragraphs, since a line break in the midst of a code
5700 span will cause an unintended line break in the output.  Others just
5701 leave internal spaces as they are, which is fine if only HTML is being
5702 targeted.)
5703
5704 ```````````````````````````````` example
5705 `foo `` bar`
5706 .
5707 <p><code>foo `` bar</code></p>
5708 ````````````````````````````````
5709
5710
5711 Note that backslash escapes do not work in code spans. All backslashes
5712 are treated literally:
5713
5714 ```````````````````````````````` example
5715 `foo\`bar`
5716 .
5717 <p><code>foo\</code>bar`</p>
5718 ````````````````````````````````
5719
5720
5721 Backslash escapes are never needed, because one can always choose a
5722 string of *n* backtick characters as delimiters, where the code does
5723 not contain any strings of exactly *n* backtick characters.
5724
5725 Code span backticks have higher precedence than any other inline
5726 constructs except HTML tags and autolinks.  Thus, for example, this is
5727 not parsed as emphasized text, since the second `*` is part of a code
5728 span:
5729
5730 ```````````````````````````````` example
5731 *foo`*`
5732 .
5733 <p>*foo<code>*</code></p>
5734 ````````````````````````````````
5735
5736
5737 And this is not parsed as a link:
5738
5739 ```````````````````````````````` example
5740 [not a `link](/foo`)
5741 .
5742 <p>[not a <code>link](/foo</code>)</p>
5743 ````````````````````````````````
5744
5745
5746 Code spans, HTML tags, and autolinks have the same precedence.
5747 Thus, this is code:
5748
5749 ```````````````````````````````` example
5750 `<a href="`">`
5751 .
5752 <p><code>&lt;a href=&quot;</code>&quot;&gt;`</p>
5753 ````````````````````````````````
5754
5755
5756 But this is an HTML tag:
5757
5758 ```````````````````````````````` example
5759 <a href="`">`
5760 .
5761 <p><a href="`">`</p>
5762 ````````````````````````````````
5763
5764
5765 And this is code:
5766
5767 ```````````````````````````````` example
5768 `<http://foo.bar.`baz>`
5769 .
5770 <p><code>&lt;http://foo.bar.</code>baz&gt;`</p>
5771 ````````````````````````````````
5772
5773
5774 But this is an autolink:
5775
5776 ```````````````````````````````` example
5777 <http://foo.bar.`baz>`
5778 .
5779 <p><a href="http://foo.bar.%60baz">http://foo.bar.`baz</a>`</p>
5780 ````````````````````````````````
5781
5782
5783 When a backtick string is not closed by a matching backtick string,
5784 we just have literal backticks:
5785
5786 ```````````````````````````````` example
5787 ```foo``
5788 .
5789 <p>```foo``</p>
5790 ````````````````````````````````
5791
5792
5793 ```````````````````````````````` example
5794 `foo
5795 .
5796 <p>`foo</p>
5797 ````````````````````````````````
5798
5799
5800 ## Emphasis and strong emphasis
5801
5802 John Gruber's original [Markdown syntax
5803 description](http://daringfireball.net/projects/markdown/syntax#em) says:
5804
5805 > Markdown treats asterisks (`*`) and underscores (`_`) as indicators of
5806 > emphasis. Text wrapped with one `*` or `_` will be wrapped with an HTML
5807 > `<em>` tag; double `*`'s or `_`'s will be wrapped with an HTML `<strong>`
5808 > tag.
5809
5810 This is enough for most users, but these rules leave much undecided,
5811 especially when it comes to nested emphasis.  The original
5812 `Markdown.pl` test suite makes it clear that triple `***` and
5813 `___` delimiters can be used for strong emphasis, and most
5814 implementations have also allowed the following patterns:
5815
5816 ``` markdown
5817 ***strong emph***
5818 ***strong** in emph*
5819 ***emph* in strong**
5820 **in strong *emph***
5821 *in emph **strong***
5822 ```
5823
5824 The following patterns are less widely supported, but the intent
5825 is clear and they are useful (especially in contexts like bibliography
5826 entries):
5827
5828 ``` markdown
5829 *emph *with emph* in it*
5830 **strong **with strong** in it**
5831 ```
5832
5833 Many implementations have also restricted intraword emphasis to
5834 the `*` forms, to avoid unwanted emphasis in words containing
5835 internal underscores.  (It is best practice to put these in code
5836 spans, but users often do not.)
5837
5838 ``` markdown
5839 internal emphasis: foo*bar*baz
5840 no emphasis: foo_bar_baz
5841 ```
5842
5843 The rules given below capture all of these patterns, while allowing
5844 for efficient parsing strategies that do not backtrack.
5845
5846 First, some definitions.  A [delimiter run](@) is either
5847 a sequence of one or more `*` characters that is not preceded or
5848 followed by a `*` character, or a sequence of one or more `_`
5849 characters that is not preceded or followed by a `_` character.
5850
5851 A [left-flanking delimiter run](@) is
5852 a [delimiter run] that is (a) not followed by [Unicode whitespace],
5853 and (b) either not followed by a [punctuation character], or
5854 preceded by [Unicode whitespace] or a [punctuation character].
5855 For purposes of this definition, the beginning and the end of
5856 the line count as Unicode whitespace.
5857
5858 A [right-flanking delimiter run](@) is
5859 a [delimiter run] that is (a) not preceded by [Unicode whitespace],
5860 and (b) either not preceded by a [punctuation character], or
5861 followed by [Unicode whitespace] or a [punctuation character].
5862 For purposes of this definition, the beginning and the end of
5863 the line count as Unicode whitespace.
5864
5865 Here are some examples of delimiter runs.
5866
5867   - left-flanking but not right-flanking:
5868
5869     ```
5870     ***abc
5871       _abc
5872     **"abc"
5873      _"abc"
5874     ```
5875
5876   - right-flanking but not left-flanking:
5877
5878     ```
5879      abc***
5880      abc_
5881     "abc"**
5882     "abc"_
5883     ```
5884
5885   - Both left and right-flanking:
5886
5887     ```
5888      abc***def
5889     "abc"_"def"
5890     ```
5891
5892   - Neither left nor right-flanking:
5893
5894     ```
5895     abc *** def
5896     a _ b
5897     ```
5898
5899 (The idea of distinguishing left-flanking and right-flanking
5900 delimiter runs based on the character before and the character
5901 after comes from Roopesh Chander's
5902 [vfmd](http://www.vfmd.org/vfmd-spec/specification/#procedure-for-identifying-emphasis-tags).
5903 vfmd uses the terminology "emphasis indicator string" instead of "delimiter
5904 run," and its rules for distinguishing left- and right-flanking runs
5905 are a bit more complex than the ones given here.)
5906
5907 The following rules define emphasis and strong emphasis:
5908
5909 1.  A single `*` character [can open emphasis](@)
5910     iff (if and only if) it is part of a [left-flanking delimiter run].
5911
5912 2.  A single `_` character [can open emphasis] iff
5913     it is part of a [left-flanking delimiter run]
5914     and either (a) not part of a [right-flanking delimiter run]
5915     or (b) part of a [right-flanking delimiter run]
5916     preceded by punctuation.
5917
5918 3.  A single `*` character [can close emphasis](@)
5919     iff it is part of a [right-flanking delimiter run].
5920
5921 4.  A single `_` character [can close emphasis] iff
5922     it is part of a [right-flanking delimiter run]
5923     and either (a) not part of a [left-flanking delimiter run]
5924     or (b) part of a [left-flanking delimiter run]
5925     followed by punctuation.
5926
5927 5.  A double `**` [can open strong emphasis](@)
5928     iff it is part of a [left-flanking delimiter run].
5929
5930 6.  A double `__` [can open strong emphasis] iff
5931     it is part of a [left-flanking delimiter run]
5932     and either (a) not part of a [right-flanking delimiter run]
5933     or (b) part of a [right-flanking delimiter run]
5934     preceded by punctuation.
5935
5936 7.  A double `**` [can close strong emphasis](@)
5937     iff it is part of a [right-flanking delimiter run].
5938
5939 8.  A double `__` [can close strong emphasis]
5940     it is part of a [right-flanking delimiter run]
5941     and either (a) not part of a [left-flanking delimiter run]
5942     or (b) part of a [left-flanking delimiter run]
5943     followed by punctuation.
5944
5945 9.  Emphasis begins with a delimiter that [can open emphasis] and ends
5946     with a delimiter that [can close emphasis], and that uses the same
5947     character (`_` or `*`) as the opening delimiter.  The
5948     opening and closing delimiters must belong to separate
5949     [delimiter runs].  If one of the delimiters can both
5950     open and close emphasis, then the sum of the lengths of the
5951     delimiter runs containing the opening and closing delimiters
5952     must not be a multiple of 3.
5953
5954 10. Strong emphasis begins with a delimiter that
5955     [can open strong emphasis] and ends with a delimiter that
5956     [can close strong emphasis], and that uses the same character
5957     (`_` or `*`) as the opening delimiter.  The
5958     opening and closing delimiters must belong to separate
5959     [delimiter runs].  If one of the delimiters can both open
5960     and close strong emphasis, then the sum of the lengths of
5961     the delimiter runs containing the opening and closing
5962     delimiters must not be a multiple of 3.
5963
5964 11. A literal `*` character cannot occur at the beginning or end of
5965     `*`-delimited emphasis or `**`-delimited strong emphasis, unless it
5966     is backslash-escaped.
5967
5968 12. A literal `_` character cannot occur at the beginning or end of
5969     `_`-delimited emphasis or `__`-delimited strong emphasis, unless it
5970     is backslash-escaped.
5971
5972 Where rules 1--12 above are compatible with multiple parsings,
5973 the following principles resolve ambiguity:
5974
5975 13. The number of nestings should be minimized. Thus, for example,
5976     an interpretation `<strong>...</strong>` is always preferred to
5977     `<em><em>...</em></em>`.
5978
5979 14. An interpretation `<strong><em>...</em></strong>` is always
5980     preferred to `<em><strong>..</strong></em>`.
5981
5982 15. When two potential emphasis or strong emphasis spans overlap,
5983     so that the second begins before the first ends and ends after
5984     the first ends, the first takes precedence. Thus, for example,
5985     `*foo _bar* baz_` is parsed as `<em>foo _bar</em> baz_` rather
5986     than `*foo <em>bar* baz</em>`.
5987
5988 16. When there are two potential emphasis or strong emphasis spans
5989     with the same closing delimiter, the shorter one (the one that
5990     opens later) takes precedence. Thus, for example,
5991     `**foo **bar baz**` is parsed as `**foo <strong>bar baz</strong>`
5992     rather than `<strong>foo **bar baz</strong>`.
5993
5994 17. Inline code spans, links, images, and HTML tags group more tightly
5995     than emphasis.  So, when there is a choice between an interpretation
5996     that contains one of these elements and one that does not, the
5997     former always wins.  Thus, for example, `*[foo*](bar)` is
5998     parsed as `*<a href="bar">foo*</a>` rather than as
5999     `<em>[foo</em>](bar)`.
6000
6001 These rules can be illustrated through a series of examples.
6002
6003 Rule 1:
6004
6005 ```````````````````````````````` example
6006 *foo bar*
6007 .
6008 <p><em>foo bar</em></p>
6009 ````````````````````````````````
6010
6011
6012 This is not emphasis, because the opening `*` is followed by
6013 whitespace, and hence not part of a [left-flanking delimiter run]:
6014
6015 ```````````````````````````````` example
6016 a * foo bar*
6017 .
6018 <p>a * foo bar*</p>
6019 ````````````````````````````````
6020
6021
6022 This is not emphasis, because the opening `*` is preceded
6023 by an alphanumeric and followed by punctuation, and hence
6024 not part of a [left-flanking delimiter run]:
6025
6026 ```````````````````````````````` example
6027 a*"foo"*
6028 .
6029 <p>a*&quot;foo&quot;*</p>
6030 ````````````````````````````````
6031
6032
6033 Unicode nonbreaking spaces count as whitespace, too:
6034
6035 ```````````````````````````````` example
6036 * a *
6037 .
6038 <p>* a *</p>
6039 ````````````````````````````````
6040
6041
6042 Intraword emphasis with `*` is permitted:
6043
6044 ```````````````````````````````` example
6045 foo*bar*
6046 .
6047 <p>foo<em>bar</em></p>
6048 ````````````````````````````````
6049
6050
6051 ```````````````````````````````` example
6052 5*6*78
6053 .
6054 <p>5<em>6</em>78</p>
6055 ````````````````````````````````
6056
6057
6058 Rule 2:
6059
6060 ```````````````````````````````` example
6061 _foo bar_
6062 .
6063 <p><em>foo bar</em></p>
6064 ````````````````````````````````
6065
6066
6067 This is not emphasis, because the opening `_` is followed by
6068 whitespace:
6069
6070 ```````````````````````````````` example
6071 _ foo bar_
6072 .
6073 <p>_ foo bar_</p>
6074 ````````````````````````````````
6075
6076
6077 This is not emphasis, because the opening `_` is preceded
6078 by an alphanumeric and followed by punctuation:
6079
6080 ```````````````````````````````` example
6081 a_"foo"_
6082 .
6083 <p>a_&quot;foo&quot;_</p>
6084 ````````````````````````````````
6085
6086
6087 Emphasis with `_` is not allowed inside words:
6088
6089 ```````````````````````````````` example
6090 foo_bar_
6091 .
6092 <p>foo_bar_</p>
6093 ````````````````````````````````
6094
6095
6096 ```````````````````````````````` example
6097 5_6_78
6098 .
6099 <p>5_6_78</p>
6100 ````````````````````````````````
6101
6102
6103 ```````````````````````````````` example
6104 пристаням_стремятся_
6105 .
6106 <p>пристаням_стремятся_</p>
6107 ````````````````````````````````
6108
6109
6110 Here `_` does not generate emphasis, because the first delimiter run
6111 is right-flanking and the second left-flanking:
6112
6113 ```````````````````````````````` example
6114 aa_"bb"_cc
6115 .
6116 <p>aa_&quot;bb&quot;_cc</p>
6117 ````````````````````````````````
6118
6119
6120 This is emphasis, even though the opening delimiter is
6121 both left- and right-flanking, because it is preceded by
6122 punctuation:
6123
6124 ```````````````````````````````` example
6125 foo-_(bar)_
6126 .
6127 <p>foo-<em>(bar)</em></p>
6128 ````````````````````````````````
6129
6130
6131 Rule 3:
6132
6133 This is not emphasis, because the closing delimiter does
6134 not match the opening delimiter:
6135
6136 ```````````````````````````````` example
6137 _foo*
6138 .
6139 <p>_foo*</p>
6140 ````````````````````````````````
6141
6142
6143 This is not emphasis, because the closing `*` is preceded by
6144 whitespace:
6145
6146 ```````````````````````````````` example
6147 *foo bar *
6148 .
6149 <p>*foo bar *</p>
6150 ````````````````````````````````
6151
6152
6153 A newline also counts as whitespace:
6154
6155 ```````````````````````````````` example
6156 *foo bar
6157 *
6158 .
6159 <p>*foo bar
6160 *</p>
6161 ````````````````````````````````
6162
6163
6164 This is not emphasis, because the second `*` is
6165 preceded by punctuation and followed by an alphanumeric
6166 (hence it is not part of a [right-flanking delimiter run]:
6167
6168 ```````````````````````````````` example
6169 *(*foo)
6170 .
6171 <p>*(*foo)</p>
6172 ````````````````````````````````
6173
6174
6175 The point of this restriction is more easily appreciated
6176 with this example:
6177
6178 ```````````````````````````````` example
6179 *(*foo*)*
6180 .
6181 <p><em>(<em>foo</em>)</em></p>
6182 ````````````````````````````````
6183
6184
6185 Intraword emphasis with `*` is allowed:
6186
6187 ```````````````````````````````` example
6188 *foo*bar
6189 .
6190 <p><em>foo</em>bar</p>
6191 ````````````````````````````````
6192
6193
6194
6195 Rule 4:
6196
6197 This is not emphasis, because the closing `_` is preceded by
6198 whitespace:
6199
6200 ```````````````````````````````` example
6201 _foo bar _
6202 .
6203 <p>_foo bar _</p>
6204 ````````````````````````````````
6205
6206
6207 This is not emphasis, because the second `_` is
6208 preceded by punctuation and followed by an alphanumeric:
6209
6210 ```````````````````````````````` example
6211 _(_foo)
6212 .
6213 <p>_(_foo)</p>
6214 ````````````````````````````````
6215
6216
6217 This is emphasis within emphasis:
6218
6219 ```````````````````````````````` example
6220 _(_foo_)_
6221 .
6222 <p><em>(<em>foo</em>)</em></p>
6223 ````````````````````````````````
6224
6225
6226 Intraword emphasis is disallowed for `_`:
6227
6228 ```````````````````````````````` example
6229 _foo_bar
6230 .
6231 <p>_foo_bar</p>
6232 ````````````````````````````````
6233
6234
6235 ```````````````````````````````` example
6236 _пристаням_стремятся
6237 .
6238 <p>_пристаням_стремятся</p>
6239 ````````````````````````````````
6240
6241
6242 ```````````````````````````````` example
6243 _foo_bar_baz_
6244 .
6245 <p><em>foo_bar_baz</em></p>
6246 ````````````````````````````````
6247
6248
6249 This is emphasis, even though the closing delimiter is
6250 both left- and right-flanking, because it is followed by
6251 punctuation:
6252
6253 ```````````````````````````````` example
6254 _(bar)_.
6255 .
6256 <p><em>(bar)</em>.</p>
6257 ````````````````````````````````
6258
6259
6260 Rule 5:
6261
6262 ```````````````````````````````` example
6263 **foo bar**
6264 .
6265 <p><strong>foo bar</strong></p>
6266 ````````````````````````````````
6267
6268
6269 This is not strong emphasis, because the opening delimiter is
6270 followed by whitespace:
6271
6272 ```````````````````````````````` example
6273 ** foo bar**
6274 .
6275 <p>** foo bar**</p>
6276 ````````````````````````````````
6277
6278
6279 This is not strong emphasis, because the opening `**` is preceded
6280 by an alphanumeric and followed by punctuation, and hence
6281 not part of a [left-flanking delimiter run]:
6282
6283 ```````````````````````````````` example
6284 a**"foo"**
6285 .
6286 <p>a**&quot;foo&quot;**</p>
6287 ````````````````````````````````
6288
6289
6290 Intraword strong emphasis with `**` is permitted:
6291
6292 ```````````````````````````````` example
6293 foo**bar**
6294 .
6295 <p>foo<strong>bar</strong></p>
6296 ````````````````````````````````
6297
6298
6299 Rule 6:
6300
6301 ```````````````````````````````` example
6302 __foo bar__
6303 .
6304 <p><strong>foo bar</strong></p>
6305 ````````````````````````````````
6306
6307
6308 This is not strong emphasis, because the opening delimiter is
6309 followed by whitespace:
6310
6311 ```````````````````````````````` example
6312 __ foo bar__
6313 .
6314 <p>__ foo bar__</p>
6315 ````````````````````````````````
6316
6317
6318 A newline counts as whitespace:
6319 ```````````````````````````````` example
6320 __
6321 foo bar__
6322 .
6323 <p>__
6324 foo bar__</p>
6325 ````````````````````````````````
6326
6327
6328 This is not strong emphasis, because the opening `__` is preceded
6329 by an alphanumeric and followed by punctuation:
6330
6331 ```````````````````````````````` example
6332 a__"foo"__
6333 .
6334 <p>a__&quot;foo&quot;__</p>
6335 ````````````````````````````````
6336
6337
6338 Intraword strong emphasis is forbidden with `__`:
6339
6340 ```````````````````````````````` example
6341 foo__bar__
6342 .
6343 <p>foo__bar__</p>
6344 ````````````````````````````````
6345
6346
6347 ```````````````````````````````` example
6348 5__6__78
6349 .
6350 <p>5__6__78</p>
6351 ````````````````````````````````
6352
6353
6354 ```````````````````````````````` example
6355 пристаням__стремятся__
6356 .
6357 <p>пристаням__стремятся__</p>
6358 ````````````````````````````````
6359
6360
6361 ```````````````````````````````` example
6362 __foo, __bar__, baz__
6363 .
6364 <p><strong>foo, <strong>bar</strong>, baz</strong></p>
6365 ````````````````````````````````
6366
6367
6368 This is strong emphasis, even though the opening delimiter is
6369 both left- and right-flanking, because it is preceded by
6370 punctuation:
6371
6372 ```````````````````````````````` example
6373 foo-__(bar)__
6374 .
6375 <p>foo-<strong>(bar)</strong></p>
6376 ````````````````````````````````
6377
6378
6379
6380 Rule 7:
6381
6382 This is not strong emphasis, because the closing delimiter is preceded
6383 by whitespace:
6384
6385 ```````````````````````````````` example
6386 **foo bar **
6387 .
6388 <p>**foo bar **</p>
6389 ````````````````````````````````
6390
6391
6392 (Nor can it be interpreted as an emphasized `*foo bar *`, because of
6393 Rule 11.)
6394
6395 This is not strong emphasis, because the second `**` is
6396 preceded by punctuation and followed by an alphanumeric:
6397
6398 ```````````````````````````````` example
6399 **(**foo)
6400 .
6401 <p>**(**foo)</p>
6402 ````````````````````````````````
6403
6404
6405 The point of this restriction is more easily appreciated
6406 with these examples:
6407
6408 ```````````````````````````````` example
6409 *(**foo**)*
6410 .
6411 <p><em>(<strong>foo</strong>)</em></p>
6412 ````````````````````````````````
6413
6414
6415 ```````````````````````````````` example
6416 **Gomphocarpus (*Gomphocarpus physocarpus*, syn.
6417 *Asclepias physocarpa*)**
6418 .
6419 <p><strong>Gomphocarpus (<em>Gomphocarpus physocarpus</em>, syn.
6420 <em>Asclepias physocarpa</em>)</strong></p>
6421 ````````````````````````````````
6422
6423
6424 ```````````````````````````````` example
6425 **foo "*bar*" foo**
6426 .
6427 <p><strong>foo &quot;<em>bar</em>&quot; foo</strong></p>
6428 ````````````````````````````````
6429
6430
6431 Intraword emphasis:
6432
6433 ```````````````````````````````` example
6434 **foo**bar
6435 .
6436 <p><strong>foo</strong>bar</p>
6437 ````````````````````````````````
6438
6439
6440 Rule 8:
6441
6442 This is not strong emphasis, because the closing delimiter is
6443 preceded by whitespace:
6444
6445 ```````````````````````````````` example
6446 __foo bar __
6447 .
6448 <p>__foo bar __</p>
6449 ````````````````````````````````
6450
6451
6452 This is not strong emphasis, because the second `__` is
6453 preceded by punctuation and followed by an alphanumeric:
6454
6455 ```````````````````````````````` example
6456 __(__foo)
6457 .
6458 <p>__(__foo)</p>
6459 ````````````````````````````````
6460
6461
6462 The point of this restriction is more easily appreciated
6463 with this example:
6464
6465 ```````````````````````````````` example
6466 _(__foo__)_
6467 .
6468 <p><em>(<strong>foo</strong>)</em></p>
6469 ````````````````````````````````
6470
6471
6472 Intraword strong emphasis is forbidden with `__`:
6473
6474 ```````````````````````````````` example
6475 __foo__bar
6476 .
6477 <p>__foo__bar</p>
6478 ````````````````````````````````
6479
6480
6481 ```````````````````````````````` example
6482 __пристаням__стремятся
6483 .
6484 <p>__пристаням__стремятся</p>
6485 ````````````````````````````````
6486
6487
6488 ```````````````````````````````` example
6489 __foo__bar__baz__
6490 .
6491 <p><strong>foo__bar__baz</strong></p>
6492 ````````````````````````````````
6493
6494
6495 This is strong emphasis, even though the closing delimiter is
6496 both left- and right-flanking, because it is followed by
6497 punctuation:
6498
6499 ```````````````````````````````` example
6500 __(bar)__.
6501 .
6502 <p><strong>(bar)</strong>.</p>
6503 ````````````````````````````````
6504
6505
6506 Rule 9:
6507
6508 Any nonempty sequence of inline elements can be the contents of an
6509 emphasized span.
6510
6511 ```````````````````````````````` example
6512 *foo [bar](/url)*
6513 .
6514 <p><em>foo <a href="/url">bar</a></em></p>
6515 ````````````````````````````````
6516
6517
6518 ```````````````````````````````` example
6519 *foo
6520 bar*
6521 .
6522 <p><em>foo
6523 bar</em></p>
6524 ````````````````````````````````
6525
6526
6527 In particular, emphasis and strong emphasis can be nested
6528 inside emphasis:
6529
6530 ```````````````````````````````` example
6531 _foo __bar__ baz_
6532 .
6533 <p><em>foo <strong>bar</strong> baz</em></p>
6534 ````````````````````````````````
6535
6536
6537 ```````````````````````````````` example
6538 _foo _bar_ baz_
6539 .
6540 <p><em>foo <em>bar</em> baz</em></p>
6541 ````````````````````````````````
6542
6543
6544 ```````````````````````````````` example
6545 __foo_ bar_
6546 .
6547 <p><em><em>foo</em> bar</em></p>
6548 ````````````````````````````````
6549
6550
6551 ```````````````````````````````` example
6552 *foo *bar**
6553 .
6554 <p><em>foo <em>bar</em></em></p>
6555 ````````````````````````````````
6556
6557
6558 ```````````````````````````````` example
6559 *foo **bar** baz*
6560 .
6561 <p><em>foo <strong>bar</strong> baz</em></p>
6562 ````````````````````````````````
6563
6564 ```````````````````````````````` example
6565 *foo**bar**baz*
6566 .
6567 <p><em>foo<strong>bar</strong>baz</em></p>
6568 ````````````````````````````````
6569
6570 Note that in the preceding case, the interpretation
6571
6572 ``` markdown
6573 <p><em>foo</em><em>bar<em></em>baz</em></p>
6574 ```
6575
6576
6577 is precluded by the condition that a delimiter that
6578 can both open and close (like the `*` after `foo`)
6579 cannot form emphasis if the sum of the lengths of
6580 the delimiter runs containing the opening and
6581 closing delimiters is a multiple of 3.
6582
6583 The same condition ensures that the following
6584 cases are all strong emphasis nested inside
6585 emphasis, even when the interior spaces are
6586 omitted:
6587
6588
6589 ```````````````````````````````` example
6590 ***foo** bar*
6591 .
6592 <p><em><strong>foo</strong> bar</em></p>
6593 ````````````````````````````````
6594
6595
6596 ```````````````````````````````` example
6597 *foo **bar***
6598 .
6599 <p><em>foo <strong>bar</strong></em></p>
6600 ````````````````````````````````
6601
6602
6603 ```````````````````````````````` example
6604 *foo**bar***
6605 .
6606 <p><em>foo<strong>bar</strong></em></p>
6607 ````````````````````````````````
6608
6609
6610 Indefinite levels of nesting are possible:
6611
6612 ```````````````````````````````` example
6613 *foo **bar *baz* bim** bop*
6614 .
6615 <p><em>foo <strong>bar <em>baz</em> bim</strong> bop</em></p>
6616 ````````````````````````````````
6617
6618
6619 ```````````````````````````````` example
6620 *foo [*bar*](/url)*
6621 .
6622 <p><em>foo <a href="/url"><em>bar</em></a></em></p>
6623 ````````````````````````````````
6624
6625
6626 There can be no empty emphasis or strong emphasis:
6627
6628 ```````````````````````````````` example
6629 ** is not an empty emphasis
6630 .
6631 <p>** is not an empty emphasis</p>
6632 ````````````````````````````````
6633
6634
6635 ```````````````````````````````` example
6636 **** is not an empty strong emphasis
6637 .
6638 <p>**** is not an empty strong emphasis</p>
6639 ````````````````````````````````
6640
6641
6642
6643 Rule 10:
6644
6645 Any nonempty sequence of inline elements can be the contents of an
6646 strongly emphasized span.
6647
6648 ```````````````````````````````` example
6649 **foo [bar](/url)**
6650 .
6651 <p><strong>foo <a href="/url">bar</a></strong></p>
6652 ````````````````````````````````
6653
6654
6655 ```````````````````````````````` example
6656 **foo
6657 bar**
6658 .
6659 <p><strong>foo
6660 bar</strong></p>
6661 ````````````````````````````````
6662
6663
6664 In particular, emphasis and strong emphasis can be nested
6665 inside strong emphasis:
6666
6667 ```````````````````````````````` example
6668 __foo _bar_ baz__
6669 .
6670 <p><strong>foo <em>bar</em> baz</strong></p>
6671 ````````````````````````````````
6672
6673
6674 ```````````````````````````````` example
6675 __foo __bar__ baz__
6676 .
6677 <p><strong>foo <strong>bar</strong> baz</strong></p>
6678 ````````````````````````````````
6679
6680
6681 ```````````````````````````````` example
6682 ____foo__ bar__
6683 .
6684 <p><strong><strong>foo</strong> bar</strong></p>
6685 ````````````````````````````````
6686
6687
6688 ```````````````````````````````` example
6689 **foo **bar****
6690 .
6691 <p><strong>foo <strong>bar</strong></strong></p>
6692 ````````````````````````````````
6693
6694
6695 ```````````````````````````````` example
6696 **foo *bar* baz**
6697 .
6698 <p><strong>foo <em>bar</em> baz</strong></p>
6699 ````````````````````````````````
6700
6701
6702 ```````````````````````````````` example
6703 **foo*bar*baz**
6704 .
6705 <p><strong>foo<em>bar</em>baz</strong></p>
6706 ````````````````````````````````
6707
6708
6709 ```````````````````````````````` example
6710 ***foo* bar**
6711 .
6712 <p><strong><em>foo</em> bar</strong></p>
6713 ````````````````````````````````
6714
6715
6716 ```````````````````````````````` example
6717 **foo *bar***
6718 .
6719 <p><strong>foo <em>bar</em></strong></p>
6720 ````````````````````````````````
6721
6722
6723 Indefinite levels of nesting are possible:
6724
6725 ```````````````````````````````` example
6726 **foo *bar **baz**
6727 bim* bop**
6728 .
6729 <p><strong>foo <em>bar <strong>baz</strong>
6730 bim</em> bop</strong></p>
6731 ````````````````````````````````
6732
6733
6734 ```````````````````````````````` example
6735 **foo [*bar*](/url)**
6736 .
6737 <p><strong>foo <a href="/url"><em>bar</em></a></strong></p>
6738 ````````````````````````````````
6739
6740
6741 There can be no empty emphasis or strong emphasis:
6742
6743 ```````````````````````````````` example
6744 __ is not an empty emphasis
6745 .
6746 <p>__ is not an empty emphasis</p>
6747 ````````````````````````````````
6748
6749
6750 ```````````````````````````````` example
6751 ____ is not an empty strong emphasis
6752 .
6753 <p>____ is not an empty strong emphasis</p>
6754 ````````````````````````````````
6755
6756
6757
6758 Rule 11:
6759
6760 ```````````````````````````````` example
6761 foo ***
6762 .
6763 <p>foo ***</p>
6764 ````````````````````````````````
6765
6766
6767 ```````````````````````````````` example
6768 foo *\**
6769 .
6770 <p>foo <em>*</em></p>
6771 ````````````````````````````````
6772
6773
6774 ```````````````````````````````` example
6775 foo *_*
6776 .
6777 <p>foo <em>_</em></p>
6778 ````````````````````````````````
6779
6780
6781 ```````````````````````````````` example
6782 foo *****
6783 .
6784 <p>foo *****</p>
6785 ````````````````````````````````
6786
6787
6788 ```````````````````````````````` example
6789 foo **\***
6790 .
6791 <p>foo <strong>*</strong></p>
6792 ````````````````````````````````
6793
6794
6795 ```````````````````````````````` example
6796 foo **_**
6797 .
6798 <p>foo <strong>_</strong></p>
6799 ````````````````````````````````
6800
6801
6802 Note that when delimiters do not match evenly, Rule 11 determines
6803 that the excess literal `*` characters will appear outside of the
6804 emphasis, rather than inside it:
6805
6806 ```````````````````````````````` example
6807 **foo*
6808 .
6809 <p>*<em>foo</em></p>
6810 ````````````````````````````````
6811
6812
6813 ```````````````````````````````` example
6814 *foo**
6815 .
6816 <p><em>foo</em>*</p>
6817 ````````````````````````````````
6818
6819
6820 ```````````````````````````````` example
6821 ***foo**
6822 .
6823 <p>*<strong>foo</strong></p>
6824 ````````````````````````````````
6825
6826
6827 ```````````````````````````````` example
6828 ****foo*
6829 .
6830 <p>***<em>foo</em></p>
6831 ````````````````````````````````
6832
6833
6834 ```````````````````````````````` example
6835 **foo***
6836 .
6837 <p><strong>foo</strong>*</p>
6838 ````````````````````````````````
6839
6840
6841 ```````````````````````````````` example
6842 *foo****
6843 .
6844 <p><em>foo</em>***</p>
6845 ````````````````````````````````
6846
6847
6848
6849 Rule 12:
6850
6851 ```````````````````````````````` example
6852 foo ___
6853 .
6854 <p>foo ___</p>
6855 ````````````````````````````````
6856
6857
6858 ```````````````````````````````` example
6859 foo _\__
6860 .
6861 <p>foo <em>_</em></p>
6862 ````````````````````````````````
6863
6864
6865 ```````````````````````````````` example
6866 foo _*_
6867 .
6868 <p>foo <em>*</em></p>
6869 ````````````````````````````````
6870
6871
6872 ```````````````````````````````` example
6873 foo _____
6874 .
6875 <p>foo _____</p>
6876 ````````````````````````````````
6877
6878
6879 ```````````````````````````````` example
6880 foo __\___
6881 .
6882 <p>foo <strong>_</strong></p>
6883 ````````````````````````````````
6884
6885
6886 ```````````````````````````````` example
6887 foo __*__
6888 .
6889 <p>foo <strong>*</strong></p>
6890 ````````````````````````````````
6891
6892
6893 ```````````````````````````````` example
6894 __foo_
6895 .
6896 <p>_<em>foo</em></p>
6897 ````````````````````````````````
6898
6899
6900 Note that when delimiters do not match evenly, Rule 12 determines
6901 that the excess literal `_` characters will appear outside of the
6902 emphasis, rather than inside it:
6903
6904 ```````````````````````````````` example
6905 _foo__
6906 .
6907 <p><em>foo</em>_</p>
6908 ````````````````````````````````
6909
6910
6911 ```````````````````````````````` example
6912 ___foo__
6913 .
6914 <p>_<strong>foo</strong></p>
6915 ````````````````````````````````
6916
6917
6918 ```````````````````````````````` example
6919 ____foo_
6920 .
6921 <p>___<em>foo</em></p>
6922 ````````````````````````````````
6923
6924
6925 ```````````````````````````````` example
6926 __foo___
6927 .
6928 <p><strong>foo</strong>_</p>
6929 ````````````````````````````````
6930
6931
6932 ```````````````````````````````` example
6933 _foo____
6934 .
6935 <p><em>foo</em>___</p>
6936 ````````````````````````````````
6937
6938
6939 Rule 13 implies that if you want emphasis nested directly inside
6940 emphasis, you must use different delimiters:
6941
6942 ```````````````````````````````` example
6943 **foo**
6944 .
6945 <p><strong>foo</strong></p>
6946 ````````````````````````````````
6947
6948
6949 ```````````````````````````````` example
6950 *_foo_*
6951 .
6952 <p><em><em>foo</em></em></p>
6953 ````````````````````````````````
6954
6955
6956 ```````````````````````````````` example
6957 __foo__
6958 .
6959 <p><strong>foo</strong></p>
6960 ````````````````````````````````
6961
6962
6963 ```````````````````````````````` example
6964 _*foo*_
6965 .
6966 <p><em><em>foo</em></em></p>
6967 ````````````````````````````````
6968
6969
6970 However, strong emphasis within strong emphasis is possible without
6971 switching delimiters:
6972
6973 ```````````````````````````````` example
6974 ****foo****
6975 .
6976 <p><strong><strong>foo</strong></strong></p>
6977 ````````````````````````````````
6978
6979
6980 ```````````````````````````````` example
6981 ____foo____
6982 .
6983 <p><strong><strong>foo</strong></strong></p>
6984 ````````````````````````````````
6985
6986
6987
6988 Rule 13 can be applied to arbitrarily long sequences of
6989 delimiters:
6990
6991 ```````````````````````````````` example
6992 ******foo******
6993 .
6994 <p><strong><strong><strong>foo</strong></strong></strong></p>
6995 ````````````````````````````````
6996
6997
6998 Rule 14:
6999
7000 ```````````````````````````````` example
7001 ***foo***
7002 .
7003 <p><strong><em>foo</em></strong></p>
7004 ````````````````````````````````
7005
7006
7007 ```````````````````````````````` example
7008 _____foo_____
7009 .
7010 <p><strong><strong><em>foo</em></strong></strong></p>
7011 ````````````````````````````````
7012
7013
7014 Rule 15:
7015
7016 ```````````````````````````````` example
7017 *foo _bar* baz_
7018 .
7019 <p><em>foo _bar</em> baz_</p>
7020 ````````````````````````````````
7021
7022
7023 ```````````````````````````````` example
7024 *foo __bar *baz bim__ bam*
7025 .
7026 <p><em>foo <strong>bar *baz bim</strong> bam</em></p>
7027 ````````````````````````````````
7028
7029
7030 Rule 16:
7031
7032 ```````````````````````````````` example
7033 **foo **bar baz**
7034 .
7035 <p>**foo <strong>bar baz</strong></p>
7036 ````````````````````````````````
7037
7038
7039 ```````````````````````````````` example
7040 *foo *bar baz*
7041 .
7042 <p>*foo <em>bar baz</em></p>
7043 ````````````````````````````````
7044
7045
7046 Rule 17:
7047
7048 ```````````````````````````````` example
7049 *[bar*](/url)
7050 .
7051 <p>*<a href="/url">bar*</a></p>
7052 ````````````````````````````````
7053
7054
7055 ```````````````````````````````` example
7056 _foo [bar_](/url)
7057 .
7058 <p>_foo <a href="/url">bar_</a></p>
7059 ````````````````````````````````
7060
7061
7062 ```````````````````````````````` example
7063 *<img src="foo" title="*"/>
7064 .
7065 <p>*<img src="foo" title="*"/></p>
7066 ````````````````````````````````
7067
7068
7069 ```````````````````````````````` example
7070 **<a href="**">
7071 .
7072 <p>**<a href="**"></p>
7073 ````````````````````````````````
7074
7075
7076 ```````````````````````````````` example
7077 __<a href="__">
7078 .
7079 <p>__<a href="__"></p>
7080 ````````````````````````````````
7081
7082
7083 ```````````````````````````````` example
7084 *a `*`*
7085 .
7086 <p><em>a <code>*</code></em></p>
7087 ````````````````````````````````
7088
7089
7090 ```````````````````````````````` example
7091 _a `_`_
7092 .
7093 <p><em>a <code>_</code></em></p>
7094 ````````````````````````````````
7095
7096
7097 ```````````````````````````````` example
7098 **a<http://foo.bar/?q=**>
7099 .
7100 <p>**a<a href="http://foo.bar/?q=**">http://foo.bar/?q=**</a></p>
7101 ````````````````````````````````
7102
7103
7104 ```````````````````````````````` example
7105 __a<http://foo.bar/?q=__>
7106 .
7107 <p>__a<a href="http://foo.bar/?q=__">http://foo.bar/?q=__</a></p>
7108 ````````````````````````````````
7109
7110
7111
7112 ## Links
7113
7114 A link contains [link text] (the visible text), a [link destination]
7115 (the URI that is the link destination), and optionally a [link title].
7116 There are two basic kinds of links in Markdown.  In [inline links] the
7117 destination and title are given immediately after the link text.  In
7118 [reference links] the destination and title are defined elsewhere in
7119 the document.
7120
7121 A [link text](@) consists of a sequence of zero or more
7122 inline elements enclosed by square brackets (`[` and `]`).  The
7123 following rules apply:
7124
7125 - Links may not contain other links, at any level of nesting. If
7126   multiple otherwise valid link definitions appear nested inside each
7127   other, the inner-most definition is used.
7128
7129 - Brackets are allowed in the [link text] only if (a) they
7130   are backslash-escaped or (b) they appear as a matched pair of brackets,
7131   with an open bracket `[`, a sequence of zero or more inlines, and
7132   a close bracket `]`.
7133
7134 - Backtick [code spans], [autolinks], and raw [HTML tags] bind more tightly
7135   than the brackets in link text.  Thus, for example,
7136   `` [foo`]` `` could not be a link text, since the second `]`
7137   is part of a code span.
7138
7139 - The brackets in link text bind more tightly than markers for
7140   [emphasis and strong emphasis]. Thus, for example, `*[foo*](url)` is a link.
7141
7142 A [link destination](@) consists of either
7143
7144 - a sequence of zero or more characters between an opening `<` and a
7145   closing `>` that contains no spaces, line breaks, or unescaped
7146   `<` or `>` characters, or
7147
7148 - a nonempty sequence of characters that does not include
7149   ASCII space or control characters, and includes parentheses
7150   only if (a) they are backslash-escaped or (b) they are part of
7151   a balanced pair of unescaped parentheses that is not itself
7152   inside a balanced pair of unescaped parentheses.
7153
7154 A [link title](@)  consists of either
7155
7156 - a sequence of zero or more characters between straight double-quote
7157   characters (`"`), including a `"` character only if it is
7158   backslash-escaped, or
7159
7160 - a sequence of zero or more characters between straight single-quote
7161   characters (`'`), including a `'` character only if it is
7162   backslash-escaped, or
7163
7164 - a sequence of zero or more characters between matching parentheses
7165   (`(...)`), including a `)` character only if it is backslash-escaped.
7166
7167 Although [link titles] may span multiple lines, they may not contain
7168 a [blank line].
7169
7170 An [inline link](@) consists of a [link text] followed immediately
7171 by a left parenthesis `(`, optional [whitespace], an optional
7172 [link destination], an optional [link title] separated from the link
7173 destination by [whitespace], optional [whitespace], and a right
7174 parenthesis `)`. The link's text consists of the inlines contained
7175 in the [link text] (excluding the enclosing square brackets).
7176 The link's URI consists of the link destination, excluding enclosing
7177 `<...>` if present, with backslash-escapes in effect as described
7178 above.  The link's title consists of the link title, excluding its
7179 enclosing delimiters, with backslash-escapes in effect as described
7180 above.
7181
7182 Here is a simple inline link:
7183
7184 ```````````````````````````````` example
7185 [link](/uri "title")
7186 .
7187 <p><a href="/uri" title="title">link</a></p>
7188 ````````````````````````````````
7189
7190
7191 The title may be omitted:
7192
7193 ```````````````````````````````` example
7194 [link](/uri)
7195 .
7196 <p><a href="/uri">link</a></p>
7197 ````````````````````````````````
7198
7199
7200 Both the title and the destination may be omitted:
7201
7202 ```````````````````````````````` example
7203 [link]()
7204 .
7205 <p><a href="">link</a></p>
7206 ````````````````````````````````
7207
7208
7209 ```````````````````````````````` example
7210 [link](<>)
7211 .
7212 <p><a href="">link</a></p>
7213 ````````````````````````````````
7214
7215
7216 The destination cannot contain spaces or line breaks,
7217 even if enclosed in pointy brackets:
7218
7219 ```````````````````````````````` example
7220 [link](/my uri)
7221 .
7222 <p>[link](/my uri)</p>
7223 ````````````````````````````````
7224
7225
7226 ```````````````````````````````` example
7227 [link](</my uri>)
7228 .
7229 <p>[link](&lt;/my uri&gt;)</p>
7230 ````````````````````````````````
7231
7232
7233 ```````````````````````````````` example
7234 [link](foo
7235 bar)
7236 .
7237 <p>[link](foo
7238 bar)</p>
7239 ````````````````````````````````
7240
7241
7242 ```````````````````````````````` example
7243 [link](<foo
7244 bar>)
7245 .
7246 <p>[link](<foo
7247 bar>)</p>
7248 ````````````````````````````````
7249
7250 Parentheses inside the link destination may be escaped:
7251
7252 ```````````````````````````````` example
7253 [link](\(foo\))
7254 .
7255 <p><a href="(foo)">link</a></p>
7256 ````````````````````````````````
7257
7258 One level of balanced parentheses is allowed without escaping:
7259
7260 ```````````````````````````````` example
7261 [link]((foo)and(bar))
7262 .
7263 <p><a href="(foo)and(bar)">link</a></p>
7264 ````````````````````````````````
7265
7266 However, if you have parentheses within parentheses, you need to escape
7267 or use the `<...>` form:
7268
7269 ```````````````````````````````` example
7270 [link](foo(and(bar)))
7271 .
7272 <p>[link](foo(and(bar)))</p>
7273 ````````````````````````````````
7274
7275
7276 ```````````````````````````````` example
7277 [link](foo(and\(bar\)))
7278 .
7279 <p><a href="foo(and(bar))">link</a></p>
7280 ````````````````````````````````
7281
7282
7283 ```````````````````````````````` example
7284 [link](<foo(and(bar))>)
7285 .
7286 <p><a href="foo(and(bar))">link</a></p>
7287 ````````````````````````````````
7288
7289
7290 Parentheses and other symbols can also be escaped, as usual
7291 in Markdown:
7292
7293 ```````````````````````````````` example
7294 [link](foo\)\:)
7295 .
7296 <p><a href="foo):">link</a></p>
7297 ````````````````````````````````
7298
7299
7300 A link can contain fragment identifiers and queries:
7301
7302 ```````````````````````````````` example
7303 [link](#fragment)
7304
7305 [link](http://example.com#fragment)
7306
7307 [link](http://example.com?foo=3#frag)
7308 .
7309 <p><a href="#fragment">link</a></p>
7310 <p><a href="http://example.com#fragment">link</a></p>
7311 <p><a href="http://example.com?foo=3#frag">link</a></p>
7312 ````````````````````````````````
7313
7314
7315 Note that a backslash before a non-escapable character is
7316 just a backslash:
7317
7318 ```````````````````````````````` example
7319 [link](foo\bar)
7320 .
7321 <p><a href="foo%5Cbar">link</a></p>
7322 ````````````````````````````````
7323
7324
7325 URL-escaping should be left alone inside the destination, as all
7326 URL-escaped characters are also valid URL characters. Entity and
7327 numerical character references in the destination will be parsed
7328 into the corresponding Unicode code points, as usual.  These may
7329 be optionally URL-escaped when written as HTML, but this spec
7330 does not enforce any particular policy for rendering URLs in
7331 HTML or other formats.  Renderers may make different decisions
7332 about how to escape or normalize URLs in the output.
7333
7334 ```````````````````````````````` example
7335 [link](foo%20b&auml;)
7336 .
7337 <p><a href="foo%20b%C3%A4">link</a></p>
7338 ````````````````````````````````
7339
7340
7341 Note that, because titles can often be parsed as destinations,
7342 if you try to omit the destination and keep the title, you'll
7343 get unexpected results:
7344
7345 ```````````````````````````````` example
7346 [link]("title")
7347 .
7348 <p><a href="%22title%22">link</a></p>
7349 ````````````````````````````````
7350
7351
7352 Titles may be in single quotes, double quotes, or parentheses:
7353
7354 ```````````````````````````````` example
7355 [link](/url "title")
7356 [link](/url 'title')
7357 [link](/url (title))
7358 .
7359 <p><a href="/url" title="title">link</a>
7360 <a href="/url" title="title">link</a>
7361 <a href="/url" title="title">link</a></p>
7362 ````````````````````````````````
7363
7364
7365 Backslash escapes and entity and numeric character references
7366 may be used in titles:
7367
7368 ```````````````````````````````` example
7369 [link](/url "title \"&quot;")
7370 .
7371 <p><a href="/url" title="title &quot;&quot;">link</a></p>
7372 ````````````````````````````````
7373
7374
7375 Titles must be separated from the link using a [whitespace].
7376 Other [Unicode whitespace] like non-breaking space doesn't work.
7377
7378 ```````````````````````````````` example
7379 [link](/url "title")
7380 .
7381 <p><a href="/url%C2%A0%22title%22">link</a></p>
7382 ````````````````````````````````
7383
7384
7385 Nested balanced quotes are not allowed without escaping:
7386
7387 ```````````````````````````````` example
7388 [link](/url "title "and" title")
7389 .
7390 <p>[link](/url &quot;title &quot;and&quot; title&quot;)</p>
7391 ````````````````````````````````
7392
7393
7394 But it is easy to work around this by using a different quote type:
7395
7396 ```````````````````````````````` example
7397 [link](/url 'title "and" title')
7398 .
7399 <p><a href="/url" title="title &quot;and&quot; title">link</a></p>
7400 ````````````````````````````````
7401
7402
7403 (Note:  `Markdown.pl` did allow double quotes inside a double-quoted
7404 title, and its test suite included a test demonstrating this.
7405 But it is hard to see a good rationale for the extra complexity this
7406 brings, since there are already many ways---backslash escaping,
7407 entity and numeric character references, or using a different
7408 quote type for the enclosing title---to write titles containing
7409 double quotes.  `Markdown.pl`'s handling of titles has a number
7410 of other strange features.  For example, it allows single-quoted
7411 titles in inline links, but not reference links.  And, in
7412 reference links but not inline links, it allows a title to begin
7413 with `"` and end with `)`.  `Markdown.pl` 1.0.1 even allows
7414 titles with no closing quotation mark, though 1.0.2b8 does not.
7415 It seems preferable to adopt a simple, rational rule that works
7416 the same way in inline links and link reference definitions.)
7417
7418 [Whitespace] is allowed around the destination and title:
7419
7420 ```````````````````````````````` example
7421 [link](   /uri
7422   "title"  )
7423 .
7424 <p><a href="/uri" title="title">link</a></p>
7425 ````````````````````````````````
7426
7427
7428 But it is not allowed between the link text and the
7429 following parenthesis:
7430
7431 ```````````````````````````````` example
7432 [link] (/uri)
7433 .
7434 <p>[link] (/uri)</p>
7435 ````````````````````````````````
7436
7437
7438 The link text may contain balanced brackets, but not unbalanced ones,
7439 unless they are escaped:
7440
7441 ```````````````````````````````` example
7442 [link [foo [bar]]](/uri)
7443 .
7444 <p><a href="/uri">link [foo [bar]]</a></p>
7445 ````````````````````````````````
7446
7447
7448 ```````````````````````````````` example
7449 [link] bar](/uri)
7450 .
7451 <p>[link] bar](/uri)</p>
7452 ````````````````````````````````
7453
7454
7455 ```````````````````````````````` example
7456 [link [bar](/uri)
7457 .
7458 <p>[link <a href="/uri">bar</a></p>
7459 ````````````````````````````````
7460
7461
7462 ```````````````````````````````` example
7463 [link \[bar](/uri)
7464 .
7465 <p><a href="/uri">link [bar</a></p>
7466 ````````````````````````````````
7467
7468
7469 The link text may contain inline content:
7470
7471 ```````````````````````````````` example
7472 [link *foo **bar** `#`*](/uri)
7473 .
7474 <p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p>
7475 ````````````````````````````````
7476
7477
7478 ```````````````````````````````` example
7479 [![moon](moon.jpg)](/uri)
7480 .
7481 <p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p>
7482 ````````````````````````````````
7483
7484
7485 However, links may not contain other links, at any level of nesting.
7486
7487 ```````````````````````````````` example
7488 [foo [bar](/uri)](/uri)
7489 .
7490 <p>[foo <a href="/uri">bar</a>](/uri)</p>
7491 ````````````````````````````````
7492
7493
7494 ```````````````````````````````` example
7495 [foo *[bar [baz](/uri)](/uri)*](/uri)
7496 .
7497 <p>[foo <em>[bar <a href="/uri">baz</a>](/uri)</em>](/uri)</p>
7498 ````````````````````````````````
7499
7500
7501 ```````````````````````````````` example
7502 ![[[foo](uri1)](uri2)](uri3)
7503 .
7504 <p><img src="uri3" alt="[foo](uri2)" /></p>
7505 ````````````````````````````````
7506
7507
7508 These cases illustrate the precedence of link text grouping over
7509 emphasis grouping:
7510
7511 ```````````````````````````````` example
7512 *[foo*](/uri)
7513 .
7514 <p>*<a href="/uri">foo*</a></p>
7515 ````````````````````````````````
7516
7517
7518 ```````````````````````````````` example
7519 [foo *bar](baz*)
7520 .
7521 <p><a href="baz*">foo *bar</a></p>
7522 ````````````````````````````````
7523
7524
7525 Note that brackets that *aren't* part of links do not take
7526 precedence:
7527
7528 ```````````````````````````````` example
7529 *foo [bar* baz]
7530 .
7531 <p><em>foo [bar</em> baz]</p>
7532 ````````````````````````````````
7533
7534
7535 These cases illustrate the precedence of HTML tags, code spans,
7536 and autolinks over link grouping:
7537
7538 ```````````````````````````````` example
7539 [foo <bar attr="](baz)">
7540 .
7541 <p>[foo <bar attr="](baz)"></p>
7542 ````````````````````````````````
7543
7544
7545 ```````````````````````````````` example
7546 [foo`](/uri)`
7547 .
7548 <p>[foo<code>](/uri)</code></p>
7549 ````````````````````````````````
7550
7551
7552 ```````````````````````````````` example
7553 [foo<http://example.com/?search=](uri)>
7554 .
7555 <p>[foo<a href="http://example.com/?search=%5D(uri)">http://example.com/?search=](uri)</a></p>
7556 ````````````````````````````````
7557
7558
7559 There are three kinds of [reference link](@)s:
7560 [full](#full-reference-link), [collapsed](#collapsed-reference-link),
7561 and [shortcut](#shortcut-reference-link).
7562
7563 A [full reference link](@)
7564 consists of a [link text] immediately followed by a [link label]
7565 that [matches] a [link reference definition] elsewhere in the document.
7566
7567 A [link label](@)  begins with a left bracket (`[`) and ends
7568 with the first right bracket (`]`) that is not backslash-escaped.
7569 Between these brackets there must be at least one [non-whitespace character].
7570 Unescaped square bracket characters are not allowed in
7571 [link labels].  A link label can have at most 999
7572 characters inside the square brackets.
7573
7574 One label [matches](@)
7575 another just in case their normalized forms are equal.  To normalize a
7576 label, perform the *Unicode case fold* and collapse consecutive internal
7577 [whitespace] to a single space.  If there are multiple
7578 matching reference link definitions, the one that comes first in the
7579 document is used.  (It is desirable in such cases to emit a warning.)
7580
7581 The contents of the first link label are parsed as inlines, which are
7582 used as the link's text.  The link's URI and title are provided by the
7583 matching [link reference definition].
7584
7585 Here is a simple example:
7586
7587 ```````````````````````````````` example
7588 [foo][bar]
7589
7590 [bar]: /url "title"
7591 .
7592 <p><a href="/url" title="title">foo</a></p>
7593 ````````````````````````````````
7594
7595
7596 The rules for the [link text] are the same as with
7597 [inline links].  Thus:
7598
7599 The link text may contain balanced brackets, but not unbalanced ones,
7600 unless they are escaped:
7601
7602 ```````````````````````````````` example
7603 [link [foo [bar]]][ref]
7604
7605 [ref]: /uri
7606 .
7607 <p><a href="/uri">link [foo [bar]]</a></p>
7608 ````````````````````````````````
7609
7610
7611 ```````````````````````````````` example
7612 [link \[bar][ref]
7613
7614 [ref]: /uri
7615 .
7616 <p><a href="/uri">link [bar</a></p>
7617 ````````````````````````````````
7618
7619
7620 The link text may contain inline content:
7621
7622 ```````````````````````````````` example
7623 [link *foo **bar** `#`*][ref]
7624
7625 [ref]: /uri
7626 .
7627 <p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p>
7628 ````````````````````````````````
7629
7630
7631 ```````````````````````````````` example
7632 [![moon](moon.jpg)][ref]
7633
7634 [ref]: /uri
7635 .
7636 <p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p>
7637 ````````````````````````````````
7638
7639
7640 However, links may not contain other links, at any level of nesting.
7641
7642 ```````````````````````````````` example
7643 [foo [bar](/uri)][ref]
7644
7645 [ref]: /uri
7646 .
7647 <p>[foo <a href="/uri">bar</a>]<a href="/uri">ref</a></p>
7648 ````````````````````````````````
7649
7650
7651 ```````````````````````````````` example
7652 [foo *bar [baz][ref]*][ref]
7653
7654 [ref]: /uri
7655 .
7656 <p>[foo <em>bar <a href="/uri">baz</a></em>]<a href="/uri">ref</a></p>
7657 ````````````````````````````````
7658
7659
7660 (In the examples above, we have two [shortcut reference links]
7661 instead of one [full reference link].)
7662
7663 The following cases illustrate the precedence of link text grouping over
7664 emphasis grouping:
7665
7666 ```````````````````````````````` example
7667 *[foo*][ref]
7668
7669 [ref]: /uri
7670 .
7671 <p>*<a href="/uri">foo*</a></p>
7672 ````````````````````````````````
7673
7674
7675 ```````````````````````````````` example
7676 [foo *bar][ref]
7677
7678 [ref]: /uri
7679 .
7680 <p><a href="/uri">foo *bar</a></p>
7681 ````````````````````````````````
7682
7683
7684 These cases illustrate the precedence of HTML tags, code spans,
7685 and autolinks over link grouping:
7686
7687 ```````````````````````````````` example
7688 [foo <bar attr="][ref]">
7689
7690 [ref]: /uri
7691 .
7692 <p>[foo <bar attr="][ref]"></p>
7693 ````````````````````````````````
7694
7695
7696 ```````````````````````````````` example
7697 [foo`][ref]`
7698
7699 [ref]: /uri
7700 .
7701 <p>[foo<code>][ref]</code></p>
7702 ````````````````````````````````
7703
7704
7705 ```````````````````````````````` example
7706 [foo<http://example.com/?search=][ref]>
7707
7708 [ref]: /uri
7709 .
7710 <p>[foo<a href="http://example.com/?search=%5D%5Bref%5D">http://example.com/?search=][ref]</a></p>
7711 ````````````````````````````````
7712
7713
7714 Matching is case-insensitive:
7715
7716 ```````````````````````````````` example
7717 [foo][BaR]
7718
7719 [bar]: /url "title"
7720 .
7721 <p><a href="/url" title="title">foo</a></p>
7722 ````````````````````````````````
7723
7724
7725 Unicode case fold is used:
7726
7727 ```````````````````````````````` example
7728 [Толпой][Толпой] is a Russian word.
7729
7730 [ТОЛПОЙ]: /url
7731 .
7732 <p><a href="/url">Толпой</a> is a Russian word.</p>
7733 ````````````````````````````````
7734
7735
7736 Consecutive internal [whitespace] is treated as one space for
7737 purposes of determining matching:
7738
7739 ```````````````````````````````` example
7740 [Foo
7741   bar]: /url
7742
7743 [Baz][Foo bar]
7744 .
7745 <p><a href="/url">Baz</a></p>
7746 ````````````````````````````````
7747
7748
7749 No [whitespace] is allowed between the [link text] and the
7750 [link label]:
7751
7752 ```````````````````````````````` example
7753 [foo] [bar]
7754
7755 [bar]: /url "title"
7756 .
7757 <p>[foo] <a href="/url" title="title">bar</a></p>
7758 ````````````````````````````````
7759
7760
7761 ```````````````````````````````` example
7762 [foo]
7763 [bar]
7764
7765 [bar]: /url "title"
7766 .
7767 <p>[foo]
7768 <a href="/url" title="title">bar</a></p>
7769 ````````````````````````````````
7770
7771
7772 This is a departure from John Gruber's original Markdown syntax
7773 description, which explicitly allows whitespace between the link
7774 text and the link label.  It brings reference links in line with
7775 [inline links], which (according to both original Markdown and
7776 this spec) cannot have whitespace after the link text.  More
7777 importantly, it prevents inadvertent capture of consecutive
7778 [shortcut reference links]. If whitespace is allowed between the
7779 link text and the link label, then in the following we will have
7780 a single reference link, not two shortcut reference links, as
7781 intended:
7782
7783 ``` markdown
7784 [foo]
7785 [bar]
7786
7787 [foo]: /url1
7788 [bar]: /url2
7789 ```
7790
7791 (Note that [shortcut reference links] were introduced by Gruber
7792 himself in a beta version of `Markdown.pl`, but never included
7793 in the official syntax description.  Without shortcut reference
7794 links, it is harmless to allow space between the link text and
7795 link label; but once shortcut references are introduced, it is
7796 too dangerous to allow this, as it frequently leads to
7797 unintended results.)
7798
7799 When there are multiple matching [link reference definitions],
7800 the first is used:
7801
7802 ```````````````````````````````` example
7803 [foo]: /url1
7804
7805 [foo]: /url2
7806
7807 [bar][foo]
7808 .
7809 <p><a href="/url1">bar</a></p>
7810 ````````````````````````````````
7811
7812
7813 Note that matching is performed on normalized strings, not parsed
7814 inline content.  So the following does not match, even though the
7815 labels define equivalent inline content:
7816
7817 ```````````````````````````````` example
7818 [bar][foo\!]
7819
7820 [foo!]: /url
7821 .
7822 <p>[bar][foo!]</p>
7823 ````````````````````````````````
7824
7825
7826 [Link labels] cannot contain brackets, unless they are
7827 backslash-escaped:
7828
7829 ```````````````````````````````` example
7830 [foo][ref[]
7831
7832 [ref[]: /uri
7833 .
7834 <p>[foo][ref[]</p>
7835 <p>[ref[]: /uri</p>
7836 ````````````````````````````````
7837
7838
7839 ```````````````````````````````` example
7840 [foo][ref[bar]]
7841
7842 [ref[bar]]: /uri
7843 .
7844 <p>[foo][ref[bar]]</p>
7845 <p>[ref[bar]]: /uri</p>
7846 ````````````````````````````````
7847
7848
7849 ```````````````````````````````` example
7850 [[[foo]]]
7851
7852 [[[foo]]]: /url
7853 .
7854 <p>[[[foo]]]</p>
7855 <p>[[[foo]]]: /url</p>
7856 ````````````````````````````````
7857
7858
7859 ```````````````````````````````` example
7860 [foo][ref\[]
7861
7862 [ref\[]: /uri
7863 .
7864 <p><a href="/uri">foo</a></p>
7865 ````````````````````````````````
7866
7867
7868 Note that in this example `]` is not backslash-escaped:
7869
7870 ```````````````````````````````` example
7871 [bar\\]: /uri
7872
7873 [bar\\]
7874 .
7875 <p><a href="/uri">bar\</a></p>
7876 ````````````````````````````````
7877
7878
7879 A [link label] must contain at least one [non-whitespace character]:
7880
7881 ```````````````````````````````` example
7882 []
7883
7884 []: /uri
7885 .
7886 <p>[]</p>
7887 <p>[]: /uri</p>
7888 ````````````````````````````````
7889
7890
7891 ```````````````````````````````` example
7892 [
7893  ]
7894
7895 [
7896  ]: /uri
7897 .
7898 <p>[
7899 ]</p>
7900 <p>[
7901 ]: /uri</p>
7902 ````````````````````````````````
7903
7904
7905 A [collapsed reference link](@)
7906 consists of a [link label] that [matches] a
7907 [link reference definition] elsewhere in the
7908 document, followed by the string `[]`.
7909 The contents of the first link label are parsed as inlines,
7910 which are used as the link's text.  The link's URI and title are
7911 provided by the matching reference link definition.  Thus,
7912 `[foo][]` is equivalent to `[foo][foo]`.
7913
7914 ```````````````````````````````` example
7915 [foo][]
7916
7917 [foo]: /url "title"
7918 .
7919 <p><a href="/url" title="title">foo</a></p>
7920 ````````````````````````````````
7921
7922
7923 ```````````````````````````````` example
7924 [*foo* bar][]
7925
7926 [*foo* bar]: /url "title"
7927 .
7928 <p><a href="/url" title="title"><em>foo</em> bar</a></p>
7929 ````````````````````````````````
7930
7931
7932 The link labels are case-insensitive:
7933
7934 ```````````````````````````````` example
7935 [Foo][]
7936
7937 [foo]: /url "title"
7938 .
7939 <p><a href="/url" title="title">Foo</a></p>
7940 ````````````````````````````````
7941
7942
7943
7944 As with full reference links, [whitespace] is not
7945 allowed between the two sets of brackets:
7946
7947 ```````````````````````````````` example
7948 [foo] 
7949 []
7950
7951 [foo]: /url "title"
7952 .
7953 <p><a href="/url" title="title">foo</a>
7954 []</p>
7955 ````````````````````````````````
7956
7957
7958 A [shortcut reference link](@)
7959 consists of a [link label] that [matches] a
7960 [link reference definition] elsewhere in the
7961 document and is not followed by `[]` or a link label.
7962 The contents of the first link label are parsed as inlines,
7963 which are used as the link's text.  The link's URI and title
7964 are provided by the matching link reference definition.
7965 Thus, `[foo]` is equivalent to `[foo][]`.
7966
7967 ```````````````````````````````` example
7968 [foo]
7969
7970 [foo]: /url "title"
7971 .
7972 <p><a href="/url" title="title">foo</a></p>
7973 ````````````````````````````````
7974
7975
7976 ```````````````````````````````` example
7977 [*foo* bar]
7978
7979 [*foo* bar]: /url "title"
7980 .
7981 <p><a href="/url" title="title"><em>foo</em> bar</a></p>
7982 ````````````````````````````````
7983
7984
7985 ```````````````````````````````` example
7986 [[*foo* bar]]
7987
7988 [*foo* bar]: /url "title"
7989 .
7990 <p>[<a href="/url" title="title"><em>foo</em> bar</a>]</p>
7991 ````````````````````````````````
7992
7993
7994 ```````````````````````````````` example
7995 [[bar [foo]
7996
7997 [foo]: /url
7998 .
7999 <p>[[bar <a href="/url">foo</a></p>
8000 ````````````````````````````````
8001
8002
8003 The link labels are case-insensitive:
8004
8005 ```````````````````````````````` example
8006 [Foo]
8007
8008 [foo]: /url "title"
8009 .
8010 <p><a href="/url" title="title">Foo</a></p>
8011 ````````````````````````````````
8012
8013
8014 A space after the link text should be preserved:
8015
8016 ```````````````````````````````` example
8017 [foo] bar
8018
8019 [foo]: /url
8020 .
8021 <p><a href="/url">foo</a> bar</p>
8022 ````````````````````````````````
8023
8024
8025 If you just want bracketed text, you can backslash-escape the
8026 opening bracket to avoid links:
8027
8028 ```````````````````````````````` example
8029 \[foo]
8030
8031 [foo]: /url "title"
8032 .
8033 <p>[foo]</p>
8034 ````````````````````````````````
8035
8036
8037 Note that this is a link, because a link label ends with the first
8038 following closing bracket:
8039
8040 ```````````````````````````````` example
8041 [foo*]: /url
8042
8043 *[foo*]
8044 .
8045 <p>*<a href="/url">foo*</a></p>
8046 ````````````````````````````````
8047
8048
8049 Full and compact references take precedence over shortcut
8050 references:
8051
8052 ```````````````````````````````` example
8053 [foo][bar]
8054
8055 [foo]: /url1
8056 [bar]: /url2
8057 .
8058 <p><a href="/url2">foo</a></p>
8059 ````````````````````````````````
8060
8061 ```````````````````````````````` example
8062 [foo][]
8063
8064 [foo]: /url1
8065 .
8066 <p><a href="/url1">foo</a></p>
8067 ````````````````````````````````
8068
8069 Inline links also take precedence:
8070
8071 ```````````````````````````````` example
8072 [foo]()
8073
8074 [foo]: /url1
8075 .
8076 <p><a href="">foo</a></p>
8077 ````````````````````````````````
8078
8079 ```````````````````````````````` example
8080 [foo](not a link)
8081
8082 [foo]: /url1
8083 .
8084 <p><a href="/url1">foo</a>(not a link)</p>
8085 ````````````````````````````````
8086
8087 In the following case `[bar][baz]` is parsed as a reference,
8088 `[foo]` as normal text:
8089
8090 ```````````````````````````````` example
8091 [foo][bar][baz]
8092
8093 [baz]: /url
8094 .
8095 <p>[foo]<a href="/url">bar</a></p>
8096 ````````````````````````````````
8097
8098
8099 Here, though, `[foo][bar]` is parsed as a reference, since
8100 `[bar]` is defined:
8101
8102 ```````````````````````````````` example
8103 [foo][bar][baz]
8104
8105 [baz]: /url1
8106 [bar]: /url2
8107 .
8108 <p><a href="/url2">foo</a><a href="/url1">baz</a></p>
8109 ````````````````````````````````
8110
8111
8112 Here `[foo]` is not parsed as a shortcut reference, because it
8113 is followed by a link label (even though `[bar]` is not defined):
8114
8115 ```````````````````````````````` example
8116 [foo][bar][baz]
8117
8118 [baz]: /url1
8119 [foo]: /url2
8120 .
8121 <p>[foo]<a href="/url1">bar</a></p>
8122 ````````````````````````````````
8123
8124
8125
8126 ## Images
8127
8128 Syntax for images is like the syntax for links, with one
8129 difference. Instead of [link text], we have an
8130 [image description](@).  The rules for this are the
8131 same as for [link text], except that (a) an
8132 image description starts with `![` rather than `[`, and
8133 (b) an image description may contain links.
8134 An image description has inline elements
8135 as its contents.  When an image is rendered to HTML,
8136 this is standardly used as the image's `alt` attribute.
8137
8138 ```````````````````````````````` example
8139 ![foo](/url "title")
8140 .
8141 <p><img src="/url" alt="foo" title="title" /></p>
8142 ````````````````````````````````
8143
8144
8145 ```````````````````````````````` example
8146 ![foo *bar*]
8147
8148 [foo *bar*]: train.jpg "train & tracks"
8149 .
8150 <p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p>
8151 ````````````````````````````````
8152
8153
8154 ```````````````````````````````` example
8155 ![foo ![bar](/url)](/url2)
8156 .
8157 <p><img src="/url2" alt="foo bar" /></p>
8158 ````````````````````````````````
8159
8160
8161 ```````````````````````````````` example
8162 ![foo [bar](/url)](/url2)
8163 .
8164 <p><img src="/url2" alt="foo bar" /></p>
8165 ````````````````````````````````
8166
8167
8168 Though this spec is concerned with parsing, not rendering, it is
8169 recommended that in rendering to HTML, only the plain string content
8170 of the [image description] be used.  Note that in
8171 the above example, the alt attribute's value is `foo bar`, not `foo
8172 [bar](/url)` or `foo <a href="/url">bar</a>`.  Only the plain string
8173 content is rendered, without formatting.
8174
8175 ```````````````````````````````` example
8176 ![foo *bar*][]
8177
8178 [foo *bar*]: train.jpg "train & tracks"
8179 .
8180 <p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p>
8181 ````````````````````````````````
8182
8183
8184 ```````````````````````````````` example
8185 ![foo *bar*][foobar]
8186
8187 [FOOBAR]: train.jpg "train & tracks"
8188 .
8189 <p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p>
8190 ````````````````````````````````
8191
8192
8193 ```````````````````````````````` example
8194 ![foo](train.jpg)
8195 .
8196 <p><img src="train.jpg" alt="foo" /></p>
8197 ````````````````````````````````
8198
8199
8200 ```````````````````````````````` example
8201 My ![foo bar](/path/to/train.jpg  "title"   )
8202 .
8203 <p>My <img src="/path/to/train.jpg" alt="foo bar" title="title" /></p>
8204 ````````````````````````````````
8205
8206
8207 ```````````````````````````````` example
8208 ![foo](<url>)
8209 .
8210 <p><img src="url" alt="foo" /></p>
8211 ````````````````````````````````
8212
8213
8214 ```````````````````````````````` example
8215 ![](/url)
8216 .
8217 <p><img src="/url" alt="" /></p>
8218 ````````````````````````````````
8219
8220
8221 Reference-style:
8222
8223 ```````````````````````````````` example
8224 ![foo][bar]
8225
8226 [bar]: /url
8227 .
8228 <p><img src="/url" alt="foo" /></p>
8229 ````````````````````````````````
8230
8231
8232 ```````````````````````````````` example
8233 ![foo][bar]
8234
8235 [BAR]: /url
8236 .
8237 <p><img src="/url" alt="foo" /></p>
8238 ````````````````````````````````
8239
8240
8241 Collapsed:
8242
8243 ```````````````````````````````` example
8244 ![foo][]
8245
8246 [foo]: /url "title"
8247 .
8248 <p><img src="/url" alt="foo" title="title" /></p>
8249 ````````````````````````````````
8250
8251
8252 ```````````````````````````````` example
8253 ![*foo* bar][]
8254
8255 [*foo* bar]: /url "title"
8256 .
8257 <p><img src="/url" alt="foo bar" title="title" /></p>
8258 ````````````````````````````````
8259
8260
8261 The labels are case-insensitive:
8262
8263 ```````````````````````````````` example
8264 ![Foo][]
8265
8266 [foo]: /url "title"
8267 .
8268 <p><img src="/url" alt="Foo" title="title" /></p>
8269 ````````````````````````````````
8270
8271
8272 As with reference links, [whitespace] is not allowed
8273 between the two sets of brackets:
8274
8275 ```````````````````````````````` example
8276 ![foo] 
8277 []
8278
8279 [foo]: /url "title"
8280 .
8281 <p><img src="/url" alt="foo" title="title" />
8282 []</p>
8283 ````````````````````````````````
8284
8285
8286 Shortcut:
8287
8288 ```````````````````````````````` example
8289 ![foo]
8290
8291 [foo]: /url "title"
8292 .
8293 <p><img src="/url" alt="foo" title="title" /></p>
8294 ````````````````````````````````
8295
8296
8297 ```````````````````````````````` example
8298 ![*foo* bar]
8299
8300 [*foo* bar]: /url "title"
8301 .
8302 <p><img src="/url" alt="foo bar" title="title" /></p>
8303 ````````````````````````````````
8304
8305
8306 Note that link labels cannot contain unescaped brackets:
8307
8308 ```````````````````````````````` example
8309 ![[foo]]
8310
8311 [[foo]]: /url "title"
8312 .
8313 <p>![[foo]]</p>
8314 <p>[[foo]]: /url &quot;title&quot;</p>
8315 ````````````````````````````````
8316
8317
8318 The link labels are case-insensitive:
8319
8320 ```````````````````````````````` example
8321 ![Foo]
8322
8323 [foo]: /url "title"
8324 .
8325 <p><img src="/url" alt="Foo" title="title" /></p>
8326 ````````````````````````````````
8327
8328
8329 If you just want bracketed text, you can backslash-escape the
8330 opening `!` and `[`:
8331
8332 ```````````````````````````````` example
8333 \!\[foo]
8334
8335 [foo]: /url "title"
8336 .
8337 <p>![foo]</p>
8338 ````````````````````````````````
8339
8340
8341 If you want a link after a literal `!`, backslash-escape the
8342 `!`:
8343
8344 ```````````````````````````````` example
8345 \![foo]
8346
8347 [foo]: /url "title"
8348 .
8349 <p>!<a href="/url" title="title">foo</a></p>
8350 ````````````````````````````````
8351
8352
8353 ## Autolinks
8354
8355 [Autolink](@)s are absolute URIs and email addresses inside
8356 `<` and `>`. They are parsed as links, with the URL or email address
8357 as the link label.
8358
8359 A [URI autolink](@) consists of `<`, followed by an
8360 [absolute URI] not containing `<`, followed by `>`.  It is parsed as
8361 a link to the URI, with the URI as the link's label.
8362
8363 An [absolute URI](@),
8364 for these purposes, consists of a [scheme] followed by a colon (`:`)
8365 followed by zero or more characters other than ASCII
8366 [whitespace] and control characters, `<`, and `>`.  If
8367 the URI includes these characters, they must be percent-encoded
8368 (e.g. `%20` for a space).
8369
8370 For purposes of this spec, a [scheme](@) is any sequence
8371 of 2--32 characters beginning with an ASCII letter and followed
8372 by any combination of ASCII letters, digits, or the symbols plus
8373 ("+"), period ("."), or hyphen ("-").
8374
8375 Here are some valid autolinks:
8376
8377 ```````````````````````````````` example
8378 <http://foo.bar.baz>
8379 .
8380 <p><a href="http://foo.bar.baz">http://foo.bar.baz</a></p>
8381 ````````````````````````````````
8382
8383
8384 ```````````````````````````````` example
8385 <http://foo.bar.baz/test?q=hello&id=22&boolean>
8386 .
8387 <p><a href="http://foo.bar.baz/test?q=hello&amp;id=22&amp;boolean">http://foo.bar.baz/test?q=hello&amp;id=22&amp;boolean</a></p>
8388 ````````````````````````````````
8389
8390
8391 ```````````````````````````````` example
8392 <irc://foo.bar:2233/baz>
8393 .
8394 <p><a href="irc://foo.bar:2233/baz">irc://foo.bar:2233/baz</a></p>
8395 ````````````````````````````````
8396
8397
8398 Uppercase is also fine:
8399
8400 ```````````````````````````````` example
8401 <MAILTO:FOO@BAR.BAZ>
8402 .
8403 <p><a href="MAILTO:FOO@BAR.BAZ">MAILTO:FOO@BAR.BAZ</a></p>
8404 ````````````````````````````````
8405
8406
8407 Note that many strings that count as [absolute URIs] for
8408 purposes of this spec are not valid URIs, because their
8409 schemes are not registered or because of other problems
8410 with their syntax:
8411
8412 ```````````````````````````````` example
8413 <a+b+c:d>
8414 .
8415 <p><a href="a+b+c:d">a+b+c:d</a></p>
8416 ````````````````````````````````
8417
8418
8419 ```````````````````````````````` example
8420 <made-up-scheme://foo,bar>
8421 .
8422 <p><a href="made-up-scheme://foo,bar">made-up-scheme://foo,bar</a></p>
8423 ````````````````````````````````
8424
8425
8426 ```````````````````````````````` example
8427 <http://../>
8428 .
8429 <p><a href="http://../">http://../</a></p>
8430 ````````````````````````````````
8431
8432
8433 ```````````````````````````````` example
8434 <localhost:5001/foo>
8435 .
8436 <p><a href="localhost:5001/foo">localhost:5001/foo</a></p>
8437 ````````````````````````````````
8438
8439
8440 Spaces are not allowed in autolinks:
8441
8442 ```````````````````````````````` example
8443 <http://foo.bar/baz bim>
8444 .
8445 <p>&lt;http://foo.bar/baz bim&gt;</p>
8446 ````````````````````````````````
8447
8448
8449 Backslash-escapes do not work inside autolinks:
8450
8451 ```````````````````````````````` example
8452 <http://example.com/\[\>
8453 .
8454 <p><a href="http://example.com/%5C%5B%5C">http://example.com/\[\</a></p>
8455 ````````````````````````````````
8456
8457
8458 An [email autolink](@)
8459 consists of `<`, followed by an [email address],
8460 followed by `>`.  The link's label is the email address,
8461 and the URL is `mailto:` followed by the email address.
8462
8463 An [email address](@),
8464 for these purposes, is anything that matches
8465 the [non-normative regex from the HTML5
8466 spec](https://html.spec.whatwg.org/multipage/forms.html#e-mail-state-(type=email)):
8467
8468     /^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?
8469     (?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/
8470
8471 Examples of email autolinks:
8472
8473 ```````````````````````````````` example
8474 <foo@bar.example.com>
8475 .
8476 <p><a href="mailto:foo@bar.example.com">foo@bar.example.com</a></p>
8477 ````````````````````````````````
8478
8479
8480 ```````````````````````````````` example
8481 <foo+special@Bar.baz-bar0.com>
8482 .
8483 <p><a href="mailto:foo+special@Bar.baz-bar0.com">foo+special@Bar.baz-bar0.com</a></p>
8484 ````````````````````````````````
8485
8486
8487 Backslash-escapes do not work inside email autolinks:
8488
8489 ```````````````````````````````` example
8490 <foo\+@bar.example.com>
8491 .
8492 <p>&lt;foo+@bar.example.com&gt;</p>
8493 ````````````````````````````````
8494
8495
8496 These are not autolinks:
8497
8498 ```````````````````````````````` example
8499 <>
8500 .
8501 <p>&lt;&gt;</p>
8502 ````````````````````````````````
8503
8504
8505 ```````````````````````````````` example
8506 < http://foo.bar >
8507 .
8508 <p>&lt; http://foo.bar &gt;</p>
8509 ````````````````````````````````
8510
8511
8512 ```````````````````````````````` example
8513 <m:abc>
8514 .
8515 <p>&lt;m:abc&gt;</p>
8516 ````````````````````````````````
8517
8518
8519 ```````````````````````````````` example
8520 <foo.bar.baz>
8521 .
8522 <p>&lt;foo.bar.baz&gt;</p>
8523 ````````````````````````````````
8524
8525
8526 ```````````````````````````````` example
8527 http://example.com
8528 .
8529 <p>http://example.com</p>
8530 ````````````````````````````````
8531
8532
8533 ```````````````````````````````` example
8534 foo@bar.example.com
8535 .
8536 <p>foo@bar.example.com</p>
8537 ````````````````````````````````
8538
8539
8540 ## Raw HTML
8541
8542 Text between `<` and `>` that looks like an HTML tag is parsed as a
8543 raw HTML tag and will be rendered in HTML without escaping.
8544 Tag and attribute names are not limited to current HTML tags,
8545 so custom tags (and even, say, DocBook tags) may be used.
8546
8547 Here is the grammar for tags:
8548
8549 A [tag name](@) consists of an ASCII letter
8550 followed by zero or more ASCII letters, digits, or
8551 hyphens (`-`).
8552
8553 An [attribute](@) consists of [whitespace],
8554 an [attribute name], and an optional
8555 [attribute value specification].
8556
8557 An [attribute name](@)
8558 consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII
8559 letters, digits, `_`, `.`, `:`, or `-`.  (Note:  This is the XML
8560 specification restricted to ASCII.  HTML5 is laxer.)
8561
8562 An [attribute value specification](@)
8563 consists of optional [whitespace],
8564 a `=` character, optional [whitespace], and an [attribute
8565 value].
8566
8567 An [attribute value](@)
8568 consists of an [unquoted attribute value],
8569 a [single-quoted attribute value], or a [double-quoted attribute value].
8570
8571 An [unquoted attribute value](@)
8572 is a nonempty string of characters not
8573 including spaces, `"`, `'`, `=`, `<`, `>`, or `` ` ``.
8574
8575 A [single-quoted attribute value](@)
8576 consists of `'`, zero or more
8577 characters not including `'`, and a final `'`.
8578
8579 A [double-quoted attribute value](@)
8580 consists of `"`, zero or more
8581 characters not including `"`, and a final `"`.
8582
8583 An [open tag](@) consists of a `<` character, a [tag name],
8584 zero or more [attributes], optional [whitespace], an optional `/`
8585 character, and a `>` character.
8586
8587 A [closing tag](@) consists of the string `</`, a
8588 [tag name], optional [whitespace], and the character `>`.
8589
8590 An [HTML comment](@) consists of `<!--` + *text* + `-->`,
8591 where *text* does not start with `>` or `->`, does not end with `-`,
8592 and does not contain `--`.  (See the
8593 [HTML5 spec](http://www.w3.org/TR/html5/syntax.html#comments).)
8594
8595 A [processing instruction](@)
8596 consists of the string `<?`, a string
8597 of characters not including the string `?>`, and the string
8598 `?>`.
8599
8600 A [declaration](@) consists of the
8601 string `<!`, a name consisting of one or more uppercase ASCII letters,
8602 [whitespace], a string of characters not including the
8603 character `>`, and the character `>`.
8604
8605 A [CDATA section](@) consists of
8606 the string `<![CDATA[`, a string of characters not including the string
8607 `]]>`, and the string `]]>`.
8608
8609 An [HTML tag](@) consists of an [open tag], a [closing tag],
8610 an [HTML comment], a [processing instruction], a [declaration],
8611 or a [CDATA section].
8612
8613 Here are some simple open tags:
8614
8615 ```````````````````````````````` example
8616 <a><bab><c2c>
8617 .
8618 <p><a><bab><c2c></p>
8619 ````````````````````````````````
8620
8621
8622 Empty elements:
8623
8624 ```````````````````````````````` example
8625 <a/><b2/>
8626 .
8627 <p><a/><b2/></p>
8628 ````````````````````````````````
8629
8630
8631 [Whitespace] is allowed:
8632
8633 ```````````````````````````````` example
8634 <a  /><b2
8635 data="foo" >
8636 .
8637 <p><a  /><b2
8638 data="foo" ></p>
8639 ````````````````````````````````
8640
8641
8642 With attributes:
8643
8644 ```````````````````````````````` example
8645 <a foo="bar" bam = 'baz <em>"</em>'
8646 _boolean zoop:33=zoop:33 />
8647 .
8648 <p><a foo="bar" bam = 'baz <em>"</em>'
8649 _boolean zoop:33=zoop:33 /></p>
8650 ````````````````````````````````
8651
8652
8653 Custom tag names can be used:
8654
8655 ```````````````````````````````` example
8656 Foo <responsive-image src="foo.jpg" />
8657 .
8658 <p>Foo <responsive-image src="foo.jpg" /></p>
8659 ````````````````````````````````
8660
8661
8662 Illegal tag names, not parsed as HTML:
8663
8664 ```````````````````````````````` example
8665 <33> <__>
8666 .
8667 <p>&lt;33&gt; &lt;__&gt;</p>
8668 ````````````````````````````````
8669
8670
8671 Illegal attribute names:
8672
8673 ```````````````````````````````` example
8674 <a h*#ref="hi">
8675 .
8676 <p>&lt;a h*#ref=&quot;hi&quot;&gt;</p>
8677 ````````````````````````````````
8678
8679
8680 Illegal attribute values:
8681
8682 ```````````````````````````````` example
8683 <a href="hi'> <a href=hi'>
8684 .
8685 <p>&lt;a href=&quot;hi'&gt; &lt;a href=hi'&gt;</p>
8686 ````````````````````````````````
8687
8688
8689 Illegal [whitespace]:
8690
8691 ```````````````````````````````` example
8692 < a><
8693 foo><bar/ >
8694 .
8695 <p>&lt; a&gt;&lt;
8696 foo&gt;&lt;bar/ &gt;</p>
8697 ````````````````````````````````
8698
8699
8700 Missing [whitespace]:
8701
8702 ```````````````````````````````` example
8703 <a href='bar'title=title>
8704 .
8705 <p>&lt;a href='bar'title=title&gt;</p>
8706 ````````````````````````````````
8707
8708
8709 Closing tags:
8710
8711 ```````````````````````````````` example
8712 </a></foo >
8713 .
8714 <p></a></foo ></p>
8715 ````````````````````````````````
8716
8717
8718 Illegal attributes in closing tag:
8719
8720 ```````````````````````````````` example
8721 </a href="foo">
8722 .
8723 <p>&lt;/a href=&quot;foo&quot;&gt;</p>
8724 ````````````````````````````````
8725
8726
8727 Comments:
8728
8729 ```````````````````````````````` example
8730 foo <!-- this is a
8731 comment - with hyphen -->
8732 .
8733 <p>foo <!-- this is a
8734 comment - with hyphen --></p>
8735 ````````````````````````````````
8736
8737
8738 ```````````````````````````````` example
8739 foo <!-- not a comment -- two hyphens -->
8740 .
8741 <p>foo &lt;!-- not a comment -- two hyphens --&gt;</p>
8742 ````````````````````````````````
8743
8744
8745 Not comments:
8746
8747 ```````````````````````````````` example
8748 foo <!--> foo -->
8749
8750 foo <!-- foo--->
8751 .
8752 <p>foo &lt;!--&gt; foo --&gt;</p>
8753 <p>foo &lt;!-- foo---&gt;</p>
8754 ````````````````````````````````
8755
8756
8757 Processing instructions:
8758
8759 ```````````````````````````````` example
8760 foo <?php echo $a; ?>
8761 .
8762 <p>foo <?php echo $a; ?></p>
8763 ````````````````````````````````
8764
8765
8766 Declarations:
8767
8768 ```````````````````````````````` example
8769 foo <!ELEMENT br EMPTY>
8770 .
8771 <p>foo <!ELEMENT br EMPTY></p>
8772 ````````````````````````````````
8773
8774
8775 CDATA sections:
8776
8777 ```````````````````````````````` example
8778 foo <![CDATA[>&<]]>
8779 .
8780 <p>foo <![CDATA[>&<]]></p>
8781 ````````````````````````````````
8782
8783
8784 Entity and numeric character references are preserved in HTML
8785 attributes:
8786
8787 ```````````````````````````````` example
8788 foo <a href="&ouml;">
8789 .
8790 <p>foo <a href="&ouml;"></p>
8791 ````````````````````````````````
8792
8793
8794 Backslash escapes do not work in HTML attributes:
8795
8796 ```````````````````````````````` example
8797 foo <a href="\*">
8798 .
8799 <p>foo <a href="\*"></p>
8800 ````````````````````````````````
8801
8802
8803 ```````````````````````````````` example
8804 <a href="\"">
8805 .
8806 <p>&lt;a href=&quot;&quot;&quot;&gt;</p>
8807 ````````````````````````````````
8808
8809
8810 ## Hard line breaks
8811
8812 A line break (not in a code span or HTML tag) that is preceded
8813 by two or more spaces and does not occur at the end of a block
8814 is parsed as a [hard line break](@) (rendered
8815 in HTML as a `<br />` tag):
8816
8817 ```````````````````````````````` example
8818 foo  
8819 baz
8820 .
8821 <p>foo<br />
8822 baz</p>
8823 ````````````````````````````````
8824
8825
8826 For a more visible alternative, a backslash before the
8827 [line ending] may be used instead of two spaces:
8828
8829 ```````````````````````````````` example
8830 foo\
8831 baz
8832 .
8833 <p>foo<br />
8834 baz</p>
8835 ````````````````````````````````
8836
8837
8838 More than two spaces can be used:
8839
8840 ```````````````````````````````` example
8841 foo       
8842 baz
8843 .
8844 <p>foo<br />
8845 baz</p>
8846 ````````````````````````````````
8847
8848
8849 Leading spaces at the beginning of the next line are ignored:
8850
8851 ```````````````````````````````` example
8852 foo  
8853      bar
8854 .
8855 <p>foo<br />
8856 bar</p>
8857 ````````````````````````````````
8858
8859
8860 ```````````````````````````````` example
8861 foo\
8862      bar
8863 .
8864 <p>foo<br />
8865 bar</p>
8866 ````````````````````````````````
8867
8868
8869 Line breaks can occur inside emphasis, links, and other constructs
8870 that allow inline content:
8871
8872 ```````````````````````````````` example
8873 *foo  
8874 bar*
8875 .
8876 <p><em>foo<br />
8877 bar</em></p>
8878 ````````````````````````````````
8879
8880
8881 ```````````````````````````````` example
8882 *foo\
8883 bar*
8884 .
8885 <p><em>foo<br />
8886 bar</em></p>
8887 ````````````````````````````````
8888
8889
8890 Line breaks do not occur inside code spans
8891
8892 ```````````````````````````````` example
8893 `code  
8894 span`
8895 .
8896 <p><code>code span</code></p>
8897 ````````````````````````````````
8898
8899
8900 ```````````````````````````````` example
8901 `code\
8902 span`
8903 .
8904 <p><code>code\ span</code></p>
8905 ````````````````````````````````
8906
8907
8908 or HTML tags:
8909
8910 ```````````````````````````````` example
8911 <a href="foo  
8912 bar">
8913 .
8914 <p><a href="foo  
8915 bar"></p>
8916 ````````````````````````````````
8917
8918
8919 ```````````````````````````````` example
8920 <a href="foo\
8921 bar">
8922 .
8923 <p><a href="foo\
8924 bar"></p>
8925 ````````````````````````````````
8926
8927
8928 Hard line breaks are for separating inline content within a block.
8929 Neither syntax for hard line breaks works at the end of a paragraph or
8930 other block element:
8931
8932 ```````````````````````````````` example
8933 foo\
8934 .
8935 <p>foo\</p>
8936 ````````````````````````````````
8937
8938
8939 ```````````````````````````````` example
8940 foo  
8941 .
8942 <p>foo</p>
8943 ````````````````````````````````
8944
8945
8946 ```````````````````````````````` example
8947 ### foo\
8948 .
8949 <h3>foo\</h3>
8950 ````````````````````````````````
8951
8952
8953 ```````````````````````````````` example
8954 ### foo  
8955 .
8956 <h3>foo</h3>
8957 ````````````````````````````````
8958
8959
8960 ## Soft line breaks
8961
8962 A regular line break (not in a code span or HTML tag) that is not
8963 preceded by two or more spaces or a backslash is parsed as a
8964 [softbreak](@).  (A softbreak may be rendered in HTML either as a
8965 [line ending] or as a space. The result will be the same in
8966 browsers. In the examples here, a [line ending] will be used.)
8967
8968 ```````````````````````````````` example
8969 foo
8970 baz
8971 .
8972 <p>foo
8973 baz</p>
8974 ````````````````````````````````
8975
8976
8977 Spaces at the end of the line and beginning of the next line are
8978 removed:
8979
8980 ```````````````````````````````` example
8981 foo 
8982  baz
8983 .
8984 <p>foo
8985 baz</p>
8986 ````````````````````````````````
8987
8988
8989 A conforming parser may render a soft line break in HTML either as a
8990 line break or as a space.
8991
8992 A renderer may also provide an option to render soft line breaks
8993 as hard line breaks.
8994
8995 ## Textual content
8996
8997 Any characters not given an interpretation by the above rules will
8998 be parsed as plain textual content.
8999
9000 ```````````````````````````````` example
9001 hello $.;'there
9002 .
9003 <p>hello $.;'there</p>
9004 ````````````````````````````````
9005
9006
9007 ```````````````````````````````` example
9008 Foo χρῆν
9009 .
9010 <p>Foo χρῆν</p>
9011 ````````````````````````````````
9012
9013
9014 Internal spaces are preserved verbatim:
9015
9016 ```````````````````````````````` example
9017 Multiple     spaces
9018 .
9019 <p>Multiple     spaces</p>
9020 ````````````````````````````````
9021
9022
9023 <!-- END TESTS -->
9024
9025 # Appendix: A parsing strategy
9026
9027 In this appendix we describe some features of the parsing strategy
9028 used in the CommonMark reference implementations.
9029
9030 ## Overview
9031
9032 Parsing has two phases:
9033
9034 1. In the first phase, lines of input are consumed and the block
9035 structure of the document---its division into paragraphs, block quotes,
9036 list items, and so on---is constructed.  Text is assigned to these
9037 blocks but not parsed. Link reference definitions are parsed and a
9038 map of links is constructed.
9039
9040 2. In the second phase, the raw text contents of paragraphs and headings
9041 are parsed into sequences of Markdown inline elements (strings,
9042 code spans, links, emphasis, and so on), using the map of link
9043 references constructed in phase 1.
9044
9045 At each point in processing, the document is represented as a tree of
9046 **blocks**.  The root of the tree is a `document` block.  The `document`
9047 may have any number of other blocks as **children**.  These children
9048 may, in turn, have other blocks as children.  The last child of a block
9049 is normally considered **open**, meaning that subsequent lines of input
9050 can alter its contents.  (Blocks that are not open are **closed**.)
9051 Here, for example, is a possible document tree, with the open blocks
9052 marked by arrows:
9053
9054 ``` tree
9055 -> document
9056   -> block_quote
9057        paragraph
9058          "Lorem ipsum dolor\nsit amet."
9059     -> list (type=bullet tight=true bullet_char=-)
9060          list_item
9061            paragraph
9062              "Qui *quodsi iracundia*"
9063       -> list_item
9064         -> paragraph
9065              "aliquando id"
9066 ```
9067
9068 ## Phase 1: block structure
9069
9070 Each line that is processed has an effect on this tree.  The line is
9071 analyzed and, depending on its contents, the document may be altered
9072 in one or more of the following ways:
9073
9074 1. One or more open blocks may be closed.
9075 2. One or more new blocks may be created as children of the
9076    last open block.
9077 3. Text may be added to the last (deepest) open block remaining
9078    on the tree.
9079
9080 Once a line has been incorporated into the tree in this way,
9081 it can be discarded, so input can be read in a stream.
9082
9083 For each line, we follow this procedure:
9084
9085 1. First we iterate through the open blocks, starting with the
9086 root document, and descending through last children down to the last
9087 open block.  Each block imposes a condition that the line must satisfy
9088 if the block is to remain open.  For example, a block quote requires a
9089 `>` character.  A paragraph requires a non-blank line.
9090 In this phase we may match all or just some of the open
9091 blocks.  But we cannot close unmatched blocks yet, because we may have a
9092 [lazy continuation line].
9093
9094 2.  Next, after consuming the continuation markers for existing
9095 blocks, we look for new block starts (e.g. `>` for a block quote).
9096 If we encounter a new block start, we close any blocks unmatched
9097 in step 1 before creating the new block as a child of the last
9098 matched block.
9099
9100 3.  Finally, we look at the remainder of the line (after block
9101 markers like `>`, list markers, and indentation have been consumed).
9102 This is text that can be incorporated into the last open
9103 block (a paragraph, code block, heading, or raw HTML).
9104
9105 Setext headings are formed when we see a line of a paragraph
9106 that is a [setext heading underline].
9107
9108 Reference link definitions are detected when a paragraph is closed;
9109 the accumulated text lines are parsed to see if they begin with
9110 one or more reference link definitions.  Any remainder becomes a
9111 normal paragraph.
9112
9113 We can see how this works by considering how the tree above is
9114 generated by four lines of Markdown:
9115
9116 ``` markdown
9117 > Lorem ipsum dolor
9118 sit amet.
9119 > - Qui *quodsi iracundia*
9120 > - aliquando id
9121 ```
9122
9123 At the outset, our document model is just
9124
9125 ``` tree
9126 -> document
9127 ```
9128
9129 The first line of our text,
9130
9131 ``` markdown
9132 > Lorem ipsum dolor
9133 ```
9134
9135 causes a `block_quote` block to be created as a child of our
9136 open `document` block, and a `paragraph` block as a child of
9137 the `block_quote`.  Then the text is added to the last open
9138 block, the `paragraph`:
9139
9140 ``` tree
9141 -> document
9142   -> block_quote
9143     -> paragraph
9144          "Lorem ipsum dolor"
9145 ```
9146
9147 The next line,
9148
9149 ``` markdown
9150 sit amet.
9151 ```
9152
9153 is a "lazy continuation" of the open `paragraph`, so it gets added
9154 to the paragraph's text:
9155
9156 ``` tree
9157 -> document
9158   -> block_quote
9159     -> paragraph
9160          "Lorem ipsum dolor\nsit amet."
9161 ```
9162
9163 The third line,
9164
9165 ``` markdown
9166 > - Qui *quodsi iracundia*
9167 ```
9168
9169 causes the `paragraph` block to be closed, and a new `list` block
9170 opened as a child of the `block_quote`.  A `list_item` is also
9171 added as a child of the `list`, and a `paragraph` as a child of
9172 the `list_item`.  The text is then added to the new `paragraph`:
9173
9174 ``` tree
9175 -> document
9176   -> block_quote
9177        paragraph
9178          "Lorem ipsum dolor\nsit amet."
9179     -> list (type=bullet tight=true bullet_char=-)
9180       -> list_item
9181         -> paragraph
9182              "Qui *quodsi iracundia*"
9183 ```
9184
9185 The fourth line,
9186
9187 ``` markdown
9188 > - aliquando id
9189 ```
9190
9191 causes the `list_item` (and its child the `paragraph`) to be closed,
9192 and a new `list_item` opened up as child of the `list`.  A `paragraph`
9193 is added as a child of the new `list_item`, to contain the text.
9194 We thus obtain the final tree:
9195
9196 ``` tree
9197 -> document
9198   -> block_quote
9199        paragraph
9200          "Lorem ipsum dolor\nsit amet."
9201     -> list (type=bullet tight=true bullet_char=-)
9202          list_item
9203            paragraph
9204              "Qui *quodsi iracundia*"
9205       -> list_item
9206         -> paragraph
9207              "aliquando id"
9208 ```
9209
9210 ## Phase 2: inline structure
9211
9212 Once all of the input has been parsed, all open blocks are closed.
9213
9214 We then "walk the tree," visiting every node, and parse raw
9215 string contents of paragraphs and headings as inlines.  At this
9216 point we have seen all the link reference definitions, so we can
9217 resolve reference links as we go.
9218
9219 ``` tree
9220 document
9221   block_quote
9222     paragraph
9223       str "Lorem ipsum dolor"
9224       softbreak
9225       str "sit amet."
9226     list (type=bullet tight=true bullet_char=-)
9227       list_item
9228         paragraph
9229           str "Qui "
9230           emph
9231             str "quodsi iracundia"
9232       list_item
9233         paragraph
9234           str "aliquando id"
9235 ```
9236
9237 Notice how the [line ending] in the first paragraph has
9238 been parsed as a `softbreak`, and the asterisks in the first list item
9239 have become an `emph`.
9240
9241 ### An algorithm for parsing nested emphasis and links
9242
9243 By far the trickiest part of inline parsing is handling emphasis,
9244 strong emphasis, links, and images.  This is done using the following
9245 algorithm.
9246
9247 When we're parsing inlines and we hit either
9248
9249 - a run of `*` or `_` characters, or
9250 - a `[` or `![`
9251
9252 we insert a text node with these symbols as its literal content, and we
9253 add a pointer to this text node to the [delimiter stack](@).
9254
9255 The [delimiter stack] is a doubly linked list.  Each
9256 element contains a pointer to a text node, plus information about
9257
9258 - the type of delimiter (`[`, `![`, `*`, `_`)
9259 - the number of delimiters,
9260 - whether the delimiter is "active" (all are active to start), and
9261 - whether the delimiter is a potential opener, a potential closer,
9262   or both (which depends on what sort of characters precede
9263   and follow the delimiters).
9264
9265 When we hit a `]` character, we call the *look for link or image*
9266 procedure (see below).
9267
9268 When we hit the end of the input, we call the *process emphasis*
9269 procedure (see below), with `stack_bottom` = NULL.
9270
9271 #### *look for link or image*
9272
9273 Starting at the top of the delimiter stack, we look backwards
9274 through the stack for an opening `[` or `![` delimiter.
9275
9276 - If we don't find one, we return a literal text node `]`.
9277
9278 - If we do find one, but it's not *active*, we remove the inactive
9279   delimiter from the stack, and return a literal text node `]`.
9280
9281 - If we find one and it's active, then we parse ahead to see if
9282   we have an inline link/image, reference link/image, compact reference
9283   link/image, or shortcut reference link/image.
9284
9285   + If we don't, then we remove the opening delimiter from the
9286     delimiter stack and return a literal text node `]`.
9287
9288   + If we do, then
9289
9290     * We return a link or image node whose children are the inlines
9291       after the text node pointed to by the opening delimiter.
9292
9293     * We run *process emphasis* on these inlines, with the `[` opener
9294       as `stack_bottom`.
9295
9296     * We remove the opening delimiter.
9297
9298     * If we have a link (and not an image), we also set all
9299       `[` delimiters before the opening delimiter to *inactive*.  (This
9300       will prevent us from getting links within links.)
9301
9302 #### *process emphasis*
9303
9304 Parameter `stack_bottom` sets a lower bound to how far we
9305 descend in the [delimiter stack].  If it is NULL, we can
9306 go all the way to the bottom.  Otherwise, we stop before
9307 visiting `stack_bottom`.
9308
9309 Let `current_position` point to the element on the [delimiter stack]
9310 just above `stack_bottom` (or the first element if `stack_bottom`
9311 is NULL).
9312
9313 We keep track of the `openers_bottom` for each delimiter
9314 type (`*`, `_`).  Initialize this to `stack_bottom`.
9315
9316 Then we repeat the following until we run out of potential
9317 closers:
9318
9319 - Move `current_position` forward in the delimiter stack (if needed)
9320   until we find the first potential closer with delimiter `*` or `_`.
9321   (This will be the potential closer closest
9322   to the beginning of the input -- the first one in parse order.)
9323
9324 - Now, look back in the stack (staying above `stack_bottom` and
9325   the `openers_bottom` for this delimiter type) for the
9326   first matching potential opener ("matching" means same delimiter).
9327
9328 - If one is found:
9329
9330   + Figure out whether we have emphasis or strong emphasis:
9331     if both closer and opener spans have length >= 2, we have
9332     strong, otherwise regular.
9333
9334   + Insert an emph or strong emph node accordingly, after
9335     the text node corresponding to the opener.
9336
9337   + Remove any delimiters between the opener and closer from
9338     the delimiter stack.
9339
9340   + Remove 1 (for regular emph) or 2 (for strong emph) delimiters
9341     from the opening and closing text nodes.  If they become empty
9342     as a result, remove them and remove the corresponding element
9343     of the delimiter stack.  If the closing node is removed, reset
9344     `current_position` to the next element in the stack.
9345
9346 - If none in found:
9347
9348   + Set `openers_bottom` to the element before `current_position`.
9349     (We know that there are no openers for this kind of closer up to and
9350     including this point, so this puts a lower bound on future searches.)
9351
9352   + If the closer at `current_position` is not a potential opener,
9353     remove it from the delimiter stack (since we know it can't
9354     be a closer either).
9355
9356   + Advance `current_position` to the next element in the stack.
9357
9358 After we're done, we remove all delimiters above `stack_bottom` from the
9359 delimiter stack.
9360