92faa7302258e797e2519f906d035d534f58567f
[simantics/platform.git] / tests / org.simantics.scl.compiler.tests / src / org / simantics / scl / compiler / tests / markdown / spec.txt
1 ---
2 title: CommonMark Spec
3 author: John MacFarlane
4 version: 0.26
5 date: '2016-07-15'
6 license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)'
7 ...
8
9 # Introduction
10
11 ## What is Markdown?
12
13 Markdown is a plain text format for writing structured documents,
14 based on conventions used for indicating formatting in email and
15 usenet posts.  It was developed in 2004 by John Gruber, who wrote
16 the first Markdown-to-HTML converter in Perl, and it soon became
17 ubiquitous.  In the next decade, dozens of implementations were
18 developed in many languages.  Some extended the original
19 Markdown syntax with conventions for footnotes, tables, and
20 other document elements.  Some allowed Markdown documents to be
21 rendered in formats other than HTML.  Websites like Reddit,
22 StackOverflow, and GitHub had millions of people using Markdown.
23 And Markdown started to be used beyond the web, to author books,
24 articles, slide shows, letters, and lecture notes.
25
26 What distinguishes Markdown from many other lightweight markup
27 syntaxes, which are often easier to write, is its readability.
28 As Gruber writes:
29
30 > The overriding design goal for Markdown's formatting syntax is
31 > to make it as readable as possible. The idea is that a
32 > Markdown-formatted document should be publishable as-is, as
33 > plain text, without looking like it's been marked up with tags
34 > or formatting instructions.
35 > (<http://daringfireball.net/projects/markdown/>)
36
37 The point can be illustrated by comparing a sample of
38 [AsciiDoc](http://www.methods.co.nz/asciidoc/) with
39 an equivalent sample of Markdown.  Here is a sample of
40 AsciiDoc from the AsciiDoc manual:
41
42 ```
43 1. List item one.
44 +
45 List item one continued with a second paragraph followed by an
46 Indented block.
47 +
48 .................
49 $ ls *.sh
50 $ mv *.sh ~/tmp
51 .................
52 +
53 List item continued with a third paragraph.
54
55 2. List item two continued with an open block.
56 +
57 --
58 This paragraph is part of the preceding list item.
59
60 a. This list is nested and does not require explicit item
61 continuation.
62 +
63 This paragraph is part of the preceding list item.
64
65 b. List item b.
66
67 This paragraph belongs to item two of the outer list.
68 --
69 ```
70
71 And here is the equivalent in Markdown:
72 ```
73 1.  List item one.
74
75     List item one continued with a second paragraph followed by an
76     Indented block.
77
78         $ ls *.sh
79         $ mv *.sh ~/tmp
80
81     List item continued with a third paragraph.
82
83 2.  List item two continued with an open block.
84
85     This paragraph is part of the preceding list item.
86
87     1. This list is nested and does not require explicit item continuation.
88
89        This paragraph is part of the preceding list item.
90
91     2. List item b.
92
93     This paragraph belongs to item two of the outer list.
94 ```
95
96 The AsciiDoc version is, arguably, easier to write. You don't need
97 to worry about indentation.  But the Markdown version is much easier
98 to read.  The nesting of list items is apparent to the eye in the
99 source, not just in the processed document.
100
101 ## Why is a spec needed?
102
103 John Gruber's [canonical description of Markdown's
104 syntax](http://daringfireball.net/projects/markdown/syntax)
105 does not specify the syntax unambiguously.  Here are some examples of
106 questions it does not answer:
107
108 1.  How much indentation is needed for a sublist?  The spec says that
109     continuation paragraphs need to be indented four spaces, but is
110     not fully explicit about sublists.  It is natural to think that
111     they, too, must be indented four spaces, but `Markdown.pl` does
112     not require that.  This is hardly a "corner case," and divergences
113     between implementations on this issue often lead to surprises for
114     users in real documents. (See [this comment by John
115     Gruber](http://article.gmane.org/gmane.text.markdown.general/1997).)
116
117 2.  Is a blank line needed before a block quote or heading?
118     Most implementations do not require the blank line.  However,
119     this can lead to unexpected results in hard-wrapped text, and
120     also to ambiguities in parsing (note that some implementations
121     put the heading inside the blockquote, while others do not).
122     (John Gruber has also spoken [in favor of requiring the blank
123     lines](http://article.gmane.org/gmane.text.markdown.general/2146).)
124
125 3.  Is a blank line needed before an indented code block?
126     (`Markdown.pl` requires it, but this is not mentioned in the
127     documentation, and some implementations do not require it.)
128
129     ``` markdown
130     paragraph
131         code?
132     ```
133
134 4.  What is the exact rule for determining when list items get
135     wrapped in `<p>` tags?  Can a list be partially "loose" and partially
136     "tight"?  What should we do with a list like this?
137
138     ``` markdown
139     1. one
140
141     2. two
142     3. three
143     ```
144
145     Or this?
146
147     ``` markdown
148     1.  one
149         - a
150
151         - b
152     2.  two
153     ```
154
155     (There are some relevant comments by John Gruber
156     [here](http://article.gmane.org/gmane.text.markdown.general/2554).)
157
158 5.  Can list markers be indented?  Can ordered list markers be right-aligned?
159
160     ``` markdown
161      8. item 1
162      9. item 2
163     10. item 2a
164     ```
165
166 6.  Is this one list with a thematic break in its second item,
167     or two lists separated by a thematic break?
168
169     ``` markdown
170     * a
171     * * * * *
172     * b
173     ```
174
175 7.  When list markers change from numbers to bullets, do we have
176     two lists or one?  (The Markdown syntax description suggests two,
177     but the perl scripts and many other implementations produce one.)
178
179     ``` markdown
180     1. fee
181     2. fie
182     -  foe
183     -  fum
184     ```
185
186 8.  What are the precedence rules for the markers of inline structure?
187     For example, is the following a valid link, or does the code span
188     take precedence ?
189
190     ``` markdown
191     [a backtick (`)](/url) and [another backtick (`)](/url).
192     ```
193
194 9.  What are the precedence rules for markers of emphasis and strong
195     emphasis?  For example, how should the following be parsed?
196
197     ``` markdown
198     *foo *bar* baz*
199     ```
200
201 10. What are the precedence rules between block-level and inline-level
202     structure?  For example, how should the following be parsed?
203
204     ``` markdown
205     - `a long code span can contain a hyphen like this
206       - and it can screw things up`
207     ```
208
209 11. Can list items include section headings?  (`Markdown.pl` does not
210     allow this, but does allow blockquotes to include headings.)
211
212     ``` markdown
213     - # Heading
214     ```
215
216 12. Can list items be empty?
217
218     ``` markdown
219     * a
220     *
221     * b
222     ```
223
224 13. Can link references be defined inside block quotes or list items?
225
226     ``` markdown
227     > Blockquote [foo].
228     >
229     > [foo]: /url
230     ```
231
232 14. If there are multiple definitions for the same reference, which takes
233     precedence?
234
235     ``` markdown
236     [foo]: /url1
237     [foo]: /url2
238
239     [foo][]
240     ```
241
242 In the absence of a spec, early implementers consulted `Markdown.pl`
243 to resolve these ambiguities.  But `Markdown.pl` was quite buggy, and
244 gave manifestly bad results in many cases, so it was not a
245 satisfactory replacement for a spec.
246
247 Because there is no unambiguous spec, implementations have diverged
248 considerably.  As a result, users are often surprised to find that
249 a document that renders one way on one system (say, a github wiki)
250 renders differently on another (say, converting to docbook using
251 pandoc).  To make matters worse, because nothing in Markdown counts
252 as a "syntax error," the divergence often isn't discovered right away.
253
254 ## About this document
255
256 This document attempts to specify Markdown syntax unambiguously.
257 It contains many examples with side-by-side Markdown and
258 HTML.  These are intended to double as conformance tests.  An
259 accompanying script `spec_tests.py` can be used to run the tests
260 against any Markdown program:
261
262     python test/spec_tests.py --spec spec.txt --program PROGRAM
263
264 Since this document describes how Markdown is to be parsed into
265 an abstract syntax tree, it would have made sense to use an abstract
266 representation of the syntax tree instead of HTML.  But HTML is capable
267 of representing the structural distinctions we need to make, and the
268 choice of HTML for the tests makes it possible to run the tests against
269 an implementation without writing an abstract syntax tree renderer.
270
271 This document is generated from a text file, `spec.txt`, written
272 in Markdown with a small extension for the side-by-side tests.
273 The script `tools/makespec.py` can be used to convert `spec.txt` into
274 HTML or CommonMark (which can then be converted into other formats).
275
276 In the examples, the `→` character is used to represent tabs.
277
278 # Preliminaries
279
280 ## Characters and lines
281
282 Any sequence of [characters] is a valid CommonMark
283 document.
284
285 A [character](@) is a Unicode code point.  Although some
286 code points (for example, combining accents) do not correspond to
287 characters in an intuitive sense, all code points count as characters
288 for purposes of this spec.
289
290 This spec does not specify an encoding; it thinks of lines as composed
291 of [characters] rather than bytes.  A conforming parser may be limited
292 to a certain encoding.
293
294 A [line](@) is a sequence of zero or more [characters]
295 other than newline (`U+000A`) or carriage return (`U+000D`),
296 followed by a [line ending] or by the end of file.
297
298 A [line ending](@) is a newline (`U+000A`), a carriage return
299 (`U+000D`) not followed by a newline, or a carriage return and a
300 following newline.
301
302 A line containing no characters, or a line containing only spaces
303 (`U+0020`) or tabs (`U+0009`), is called a [blank line](@).
304
305 The following definitions of character classes will be used in this spec:
306
307 A [whitespace character](@) is a space
308 (`U+0020`), tab (`U+0009`), newline (`U+000A`), line tabulation (`U+000B`),
309 form feed (`U+000C`), or carriage return (`U+000D`).
310
311 [Whitespace](@) is a sequence of one or more [whitespace
312 characters].
313
314 A [Unicode whitespace character](@) is
315 any code point in the Unicode `Zs` class, or a tab (`U+0009`),
316 carriage return (`U+000D`), newline (`U+000A`), or form feed
317 (`U+000C`).
318
319 [Unicode whitespace](@) is a sequence of one
320 or more [Unicode whitespace characters].
321
322 A [space](@) is `U+0020`.
323
324 A [non-whitespace character](@) is any character
325 that is not a [whitespace character].
326
327 An [ASCII punctuation character](@)
328 is `!`, `"`, `#`, `$`, `%`, `&`, `'`, `(`, `)`,
329 `*`, `+`, `,`, `-`, `.`, `/`, `:`, `;`, `<`, `=`, `>`, `?`, `@`,
330 `[`, `\`, `]`, `^`, `_`, `` ` ``, `{`, `|`, `}`, or `~`.
331
332 A [punctuation character](@) is an [ASCII
333 punctuation character] or anything in
334 the Unicode classes `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po`, or `Ps`.
335
336 ## Tabs
337
338 Tabs in lines are not expanded to [spaces].  However,
339 in contexts where whitespace helps to define block structure,
340 tabs behave as if they were replaced by spaces with a tab stop
341 of 4 characters.
342
343 Thus, for example, a tab can be used instead of four spaces
344 in an indented code block.  (Note, however, that internal
345 tabs are passed through as literal tabs, not expanded to
346 spaces.)
347
348 ```````````````````````````````` example
349 →foo→baz→→bim
350 .
351 <pre><code>foo→baz→→bim
352 </code></pre>
353 ````````````````````````````````
354
355 ```````````````````````````````` example
356   →foo→baz→→bim
357 .
358 <pre><code>foo→baz→→bim
359 </code></pre>
360 ````````````````````````````````
361
362 ```````````````````````````````` example
363     a→a
364     ὐ→a
365 .
366 <pre><code>a→a
367 ὐ→a
368 </code></pre>
369 ````````````````````````````````
370
371 In the following example, a continuation paragraph of a list
372 item is indented with a tab; this has exactly the same effect
373 as indentation with four spaces would:
374
375 ```````````````````````````````` example
376   - foo
377
378 →bar
379 .
380 <ul>
381 <li>
382 <p>foo</p>
383 <p>bar</p>
384 </li>
385 </ul>
386 ````````````````````````````````
387
388 ```````````````````````````````` example
389 - foo
390
391 →→bar
392 .
393 <ul>
394 <li>
395 <p>foo</p>
396 <pre><code>  bar
397 </code></pre>
398 </li>
399 </ul>
400 ````````````````````````````````
401
402 Normally the `>` that begins a block quote may be followed
403 optionally by a space, which is not considered part of the
404 content.  In the following case `>` is followed by a tab,
405 which is treated as if it were expanded into spaces.
406 Since one of theses spaces is considered part of the
407 delimiter, `foo` is considered to be indented six spaces
408 inside the block quote context, so we get an indented
409 code block starting with two spaces.
410
411 ```````````````````````````````` example
412 >→→foo
413 .
414 <blockquote>
415 <pre><code>  foo
416 </code></pre>
417 </blockquote>
418 ````````````````````````````````
419
420 ```````````````````````````````` example
421 -→→foo
422 .
423 <ul>
424 <li>
425 <pre><code>  foo
426 </code></pre>
427 </li>
428 </ul>
429 ````````````````````````````````
430
431
432 ```````````````````````````````` example
433     foo
434 →bar
435 .
436 <pre><code>foo
437 bar
438 </code></pre>
439 ````````````````````````````````
440
441 ```````````````````````````````` example
442  - foo
443    - bar
444 → - baz
445 .
446 <ul>
447 <li>foo
448 <ul>
449 <li>bar
450 <ul>
451 <li>baz</li>
452 </ul>
453 </li>
454 </ul>
455 </li>
456 </ul>
457 ````````````````````````````````
458
459 ```````````````````````````````` example
460 #→Foo
461 .
462 <h1>Foo</h1>
463 ````````````````````````````````
464
465 ```````````````````````````````` example
466 *→*→*→
467 .
468 <hr />
469 ````````````````````````````````
470
471
472 ## Insecure characters
473
474 For security reasons, the Unicode character `U+0000` must be replaced
475 with the REPLACEMENT CHARACTER (`U+FFFD`).
476
477 # Blocks and inlines
478
479 We can think of a document as a sequence of
480 [blocks](@)---structural elements like paragraphs, block
481 quotations, lists, headings, rules, and code blocks.  Some blocks (like
482 block quotes and list items) contain other blocks; others (like
483 headings and paragraphs) contain [inline](@) content---text,
484 links, emphasized text, images, code, and so on.
485
486 ## Precedence
487
488 Indicators of block structure always take precedence over indicators
489 of inline structure.  So, for example, the following is a list with
490 two items, not a list with one item containing a code span:
491
492 ```````````````````````````````` example
493 - `one
494 - two`
495 .
496 <ul>
497 <li>`one</li>
498 <li>two`</li>
499 </ul>
500 ````````````````````````````````
501
502
503 This means that parsing can proceed in two steps:  first, the block
504 structure of the document can be discerned; second, text lines inside
505 paragraphs, headings, and other block constructs can be parsed for inline
506 structure.  The second step requires information about link reference
507 definitions that will be available only at the end of the first
508 step.  Note that the first step requires processing lines in sequence,
509 but the second can be parallelized, since the inline parsing of
510 one block element does not affect the inline parsing of any other.
511
512 ## Container blocks and leaf blocks
513
514 We can divide blocks into two types:
515 [container block](@)s,
516 which can contain other blocks, and [leaf block](@)s,
517 which cannot.
518
519 # Leaf blocks
520
521 This section describes the different kinds of leaf block that make up a
522 Markdown document.
523
524 ## Thematic breaks
525
526 A line consisting of 0-3 spaces of indentation, followed by a sequence
527 of three or more matching `-`, `_`, or `*` characters, each followed
528 optionally by any number of spaces, forms a
529 [thematic break](@).
530
531 ```````````````````````````````` example
532 ***
533 ---
534 ___
535 .
536 <hr />
537 <hr />
538 <hr />
539 ````````````````````````````````
540
541
542 Wrong characters:
543
544 ```````````````````````````````` example
545 +++
546 .
547 <p>+++</p>
548 ````````````````````````````````
549
550
551 ```````````````````````````````` example
552 ===
553 .
554 <p>===</p>
555 ````````````````````````````````
556
557
558 Not enough characters:
559
560 ```````````````````````````````` example
561 --
562 **
563 __
564 .
565 <p>--
566 **
567 __</p>
568 ````````````````````````````````
569
570
571 One to three spaces indent are allowed:
572
573 ```````````````````````````````` example
574  ***
575   ***
576    ***
577 .
578 <hr />
579 <hr />
580 <hr />
581 ````````````````````````````````
582
583
584 Four spaces is too many:
585
586 ```````````````````````````````` example
587     ***
588 .
589 <pre><code>***
590 </code></pre>
591 ````````````````````````````````
592
593
594 ```````````````````````````````` example
595 Foo
596     ***
597 .
598 <p>Foo
599 ***</p>
600 ````````````````````````````````
601
602
603 More than three characters may be used:
604
605 ```````````````````````````````` example
606 _____________________________________
607 .
608 <hr />
609 ````````````````````````````````
610
611
612 Spaces are allowed between the characters:
613
614 ```````````````````````````````` example
615  - - -
616 .
617 <hr />
618 ````````````````````````````````
619
620
621 ```````````````````````````````` example
622  **  * ** * ** * **
623 .
624 <hr />
625 ````````````````````````````````
626
627
628 ```````````````````````````````` example
629 -     -      -      -
630 .
631 <hr />
632 ````````````````````````````````
633
634
635 Spaces are allowed at the end:
636
637 ```````````````````````````````` example
638 - - - -    
639 .
640 <hr />
641 ````````````````````````````````
642
643
644 However, no other characters may occur in the line:
645
646 ```````````````````````````````` example
647 _ _ _ _ a
648
649 a------
650
651 ---a---
652 .
653 <p>_ _ _ _ a</p>
654 <p>a------</p>
655 <p>---a---</p>
656 ````````````````````````````````
657
658
659 It is required that all of the [non-whitespace characters] be the same.
660 So, this is not a thematic break:
661
662 ```````````````````````````````` example
663  *-*
664 .
665 <p><em>-</em></p>
666 ````````````````````````````````
667
668
669 Thematic breaks do not need blank lines before or after:
670
671 ```````````````````````````````` example
672 - foo
673 ***
674 - bar
675 .
676 <ul>
677 <li>foo</li>
678 </ul>
679 <hr />
680 <ul>
681 <li>bar</li>
682 </ul>
683 ````````````````````````````````
684
685
686 Thematic breaks can interrupt a paragraph:
687
688 ```````````````````````````````` example
689 Foo
690 ***
691 bar
692 .
693 <p>Foo</p>
694 <hr />
695 <p>bar</p>
696 ````````````````````````````````
697
698
699 If a line of dashes that meets the above conditions for being a
700 thematic break could also be interpreted as the underline of a [setext
701 heading], the interpretation as a
702 [setext heading] takes precedence. Thus, for example,
703 this is a setext heading, not a paragraph followed by a thematic break:
704
705 ```````````````````````````````` example
706 Foo
707 ---
708 bar
709 .
710 <h2>Foo</h2>
711 <p>bar</p>
712 ````````````````````````````````
713
714
715 When both a thematic break and a list item are possible
716 interpretations of a line, the thematic break takes precedence:
717
718 ```````````````````````````````` example
719 * Foo
720 * * *
721 * Bar
722 .
723 <ul>
724 <li>Foo</li>
725 </ul>
726 <hr />
727 <ul>
728 <li>Bar</li>
729 </ul>
730 ````````````````````````````````
731
732
733 If you want a thematic break in a list item, use a different bullet:
734
735 ```````````````````````````````` example
736 - Foo
737 - * * *
738 .
739 <ul>
740 <li>Foo</li>
741 <li>
742 <hr />
743 </li>
744 </ul>
745 ````````````````````````````````
746
747
748 ## ATX headings
749
750 An [ATX heading](@)
751 consists of a string of characters, parsed as inline content, between an
752 opening sequence of 1--6 unescaped `#` characters and an optional
753 closing sequence of any number of unescaped `#` characters.
754 The opening sequence of `#` characters must be followed by a
755 [space] or by the end of line. The optional closing sequence of `#`s must be
756 preceded by a [space] and may be followed by spaces only.  The opening
757 `#` character may be indented 0-3 spaces.  The raw contents of the
758 heading are stripped of leading and trailing spaces before being parsed
759 as inline content.  The heading level is equal to the number of `#`
760 characters in the opening sequence.
761
762 Simple headings:
763
764 ```````````````````````````````` example
765 # foo
766 ## foo
767 ### foo
768 #### foo
769 ##### foo
770 ###### foo
771 .
772 <h1>foo</h1>
773 <h2>foo</h2>
774 <h3>foo</h3>
775 <h4>foo</h4>
776 <h5>foo</h5>
777 <h6>foo</h6>
778 ````````````````````````````````
779
780
781 More than six `#` characters is not a heading:
782
783 ```````````````````````````````` example
784 ####### foo
785 .
786 <p>####### foo</p>
787 ````````````````````````````````
788
789
790 At least one space is required between the `#` characters and the
791 heading's contents, unless the heading is empty.  Note that many
792 implementations currently do not require the space.  However, the
793 space was required by the
794 [original ATX implementation](http://www.aaronsw.com/2002/atx/atx.py),
795 and it helps prevent things like the following from being parsed as
796 headings:
797
798 ```````````````````````````````` example
799 #5 bolt
800
801 #hashtag
802 .
803 <p>#5 bolt</p>
804 <p>#hashtag</p>
805 ````````````````````````````````
806
807
808 This is not a heading, because the first `#` is escaped:
809
810 ```````````````````````````````` example
811 \## foo
812 .
813 <p>## foo</p>
814 ````````````````````````````````
815
816
817 Contents are parsed as inlines:
818
819 ```````````````````````````````` example
820 # foo *bar* \*baz\*
821 .
822 <h1>foo <em>bar</em> *baz*</h1>
823 ````````````````````````````````
824
825
826 Leading and trailing blanks are ignored in parsing inline content:
827
828 ```````````````````````````````` example
829 #                  foo                     
830 .
831 <h1>foo</h1>
832 ````````````````````````````````
833
834
835 One to three spaces indentation are allowed:
836
837 ```````````````````````````````` example
838  ### foo
839   ## foo
840    # foo
841 .
842 <h3>foo</h3>
843 <h2>foo</h2>
844 <h1>foo</h1>
845 ````````````````````````````````
846
847
848 Four spaces are too much:
849
850 ```````````````````````````````` example
851     # foo
852 .
853 <pre><code># foo
854 </code></pre>
855 ````````````````````````````````
856
857
858 ```````````````````````````````` example
859 foo
860     # bar
861 .
862 <p>foo
863 # bar</p>
864 ````````````````````````````````
865
866
867 A closing sequence of `#` characters is optional:
868
869 ```````````````````````````````` example
870 ## foo ##
871   ###   bar    ###
872 .
873 <h2>foo</h2>
874 <h3>bar</h3>
875 ````````````````````````````````
876
877
878 It need not be the same length as the opening sequence:
879
880 ```````````````````````````````` example
881 # foo ##################################
882 ##### foo ##
883 .
884 <h1>foo</h1>
885 <h5>foo</h5>
886 ````````````````````````````````
887
888
889 Spaces are allowed after the closing sequence:
890
891 ```````````````````````````````` example
892 ### foo ###     
893 .
894 <h3>foo</h3>
895 ````````````````````````````````
896
897
898 A sequence of `#` characters with anything but [spaces] following it
899 is not a closing sequence, but counts as part of the contents of the
900 heading:
901
902 ```````````````````````````````` example
903 ### foo ### b
904 .
905 <h3>foo ### b</h3>
906 ````````````````````````````````
907
908
909 The closing sequence must be preceded by a space:
910
911 ```````````````````````````````` example
912 # foo#
913 .
914 <h1>foo#</h1>
915 ````````````````````````````````
916
917
918 Backslash-escaped `#` characters do not count as part
919 of the closing sequence:
920
921 ```````````````````````````````` example
922 ### foo \###
923 ## foo #\##
924 # foo \#
925 .
926 <h3>foo ###</h3>
927 <h2>foo ###</h2>
928 <h1>foo #</h1>
929 ````````````````````````````````
930
931
932 ATX headings need not be separated from surrounding content by blank
933 lines, and they can interrupt paragraphs:
934
935 ```````````````````````````````` example
936 ****
937 ## foo
938 ****
939 .
940 <hr />
941 <h2>foo</h2>
942 <hr />
943 ````````````````````````````````
944
945
946 ```````````````````````````````` example
947 Foo bar
948 # baz
949 Bar foo
950 .
951 <p>Foo bar</p>
952 <h1>baz</h1>
953 <p>Bar foo</p>
954 ````````````````````````````````
955
956
957 ATX headings can be empty:
958
959 ```````````````````````````````` example
960 ## 
961 #
962 ### ###
963 .
964 <h2></h2>
965 <h1></h1>
966 <h3></h3>
967 ````````````````````````````````
968
969
970 ## Setext headings
971
972 A [setext heading](@) consists of one or more
973 lines of text, each containing at least one [non-whitespace
974 character], with no more than 3 spaces indentation, followed by
975 a [setext heading underline].  The lines of text must be such
976 that, were they not followed by the setext heading underline,
977 they would be interpreted as a paragraph:  they cannot be
978 interpretable as a [code fence], [ATX heading][ATX headings],
979 [block quote][block quotes], [thematic break][thematic breaks],
980 [list item][list items], or [HTML block][HTML blocks].
981
982 A [setext heading underline](@) is a sequence of
983 `=` characters or a sequence of `-` characters, with no more than 3
984 spaces indentation and any number of trailing spaces.  If a line
985 containing a single `-` can be interpreted as an
986 empty [list items], it should be interpreted this way
987 and not as a [setext heading underline].
988
989 The heading is a level 1 heading if `=` characters are used in
990 the [setext heading underline], and a level 2 heading if `-`
991 characters are used.  The contents of the heading are the result
992 of parsing the preceding lines of text as CommonMark inline
993 content.
994
995 In general, a setext heading need not be preceded or followed by a
996 blank line.  However, it cannot interrupt a paragraph, so when a
997 setext heading comes after a paragraph, a blank line is needed between
998 them.
999
1000 Simple examples:
1001
1002 ```````````````````````````````` example
1003 Foo *bar*
1004 =========
1005
1006 Foo *bar*
1007 ---------
1008 .
1009 <h1>Foo <em>bar</em></h1>
1010 <h2>Foo <em>bar</em></h2>
1011 ````````````````````````````````
1012
1013
1014 The content of the header may span more than one line:
1015
1016 ```````````````````````````````` example
1017 Foo *bar
1018 baz*
1019 ====
1020 .
1021 <h1>Foo <em>bar
1022 baz</em></h1>
1023 ````````````````````````````````
1024
1025
1026 The underlining can be any length:
1027
1028 ```````````````````````````````` example
1029 Foo
1030 -------------------------
1031
1032 Foo
1033 =
1034 .
1035 <h2>Foo</h2>
1036 <h1>Foo</h1>
1037 ````````````````````````````````
1038
1039
1040 The heading content can be indented up to three spaces, and need
1041 not line up with the underlining:
1042
1043 ```````````````````````````````` example
1044    Foo
1045 ---
1046
1047   Foo
1048 -----
1049
1050   Foo
1051   ===
1052 .
1053 <h2>Foo</h2>
1054 <h2>Foo</h2>
1055 <h1>Foo</h1>
1056 ````````````````````````````````
1057
1058
1059 Four spaces indent is too much:
1060
1061 ```````````````````````````````` example
1062     Foo
1063     ---
1064
1065     Foo
1066 ---
1067 .
1068 <pre><code>Foo
1069 ---
1070
1071 Foo
1072 </code></pre>
1073 <hr />
1074 ````````````````````````````````
1075
1076
1077 The setext heading underline can be indented up to three spaces, and
1078 may have trailing spaces:
1079
1080 ```````````````````````````````` example
1081 Foo
1082    ----      
1083 .
1084 <h2>Foo</h2>
1085 ````````````````````````````````
1086
1087
1088 Four spaces is too much:
1089
1090 ```````````````````````````````` example
1091 Foo
1092     ---
1093 .
1094 <p>Foo
1095 ---</p>
1096 ````````````````````````````````
1097
1098
1099 The setext heading underline cannot contain internal spaces:
1100
1101 ```````````````````````````````` example
1102 Foo
1103 = =
1104
1105 Foo
1106 --- -
1107 .
1108 <p>Foo
1109 = =</p>
1110 <p>Foo</p>
1111 <hr />
1112 ````````````````````````````````
1113
1114
1115 Trailing spaces in the content line do not cause a line break:
1116
1117 ```````````````````````````````` example
1118 Foo  
1119 -----
1120 .
1121 <h2>Foo</h2>
1122 ````````````````````````````````
1123
1124
1125 Nor does a backslash at the end:
1126
1127 ```````````````````````````````` example
1128 Foo\
1129 ----
1130 .
1131 <h2>Foo\</h2>
1132 ````````````````````````````````
1133
1134
1135 Since indicators of block structure take precedence over
1136 indicators of inline structure, the following are setext headings:
1137
1138 ```````````````````````````````` example
1139 `Foo
1140 ----
1141 `
1142
1143 <a title="a lot
1144 ---
1145 of dashes"/>
1146 .
1147 <h2>`Foo</h2>
1148 <p>`</p>
1149 <h2>&lt;a title=&quot;a lot</h2>
1150 <p>of dashes&quot;/&gt;</p>
1151 ````````````````````````````````
1152
1153
1154 The setext heading underline cannot be a [lazy continuation
1155 line] in a list item or block quote:
1156
1157 ```````````````````````````````` example
1158 > Foo
1159 ---
1160 .
1161 <blockquote>
1162 <p>Foo</p>
1163 </blockquote>
1164 <hr />
1165 ````````````````````````````````
1166
1167
1168 ```````````````````````````````` example
1169 > foo
1170 bar
1171 ===
1172 .
1173 <blockquote>
1174 <p>foo
1175 bar
1176 ===</p>
1177 </blockquote>
1178 ````````````````````````````````
1179
1180
1181 ```````````````````````````````` example
1182 - Foo
1183 ---
1184 .
1185 <ul>
1186 <li>Foo</li>
1187 </ul>
1188 <hr />
1189 ````````````````````````````````
1190
1191
1192 A blank line is needed between a paragraph and a following
1193 setext heading, since otherwise the paragraph becomes part
1194 of the heading's content:
1195
1196 ```````````````````````````````` example
1197 Foo
1198 Bar
1199 ---
1200 .
1201 <h2>Foo
1202 Bar</h2>
1203 ````````````````````````````````
1204
1205
1206 But in general a blank line is not required before or after
1207 setext headings:
1208
1209 ```````````````````````````````` example
1210 ---
1211 Foo
1212 ---
1213 Bar
1214 ---
1215 Baz
1216 .
1217 <hr />
1218 <h2>Foo</h2>
1219 <h2>Bar</h2>
1220 <p>Baz</p>
1221 ````````````````````````````````
1222
1223
1224 Setext headings cannot be empty:
1225
1226 ```````````````````````````````` example
1227
1228 ====
1229 .
1230 <p>====</p>
1231 ````````````````````````````````
1232
1233
1234 Setext heading text lines must not be interpretable as block
1235 constructs other than paragraphs.  So, the line of dashes
1236 in these examples gets interpreted as a thematic break:
1237
1238 ```````````````````````````````` example
1239 ---
1240 ---
1241 .
1242 <hr />
1243 <hr />
1244 ````````````````````````````````
1245
1246
1247 ```````````````````````````````` example
1248 - foo
1249 -----
1250 .
1251 <ul>
1252 <li>foo</li>
1253 </ul>
1254 <hr />
1255 ````````````````````````````````
1256
1257
1258 ```````````````````````````````` example
1259     foo
1260 ---
1261 .
1262 <pre><code>foo
1263 </code></pre>
1264 <hr />
1265 ````````````````````````````````
1266
1267
1268 ```````````````````````````````` example
1269 > foo
1270 -----
1271 .
1272 <blockquote>
1273 <p>foo</p>
1274 </blockquote>
1275 <hr />
1276 ````````````````````````````````
1277
1278
1279 If you want a heading with `> foo` as its literal text, you can
1280 use backslash escapes:
1281
1282 ```````````````````````````````` example
1283 \> foo
1284 ------
1285 .
1286 <h2>&gt; foo</h2>
1287 ````````````````````````````````
1288
1289
1290 **Compatibility note:**  Most existing Markdown implementations
1291 do not allow the text of setext headings to span multiple lines.
1292 But there is no consensus about how to interpret
1293
1294 ``` markdown
1295 Foo
1296 bar
1297 ---
1298 baz
1299 ```
1300
1301 One can find four different interpretations:
1302
1303 1. paragraph "Foo", heading "bar", paragraph "baz"
1304 2. paragraph "Foo bar", thematic break, paragraph "baz"
1305 3. paragraph "Foo bar --- baz"
1306 4. heading "Foo bar", paragraph "baz"
1307
1308 We find interpretation 4 most natural, and interpretation 4
1309 increases the expressive power of CommonMark, by allowing
1310 multiline headings.  Authors who want interpretation 1 can
1311 put a blank line after the first paragraph:
1312
1313 ```````````````````````````````` example
1314 Foo
1315
1316 bar
1317 ---
1318 baz
1319 .
1320 <p>Foo</p>
1321 <h2>bar</h2>
1322 <p>baz</p>
1323 ````````````````````````````````
1324
1325
1326 Authors who want interpretation 2 can put blank lines around
1327 the thematic break,
1328
1329 ```````````````````````````````` example
1330 Foo
1331 bar
1332
1333 ---
1334
1335 baz
1336 .
1337 <p>Foo
1338 bar</p>
1339 <hr />
1340 <p>baz</p>
1341 ````````````````````````````````
1342
1343
1344 or use a thematic break that cannot count as a [setext heading
1345 underline], such as
1346
1347 ```````````````````````````````` example
1348 Foo
1349 bar
1350 * * *
1351 baz
1352 .
1353 <p>Foo
1354 bar</p>
1355 <hr />
1356 <p>baz</p>
1357 ````````````````````````````````
1358
1359
1360 Authors who want interpretation 3 can use backslash escapes:
1361
1362 ```````````````````````````````` example
1363 Foo
1364 bar
1365 \---
1366 baz
1367 .
1368 <p>Foo
1369 bar
1370 ---
1371 baz</p>
1372 ````````````````````````````````
1373
1374
1375 ## Indented code blocks
1376
1377 An [indented code block](@) is composed of one or more
1378 [indented chunks] separated by blank lines.
1379 An [indented chunk](@) is a sequence of non-blank lines,
1380 each indented four or more spaces. The contents of the code block are
1381 the literal contents of the lines, including trailing
1382 [line endings], minus four spaces of indentation.
1383 An indented code block has no [info string].
1384
1385 An indented code block cannot interrupt a paragraph, so there must be
1386 a blank line between a paragraph and a following indented code block.
1387 (A blank line is not needed, however, between a code block and a following
1388 paragraph.)
1389
1390 ```````````````````````````````` example
1391     a simple
1392       indented code block
1393 .
1394 <pre><code>a simple
1395   indented code block
1396 </code></pre>
1397 ````````````````````````````````
1398
1399
1400 If there is any ambiguity between an interpretation of indentation
1401 as a code block and as indicating that material belongs to a [list
1402 item][list items], the list item interpretation takes precedence:
1403
1404 ```````````````````````````````` example
1405   - foo
1406
1407     bar
1408 .
1409 <ul>
1410 <li>
1411 <p>foo</p>
1412 <p>bar</p>
1413 </li>
1414 </ul>
1415 ````````````````````````````````
1416
1417
1418 ```````````````````````````````` example
1419 1.  foo
1420
1421     - bar
1422 .
1423 <ol>
1424 <li>
1425 <p>foo</p>
1426 <ul>
1427 <li>bar</li>
1428 </ul>
1429 </li>
1430 </ol>
1431 ````````````````````````````````
1432
1433
1434
1435 The contents of a code block are literal text, and do not get parsed
1436 as Markdown:
1437
1438 ```````````````````````````````` example
1439     <a/>
1440     *hi*
1441
1442     - one
1443 .
1444 <pre><code>&lt;a/&gt;
1445 *hi*
1446
1447 - one
1448 </code></pre>
1449 ````````````````````````````````
1450
1451
1452 Here we have three chunks separated by blank lines:
1453
1454 ```````````````````````````````` example
1455     chunk1
1456
1457     chunk2
1458   
1459  
1460  
1461     chunk3
1462 .
1463 <pre><code>chunk1
1464
1465 chunk2
1466
1467
1468
1469 chunk3
1470 </code></pre>
1471 ````````````````````````````````
1472
1473
1474 Any initial spaces beyond four will be included in the content, even
1475 in interior blank lines:
1476
1477 ```````````````````````````````` example
1478     chunk1
1479       
1480       chunk2
1481 .
1482 <pre><code>chunk1
1483   
1484   chunk2
1485 </code></pre>
1486 ````````````````````````````````
1487
1488
1489 An indented code block cannot interrupt a paragraph.  (This
1490 allows hanging indents and the like.)
1491
1492 ```````````````````````````````` example
1493 Foo
1494     bar
1495
1496 .
1497 <p>Foo
1498 bar</p>
1499 ````````````````````````````````
1500
1501
1502 However, any non-blank line with fewer than four leading spaces ends
1503 the code block immediately.  So a paragraph may occur immediately
1504 after indented code:
1505
1506 ```````````````````````````````` example
1507     foo
1508 bar
1509 .
1510 <pre><code>foo
1511 </code></pre>
1512 <p>bar</p>
1513 ````````````````````````````````
1514
1515
1516 And indented code can occur immediately before and after other kinds of
1517 blocks:
1518
1519 ```````````````````````````````` example
1520 # Heading
1521     foo
1522 Heading
1523 ------
1524     foo
1525 ----
1526 .
1527 <h1>Heading</h1>
1528 <pre><code>foo
1529 </code></pre>
1530 <h2>Heading</h2>
1531 <pre><code>foo
1532 </code></pre>
1533 <hr />
1534 ````````````````````````````````
1535
1536
1537 The first line can be indented more than four spaces:
1538
1539 ```````````````````````````````` example
1540         foo
1541     bar
1542 .
1543 <pre><code>    foo
1544 bar
1545 </code></pre>
1546 ````````````````````````````````
1547
1548
1549 Blank lines preceding or following an indented code block
1550 are not included in it:
1551
1552 ```````````````````````````````` example
1553
1554     
1555     foo
1556     
1557
1558 .
1559 <pre><code>foo
1560 </code></pre>
1561 ````````````````````````````````
1562
1563
1564 Trailing spaces are included in the code block's content:
1565
1566 ```````````````````````````````` example
1567     foo  
1568 .
1569 <pre><code>foo  
1570 </code></pre>
1571 ````````````````````````````````
1572
1573
1574
1575 ## Fenced code blocks
1576
1577 A [code fence](@) is a sequence
1578 of at least three consecutive backtick characters (`` ` ``) or
1579 tildes (`~`).  (Tildes and backticks cannot be mixed.)
1580 A [fenced code block](@)
1581 begins with a code fence, indented no more than three spaces.
1582
1583 The line with the opening code fence may optionally contain some text
1584 following the code fence; this is trimmed of leading and trailing
1585 spaces and called the [info string](@).
1586 The [info string] may not contain any backtick
1587 characters.  (The reason for this restriction is that otherwise
1588 some inline code would be incorrectly interpreted as the
1589 beginning of a fenced code block.)
1590
1591 The content of the code block consists of all subsequent lines, until
1592 a closing [code fence] of the same type as the code block
1593 began with (backticks or tildes), and with at least as many backticks
1594 or tildes as the opening code fence.  If the leading code fence is
1595 indented N spaces, then up to N spaces of indentation are removed from
1596 each line of the content (if present).  (If a content line is not
1597 indented, it is preserved unchanged.  If it is indented less than N
1598 spaces, all of the indentation is removed.)
1599
1600 The closing code fence may be indented up to three spaces, and may be
1601 followed only by spaces, which are ignored.  If the end of the
1602 containing block (or document) is reached and no closing code fence
1603 has been found, the code block contains all of the lines after the
1604 opening code fence until the end of the containing block (or
1605 document).  (An alternative spec would require backtracking in the
1606 event that a closing code fence is not found.  But this makes parsing
1607 much less efficient, and there seems to be no real down side to the
1608 behavior described here.)
1609
1610 A fenced code block may interrupt a paragraph, and does not require
1611 a blank line either before or after.
1612
1613 The content of a code fence is treated as literal text, not parsed
1614 as inlines.  The first word of the [info string] is typically used to
1615 specify the language of the code sample, and rendered in the `class`
1616 attribute of the `code` tag.  However, this spec does not mandate any
1617 particular treatment of the [info string].
1618
1619 Here is a simple example with backticks:
1620
1621 ```````````````````````````````` example
1622 ```
1623 <
1624  >
1625 ```
1626 .
1627 <pre><code>&lt;
1628  &gt;
1629 </code></pre>
1630 ````````````````````````````````
1631
1632
1633 With tildes:
1634
1635 ```````````````````````````````` example
1636 ~~~
1637 <
1638  >
1639 ~~~
1640 .
1641 <pre><code>&lt;
1642  &gt;
1643 </code></pre>
1644 ````````````````````````````````
1645
1646
1647 The closing code fence must use the same character as the opening
1648 fence:
1649
1650 ```````````````````````````````` example
1651 ```
1652 aaa
1653 ~~~
1654 ```
1655 .
1656 <pre><code>aaa
1657 ~~~
1658 </code></pre>
1659 ````````````````````````````````
1660
1661
1662 ```````````````````````````````` example
1663 ~~~
1664 aaa
1665 ```
1666 ~~~
1667 .
1668 <pre><code>aaa
1669 ```
1670 </code></pre>
1671 ````````````````````````````````
1672
1673
1674 The closing code fence must be at least as long as the opening fence:
1675
1676 ```````````````````````````````` example
1677 ````
1678 aaa
1679 ```
1680 ``````
1681 .
1682 <pre><code>aaa
1683 ```
1684 </code></pre>
1685 ````````````````````````````````
1686
1687
1688 ```````````````````````````````` example
1689 ~~~~
1690 aaa
1691 ~~~
1692 ~~~~
1693 .
1694 <pre><code>aaa
1695 ~~~
1696 </code></pre>
1697 ````````````````````````````````
1698
1699
1700 Unclosed code blocks are closed by the end of the document
1701 (or the enclosing [block quote][block quotes] or [list item][list items]):
1702
1703 ```````````````````````````````` example
1704 ```
1705 .
1706 <pre><code></code></pre>
1707 ````````````````````````````````
1708
1709
1710 ```````````````````````````````` example
1711 `````
1712
1713 ```
1714 aaa
1715 .
1716 <pre><code>
1717 ```
1718 aaa
1719 </code></pre>
1720 ````````````````````````````````
1721
1722
1723 ```````````````````````````````` example
1724 > ```
1725 > aaa
1726
1727 bbb
1728 .
1729 <blockquote>
1730 <pre><code>aaa
1731 </code></pre>
1732 </blockquote>
1733 <p>bbb</p>
1734 ````````````````````````````````
1735
1736
1737 A code block can have all empty lines as its content:
1738
1739 ```````````````````````````````` example
1740 ```
1741
1742   
1743 ```
1744 .
1745 <pre><code>
1746   
1747 </code></pre>
1748 ````````````````````````````````
1749
1750
1751 A code block can be empty:
1752
1753 ```````````````````````````````` example
1754 ```
1755 ```
1756 .
1757 <pre><code></code></pre>
1758 ````````````````````````````````
1759
1760
1761 Fences can be indented.  If the opening fence is indented,
1762 content lines will have equivalent opening indentation removed,
1763 if present:
1764
1765 ```````````````````````````````` example
1766  ```
1767  aaa
1768 aaa
1769 ```
1770 .
1771 <pre><code>aaa
1772 aaa
1773 </code></pre>
1774 ````````````````````````````````
1775
1776
1777 ```````````````````````````````` example
1778   ```
1779 aaa
1780   aaa
1781 aaa
1782   ```
1783 .
1784 <pre><code>aaa
1785 aaa
1786 aaa
1787 </code></pre>
1788 ````````````````````````````````
1789
1790
1791 ```````````````````````````````` example
1792    ```
1793    aaa
1794     aaa
1795   aaa
1796    ```
1797 .
1798 <pre><code>aaa
1799  aaa
1800 aaa
1801 </code></pre>
1802 ````````````````````````````````
1803
1804
1805 Four spaces indentation produces an indented code block:
1806
1807 ```````````````````````````````` example
1808     ```
1809     aaa
1810     ```
1811 .
1812 <pre><code>```
1813 aaa
1814 ```
1815 </code></pre>
1816 ````````````````````````````````
1817
1818
1819 Closing fences may be indented by 0-3 spaces, and their indentation
1820 need not match that of the opening fence:
1821
1822 ```````````````````````````````` example
1823 ```
1824 aaa
1825   ```
1826 .
1827 <pre><code>aaa
1828 </code></pre>
1829 ````````````````````````````````
1830
1831
1832 ```````````````````````````````` example
1833    ```
1834 aaa
1835   ```
1836 .
1837 <pre><code>aaa
1838 </code></pre>
1839 ````````````````````````````````
1840
1841
1842 This is not a closing fence, because it is indented 4 spaces:
1843
1844 ```````````````````````````````` example
1845 ```
1846 aaa
1847     ```
1848 .
1849 <pre><code>aaa
1850     ```
1851 </code></pre>
1852 ````````````````````````````````
1853
1854
1855
1856 Code fences (opening and closing) cannot contain internal spaces:
1857
1858 ```````````````````````````````` example
1859 ``` ```
1860 aaa
1861 .
1862 <p><code></code>
1863 aaa</p>
1864 ````````````````````````````````
1865
1866
1867 ```````````````````````````````` example
1868 ~~~~~~
1869 aaa
1870 ~~~ ~~
1871 .
1872 <pre><code>aaa
1873 ~~~ ~~
1874 </code></pre>
1875 ````````````````````````````````
1876
1877
1878 Fenced code blocks can interrupt paragraphs, and can be followed
1879 directly by paragraphs, without a blank line between:
1880
1881 ```````````````````````````````` example
1882 foo
1883 ```
1884 bar
1885 ```
1886 baz
1887 .
1888 <p>foo</p>
1889 <pre><code>bar
1890 </code></pre>
1891 <p>baz</p>
1892 ````````````````````````````````
1893
1894
1895 Other blocks can also occur before and after fenced code blocks
1896 without an intervening blank line:
1897
1898 ```````````````````````````````` example
1899 foo
1900 ---
1901 ~~~
1902 bar
1903 ~~~
1904 # baz
1905 .
1906 <h2>foo</h2>
1907 <pre><code>bar
1908 </code></pre>
1909 <h1>baz</h1>
1910 ````````````````````````````````
1911
1912
1913 An [info string] can be provided after the opening code fence.
1914 Opening and closing spaces will be stripped, and the first word, prefixed
1915 with `language-`, is used as the value for the `class` attribute of the
1916 `code` element within the enclosing `pre` element.
1917
1918 ```````````````````````````````` example
1919 ```ruby
1920 def foo(x)
1921   return 3
1922 end
1923 ```
1924 .
1925 <pre><code class="language-ruby">def foo(x)
1926   return 3
1927 end
1928 </code></pre>
1929 ````````````````````````````````
1930
1931
1932 ```````````````````````````````` example
1933 ~~~~    ruby startline=3 $%@#$
1934 def foo(x)
1935   return 3
1936 end
1937 ~~~~~~~
1938 .
1939 <pre><code class="language-ruby">def foo(x)
1940   return 3
1941 end
1942 </code></pre>
1943 ````````````````````````````````
1944
1945
1946 ```````````````````````````````` example
1947 ````;
1948 ````
1949 .
1950 <pre><code class="language-;"></code></pre>
1951 ````````````````````````````````
1952
1953
1954 [Info strings] for backtick code blocks cannot contain backticks:
1955
1956 ```````````````````````````````` example
1957 ``` aa ```
1958 foo
1959 .
1960 <p><code>aa</code>
1961 foo</p>
1962 ````````````````````````````````
1963
1964
1965 Closing code fences cannot have [info strings]:
1966
1967 ```````````````````````````````` example
1968 ```
1969 ``` aaa
1970 ```
1971 .
1972 <pre><code>``` aaa
1973 </code></pre>
1974 ````````````````````````````````
1975
1976
1977
1978 ## HTML blocks
1979
1980 An [HTML block](@) is a group of lines that is treated
1981 as raw HTML (and will not be escaped in HTML output).
1982
1983 There are seven kinds of [HTML block], which can be defined
1984 by their start and end conditions.  The block begins with a line that
1985 meets a [start condition](@) (after up to three spaces
1986 optional indentation).  It ends with the first subsequent line that
1987 meets a matching [end condition](@), or the last line of
1988 the document or other [container block]), if no line is encountered that meets the
1989 [end condition].  If the first line meets both the [start condition]
1990 and the [end condition], the block will contain just that line.
1991
1992 1.  **Start condition:**  line begins with the string `<script`,
1993 `<pre`, or `<style` (case-insensitive), followed by whitespace,
1994 the string `>`, or the end of the line.\
1995 **End condition:**  line contains an end tag
1996 `</script>`, `</pre>`, or `</style>` (case-insensitive; it
1997 need not match the start tag).
1998
1999 2.  **Start condition:** line begins with the string `<!--`.\
2000 **End condition:**  line contains the string `-->`.
2001
2002 3.  **Start condition:** line begins with the string `<?`.\
2003 **End condition:** line contains the string `?>`.
2004
2005 4.  **Start condition:** line begins with the string `<!`
2006 followed by an uppercase ASCII letter.\
2007 **End condition:** line contains the character `>`.
2008
2009 5.  **Start condition:**  line begins with the string
2010 `<![CDATA[`.\
2011 **End condition:** line contains the string `]]>`.
2012
2013 6.  **Start condition:** line begins the string `<` or `</`
2014 followed by one of the strings (case-insensitive) `address`,
2015 `article`, `aside`, `base`, `basefont`, `blockquote`, `body`,
2016 `caption`, `center`, `col`, `colgroup`, `dd`, `details`, `dialog`,
2017 `dir`, `div`, `dl`, `dt`, `fieldset`, `figcaption`, `figure`,
2018 `footer`, `form`, `frame`, `frameset`,
2019 `h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `head`, `header`, `hr`,
2020 `html`, `iframe`, `legend`, `li`, `link`, `main`, `menu`, `menuitem`,
2021 `meta`, `nav`, `noframes`, `ol`, `optgroup`, `option`, `p`, `param`,
2022 `section`, `source`, `summary`, `table`, `tbody`, `td`,
2023 `tfoot`, `th`, `thead`, `title`, `tr`, `track`, `ul`, followed
2024 by [whitespace], the end of the line, the string `>`, or
2025 the string `/>`.\
2026 **End condition:** line is followed by a [blank line].
2027
2028 7.  **Start condition:**  line begins with a complete [open tag]
2029 or [closing tag] (with any [tag name] other than `script`,
2030 `style`, or `pre`) followed only by [whitespace]
2031 or the end of the line.\
2032 **End condition:** line is followed by a [blank line].
2033
2034 All types of [HTML blocks] except type 7 may interrupt
2035 a paragraph.  Blocks of type 7 may not interrupt a paragraph.
2036 (This restriction is intended to prevent unwanted interpretation
2037 of long tags inside a wrapped paragraph as starting HTML blocks.)
2038
2039 Some simple examples follow.  Here are some basic HTML blocks
2040 of type 6:
2041
2042 ```````````````````````````````` example
2043 <table>
2044   <tr>
2045     <td>
2046            hi
2047     </td>
2048   </tr>
2049 </table>
2050
2051 okay.
2052 .
2053 <table>
2054   <tr>
2055     <td>
2056            hi
2057     </td>
2058   </tr>
2059 </table>
2060 <p>okay.</p>
2061 ````````````````````````````````
2062
2063
2064 ```````````````````````````````` example
2065  <div>
2066   *hello*
2067          <foo><a>
2068 .
2069  <div>
2070   *hello*
2071          <foo><a>
2072 ````````````````````````````````
2073
2074
2075 A block can also start with a closing tag:
2076
2077 ```````````````````````````````` example
2078 </div>
2079 *foo*
2080 .
2081 </div>
2082 *foo*
2083 ````````````````````````````````
2084
2085
2086 Here we have two HTML blocks with a Markdown paragraph between them:
2087
2088 ```````````````````````````````` example
2089 <DIV CLASS="foo">
2090
2091 *Markdown*
2092
2093 </DIV>
2094 .
2095 <DIV CLASS="foo">
2096 <p><em>Markdown</em></p>
2097 </DIV>
2098 ````````````````````````````````
2099
2100
2101 The tag on the first line can be partial, as long
2102 as it is split where there would be whitespace:
2103
2104 ```````````````````````````````` example
2105 <div id="foo"
2106   class="bar">
2107 </div>
2108 .
2109 <div id="foo"
2110   class="bar">
2111 </div>
2112 ````````````````````````````````
2113
2114
2115 ```````````````````````````````` example
2116 <div id="foo" class="bar
2117   baz">
2118 </div>
2119 .
2120 <div id="foo" class="bar
2121   baz">
2122 </div>
2123 ````````````````````````````````
2124
2125
2126 An open tag need not be closed:
2127 ```````````````````````````````` example
2128 <div>
2129 *foo*
2130
2131 *bar*
2132 .
2133 <div>
2134 *foo*
2135 <p><em>bar</em></p>
2136 ````````````````````````````````
2137
2138
2139
2140 A partial tag need not even be completed (garbage
2141 in, garbage out):
2142
2143 ```````````````````````````````` example
2144 <div id="foo"
2145 *hi*
2146 .
2147 <div id="foo"
2148 *hi*
2149 ````````````````````````````````
2150
2151
2152 ```````````````````````````````` example
2153 <div class
2154 foo
2155 .
2156 <div class
2157 foo
2158 ````````````````````````````````
2159
2160
2161 The initial tag doesn't even need to be a valid
2162 tag, as long as it starts like one:
2163
2164 ```````````````````````````````` example
2165 <div *???-&&&-<---
2166 *foo*
2167 .
2168 <div *???-&&&-<---
2169 *foo*
2170 ````````````````````````````````
2171
2172
2173 In type 6 blocks, the initial tag need not be on a line by
2174 itself:
2175
2176 ```````````````````````````````` example
2177 <div><a href="bar">*foo*</a></div>
2178 .
2179 <div><a href="bar">*foo*</a></div>
2180 ````````````````````````````````
2181
2182
2183 ```````````````````````````````` example
2184 <table><tr><td>
2185 foo
2186 </td></tr></table>
2187 .
2188 <table><tr><td>
2189 foo
2190 </td></tr></table>
2191 ````````````````````````````````
2192
2193
2194 Everything until the next blank line or end of document
2195 gets included in the HTML block.  So, in the following
2196 example, what looks like a Markdown code block
2197 is actually part of the HTML block, which continues until a blank
2198 line or the end of the document is reached:
2199
2200 ```````````````````````````````` example
2201 <div></div>
2202 ``` c
2203 int x = 33;
2204 ```
2205 .
2206 <div></div>
2207 ``` c
2208 int x = 33;
2209 ```
2210 ````````````````````````````````
2211
2212
2213 To start an [HTML block] with a tag that is *not* in the
2214 list of block-level tags in (6), you must put the tag by
2215 itself on the first line (and it must be complete):
2216
2217 ```````````````````````````````` example
2218 <a href="foo">
2219 *bar*
2220 </a>
2221 .
2222 <a href="foo">
2223 *bar*
2224 </a>
2225 ````````````````````````````````
2226
2227
2228 In type 7 blocks, the [tag name] can be anything:
2229
2230 ```````````````````````````````` example
2231 <Warning>
2232 *bar*
2233 </Warning>
2234 .
2235 <Warning>
2236 *bar*
2237 </Warning>
2238 ````````````````````````````````
2239
2240
2241 ```````````````````````````````` example
2242 <i class="foo">
2243 *bar*
2244 </i>
2245 .
2246 <i class="foo">
2247 *bar*
2248 </i>
2249 ````````````````````````````````
2250
2251
2252 ```````````````````````````````` example
2253 </ins>
2254 *bar*
2255 .
2256 </ins>
2257 *bar*
2258 ````````````````````````````````
2259
2260
2261 These rules are designed to allow us to work with tags that
2262 can function as either block-level or inline-level tags.
2263 The `<del>` tag is a nice example.  We can surround content with
2264 `<del>` tags in three different ways.  In this case, we get a raw
2265 HTML block, because the `<del>` tag is on a line by itself:
2266
2267 ```````````````````````````````` example
2268 <del>
2269 *foo*
2270 </del>
2271 .
2272 <del>
2273 *foo*
2274 </del>
2275 ````````````````````````````````
2276
2277
2278 In this case, we get a raw HTML block that just includes
2279 the `<del>` tag (because it ends with the following blank
2280 line).  So the contents get interpreted as CommonMark:
2281
2282 ```````````````````````````````` example
2283 <del>
2284
2285 *foo*
2286
2287 </del>
2288 .
2289 <del>
2290 <p><em>foo</em></p>
2291 </del>
2292 ````````````````````````````````
2293
2294
2295 Finally, in this case, the `<del>` tags are interpreted
2296 as [raw HTML] *inside* the CommonMark paragraph.  (Because
2297 the tag is not on a line by itself, we get inline HTML
2298 rather than an [HTML block].)
2299
2300 ```````````````````````````````` example
2301 <del>*foo*</del>
2302 .
2303 <p><del><em>foo</em></del></p>
2304 ````````````````````````````````
2305
2306
2307 HTML tags designed to contain literal content
2308 (`script`, `style`, `pre`), comments, processing instructions,
2309 and declarations are treated somewhat differently.
2310 Instead of ending at the first blank line, these blocks
2311 end at the first line containing a corresponding end tag.
2312 As a result, these blocks can contain blank lines:
2313
2314 A pre tag (type 1):
2315
2316 ```````````````````````````````` example
2317 <pre language="haskell"><code>
2318 import Text.HTML.TagSoup
2319
2320 main :: IO ()
2321 main = print $ parseTags tags
2322 </code></pre>
2323 okay
2324 .
2325 <pre language="haskell"><code>
2326 import Text.HTML.TagSoup
2327
2328 main :: IO ()
2329 main = print $ parseTags tags
2330 </code></pre>
2331 <p>okay</p>
2332 ````````````````````````````````
2333
2334
2335 A script tag (type 1):
2336
2337 ```````````````````````````````` example
2338 <script type="text/javascript">
2339 // JavaScript example
2340
2341 document.getElementById("demo").innerHTML = "Hello JavaScript!";
2342 </script>
2343 okay
2344 .
2345 <script type="text/javascript">
2346 // JavaScript example
2347
2348 document.getElementById("demo").innerHTML = "Hello JavaScript!";
2349 </script>
2350 <p>okay</p>
2351 ````````````````````````````````
2352
2353
2354 A style tag (type 1):
2355
2356 ```````````````````````````````` example
2357 <style
2358   type="text/css">
2359 h1 {color:red;}
2360
2361 p {color:blue;}
2362 </style>
2363 okay
2364 .
2365 <style
2366   type="text/css">
2367 h1 {color:red;}
2368
2369 p {color:blue;}
2370 </style>
2371 <p>okay</p>
2372 ````````````````````````````````
2373
2374
2375 If there is no matching end tag, the block will end at the
2376 end of the document (or the enclosing [block quote][block quotes]
2377 or [list item][list items]):
2378
2379 ```````````````````````````````` example
2380 <style
2381   type="text/css">
2382
2383 foo
2384 .
2385 <style
2386   type="text/css">
2387
2388 foo
2389 ````````````````````````````````
2390
2391
2392 ```````````````````````````````` example
2393 > <div>
2394 > foo
2395
2396 bar
2397 .
2398 <blockquote>
2399 <div>
2400 foo
2401 </blockquote>
2402 <p>bar</p>
2403 ````````````````````````````````
2404
2405
2406 ```````````````````````````````` example
2407 - <div>
2408 - foo
2409 .
2410 <ul>
2411 <li>
2412 <div>
2413 </li>
2414 <li>foo</li>
2415 </ul>
2416 ````````````````````````````````
2417
2418
2419 The end tag can occur on the same line as the start tag:
2420
2421 ```````````````````````````````` example
2422 <style>p{color:red;}</style>
2423 *foo*
2424 .
2425 <style>p{color:red;}</style>
2426 <p><em>foo</em></p>
2427 ````````````````````````````````
2428
2429
2430 ```````````````````````````````` example
2431 <!-- foo -->*bar*
2432 *baz*
2433 .
2434 <!-- foo -->*bar*
2435 <p><em>baz</em></p>
2436 ````````````````````````````````
2437
2438
2439 Note that anything on the last line after the
2440 end tag will be included in the [HTML block]:
2441
2442 ```````````````````````````````` example
2443 <script>
2444 foo
2445 </script>1. *bar*
2446 .
2447 <script>
2448 foo
2449 </script>1. *bar*
2450 ````````````````````````````````
2451
2452
2453 A comment (type 2):
2454
2455 ```````````````````````````````` example
2456 <!-- Foo
2457
2458 bar
2459    baz -->
2460 okay
2461 .
2462 <!-- Foo
2463
2464 bar
2465    baz -->
2466 <p>okay</p>
2467 ````````````````````````````````
2468
2469
2470
2471 A processing instruction (type 3):
2472
2473 ```````````````````````````````` example
2474 <?php
2475
2476   echo '>';
2477
2478 ?>
2479 okay
2480 .
2481 <?php
2482
2483   echo '>';
2484
2485 ?>
2486 <p>okay</p>
2487 ````````````````````````````````
2488
2489
2490 A declaration (type 4):
2491
2492 ```````````````````````````````` example
2493 <!DOCTYPE html>
2494 .
2495 <!DOCTYPE html>
2496 ````````````````````````````````
2497
2498
2499 CDATA (type 5):
2500
2501 ```````````````````````````````` example
2502 <![CDATA[
2503 function matchwo(a,b)
2504 {
2505   if (a < b && a < 0) then {
2506     return 1;
2507
2508   } else {
2509
2510     return 0;
2511   }
2512 }
2513 ]]>
2514 okay
2515 .
2516 <![CDATA[
2517 function matchwo(a,b)
2518 {
2519   if (a < b && a < 0) then {
2520     return 1;
2521
2522   } else {
2523
2524     return 0;
2525   }
2526 }
2527 ]]>
2528 <p>okay</p>
2529 ````````````````````````````````
2530
2531
2532 The opening tag can be indented 1-3 spaces, but not 4:
2533
2534 ```````````````````````````````` example
2535   <!-- foo -->
2536
2537     <!-- foo -->
2538 .
2539   <!-- foo -->
2540 <pre><code>&lt;!-- foo --&gt;
2541 </code></pre>
2542 ````````````````````````````````
2543
2544
2545 ```````````````````````````````` example
2546   <div>
2547
2548     <div>
2549 .
2550   <div>
2551 <pre><code>&lt;div&gt;
2552 </code></pre>
2553 ````````````````````````````````
2554
2555
2556 An HTML block of types 1--6 can interrupt a paragraph, and need not be
2557 preceded by a blank line.
2558
2559 ```````````````````````````````` example
2560 Foo
2561 <div>
2562 bar
2563 </div>
2564 .
2565 <p>Foo</p>
2566 <div>
2567 bar
2568 </div>
2569 ````````````````````````````````
2570
2571
2572 However, a following blank line is needed, except at the end of
2573 a document, and except for blocks of types 1--5, above:
2574
2575 ```````````````````````````````` example
2576 <div>
2577 bar
2578 </div>
2579 *foo*
2580 .
2581 <div>
2582 bar
2583 </div>
2584 *foo*
2585 ````````````````````````````````
2586
2587
2588 HTML blocks of type 7 cannot interrupt a paragraph:
2589
2590 ```````````````````````````````` example
2591 Foo
2592 <a href="bar">
2593 baz
2594 .
2595 <p>Foo
2596 <a href="bar">
2597 baz</p>
2598 ````````````````````````````````
2599
2600
2601 This rule differs from John Gruber's original Markdown syntax
2602 specification, which says:
2603
2604 > The only restrictions are that block-level HTML elements —
2605 > e.g. `<div>`, `<table>`, `<pre>`, `<p>`, etc. — must be separated from
2606 > surrounding content by blank lines, and the start and end tags of the
2607 > block should not be indented with tabs or spaces.
2608
2609 In some ways Gruber's rule is more restrictive than the one given
2610 here:
2611
2612 - It requires that an HTML block be preceded by a blank line.
2613 - It does not allow the start tag to be indented.
2614 - It requires a matching end tag, which it also does not allow to
2615   be indented.
2616
2617 Most Markdown implementations (including some of Gruber's own) do not
2618 respect all of these restrictions.
2619
2620 There is one respect, however, in which Gruber's rule is more liberal
2621 than the one given here, since it allows blank lines to occur inside
2622 an HTML block.  There are two reasons for disallowing them here.
2623 First, it removes the need to parse balanced tags, which is
2624 expensive and can require backtracking from the end of the document
2625 if no matching end tag is found. Second, it provides a very simple
2626 and flexible way of including Markdown content inside HTML tags:
2627 simply separate the Markdown from the HTML using blank lines:
2628
2629 Compare:
2630
2631 ```````````````````````````````` example
2632 <div>
2633
2634 *Emphasized* text.
2635
2636 </div>
2637 .
2638 <div>
2639 <p><em>Emphasized</em> text.</p>
2640 </div>
2641 ````````````````````````````````
2642
2643
2644 ```````````````````````````````` example
2645 <div>
2646 *Emphasized* text.
2647 </div>
2648 .
2649 <div>
2650 *Emphasized* text.
2651 </div>
2652 ````````````````````````````````
2653
2654
2655 Some Markdown implementations have adopted a convention of
2656 interpreting content inside tags as text if the open tag has
2657 the attribute `markdown=1`.  The rule given above seems a simpler and
2658 more elegant way of achieving the same expressive power, which is also
2659 much simpler to parse.
2660
2661 The main potential drawback is that one can no longer paste HTML
2662 blocks into Markdown documents with 100% reliability.  However,
2663 *in most cases* this will work fine, because the blank lines in
2664 HTML are usually followed by HTML block tags.  For example:
2665
2666 ```````````````````````````````` example
2667 <table>
2668
2669 <tr>
2670
2671 <td>
2672 Hi
2673 </td>
2674
2675 </tr>
2676
2677 </table>
2678 .
2679 <table>
2680 <tr>
2681 <td>
2682 Hi
2683 </td>
2684 </tr>
2685 </table>
2686 ````````````````````````````````
2687
2688
2689 There are problems, however, if the inner tags are indented
2690 *and* separated by spaces, as then they will be interpreted as
2691 an indented code block:
2692
2693 ```````````````````````````````` example
2694 <table>
2695
2696   <tr>
2697
2698     <td>
2699       Hi
2700     </td>
2701
2702   </tr>
2703
2704 </table>
2705 .
2706 <table>
2707   <tr>
2708 <pre><code>&lt;td&gt;
2709   Hi
2710 &lt;/td&gt;
2711 </code></pre>
2712   </tr>
2713 </table>
2714 ````````````````````````````````
2715
2716
2717 Fortunately, blank lines are usually not necessary and can be
2718 deleted.  The exception is inside `<pre>` tags, but as described
2719 above, raw HTML blocks starting with `<pre>` *can* contain blank
2720 lines.
2721
2722 ## Link reference definitions
2723
2724 A [link reference definition](@)
2725 consists of a [link label], indented up to three spaces, followed
2726 by a colon (`:`), optional [whitespace] (including up to one
2727 [line ending]), a [link destination],
2728 optional [whitespace] (including up to one
2729 [line ending]), and an optional [link
2730 title], which if it is present must be separated
2731 from the [link destination] by [whitespace].
2732 No further [non-whitespace characters] may occur on the line.
2733
2734 A [link reference definition]
2735 does not correspond to a structural element of a document.  Instead, it
2736 defines a label which can be used in [reference links]
2737 and reference-style [images] elsewhere in the document.  [Link
2738 reference definitions] can come either before or after the links that use
2739 them.
2740
2741 ```````````````````````````````` example
2742 [foo]: /url "title"
2743
2744 [foo]
2745 .
2746 <p><a href="/url" title="title">foo</a></p>
2747 ````````````````````````````````
2748
2749
2750 ```````````````````````````````` example
2751    [foo]: 
2752       /url  
2753            'the title'  
2754
2755 [foo]
2756 .
2757 <p><a href="/url" title="the title">foo</a></p>
2758 ````````````````````````````````
2759
2760
2761 ```````````````````````````````` example
2762 [Foo*bar\]]:my_(url) 'title (with parens)'
2763
2764 [Foo*bar\]]
2765 .
2766 <p><a href="my_(url)" title="title (with parens)">Foo*bar]</a></p>
2767 ````````````````````````````````
2768
2769
2770 ```````````````````````````````` example
2771 [Foo bar]:
2772 <my%20url>
2773 'title'
2774
2775 [Foo bar]
2776 .
2777 <p><a href="my%20url" title="title">Foo bar</a></p>
2778 ````````````````````````````````
2779
2780
2781 The title may extend over multiple lines:
2782
2783 ```````````````````````````````` example
2784 [foo]: /url '
2785 title
2786 line1
2787 line2
2788 '
2789
2790 [foo]
2791 .
2792 <p><a href="/url" title="
2793 title
2794 line1
2795 line2
2796 ">foo</a></p>
2797 ````````````````````````````````
2798
2799
2800 However, it may not contain a [blank line]:
2801
2802 ```````````````````````````````` example
2803 [foo]: /url 'title
2804
2805 with blank line'
2806
2807 [foo]
2808 .
2809 <p>[foo]: /url 'title</p>
2810 <p>with blank line'</p>
2811 <p>[foo]</p>
2812 ````````````````````````````````
2813
2814
2815 The title may be omitted:
2816
2817 ```````````````````````````````` example
2818 [foo]:
2819 /url
2820
2821 [foo]
2822 .
2823 <p><a href="/url">foo</a></p>
2824 ````````````````````````````````
2825
2826
2827 The link destination may not be omitted:
2828
2829 ```````````````````````````````` example
2830 [foo]:
2831
2832 [foo]
2833 .
2834 <p>[foo]:</p>
2835 <p>[foo]</p>
2836 ````````````````````````````````
2837
2838
2839 Both title and destination can contain backslash escapes
2840 and literal backslashes:
2841
2842 ```````````````````````````````` example
2843 [foo]: /url\bar\*baz "foo\"bar\baz"
2844
2845 [foo]
2846 .
2847 <p><a href="/url%5Cbar*baz" title="foo&quot;bar\baz">foo</a></p>
2848 ````````````````````````````````
2849
2850
2851 A link can come before its corresponding definition:
2852
2853 ```````````````````````````````` example
2854 [foo]
2855
2856 [foo]: url
2857 .
2858 <p><a href="url">foo</a></p>
2859 ````````````````````````````````
2860
2861
2862 If there are several matching definitions, the first one takes
2863 precedence:
2864
2865 ```````````````````````````````` example
2866 [foo]
2867
2868 [foo]: first
2869 [foo]: second
2870 .
2871 <p><a href="first">foo</a></p>
2872 ````````````````````````````````
2873
2874
2875 As noted in the section on [Links], matching of labels is
2876 case-insensitive (see [matches]).
2877
2878 ```````````````````````````````` example
2879 [FOO]: /url
2880
2881 [Foo]
2882 .
2883 <p><a href="/url">Foo</a></p>
2884 ````````````````````````````````
2885
2886
2887 ```````````````````````````````` example
2888 [ΑΓΩ]: /φου
2889
2890 [αγω]
2891 .
2892 <p><a href="/%CF%86%CE%BF%CF%85">αγω</a></p>
2893 ````````````````````````````````
2894
2895
2896 Here is a link reference definition with no corresponding link.
2897 It contributes nothing to the document.
2898
2899 ```````````````````````````````` example
2900 [foo]: /url
2901 .
2902 ````````````````````````````````
2903
2904
2905 Here is another one:
2906
2907 ```````````````````````````````` example
2908 [
2909 foo
2910 ]: /url
2911 bar
2912 .
2913 <p>bar</p>
2914 ````````````````````````````````
2915
2916
2917 This is not a link reference definition, because there are
2918 [non-whitespace characters] after the title:
2919
2920 ```````````````````````````````` example
2921 [foo]: /url "title" ok
2922 .
2923 <p>[foo]: /url &quot;title&quot; ok</p>
2924 ````````````````````````````````
2925
2926
2927 This is a link reference definition, but it has no title:
2928
2929 ```````````````````````````````` example
2930 [foo]: /url
2931 "title" ok
2932 .
2933 <p>&quot;title&quot; ok</p>
2934 ````````````````````````````````
2935
2936
2937 This is not a link reference definition, because it is indented
2938 four spaces:
2939
2940 ```````````````````````````````` example
2941     [foo]: /url "title"
2942
2943 [foo]
2944 .
2945 <pre><code>[foo]: /url &quot;title&quot;
2946 </code></pre>
2947 <p>[foo]</p>
2948 ````````````````````````````````
2949
2950
2951 This is not a link reference definition, because it occurs inside
2952 a code block:
2953
2954 ```````````````````````````````` example
2955 ```
2956 [foo]: /url
2957 ```
2958
2959 [foo]
2960 .
2961 <pre><code>[foo]: /url
2962 </code></pre>
2963 <p>[foo]</p>
2964 ````````````````````````````````
2965
2966
2967 A [link reference definition] cannot interrupt a paragraph.
2968
2969 ```````````````````````````````` example
2970 Foo
2971 [bar]: /baz
2972
2973 [bar]
2974 .
2975 <p>Foo
2976 [bar]: /baz</p>
2977 <p>[bar]</p>
2978 ````````````````````````````````
2979
2980
2981 However, it can directly follow other block elements, such as headings
2982 and thematic breaks, and it need not be followed by a blank line.
2983
2984 ```````````````````````````````` example
2985 # [Foo]
2986 [foo]: /url
2987 > bar
2988 .
2989 <h1><a href="/url">Foo</a></h1>
2990 <blockquote>
2991 <p>bar</p>
2992 </blockquote>
2993 ````````````````````````````````
2994
2995
2996 Several [link reference definitions]
2997 can occur one after another, without intervening blank lines.
2998
2999 ```````````````````````````````` example
3000 [foo]: /foo-url "foo"
3001 [bar]: /bar-url
3002   "bar"
3003 [baz]: /baz-url
3004
3005 [foo],
3006 [bar],
3007 [baz]
3008 .
3009 <p><a href="/foo-url" title="foo">foo</a>,
3010 <a href="/bar-url" title="bar">bar</a>,
3011 <a href="/baz-url">baz</a></p>
3012 ````````````````````````````````
3013
3014
3015 [Link reference definitions] can occur
3016 inside block containers, like lists and block quotations.  They
3017 affect the entire document, not just the container in which they
3018 are defined:
3019
3020 ```````````````````````````````` example
3021 [foo]
3022
3023 > [foo]: /url
3024 .
3025 <p><a href="/url">foo</a></p>
3026 <blockquote>
3027 </blockquote>
3028 ````````````````````````````````
3029
3030
3031
3032 ## Paragraphs
3033
3034 A sequence of non-blank lines that cannot be interpreted as other
3035 kinds of blocks forms a [paragraph](@).
3036 The contents of the paragraph are the result of parsing the
3037 paragraph's raw content as inlines.  The paragraph's raw content
3038 is formed by concatenating the lines and removing initial and final
3039 [whitespace].
3040
3041 A simple example with two paragraphs:
3042
3043 ```````````````````````````````` example
3044 aaa
3045
3046 bbb
3047 .
3048 <p>aaa</p>
3049 <p>bbb</p>
3050 ````````````````````````````````
3051
3052
3053 Paragraphs can contain multiple lines, but no blank lines:
3054
3055 ```````````````````````````````` example
3056 aaa
3057 bbb
3058
3059 ccc
3060 ddd
3061 .
3062 <p>aaa
3063 bbb</p>
3064 <p>ccc
3065 ddd</p>
3066 ````````````````````````````````
3067
3068
3069 Multiple blank lines between paragraph have no effect:
3070
3071 ```````````````````````````````` example
3072 aaa
3073
3074
3075 bbb
3076 .
3077 <p>aaa</p>
3078 <p>bbb</p>
3079 ````````````````````````````````
3080
3081
3082 Leading spaces are skipped:
3083
3084 ```````````````````````````````` example
3085   aaa
3086  bbb
3087 .
3088 <p>aaa
3089 bbb</p>
3090 ````````````````````````````````
3091
3092
3093 Lines after the first may be indented any amount, since indented
3094 code blocks cannot interrupt paragraphs.
3095
3096 ```````````````````````````````` example
3097 aaa
3098              bbb
3099                                        ccc
3100 .
3101 <p>aaa
3102 bbb
3103 ccc</p>
3104 ````````````````````````````````
3105
3106
3107 However, the first line may be indented at most three spaces,
3108 or an indented code block will be triggered:
3109
3110 ```````````````````````````````` example
3111    aaa
3112 bbb
3113 .
3114 <p>aaa
3115 bbb</p>
3116 ````````````````````````````````
3117
3118
3119 ```````````````````````````````` example
3120     aaa
3121 bbb
3122 .
3123 <pre><code>aaa
3124 </code></pre>
3125 <p>bbb</p>
3126 ````````````````````````````````
3127
3128
3129 Final spaces are stripped before inline parsing, so a paragraph
3130 that ends with two or more spaces will not end with a [hard line
3131 break]:
3132
3133 ```````````````````````````````` example
3134 aaa     
3135 bbb     
3136 .
3137 <p>aaa<br />
3138 bbb</p>
3139 ````````````````````````````````
3140
3141
3142 ## Blank lines
3143
3144 [Blank lines] between block-level elements are ignored,
3145 except for the role they play in determining whether a [list]
3146 is [tight] or [loose].
3147
3148 Blank lines at the beginning and end of the document are also ignored.
3149
3150 ```````````````````````````````` example
3151   
3152
3153 aaa
3154   
3155
3156 # aaa
3157
3158   
3159 .
3160 <p>aaa</p>
3161 <h1>aaa</h1>
3162 ````````````````````````````````
3163
3164
3165
3166 # Container blocks
3167
3168 A [container block] is a block that has other
3169 blocks as its contents.  There are two basic kinds of container blocks:
3170 [block quotes] and [list items].
3171 [Lists] are meta-containers for [list items].
3172
3173 We define the syntax for container blocks recursively.  The general
3174 form of the definition is:
3175
3176 > If X is a sequence of blocks, then the result of
3177 > transforming X in such-and-such a way is a container of type Y
3178 > with these blocks as its content.
3179
3180 So, we explain what counts as a block quote or list item by explaining
3181 how these can be *generated* from their contents. This should suffice
3182 to define the syntax, although it does not give a recipe for *parsing*
3183 these constructions.  (A recipe is provided below in the section entitled
3184 [A parsing strategy](#appendix-a-parsing-strategy).)
3185
3186 ## Block quotes
3187
3188 A [block quote marker](@)
3189 consists of 0-3 spaces of initial indent, plus (a) the character `>` together
3190 with a following space, or (b) a single character `>` not followed by a space.
3191
3192 The following rules define [block quotes]:
3193
3194 1.  **Basic case.**  If a string of lines *Ls* constitute a sequence
3195     of blocks *Bs*, then the result of prepending a [block quote
3196     marker] to the beginning of each line in *Ls*
3197     is a [block quote](#block-quotes) containing *Bs*.
3198
3199 2.  **Laziness.**  If a string of lines *Ls* constitute a [block
3200     quote](#block-quotes) with contents *Bs*, then the result of deleting
3201     the initial [block quote marker] from one or
3202     more lines in which the next [non-whitespace character] after the [block
3203     quote marker] is [paragraph continuation
3204     text] is a block quote with *Bs* as its content.
3205     [Paragraph continuation text](@) is text
3206     that will be parsed as part of the content of a paragraph, but does
3207     not occur at the beginning of the paragraph.
3208
3209 3.  **Consecutiveness.**  A document cannot contain two [block
3210     quotes] in a row unless there is a [blank line] between them.
3211
3212 Nothing else counts as a [block quote](#block-quotes).
3213
3214 Here is a simple example:
3215
3216 ```````````````````````````````` example
3217 > # Foo
3218 > bar
3219 > baz
3220 .
3221 <blockquote>
3222 <h1>Foo</h1>
3223 <p>bar
3224 baz</p>
3225 </blockquote>
3226 ````````````````````````````````
3227
3228
3229 The spaces after the `>` characters can be omitted:
3230
3231 ```````````````````````````````` example
3232 ># Foo
3233 >bar
3234 > baz
3235 .
3236 <blockquote>
3237 <h1>Foo</h1>
3238 <p>bar
3239 baz</p>
3240 </blockquote>
3241 ````````````````````````````````
3242
3243
3244 The `>` characters can be indented 1-3 spaces:
3245
3246 ```````````````````````````````` example
3247    > # Foo
3248    > bar
3249  > baz
3250 .
3251 <blockquote>
3252 <h1>Foo</h1>
3253 <p>bar
3254 baz</p>
3255 </blockquote>
3256 ````````````````````````````````
3257
3258
3259 Four spaces gives us a code block:
3260
3261 ```````````````````````````````` example
3262     > # Foo
3263     > bar
3264     > baz
3265 .
3266 <pre><code>&gt; # Foo
3267 &gt; bar
3268 &gt; baz
3269 </code></pre>
3270 ````````````````````````````````
3271
3272
3273 The Laziness clause allows us to omit the `>` before
3274 [paragraph continuation text]:
3275
3276 ```````````````````````````````` example
3277 > # Foo
3278 > bar
3279 baz
3280 .
3281 <blockquote>
3282 <h1>Foo</h1>
3283 <p>bar
3284 baz</p>
3285 </blockquote>
3286 ````````````````````````````````
3287
3288
3289 A block quote can contain some lazy and some non-lazy
3290 continuation lines:
3291
3292 ```````````````````````````````` example
3293 > bar
3294 baz
3295 > foo
3296 .
3297 <blockquote>
3298 <p>bar
3299 baz
3300 foo</p>
3301 </blockquote>
3302 ````````````````````````````````
3303
3304
3305 Laziness only applies to lines that would have been continuations of
3306 paragraphs had they been prepended with [block quote markers].
3307 For example, the `> ` cannot be omitted in the second line of
3308
3309 ``` markdown
3310 > foo
3311 > ---
3312 ```
3313
3314 without changing the meaning:
3315
3316 ```````````````````````````````` example
3317 > foo
3318 ---
3319 .
3320 <blockquote>
3321 <p>foo</p>
3322 </blockquote>
3323 <hr />
3324 ````````````````````````````````
3325
3326
3327 Similarly, if we omit the `> ` in the second line of
3328
3329 ``` markdown
3330 > - foo
3331 > - bar
3332 ```
3333
3334 then the block quote ends after the first line:
3335
3336 ```````````````````````````````` example
3337 > - foo
3338 - bar
3339 .
3340 <blockquote>
3341 <ul>
3342 <li>foo</li>
3343 </ul>
3344 </blockquote>
3345 <ul>
3346 <li>bar</li>
3347 </ul>
3348 ````````````````````````````````
3349
3350
3351 For the same reason, we can't omit the `> ` in front of
3352 subsequent lines of an indented or fenced code block:
3353
3354 ```````````````````````````````` example
3355 >     foo
3356     bar
3357 .
3358 <blockquote>
3359 <pre><code>foo
3360 </code></pre>
3361 </blockquote>
3362 <pre><code>bar
3363 </code></pre>
3364 ````````````````````````````````
3365
3366
3367 ```````````````````````````````` example
3368 > ```
3369 foo
3370 ```
3371 .
3372 <blockquote>
3373 <pre><code></code></pre>
3374 </blockquote>
3375 <p>foo</p>
3376 <pre><code></code></pre>
3377 ````````````````````````````````
3378
3379
3380 Note that in the following case, we have a [lazy
3381 continuation line]:
3382
3383 ```````````````````````````````` example
3384 > foo
3385     - bar
3386 .
3387 <blockquote>
3388 <p>foo
3389 - bar</p>
3390 </blockquote>
3391 ````````````````````````````````
3392
3393
3394 To see why, note that in
3395
3396 ```markdown
3397 > foo
3398 >     - bar
3399 ```
3400
3401 the `- bar` is indented too far to start a list, and can't
3402 be an indented code block because indented code blocks cannot
3403 interrupt paragraphs, so it is [paragraph continuation text].
3404
3405 A block quote can be empty:
3406
3407 ```````````````````````````````` example
3408 >
3409 .
3410 <blockquote>
3411 </blockquote>
3412 ````````````````````````````````
3413
3414
3415 ```````````````````````````````` example
3416 >
3417 >  
3418
3419 .
3420 <blockquote>
3421 </blockquote>
3422 ````````````````````````````````
3423
3424
3425 A block quote can have initial or final blank lines:
3426
3427 ```````````````````````````````` example
3428 >
3429 > foo
3430 >  
3431 .
3432 <blockquote>
3433 <p>foo</p>
3434 </blockquote>
3435 ````````````````````````````````
3436
3437
3438 A blank line always separates block quotes:
3439
3440 ```````````````````````````````` example
3441 > foo
3442
3443 > bar
3444 .
3445 <blockquote>
3446 <p>foo</p>
3447 </blockquote>
3448 <blockquote>
3449 <p>bar</p>
3450 </blockquote>
3451 ````````````````````````````````
3452
3453
3454 (Most current Markdown implementations, including John Gruber's
3455 original `Markdown.pl`, will parse this example as a single block quote
3456 with two paragraphs.  But it seems better to allow the author to decide
3457 whether two block quotes or one are wanted.)
3458
3459 Consecutiveness means that if we put these block quotes together,
3460 we get a single block quote:
3461
3462 ```````````````````````````````` example
3463 > foo
3464 > bar
3465 .
3466 <blockquote>
3467 <p>foo
3468 bar</p>
3469 </blockquote>
3470 ````````````````````````````````
3471
3472
3473 To get a block quote with two paragraphs, use:
3474
3475 ```````````````````````````````` example
3476 > foo
3477 >
3478 > bar
3479 .
3480 <blockquote>
3481 <p>foo</p>
3482 <p>bar</p>
3483 </blockquote>
3484 ````````````````````````````````
3485
3486
3487 Block quotes can interrupt paragraphs:
3488
3489 ```````````````````````````````` example
3490 foo
3491 > bar
3492 .
3493 <p>foo</p>
3494 <blockquote>
3495 <p>bar</p>
3496 </blockquote>
3497 ````````````````````````````````
3498
3499
3500 In general, blank lines are not needed before or after block
3501 quotes:
3502
3503 ```````````````````````````````` example
3504 > aaa
3505 ***
3506 > bbb
3507 .
3508 <blockquote>
3509 <p>aaa</p>
3510 </blockquote>
3511 <hr />
3512 <blockquote>
3513 <p>bbb</p>
3514 </blockquote>
3515 ````````````````````````````````
3516
3517
3518 However, because of laziness, a blank line is needed between
3519 a block quote and a following paragraph:
3520
3521 ```````````````````````````````` example
3522 > bar
3523 baz
3524 .
3525 <blockquote>
3526 <p>bar
3527 baz</p>
3528 </blockquote>
3529 ````````````````````````````````
3530
3531
3532 ```````````````````````````````` example
3533 > bar
3534
3535 baz
3536 .
3537 <blockquote>
3538 <p>bar</p>
3539 </blockquote>
3540 <p>baz</p>
3541 ````````````````````````````````
3542
3543
3544 ```````````````````````````````` example
3545 > bar
3546 >
3547 baz
3548 .
3549 <blockquote>
3550 <p>bar</p>
3551 </blockquote>
3552 <p>baz</p>
3553 ````````````````````````````````
3554
3555
3556 It is a consequence of the Laziness rule that any number
3557 of initial `>`s may be omitted on a continuation line of a
3558 nested block quote:
3559
3560 ```````````````````````````````` example
3561 > > > foo
3562 bar
3563 .
3564 <blockquote>
3565 <blockquote>
3566 <blockquote>
3567 <p>foo
3568 bar</p>
3569 </blockquote>
3570 </blockquote>
3571 </blockquote>
3572 ````````````````````````````````
3573
3574
3575 ```````````````````````````````` example
3576 >>> foo
3577 > bar
3578 >>baz
3579 .
3580 <blockquote>
3581 <blockquote>
3582 <blockquote>
3583 <p>foo
3584 bar
3585 baz</p>
3586 </blockquote>
3587 </blockquote>
3588 </blockquote>
3589 ````````````````````````````````
3590
3591
3592 When including an indented code block in a block quote,
3593 remember that the [block quote marker] includes
3594 both the `>` and a following space.  So *five spaces* are needed after
3595 the `>`:
3596
3597 ```````````````````````````````` example
3598 >     code
3599
3600 >    not code
3601 .
3602 <blockquote>
3603 <pre><code>code
3604 </code></pre>
3605 </blockquote>
3606 <blockquote>
3607 <p>not code</p>
3608 </blockquote>
3609 ````````````````````````````````
3610
3611
3612
3613 ## List items
3614
3615 A [list marker](@) is a
3616 [bullet list marker] or an [ordered list marker].
3617
3618 A [bullet list marker](@)
3619 is a `-`, `+`, or `*` character.
3620
3621 An [ordered list marker](@)
3622 is a sequence of 1--9 arabic digits (`0-9`), followed by either a
3623 `.` character or a `)` character.  (The reason for the length
3624 limit is that with 10 digits we start seeing integer overflows
3625 in some browsers.)
3626
3627 The following rules define [list items]:
3628
3629 1.  **Basic case.**  If a sequence of lines *Ls* constitute a sequence of
3630     blocks *Bs* starting with a [non-whitespace character] and not separated
3631     from each other by more than one blank line, and *M* is a list
3632     marker of width *W* followed by 1 ≤ *N* ≤ 4 spaces, then the result
3633     of prepending *M* and the following spaces to the first line of
3634     *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a
3635     list item with *Bs* as its contents.  The type of the list item
3636     (bullet or ordered) is determined by the type of its list marker.