CommonMark Specification

Check out the Markdown Dingus to experiment with the CommonMark processor.

Contents

What is CommonMark?
Why CommonMark Exists
Key Differences from Standard Markdown
CommonMark Parsing Algorithm
Benefits of CommonMark
Implementation Notes
Resources

What is CommonMark?

CommonMark is a strongly specified, highly compatible implementation of Markdown. It was created to address the ambiguities and inconsistencies in John Gruber’s original Markdown specification, which led to divergent implementations across different platforms and tools.

Why CommonMark Exists

The original Markdown specification by John Gruber was intentionally ambiguous in many areas, leading to different interpretations by various implementations. This created problems where the same Markdown document would render differently on different platforms (GitHub, StackOverflow, Reddit, etc.).

CommonMark provides:

Unambiguous specifications for all Markdown syntax
Comprehensive test suite to ensure consistent behavior
Clear precedence rules for conflicting syntax
Detailed parsing algorithm that can be implemented consistently

Key Differences from Standard Markdown

1. Stricter Parsing Rules

CommonMark enforces more consistent parsing behavior:

Blank Lines Before Block Elements

CommonMark requires blank lines before headings, blockquotes, and lists
Standard Markdown often allows these without blank lines

Text
# Heading

CommonMark: Requires blank line before heading

Standard Markdown: Often allows without blank line

2. List Item Parsing

Indentation Requirements

CommonMark has specific rules for list item indentation
Sublists must be indented consistently (typically 4 spaces)
Standard Markdown implementations vary on this

1. First item
   - Sublist item (4 spaces required in CommonMark)
2. Second item

List Continuation

CommonMark has clear rules for when list items are “loose” vs “tight”
Loose lists wrap items in <p> tags, tight lists don’t

3. Code Block Handling

Fenced Code Blocks

CommonMark standardizes fenced code block syntax with backticks or tildes
Requires consistent indentation and closing markers

code here


**Indented Code Blocks**

- CommonMark requires blank lines before indented code blocks
- Standard Markdown often allows them without blank lines

### 4. **Link and Image Processing**

**Reference Link Precedence**

- CommonMark has clear rules for which reference definition takes precedence
- Multiple definitions for the same reference are handled consistently

[link1]: /url1
[link1]: /url2
[link1]  <!-- Uses /url2 in CommonMark -->

Link Parsing Order

CommonMark processes links before emphasis
This affects how nested syntax is interpreted

5. Emphasis and Strong Emphasis

Nested Emphasis Rules

CommonMark has specific algorithms for handling nested * and _ markers
Prevents ambiguous parsing of complex emphasis patterns

*foo *bar* baz*  <!-- Clear precedence rules in CommonMark -->

Delimiter Processing

CommonMark uses a “delimiter stack” algorithm for consistent emphasis parsing
Standard Markdown implementations vary in their approach

6. HTML Block Processing

HTML Block Detection

CommonMark has 7 different types of HTML blocks with specific rules
Each type has different requirements for start/end conditions

<div>
This is an HTML block in CommonMark
</div>

7. Line Break Handling

Hard Line Breaks

CommonMark requires two spaces at end of line for hard breaks
Single line breaks become soft breaks (ignored in HTML)

Line one
Line two  <!-- Two spaces before line break -->

8. Entity and Character References

Numeric Character References

CommonMark supports both decimal and hexadecimal numeric references
Standard Markdown support varies

&#8212;  <!-- Decimal -->
&#x2014; <!-- Hexadecimal -->

CommonMark Parsing Algorithm

CommonMark uses a two-phase parsing approach:

Phase 1: Block Structure

Line Processing: Each line is analyzed for block-level markers
Container Blocks: Blockquotes, lists, and other containers are identified
Leaf Blocks: Headings, code blocks, paragraphs are processed
Reference Links: Link definitions are collected for later use

Phase 2: Inline Structure

Inline Processing: Text within blocks is parsed for inline elements
Emphasis Parsing: Uses delimiter stack algorithm for consistent emphasis
Link Resolution: Reference links are resolved using collected definitions
Entity Processing: Character references are converted to actual characters

Benefits of CommonMark

Predictable Behavior: Same input always produces same output
Cross-Platform Compatibility: Works consistently across different tools
Comprehensive Testing: Extensive test suite ensures reliability
Clear Documentation: Detailed specification eliminates guesswork
Future-Proof: Well-defined extension points for new features

Implementation Notes

CommonMark is designed to be:

Specification-compliant: Follows the official CommonMark spec exactly
Test-driven: Passes the official CommonMark test suite
Extensible: Can be extended with additional features while maintaining compatibility
Fast: Optimized parsing algorithms for performance

Resources

This documentation covers CommonMark 0.31.2 (2024–01–28). For the most current information, always refer to the official specification.

Next up: Kramdown Specification ▶

Getting Started

Writing Features

Supported Apps

Advanced Features

Settings

FAQ, Tips, and Tricks

Troubleshooting

About Markdown