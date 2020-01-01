I went down a rabbit hole of learning how to parse Markdown to add custom syntax to it. I had to go through a few projects and learn how to work with a new type of abstract syntax tree (AST). Previously, my experience with ASTs was limited to writing codemods using jscodeshift. In the end, it turned out to be a worthwhile investment of time! In this post, I'll go over what I learned using a problem I wanted to solve.

The problem

I wanted to introduce a new syntax to Markdown that lets authors add stylised subtitles to their documents. Using smaller headings to give the appearance of subtitles is not a great solution. It is not considered semantic HTML and it also hurts accessibility because it introduces extra, unintended headings to the document's accessibility tree.

What I wanted was a new syntax that looked like this:

# I am a level 1 heading # - I am a level 1 subtitle ## - I am a level 2 subtitle ###### - Up to six levels are supported corresponding to heading levels

When rendered to HTML these should come out as:

< h1 > I am a level 1 heading </ h1 > < p class = " subtitle subtitle--1 " > I am a level 1 subtitle </ p > < p class = " subtitle subtitle--2 " > I am a level 2 subtitle </ p > < p class = " subtitle subtitle--6 " > Up to six levels are supported corresponding to heading levels </ p >

The ecosystem

Currently, the best tools to use for this task exist in the node.js ecosystem. These are:

unified - an interface for parsing, inspecting, transforming, and serializing content through syntax trees

- an interface for parsing, inspecting, transforming, and serializing content through syntax trees remark - a Markdown processor powered by plugins part of the unified collective

- a Markdown processor powered by plugins part of the unified collective mdast - a specification for representing Markdown in a syntax tree (see this example)

- a specification for representing Markdown in a syntax tree (see this example) hast - a specification for representing HTML (and embedded SVG or MathML) as an abstract syntax tree. You can use rehype to parse html text as hast.

- a specification for representing HTML (and embedded SVG or MathML) as an abstract syntax tree. You can use rehype to parse html text as hast. unist - is a specification for syntax trees. mdast and hast are unist-compliant syntax trees.

What do these tools look like in practice? The following code gives an idea of what a processing pipeline looks like:

const remark = require ( "remark" ) ; const html = require ( "remark-html" ) ; const subtitlePlugin = require ( "./remark-subtitles" ) ; const text = ` # Hello ###- How are __you__? Great! ` ; remark ( ) . use ( subtitlePlugin ) . use ( html ) . process ( text , function ( err , file ) { if ( err ) throw err ; console . log ( String ( file ) ) ; } ) ; } ) ;

The solution

Starting from this, we can now write our plugin. The following snippet shows the plugin code with comments annotating the interesting parts.

const is = require ( "unist-util-is" ) ; const visit = require ( "unist-util-visit" ) ; const mdastToHast = require ( "mdast-util-to-hast" ) ; module . exports = function subtitlePlugin ( ) { return async function transform ( tree ) { visit ( tree , "paragraph" , ( paragraphNode ) => { const { children } = paragraphNode ; const textNode = children && children [ 0 ] ; if ( ! is ( textNode , "text" ) ) { return ; } const text = typeof textNode . value === "string" ? textNode . value . trimLeft ( ) : "" ; const re = / ^(#{1,6})-\s+ / ; const matches = text . match ( re ) ; if ( typeof text === "string" && ! matches ) { return ; } const depth = matches [ 1 ] . length ; const newValue = text . replace ( re , "" ) ; paragraphNode . data = { hName : "p" , hProperties : { className : ` subtitle subtitle-- ${ depth } ` , "data-remark-subtype" : "subtitle" , "data-subtitle" : depth , } , hChildren : [ { ... textNode , value : newValue , } , ... children . slice ( 1 ) , ] . map ( mdastToHast ) , } ; } ) ; } ; } ;

If you'd like to play with a runnable version of this code check out this runkit demo.

Wrap up: Why is this cool?

Beyond adding new syntax, being able to analyse Markdown unlocks a lot of automation and authoring enhancements. Here are some ideas of where you can go with this:

Write your incident response documents in Markdown and use - [] to create action items. Now, you can write a parser and pipeline that runs in CI to create tickets for these and update the document with links to them.

to create action items. Now, you can write a parser and pipeline that runs in CI to create tickets for these and update the document with links to them. Use remark to lint Markdown documents for things like too many spaces (e.g. extra space)

Use remark to take links to an excalidraw diagram and embed a preview on hover feature

Check out these awesome remark plugins if you're looking for inspiration. I hope this helps you get started!