Punctuation as markup

Like many, I spent most of my days in markup. Sometimes to the point that I forget what it was invented for, to address what problem. During these times of doubt and confusion, I like to go back in time and read the works of the pioneers.

Yesterday night, I read this great article Markup Systems and the Future of Scholarly Text Processing, which dates back from before XML, HTML, SGML, or GML even!

There is a section on the different kinds of markup, in particular punctuational markup and descriptive markup. After reading this, it occurred to me that the following three representations are strictly equivalent (markup is highlighted in bold):

  • Plain text markup using the period “markup” to signify the end a sentence that is a statement (see. this section of HyperGrammar for a grammar refresher): The teacher asked who was chewing gum.
  • XML markup that specifies that a piece of text is a sentence and a statement (notice no end punctuation here): <sentence><statement>The teacher asked who was chewing gum</statement></sentence>
  • Plain old semantic HTML that specifies that a piece of text is a sentence and a statement (notice no end punctuation here): <span class="sentence statement">The teacher asked who was chewing gum</span>

In the last case, you can use CSS code to add a period at the end of each sentence that is a statement:

.sentence.statement:after {
content: '.'
}

This means also that if we are being strict, combining punctuation with HTML or XML markup, when descriptive markup and CSS styling suffice, is a bad practice since it is semantically redundant, or in other words, one of the two is useless.

There are plenty other examples to explore: for instance, quotes (“) as markup that is an alternative to the <q></q> or <blockquote></blockquote> HTML markup. This may be pushed to the extreme that each space in plain text is viewed as markup to distinguishes words from one another.

This is probably an epiphany just for me, but I thought I’d post it anyway!