Equations ≠ Math (Or: Equation layout as a print artifact)

Photo of Peter Krautzberger.

Peter Krautzberger is an independent consultant, working primarily with scientific publishers on web-centric content architecture and the supporting toolchains. He manages the MathJax Consortium (a joint venture of the American Mathematical Society and the Society for Industrial and Applied Mathematics) known for the leading equation rendering solution for the web. Peter is also an invited expert at the W3C Publishing Working Group and co-chairs the Math-on-webpages Community Group. In another life, he received a PhD in mathematics from Freie Universitaet Berlin.

Peter will be giving a workshop at ebookcraft 2018 called MathML: Equation rendering in ebooks.

When people speak about math content in the context of the web they don't usually mean mathematics. Instead, they usually mean formulas or equations of some kind. That is, they don't mean content in mathematics (which often doesn't qualify as either of these), they simply mean something that looks like a mathematical equation. Now you might say a formula in physics is still essentially mathematics but both mathematicians and physicists might disagree with you (and each other, possibly explosively so); besides, it quickly gets hairy when considering chemical equations or notations from the life sciences.

It's not hard to understand why people confuse this though. For example, most typesetting tools with support for such content will dub it math mode or equation input. So I've recently begun to use equational content as a relatable, but less encumbered, term. And it helps identify a key problem of equational content: it doesn't help to confuse a field of study with a layout tradition.

Where were we?

Equational content is, well, simply pretty terrible all around. As you would expect from any content with millennia of tradition, equational content comes in many different forms but over the past century the use in public education and publish-or-perish scientific research has led to a form of expression that is very rigid in both tools and traditions.

Theorem of Pythagoras, Oliver Byrne, 1847

Theorem of Pythagoras, Oliver Byrne, 1847

Even a relatively recent publication, such as Oliver Byrne's version of Euclid's Elements, is full of creativity and playfulness yet it appears light-years away from the life of today's STEM students, teachers, and researchers.

Instead, our daily grind is convolutions of grey and boring letters, with content running rampant in tables, random vertical stacks, superscripts, and subscripts. Effectively, today's equation layout has become extremely compressed, archaic, convoluted, and frequently undecipherable. It destroys academic careers by the millions and it can often only be understood when you can see it written live (i.e., animated).

A lengthy expression about Sheafs of modules

A lengthy expression about Sheafs of modules

It is, of course, rarely animated as succinctly as it is by Jill Clayburgh in It's My Turn.

Snake lemma demonstration from the 1980's film, It's My Turn, starring Jill Clayburgh and Michael Douglas.

At its best, equational content is a good abstract drawing. At its worst (usually?), it's deafening gibberish. How did we get here?

10 PRINT "Hello World!"; 20 GOTO 10

The critical problem of equational content is that it's not just rooted but stuck in the limitation of print media: We needed to adopt bad practices for such a long time that many people now consider them good. Here are my personal top three examples:

  • The abuse of fonts to imply meaning is a constant horror in equational content. Here's a very mild example from Wikipedia but you can easily come across a book where a dozen variations of, say, the letter A appear, denoting a convoluted set of somewhat related concepts, sometimes even changing meaning between chapters or sections. Unicode has even deemed this abuse of notation important enough to give us such wonders as the Unicode point MATHEMATICAL BOLD ITALIC G in the Mathematical Alphanumeric Symbols Block.
  • The abuse of sub- and superscripts is another great example. Do you need to add a variant of an object you've already introduced in your notation? Just slap some sub/superscripts around it, et voilà, a new object. Ideally, combine it with the previous point, because clearly it will be easy for readers to differentiate the letter c in various fonts when placed in superscript.
  • Another historic accident are preferences of stylistic separations. For example, in print it's abhorred to make math content bold when the surrounding content is bold (e.g., in a heading), yet on the web people complain that an equation in a link doesn't get the correct text decoration; what would that even be?

Obviously, there's little point in criticizing the historic development of equational content. Given that print was mostly limited to (at best) greyscale with a limited character set, naturally people had to be creative; this was a good thing and it's amazing what humanity has accomplished.

Where do we go now?

It only becomes a problem when you pretend that this tradition should do more than inform authors when they encounter a medium. The web so far has developed without much influence from equational content. Every once in a while I wonder: What if Tim Berners-Lee had given the web some basic building blocks for equations? Just a fraction and a square root. Maybe instead of image renditions of print equations we'd have immediately seen the same creativity applied to equations that we saw with hacking general layout (1px GIF anyone?). Of course, that's hopelessly romanticizing the evolution of the web. But I can't stop wondering.

The web has adopted a rather different approach to separating content and presentation and the traditions of equational content are essentially incompatible with it. Obviously, it's not like you shouldn't be able to put traditional equational content on the web — you should (and you can very well today). But I've come to think it's perfectly fine and, in fact, appropriate that this continues to be a difficult problem. For example, traditional equational content is almost always inaccessible (without heuristic algorithms, i.e., guessing around); it's basically a bunch of glyphs placed in weird 2D patterns (like above and below a line that in turn is magically centered on some baseline and may or may not indicate it corresponds to the notion of a mathematical fraction). Pretending that this is a basis for accessible rendering on the web strikes me as foolish (or ridiculously zealous).

If you think that all equational content should be limited to the traditions of the print era, that's fine of course. I think humanity can do better on the web. Though I think we'd need to acknowledge that the (print) traditions enshrined in equational content have serious flaws and should (and invariably will) be replaced with better concepts and narratives that are appropriate for this medium.

If you'd like to hear more from Peter Krautzberger about MathML, register for ebookcraft, March 21-22, 2018 in Toronto. You can find more details about the conference here, or sign up for our mailing list to get all of the conference updates.