Tim Berners-Lee
Date: 2023-10-03, last change: $Date: 2023/10/05 12:58:52 $
Status: personal view only. Editing status: More or less complete.

Up to Design Issues


Pretty Printing

"Pretty Printing" is the practice of formatting code in a computer language so that it is easy to read by humans, and aesthetically pleasing. There is a Wikipedia article which goes over it well in general. In the case of data in the Solid ecosystem there is value not only in legibility but also in making small changes to data evident as small changes to the text, and in tests as a simple more or less canonical form to easily compare expected and actual results. And specifically for RDF data in N3 or Turtle, those languages provide powerful features for representing a graph data in a form close to English language. Legibility and aesthetic properties affect not only reading code but also people writing it. If are expressing by hand instance data, from random facts through configuration files to ontologies and rules, being able to write it in a clear, English like way also is more pleasant, and easier to check for mistakes. When a group is collaborating the same same data document, such as on a whiteboard or in a chat, then legibility and writability are both important.


Introduction

Pretty Printing is the practice of formatting code in a computer language so that it is easy to read by humans, and aesthetically pleasing. There is a Wikipedia article which goes over it well in general.

Specifically in the Solid ecosystem of read-write linked data, legibility of the data in a solid pod is important. it is important because developers who are playing with exiting programs need to be able see how they work. Then, when they are making their own applications, to be able to understand what is happening and what has gone wrong. The "view source effect" which allowed the web to spread though people copying and adapting each others web pages is important linked data in Solid pods too.

It has been the long communal experience in the internet protocol design community that simple text formats (like FTP, SNTP, HTTP, HTML, XML) have been key to adoption and understanding of new systems. So while you can ship a binary format and give everyone tools for translating it into something legible, actually is is better to ship legible formats directly across the net.

It turns out there a few other benefits of legible serialization in the system.

Storage space, transmission time and bandwidth

When RDF triples are encoding using the syntax tools Turtle provides they can take significantly less space. This may or may not be important. The text may get compressed at another stage.

Consistency and continuity

When a data is automatically serialized it is useful for small changes in the it is useful for the same data to consistently give the same text.

It also should be the case that small changes in the data produce small changes in the text.

In tests

Tests of a system involve performing a function, and then comparing the actual output to see if it matches the expected output. When the output is RDF data, then it is good to compare the serialized versions, the pretty-printed versions, rather than the raw triples. This allows on easily to understand what has happened when they don't match.

With Source Code Control

When the serialized data is checked in to a source code control system, such as Mercurial (hg), git, SVN or CVS, then it is very important for small changes in the underlying data to map to small changes in the serialized text. This makes the record of changes smaller, and makes it possible for someone investigating a problem which happened at a certain point to compare two version of the code before and after.

Legibility when writing

Computer languages which have features for being clear and legible also allow people writing code by hand to be clearer in what they are writing. It is easier to write code -- such as say configuration files, or business rules -- if you are writing in a legible, pretty, language.

Collaborative editing of code

Even more important is the clarity and simplicity of expression when more than person is brainstorming over something in RDF, such as on a whiteboard or in a chat. Being able to share and co-design a bit of data is key to the remote collaborative design process. (This is what N3 was originally made for.)

Legibility tools in Turtle

Data in a pod is in RDF typically in Turtle, or JSON-LD. It may be in or in a superset of Turtle called N3. Turtle, and the more powerful n3, have a set of syntax forms which make things more legible. Use them when you serialize a document.

The following are only available in the full N3 language.

Specific formatting parameters

The specific values of these parameters are famous for being very much a question of personal choice. Different people's brains work differently when it comes to white space, indent, line length, breaking of long comment strings in rdf. The "spaces or tabs" question is infamous for bike-shedding, and indent and whitespace amounts are a question of taste. But if you are a developer and users may in same cases end up seeing the formatted text you generate, then you have to guess their preferences not just insert your own. I won't give you my favorite settings here, to keep the topic on the principles.


References

Up to Design Issues

Tim BL