Unfancy API Documentation

Published 23 February 2013

Warning This article is obsolete, and only left here for historical purposes. You should look at Dollphie for recent development.

I've been fiddling with different approaches to API documentation over the last years, and found a sweet spot that actually fits my needs. However I still faced a problem: there were no tools to support my approach to documentation. There were lots of tools out there that took different approaches for documentation, but I couldn't use any of them because they were either monolithic or language-specific and not that easy to bend my way.

So, in this blog post I describe that approach better and my way of trying to solve not just my problem, but alleviating the cost of coming up with different approaches to documentation, which should probably benefit more people (specially those who disagree with my dictatorial rules :3)

An overview of the documentation landscape

I work with a few different languages in a daily basis. Sometimes I'll code in Python, then hack up some stuff in Clojure, or invoke some clever computer spirits with JavaScript spells. Still, most of my time these days is spent working with LiveScript. At the end of the day, though, this context switching between a dozen of different languages leads to a really, really annoying thing: they use completely different tools and environments.

Some of those tooling is specific to a language because it needs to be specific to a language, for example editor modes, or the runtime library, static checkers and linters, etc. Others aren't really specific to a language, but still, like editors, debugger interfaces, build tools or documentation toolkits. Except that more often than not, people tend to make non-language-specific tools language specific, which kills the idea of reusing a otherwise generic tool in different environments.

I really mean it, reusing generic tools for different languages is a big thing. And it'll be even more of a big thing as people start working in more than one environment at the same time — front-end developers usually work with at least three.

I lied. Documentation isn't actually generic. There are plenty of language-specific parts. For example, language A might have run-time macros, whereas language B has a different concept of modules, and you want to capture those differences in the documentation in order to help the user grasp the meaning of your code quicker. However, these "language-specific" features are more often than not based upon years-old programming language fundamental building blocks, and those can be well generalised up to a point — at least, as long as it concerns the programmers' understanding of what an API does, why it does such, how it does such and how you should use it.

And please, let's not forget that API documentation can be "consumed" and visualised in lots of ways, using the same basic meta-data, whether it's generalised or not. You could say, documentation is made of several steps, and each of these steps produce something useful (or "consumable").

Even with this possibility of generalising things, I've found that the majority of the main-stream tools for documentation are language specific or totally overlook the step part, or yet only lets you change how the basic meta-data is visualised in a handful of ways. This is true for JSDoc, Docco, Marginalia, PyDoc, etc. Tools that can give you the meta-data they gather from the source code, like Doxygen, are often monolithic and generate this meta-data in annoying formats, like XML. Tools that don't rely on the structure of the language don't allow you to make that meta-information available by other tools, and there isn't a "visualisation only" format (well, you could kind of call Sphinx one, but then you're not really using meta-data).

For example, if you have a project that uses Scala and Java, but you want to present their API together because it actually makes sense, you can't easily do that. Or if you want to include the documentation directly on your project's README because you have only one or two public API methods, you also can't do that easily because there's no tool that just does: "hey, take this documentation meta-data, and generate something I can use in a Markdown-formatted README." You have to actually reinvent the wheel all the time by parsing the whole source and generating the thing in the format you want, or be forever tied to a particular tool by writing a plugin.

You also can't say, "Oh, I want to parse the documentation in this language using its structure, because that's easily to statically determine, but for this other language I want a generic thing because it's all too dynamic and you can only take guesses about what everything means. But I still want to display them using the same format." If that thing happens, you're basically out of luck.

Separating the concerns in documentation tools

Most of this can be fixed by separating the documentation in some well-defined steps, and creating the whole picture by mixing-and-matching tools that fit those steps. Below is a list of concerns that could be addressed by any documentation tool, but are at the same time independent of all other concerns:

  1. Analysing the entities that occur in a code-base (classes, modules, etc.)
  2. Resolving types in the code by way of type-inference.
  3. Resolving relationships between entities (multi-methods, inheritance, etc.)
  4. Analysing or guessing contextual relationships or abstractions.
  5. Merging documentations from different projects into one.
  6. Searching a particular entity in a code-base.
  7. Visualising documentation.

This list is in no way authoritative, there might be plenty of concerns that aren't listed here depending on the particular needs of a project. Or those concerns might be too much for a particular project. But again, this just shows how we could all benefit from separating all those concerns in tools that can be easily mixed and matched to achieve a greater goal: a nice way of documenting your APIs.

So, I did say that we could separate a documentation tool in a fixed number of steps. There are four steps in documenting an API, to be more precise:

  1. A developer annotates their code following certain rules. This helps analysis tools get a better sense of the code, and users of the API to see faster how to use or what's the intention of a particular part of the API.
  2. A tool analyses and extracts meta-information about API entities. Entities are all the things that are visible or usable in some way by the end user. So, classes, modules and functions are examples of entities in most contexts, whereas syntactical constructions like `for` or `if` are not — but more on that later.
  3. Meta-information including structure, relationships and annotations is put together in a single place, in some easy to parse and easy to use format.
  4. A visualisation tool uses the previous meta-information to display the structure, relationships, annotations, usage examples and any other kind of information that might be associated with the entities in a way that makes sense in the context of the visualisation tool.

The graph below shows one way the concerns could fit in the steps above:

http://dl.dropbox.com/u/4429200/blog/2013-02-23-doc-steps.png

So, why would this work? Simple (despite what my failures at graphs would tell you), the annotated source is the canonical reference for everything, it's what tells us which Entities we care about to handle in the next stages.

Then, several tools analyse this annotated source from different angles, one tool may extract entities using a simple structural analysis of the sources, another can extract type annotations, another can make sense of annotation meta-data in the comments, or here-docs if the language supports it. At the end of the day, however, all of those tools are able to communicate between each other and collaborate because they use a standard format.

Then, all of those (up until now) disjoint information about the entities in your code base moves on to the merging stage. At this point, all of the entity information that we gathered is put together as a single thing, and tools collaborate between each other to resolve relationship graphs, they could also provide better contextual organisations, solve links between entities, automatically "suggest" similar or more abstract methods, load examples, etc.

At last, all the information that was put together in the merging phase passes to the last phase, which is visualisation. Again, since everything uses an standard format, the visualisation tools don't need to care about which tools were ran, they just care about the information they were given. At this stage, the information can be used to display the API documentation inline in README files, if it is small enough, or as an interactive web application, or even as a searchable resource within your text editor (Emacs has this).

Meet Dollphie, Calliope, Kalamoi and Papyr°

As an experimental implementation of the concept defined above, I quickly hacked together these three tools (Kalamoi, Papyr°, Calliope) over the course of this week. I needed a generic documentation format since I want to use the same toolkit for documenting all the languages I work with. But I also needed to take into account the differences in each language's semantics and fundamentals behind each kind of entity. And I wanted to use different kinds of visualisation depending on the context, thus small modules with one or two public functions would have all their API documentation in a README.md, whereas more complex ones would be displayed in an interactive web application.

These tools are far from production ready (for example, there are no support for class hierarchies right now, nor field visibilities, nor multi-methods), and they need some serious refactoring (for example, kalamoi does everything from analysis to merging), but they're usable. Calliope is an opinionated wrapper over the other two tools for JavaScript projects that are already using CommonJS packages, and it's a really thin wrapper.

The basic idea of this toolkit revolves around a simple and generic structural annotation format (Dollphie) to describe entities in a non-language-specific way. This alone gets us around the problem of relying on a language's structure in overly dynamic languages like JavaScript, and works as a fallback for providing a common ground for languages that don't have specialised tools. Then there's a parser and analyser (Kalamoi) for this format, which spits out a standard documentation JSON with the entities meta-data, and finally the visualiser (Papyr°) takes that JSON and displays to the user in an interactive way.

The generic structural annotation

Since most of my work is in a overtly dynamic language like JavaScript (granted I use LiveScript now, but same semantics), parsing the source and trying to make sense of it isn't always easy. Say you have the following:

/// Given an ID, sanitises it, otherwise generates an unique ID.
var id = function() { var guid = 0

                      return function(name) {
                               return name?    sanitise(name)
                               :      /* _ */  (++guid).toString(16) }

                      // Sanitises an ID
                      function sanitise(name) {
                        return name.replace(/\s/, '-')
                                   .replace(/\W/, '')
                                   .toLowerCase() }}()

Even though this use of the "Module Pattern" is way too common, it's not obvious right away that the documentation refers to the inner function rather than the immediately invoked function expression that surrounds it. Or say you have something like this:

/// Takes the first `n` items of a list.
var take = curry(function(n, xs) {
                   return xs.slice(0, n) })

Which is also fairly common (not just the currying, but decorating a function in general), not just in JavaScript, but also in Python. This doesn't make the type of take immediately obvious, unless you happen to have figured out the type of curry already, or if your documentation tool evaluates a certain sub-set of JavaScript to figure out these things.

So, you have to annotate your code anyways with types, but there might be cases where your documentation tool will have some trouble figuring out the structure of your code anyways. Not to say you'd need extremely language-specific tools to extract this information. In the end I ditched this approach and decided for a generalisation of all languages I worked mostly with (which at the time were mainly JavaScript and Python), not just on the way entities are annotated, but also in the way structure itself is annotated.

The documentation format I came up with (and evolved informally through the course of this year) is heavily inspired by ReStructuredText and Markdown. The basic idea is that you have sections with varying depth levels (nested sections yay!), and each section can have a range of information belonging to it, including other sections. This way, entities are just special kinds of sections. The main difference from systems like Docco, etc. is that there are subtle annotations that make it possible to figure out the structure of the code and extract meta-data concerning a particular entity.

The whole parser fits in under 100-loc of LiveScript code. You can get a feeling of the documentation format by reading Kalamoi, Papyr°, Calliope or Boo source code. Boo is written in JavaScript, the others are all LiveScript. Adapting the parser to other languages can be done in three lines of code, but until I refactor stuff this is still a pain:

// in parser.ls
exports.luaSyntax = boo.derive(baseSyntax, { comment: '--' })

// in index.ls
var languageFor = function(extension){ ...
  case '.lua': return 'Lua'

var syntaxFor = function(language){ ...
  case: 'lua': return parser.luaSyntax

You probably guessed the parser relies on single-line comments, and yes. That's because the parser is line-based for simplicity. Basically, every line that starts with <comment-character> is treated as special meta-data for the parser, and the rest is treated as code. Lines that start with a repetition of the last <comment-character> are treated as sections, and the number of <comment-character> defines the level of that section (just as Markdown headings).

## Module foo
# Description of `foo`.

### -- Internal functions

#### Function id
# Description of `id`
id = (x) -> x

### -- Public functions

#### Function k
# Description of `k`
k = (x) -> (y) -> x

As you can probably guess, the special "kind" of section is defined by the first <word> after the comment character. A <word> here is more in the sense of Lisp, except the only thing that works as a word boundary is white-space. Sections where the kind is -- are special headers, that have no special entity meaning, but can be used to logically group things within a source code to make it easier to navigate and understand the structure.

There are only two other special constructs that can be used inside of a section to declare special meta-data about an entity. The <generic meta data>, in the form of :keys: optional value, and the <type signature>, in the Haskell-veins of :: type, although the format of the type signatures themselves is not specified — not a job for this particular language:

### Function id
# :internal, deprecated:
# The identity function.
#
# :: a -> a
id = (x) -> x

This all works well for well structured source code, and probably somewhat well for pseudo-literate code.

A format to exchange entity meta-data

I said before that all the benefits of this approach derive from a standard and simple format to exchange meta-data about entities. And there isn't anything easier nowadays for this kind of structured data than JSON. It's available everywhere, easy to parse, easy to write, and it usually maps nicely to lots of language's core data structures (at least dynamic ones) since it uses Maps, Vectors, Strings and Numbers as types, rather than a document tree .

This format is devised from an overt generalisation of all entities, with basis on what users of the API might be looking for when they browse through API references. This includes ideas that are common in other documentation systems, and kind of obvious, like an entity's name, its type or a description about what it means:

type URI = String

type EntityKind = Data
                | Module
                | Function
                | Object
                | Type

type Author = { name    :: String
              ; email   :: String
              ; website :: String
              }

type Entity = { id         :: URI  -- An unique ID for the entity
              ; parent     :: URI -- The entity that contains this entity
              ; kind       :: EntityKind -- The type of the entity being represented
              ; name       :: String -- The name of this entity
              ; signatures :: [String] -- The types of the entity
              ; text       :: String -- A text describing this entity
              ; code       :: String -- The source code associated with this entity
              ; meta       :: { String -> String | Bool } -- Additional metadata
              ; language   :: String -- The language of the source code
              ; line       :: Number -- Where the source code starts
              ; end-line   :: Number -- Where the source code ends
              ; file       :: String -- The file where the entity appears
              ; copyright  :: String -- Short copyright notice (if any)
              ; repository :: String -- The canonical repository for the code
              ; authors    :: [Author] -- Who contributed to the code
              ; licence    :: String -- The terms in which the code is released
              }

type Entities = [Entity]
Note

Yes, I do realise it makes more sense to create a new data structure to store all information related to code, I'll likely do it when refactoring these tools. I'll probably make code a List as well, so it can handle multi-methods defined in different parts of the code and such.

This, however does not cover several kinds of entities, specially entities that have intrinsic relationships with others in a way that it changes their behaviour, like Classes and SuperClasses/Mixins/Traits/Delegative & Concatenative Prototypes, etc.. It could just well be encoded in the type annotation, but then you lose the ability of using these relationships to convey additional information for the user when visualising the documentation, and encoding it as an additional but non-specified meta-data means that, while other tools can easily make sense of the data, we don't have a common-ground for them to collaborate with every other tool in this environment, so that's a no-go.

As it current stands, these things would need to be devised on top of the current standard and be added for the specification, so we don't have the risk of conflicts between different notations, and tools that don't understand that "extension" can just ignore the additional meta-data.

Visualising entity meta-data

Well, meta-data is good for applications to manipulate them, but not good for the end user, thus we are left with the problem of "how do we visualise this meta-data". Turns out, there might be many different cases where we want to do different visualisations. The goal of Papyr° is to cover one of them: finding some entity that fits the problem the user is trying to solve in a potentially large and complex API, and discovering how to use that entity. In the current form, Papyr° manages to achieve almost none of the goals outlined above, although it is a starting point.

http://dl.dropbox.com/u/4429200/blog/papyro.png

First, helping the user find the right component to use for a certain problem involves first knowing how the user searches for information about that particular problem. In some cases it's easy to guess, for example, the user might be looking for a stable sort, and as such they'll type "stable sort" in the search box and expect to see classes, functions, procedures, methods, whatever that will do the thing they want. But perhaps the user isn't really looking for a stable sort, perhaps in their case some other sorting algorithm will do the job better or faster because they don't have the constraints they gave in the first place. In this case, the system has to guess the intentions of the user from the provided context, and suggest things that fit the context but provide different constraints, or different benefits.

This also kind of asks for a more expressive search. One where the user can define their constraints better. For example, if they know they're working with a list of Elements, they might be looking for a sorting function that is efficient for those, so they could provide a basic type, say: sort :: [Element] -> [Element], and the search engine would attempt to match sorting functions that provide an approximate type to that. Hoogle does this kind of approximate type searches. Other constraints might involve platforms, mutability, etc. Ranking the search results by favouring the ones that fit the constraints the user provided better is also an interesting addition.

Currently Papyr° does not provide automatic contextual hints for potentially useful components, but this is definitely a feature I want to work on, and that could perhaps be branched in other projects. So, for example, an analysis tool could be ran on the entities meta-data earlier on to determine some possible logical tags for each entity, which could be used by the visualisation tool for relating different entities and ranking them. Approximate type searches are something that are planned but not yet implemented, this is however something I need to give a little more thought since the type signatures are not defined, so it's not something as straight-forward as Hoogle's type search. The problem is also more with analysing the user input and determining the types rather than having different type annotations used by the entities, since you could probably determine that by some meta-data or language.

So, the second goal of Papyr° is helping the user to use a particular entity to solve their problem. This is partially done by automatically loading Markdown literate examples and displaying them alongside the source code and other details for an entity. The literate examples are, however, static. There could be some use in defining a format for interactive examples, or even examples that would be ran by a particular engine when building the documentation JSON, for the cases where fully interactive examples in the browsers aren't possible.

The only other visualisation format I plan on writing is Reedox, a minimal visualisation format for overly small APIs (one or two functions), where it doesn't make much sense to have the user load an entire new page to search for something where the whole API fits in two paragraphs in your project's README. And this is the case for quite a few packages on NPM, including many I plan on writing in order to refactor the projects outlined above.

Conclusion and future works

This is just the beginning, there's lots of stuff to do, not only in standardisation, so we can get better processing and analysis tools to extract the meta-data we need, but also in visualisation tools so we can actually use that meta-data in a way that it's useful for the user.

Again, none of the projects above is really finished, but they can work for you right now, and the core interfaces are unlikely to change. Calliope will surely have the same interface and it'll be updated to reflect any changes that might happen in the other projects. It's also quite easy to integrate with your existing NPM package, so you can give it a try :3

None the less, I'll be actively working on these and other tools (testing and building, mostly). More crazy stuff to come ;3