For years generating simple and elegant documentation has been a thorn in the side for C++ developers. There are some very established projects out there like Doxygen and Sphynx, but their complexity makes them notoriously hard to customise in any meaningful way, and sadly their default output format has always reminded me of a Windows 3.1 help file.

While recently searching for a simple C++ documentation solution for LibSourcey I discovered that there simply wasn’t anything like this out there. Crazy right? The situation has been the same for years. Anyway I did what any self respecting hacker would do, I build one. It’s not perfect, but it does what I need it to do, and you can see the result for yourself. Hopefully, with the right contributions from the open source community, it will continue to improve over time.

The following sections are about the state of C++ parsers and documentation generators, so if you want to go straight to installing and using Moxygen then click here

Why Markdown?

There are those that dismiss Markdown as a viable solution for generating API documentation. Yes, it’s true that the format has it’s standardisation issues, but it’s so simple and nice to work with that if you don’t need to generate highly technical documentation, then I ask you, “why not”?

  • Markdown is a pervasive format that is only gaining traction
  • You can keep your README and other help docs in the same format as your API spec
  • It’s easy to read in source
  • There are many Markdown to HTML generators available
  • You can drop the spec directly into a Jekyll or Middleman type static website generator
  • GitBook can convert Markdown to PDF, Moby and ePub

Blah blah blah, point made…

Parsing C++

The first thing that’s required is to output C++ code into a more parsable format, such as XML or similar. The aren’t many solutions for parsing C++, probably because the language is so bloody hard to parse.

There only viable open source solutions that I was able to find were:

  • CastXML — CastXML is as XML output extension for GCC, and the successor to GCCXML. While GCC is an awesome compiler and C++ parser, it only works on unix systems which makes cross platform an issue, and by the time the C++ is processed by the parser the comments are long gone - therefore and extra process would be required to extract source code comments. Doable, but not ideal.

  • Clang — Clang has an awesome standards compliant C++ parser, it’s cross platform, but unfortunately it outputs the AST format which would be a Herculean feat to turn into something readable. Let’s keep looking.

  • Doxygen XML — I know I badmouthed Doxygen in the first paragraph, but Doxygen is actually pretty awesome (just the HTML output stinks). It’s cross platform, supports many languages (good for the future development of our Markdown generator), and also outputs raw XML that is relatively easy to parse. Perfect for Moxygen!

Generating Markdown

Markdown is a relatively new kid on the block, especially in the somewhat dusty world of C++, therefore one wouldn’t expect there to be a lot of tools available just yet.

I found two projects that were of some use:

  • cldocCldoc is a python project based on the Clang parser, and actually outputs pretty nice documentation and apparently also outputs Markdown. Unfortunately though the issues are piling up on Github and it looks pretty unmaintained at this point in time.

  • doxygen2mddoxygen2md is a super simple nodejs parser for Doxygen XML output. Unfortunately it’s only suitable for simple single page documentation. It borked at multiple points when parsing the Doxygen output from LibSourcey, and it also produced some pretty interesting output when I threw some nested namespaces at it.

As it stands the best C++ documentation parser available for C++ is currently Doxygen, and as nodejs is a quick and easy platform to work on I opted to extend the parser from doxygen2md in order to build Moxygen.

Using Moxygen

  1. Install Doxygen.
  2. Add GENERATE_XML=YES to your Doxyfile.
  3. Run doxygen to generate the XML documentation.
  4. Install moxygen like so: npm install moxygen -g.
  5. Run moxygen providing the folder location of the XML documentation as the first argument ie. {OUTPUT_DIRECTORY}/xml.
Usage: moxygen [options] <doxygen directory>

Options:

  -h, --help             output usage information
  -V, --version          output the version number
  -v, --verbose          verbose mode
  -a, --anchors          add anchors to internal links
  -g, --modules          output doxygen modules into separate files
  -l, --language <lang>  programming language
  -t, --templates <dir>  custom templates directory
  -o, --output <file>    output file (must contain %s when using modules)

Single Page Output

If you want single page Markdown output the you can run Moxygen like so:

moxygen --anchors --output api.md /path/to/doxygen/xml

Multi Page Output

Moxygen supports the doxygen modules syntax for generating multi page documentation.

Every \defgroup in your source code will be parsed into a separate output files, with internal reference updated accordingly.

Example:

moxygen --anchors --modules --output api-%s.md /path/to/doxygen/xml

Hosting Your Documentation

We use a combination of GitBook and Middleman for LibSourcey. It’s very easy to setup; the GitBook is located in the doc folder of the repository, with symlinks to the main README and LICENSE files so they can be reused in the book. Next the static GitBook HTML is copied across to out Middleman website and deployed using GitHub pages. All in all a very convenient (and cost effective!) solution.

You could also opt to store your documentation on a separate GitBook repository, that way you would just need to push your repository to update your live documentation.

Conclusion

It’s my hope that with Moxygen, C++ developers will now have a way to generate more aesthetically pleasing and readable documentation. Since we only ever look at the docs when we absolutely have to, let’s make the process as enjoyable as possible! :)