Code the Docs

Occasionally scribbling about my adventures in open source technical documentation. Subscribe to the Atom/RSS feed.

Latest Posts

Resumé-as-Code: Small Data Sourcing for Fun & Profit

What better way to show off what I do as a documentation platform developer (“DocOps engineer”) than by generating my own resumé with the technology stack I use professionally? Several prospective clients have remarked on the demonstrative novelty of this approach. If you also earn a crust from setting up documentation environments for engineering teams and the like, consider building your professional documentation from code to show off your skills.

With its singular source and dual HTML and PDF output, my resumé is a microcosmic model implementation of the AJYL tech stack, consisting of AsciiDoc, Jekyll/JAMstack, YAML, and Liquid.

In this post, I will walk you through how I set it all up, explaining how each component language and utility come into play.

The Sources

Most of the specific content items in my resumé are stored as small data objects arranged in a YAML file, similar to fields in a database. A small-data file is simply a plaintext document made up of keys and values structured such that both people and machines can modify and consume the organized information.

Here is a YAML snippet drawn directly from the single-sourced file.

From data/dominick.yml — Resumé small data source file snippet
posts:
  paid:
    - org: Codewriting, LLC
      titles:
        - Principal
        - Codewriter
      dates:
        start: 2017-10-01
        end:
      description: maintain open-source technical documentation platform (https://www.ajyl.org/liquidoc-cmf[LiquiDoc CMF]); install, instruct, and maintain docs platforms for excellent engineering organizations
      url: https://codewriting.org
    - org: Rocana (formerly ScalingData)
      titles:
        - Technical Documentation Manager
        - Engineering Technical Lead
      dates:
        start: 2015-03-01
        end: 2017-10-01
      description: responsible for writing, editing, and managing all customer-facing technical docs, as well as overseeing all common internal documentation; create & maintain tooling to facilitate docs
      url: https://www.crunchbase.com/organization/rocana

If you inspect the whole file, you will see it contains several clusters of data, each analogous to a table in a database.

One advantage of maintaining small data in flat files is that you can build an entire, perfectly portable datasource with a text editor in mere minutes.

In this case, parameter values contain AsciiDoc markup. Notice on the first line beginning with a destination: key, the value text contains AsciiDoc hyperlink markup.

This data gets parsed with a Liquid template. This particular Liquid template (see complete source) weaves those AsciiDoc-formatted strings from the data file with more AsciiDoc markup.

From resume.asciidoc — Template mixes Liquid with AsciiDoc, generates new AsciiDoc source
== Employment History
{% for p in posts.paid %}
{{ p.org }} {separator} [dateline]#{{ p.dates['start'] | date: "%b %Y" | remove: "-00-00" }} - {% if p.dates['end'] %}{{ p.dates['end'] | date: "%b %Y" | remove: "-00-00" }}{% else %}present{% endif %}#::{% for title in p.titles %}
_{{ title }}_{% unless forloop.last %}, {% endunless %}{% endfor %} +
{{ p.description }}
{% if forloop.index == 2 %}
{% endif -%}
{%- endfor %}

Because Liquid templates are usually a mix of at least two markup languages (Liquid markup itself plus your target format), they can feel pretty busy. Such is the nature of content templating. Nevertheless, we can generate such elegant code this way.

Don’t worry about the details of this code. Basically what it does is iterate (“loop”) through the posts.paid array in the data file, creating an entry for each item it comes across (objects describing my past paid gigs). As it churns, Liquid expresses the details of each object, mixing it in with some AsciiDoc markup you’ll also recognize in the output just below.

From _build/pages/resume.adoc — AsciiDoc source file output after parsing
== Employment History

Codewriting, LLC {separator} [dateline]#Oct 2017 - present#::
_Principal_,
_Codewriter_ +
maintain open-source technical documentation platform (https://www.ajyl.org/liquidoc-cmf[LiquiDoc CMF]); install, instruct, and maintain docs platforms for excellent engineering organizations

Rocana (formerly ScalingData) {separator} [dateline]#Mar 2015 - Oct 2017#::
_Technical Documentation Manager_,
_Engineering Technical Lead_ +
responsible for writing, editing, and managing all customer-facing technical docs, as well as overseeing all common internal documentation; create & maintain tooling to facilitate docs

The above snippet includes just the AsciiDoc generated for the first two entries, though the _build/pages/resume.adoc file is a complete AsciiDoc document source.

Since this file is built and thus not tracked in the Git repository, to see the whole file you would have to clone and build the Codewriting repo, generating a copy into the _build/pages/ directory.)
Because that final, prebuilt source is generated before this blog, the above snippet is directly sourced from the resulting file, accessed while this blog post was being rendered and ensuring it always accurately reflects the current state of the underlying data. (See the bonus explainer at the end of this post.)

Finally, I render that AsciiDoc file into rich-text output. The PDF document is generated by the killer Prawn PDF utility, after which the Jekyll static site generator spits it out as HTML. Both of these processes invoke the Asciidoctor rendering engine, as detailed below in The Build.

The Output

Here are images from the relevant details of both rendered artifacts.

Example 1. HTML version
screenshot detail resume employment html
Example 2. PDF version
screenshot detail resume employment pdf

The Build

Now that we see what happened, let’s explore how it was scripted.

All of this activity is coordinated by LiquiDoc, my free, open-source utility for combining AsciiDoc, YAML, and Liquid with Jekyll and other JAMstack services and utilities. LiquiDoc is a nascent API for the AJYL docstack, designed for building rich-text documents in many editions and formats, all from the same source, by invoking upstream tools (such as Asciidoctor, Liquid, and Jekyll processing engines).

The Codewriting project’s LiquiDoc build configuration file (config) for this site is pretty long, but essentially three relevant stages generate both the HTML and PDF versions of the resumé. Let’s look at the first stage.

From _configs/build-global.yml — LiquiDoc config file resume parsing stage
- stage: parse-resume
  action: parse
  message: Building BD resumé.
  data: data/dominick.yml
  builds:
  - template: _templates/liquid/resume.asciidoc
    output: _build/pages/resume.adoc

Here LiquiDoc is instructed to parse the dominick.yml data with the resume.asciidoc template. This generates resume.adoc, an AsciiDoc source file containing its own output instructions, saved to the ephemeral _build/pages/ directory.

Whenever the resume.adoc file is processed by Asciidoctor, header information (“frontmatter” in Jekyll parlance) designates the output settings and variables. First, it establishes settings and default variables, which will be used in the Jekyll site generation.

From _build/pages/resume.adoc — Key Jekyll frontmatter settings and content variables
// Jekyll-specific settings
:page-layout: resume
:page-permalink: /brian-dominick-resume
:edition: html
:otheredition: PDF
:otheredition_uri: http://codewriting.org/assets/files/brian-dominick-resume.pdf
:includes_path: assets/includes
:separator:
:header_cols: 1
:doctitle: Brian Dominick, Codewriter
:contact: link:http://codewriting.org/contact[contact Brian]
:contact_table_width: 100%

This built file’s header also contains information for the PDF rendering.

From _build/pages/resume.adoc — Key PDF settings and content variables
ifdef::backend-pdf[]
// PDF override settings
:notitle:
:edition: PDF
:otheredition: HTML
:otheredition_uri: http://codewriting.org/brian-dominick-resume
:includes_path: ../assets/includes
:contact: 315-254-0342 / [email protected]
:contact_table_width: 80%
:separator: |
:header_cols: 1,3
endif::backend-pdf[]

The first and last lines are conditionals; they ensure the enveloped settings are only applied during rendering procedures if the backend-pdf setting (“attribute” in AsciiDoc parlance) is defined. These will replace any previously set parameters with the same names.

Which gets us back to the LiquiDoc config. The next stage instructs LiquiDoc to actually render the PDF version of the output from the source file prebuilt in the last stage (_build/pages/resume.adoc).

From _configs/build-global.yml — LiquiDoc config file resume PDF render stage
- stage: render-resume
  action: render
  message: Rendering resumé files.
  source: _build/pages/resume.adoc
  data: _configs/asciidoctor.yml
  builds:
  - output: _build/assets/files/brian-dominick-resume.pdf
    style: theme/pdf/resume-theme.yml
    doctype: article
    attributes:
      edition: pdf
      icons: font
      pdf-fontsdir: theme/fonts
      imagesdir: assets/images
      safe: unsafe

This block invokes a very simple Asciidoctor conversion using the PDF backend extension, which engages Prawn for the final conversion. A customized YAML-formatted theme file enforces the style of the PDF output.

The resulting PDF artifact will later be copied into the Jekyll-generated website so it can be served, but first we have to build the rest of the site, including the HTML version of the resumé. LiquiDoc has us covered here, as well.

From _configs/build-global.yml — Jekyll build execution stage
- stage: build-site
  action: render
  builds:
    - backend: jekyll
      properties:
        files:
          - _configs/jekyll.yml
          - _configs/asciidoctor.yml
      attributes:
        stylesdir: theme/hyde/css
        source-highlighter: highlightjs

This section of the config script instructs Jekyll to generate a site, one page of which will be the HTML version of the resumé, as dictated by the _build/pages/resume.adoc file’s frontmatter settings referenced above. This conversion does not require a specific instruction, as Jekyll renders all files in _build/pages/ by default, and this prebuilt file is after all just another Jekyll-ready AsciiDoc file like all its siblings.

After the static site is built, LiquiDoc copies the PDF resumé into it, and the whole thing is ready to be served.

Conclusion

These are all the main features of an AJYL platform: small data in YAML flat files, content in AsciiDoc, templating and transformation with Liquid, and Jekyll to put it all together as a proper website.

The rest of the Codewriting site uses similar techniques to massage data and content together, either during prebuilding with Liquid or when rendering final artifacts. Asciidoctor uses parameters (AsciiDoc attributes) when converting AsciiDoc files to HTML, PDF, and other formats. Also during the render build, Jekyll engages Liquid’s template engine to construct the site’s HTML (or even to preprocess AsciiDoc content files containing Liquid markup). All this complementarity is key to the AJYL environment.

While a resumé is merely a novel implementation of this varied and powerful technology stack, imagine what this kind of flexibility can do for conveying complex product data in numerous permutations of editions and output formats.

Meta: How this Post was Made

All of the snippets included in this blog entry are derived from the original source. The Code the Docs blog is sourced in AsciiDoc, enabling dynamic embedding (“inclusion”) of content from the codebase itself.

For instance, look at the source that renders the final code listing above (From _configs/build-global.yml — Jekyll build execution stage).

From the AsciiDoc source for this blog entry
.From _configs/build-global.yml -- Jekyll build execution stage
[source,yaml]
----
include::assets/includes/build-global.yml[tags="build-jekyll-example-snippet"]
----

The original config file (_configs/build-global.yml) is tagged with comment code indicating what to snip for inclusion into another file. (This file is copied into the ephemeral build directory to make it available in this way, hence we are technically reading from a clone saved to _build/assets/includes/ early in the build procedure.) In the blog post source, the AsciiDoc include:: macro includes a request for the section of code tagged build-jekyll-example-snippet.

Note the advantage over conventional blogging, which entails copying and pasting source code samples into a post that gets saved in a database (and forgotten). Once you’ve duplicated the source, it becomes a divergence risk; unless you update any docs referencing it, they will be inaccurate. How many bloggers do you know who update code samples in years-old posts to ensure they stay current? (Not me, I must admit.)

I will add that this system came in handy throughout the writing of this post, during which I manipulated each source multiple times in order to arrange good example snippets.

Imagine if your user docs always reflected current code examples and other canonical information, drawn straight from the product source.

Asciidoctor, Jekyll, and Hyde for Elegant Docs out of the Box

I swear I don’t just love this stack because of the title it gives me. I do admit “Hyde” wouldn’t have made it into the title if it didn’t just sound so good, but it’s also a nice theme, as you can see.

In any case, that’s what’s under the hood of this site. All pages and posts are written in AsciiDoc dynamic markup, converted to HTML by the Asciidoctor rendering engine, built into a proper site by Jekyll static generator with a lightweight theme called Hyde giving the pages structure and style. Then served on GitHub Pages.

Update!
Now hosted on the fantastic Netlify continuous docs deployment platform!

I’ve been working with Jekyll and Asciidoctor for well over a year now, and I am confident that they are robust, pliable utilities that can solve a very wide range of technical documentation problems. This is especially true when you consider the broad range of Asciidoctor output.

In this blog, I will journal some of my technical experiments in documentation tooling, much of it around Asciidoctor and my own free and open source tool, LiquiDoc. All of this tooling is FOSS, and it’s all written in Ruby, which I started learning some months back in order to hack a complex docs build toolchains. Now I’m looking for new projects and challenges, which i hope to trace here.

Not that I’m hacking or extending it, the other major tool all of this depends on is Git.

If you’re a technical documentarian of any kind, consider checking out my book Codewriting. I’ve set up an RSS feed, and I’ll be posting links on my nascent Twitter account (@_codewriter) as soon as I get my shit together.

Types of User Interface: A Documentarian’s Take

By my count, there are seven major types of user interfaces (UI) that developers make and tech writers document. I have found a few lists of computer interface types, most of them paltry. None speaks from a documentarian’s point of view, but lack of coverage has never meant something is not actually important to do technical writers. We’re often just too busy writing about our products to stop and explore just how we do that.

Nevertheless, at risk of stirring up controversy, let me take a stab:

command-line interfaces (CLI)

includes multi-line editors in non-GUI systems

form-fill interfaces (FFI)

includes wizards and dialogs

graphical user interfaces (GUI)

operating systems, video games, and most other mixed-interface systems

augmented reality interfaces (AR)

screens that overlay real-world visualization with text, symbols, and interactive elements

application programming interfaces (API)

protocols and tooling for developers to interact with and extend a product or programming language, including integrated development environments (IDE) and software developer kits (SDK)

natural language user interfaces (LUI)

UIs that accept language commands, includes conversational interfaces like Siri and Alexa, as well as chatbots

virtual reality (VR)

immersive experiences intended to maximize and naturalize the interface while minimizing interference from conventional UI elements

You may be tempted to add to these top-level categories — and the Wikipedia article would encourage this — but I challenge you to help me merge and reduce, if possible.

  • Some might say FFIs are a sub-category of GUIs, but I think they’re different enough to instruct, and both so terribly common, the distinction is significant.

  • Are augmented reality and virtual reality in the same category — reality-based interfaces?

I don’t know why I find this taxonomy comforting, but it helps me to approach each type of interface with its particular attributes front of mind.

HELP WANTED
Codewriting explores the first four types of UI, each at some length. I have no experience documenting the , but I would welcome insights in appropriate places from those with relevant expertise.

Interface types are often associated with audience types. If I’m documenting a CLI, I know it’s not for my father. In fact, if my father is the product’s audience, we know a CLI is not the right interface choice. Aside from unforgiving syntax, the command line lacks visualization and only minimally abstracts underlying logic and routines. This makes the CLI a non-starter for huge swaths of less technically inclined users, as well as less frustration-inclined geeks.

While command-line interfaces are considered the most intimidating, they are by no means the most complex. In fact, the list of UI types above reflects my sense of the ascending order of potential complexity. The less burdensome the interface seems, the more open-ended it is, and thus the more complicated it would be to fully capture the range of its utilization.

Imagine the potential complexity of a natural language interface. Most limitations of Siri and Alexa are not in the interface, but in the intelligence. The range of commands such systems can already distinguish make them spectacularly open ended.

Drastically more complex than any mere language-based interface is VR, which can incorporate natural language commands, kinetic interface techniques, and eventually advanced brain and neural interfaces. One can almost imagine VR so immersive the user cannot distinguish the digital world from the natural world. Such an interface would be so intuitive, in the truest sense of that term, it would be essentially impossible to document.

Or maybe the instructions could be captured in one simple command: “Act naturally.”

This piece may wind up in the book, but it doesn’t have a home there yet. I’m copying and pasting its contents from topics/ui-types.adoc directory in my codebase, where the original awaits inclusion in other docs. It will likely get some edits before appearing in the book or elsewhere, but the state of this entry will remain as it stood appeared when originally posted.

Equality of Docs and Code

Think about your current product. Whether you have a dedicated technical writer/docs team or your developers self-document the product, which domain is considered the canonical source of truth over product features? Let me put it another way. If I asked you to be absolutely certain that the default setting of a particular configuration property is true or false — not the stated value but the actual out-of-the-box value, and not what it’s supposed to be but what it actually is — where would you turn: the codebase or the docs?

If your answer is, “We keep our docs in our codebase, sucker!”, you definitely earn points with me. But you still have not addressed the spirit of my question.

Which part of your codebase contains the canonical answer to my question about that default setting?

Here you are probably wondering if I’m crazy, some megalomaniacal tech writer who doesn’t even understand programming. Surely the answer must be the product code. No matter which one we consider the primary source, no matter how badly some tech writer might wish that “the docs are always right” or some such fantasy, the product source defines the answer, and therefore it must be the prime source of truth.

What if it doesn’t need to be that way? There is only one way your product’s backend, API, user interfaces, and documentation will all reflect the same information about every detail of your product. That way is to single-source all reference matter, drawing on that prime source every time you generate a dependency library, an interface, or a document — basically, every time you build.

For developers, all this really means is self-consciously organizing “small data” related to the product in universally accessible formats such as JSON, YAML, XML, or CSV, rather than native data structures, whenever possible. Native structures can then be built from these prime sources at build time or runtime.

This manner of operating is more challenging for the documentation side, as currently even the more sophisticated documentation systems provide very limited support for deriving usable structured data from external sources. Unconventional tooling is required to generate and serve tables or pages from those cross-platform data structures, but these tools exist and are becoming accessible. And the benefits can be enormous.

The case I am making is that because there is no room for divergence of the product’s documented behavior and its actual behavior, every human-dependent step between the product code and the generated docs is a chance for divergence.

We get it right more often with APIs, because we tend to source API docs very much within the product codebase. When the codebase and the interface are more directly related and less abstracted, documentation pairs nicely with its underlying code. There are still ways to improve overly robotic API docs, but others are tackling this subject more concertedly than I. (For instance, techcomm blogger/guru Tom Johnson is delightfully obsessed with API docs, as is the network API the Docs.)

For end-user-facing interfaces and docs, as well as for docs that go beyond strict reference format — that is, exceptional docs — smarter tooling is needed to ensure currency and accuracy.

It may seem like I am writing the writer out of the documentation story, but I intend no such thing. Managing these details is not necessarily a huge part of our job, though it is among the more perilous and tedious. There’s a decent chance your bosses and subject matter experts already assume you have some great system or keep docs updated in your sleeps somehow. It’s where we can be most objectively wrong, and where we are most workflow dependent, hoping that the communications between us and SMEs is bug free.

And this makes sense beyond the engineering and support teams, if you think about it. Not only do interfaces and instructions have to correlate to the application’s existential truths, so do legal docs that explain users' legal rights and obligations pertaining to your product. These rules may vary depending on which version they are using, or from where they’re accessing your cloud. End user license agreements, enterprise software service contracts, system permissions requests, and privacy policies can be extremely difficult to coordinate across jurisdictions and product versions. Yet it is critical that any promises made in them are perfectly consistent with their associated product version.

Is anybody drawing product data and docs content from the same data sources? I am eager to learn more about how this is handled at different shops, as I intend to make it a central feature of the strategy I’m developing, unless I learn it is too difficult to implement in too many circumstances. So I’d love to hear struggles as well as success stories.

Codewriting launched at Write the Docs Portland

After six months of writing in private, I released a draft of my book-in-progress about collaborative software documentation practices and tools during the “Writing Day” pre-conference that kicked off “Write the Docs Portland 2017”. I will however be finishing the book in public, hopefully with contributions from others in the field.

Codewriting is my attempt at learning in public, documenting my observations of the field and my approach to problem solving as a self-certified DocOps hacker.

After a recent career shift, I found myself as a former developer employing a cutting-edge methodology in technical documentation, and eventually writing software to build better docs. Now I’m excited to start sharing with the broader software development and documentation community the lessons I’ve learned along the way.

One of those lessons is to operate collaboratively and let subject-matter experts (SMEs) contribute directly to the common wisdom in a shared source repository. In the case of Codewriting, SMEs are my fellow docs hackers and tech writers solving remarkably difficult tooling, workflow, and content problems in forward-thinking ways. In keeping with this observation, such people are strongly encouraged to fork and contribute to the book draft I’ve set forth, writing in public alongside me.

And everyone is invited to begin taking advantage of the book’s content (in draft form, for now) as well as the rudimentary (alpha pre-release) build tooling used to compile the document source into PDF and HTML. Codewriting is its own self-contained source codebase and build platform; anyone can clone or download the repo and use contained Ruby scripts to build the book from source. These same AsciiDoc-centric tools are featured in many of the book’s lessons.

As a former mediocre developer and modestly successful professional writer, in 2015 I found writing about a sophisticated, cutting-edge Big Data IT ops enterprise platform to be my domain. I was overjoyed to find myself writing Rocana’s product documentation using the same principles and processes employed by the engineers writing code. I sit on the Engineering team as a valued contributor, and my Reference Guide source code sits in the product codebase, as valued content.

I believe this system lets me do better work with less frustration and perhaps even fewer errors. What’s more, I think I can help teach it to programmers who want to write and maintain better docs, as well as tech writers who want to work more programmatically, perhaps in closer proximity to engineers.