Becoming a document engineer

Can software engineers teach us about how to create documentation? Marjorie Jones thinks we can learn from them.

This article was published originally in Communicator (Spring 2017). Communicator is the authoritative, award winning, journal for UK technical communicators. It is home to high quality, objective, and peer-reviewed features – current, relevant, and in-depth.

I started my working life as a computer programmer in the 1980s when software was unstructured and monolithic. There was little code reuse, few formal methodologies, and the tools and languages of the day often didn’t enforce good practices. It wasn’t uncommon for a software component to consist of 10,000 lines of source code in a single file.

In the last 30 years, there have been many changes in software development. Code is now structured and modular and many components are reusable. Modules are small and self-contained. Current tools and languages enforce good practices and software development methodologies are mature and well understood. And programmers are now called software engineers.

Similar changes are happening in technical communication. Historically we worked on unstructured, monolithic documents, with no component reuse, limited tool support and few formal methodologies. Now we are embracing structured content and reusable components. Sophisticated authoring tools are available and content creation methodologies are maturing.

Historically we worked on unstructured, monolithic documents, with no component reuse, limited tool support and few formal methodologies.

This gives us flexibility to create vast amounts of content responsively and efficiently. It also makes the content creation process more like engineering (where each output is assembled from many component parts) than traditional technical writing (where each output is generated from a single source file).

Now we are embracing structured content and reusable components.

In software engineering, the transformation didn’t happen overnight and the journey wasn’t always smooth. In this article, I’ll look at some of the things that software engineers learned along the way and suggest how technical communicators can apply this knowledge to their work to become document engineers.

An early failure

The company I was working for as a junior programmer in 1983 had already tried the new ‘structured programming’ methodology on one project. The result was disastrous. It was the buggiest system we had. The code was hard to understand and hard to maintain. Every time the code was changed, something unrelated stopped working. The IT manager decided that structured programming didn’t work so we went back to writing unstructured monolithic programs.

That conclusion seems surprising now. With hindsight, I can understand some of the reasons for the failure. It isn’t possible to take something big and complex, chop it into lots of little pieces, give them to multiple people to implement, and somehow expect that they will magically assemble themselves into the right thing. They won’t! You need the right pieces. You need tools and processes to manage the complexity and to identify problems promptly. You need a final integration and test phase to verify that the whole system works as expected.

Software engineers spent many years working out how to use structured programming effectively. As we become document engineers, we can learn from them. Often, what works for software will work for us, and some of the things that go wrong for software can now go wrong for us too. Additionally, if you are working in a software environment, the software engineers will have tools and processes that will help you too.

Software Engineering 101

The software development process has several stages:

Specification: defining what to build
Design: deciding how to build it
Implementation: building each component
Integration and Testing: assembling the components and checking the results.

In the traditional waterfall approach, each stage completes before the next stage starts. These stages are also present in iterative approaches like Agile, where software is developed incrementally. Each feature must still pass through all the stages, although, for some features, some stages may be trivial, and a single Agile sprint often includes tasks from more than one stage. Sometimes it’s hard to be sure one stage is correct until you’ve started the next stage, so there will always be iteration and rework, even in Agile. For example, you may not find a design problem until you try to implement the design. The earlier that problems are found, the cheaper it is to fix them. Otherwise, work in subsequent stages may be blocked, and work in the current stage may have to stop to make resources available to fix the problems.
Software engineers use various tools and processes to help them ensure that the output of each stage is ‘good enough’ to be used as the input for the next stage. This avoids expensive rework and helps keep the project on track.

Specification

In this stage, software engineers identify the user problem and specify what they need to create to solve it. For us, this involves identifying the needs of the reader and defining things like the structure, content and format of the output.
This stage is comprehensively covered by experts such as content strategists and user experience designers, so I won’t consider it in detail here.

Design

In this stage, software engineers decide how to build what was specified. This stage is crucial as good designs rarely evolve without intentional actions. Here are some things we can borrow from software engineers.

System design

The system design process identifies the main components of the solution and how they interact. For us, the components are our individual building blocks of content and vary with the authoring tool. For example, we may have topics, snippets, conditions, stylesheets, page layouts, output maps and more.

Like software engineers, we should identify and design our components and their interactions carefully to ensure we have the right components and can find them when we need them. This may include determining project and folder structures and defining naming conventions. If possible, test your proposed structure on a pilot project. It can be costly to change it later.

Software engineers often hold a design review to verify the design. You can do this informally even if you’re a sole author. On one occasion, I simply asked a friendly developer to bring his coffee over to my desk and give his opinion on my proposed project structure. This helped me clarify what I needed and gave him a better insight into my world. The overhead was tiny, but my design got reviewed in a way that was timely and appropriate.

Component design

Software engineers also design each component to ensure it functions as required, and is easy to understand and maintain.

When software engineers design components, they aim for:

high cohesion and low coupling: keeping closely related things together, and separating things that aren’t closely related
no unnecessary dependencies: dependencies between components make components harder to reuse and maintain
no unnecessary duplication: duplicated information can easily get out of step and cause hard-to-find errors.

Some authoring tools enforce good design practices. Other tools make us work harder to avoid sub-optimal design and the tool may even allow us to create complex and confusing structures that will be hard to maintain later.

Always take extra care with common components, as they will be expensive to change later. For example, you may be able to use snippets to avoid unnecessary duplication. But each snippet should be highly cohesive and loosely coupled (that is, only contain information about one thing) to ensure it can be easily reused and, if necessary, can be modified later without unexpected side-effects.

Implementation

In this stage software engineers start building the system they have specified and designed. For many of us, this is our main activity: it’s what we do when we write our content. Here are some things that software engineers consider during implementation that can help us.

Software engineers design aims:

high cohesion and low coupling

no unnecessary dependencies

no unnecessary duplication

Source control

Software engineers use source control (sometimes called version control) to manage their files. It allows them to see who changed what, to revert to a previous version of a file if necessary, to manage multiple conflicting changes, to manage multiple releases, and to baseline versions of source files, for example, to record which versions of which files were used for each build.

For technical communicators working in a structured environment, some form of source control is essential. You need to be able to keep track of components and the changes made to them, and to record which versions of which components have been assembled into which outputs. If your authoring tool does not include source control features, and you don’t have a separate source control tool or equivalent, then finding a source control system should be a priority.

If you document software, I recommend that, you use the same source control tool and the same versioning and branching structure as your software developers do, if you can. It’s convenient if your chosen system integrates with your authoring tool, but if necessary, you can use the source control interface directly.

As a last resort, you could use a regular system of safe backups, with archived copies for each formal release of your content. This records the sources that were used for each release and allows you to revert to a known state if necessary, although with less granularity than a source control system.

For technical communicators working in a structured environment, some form of source control is essential. Track components and changes made to them, record versions.

Comments

Software engineers use comments extensively in their code to add explanations about the implementation and to help engineers working on that code in the future. Comments are included in the source but don’t affect the executable code.

Comments can be useful for technical communicators too. Sometimes, it is useful to add explanatory comments about decisions you have made, or dependencies that exist. These comments can help anyone who works on the content in the future.

Note that here I’m referring to comments that explain the structure or content of a file, not review comments, which are typically removed when they are resolved, or source control check in comments, which record the change history of the component.

It’s usually possible to find a way to include comment text in your source files but not in your output files. For example, if your tool supports conditional content, you could define a “Comment” condition that is always excluded from all outputs.
Make sure to put comments where they can be spotted when they are needed, and to keep them up to date.

Code reviews

Many software teams insist that code is reviewed by another engineer before it is approved. This review is not the final verification that the code functions correctly: it’s a check that the source code meets the coding standards and follows best practices, that it’s understandable and maintainable, and that the implementation has no obvious errors or omissions. It’s an additional step to
ensure quality. It is not a substitute for software testing, which will follow later.

A code review can be useful. It’s a review of the source, not the output, and will usually be carried out by another technical writer. It can verify, for example, that the correct topics have been created and that styles and conditions have been applied correctly.

A code review can be useful for us too, but note that on its own, it is not enough. It’s a review of the source, not the output, and will usually be carried out by another technical writer. It can verify, for example, that the correct topics have been created and that styles and conditions have been applied correctly. But the source content may be hard to follow, especially if it is heavily marked up or conditionalised, or includes external components. A proper review of the output is necessary too, in the same way that software that has been code reviewed still needs to be tested.

Unit testing

In software engineering, unit testing helps ensure the component is ‘good enough’ to be built into the rest of the system. It’s usually done by the engineer who wrote the code, or by someone who knows the code well. It involves checking as many paths through the code as possible. This is known as white box testing. In some companies, evidence of unit testing must be supplied when the code is submitted for code review.

For us, the equivalent of unit testing includes a review of the output by a subject matter expert (SME) and perhaps by another writer. But this may not be enough for structured content.

A single source file may be reused in several different outputs, and may not appear correctly in all of them. For example, an incorrectly set condition may cause content to be incorrectly omitted from one product variant, or incorrectly included in another. This won’t be apparent unless each output is built and inspected too. Although we are unlikely (yet) to have a suite of formal, automated and repeatable unit tests as the software engineers do, we can methodically test all paths through our content, to cover all the output options. So build and check everything, however small your change is. If something is wrong, you are most likely to spot it while the changes are fresh in your mind.

Build and check everything, however small your change is. If something is wrong, you are most likely to spot it while your changes are fresh in your mind.

Integration and system testing

In this stage, software engineers verify that they have built the system they intended to build. The unit-tested components are integrated, and the functionality is tested against the specification. Unlike unit testing, the testing in this stage is independent (carried out by someone who isn’t the original software engineer) and black box (done without a knowledge of the internal implementation).

This stage is important for us too, even although the technical content has already been reviewed. In a world where we are writing individual components, what we see and what the SME reviews isn’t necessarily what appears in the final output. Even if we build and test our content locally as we write it, the final output may be different because of other changes or differences between our local environment and the build environment.

Here are some things software engineers do during integration and testing that may help us.

Regular, automated builds

Software engineers call this continuous integration. Often, builds are triggered as soon as updated source files are checked into source control.

If you are documenting software, try to get the documentation built with the software. If your authoring tool supports automated builds and you are using the same source control system and branching structure as the developers, this should be relatively easy.

Even if you document something other than software, regular builds, automated if possible, may be helpful. They give an early warning of build problems and early exposure of your content for others to review or test.

If your authoring tool doesn’t support automated builds, you could consider running a regular manual build, and making the results available in an agreed place. You may like to use a checklist to ensure the build process is consistent and repeatable and that the expected output has been generated.

Automated tests

In software, many tests are automated, and run as soon as a build completes successfully. This is one area that differs from software development. Most of the time, a real person needs to validate the documentation, although some automated verification is possible. For example, spelling, grammar and terminology checkers are available, and the build process can report broken links and other build errors.
Many continuous integration tools automatically report a build failure if the build output (in our case, the documentation) is significantly smaller or larger than in the last successful build, as that may mean something unexpected has occurred.

Independent tests

Most software teams include independent testers. If you have a test team, you can help them to help you. If you list the documentation that you have created or updated for each feature, the testers can check that the expected documentation is present when they test the related feature. They may even spot problems in the content.

If you don’t have a test team or a formal testing phase, you can ask someone independent to check your final content. You can provide a checklist of things to verify, as you don’t want or need another detailed technical review at this stage.

I used the latter approach when I worked on a project where the deliverables were enormous PDFs, manually created from multiple Word documents, with many potential points of failure. Sometimes the independent check caught an error I hadn’t spotted, like an incorrect date, or a missing chapter, and saved potential embarrassment. In another project, our test team spotted that some images weren’t being rendered properly in the output. This was due to a problem with the version of a software package used for the final build, whereas my local builds had rendered the images correctly.

None of these problems would have been spotted without an independent check.

The future

So that’s an overview of some things that we can learn from software engineers as we embrace the challenges of structured authoring and managing large numbers of content components, to become document engineers.

Here are some other exciting developments.

Agile documentation

Almost everyone working in software has heard of Agile, but many technical communicators are realising that Agile is rather light on how end-user documentation fits in. There are almost as many approaches to integrating documentation with Agile teams as there are technical communicators. Some of these approaches are more successful than others.

While we are still finding our way, I think it’s helpful to focus on a crucial Agile principle that has revolutionised software development, and that’s the iterative delivery of the minimum viable product. Instead of trying to deliver comprehensive documentation, we can use Agile principles to develop the documentation too. That is, we can create the minimum viable documentation for the product and iteratively improve it based on user feedback.

This keeps the documentation simple and focused directly on the needs of the user. It also ensures that user feedback on the documentation is actively solicited and acted on as part of the product development. As a result, the product interface may need to become clearer and more intuitive, to avoid including extra documentation to explain the complexity. Then everybody wins, especially the user.

We can create the minimum viable documentation for the product and iteratively improve it based on user feedback.

Docs as code

Recently, some technical communicators have started talking about ‘docs as code’, which encompasses some of the ideas in this article.

Docs as code generally covers things like:

adopting software development tools like source control and syntax checkers
adopting software development practices like continuous integration and automated testing
using lightweight mark-up languages
adopting software development cultures like community authorship.

Even if your documentation projects are too complex for mark-up languages or too commercially sensitive for community authorship, or if you don’t document software, you can benefit from adopting appropriate software development tools and practices.

Additional menu