Documenting

This guide is for developers who write API documentation. To build documentation run:

repo docs

in the repo and you will find the output under _build/docs/carbonite/latest/.

Documenting Python API

The best way to document our Python API is to do so directly in the code. That way it’s always extracted from a location where it’s closest to the actual code and most likely to be correct. We have two scenarios to consider:

  • Python code

  • C++ code that is exposed to Python

For both of these cases we need to write our documentation in the Python Docstring format (see PEP 257 for background). In a perfect world we would be able to use exactly the same approach, regardless of whether the Python API was written in Python or coming from C++ code that is exposing Python bindings via pybind11. Our world is unfortunately not perfect here but it’s quite close; most of the approach is the same - we will highlight when a different approach is required for the two cases of Python code and C++ code exposed to Python.

Instead of using the older and more cumbersome restructredText Docstring specification we have adopted the more streamlined Google Python Style Docstring format. This is how you would document an API function in Python:

from typing import Optional

def answer_question(question: str) -> Optional[str]:
    """This function can answer some questions.

    It currently only answers a limited set of questions so don't expect it to know everything.

    Args:
        question: The question passed to the function, trailing question mark is not necessary and
            casing is not important.

    Returns:
        The answer to the question or ``None`` if it doesn't know the answer.

    """
    if question.lower().startswith("what is the answer to life, universe, and everything"):
        return str(42)
    else:
        return None

After running the documentation generation system we will get this as the output (assuming the above was in a module named carb):

Answer documentation

There are a few things you will notice:

  1. We use the Python type hints (introduced in Python 3.5) in the function signature so we don’t need to write any of that information in the docstring. An additional benefit of this approach is that many Python IDEs can utilize this information and perform type checking when programming against the API. Notice that we always do from typing import ... so we never have to prefix with typing namespace when referring to List, Union, Dict, and friends. This is the common approach in the Python community.

  2. The high-level structure is essentially in four parts:

    • A one-liner describing the function (without details or corner cases)

    • A paragraph that gives more detail on the function behavior (if necessary)

    • An Args: section (if the function takes arguments, note that self is not considered an argument)

    • A Returns: section (if the function can return somethings other than None)

Before we discuss the other bits to document (modules and module attributes) let’s examine how we would document the very same function if it was written in C++ and exposed to Python using pybind11.

    m.def("answer_question", &answerQuestion, py::arg("question"), R"(
        This function can answer some questions.

        It currently only answers a limited set of questions so don't expect it to know everything.

        Args:
            question: The question passed to the function, trailing question mark is not necessary and
                casing is not important.

        Returns:
            The answer to the question or empty string if it doesn't know the answer.)");

The outcome is identical to what we saw from the Python source code, except that we cannot return optionally a string in C++.

Answer documentation C++

We want to draw you attention to the following:

  1. pybind11 generates the type information for you, based on the C++ types. The py::arg object must be used to get properly named arguments into the function signature (see pybind11 documentation) - otherwise you just get arg0 and so forth in the documentation.

  2. Indentation is key when writing docstrings. The documentation system is clever enough to remove uniform indentation. That is, as long as all the lines have the same amount of padding that padding will be ignored and not passed onto the restructured text processor. Fortunately clang-format leaves this funky formatting alone - respecting the raw string qualifier.

Let’s now turn our attention to how we document modules and their attributes. We should of course only document modules that are part of our API (not internal helper modules) and only public attributes. Below is a detailed example:

"""Example of Google style docstrings for module.

This module demonstrates documentation as specified by the `Google Python
Style Guide`_. Docstrings may extend over multiple lines. Sections are created
with a section header and a colon followed by a block of indented text.

Example:
    Examples can be given using either the ``Example`` or ``Examples``
    sections. Sections support any reStructuredText formatting, including
    literal blocks::

        $ python example.py

Section breaks are created by resuming unindented text. Section breaks
are also implicitly created anytime a new section starts.

Attributes:
    module_level_variable1 (int): Module level variables may be documented in
        either the ``Attributes`` section of the module docstring, or in an
        inline docstring immediately following the variable.

        Either form is acceptable, but the two should not be mixed. Choose
        one convention to document module level variables and be consistent
        with it.

    module_level_variable2 (Optional[str]): Use objects from typing,
        such as Optional, to annotate the type properly.

    module_level_variable4 (Optional[File]): We can resolve type references
        to other objects that are built as part of the documentation. This will link
        to `carb.filesystem.File`.

Todo:
    * For module TODOs if you want them
    * These can be useful if you want to communicate any shortcomings in the module we plan to address

.. _Google Python Style Guide:
   http://google.github.io/styleguide/pyguide.html

"""


module_level_variable1 = 12345
module_level_variable3 = 98765
"""int: Module level variable documented inline. The type hint should be specified on the first line, separated by a
colon from the text. This approach may be preferable since it keeps the documentation closer to the code and the default
assignment is shown. A downside is that the variable will get alphabetically sorted among functions in the module
so won't have the same cohesion as the approach above."""

module_level_variable2 = None
module_level_variable4 = None

This is what the documentation would look like:

Module documentation

As we have mentioned we should not mix the Attributes: style of documentation with inline documentation of attributes. Notice how module_level_variable3 appears in a separate block from all the other attributes that were documented. It is even after the TODO section. Chose one approach for your module and stick to it. There are valid reasons to pick one style above the other but don’t cross the streams! As before we use type hints from typing but we don’t use the typing syntax to attach them. We write:

"""...
Attributes:
    module_variable (Optional[str]): This is important ...
"""

or

module_variable = None
"""Optional[str]: This is important ..."""

But we don’t write:

from typing import Optional

module_variable: Optional[str] = None
"""This is important ..."""

This is because the last form (which was introduced in Python 3.6) is still poorly supported by tools - including our documentation system. It also doesn’t work with Python bindings generated from C++ code using pybind11.

For instructions on how to document classes, exceptions, etc please consult the Sphinx Napoleon Extension Guide.