Overview

A technical overview of customizing the theme, creating the skeleton template, and merging the two together

The NOAA Fisheries Southeast Fisheries Science Center (SEFSC) is actively developing a suite of new, innovative machine learning models designed to increase data processing efficiencies across the Center’s mission portfolio. These projects include, but are not limited to, automating the processing of Gulf and South Atlantic video survey data to derive species counts, automating aging of Gulf and Atlantic menhaden and red snapper, and automating the derivation of fish lengths from stereo video. These models were developed in Python using open-source libraries (e.g., PyTorch, cudatoolkit, Torchvision, and scikit-learn) by partners through the NOAA Northern Gulf Institute (NGI) and the Cooperative Institute for Climate, Ocean, and Ecosystem Studies (CICOES) and are or will be maintained on the SEFSC Organization GitHub account.

Introducing new computer models into the Center’s data processing procedures necessitates adequate documentation so that the end users will know how to install and execute the models in operational use. Indeed, all computer models and software should be documented by the developers before being released to the intended end user. In the absence of a universally accepted documentation template for use within the SEFSC, NOAA Fisheries, or even NOAA at large, staff across NOAA have established their own approaches and templates of varying degrees of complexity based on their immediate needs. Some examples of these solutions are summarized in Table 1 along with notable strengths and weaknesses of each.

Several features were prioritized when selecting a platform for SEFSC documentation:

  1. Content version control
  2. Documentation existing alongside the model source code
  3. Minimal learning curve for populating and maintaining content
  4. Layout and formatting customization to align with agency visual branding requirements

Table 1: A variety of resources have been used across NOAA for documenting procedures or applications, each having its own advantages and disadvantages. In the absence of guidance or directives, users currently select the approach best suited to their needs. This non-exhaustive list provides some examples of documentation solutions.
DOCUMENTATION SOLUTION ADVANTAGES DISADVANTAGES
Microsoft Word or Google Doc
  • Creation is straightforward
  • Easily converted to PDF
  • Software readily available to NOAA staff
  • Not co-located with software
  • Document owner often retains ownership
  • Difficult to share publicly
  • No version control
GitHub respository Wiki

Example: NOAA IOOS National Glider Data Assembly Center (NGDAC) Documentation (archived)
  • Consistent formatting across all GitHub repositories (repos)
  • Connected to the repo in which application is hosted
  • Public accessibility follows repo settings
  • No local software requirements
  • Limited functionality
  • No ability to customize appearances, layouts, etc.
  • Limited version control
  • Learning curve with Markdown and HTML to populate content
Quarto webpage made with R

Example: Quarto webpage with book layout made in RStudio
  • Consistent formatting across implementations
  • Hosted on GitHub Pages (gh-pages) alongside the software itself
  • GitHub version controlled
  • Template and setup guidance available via NOAA Fisheries Open Science initiative
  • Learning curve with Markdown and HTML to populate content
  • Learning curve with R and RStudio to develop content locally
  • Learning curve with GitHub Pages and GitHub Actions if customization is needed
Quarto webpage made with Python

Example: Quarto webpage with book layout made with Python
  • Consistent formatting across implementations
  • Can be customized for the organization with optional centralized format control
  • Easily templated to minimize learning curves
  • Hosted on gh-pages within the repo in which the application is hosted
  • Full GitHub version control
  • Template and setup guidance available via NOAA Fisheries Open Science initiative
  • Learning curve with Markdown and HTML (optional) to populate content
  • Learning curve with GitHub Pages and GitHub Actions if customization is needed
GitHub Pages webpage with Jekyll theme

Example: NOAA IOOS Documentation Portal
  • Consistent formatting across implementations
  • Can be customized for the organization with optional centralized format control
  • Easily templated to minimize learning curves
  • Hosted on gh-pages within the repo in which the application is hosted
  • Full GitHub version control
  • Template and setup guidance created by IOOS
  • Learning curve with Markdown and HTML to populate content
  • Advanced customization requires knowledge of gh-pages, GitHub Actions, CSS, and other languages
  • Initial setup can be challenging

Google Docs, though straightforward to create and widely used across NOAA, present too many challenges. File ownership is a perpetual problem, with the original author often retaining ownership and files getting lost in staffing changes. Without preserving documents’ URLs, sharing access permissions, and transferring ownership when needed, these files often get lost or are not known to exist by new staff. There is also a tendency for documents to perpetually stay in “draft” version, which casts doubt on the accuracy of the content and whether or not a final version exists. A far better solution is one that is connected to GitHub where the model code itself is hosted and maintained. One option is GitHub Wikis, which can be created for any repository and have been widely used by NOAA Integrated Ocean Observing System (IOOS) programs, such as the original National Glider Data Assembly Center documentation page. Yet these Wikis offer limited version control and customization aside from the actual content. GitHub Pages, a service that allows a project website to be freely hosted for any repository, offers a more comprehensive assortment of options and flexibilities including customizable themes that can be tailored to agency visual branding, the ability to control this theme in one repository and use it across any number of documents, full version control, and the option of creating a template repository that can be easily duplicated and integrated into new projects. This project uses the simplicity of Quarto, an open-source scientific and technical publishing system, to create a documentation website and utilizes GitHub Pages for hosting with full version control. (For an alternative solution using Jekyll and GitHub Pages, see SEFSC Jekyll Documentation Theme.)