HOWTO - Documentation

As part of the development of sustainable software it is important that code is well documentated to inform developers that need to implement, extend or replace the code about what it does, the inputs, outputs and any dependencies on other software or code. All classes and functions should have matching documentation.

There are 2 key parts of the documentation, the first is for the classes and functions. The documentation should match the PEP8 standard, an example of this is in the MuG Coding Guidlines. The second part is the Architetural Design Record. The ADR should record why key choices have been made, this is especially true if the choices do not match the borm or there has been a major change in a function (addition, removal or completely rewritten). The ADR provides the reasoning behind the code and the documentation string in the functions describe the code. Between them they provide a log of the development of the project.

An example function description should therefore match the following:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
"""
Assembly Index Manager

Manges the creation of indexes for a given genome assembly file. If the
downloaded file has not been unzipped then it will get unzipped here.
There are then 3 indexers that are available including BWA, Bowtie2 and
GEM. If the indexes already exist for the given file then the indexing
is not rerun.

Parameters
----------
file_name : str
   Location of the assembly FASTA file

Returns
-------
dict
   bowtie : str
      Location of the Bowtie index file
   bwa : str
      Location of the BWA index file
   gem : str
      Location of the gem index file

Example
-------
.. code-block:: python
  :linenos:

  from tool.common import common
  cf = common()

  indexes = cf.run_indexers('/<data_dir>/human_GRCh38.fa.gz')
  print(indexes)


"""

Building the Documentation

Full documentation for a repository can be built using Sphinx. If the pipeline has been developed based on a fork of the mg-process-test repository it can be done by:

1
2
3
4
5
cd ${mg-process-test}
pip install sphinx

cd docs/
make html

Updating the documentation

If new pipelines or tools are added to the repository then it is important that they are included in the documentation.

Updates for a new tool - docs/tool.rst

A new section can be added to the docs/tool.rst file to reflect the new tool.

Before:

1
2
3
4
5
6
.. automodule:: tool

   Test Tool
   -----------
   .. autoclass:: tool.testTool.testTool
      :members:

After:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
.. automodule:: tool

   Test Tool
   ---------
   .. autoclass:: tool.testTool.testTool
      :members:

   Test Tool 2
   -----------
   .. autoclass:: tool.testTool.testTool2
      :members:

Updates for a new pipeline - docs/pipelines.rst

A new section can be added to the docs/pipelines.rst file to reflect the new pipeline. This requires providing a larger description about the input required for running the pipeline, what it returns and examples about how to run the code locally and within the COMPSs environment.

An example of a pipeline block is as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
Test Tool
---------
.. automodule:: process_test

   This is a demonstration pipeline using the testTool.

   Running from the command line
   =============================

   Parameters
   ----------
   config : file
      Location of the config file for the workflow
   in_metadata : file
      Location of the input list of files required by the process
   out_metadata : file
      Location of the output results.json file for returned files

   Returns
   -------
   output : file
      Text file with a single entry

   Example
   -------
   To run the script locally this can be done as follows:

   .. code-block:: none
      :linenos:

      cd ${mg-process-test}
      python mg_process_test/process_test.py --config mg_process_test/tests/json/process_test.json --in_metadata mg_process_test/tests/json/input_test.json --out_metadata mg_process_test/tests/results.json --local

   The `--local` parameter should be used if the script is being run within an environment where (py)COMPSs is not installed. It can also be used in an environment where (py)COMPSs is installed, but the script needs to be run locally for testing purposes.

   When using a local verion of the [COMPS virtual machine](http://www.bsc.es/computer-sciences/grid-computing/comp-superscalar/downloads-and-documentation):

   .. code-block:: none
      :linenos:

      cd /home/compss/code/mg-process-test
      runcompss --lang=python mg_process_test/process_test.py --config /home/compss/code/mg-process-test/mg_process_test/tests/json/process_test.json --in_metadata /home/compss/code/mg-process-test/mg_process_test/tests/json/input_test.json --out_metadata /home/compss/code/mg-process-test/mg_process_test/tests/results.json

   Methods
   =======
   .. autoclass:: process_test.process_test
      :members: