update June 11, 2026
NAME

birchdockit.py - add, delete or change objects in a birchdocdb database

SYNOPSIS

birchdocdbkit.py  inputfile databasefile [--commit]

birchdocdbkit.py  inputfile databasefile --outfile outputfile

DESCRIPTION
birchdocdbkit.py is a command line client designed to be run by BioLegato to make changes in the BIRCH documentation database. Based on objects and commands in inputfile, birchdocdbkit.py generates SQL statements to make the corresponding changes in the database.
inputfile - TSV input file with queries in the form of database objects.
databasefile - sqlite database file
--commit - By default, will only produce a report on sanity checking. If --commit is specified, the change is made provided that sanity checking succeeded.
--outfile - Output file used only with the SHOW directive.
RATIONALE

birchdocdbkit.py is a database client that represents BIRCH components as objects of four classes: Package, Category, Program and File.

Package

Program
Name
Description
Category
Program
Documentation
Data
Platform
Installation
Name
Description
Launch
Category
Package
Documentation
Data
SampleInput
SampleOutput
Platform
Installation
Category
File
Name
Programs
Packages
Name
Description
Command

SQL databases are not object-oriented. Everything is a table. Each column is a field, which expects to have a single value. There is no provision for having fields with multiple values. For example, there may be several documentation files for a Program object, each of which should point to an object of the File class. Similarly, a Program may belong to several categories, so the Category field needs to have a way to point to more than one Category field.

The typical solution in SQL databases is to create tables that represent relations by linking any two classes to each other.  For example, membership of a Program object in a Package is implemented in the PkgProg class. If you add a new program  object, a PkgProg object is also created telling which package the program belongs to. The PkgProg object has two columns, Package and Program.

Consequently, many transactions may require changes in two or more tables, especially link tables. The table below summarizes tables that need to be changed if the parent table is changed. Indentation represents dependencies.

It is important to emphasize that birchdocdbkit.py is an abstraction layer in which the end user only has to think about the four classes of objects. The link tables are automatically referenced during transactions.

Class dependencies - For each class of objects, the table lists other child classes that may need to be changed if the parent object changes.
Package
Program
Category
File
Package
PkgProg
PkgCat
PkgDoc
PkgPlat


Program
PkgProg
ProgCat
ProgDoc
ProgLaunch
ProgPlat
ProgSampleInp
ProgSampleOut
Category
PkgCat
ProgCat
File
PkgDoc
PkgDat
ProgDoc
ProgDat
ProgSampleInp
ProgSampleOut
ComType
Platform


ComType
ProgLaunch
Platform
PkgPlat
ProgPlat



Viewing database objects

Input file

The input file contains one or more SHOW directives, specifying objects to be written to outfile as text. All directives must terminate with a semicolon on a separate line. Comments are any line beginning with two dashes ie. '--'.

Transaction
 Write one or more objects to an output file
Syntax
SHOW<tab><class><tab><object>
;
Example
--------------------------------
-- show package object

SHOW    PACKAGE    NCBI
;


The example above would produce the output

----------------------------------------
Package    NCBI
    Description    Database tools - Natl. Ctr. for Biotech. Information
    Program    sequin
    Program    Cn3D
    Program    blastp
    Program    blastx
    Program    blastn
    Program    tblastn
    Program    tblastx
    Program    blast_formatter
    Program    makeblastdb
    Program    blastdbcmd
    Program    tbl2asn
    Program    magicblast
    Platform    linux-x86_64,osx-x86_64
    Installation    BIRCH


In another example, two SHOW directives are given.

--------------------------------
-- show program objects
SHOW    PROGRAM    bachrest
;
SHOW    PROGRAM    spades
;

 
would produce

----------------------------------------
Program    bachrest
    Description    Restriction site search (batch)
    Category    Sequence - Restriction Analysis
    Package    FSAP
    Launch    interactive    bachrest
    Launch    bldna    DNA/RNA --> BACHREST
    Documentation    $doc/fsap/rest.txt
    Documentation    $tutorials/bioLegato/sequence/sequence.html
    Data    $dat/REBASE/type2.lst
    Platform    linux-x86_64,osx-x86_64
    Installation    BIRCH

----------------------------------------
Program    spades
    Description    Genome assembler
    Category    Sequence - DNA Sequencing and Assembly
    Package    Saint Petersberg Algorithmic Biology Lab
    Launch    command    spades.py [options] -o <output_dir>
    Documentation    $doc/spades/manual.html
    Documentation    $doc/spades/spades.man
    Data    $dat/spades/test_dataset
    Data    $dat/spades/test_dataset_plasmid
    Platform    linux-x86_64
    Installation    BIRCH



Adding, deleting or modifying database objects

Input file

The input file specifies transactions as representations of database objects to be added, deleted or changed. The input file for birchdocdbkit.py may contain as many database objects as desired, up to and including the entire database. Objects may be in any order, and may reference other objects further down the file. Each object is terminated by a semicolon (;).

The syntax, while patterned after SQL, is not strictly SQL. If you want you can think of it as a layer that sits on top of SQL.

Transaction
 add an object to an existing class
 assumes that file objects referenced already exist
Syntax
INSERT_INTO<tab><class><tab><object>
Example
INSERT_INTO PROGRAM    numseq
Description    print a sequence
Launch    interactive    numseq
Launch    bldna    DNA/RNA --> numseq
Documentation    $doc/fsap/numseq.txt
Documentation    $doc/fsap/fsap.txt
Package    FSAP
Category    Sequence
Platform    L:l:M:m
Installation    BIRCH

;
 


Transaction
 delete an object, and its child objects, from a class.
Syntax
DELETE<tab><class><tab><object>
Example
DELETE    PROGRAM    numseq
;


Transaction
 add a field to an object
Syntax
UPDATE<tab><class><tab><object>
SET<tab><field><tab><value>
Example
UPDATE    PROGRAM    numseq
SET    Documentation    $BIRCH/doc/fsap/numseqprime.txt
;


Transaction
 set a field to NULL
Syntax
UPDATE<tab><class><tab><object>
UNSET<tab><field><tab><value>
Example
UPDATE    PROGRAM    numseq
UNSET    Documentation    $BIRCH/doc/fsap/numseqprime.txt
;



DEFINITIONS

Syntactic definitions are given in Backus-Naur notation.

Primitives
<tab>::= tab character (\t)
<NL>::= end of line character ie. newline (\n)
<object terminator>::= semicolon (;)

<input file> ::=<transaction>[<transaction>]
<comment>::="--"<string>
<transaction>::=<command><tab><objecttype><tab><objectname><NL><object terminator><NL>


<token>::= string that does not contain blanks
<file path>::=<fully qualified file-path> may contain environment variables of the form $<string>
<launcher>::=<string>  command to be run using file as input. May contain environment variables.
<object>::=<Package object>|<Program object>|<Category object>|<File object>
<program name>::=<string> name field of a program object
<file name>::=<string> name field of a file object
<package name>::= <string> name field of a package object
<cat name>::=<string>CatName field of a category
[field]::=0 or more instances of a field are allowed
Fields not in brackets may have 0 or 1 instances


<File object>::=
Name<tab><token>
Description<tab><string>
Command:<launcher><file path>

<Category object>::=
CatName<tab><string>

<comtype>::="command"|"interactive"|"gui"|"birch"|"birchadmin"|"bldna"|
           "blnalign"|"blprotein"|"blpalign"|"bldata"|"blmarker"|"blnfetch"|"
           blpfetch"|"blreads"|"blncbi"|"bltable"|"bltree"
<launch command>::=<string>

<platform codes>::= a string with letters indicating which platforms are supported, separated by colons (:)
L - linux-x86_64, l - linux-arm64, M - osx-x86_64, m - macos-arm64
platform codes may be in any order
Example: l:m would be a platform code meaning that the program or package or program is supported on linux-arm64 and macos-arm64

<Program object>::=
Name<tab><token>
Description<tab><string>
[Category<tab><catname>]
Package<tab><package name>
[Launch<tab><comtype><tab><launch command>]
[Documentation<tab><file name>]
[SampleInput<tab><file name>]
[SampleOutput<tab><file name>]
Platform<tab><platform codes>
Installation<tab>BIRCH|local

<Package object>::=
Name<tab><token>
Description<tab><string>
[Category<tab><catname>]
[Program<tab><program name>]
[Documentation<tab><file name>]
[SampleInput<tab><file name>]
[SampleOutput<tab><file name>]
Platform<tab><platform codes>
Installation<tab>BIRCH|local

SEE ALSO
sql2htmldoc.py - Produces HTML documentation pages for BIRCH, usingn the Sqlite database as input.
AUTHOR
Dr. Brian Fristensky
Department of Plant Science
University of Manitoba
Winnipeg, MB  Canada R3T 2N2
brian.fristensky@umanitoba.ca
http://home.cc.umanitoba.ca/~frist