C API

#include <kloetzl/dna.h>

Most functions expect a begin and an end pointer. These are to make the half-open interval of DNA sequence to work on. Instead of relying on null-termination of strings, using an end-pointer enables better SIMD and working only on subsequences of strings. As genomes are long, you should avoid calling strlen too often and store the length with the data in one place anyway.

DNA

Every symbol exported by this library starts with dna. The following utility functions do not work on strings but are still useful in a bioinformatics context.

DNA4

Functions beginning with dna4_ work on the letters ACGT exclusively. All other characters (lower case, null bytes, U) may trigger arbitrary behavior.

DNAX

The following functions are not limited to any alphabet. They commonly use a table to allow custom behavior.