Class Specialization
IndexQGram
An index based on an array of sorted q-grams. Especially useful for q-gram/k-mer searches.
![]() | ![]() | ||||||||||||||||
IndexQGram | |||||||||||||||||
![]() | ![]() | ||||||||||||||||
| |||||||||||||||||
Include Headers
seqan/index.h
Parameters
The text type. Types: String | |
The Shape specialization type. Note: This can be either a | |
The specializing type. Types: OpenAddressing Default: Default |
Remarks
The fibres (see Index and Fibre) of this index are a suffix array sorted by the first q characters (see QGramSA) and a q-gram directory (see QGramDir).
The size of the q-gram directory is |Σ|^q.
On a 32bit system the q-gram length is limited to 3 for char alphabets or 13-14 for Dna alphabets.
Consider to use the OpenAddressing q-gram index for longer q-grams if you don't need q-grams to be sorted.
Specialization of
Specializations
| An index based on a refined array of sorted q-grams. | |
| An index based on an array of sorted q-grams. |
Metafunctions
| Type of a specific container member (fibre). (Index) | |
| The default alphabet type of a suffix array, i.e. the type to store a position of a string or string set. (Index) |
Functions
| Resets an object. (Index) | |
| Returns the number of occurrences of representative substring or a q-gram in the index text. | |
| Returns the number of occurrences of a q-gram for every sequence of a StringSet . | |
| Return the number of sequences in an index' underlying text. (Index) | |
| Builds a q-gram index on a sequence. | |
| Shortcut for | |
| The end of a container. (Index) | |
| Returns a specific fibre of a container. (Index) | |
| Creates a matrix storing the number of common q-grams between all pairs of sequences. | |
| Returns an occurrence of the representative substring or a q-gram in the index text. | |
| Returns all occurrences of the representative substring or a q-gram in the index text. | |
| Return the q-gram step size used for index creation. | |
| Shortcut for | |
| Shortcut for | |
| Shortcut for | |
| Creates a specific Fibre. (Index) | |
| Shortcut for | |
| On-demand creation of a specific Fibre. (Index) | |
| Shortcut for | |
| Shortcut for | |
| Returns whether a specific Fibre is present. (Index) | |
| Shortcut for | |
| The number of characters in the underlying text of the index is returned. (Index) | |
| This functions opens an index from disk. (Index) | |
| Returns the suffix array interval borders of occurrences of representative substring or a q-gram in the index text. | |
| This functions saves an index to disk. (Index) | |
| Change the q-gram step size used for index creation. |
Examples
The following code prints all occurrences of the gapped q-gram "AT-A" in "CATGATTACATA".
File "index_qgram.cpp"
| 1 | |
| 2 | |
| 3 | |
| 4 | |
| 5 | |
| 6 | |
| 7 | |
| 8 | |
| 9 | |
| 10 | |
| 11 | |
| 12 | |
| 13 | |
| 14 | |
| 15 | |
| 16 |
4
Example Programs
See Also
SeqAn - Sequence Analysis Library - www.seqan.de
