Function
hash
Computes a (lower) hash value for a shape applied to a sequence.
The hash value (a.k.a. code) of a q-gram is the lexicographical rank of this q-gram in the set of all possible q-grams.
For example, the hash value of the Dna 3-gram AAG is 2 as there are only two 3-grams (AAA and AAC) having a smaller lexicographical rank.
If hash is called with a gapped shape, the q-gram is the text subsequence of no-gap shape positions relative to the text iterator, e.g. a shape 1101 at the beginning of text ACGT corresponds to the 3-gram ACT.
Include Headers
seqan/index.h
Parameters
Shape to be used for hashing. Types: Shape | |
Sequence iterator pointing to the first character of the shape. | |
The distance of |
Return Values
Hash value of the shape.
Member of
Examples
Code example that computes hash values of 4-grams with different shapes starting at the beginning of a text.
File "shape_hash.cpp"
| 1 | |
| 2 | |
| 3 | |
| 4 | |
| 5 | |
| 6 | |
| 7 | |
| 8 | |
| 9 | |
| 10 | |
| 11 | |
| 12 | |
| 13 | |
| 14 | |
| 15 | |
| 16 | |
| 17 | |
| 18 | |
| 19 | |
| 20 | |
| 21 | |
| 22 | |
| 23 | |
| 24 | |
| 25 | |
| 26 | |
| 27 | |
| 28 | |
| 29 | |
| 30 |
The resulting hexadecimal hash values of the three 4-mers GATT, GATC and GATA are:
0x8d
0x8c
SeqAn - Sequence Analysis Library - www.seqan.de