Example Program
Constraint Iterator
Example for using node predicates on a deferred suffix tree.
Given a sequences, we want to find all substrings s that fulfill certain constraints.
The relative probabilty to see s should be at least p_min . s should also be not longer than
replen_max .
The latter constraint is a anti-monotonic pattern predicate and can be used in conjunction with the
first constraint to cut of the trunk of a suffix tree. Only the top of the suffix tree contains candidates
that might fulfill both predicates, so we can use an Index based on a deferred suffix tree (see IndexWotd).
The following example demonstrates how to iterate over all suffix tree nodes fulfilling the constraints and output them.
File "index_node_predicate.cpp"
A tutorial showing how to extent an index with a node predicate.
| 1 | |
| 2 | |
| 3 | |
| 4 | |
| 5 |
constraint parameters
| 6 | |
| 7 | |
| 8 | |
| 9 | |
| 10 | |
| 11 |
SeqAn extensions
| 12 | |
| 13 | |
| 14 | |
| 15 | |
| 16 | |
| 17 | |
| 18 | |
| 19 | |
| 20 | |
| 21 | |
| 22 | |
| 23 | |
| 24 | |
| 25 | |
| 26 | |
| 27 | |
| 28 | |
| 29 | |
| 30 | |
| 31 | |
| 32 | |
| 33 | |
| 34 | |
| 35 | |
| 36 | |
| 37 | |
| 38 | |
| 39 | |
| 40 | |
| 41 | |
| 42 | |
| 43 | |
| 44 | |
| 45 | |
| 46 | |
| 47 | |
| 48 | |
| 49 | |
| 50 | |
| 51 | |
| 52 |
We begin with a String to store our sequence.
| 53 | |
| 54 |
Then we create our customized index which is a specialization
of the deferred wotd-Index
| 55 | |
| 56 | |
| 57 | |
| 58 | |
| 59 | |
| 60 |
To find all strings that fulfill our constraints,
we simply do a dfs-traversal via goBegin and goNext.
| 61 | |
| 62 | |
| 63 | |
| 64 | |
| 65 | |
| 66 | |
| 67 |
countOccurrences returns the number of hits of the representative.
| 68 | |
| 69 |
The representative string can be determined with representative
| 70 | |
| 71 | |
| 72 | |
| 73 | |
| 74 | |
| 75 | |
| 76 |
Output
weese@tanne:~/seqan/demos$ make index_node_predicate
weese@tanne:~/seqan/demos$ ./index_node_predicate
38x ""
6x " "
3x " wo"
2x " wood"
2x "a"
4x "c"
2x "chuck"
2x "ck"
3x "d"
2x "d "
2x "huck"
2x "k"
6x "o"
2x "od"
2x "ood"
3x "u"
2x "uck"
4x "w"
3x "wo"
2x "wood"
weese@tanne:~/seqan/demos$
See
SeqAn - Sequence Analysis Library - www.seqan.de