| bag_of_word_ify | Function to convert a record into a bag of tokens with a fieldwise flag |
| bag_signatures | Function that reduces a bag of words into a signature matrix using multiple random projections |
| block.ids.from.blocking | Returns the block ids associated with a blocking method. |
| calc_idf | Function to calculate the inverse document frequency given a shingled bag of words |
| confusion.from.blocking | Perform evaluations (recall) for blocking. |
| klsh | Function that reduces a bag of words into a signature matrix using multiple random projections |
| reduction.ratio | Returns the reduction ratio associated with a blocking method |
| reduction.ratio.from.blocking | Returns the reduction ratio associated with a blocking method |
| rproject_bags | Function that generates unit random vectors and takes (weighted) projections onto the random unit vectors given a bag of words |
| sacks_of_bags_of_words | Function to convert all records into a bag of tokens |
| tokenify | Function to token a string into its k components |