| basic-tokenizers | Basic tokenizers |
| chunk_text | Chunk text into smaller segments |
| count_characters | Count words, sentences, characters |
| count_sentences | Count words, sentences, characters |
| count_words | Count words, sentences, characters |
| mobydick | The text of Moby Dick |
| ngram-tokenizers | N-gram tokenizers |
| tokenizers | Tokenizers |
| tokenize_characters | Basic tokenizers |
| tokenize_character_shingles | Character shingle tokenizers |
| tokenize_lines | Basic tokenizers |
| tokenize_ngrams | N-gram tokenizers |
| tokenize_paragraphs | Basic tokenizers |
| tokenize_ptb | Penn Treebank Tokenizer |
| tokenize_regex | Basic tokenizers |
| tokenize_sentences | Basic tokenizers |
| tokenize_skip_ngrams | N-gram tokenizers |
| tokenize_words | Basic tokenizers |
| tokenize_word_stems | Word stem tokenizer |