|
||||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | |||||||||
| Class Summary | |
|---|---|
| LuceneIndexHelper | Utility for checking if a field exist in a Lucene index. |
| LuceneSegmentInputFormat | InputFormat implementation which splits a Lucene index at the segment level. |
| LuceneSegmentInputSplit | InputSplit implementation that represents a Lucene segment. |
| LuceneSegmentRecordReader | RecordReader implementation for Lucene segments. |
| LuceneStorageConfiguration | Holds all the configuration for SequenceFilesFromLuceneStorage, which generates a sequence file
with id as the key and a content field as value. |
| MailArchivesClusteringAnalyzer | Custom Lucene Analyzer designed for aggressive feature reduction for clustering the ASF Mail Archives using an extended set of stop words, excluding non-alpha-numeric tokens, and porter stemming. |
| MultipleTextFileInputFormat | Used in combining a large number of text files into one text input reader along with the WholeFileRecordReader class. |
| PrefixAdditionFilter | Default parser for parsing text into sequence files. |
| ReadOnlyFileSystemDirectory | This class implements a read-only Lucene Directory on top of a general FileSystem. |
| SequenceFilesFromDirectory | Converts a directory of text documents into SequenceFiles of Specified chunkSize. |
| SequenceFilesFromDirectoryFilter | Implement this interface if you wish to extend SequenceFilesFromDirectory with your own parsing logic. |
| SequenceFilesFromDirectoryMapper | Map class for SequenceFilesFromDirectory MR job |
| SequenceFilesFromLuceneStorage | Generates a sequence file from a Lucene index with a specified id field as the key and a content field as the value. |
| SequenceFilesFromLuceneStorageDriver | Driver class for the lucene2seq program. |
| SequenceFilesFromLuceneStorageMapper | Maps document IDs to key value pairs with ID field as the key and the concatenated stored field(s) as value. |
| SequenceFilesFromLuceneStorageMRJob | Generates a sequence file from a Lucene index via MapReduce. |
| SequenceFilesFromMailArchives | Converts a directory of gzipped mail archives into SequenceFiles of specified chunkSize. |
| SequenceFilesFromMailArchivesMapper | Map Class for the SequenceFilesFromMailArchives job |
| TextParagraphSplittingJob | |
| TextParagraphSplittingJob.SplitMap | |
| WholeFileRecordReader | RecordReader used with the MultipleTextFileInputFormat class to read full files as k/v pairs and groups of files as single input splits. |
| WikipediaToSequenceFile | Create and run the Wikipedia Dataset Creator. |
| Enum Summary | |
|---|---|
| SequenceFilesFromLuceneStorageMapper.DataStatus | |
|
||||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | |||||||||