|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.hadoop.mapreduce.InputFormat<K,V>
org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
org.apache.hadoop.mapreduce.lib.input.TextInputFormat
org.apache.mahout.text.wikipedia.XmlInputFormat
public class XmlInputFormat
Reads records that are delimited by a specific begin/end tag.
| Nested Class Summary | |
|---|---|
static class |
XmlInputFormat.XmlRecordReader
XMLRecordReader class to read through a given xml document to output xml blocks as records as specified by the start tag and end tag |
| Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat |
|---|
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.Counter |
| Field Summary | |
|---|---|
static String |
END_TAG_KEY
|
static String |
START_TAG_KEY
|
| Constructor Summary | |
|---|---|
XmlInputFormat()
|
|
| Method Summary | |
|---|---|
org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text> |
createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
org.apache.hadoop.mapreduce.TaskAttemptContext context)
|
| Methods inherited from class org.apache.hadoop.mapreduce.lib.input.TextInputFormat |
|---|
isSplitable |
| Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat |
|---|
addInputPath, addInputPaths, computeSplitSize, getBlockIndex, getFormatMinSplitSize, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, getSplits, listStatus, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSize |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final String START_TAG_KEY
public static final String END_TAG_KEY
| Constructor Detail |
|---|
public XmlInputFormat()
| Method Detail |
|---|
public org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
org.apache.hadoop.mapreduce.TaskAttemptContext context)
createRecordReader in class org.apache.hadoop.mapreduce.lib.input.TextInputFormat
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||