public class AnchorExtract
extends java.lang.Object
Djoerd Hiemstra and Claudia Hauff. "MIREX: MapReduce Information Retrieval Experiments" Technical Report TR-CTIT-10-15, Centre for Telematics and Information Technology, University of Twente, ISSN 1381-3625, 2010
Modifier and Type | Class and Description |
---|---|
static class |
AnchorExtract.Combine
-- Combiner: Glues local anchor texts together.
|
static class |
AnchorExtract.Map
-- Mapper: Extracts anchors.
|
static class |
AnchorExtract.Reduce
-- Reducer: Glues anchor texts together, and recovers TREC-ID.
|
Constructor and Description |
---|
AnchorExtract() |
Modifier and Type | Method and Description |
---|---|
static void |
main(java.lang.String[] args)
Runs the MapReduce job "anchor text extraction"
|
public static void main(java.lang.String[] args) throws java.lang.Exception
args
- 0: path to web collection on HDFS; 1: (non-existing) path that will contain anchor textsjava.lang.Exception
hadoop jar mirex-0.2.jar nl.utwente.mirex.AnchorExtract /user/hadoop/ClueWeb09_English/*/ /user/hadoop/ClueWeb09_Anchors