outputformat - how to get multipleOutput in hadoop -
i'm new hadoop, , have process input file. want process each line , output should 1 file each line.
i surf internet , found multipleoutputformat, , generatefilenameforkeyvalue.
but people write jobconf class. i'm using hadoop 0.20.1, think job class takes place. , don't know how use job class generate multiple output files key.
could me?
the eclipse plugin used submit , monitor jobs interact hdfs, against real or 'psuedo' cluster.
if you're running in local mode, don't think plugin gains - seeing job run in single jvm. in mind include include recent 1.x hadoop-core in eclipse project's classpath.
eitherway multipleoutputformat
has not been ported new mapreduce package (neither in 1.1.2 or 2.0.4-alpha), you'll either need port or find way (maybe multipleoutputs
- javadoc page has usage on using multipleoutputs)
Comments
Post a Comment