java - Understanding the Hadoop File System Counters -


i want understand filesystem counters in hadoop.

below counters job ran.

in every job run, observe map file bytes read equal hdfs bytes read. , observe file bytes written map sum of file bytes , hdfs bytes read mapper. pls help! same data being read both local file , hdfs, , both being written local file system map phase?

                map                         

file_bytes_read 5,062,341,139

hdfs_bytes_read 4,405,881,342

file_bytes_written 9,309,466,964

hdfs_bytes_written 0

thanks!

so answer noticing job specific. depending on job mappers/reducers write more or less bytes local file compared hdfs.

in mapper case, have similar amount of data read in both local , hdfs locations, there no problem there. mapper code happens need read same amount of data locally reads hdfs. of time mappers being used analyze amount of data greater it's ram, it's not surprising see possibly writing data gets hdfs local drive. number of bytes read hdfs , local not going sum local write size (which don't in case).

here example using terasort, 100g of data, 1 billion key/value pairs.

    file system counters             file: number of bytes read=219712810984             file: number of bytes written=312072614456             file: number of read operations=0             file: number of large read operations=0             file: number of write operations=0             hdfs: number of bytes read=100000061008             hdfs: number of bytes written=100000000000             hdfs: number of read operations=2976             hdfs: number of large read operations=0 

things notice. number of bytes read , written hdfs 100g. because 100g needed sorted, , final sorted files need written. notice needs lot of local read/writes hold , sort data, 2x , 3x amount of data read!

as final note, unless want run job without caring result. amount of hdfs bytes written should never 0, , yours hdfs_bytes_written 0


Comments

Popular posts from this blog

SPSS keyboard combination alters encoding -

Add new record to the table by click on the button in Microsoft Access -

CSS3 Transition to highlight new elements created in JQuery -