You have just executed a MapReduce job. Where is intermediate data written to after being emitted from the ....

0 votes
asked Aug 10, 2016 in CCD 470 Cloudera Certified Developer for Apache Hadoop CDH4 Upgrade Exam (CCDH) by John Hayes (470 points)
retagged Aug 14, 2016 by admin
You have just executed a MapReduce job. Where is intermediate data written to after being emitted from the Mapper’s map method?

A. Intermediate data in streamed across the network from Mapper to the Reduce and is never written to disk.

B. Into in-memory buffers on the TaskTracker node running the Mapper that spill over and are written into HDFS.

C. Into in-memory buffers that spill over to the local file system of the TaskTracker node running the Mapper.

D. Into in-memory buffers that spill over to the local file system (outside HDFS) of the TaskTracker node running the Reducer

E. Into in-memory buffers on the TaskTracker node running the Reducer that spill over and are
written into HDFS.

1 Answer

0 votes
answered Aug 10, 2016 by Sandra Reeds (1,040 points)

Answer: D

Explanation:

The mapper output (intermediate data) is stored on the Local file system (NOT HDFS) of each individual mapper nodes. This is typically a temporary directory location which can be setup in config by the hadoop administrator. The intermediate data is cleaned up after the Hadoop Job completes.

Reference:
24 Interview Questions & Answers for Hadoop MapReduce developers, Where is the Mapper Output (intermediate kay-value data) stored ?

Most active Members
this month:
    Gute Mathe-Fragen - Bestes Mathe-Forum
    ...