Hadoop works best on a large data set. True or False?

0 votes

asked Oct 20, 2018 in Hadoop by admin (21,060 points)

Is it true Hadoop works best on a large data set? Why?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

0 votes

answered Oct 20, 2018 by admin (21,060 points)

Hadoop Distributed File System (HDFS) are designed to handle very large files. The larger the file, the less time Hadoop spends seeking for the next data location on disk, the more time Hadoop runs at the limit of the bandwidth of your disks.

Seeks are generally expensive operations that are useful when they only need to analyze a small subset of your dataset. Since Hadoop is designed to run over your entire dataset, it is best to minimize seeks by using large files.

Hadoop works best on a large data set. True or False?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Related questions