Sanjay Sharma’s Weblog

August 27, 2009

Hadoop- some revelations

Filed under: Advanced computing, Java world, Tech — Tags: , , — indoos @ 5:44 am

My recent experience with using Hadoop in production grade applications was both good and bad.

Here are some of the bad ones to start with-

  • Using commodity servers – not entirely true as even expressed on Hadoop web site somewhere. Anything below 8 GB RAM may not help with any good production heavy application, particularly if each Map/Reduce task uses 1-2 GB of RAM
    • Task tracker and data node JVM instances take at least around 1 GB RAM each- effectively leaving 5-6 GB RAM for Map Reduce JVMs
    • 512 MB for each Map and Reduce JVMs leaves with 5-8 Maps +3-6 Reduce instances
  • Usually real-time applications use look up or metadata data.  Although, Hadoop does offer Distributed cache or Configuration based (pseudo) replication of small shared data, the very nature of heavy Java in-memory object handling (serialization-dese) and HDFS access, does not allow performant look up handling
  • I would love to see more/easier/default control on various settings/parameters in config files as the current mechanism is really a pain in the back
  • Hadoop uses a lot of temp space. It is easy to NOT notice that you may only use 1/4 of your total available hard disk memory for business use. This is because you use 2 parts for replication (3 is default n good replication factor) while 1 for temporary (working/intermittent) processing. So for processing say 1 TB data, use may require around 4 TB+ hard disk. I learned about this the hard way after wasting good precious time!!
  • Last but not the least- it is real easy to write Map Reduce using Hadoop genius framework, but real difficult to convert business logic to Map Reduce paradigm

To be continued ……………….

Advertisements

1 Comment »

  1. Great write-up. Thanks for sharing the details.

    Havent used the Map Reduce in Hadoop. Just played around with Hadoop serialization. Plan to use more of Hadoop, your write-up was definitely very helpful.

    Comment by Senthil — September 5, 2009 @ 8:07 am


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: