Apache Hadoop
Get Involved
Fill in the gaps
Create an issue
Or discuss on list
Apache Hadoop Core is a software platform that lets one easily write and run applications that process vast amounts of data.
Here's what makes Hadoop especially useful:
- Scalable: Hadoop can reliably store and process petabytes.
- Economical: It distributes the data and processing across clusters of commonly available computers. These clusters can number into the thousands of nodes.
- Efficient: By distributing the data, Hadoop can process it in parallel on the nodes where the data is located. This makes it extremely rapid.
- Reliable: Hadoop automatically maintains multiple copies of data and automatically redeploys computing tasks based on failures.
Apache Hadoop Core , Introduction
Applications:
- CloudsApacheHama is a distributed matrix computation package
- CloudsApacheHBase is the Hadoop data store
- CloudsApacheMahout develops scalable machine learning libraries
Presentations:
- A Tour Of Apache Hadoop
- Hadoop Map-Reduce Tuning and Debugging
- Running Hadoop In The Cloud
- Introduction To Hadoop
- Hadoop Usage At Facebook
- Hadoop Distributed File System Architecture And Design
- MapReduce Vs SQL
- Meet Hadoop
- Petabyte Scale On Commodity Infrastructure
- Programming With Hadoops Map Reduce
- Understanding Map Reduce With Hadoop
See:
See Also: