Presentations about Hadoop
This is a list of presentations about Hadoop, by event and paper (newest first):
Public Presentations
A lot of these presentations are at local user groups. If there is not one in your area, start one! Take one of the existing talks and give it! Don't be afraid. The only thing to fear is trying to do live demos of MapReduce against a remote cluster. Most presenters avoid this.
SF-JUG, July 2011
- Brisk: Truly Peer-to-Peer (SriSatish Ambati, DataStax)
BioAssist Programmers' day, January 2011
- Large-Scale Data Storage and Processing for Scientists with Hadoop (Evert Lammerts, SARA High Performance Computing and Networking)
Hadoop Hackathon Amsterdam, December 2010
- Hackathon Reader for MapReduce (Evert Lammerts, SARA High Performance Computing and Networking)
FOSDEM Brussels, February 2010
- Apache Hadoop - Large scale data analysis. (Isabel Drost, Apache Mahout)
HUG Korea, December 2009
- An Introduction to Bulk Synchronization Parallel on Hadoop. (Edward J. Yoon, Apache Hama)
Hadoop Get Together Berlin, December 2009
- How we made data processing scalable at nugg.ad. (Richard Hutton, nugg.ad)
- Market research powered by Hadoop. (Nickolaus Pohle, nurago)
- Hadoop. (Jörg Möllenkamp, Sun Microsystems)
DevHouse Berlin, October 2009
- Large scale data processing with Hadoop (Isabel Drost, Apache Mahout)
Lambda Lounge (St. Louis, USA), October 2009
- Hadoop In 45 Minutes or Less (Tom Wheeler, OCI)
Hadoop World, October 2009
- SecurityCompatibilityHadoopWorld2009.pdf (Owen O'Malley, Yahoo!)
- cutting-hwnyc2009.pdf (Doug Cutting, Cloudera)
Sunnyvale Hadoop User Group, September 2009
*HUG_Sep23.pdf ( Owen O'Malley, Yahoo!)
Apache Hadoop Get Together, September 2009
- Solving Puzzles with Map Reduce (Thorsten Schütt, ZIB)
- An introduction to JAQL ( Thilo Götz, IBM)
- Lucene 2.9 Developments (Uwe Schindler, Apache Lucene)
Bristol Hadoop Workshop, August 2009
The Bristol Hadoop Workshop was a small meeting; these presentations were intended to start discussion and thought
- Hadoop Futures (Tom White, Cloudera)
- Hadoop and High-Energy Physics (Simon Metson, Bristol University)
- HDFS (Johan Oskarsson, Last.fm)
- Graphs Paolo Castagna, HP
- Long Haul Hadoop (Steve Loughran, HP)
- Benchmarking Hadoop (Steve Loughran & Julio Guijarro, HP)
FrOSCon Sankt Augustin, August 2009
- From data to information - An overview of the Hadoop ecosystem with a close-up on Mahout (Isabel Drost, Apache Mahout, video (starts with a "Hello FrOSCon visitors" round)
Hadoop Technical Discussion Presented at Machine Learning group TU Berlin, July 2009
- Apache Hadoop - Large scale data processing (Isabel Drost, Apache Mahout)
Hadoop guest lecture at Beuth Hochschule Berlin, July 2009
- Apache Hadoop - Large scale data processing (Isabel Drost, Apache Mahout)
Apache Hadoop Get Together Berlin, June 2009
- Protocol Buffers vs. Apache Thrift (Thorsten Curdt, slides available from speaker)
- Lucene for Life Science Knowledge Discovery (Dr. Christoph M. Friedrich, Fraunhofer SCAI)
Usenix, June 2009
Usenix is one of the big computing talks. The fact that Hadoop is now a subject of discussion is a measure of its success
- Hadoop-USENIX09.pdf (Marco Nicosia, USENIX, June 2009)
Hadoop Summit, June 2009
This was the west coast summit, hosted by Yahoo!
- HadoopSummit09_SortBenchmarks_ArunCMurthy.pdf (Arun C. Murthy and Owen O'Malley, Hadoop Summit, June 2009)
HUG UK, April 2009
London meeting of the UK Hadoop Users Group
- Practical MapReduce (Tom White, Cloudera)
- Introducing Apache Mahout (Isabel Drost, ASF)
- The Terrier Project (Iadh Ounisand Craig Macdonald, University of Glasgow)
- Apache HBase (Michael Stack, Powerset)
- Having Fun with PageRank and MapReduce (Paolo Castagna, HP)
- HADOOP-1722 and Typed Bytes (Klaas Bosteels, Last.fm)
- Hypercubes in HBase - (Fredrik Möllerstrand, Last.fm)
- Scalable reasoning on RDF documents with Hadoop and HBase (Michele Catasta, Last.fm)
Apache Hadoop Get Together Berlin, March 2009
- HBase (Lars George, Worldlingo)
- CouchDB in 20 minutes (Jan Lehnardt, couch.io)
ApacheCon EU 2009, March 2009
The main ASF get-together in Europe
- TuningAndDebuggingMapReduce_ApacheConEU09.pdf (Arun C. Murthy, ApacheCon EU, March 2009)
- apachecon_eu_2009_hadoop_in_the_cloud.pdf (Tom White, ApacheCon EU, March 2009)
- aw-apachecon-eu-2009.pdf (Allen Wittenauer, ApacheCon EU, March 2009)
- Dynamic Hadoop Clusters (Steve Loughran, ApacheCon EU, March 2009)
- Application Architecture For The Cloud (Steve Loughran, ApacheCon EU, March 2009)
Apache Hadoop Get Together, December 2008
- BI over Text on the Cloud (Alexander Löser, DIMA TU Berlin)
- Katta slides available from Stefan Groschupf.
ApacheCon 2008, November 2008
The annual Apache US conference
- YahooHadoopIntro-apachecon-us-2008.pdf (Owen O'Malley, ApacheCon, Nov 2008)
- dhruba_apachecon2008.pdf (Dhruba Borthakur, ApacheCon, Nov 2008)
Hadoop Technical Discussion Presented by Rapleaf, October 2008
- HadoopMapReduce_TuningAndDebugging.pdf Presentation to Hadoop Technical Discussion - Presented by Rapleaf, San Francisco, California, (Arun C. Murthy, October 2008)
East Bay Innovation Group, October 2008
- HadoopEBIG-Oct2008.pdf Hadoop presentation to East Bay Innovation Group, Oakland, California, (Owen O'Malley, October 2008)
NY Hadoop User Group, October 2008
- Hadoop Namenode High Availability, NY Hadoop User Group Meeting, New York, August 2008 (Paul George, ContextWeb)
Apache Hadoop Get Together Berlin, September 2008
- Hadoop/ HBase as Webstore (Rasmus Hahn, neofonie)
- UIMA on Hadoop (Marc Hofer, DIMA TU Berlin)
HUG UK Meeting, August 2008
Presentations from the Hadoop User Group UK Meeting, London, August 2008
- Hadoop overview (Doug Cutting)
- Hadoop Web Services on Amazon S3/EC2 Tom White
- Hadoop usage at Last.fm (Martin Dittus)
- Distributed Lucene for Hadoop (Mark Butler)
- Dumbo: Hadoop streaming made elegant and easy (Klaas Bosteels)
- Deploying Apache Hadoop with Smartfrog (Steve Loughran and Julio Guijarro)
- Hadoop at Last.fm: Radio Log Analysis for A/B Tests (Elias Pampalk)
- PostgreSQL to HBase Replication (Tim Sell)
- Hadoop: Lessons learned at Last.fm (Johan Oskarsson)
Apache Hadoop Get Together Berlin, June 2008
- Crawling the DNS (Gert Pfeifer, TU Dresden)
- Hadoop @ Semgine - Einsatz im NLP Umfeld (Sascha Kohlmann, Semgine)
- Mahout (Isabel Drost, Apache Mahout)
- How we use Apache Pig (Stefan Groschupf, 101tec)
IBM Almaden Research, June 2008
- hdfs_dhruba.pdf (Dhruba Borthakur, IBM Almaden Research, June 2008)
ApacheCon EU 2008, April 2008
- aw-apachecon-eu-2008.pdf (Allen Wittenauer, ApacheCon EU, April 2008)
- ApacheConEU2008HadoopTourTomWhite.pdf (Tom White, ApacheCon EU, April 2008)
- HadoopProgramming.pdf (Owen O'Malley, ApacheCon EU, April 2008)
SPA 2008, March 2008
- MapReduce-SPA2008.pdf (Tom White, SPA 2008, March 2008, feedback, MapReduce-SPA2008-answers.pdf)
Mailtrust Tech Talk, February 2008
- MapReduce vs SQL (Stu Hood, Mailtrust Tech Talk, February 2008)
Older talks
- radlab-hadoop.pdf (Owen O'Malley and Eric Baldeschwieler, October 2007)
- Meet Hadoop (oscon-part-1.pdf, oscon-part-2.pdf) (Doug Cutting and Eric Baldeschwieler, OSCON, July 25 2007)
- HDFSDescription.pdf (Dhruba Borthakur, June 2007)
- HadoopApacheConEu07.pdf (Owen O'Malley, May 2007)
- HadoopMapReduceArch.pdf (Owen O'Malley, July 2006)
- yahoo-sds.pdf (Doug Cutting, May 2006)
Teaching
Here are some courses that have used Hadoop to teach distributed computing (newest first):
- Google: MapReduce in a Week
- CS490h, Spring 2007, University of Washington (lecture notes & labs)
- Expanded UW course taught in Fall 2008
Presentations in other languages:
- hadoop_basarim09.pdf (Turkish) (Enis Söztutar, 1. Ulusal Yüksek Başarım ve Grid Konferansı, 04/2009)
- hadoop_ets_2juillet.pdf (Jean-Daniel Cryans, École de technologie supérieure de Montréal, Juillet 2008)