Apache Pig is a platform for analyzing large data sets. Pig's language, Pig Latin, lets you specify a sequence of data transformations such as merging data sets, filtering them, and applying functions to records or groups of records. Pig comes with many built-in functions but you can also create your own user-defined functions to do special-purpose processing.
Pig Latin programs run in a distributed fashion on a cluster (programs are complied into Map/Reduce jobs and executed using Hadoop). For quick prototyping, Pig Latin programs can also run in "local mode" without a cluster (all processing takes place in a single local JVM).
Do you Pig? At Yahoo! 40% of all Hadoop jobs are run with Pig. Come join us!
- General Information
- '''Why Pig Latin instead of SQL?''' Pig Latin: A Not-So-Foreign Language ...
- Official Apache Pig Website
- PigTalksPapers - Pig talks, papers, interviews
- PoweredBy - a (partial) list of companies using Pig
- User Documentation ==
- Developer Documentation
- Related Resources