- Nokia | Ovi
- We use Pig for exploring unstructured datasets coming from logs, database dumps, data feeds, etc.
- Several data pipelines that go into building product datasets and for further analysis use Pig tied together with Oozie to other jobs
- We have multiple Hadoop clusters, some for R&D and some for production jobs
- In R&D we run on very commodity hardware: 8-core, 16GB RAM, 4x 1TB disk per data node
- We use Pig to analyze transaction data in order to prevent fraud.
- We are the main contributors to the Pig-Eclipse project.
- Realweb - Internet Advertising company based in Russia.
- We are using Pig over Hadoop to compute statistics on banner views, clicks, user behavior on target websites after click, etc.
- We've chosen Cloudera Hadoop (http://www.cloudera.com/hadoop/) packages on Ubuntu servers 10.04. Each machine has 2/4 cores, 4 GB ram, and 1 TB of storage.
- All jobs are written using Pig language and only few user defined functions were needed to achieve our needs.