Skip to end of metadata
Go to start of metadata

This page tracks the users of Spark. To add yourself to the list, please email user@spark.apache.org with your organization name, URL, a list of which Spark components you are using, and a short description of your use case.

Companies & Organizations

  • UC Berkeley AMPLab - Big data research lab that initially launched Spark
    • We're building a variety of open source projects on Spark, including Shark, MLbase, and Spark Streaming, and developing new distributed systems techniques that improve the engine
    • We have both graduate students and a team of professional software engineers working on the stack
  • 4Quant
  • Act Now
    • Spark powers NOW APPS, a big data, real-time, predictive analytics platform. We use Spark SQL, MLlib and GraphX components for both batch ETL and analytics applied to telecommunication data, providing faster and more meaningful insights and actionable data to the operators.
  • Adatao, Inc. - Data Intelligence for All
    • Visual, Real-Time, Predictive Analytics on Spark+Hadoop, with built-in support for R, Python, SQL, and Natural Language.
    • Team of ex-Googlers & Yahoos with large-scale infrastructure experience (including both flavors of MapReduce at Google & Yahoo) & PhD's in ML/Data Mining
    • Determined that Spark, among the many alternatives, answered the right problem statements with the right design
  • Agile Lab
    • enhancing big data. 360 customer view, log analysis, bi
  • Alibaba Taobao
  • Alpine Data Labs
  • Amazon
  • Amrita Center for Cyber Security Systems and Networks
  • Art.com
    • Trending analytics and personalization
  • AsiaInfo
    • We are using Spark Core, Streaming, MLlib and Graphx. We leverage Spark and Hadoop ecosystem to build cost effective data center solution for our customer in teleco industry as well as other industrial sectors.
  • Atigeo integrated Spark in xPatterns, our big data analytics platform, as a replacement for Hadoop MR
  • Autodesk
  • Baidu
  • Bakdata – using Spark (and Shark) to perform interactive exploration of large datasets
  • Big Industries - using Spark Streaming: The Big Content Platform is a business-to-business content asset management service providing a searchable, aggregated source of live news feeds, public domain media and archives of content.
  • Bizo
  • Celtra
  • ClearStory Data – ClearStory's platform and integrated Data Intelligence application leverages Spark to speed analysis across internal and external data sources, driving holistic and actionable insights.
  • Concur
    • Spark SQL, MLlib
    • Using Spark for travel and expenses analytics and personalization
  • Conviva – Experience Live
  • Credit Karma
    • We create personalized experiences using Spark. 
  • Databricks
    • Formed by the creators of Apache Spark and Shark, Databricks is working to greatly expand these open source projects and transform big data analysis in the process. We're deeply committed to keeping all work on these systems open source.
    • Providing support for Apache Spark in partnership with ClouderaMapR, and others, as well as the Databricks Cloud hosted service.
  • Dianping.com
  • Digby
  • Drawbridge
  • eBay Inc.
    • Using Spark core for log transaction aggregation and analytics
  • EURECOM
  • Exabeam
  • Faimdata
    • Build eCommerce and data intelligence solutions to the retail industry on top of Spark/Shark/Spark Streaming
  • Falkonry
  • Flytxt
    • Big Data analytics for subscriber profiling and personalization in telecommunications domain. We are using 
      Spark core and MLlib.
  • Freeman Lab at HHMI
    • We are using Spark for analyzing and visualizing patterns in large-scale recordings of brain activity in real time
  • Fundacion CTIC
  • GraphFlow, Inc.
  • Groupon
  • Guavus
    • Stream processing of network machine data
  • Hitachi Solutions
  • The Hive
  • IBM Almaden
  • InfoObjects
    • Award winning Big Data consulting company with focus on Spark and Hadoop
  • Inspur
  • Istanbul Sehir University
  • Kenshoo
    • Digital marketing solutions and predictive media optimization
  • Kelkoo
    • Using Spark Core, SQL, and Streaming. Product recommendations, BI & analytics, real-time malicious activity filtering, and data mining.
  • Knoldus Software LLC
  • Magine TV
  • MediaCrossing – Digital Media Trading Experts in the New York and Boston areas
    • We are using Spark as a drop-in replacement for Hadoop Map/Reduce to get the right answer to our queries in a much shorter amount of time.
  • MyFitnessPal
    • Using Spark to clean-up user entered food data using both explicit and implicit user signals with the final goal of identifying high-quality food items.

    • Using Spark to build different recommendation systems for recipes & foods. 

  • NASA JPL - Deep Space Network 
  • Netease
  • NFLabs
  • Nokia Solutions and Networks
  • NTT DATA
  • Nube Technologies
    • Nube provides solutions for data curation at scale helping customer targetting, accurate inventory and efficient analysis.

  • Ooyala, Inc. – Powering personalized video experiences across all screens
  • Opentable
    • Using Apache Spark for log processing and ETL. The data obtained feeds the recommender system powered by Spark MLLIB Matrix Factorization. We are evaluating the use of Spark Streaming for real-time analytics.
  • Peerialism
  • PlanBMedia
  • PredicitionIoPredictionIO currently offers two engine templates for Apache Spark MLlib for recommendation (MLlib ALS) and classification (MLlib Naive Bayes). With these templates, you can create a custom predictive engine for production deployment efficiently.
  • Premise
  • Quantifind
  • Radius Intelligence
    • Using Scala, Spark and MLLib for Radius Marketing and Sales intelligence platform including data aggregation, data processing, data clustering, data analysis and predictive modeling of all US businesses.
  • Real Impact Analytics
    • Building large scale analytics platforms for telecoms operators
  • RocketFuel
  • RONDHUIT
  • Sailthru
    • Uses Spark to build predictive models and recommendation systems for marketing automation and personalization.
  • Samsung Research America
  • Shazam Entertainment
    • Using Spark core and Spark Streaming. Use Spark on EC2 as a replacement for EMR workflows.
  • Shopify
  • Simba Technologies
    • BI/reporting/ETL for Spark and beyond.
  • Sinnia
  • SK Telecom
    • SK Telecom analyses mobile usage patterns of customer with Spark and Shark.
  • Socialmetrix
  • Sohu
  • Stratio
    • Offers an open-source Big Data platform centered around Apache Spark.
  • Taboola – Powering "Content You May Like" around the web
  • Techbase
  • Tencent
  • Tetra Concepts
  • TrendMicro
  • truedash
    • Automatic pulling of all your data in to Spark for enterprise visualisation, predictive analytics and data exploration at a low cost.

  • TruEffect Inc
  • Tuplejump
  • UC Santa Cruz
  • University of Missouri Data Analytics & Discover Lab
  • VideoAmp
    • Intelligent video ads for online and television viewing audiences.     
  • Vistar Media
    • Location technology company enabling brands to reach on-the-go consumers

  • Yahoo!
  • Yandex
    • Using Spark in Yandex Islands, to process islands identified from a search robot

Software Projects

See Supplemental Spark Projects.

Labels
  • No labels