...
Spark Installation
Follow instructions hereto install latest spark: https://spark.apache.org/docs/latest/spark-standalone.html. In particular:
- Install spark (either download pre-built spark, or build assembly from source). Note that Spark has different distributions for different versions of Hadoop. Keep note of the spark-assembly-*.jar location.
- Start Spark cluster (Master and workers). Keep note of the Spark master URL. This can be found in Spark master WebUI.
...
Issue | Cause | Resolution |
---|---|---|
java.lang.NoSuchMethodError: com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode | Guava library version conflict between Spark and Hadoop. See HIVE-7387 and SPARK-2420 for details. | Temporarily remove guava jars from HADOOP_HOME, until JIRA's are resolved. |
org.apache.spark.SparkException: Job aborted due to stage failure: Task 5.0:0 had a not serializable result: java.io.NotSerializableException: org.apache.hadoop.io.BytesWritable | Spark serializer not set to kryo | Set spark.serializer to be org.apache.spark.serializer.KryoSerializer as described above |
...