Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Please follow the style of the existing codebase.

For Python code, Apache Spark follows PEP 8 with one exception: lines can be up to 100 characters in length, not 79.

For Java code, Apache Spark follows Oracle's Java code conversions. Many Scala guidelines below also apply to Java.

For Scala code, Apache Spark follows the official Scala style guide, but with the following changes:

Table of Contents

Line Length

Limit lines to 100 characters. The only exceptions are import statements (although even for those, try to keep them under 100 chars).

Indentation

Use 2-space indentation in general. For function declarations, use 4 space indentation for its parameters when they don't fit in a single line. For example:

...

// Correct:
if (true) {
  println("Wow!")
}

// Wrong:
if (true) {
    println("Wow!")
}

// Correct:
def newAPIHadoopFile[K, V, F <: NewInputFormat[K, V]](
    path: String,
    fClass: Class[F],
    kClass: Class[K],
    vClass: Class[V],
    conf: Configuration = hadoopConfiguration): RDD[(K, V)] = {
  // function body
}

// Wrong 
def newAPIHadoopFile[K, V, F <: NewInputFormat[K, V]](
  path: String,
  fClass: Class[F],
  kClass: Class[K],
  vClass: Class[V],
  conf: Configuration = hadoopConfiguration): RDD[(K, V)] = {
  // function body
}

Code documentation style

For Scala doc / Java doc comment before classes, objects and methods, use Java docs style instead of Scala docs style.

Code Block
languagescala
/** This is a correct one-liner, short description. */
 
/**
 * This is correct multi-line JavaDoc comment. And
 * this is my second line, and if I keep typing, this would be
 * my third line.
 */
 
/** In Spark, we don't use the ScalaDoc style so this
  * is not correct.
  */

 

For inline comment with the code, use // and not /*  .. */.

Code Block
languagescala
// This is a short, single line comment
 
// This is a multi line comment.
// Bla bla bla
 
/*
 * Do not use this style for multi line comments. This
 * style of comment interferes with commenting out
 * blocks of code, and also makes code comments harder
 * to distinguish from Scala doc / Java doc comments.
 */
 
/**
 * Do not use scala doc style for inline comments.
 */

 

Imports

Always import packages using absolute paths (e.g. scala.util.Random) instead of relative ones (e.g. util.Random).

In addition, sort imports in the following order (use alphabetical order within each group):

  • java.* and javax.*
  • scala.*
  • Third-party libraries (org.*, com.*, etc)
  • Project classes (org.apache.spark.*)

You can also use the IntelliJ import organizer plugin can organize imports for you.  Use this configuration for the plugin (configured under Preferences / Editor / Code Style / Scala Imports Organizer):

Code Block
import java.*
import javax.*

import scala.*

import *

import org.apache.spark.*

 

Infix Methods

Don't use infix notation for methods that aren't operators. For example, instead of list map func, use list.map(func), or instead of string contains "foo", use string.contains("foo"). This is to improve familiarity to developers coming from other languages.

Curly Braces

Put curly braces even around one-line if, else or loop statements. The only exception is if you are using if/else as an one-line ternary operator.

...

// Correct:
if (true) {
  println("Wow!")
}

// Correct:
if (true) statement1 else statement2

// Wrong:
if (true)
  println("Wow!")

Return Types

Always specify the return types of methods where possible. If a method has no return type, specify Unit instead in accordance with the Scala style guide. Return types for variables are not required unless the definition involves huge code blocks with potentially ambiguous return values.

...

// Correct:
def getSize(partitionId: String): Long = { ... }
def compute(partitionId: String): Unit = { ... }
 
// Wrong:
def getSize(partitionId: String) = { ... }
def compute(partitionId: String) = { ... }
def compute(partitionId: String) { ... }
 
// Correct:
val name = "black-sheep"
val path: Option[String] =
  try {
    Option(names)
      .map { ns => ns.split(",") }
      .flatMap { ns => ns.filter(_.nonEmpty).headOption }
      .map { n => "prefix" + n + "suffix" }
      .flatMap { n => if (n.hashCode % 3 == 0) Some(n + n) else None }
  } catch {
    case e: SomeSpecialException =>
      computePath(names)
  }

If in Doubt

If you're not sure about the right style for something, try to follow the style of the existing codebase. Look at whether there are other examples in the code that use your feature. Feel free to ask on the dev mailing list as well.Moved permanently to http://spark.apache.org/contributing.html