Page tree
Skip to end of metadata
Go to start of metadata


How do I configure SLF4J?

To configure SLF$J 

  1. Add a under the directory of the java test.

  2. Add the following snippets into your build.gradle file.
    test {
    systemProperty "log4j.configuration", ""
    dependencies {
    // or shadow
  3. The second dependency shadow is not necessary if another library already provides this dependency.
  4. Check the dependency included in the dependency tree, execute:

    ./gradlew dependencies.
  5. Check If you encounter an error message like the following.

    SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
    SLF4J: Defaulting to no-operation (NOP) logger implementation

    1. If so, it means there is no SLF4J.Add in the build.gradle file.

How to format code automatically and avoid spotless errors?

  1. Set up a git pre-commit hook to always autoformat code, add the following in .git/hooks/pre-commit.

  2. Set the executable bit.
  3. For more information about git hooks, go to:
  4. To skip it, use --no-verify.
  5. To disable it, use `chmod u-x.

How to run a single test?

  1. Execute:

    ./gradlew :examples:java:test --tests org.apache.beam.examples.subprocess.ExampleEchoPipelineTest --info

How to run Java Dataflow Hello World pipeline with compiled Dataflow Java worker.

You can dump multiple definitions for a gcp project name and temp folder. They are present since different targets use different names.

  1. Before running the command, configure your gcloud credentials.
  2. Add GOOGLE_APPLICATION_CREDENTIALS to your env variables.
  3. Execute:
./gradlew :runners:google-cloud-dataflow-java:examples:preCommitLegacyWorker -PdataflowProject=<GcpProjectName> -Pproject=<GcpProjectName> -PgcpProject=<GcpProjectName> -PgcsTempRoot=<Gcs location in format: gs://..., no trailing slash> -PdataflowTempRoot=<Gcs location in format: gs://...>
./gradlew :runners:google-cloud-dataflow-java:examples:preCommitFnApiWorker -PdataflowProject=<GcpProjectName> -Pproject=<GcpProjectName> -PgcpProject=<GcpProjectName>  -PgcsTempRoot=<Gcs location in format: gs://..., no trailing slash> -PdataflowTempRoot=<Gcs location in format: gs://..., no trailing slash> -PdockerImageRoot=<docker image store location in format>

How to run a User Defined Pipeline - Java Direct Runner example 

If you want to run your own pipeline, and in the meanwhile change beam repo code for dev/testing purposes. Here is an example for a simple runner like directRunner:

  1. Put your pipeline code under the example folder.
  2. Add the following build target to the related build.gradle

    task execute(type:JavaExec) {
    main = "org.apache.beam.examples.SideInputWordCount"
    classpath = configurations."directRunnerPreCommit"

There are also alternative choices, with a slight difference:

Option 1

  1. Create a maven project. 

  2. Use the following command to publish changed code to the local repository. 

     ./gradlew -Ppublishing -PnoSigning publishMavenJavaPublicationToMavenLocal

Option 2

  1. Make use of Integration tests.
  2. Make your user-defined pipeline part of the integration test. 

How to use a snapshot Beam Java SDK version?

To use snapshot BEAM new features prior to the next Beam release, you need to;

  1. Add the apache.snapshots repository to your pom.xml. Check this example.
  2. Set beam.version to a snapshot version, e.g. "2.24.0-SNAPSHOT" or later (listed here).

Common Errors

Continue on error

Use the --continue flag makes to compileJava task and to dump all found errors, not stop on first.

IntelliJ Proto Intellisense doesn't work.

This can happen when you start IntelliJ or (in my case) after modifying protos.

This is not a solved problem yet. But here are some current approaches:

  1. Clean build from console
  2. Build from IntelliJ
  3. Refresh Gradle Project in IntelliJ
  4. Restart IntelliJ
  5. Another option is if index is not updated with 3 or 4 steps. For more information, go to Rebuild IntelliJ project indexes.

A workaround that did the trick. Since many things were tried in the process and no clear way to reproduce the error, this might not be the correct or best step. Update steps if you find a shorter or cleaner way to do the trick.

  1. Refresh tradle project in IntelliJ.

  2. Close Intellij.
  3. Clean build project from the console. Execute>

    ./gradlew clean cleanTest build -x testWebsite -x :rat -x test
  4. Open IntelliJ.

What command should I run locally before creating a pull request?

We recommend running this command, in order to catch common style issues, potential bugs (using code analysis), and Javadoc issues before creating a pull request. Running this takes 5 to 10 minutes.

./gradlew spotlessApply && ./gradlew -PenableCheckerFramework=true checkstyleMain checkstyleTest javadoc spotbugsMain compileJava compileTestJava

If you don't run this locally Jenkins will run them during pre-submit. However, if these fail during pre-submit, you may not see the output of test failures. So doing this first is recommended to make your development process a bit smoother and iterate on your PR until it passes the pre-submit.

Dependency Upgrades

How to perform a dependency upgrade?

To perform a dependency upgrade we want to ensure that the PR is not introducing any new linkage errors. We do this by combining successful Jenkins test runs with analysis performed using a linkage checker. This allows us to gain confidence that we are minimizing the number of linkage issues that will arise for users. To perform a dependency upgrade:

  1. Find all Gradle subprojects that are impacted by the dependency change.
  2. For each Gradle subproject:
    1. Perform the before and after linkage checker analysis.
    2. Provide the results as part of your PR.
  3. For each Gradle subproject:
    1. Find and run relevant Jenkins test suites.

How to find all Gradle subprojects that are impacted by the dependency change

  1. Execute the command below will print out a dependency report in a text file for each project:

    ./gradlew dependencyReport
  2. Grep for a specific maven artifact identifier such as guava in all the dependency reports with:

    grep -l "guava" `find ./ -name dependencies.txt`

Linkage checker analysis

This step relies on modifying your local maven repository, typically found in ~/.m2/.

  1. Use the shell script to do this on your behalf (note that it will run the manual command below on your current workspace and also on HEAD):

    /bin/bash sdks/java/build-tools/ origin/master <your branch name> "artifactId1,artifactId2,..."

    If you omit the artifactIds, it uses beam-sdks-java-core beam-sdks-java-io-google-cloud-platform beam-runners-google-cloud-dataflow-java beam-sdks-java-io-hadoop-format; these artifacts often suffer dependency conflicts.

  2. Copy and paste the output to the PR. If it is large, you may want to use a GitHub gist. For example PRs (1, 2, 3, 4, and 5).

  3. Note that you can manually run the linkage checker on your current workspace by invoking:

    ./gradlew -Ppublishing -PjavaLinkageArtifactIds=artifactId1,artifactId2,... :checkJavaLinkage
  4. Check the example output is:

    Class org.brotli.dec.BrotliInputStream is not found;
      referenced by 1 class file (beam-sdks-java-core-2.20.0-SNAPSHOT.jar)
    Class com.github.luben.zstd.ZstdInputStream is not found;
      referenced by 1 class file (beam-sdks-java-core-2.20.0-SNAPSHOT.jar)
    Class com.github.luben.zstd.ZstdOutputStream is not found;
      referenced by 1 class file (beam-sdks-java-core-2.20.0-SNAPSHOT.jar)
    Class is not found;
      referenced by 1 class file (beam-vendor-bytebuddy-1_9_3-0.1.jar)
  5. Delete any installed Apache Beam SNAPSHOT artifacts:

    rm -rf ~/.m2/repository/org/apache/beam

Run relevant Jenkins test suites

You can find all Jenkins job configurations within and request that the reviewer run the relevant test suites by providing them with a list of all the relevant trigger phrases. You can perform this request directly on your PR or on the dev mailing list, for example.

  • No labels