How do I configure SLF4J?

To configure SLF$J 

  1. Add a log4j-test.properties under the directory of the java test.

  2. Add the following snippets into your build.gradle file.
    test {
    systemProperty "log4j.configuration", "log4j-test.properties"
    dependencies {
    shadow library.java.slf4j_api
    shadow library.java.slf4j_log4j12
    // or shadow library.java.slf4j_jdk14
  3. The second dependency shadow library.java.slf4j_log4j12 is not necessary if another library already provides this dependency.
  4. Check the dependency included in the dependency tree, execute:

    ./gradlew dependencies.
  5. Check If you encounter an error message like the following.

    SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
    SLF4J: Defaulting to no-operation (NOP) logger implementation

    1. If so, it means there is no SLF4J.Add library.java.slf4j_log4j12 or library.java.slf4j_jdk14 in the build.gradle file.

How to format code automatically and avoid spotless errors?

  1. Set up a git pre-commit hook to always autoformat code, add the following in .git/hooks/pre-commit.

  2. Set the executable bit.
  3. For more information about git hooks, go to: https://git-scm.com/docs/githooks.
  4. To skip it, use --no-verify.
  5. To disable it, use `chmod u-x.

How to run a single test?

  • Example command (run from beam root):

    ./gradlew :examples:java:test --tests org.apache.beam.examples.subprocess.ExampleEchoPipelineTest --info
  • To break that line down a bit:

    • ./gradlew
      • the Gradle wrapper that runs your code. It lives in the beam root, so wherever you run your command from, this path needs to point there.
    • :examples:java:test 
      • Everything before the last colon is the path from the project root to the root of the subproject the test is in (this directory will contain a build.gradle file)
      • The last word after the colon will always be test because it isn't a directory name, but the name of the Gradle task you're asking the wrapper to perform
    • --tests 
      • this is the option that lets you declare which specific test(s) (or test suite(s)) to run, typically using their path(s) from the src/test/java folder of the subproject
    • --info (optional)
      • sets the log level to info
  • For more information see the documentation below on:

How to run Java Dataflow Hello World pipeline with compiled Dataflow Java worker.

You can dump multiple definitions for a gcp project name and temp folder. They are present since different targets use different names.

  1. Before running the command, configure your gcloud credentials.
  2. Add GOOGLE_APPLICATION_CREDENTIALS to your env variables.
  3. Execute:
./gradlew :runners:google-cloud-dataflow-java:examples:preCommitLegacyWorker -PdataflowProject=<GcpProjectName> -Pproject=<GcpProjectName> -PgcpProject=<GcpProjectName> -PgcsTempRoot=<Gcs location in format: gs://..., no trailing slash> -PdataflowTempRoot=<Gcs location in format: gs://...>
./gradlew :runners:google-cloud-dataflow-java:examples:preCommitFnApiWorker -PdataflowProject=<GcpProjectName> -Pproject=<GcpProjectName> -PgcpProject=<GcpProjectName>  -PgcsTempRoot=<Gcs location in format: gs://..., no trailing slash> -PdataflowTempRoot=<Gcs location in format: gs://..., no trailing slash> -PdockerImageRoot=<docker image store location in format gcr.io/...>

How to run a User Defined Pipeline - Java Direct Runner example 

If you want to run your own pipeline, and in the meanwhile change beam repo code for dev/testing purposes. Here is an example for a simple runner like directRunner:

  1. Put your pipeline code under the example folder.
  2. Add the following build target to the related build.gradle

    task execute(type:JavaExec) {
    main = "org.apache.beam.examples.SideInputWordCount"
    classpath = configurations."directRunnerPreCommit"

There are also alternative choices, with a slight difference:

Option 1

  1. Create a maven project. 

  2. Use the following command to publish changed code to the local repository. 

     ./gradlew -Ppublishing -PnoSigning publishMavenJavaPublicationToMavenLocal

Option 2

  1. Make use of Integration tests.
  2. Make your user-defined pipeline part of the integration test. 

How to use a snapshot Beam Java SDK version?

To use snapshot BEAM new features prior to the next Beam release, you need to;

  1. Add the apache.snapshots repository to your pom.xml. Check this example.
  2. Set beam.version to a snapshot version, e.g. "2.24.0-SNAPSHOT" or later (listed here).

Common Errors

Continue on error

Use the --continue flag makes to compileJava task and to dump all found errors, not stop on first.

IntelliJ Proto Intellisense doesn't work.

This can happen when you start IntelliJ or (in my case) after modifying protos.

This is not a solved problem yet. But here are some current approaches:

  1. Clean build from console
  2. Build from IntelliJ
  3. Refresh Gradle Project in IntelliJ
  4. Restart IntelliJ
  5. Another option is if index is not updated with 3 or 4 steps. For more information, go to Rebuild IntelliJ project indexes.

A workaround that did the trick. Since many things were tried in the process and no clear way to reproduce the error, this might not be the correct or best step. Update steps if you find a shorter or cleaner way to do the trick.

  1. Refresh tradle project in IntelliJ.

  2. Close Intellij.
  3. Clean build project from the console. Execute>

    ./gradlew clean cleanTest build -x testWebsite -x :rat -x test
  4. Open IntelliJ.

What command should I run locally before creating a pull request?

We recommend running this command, in order to catch common style issues, potential bugs (using code analysis), and Javadoc issues before creating a pull request. Running this takes 5 to 10 minutes.

./gradlew spotlessApply && ./gradlew -PenableCheckerFramework=true checkstyleMain checkstyleTest javadoc spotbugsMain compileJava compileTestJava

If you don't run this locally Jenkins will run them during pre-submit. However, if these fail during pre-submit, you may not see the output of test failures. So doing this first is recommended to make your development process a bit smoother and iterate on your PR until it passes the pre-submit.

Dependency Upgrades

How to perform a dependency upgrade?

To perform a dependency upgrade we want to ensure that the PR is not introducing any new linkage errors. We do this by combining successful Jenkins test runs with analysis performed using a linkage checker. This allows us to gain confidence that we are minimizing the number of linkage issues that will arise for users. To perform a dependency upgrade:

  1. Find all Gradle subprojects that are impacted by the dependency change.
  2. For each Gradle subproject:
    1. Perform the before and after linkage checker analysis.
    2. Provide the results as part of your PR.
  3. For each Gradle subproject:
    1. Find and run relevant Jenkins test suites.

How to find all Gradle subprojects that are impacted by the dependency change

  1. Execute the command below will print out a dependency report in a text file for each project:

    ./gradlew dependencyReport
  2. Grep for a specific maven artifact identifier such as guava in all the dependency reports with:

    grep -l "guava" `find ./ -name dependencies.txt`

Linkage checker analysis

This step relies on modifying your local maven repository, typically found in ~/.m2/.

  1. Use the shell script to do this on your behalf (note that it will run the manual command below on your current workspace and also on HEAD):

    /bin/bash sdks/java/build-tools/beam-linkage-check.sh origin/master <your branch name> "artifactId1,artifactId2,..."

    If you omit the artifactIds, it uses beam-sdks-java-core beam-sdks-java-io-google-cloud-platform beam-runners-google-cloud-dataflow-java beam-sdks-java-io-hadoop-format; these artifacts often suffer dependency conflicts.

  2. Copy and paste the output to the PR. If it is large, you may want to use a GitHub gist. For example PRs (1, 2, 3, 4, and 5).

  3. Note that you can manually run the linkage checker on your current workspace by invoking:

    ./gradlew -Ppublishing -PjavaLinkageArtifactIds=artifactId1,artifactId2,... :checkJavaLinkage
  4. Check the example output is:

    Class org.brotli.dec.BrotliInputStream is not found;
      referenced by 1 class file
        org.apache.beam.repackaged.core.org.apache.commons.compress.compressors.brotli.BrotliCompressorInputStream (beam-sdks-java-core-2.20.0-SNAPSHOT.jar)
    Class com.github.luben.zstd.ZstdInputStream is not found;
      referenced by 1 class file
        org.apache.beam.repackaged.core.org.apache.commons.compress.compressors.zstandard.ZstdCompressorInputStream (beam-sdks-java-core-2.20.0-SNAPSHOT.jar)
    Class com.github.luben.zstd.ZstdOutputStream is not found;
      referenced by 1 class file
        org.apache.beam.repackaged.core.org.apache.commons.compress.compressors.zstandard.ZstdCompressorOutputStream (beam-sdks-java-core-2.20.0-SNAPSHOT.jar)
    Class org.apache.beam.vendor.bytebuddy.v1_9_3.net.bytebuddy.jar.asm.commons.ModuleHashesAttribute is not found;
      referenced by 1 class file
        org.apache.beam.vendor.bytebuddy.v1_9_3.net.bytebuddy.jar.asm.commons.ClassRemapper (beam-vendor-bytebuddy-1_9_3-0.1.jar)
  5. Delete any installed Apache Beam SNAPSHOT artifacts:

    rm -rf ~/.m2/repository/org/apache/beam

Run relevant Jenkins test suites

You can find all Jenkins job configurations within https://github.com/apache/beam/tree/master/.test-infra/jenkins and request that the reviewer run the relevant test suites by providing them with a list of all the relevant trigger phrases. You can perform this request directly on your PR or on the dev mailing list, for example.

Google Cloud-related dependency upgrades

To provide the consistent dependencies to Beam users, follow the following steps when upgrading Google Cloud-related dependencies:

  1. Set the Libraries BOM version.
    1. Find the latest release in https://github.com/googleapis/java-cloud-bom/releases and set libraries-bom value in BeamModulePlugin.groovy
  2. Find core Google Java library versions.
    1. Such as gRPC, Protobuf, Guava, Google Auth Library in the release note of the Libraries BOM and set them in BeamModulePlugin.groovy
  3. Find appropriate Netty version by checking io.grpc:grpc-netty's dependency declaration. For example, you can tell gRPC version 1.49.0 was built with Netty "4.1.77.Final" by reading https://search.maven.org/artifact/io.grpc/grpc-netty/1.49.0/jar:
    Update netty_version in BeamModulePlugin.groovy
  4. Find netty-tcnative version via netty-parent artifact. For example, you can tell Netty 4.1.77.Final was built with netty-tcnative "2.0.52.Final". https://search.maven.org/artifact/io.netty/netty-parent/4.1.77.Final/jar:
    Update netty_tcnative_boringssl_static version in BeamModulePlugin.groovy

  • No labels