Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Add Pipeline:AWS Steps plugin to Jenkins CI prod account
  2. Create bucket on S3 CI prod account
  3. Handling permissions
    1. Create a write policy specific for this bucket
      Currently, we only want to write to this specific bucket. To make it secure, only "PutObject" permission is given to this specific bucket (via ARN)
    2. Attach the policy to bucket
    3. Attach the policy to IAM Role

      IAM Role : jenkins_slave_role
      Policy Name : s3-ci-prod-upload-unit-test-artifact
      Action: s3: Write
      Resource connected (ARN of S3 bucket) : arn:aws:s3:::mxnet-ci-unittest-artifact-repository/*


  4. Set Global property on Jenkins console
    1. Jenkins → Manage Jenkins → Configure System → Global Properties (helps setup global environment variables visible in every job)
      Name : Value
      MXNET_CI_UNITTEST_ARTIFACT_BUCKET: mxnet-ci-unittest-artifact-repository

  5. Call S3 upload with required parameters
    1. Functions:
      1. ci/Jenkinsfile_utils.groovy → collect_test_results_unix()
      2. ci/Jenkinsfile_utils.groovy → collect_test_results_windows()
    2. PR : https://github.com/apache/incubator-mxnet/pull/16336


Design Choices

  • S3 Upload will be triggered for every build, every job, every branch.

Choice : Specific branches vs All branches

Decision : All branches

Instead of uploading unittest data to S3 only once per PR (PR-merge commit), we upload it for every commit on PR.

Why?
More data uploads → Increased sampling size → More basis for identifying the cause of unit-test slowdown
More data, better granularity and hence clarity.

Instead of limiting it for PRs+Master branch, we let it run for all branches (leaving the filtering to be done on Cloudwatch)

Instead of uploading unittest data to S3 for every single PR, design choice was made to upload it only for PR-merge (when PR gets merged into the master i.e. commit made to master).

Why? Tracking for every commit of PRs would be unfeasible, and unstable. On the other hand, PR merges are generally stable and would be "relevant" to tracking health of the master build.

  • S3 Directory structure

We have 2 approaches available:

Job / Branch / Build / File Branch / Build / Job / File

Image Added

Image Added

Decision:

Aim is to have per-job metrics. Not relevant to have metrics per-PR. Combining based on jobs ensures comparable values lie in the same folder.
Basically, individual PR runs will act as sample points. Ultimately, we rely on Job-level metrics. (Useful for Cloudwatch dashboard design too)

  • Context file

Context file would store context specific to this PR - user, source branch, target branch, timestamp, etc

...