Problem

As the package development of Scala is getting quicker, the performance of the current package could not be measured in CI. It's hard to make sure the performance for Inference and Training did not change with more feature being introduced. In order to track this change, I proposed a Benchmark Design that can be used to monitor the Scala package performance on Training and Inference.

Goals

A complete package that can measure the performance of Scala package that contains:

  1. A Maven project that can take the build-from-source jar of Scala package
    1. Once Scala automated publish has been done, we can take from the SNAPSHOT created from the published package.
  2. Several scripts that can automatically download data and run the Benchmark Test
    1. For image classification, will use real image data for testing
    2. For other examples, will use Synthetic data generated on the fly
  3. A standard output for each of the testing result

Design


Here is a walk through on the steps that need to be taken and demo commands to run.

Preparation Step

make scalapkg
cd scala-package/benchmarks
make benchmark-setup

This should build the Scala package as well as the maven project of Scala benchmark. All code in the benchmark should be compiled and generated as Jar files.

Run Script Step

make <benchmark_test_name> <number_of_runs> <output_path>
# e.g make image_classification_test 10 ./im_out.json

This is a general format for benchmark tests. It includes the output file path and the number of times to run the test. 

The output file should be a json file that follow the format as shown below

im_out.json
{"test_name": "im_classification", "time_cost": 1.0825282, "current_used_memory": 32}
{"test_name": "im_classification", "time_cost": 1.2485123, "current_used_memory": 33}
...

In this case, users can easily use these information to do any analysis they need. 

Note: The field in here is just a demo for image classification. Different tests may have different field of measurement in here.

Clean up Step

This step is designed to do the clean-up, remove all built files and logs.

make benchmark-clean

Tests to add

Inference - CNN Image Classification with single input

Use pre built model to do image classification with single image input

Inference - CNN Image Classification with batch input

The same model used in the previous one with batch image input

Large-Scale Inference test

Inspired from the users issues, run a large-scale batch inference and test the performance.

CNN Training test

Test the training performance for CNN with Accuracy measurement.

Progress


Open Questions



  • No labels