Problem
As the package development of Scala is getting quicker, the performance of the current package could not be measured in CI. It's hard to make sure the performance for Inference and Training did not change with more feature being introduced. In order to track this change, I proposed a Benchmark Design that can be used to monitor the Scala package performance on Training and Inference.
Goals
A complete package that can measure the performance of Scala package that contains:
- A Maven project that can take the build-from-source jar of Scala package
- Once Scala automated publish has been done, we can take from the SNAPSHOT created from the published package.
- Several scripts that can automatically download data and run the Benchmark Test
- For image classification, will use real image data for testing
- For other examples, will use Synthetic data generated on the fly
- A standard output for each of the testing result
Design
Here is a walk through on the steps that need to be taken and demo commands to run.
Preparation Step
make scalapkg cd scala-package/benchmarks make benchmark-setup
This should build the Scala package as well as the maven project of Scala benchmark. All code in the benchmark should be compiled and generated as Jar files.
Run Script Step
make <benchmark_test_name> <number_of_runs> <output_path> # e.g make image_classification_test 10 ./im_out.json
This is a general format for benchmark tests. It includes the output file path and the number of times to run the test.
The output file should be a json file that follow the format as shown below
{"test_name": "im_classification", "time_cost": 1.0825282, "current_used_memory": 32} {"test_name": "im_classification", "time_cost": 1.2485123, "current_used_memory": 33} ...
In this case, users can easily use these information to do any analysis they need.
Note: The field in here is just a demo for image classification. Different tests may have different field of measurement in here.
Clean up Step
This step is designed to do the clean-up, remove all built files and logs.
make benchmark-clean
Tests to add
Inference - CNN Image Classification with single input
Use pre built model to do image classification with single image input
Inference - CNN Image Classification with batch input
The same model used in the previous one with batch image input
Large-Scale Inference test
Inspired from the users issues, run a large-scale batch inference and test the performance.
CNN Training test
Test the training performance for CNN with Accuracy measurement.