...
Code Block | ||
---|---|---|
| ||
List<Context> context = new ArrayList<>(); context.add(Context.cpu()); String modelPathPrefix = "my_model"; # Load the model along with the transformations. Predictor predictor = new Predictor(modelPathPrefix, context, load_transforms=True); # Inference List<NDArray> result = predictor.predictWithNDArray(inputNDArray); |
Performance Consideration
During inference, initial benchmarks shows a noticeable performance gain with End to end models i.e., a model with data transformations and neural network all fused as a single model graph.
- ResNet-18 model pre-trained with ImageNet. https://s3.us-east-2.amazonaws.com/mxnet-public/end_to_end_models
- Pre-processing - Resize(224, 224), ToTensor, Normalize(mean=(
0.485
,
0.456
,
0.406
), std=(0.229
,
0.224
,
0.225))
- We take average of 500 runs
- Single Request Inference - Input Data - Synthetic (random.uniform(0, 255, shape=(1, 300, 300, 3))
- Batch Inference - Input Data - Synthetic (random.uniform(0, 255, shape=(25, 300, 300, 3))
- Below time gives - Average Prediction Time Per Sample
A | B | C | Non End to End Models (ms) | End to End Models (ms) | Boost % |
---|---|---|---|---|---|
CPU (C5.2X) | Single Request Inference | Python (Module API) | 17 | 14 | 17.65% |
Java Inference APIs | 17.09 | 14.16 | 17.14% | ||
Scala Inference APIs | 17.93 | 13.19 | 26.44% | ||
Batch Inference (Batch size = 25) | Python (Module API) | 15.18 | 12.57 | 17.19% | |
Java Inference APIs | 18.54 | 13 | 29.88% | ||
Scala Inference APIs | 17 | 13.26 | 22.00% | ||
GPU (P3.16X) | Single Request Inference | Python (Module API) | 5.78 | 3.14 | 45.67% |
Java Inference APIs | 8.95 | 4.26 | 52.40% | ||
Scala Inference APIs | 9.14 | 4.42 | 51.64% | ||
Batch Inference (Batch size = 25) | Python (Module API) | 2.61 | 1.31 | 49.81% | |
Java Inference APIs | 8.03 | 5.53 | 31.13% | ||
Scala Inference APIs | 7.86 | 5.52 | 29.77% |
Backward compatibility
- All APIs changes are backward compatible.
- Old MXNet model should still load without breakage with new MXNet version.
- New MXNet model will not work on old MXNet versions. If a user tries to load new MXNet model with older MXNet version, they get errors such as - "unknown field 'inputs', 'outputs' in the model" because, MXNet's model JSON parser schema in old MXNet do not understand new fields introduced as part of this work.
...