Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Major Features

API Changes

  1. Added `CachedOp`. You can now cache the operators that’s called frequently with the same set of arguments to reduce overhead.

  2. Added sample_multinomial for sampling from multinomial distributions.

  3. Added `trunc` operator for rounding towards zero.

  4. Added linalg_gemm, linalg_potrf, ... operators for lapack support.

  5. Added verbose option to Initializer for printing out initialization details.

  6. Added DeformableConvolution to contrib from the Deformable Convolutional Networks paper.

  7. Added float64 support for dot and batch_dot operator.

  8. `allow_extra` is added to Module.set_params to ignore extra parameters.

  9. Added `mod` operator for modulo.

  10. Added `multi_precision` option to SGD optimizer to improve training with float16. Resnet50 now achieves the same accuracy when trained with float16 and gives 50% speedup on Titan XP.

Performance Improvements

  1. ImageRecordIter now stores data in pinned memory to improve GPU memcopy speed.


  1. Cython interface is fixed. `make cython` and `python install --with-cython` should install the cython interface and reduce overhead in applications that use imperative/bucketing.

  2. Fixed various bugs in Faster-RCNN example:

  3. Fixed various bugs in SSD example.

  4. Fixed `out` argument not working for `zeros`, `ones`, `full`, etc.

  5. `expand_dims` now supports backward shape inference.

  6. Fixed a bug in rnn. BucketingSentenceIter that causes incorrect layout handling on multi-GPU.

  7. Fixed context mismatch when loading optimizer states.

  8. Fixed a bug in ReLU activation when using MKL.

  9. Fixed a few race conditions that causes crashes on shutdown.


  1. Refactored TShape/TBlob to use int64 dimensions and DLTensor as internal storage. Getting ready for migration to DLPack. As a result TBlob::dev_mask_ and TBlob::stride_ are removed.

Known issues

  • Inception-V3 model can be converted into CoreML format using the converter but is unable to run on Xcode.
  • The Source Headers and License Files, NOTICE File,, and have minor errors. 
  • Shape inference pass may fail as a result of the overstrict assumption that all the unknown shapes of the forward inputs can be deduced after the first forward inference pass. Users will see the following error message: “Backward shape inconsistent with the forward shape”. In order to prevent that, users can deduce the shape and provide it as an attribute for the symbolic variable. See this pull request for more details.

How to build MXNet

Please follow the instructions at except do not clone the git repo and rather use the tarball provided at .

Keras 1.2.2 with MXNet Backend  


  1. Adding Apache MXNet backend for Keras 1.2.2.
  2. Easy to use multi-gputrainingwith MXNet backend.
  3. High-performance model training in Keras with MXNet backend.

Getting Started Resources

  1. Installation -
  2. How to use Multi-GPU for training in Keras with MXNet backend -
  3. For more examples explore keras/examples directory.
  4. Source Repo -

For more details on unsupported functionalities, known issues and resources refer to release notes -


Apple CoreML Converter

You can now convert your MXNet models into Apple CoreML format so that they can run on Apple devices whichmeans that you can build your next iPhone app using your own MXNet model!

List of layers that can be converted:

  1. Activation
  2. Batchnorm
  3. Concat
  4. Convolution
  5. Deconvolution
  6. Dense
  7. Elementwise
  8. Flatten
  9. Pooling
  10. Reshape
  11. Softmax
  12. Transpose

With the above layers, this tool can convert models that are similar to:

  1. Inception
  2. Network-In-Network
  3. Squeezenet
  4. Resnet
  5. Vgg


In order to run the converter you need the following:
  1. MacOS - 10.11 (El Capitan) or higher (for running inferences on the converted model MacOS 10.13 or higher (for phones: iOS 11 or above) is needed)
  2. Python 2.7
  3. mxnet-to-coreml tool:
    pip install mxnet-to-coreml


In order to convert, say a squeezenet model (which can be downloaded from here and the synset file can be downloaded from here), you can execute the following command: (assuming you are in the directory where resides): --model-prefix='squeezenet_v1.1' --epoch=0 --input-shape='{"data":"3,227,227"}' --mode=classifier --pre-processing-arguments='{"image_input_names":"data"}' --class-labels synset.txt --output-file="squeezenetv11.mlmodel"

You can find explanations for each parameter along with more examples here.

In order to use the generated CoreML model file into your project, refer to Apple's tutorial here.