...
Code Block | ||
---|---|---|
| ||
def convert_model(sym, arg_params, aux_params, target_dtype="float16", target_dtype_ops=None, fp32_ops=None, widest_dtype_ops=None, conditional_fp32_ops=None, excluded_sym_names=None): """API for converting a model from FP32 model to a mixed precision model. MXNet tries to convert the FP32 model to mixed precision model by adding cast layers using amp_cast and amp_multicast operators. The decision on which cast layer to add is based on hardcoded lists for Automatic Mixed Precision in MXNet. These lists can be overridden by the user by providing their own lists using : targe_precision_ops, fp32_ops, widest_precision_ops, conditional_fp32_ops Parameters ---------- sym : str or Symbol Defines the structure of a neural network for FP32 types. arg_params : dict Dictionary of name to `NDArray`. aux_params : dict Dictionary of name to `NDArray`. target_dtype : str Currently only supports float16. The target dtype indicates to add cast layers when possible so that lower precision computation can be leveraged. target_dtype_ops : list of strs Override the list of operator names casted to target_dtype. If None, uses the framework's default list to be casted to target dtype. fp32_ops : list of strs Override the lists of operator names casted to FP32. If None, uses the framework's default list to be casted to FP32. widestconditional_dtypefp32_ops : list of strs A (string, string, list of op names provided by user which should run in widest precision among its inputs. If None, uses the framework's default list of widest_precision_ops. conditional_fp32_ops : list of (string, string, list of string) Override the list of operatorsstring) Override the list of operators to be casted to FP32. The format of the list is (name of the function, name of the parameter, list of values of the parameter that make the operator to be casted to FP32. The format fp32) excluded_sym_names : list of thestrs A list is (name of thestrings function,that namerepresent of the parameter, listnames of values of the parametersymbols that makeusers the operatorwant to beexclude from castedbeing to fp32) excluded_sym_names : list of strs A list of strings that represent the names of symbols that users want to exclude from being quantized. """quantized. """ |
target_dtype should decide target_dtype should decide which lists need to be overridden.
For example, in the future bfloat16 support may be added in which case these lists for operators running in bfloat16 will also be added to AMP.
In this case, target_dtype will allow users to choose the right dtype for the mixed precision model.
...
Code Block | ||
---|---|---|
| ||
def convert_block(block, target_dtype="float16", target_dtype_ops=None, fp32_ops=None, widest_dtype_ops=None, conditional_fp32_ops=None, excluded_sym_names=None, input_names=['data']):_dtype_ops=None, """Given a hybrid block/symbol block representing a neural network of data type FP32 and targetfp32_dtypeops=None, return a block with mixed precision support conditional_fp32_ops=None, Parameters ---------- block : HybridBlock or SymbolBlock object FP32 HybridBlock or SymbolBlock object target_dtype : str or numpy currently only supports float16. The target dtype indicates to add cast layers when possible so that lower precision computation can be leveraged. excluded_sym_names=None, input_names=['data']): """Given a hybrid block/symbol block representing a neural network of data type FP32 and target_dtype, return a block with mixed precision support Parameters ---------- block : HybridBlock or SymbolBlock object FP32 HybridBlock or SymbolBlock object target_precision_opsdtype : liststr ofor strsnumpy currently Overrideonly thesupports listfloat16. ofThe operatortarget namesdtype castedindicates to target_dtype.add cast layers Ifwhen None,possible usesso thethat framework'slower defaultprecision listcomputation tocan be casted to target dtype leveraged. fp32target_precision_ops : list of strs Override the listslist of operator names casted to FP32target_dtype. If None, uses the framework's default list to be casted to target FP32dtype. widest_precisionfp32_ops : list of strs Override the listlists of operator names whichcasted should run in widest precision among its input argumentsto FP32. If None, uses the framework's default list of widest_precision_ops to be casted to FP32. conditional_fp32_ops : list of (string, string, list of string) Override the list of functions casted to FP32. The format of the list is (name of the function, name of the parameter, list of values of the parameter that make the operator to be casted to fp32) excluded_sym_names : list of strs A list of strings that represent the names of symbols that users want to exclude from being quantized. input_names : list of strs A list of strings representing the names of input variables """ |
...
imagenet1k-resnet-152: JSON File, Params File
Results
Model (Samples/sec) | Batch Size | Original Model (Samples/sec) | Mixed Precision Model (Samples/sec) | Original Model with Implicit Type Conversion (MXNET_CUDA_TENSOR_OP_MATH_ALLOW_CONVERSION=1) (Samples/sec) |
---|---|---|---|---|
imagenet1k-resnet-152 | 1 | 85 | 72 | 72 |
2 | 140 | 140 | 142 | |
4 | 240 | 270 | 228 | |
8 | 320 | 470 | 261 | |
16 | 405 | 680 | 315 | |
resnet50_v1 | 1 | 215 | 165 | 205 |
2 | 370 | 330 | 365 | |
4 | 560 | 600 | 545 | |
8 | 760 | 980 | 635 | |
16 | 935 | 1400 | 790 |
FAQ
Will the arg_params and aux_params be casted to fp16 ?
...