nsoftmax based training has been widely used in face recognition and object reid.

but it is hard to train for restricted gpus and large scale data.

partial fc is the one method to solve this problem.

i have implemented one prototype code and it has been used in the training.

it is in mxnet_v0.8.0, i want to move it to the master branch of mxnet.

the github for single machine is: https://github.com/starimpact/mxnet_v0.8.0/tree/bLocalReset

the github for distributed training is: https://github.com/starimpact/mxnet_v0.8.0/tree/bProxy_Weight

  • No labels

4 Comments

  1. maybe rowsparse can replace it.

    1. hi, Rahul, 

      partial fc just training part of fc in each time.

      but distributed model parallel fc is to divide the fc to parts and put them in different gpus, and train at the same time.

  2. recently, one of my interns is developing a new kind of partial fc, which could have the same accuracy compared with big fc.