Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.


The idea is that the "profile classes" will send an instance of the "Stat classes" to AggregateStats. And in AggregateStats, we will process the stats one by one, and we will map the name names of the operator/API stats to its their AggregateStats entryentries.

With that said, we are able to know what's causing the duplication: when we profile operators, we use ProfileOperator; however, within ProfileOperatpor, we also has a member variable "as_task_" which is of class ProfileTask. The intention is to generate two events/stats that fall into different domains for one single operator call. However because those two events stats have the same operator name, they will cause duplication in AggregateStats.

"MXNET_C_API" calls will not cause duplications, because for them, we use ProfileTask only. I In other words, we are only generating one event/stat for each call.


All the "stat classes" inherit from ProfileStat in profiler.h. There, we can add a new bool member variable "enable_aggregate_". This variable defaults to true and controls whether we want to use or skip this stat in AggregateStats (an if statement is added in OnProfileStat() in Also, we want to add yet another "enable_aggregate_" to ProfileTask. The idea is that we can set this bool, and we can propagate the this value to ProfileStat's  "enable_aggregate_" through the lambda function in ProfileTask::SentStat(). Finally, in ProfileOperator, we want to set the  "enable_aggregate_" of "_as_task" too /ProfileTask to false. This way, we are continuing to produce two events/stats for each event operator call, but only the one generated by ProfileOperator will get registered in AggregateStats. The stat generated by "as_task_"/ProfileTask will be skipped, so we no longer have a duplication.