[jira] [Created] (IGNITE-6783) Create common mechanism for group training.

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (IGNITE-6783) Create common mechanism for group training.

Anton Vinogradov (Jira)
Artem Malykh created IGNITE-6783:
------------------------------------

             Summary: Create common mechanism for group training.
                 Key: IGNITE-6783
                 URL: https://issues.apache.org/jira/browse/IGNITE-6783
             Project: Ignite
          Issue Type: Task
      Security Level: Public (Viewable by anyone)
            Reporter: Artem Malykh
            Assignee: Artem Malykh


In distributed ML it is a common task to train several models in parallel with ability to communicate with each other during training. Simple example of this case is training of neural network with SGD on different chunks of data located on several nodes. In such training we do the following in a loop: on each node we do one or several SGD steps then send gradient on central node which averages gradients from each of worker nodes and send back the averaged gradient. There is a pattern in this procedure which can be applied to other ML algos and it could be useful to extract this pattern.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)