You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed that tasks are trained in a fixed order rather than being shuffled:
In stage 2, tasks follow the sequence t2m -> m2t -> predict for one epoch
In stage 3, tasks appear to be processed in the order defined in the JSON file
Since motion is treated as discrete data similar to text, using the same loss function across tasks should be possible. This makes me wonder about the following:
Was there a specific reason for not shuffling tasks during training?
Have you found better results with this fixed-order approach compared to random task selection?
Did you experiment with randomly selecting tasks in pretraining (stage 2) and instruction tuning (stage 3)?
I'm curious to learn more about the design decisions behind this approach. Looking forward to hearing your insights!
The text was updated successfully, but these errors were encountered:
shin-wn
changed the title
Question. Why don't you shuffle tasks during training?
Question: Task ordering strategy during training stages
Oct 22, 2024
shin-wn
changed the title
Question: Task ordering strategy during training stages
Question: Task ordering strategy during training stage2, 3
Oct 22, 2024
Thank you for the great work on this project.
I noticed that tasks are trained in a fixed order rather than being shuffled:
Since motion is treated as discrete data similar to text, using the same loss function across tasks should be possible. This makes me wonder about the following:
I'm curious to learn more about the design decisions behind this approach. Looking forward to hearing your insights!
The text was updated successfully, but these errors were encountered: