• Zero Bubble Pipeline Parallelism

  • 2024/07/08
  • 再生時間: 1分未満
  • ポッドキャスト

Zero Bubble Pipeline Parallelism

  • サマリー

  • Core idea is think about backward pass into two flows, one to compute grad wrt to parameters, and one to compute grad wrt to output of last layer, schedule so that you are always working instead of waiting (bubble). Read full paper: https://arxiv.org/abs/2401.10241 Tags: Systems and Performance, Deep Learning, Machine Learning
    続きを読む 一部表示
activate_samplebutton_t1

あらすじ・解説

Core idea is think about backward pass into two flows, one to compute grad wrt to parameters, and one to compute grad wrt to output of last layer, schedule so that you are always working instead of waiting (bubble). Read full paper: https://arxiv.org/abs/2401.10241 Tags: Systems and Performance, Deep Learning, Machine Learning

Zero Bubble Pipeline Parallelismに寄せられたリスナーの声

カスタマーレビュー:以下のタブを選択することで、他のサイトのレビューをご覧になれます。