Gå offline med appen Player FM !
BALANCING PIPELINE PARALLELISM WITH VOCABULARY PARALLELISM
Manage episode 449599941 series 3524393
This paper addresses imbalanced computation and memory in pipeline parallelism for large language models by partitioning vocabulary layers, reducing communication barriers, and achieving improved throughput and memory balance.
https://arxiv.org/abs//2411.05288
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
--- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support
1677 episoder
Manage episode 449599941 series 3524393
This paper addresses imbalanced computation and memory in pipeline parallelism for large language models by partitioning vocabulary layers, reducing communication barriers, and achieving improved throughput and memory balance.
https://arxiv.org/abs//2411.05288
YouTube: https://www.youtube.com/@ArxivPapers
TikTok: https://www.tiktok.com/@arxiv_papers
Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016
Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers
--- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support
1677 episoder
Semua episod
×Välkommen till Player FM
Player FM scannar webben för högkvalitativa podcasts för dig att njuta av nu direkt. Den är den bästa podcast-appen och den fungerar med Android, Iphone och webben. Bli medlem för att synka prenumerationer mellan enheter.