크기변환_mj

OMNIA’s paper accepted to NSDI 2019

Activities December 4, 2018
Professor Myeongjae Jeons paper titled Tiresias: A GPU Cluster Manager for Distributed Deep Learninghas been accepted to NSDI 2019. This work was done in collaboration with researchers at University of Michigan, Microsoft Research, Alibaba, and Bytedance.
 
This paper presents Tiresias, a GPU cluster resource manager tailored for distributed deep learning training, which efficiently schedules and places deep learning jobs to reduce their job completion times. It proposes (1) a scheduling algorithm called 2DAS that generalizes LAS and Gittins Index policy by incorporating spatial and temporal aspects of deep learning jobs, and (2) a job placement policy based on profiling of skewness in tensor distributions.
 
NSDI is the top conference in networked and distributed systems.