Special Seminar: Jaeyoung Do(Microsoft Research) “The beginning of a new era of programmable solid-state storage in cloud data centers”
2018.08.28
Abstract
Today, there is a major disconnect in cloud data centers in the speed of innovation between application/ operating system (OS) and storage infrastructures. Application/OS software is patched with new/ improved functionality every few weeks at “cloud speed” while storage devices are off limits for such sustained innovation during their hardware life cycle of 3-5 years in the data centers. Since the software inside the storage device is written by storage vendors as proprietary firmware that is not open for general application developers to modify, we are stuck with a device whose functionality and capabilities are frozen in time, even when many of these are modifiable in software.
In this talk, we introduce a concept of creating a software-defined storage substrate of solid-state-drives (SSDs) that is as programmable, agile, and flexible as the applications/OS accessing from servers in the cloud data centers. A fully programmable storage substrate will give opportunities to better bridge the gap between application/OS needs and storage capabilities/limitations, while allowing us to innovate in-house at cloud speed. As an example of the value propositions, we explain how this concept can be applied to improve I/O cost of log-structured data caching systems without drastic changes.
Short Bio
Dr. Jae Young Do is a Researcher at Microsoft Research, where he is working on AI, large-scale data management, cloud computing, and non-volatile memory based systems. He is currently leading a project, called SoftFlash that proposes to create a software-defined storage substrate of flash SSDs in the data center that is as programmable, agile, and flexible as applications and operating systems accessing it from servers. Prior to MSR, he was a Senior Scientist at Microsoft Gray Systems Lab, and was one of the main contributors to the PolyBase project, which is a new technology to run queries on both locally-stored relational and non-relational data in Hadoop/Cloud Storage, all from within Microsoft SQL Server. He received a Ph.D. in 2012 and an M.S. in 2009 from the University of Wisconsin-Madison, USA, and a B.S. in 2007 from Korea Advanced Institute of Science and Technology (KASIT), Korea, all in Computer Sciences. During his Ph.D. study, he explored several advanced architectures to effectively integrate flash SSDs into existing database management systems (DBMSs). His work on using flash SSDs to extend a main-memory DBMS buffer pool ships in SQL Server 2014.