The 91st AICS Cafe
Date and Time: Wed. June 1, 2016, 15:30-16:30
Place: Workshop room (6th floor) at AICS
Title: Toward a General I/O Arbitration Framework for netCDF based Big Data Processing
Speaker: Liao Jianwei (System Software Development Team, FLAGSHIP 2020 Project)
Presentation Language: English
Presentation Material: English
On the verge of the convergence between high performance computing (HPC) and Big Data processing, it has become increasingly prevalent to deploy large-scale data analytics workloads on high-end supercomputers. Such applications often come in the form of complex work- flows with various different components, assimilating data from scientific simulations as well as from measurements streamed from sensor net- works, such as radars and satellites. For example, as part of the next generation flagship (post-K) supercomputer project of Japan, RIKEN is investigating the feasibility of a highly accurate weather forecasting system that would provide a real-time outlook for severe guerrilla rainstorms. One of the main performance bottlenecks of this application is the lack of efficient communication among workflow components, which currently takes place over the parallel file system. This presentation reports an initial study of a direct communication framework designed for complex workflows that eliminates unnecessary file I/O among components. Specifically, we propose an I/O arbitrator layer that provides direct parallel data transfer among job components that rely on the netCDF interface for performing I/O operations, with only minimal modifications to application code. We present the design and a preliminary evaluation of the framework on the K Computer using RIKEN’s experimental weather forecasting workflow as a case study.