IST LUNCH BUNCH
Customizable Computing — From Single-chip to Datacenters
In our 2008 proposal to the NSF Expeditions in Computing program, we argued that future computing systems would be customizable with extensive use of accelerators, as custom-designed accelerators often provide 10-100X performance/energy efficiency over the general-purpose processors. Such an accelerator-rich architecture presents a fundamental departure from the classical von Neumann architecture, which emphasizes efficient sharing of the executions of different instructions on a common pipeline, providing an elegant solution when the computing resource is scarce. In constrast, the accelerator-rich architecture features heterogeneity and customizaiton for energy efficiency, which is better suited for energy-constrained designs where the silicon resource is abundant. Our research program on customizable computing turned out to be very timely and impactful -- with Intel's $17B acquistion of Altera completed in December 2015, customizable computing is going from advanced research projects into mainstream computing technologies.
In this talk, I shall first present an overview of our research on customizable computing, from single-chip, to server node, and to data centers, with extensive use of composable accelerators and field-programmable gate-arrays (FPGAs), and highlight our successes in several application domains, including medical imaging, machine learning, and computational genomics. Then, I present our ongoing work on enabling automation for customized computing. One effort is on automated compilation for combining source-code level transformation for high-level synthesis with efficient parameterized architecture template generations. Another direction is to develop efficient runtime support for scheduling and transparent resource management for integration of FPGAs for datacenter-scale acceleration with support to the existing programming interfaces, such as MapReduce, Hadoop, and Spark, for large-scale distributed computation. I shall highlight the algorithmic and implementation challenges and our solutions to many of these compilation and runtime optimization problems.