Hi, I am Meng Zhang, a 5th year Ph.D. student from the Department of Electrical and Computer Engineering at Duke University. My advisor is Dr. Daniel J. Sorin. I obtained my M.S. and B.S. degrees from Beihang University in China in 2008 and 2005, respectively.
My current research focuses on scalable verifiable cache coherence design, especially for heterogeneous systme, such as systems with CPUs and GPUs. Our main goal is to make the cache coherence protocol more amenable to verificaiton while still maintain scalability with regard to performance/power/area etc. I also have interest in fault tolerant computer architectures and reliable system design.
Validate and evaluate hardware and software cache coherence mechanisms for Echelon architecture under Steve Keckler and Zvika Guz
Design and implement a hardware cache coherence engine on next generation GPU infrastructure under Steve Keckler and James Balfour
Fall 2009: Teaching Assistant of ECE 252 Advanced computer architecture
Fall 2010: Teaching Assistant of ECE 152 Introduction to Computer Architecture
-- Intel Corporation (Hudson, MA), Feb. 2011
-- CSAIL Angstrom Seminar, Massachusetts Institute of Technology, Feb. 2011
-- Computer Engineering Seminar, North Carolina State University, Feb. 2011
-- CSAIL Angstrom Student Lunch Seminar, Massachusetts Institute of Technology, Feb. 2011
Explored the design space of hardware cache coherence on a heterogeneous system with CPUs and GPUs to see what design options is optimal regarding performance/power/area etc. The study is based on an innovative infrastructure in which CPUs and GPUs have a unified memory space.
Designed subsystems of Multi-core processors in a special way (fractal) to ease design verification. The subsystems we have designed include cache coherence protocols and memory consistency models. By leveraging fractal idea, we enable the verification of any N-node system using induction.
Here is the specification of an example cache coherence protocol using our method TreeFractal Protocol
Analyzed the pre-fabrication design verification effort and post-fabrication chip testing effort of different fault-tolerant mechanisms. These fault tolerant mechanisms include DMR, TMR, Re-execution and parity code. We found that Re-execution is the most efficient mechanism when considering formal verification and testing efforts together.
Implemented an error detection mechanism (Argus) on Core Cannibalization Architecture (CCA), a self-reconfiguration mechanism for multicore processors with hard faults.
Last updated: 09/09/2012