ECE 552 / CPS 550

Advanced Computer Architecture I

Fall 2015
Professor Daniel J. Sorin


The objective of this course is to learn the fundamental aspects of computer architecture design and analysis.
The course focuses on processor design, pipelining, superscalar, out-of-order execution, caches (memory hierarchies), virtual memory, storage
systems, and simulation techniques. Advanced topics include a survey of parallel architectures and future directions in computer architecture.
Prerequisites: ECE/CS 250 or consent of instructor
Class Location and Hours


Class meets Monday/Wednesday/Friday from 10:20am - 11:10am.

Location: TBD

 Instructor, Teaching Assistants, and News Group


Professor Daniel J. Sorin

Office: 209C Hudson Hall

Office Hours: TBD

Email: sorin AT ee DOT duke DOT edu 


Graduate Teaching Assistants:

Yijie Zhuang


Required Textbook
Computer Architecture: A Quantitative Approach, 5th edition, by Hennessy and Patterson
 Assignments and Grading
This course will require readings from the textbooks and from selected research papers.  While you will not be quizzed on readings, you
should still be certain to have read them before class so that you can learn from the class.  And, to appeal to your practical side, all readings are
fair game for the exams.  Added bonus: you will be better at reading research papers at the end of this class than at the beginning.

Students are responsible for:

Note to Computer Science students: Qualifying grade is based only on the midterm and final.

Late policy for homework and project (except for dean's excuses):
        Homework: <1 day late = 50% off
                           >1 day late = 0
        Project: No late projects will be accepted!
Academic Misconduct: I will not tolerate academically dishonest work.  This includes cheating on the exams and plagiarism on the project.  
Be careful on the project to cite prior work and to give proper credit to others' research. 
Refer to the Duke Undergraduate Honor Code or to the instructor if you have any questions about misconduct.
 Topics, Lecture Notes, and Reading Assignments (still in flux!!)

I will post lecture notes (in PDF format) on Sakai shortly before I cover them in class.  Click on topic title for link to notes.

Readings in blue will be provided by the instructor (click on links below for PS or PDF).

Topic Reading Assignments
Course Introduction & Computer Performance H/P Chapter 1;
"Instruction Sets and Beyond: Computers, Complexity, and Controversy"
Pipelined Processors
H/P Appendix A; 
"The Optimal Pipeline Depth Per Pipeline Stage is 6-8 FO4 Inverter Delays"
Hardware/Dynamic Exploitation of Instruction Level Parallelism
H/P 2 and 3 (3 is optional);  
"The Microarchitecture of the Pentium 4 Processor"

"Complexity-Effective Superscalar Processors"

"Checkpoint Processing and Recovery: Towards Scalable Large Instruction Window Processors"
Software/Static Exploitation of Instruction Level Parallelism
H/P Chapter 2;
"EPIC: Explicitly Parallel Instruction Computing"
Advanced Cache/Memory Designs
H/P Chapter 5;
"An Adaptive, Non-Uniform Cache Structure for Wire-Dominated On-Chip Caches"

"The ZCache: Decoupling Ways and Associativity"

"Exceeding the Dataflow Limit via Value Prediction"

Multithreading, Multicore, and Multiprocessors
          Motivations: Power Efficiency, ILP Limits, and TLP
          Multicore Processors

H/P Chapter 4 (and 3.5); 
"Power: A First Class Design Constraint"

"Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor"

"Multiscalar Processors"

"Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance"

"Conservation Cores: Reducing the Energy of Mature Computations

"Amdahl's Law in the Multicore Era"

"NVidia Tesla: A Unified Graphics and Computing Architecture"

Advanced Topics: Fault Tolerance, Virtual Machines, Security, etc.

"Argus: Low-Cost, Comprehensive Error Detection in Simple Cores"

"Virtual Machine Monitors: Current Techology and Future Trends"

"RIFLE: An Architectural Framework for User-Centric Information-Flow Security"

datacenter paper(s) TBD

Homework policy: Each homework should be done in a group of 2 students.  For each homework, each group turns in ONE assignment to be graded.  This assignment should have the names of both students on it.  For electronic submission of simulator code (for some homework questions), one member of the group should upload code on Sakai.


The course project will be performed either individually or in groups of 2 or 3. 

Projects will explore a micro-architectural issue (of your choice) using SimpleScalar.  You will be expected to modify the code in sim-outorder as part of your project.   See Prof. Sorin for project guidelines and ideas.

Project proposals (2 pages max!!): Due TBD in class.  Proposals must contain the following information:

Project reports (15 pages max!!): Due TBD in class.  No exceptions!

 Schedule (tentative)

This is a tentative schedule which may change depending on time constraints and which days the instructor will be out of town.





Aug 24




Aug 31




Sept 7

Dynamic ILP

Dynamic ILP

Dynamic ILP

Sept 14

Dynamic ILP

Dynamic ILP

Dynamic ILP

Sept 21

Dynamic ILP

Dynamic ILP

Dynamic ILP

Sept 28

Static ILP

Static ILP

Static ILP

Oct 5




Oct 12




Oct 19




Oct 26




Nov 2

Heterogeneous Multicore



Nov 9

Fault Tolerance

Fault Tolerance

Virtual Machines

Nov 16


Datacenters Datacenters



Nov 30


Dec 7

--------  EXAM WEEK  --------