程序代写代做代考 Fortran compiler algorithm finance cache mips FTP assembly assembler Java database PowerPoint 演示文稿
PowerPoint 演示文稿
CO101
Principle of Computer
Organization
Lecture 01: Introduction
Liang Yanyan
澳門科技大學
Macau of University of Science and
Technology
General References
• Instructor: Dr. Liang Yanyan (梁延研)
• Email: yyliang@must.edu.mo
• Tel: 88971997
• Office: A212
• TA: Mr. Lin Chi (林馳)
• Email: linantares@gmail.com
• You are encouraged to ask questions during the lecture
or after, or stop by my office.
• Classroom
• D1: Monday@C505, Friday@C408
• D2: Wednesday@C505, Friday@C408
2
General References
• Textbook: Computer
Organization and Design: the
Hardware/Software Interface –
4th Edition, David Patterson
and John Hennessy.
• Some Resource in FTP
• Username: yyliang_stu
• Password: Ja7Hr3lW
• Link: ftp://ftp.must.edu.mo/
3
ftp://ftp.must.edu.mo/
Grading Information
• Grade determinates
• Attendance 5%
• Assignments 10%
• Quizzes 15%
• Labs 15%
• Project 15%
• Final Exam 40%
• Late submission per day is subject to 10% of penalty.
• A student must gain at least 40% of the full marks in
each part in order to pass the course.
4
Why Learn This Stuff?
• You want to call yourself a “computer scientist/engineer”.
• You want to build HW/SW system people use.
• You need to make a purchasing decision or offer “expert”
advice.
• So need to know the relationship between performance
and power.
• Both hardware and software affect performance/power. Because
• Algorithm determines number of source-level statements.
• Language/compiler/architecture determine the number of machine-
level instructions.
• Processor/memory determines how fast and how power-hungry
machine-level instructions are executed.
5
Course Contents
• Introduction to the major components of a computer system, how they
function together in executing a program.
• Introduction to CPU datapath and control unit design.
• Introduction to techniques to improve performance and energy
efficiency of computer systems.
• Introduction to multiprocessor architecture.
• This course is to learn what determines the capabilities and
performance of computer systems,
• and to understand the interactions between the computer’s architecture
and its software,
• so that
• future software designers (compiler writers, operating system designers,
database programmers, application programmers, …) can achieve the best
cost-performance trade-offs,
• and so that
• future architects understand the effects of their design choices on software.
6
What You Will Learn
• How programs are translated into the machine
language,
• and how the hardware executes them.
• What determines program performance,
• and how it can be improved.
• How hardware designers improve performance.
7
What You Should Already Know
• Electronic circuit and digital logic.
• Knowledge of structured programming
languages
• Create, compile, and run C (C++, Java) programs
8
Computer Organization
• This course is all about how computers work.
• But what do we mean by a computer?
• Different types: embedded, laptop, desktop, server.
• Different uses: automobiles, graphics, finance, genomics…
• Different manufacturers: Lenovo, Apple, IBM, HP, Sony…
• Analogy: Consider a course on “automotive vehicles”.
• Many similarities from vehicle to vehicle (e.g., wheels).
• Huge differences from vehicle to vehicle (e.g., gas vs. electric).
• Best way to learn:
• Focus on a specific instance and learn how it works,
• While learning general principles and historical perspectives.
9
A Computer
10
Are there other kind of computers?
A Computer
11
Are there other kind of computers?
A Computer
12
Are there other kind of computers?
Classes of Computers
• Desktop computers
• Designed to deliver good performance to a single
user at low cost usually executing 3rd party software,
usually incorporating a graphics display, a keyboard,
and a mouse.
• Servers
• Used to run larger programs for multiple,
simultaneous users typically accessed only via a
network and that places a greater emphasis on
dependability and (often) security.
13
Classes of Computers
• Supercomputers
• A high performance, high cost class of servers with
hundreds to thousands of processors, terabytes of
memory and petabytes of storage that are used for
high-end scientific and engineering applications
• Embedded computers (processors)
• A computer inside another device used for running
one predetermined application
14
Supercomputers
• Tianhe-2 (天河-2)
• Over 3 million cores
• Power: 17.6 MW (24 MW with cooling)
• Speed: 33.86 PFLOPS (peta = 1015)
15
Embedded Computers in You Car
16
• Personal Mobile Device (PMD) and wearable devices.
• Where else are embedded processors found?
PostPC Era
17
Growth in Cell Phone Sales (Embedded)
18
The Evolution of Computer Hardware
• When was the first transistor invented?
19
1947 – the bi-polar transistor – by
Bardeen et.al at Bell Laboratories
UNIVAC I (Universal Automatic
Computer) – the first
commercial computer in USA
The Evolution of Computer Hardware
• When was the first IC (integrated circuit) invented?
20
1958, by Jack Kilby@Texas Instruments,
by hand, several transistors, resistors
and capacitors on a single substrate.
IBM System/360, 2MHz,
128KB ~ 256KB
The Evolution of Computer Hardware
• When was the first Microprocessor?
21
1971, Intel 4004
The Chip Manufacturing Process
22
a die
AMD Opteron X2 Wafer
23
300mm wafer, 117 chips, with 90nm technology
Integrated Circuit Cost
24
2( )Wafer diameter / 2 Wafer diameter
Dies per wafer Test dies per wafer
Die Area 2 Die Area
π π× ×
= − −
×
Cost of wafer
Die Cost
Dies per wafer
≈
No. of testing dies for
characteristics testing
No. of dies at the edge
≈ circumference/diagonal of die
Ideal case:
Integrated Circuit Cost (Die yield)
{1 }
Defects per unit area Die area
Die yield α
α
−×= +
Be referred to No. of critical processing steps in
the manufacturing process
Cost of wafer
Die Cost
Dies per wafer Die yield
=
×
1/( 1)Defects per unit area Die yield
Die area
αα −= −
Integrated Circuit Cost (Example)
• What is the approximate cost of a die in the wafer?
• An 8-inch wafer costs $1000
• Defect density is 1 per cm2
• Die area is 91 mm2
• Assume α = 2, test dies per wafer is 10
26
2
1 0.91
Die yield 1 0.47
2
−
×
= + =
2(8 2.54/2) 8 2.54
Dies per wafer 10
0.91 2 0.91
π π× × × ×
= − −
×
( )
1000
Die Cost
Dies per wafer 0.47
=
×
Real World Examples
• Nonlinear relation to area and defect rate
• Wafer cost and area are fixed
• Defect rate determined by manufacturing process
• Die area determined by architecture and circuit design
27
Chip Metal Line Wafer Defect Area Dies/ Yield Die Cost
layers width cost /cm2 mm2 wafer
386DX 2 0.90 $900 1.0 43 360 71% $4
486DX2 3 0.80 $1200 1.0 81 181 54% $12
PowerPC 601 4 0.80 $1700 1.3 121 115 28% $53
HP PA 7100 3 0.80 $1300 1.0 196 66 27% $73
DEC Alpha 3 0.70 $1500 1.2 234 53 19% $149
SuperSPARC 3 0.70 $1700 1.6 256 48 13% $272
Pentium 3 0.80 $1500 1.5 296 40 9% $417
From “Estimating IC Manufacturing Costs,” by Linley Gwennap, Microprocessor Report, August 2, 1993, p. 15
Die cost goes up with the die area.
Impacts of Advancing Technology
• Processor
• logic capacity: increases about 30% per year
• performance: increases 2x every 1.5 years
• Memory
• DRAM capacity: increases 4x every 3 years, about 60% per
year
• memory speed: increases 1.5x every 10 years
• cost per bit: decreases about 25% per year
• Disk
• capacity: increases about 60% per year
28
Courtesy, Intel ®
Dual Core
Itanium with
1.7B transistors
feature size
&
die size
In 1965, Intel’s Gordon Moore
predicted that the number of
transistors that can be
integrated on single chip would
double about every two years.
Moore’s Law
Moore’s Law for CPUs and DRAMs
30
From: “Facing the Hot Chips Challenge Again”, Bill Holt, Intel, presented at Hot Chips 17, 2005.
Main driver: device scaling …
31
From: “Facing the Hot Chips Challenge Again”, Bill Holt, Intel, presented at Hot Chips 17, 2005.
Main driver: device scaling …
32
Highest Clock Rate of Intel Processors
• In CMOS (Complementary Metal-Oxide-Semiconductor)
IC technology
FrequencyVoltageload CapacitivePower 2 ××=
×1000×30 5V → 1V
A Sea Change is at Hand
• The power challenge has forced a change in the design
of microprocessors.
• Since 2002 the rate of improvement in the response time of
programs on desktop computers has slowed from a factor of 1.5
per year to less than a factor of 1.2 per year.
• As of 2006 all desktop and server companies are
shipping microprocessors with multiple processors per
chip.
• Plan of record is to add two cores per chip per
generation (about every two years).
• Pentium 4, 2 cores, 2002-2005
• Core 2 Duo, 2-4 cores, 2006-2009
• Core i7, 4-8 cores, 2010-now
• Xeon, 1-15 cores, 1998-now
34
Intel Core i7 Processor
35
45nm technology, 18.9mm x 13.6mm, 0.73billion transistors, 2008
What is a Computer?
• Components:
• processor (datapath,
control)
• input (mouse,
keyboard)
• output (display, printer)
• memory (cache
(SRAM), main memory
(DRAM), disk drive,
CD/DVD)
36
Four Issues about Machine Organization
• Capabilities and performance characteristics of the
principal Functional Units (FUs).
• Functional Unit: a hardware component that can perform specific
operations (functions). For example, Adders, Registers, ALU,
Shifters, Logic Units.
• The ways in which these FUs are interconnected.
• e.g., buses.
• Information flows between components.
• e.g., the data flow is fetched from memory and transferred to
processor.
• Logic and means by which such information flow is
controlled.
37
Our Primary Focus
• Our primary focus: the processor (datapath and control)
and its interaction with memory systems.
• Implemented using tens/hundreds of millions of transistors.
• Impossible to understand by looking at each transistor.
• We need abstraction!
38
Processor Organization
• Control unit needs to have circuitry to
• Decide which is the next instruction and input it from memory.
• Decode the instruction.
• Issue signals that control the way information flows between
datapath components.
• Control what operations the datapath’s functional units perform.
• Datapath needs to have circuitry to
• Execute instructions – functional units (e.g., adder) and storage
locations (e.g., register file).
• Interconnect the functional units so that the instructions can be
executed as required.
• Load data from and store data to memory.
39
Below the Program
• Application software
• Written in high-level language, e.g. C, C++,
java…
• System software
• Operating system – supervising program that interfaces the
user’s program with the hardware (e.g., Linux, iOS, Windows).
• Handles basic input and output operations.
• Allocates storage and memory.
• Provides for protected sharing among multiple applications.
• Compiler – translates programs written in a high-level language
(e.g., C, Java) into instructions that the hardware can execute.
40
Why We use Higher-Level Languages?
• Higher-Level Languages
• Allow the programmer to think in a more natural language and
for their intended use (Fortran for scientific computation, Cobol
for business programming, Lisp for symbol manipulation, Java
for web programming, …).
• Improve programmer productivity – more understandable code
that is easier to debug and validate.
• Improve program maintainability.
• Allow programs to be independent of the computer on which
they are developed (compilers and assemblers can translate
high-level language programs to the binary instructions of any
machine).
• Emergence of optimizing compilers that produce very
efficient assembly code optimized for the target machine.
• As a result, very little programming is done today at the
assembler level.
41
You can become programmers programming programs that program programs!
— using AI!
Below the Program
• High-level language program (in C)
swap (int v[], int k)
(
int temp;
temp = v[k];
v[k] = v[k+1];
v[k+1] = temp;
)
• Assembly language program (for MIPS)
swap: sll $2, $5, 2
add $2, $4, $2
lw $15, 0($2)
lw $16, 4($2)
sw $16, 0($2)
sw $15, 4($2)
jr $31
• Machine (object) code (for MIPS)
000000 00000 00101 0001000010000000
000000 00100 00010 0001000000100000
. . .
42
Input Device Inputs Object Code
43
000000 00000 00101 0001000010000000
000000 00100 00010 0001000000100000
100011 00010 01111 0000000000000000
100011 00010 10000 0000000000000100
101011 00010 10000 0000000000000000
101011 00010 01111 0000000000000100
000000 11111 00000 0000000000001000
Object Code Stored in Memory
44
Processor Fetches an Instruction
• Processor fetches an instruction from memory.
45
Control Decodes the Instruction
• Control decodes the instruction to determine what to
execute.
46
Datapath Executes the Instruction
• Datapath executes the instruction as directed by control.
47
Processor Fetches the Next Instruction
• Processor fetches the next instruction from memory.
48
How does it know which location in memory to fetch from next?
Output Data Stored in Memory
• At program completion the data to be output resides in
memory.
49
Output Device Outputs Data
50
00000100010100000000000000000000
00000000010011110000000000000100
00000011111000000000000000001000
The Instruction Set Architecture (ISA)
• The instruction set is a critical interface.
• The interface description is separating the software and
hardware.
51
Instruction Set Architecture (ISA)
• ISA – the abstract interface between the hardware and
the lowest level software that includes all the information
necessary to write a machine language program,
including instructions, registers, memory access, I/O, …
• Enables implementations of varying cost and performance to run
identical software.
• The combination of the basic instruction set (the ISA)
and the operating system interface is called the
application binary interface (ABI).
• ABI – The user portion of the instruction set plus the operating
system interfaces used by application programmers. Defines a
standard for binary portability across computers.
52
How Do the Pieces Fit Together?
• Key Idea: levels of abstraction
53
I/O systemProcessor
Compiler
Operating System
(Unix;
Windows 9x)
Application (Netscape)
Digital Design
Circuit Design
Instruction Set
Architecture
Datapath & Control
transistors, IC layout
MemoryHardware
Software Assembler
CS 101
Abstractions
• Abstraction helps us deal with complexity of real systems, as
it hides unnecessary lower-level implementation details.
• Both hardware and software consist of hierarchical layers, with each
lower layer hiding details from the level above.
• One key interface between the levels of abstraction is the
instruction set architecture – the interface between the
hardware and low-level software.
• This abstract interface enables many implementations of varying
cost and performance to run identical software.
• An instruction set architecture allows computer designers to
talk about functions independently from the hardware that
performs them.
• Computer designers distinguish architecture from an
implementation of an architecture along the same lines: an
implementation is hardware that obeys the architecture abstraction.
54
CO101�Principle of Computer Organization
General References
General References
Grading Information
Why Learn This Stuff?
Course Contents
What You Will Learn
What You Should Already Know
Computer Organization
A Computer
A Computer
A Computer
Classes of Computers
Classes of Computers
Supercomputers
Embedded Computers in You Car
PostPC Era
Growth in Cell Phone Sales (Embedded)
The Evolution of Computer Hardware
The Evolution of Computer Hardware
The Evolution of Computer Hardware
The Chip Manufacturing Process
AMD Opteron X2 Wafer
Integrated Circuit Cost
Integrated Circuit Cost (Die yield)
Integrated Circuit Cost (Example)
Real World Examples
Impacts of Advancing Technology
Moore’s Law
Moore’s Law for CPUs and DRAMs
Main driver: device scaling …
Main driver: device scaling …
Highest Clock Rate of Intel Processors
A Sea Change is at Hand
Intel Core i7 Processor
What is a Computer?
Four Issues about Machine Organization
Our Primary Focus
Processor Organization
Below the Program
Why We use Higher-Level Languages?
Below the Program
Input Device Inputs Object Code
Object Code Stored in Memory
Processor Fetches an Instruction
Control Decodes the Instruction
Datapath Executes the Instruction
Processor Fetches the Next Instruction
Output Data Stored in Memory
Output Device Outputs Data
The Instruction Set Architecture (ISA)
Instruction Set Architecture (ISA)
How Do the Pieces Fit Together?
Abstractions