Concepts in Brief

Short theory for the written 5-mark question. Know the definitions and the "why".

Response time (latency) = how long one job takes — the individual user cares. Throughput = jobs per second / total work — the systems manager cares. Upgrading a CPU improves response time; adding machines improves throughput.

Elapsed time = user CPU time + system CPU time + wait time (disk, I/O, other programs). Performance work focuses on user CPU time — time spent executing your program's own code.

Why #cycles ≠ #instructions: different instructions take different cycles — multiply > add, floating-point > integer, memory > register. Changing the cycle time often changes how many cycles an instruction needs, because it means changing the hardware design.

Three sources of higher CPU performance: (1) increase clock rate, (2) improve processor organization to lower CPI, (3) compiler improvements that lower instruction count and/or average CPI.

SPEC benchmark: a standard program set (Standard Performance Evaluation Corp.) for comparing computers. SPEC CPU2006 = 12 integer (CINT2006) + 17 floating-point (CFP2006) benchmarks.

Fallacy: "designing for performance and for energy efficiency are unrelated goals" (wrong). Pitfall: expecting an improvement to one part to speed up the whole proportionally — the remedy is Amdahl's Law (a law of diminishing returns). Power can be reduced by lowering frequency; power is now the key limit on performance, especially for battery / embedded devices.

Two traps the exam loves:
• MIPS can rank a machine "faster" when its execution time is actually slower.
• Instruction count alone never decides performance — a sequence with more instructions can still run in fewer cycles and finish first.