Computer Architecture: A Quantitative Approach, 4th Edition
John L. Hennessy, David A. Patterson
Format: PDF / Kindle (mobi) / ePub
The era of seemingly unlimited growth in processor performance is over: single chip architectures can no longer overcome the performance limitations imposed by the power they consume and the heat they generate. Today, Intel and other semiconductor firms are abandoning the single fast processor model in favor of multi-core microprocessors--chips that combine two or more processors in a single package. In the fourth edition of Computer Architecture, the authors focus on this historic shift, increasing their coverage of multiprocessors and exploring the most effective ways of achieving parallelism as the key to unlocking the power of multiple processor architectures. Additionally, the new edition has expanded and updated coverage of design topics beyond processor performance, including power, reliability, availability, and dependability.
CD System Requirements
The CD material includes PDF documents that you can read with a PDF viewer such as Adobe, Acrobat or Adobe Reader. Recent versions of Adobe Reader for some platforms are included on the CD.
The content is designed to be viewed in a browser window that is at least 720 pixels wide. You may find the content does not display well if your display is not set to at least 1024x768 pixel resolution.
This CD can be used under any operating system that includes an HTML browser and a PDF viewer. This includes Windows, Mac OS, and most Linux and Unix systems.
Increased coverage on achieving parallelism with multiprocessors.
Case studies of latest technology from industry including the Sun Niagara Multiprocessor, AMD Opteron, and Pentium 4.
Three review appendices, included in the printed volume, review the basic and intermediate principles the main text relies upon.
Eight reference appendices, collected on the CD, cover a range of topics including specific architectures, embedded systems, application specific processors--some guest authored by subject experts.
Next-Generation Applied Intelligence: 22nd International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, IEA/AIE 2009, Tainan, Taiwan, June 2009, Proceedings
important response time requirements. (Chapter 6 discusses some file and I/O system benchmarks in detail.) SPECWeb is a Web server benchmark that simulates multiple clients requesting both static and dynamic pages from a server, as well as clients posting data to the server. Transaction-processing (TP) benchmarks measure the ability of a system to handle transactions, which consist of database accesses and updates. Airline reservation systems and bank ATM systems are typical simple examples of
performance can be understood. In the ideal case, the suite resembles a statistically valid sample of the application space, but such a sample requires more benchmarks than are typically found in most suites and requires a randomized sampling, which essentially no benchmark suite uses. Once we have chosen to measure performance with a benchmark suite, we would like to be able to summarize the performance results of the suite in a single number. A straightforward approach to computing a summary
SpeedupFP = ----------------------------------- = ---------------- = 1.23 0.5 0.8125 ( 1 – 0.5 ) + ------1.6 1.9 Quantitative Principles of Computer Design ■ 41 Improving the performance of the FP operations overall is slightly better because of the higher frequency. Amdahl’s Law is applicable beyond performance. Let’s redo the reliability example from page 27 after improving the reliability of the power supply via redundancy from 200,000-hour to 830,000,000-hour MTTF, or 4150X better.
and instruction count to segments of the code, which can be helpful to programmers trying to understand and tune the performance of an application. Often, a designer or programmer will want to understand performance at a more fine-grained level than what is available from the hardware counters. For example, they may want to know why the CPI is what it is. In such cases, simulation techniques like those used for processors that are being designed are used. 1.10 Putting It All Together:
in the buffer). ■ Hit rate in the buffer is 90% (for branches predicted taken). We compute the penalty by looking at the probability of two events: the branch is predicted taken but ends up being not taken, and the branch is taken but is not found in the buffer. Both carry a penalty of 2 cycles. Probability (branch in buffer, but actually not taken) = = Probability (branch not in buffer, but actually taken) = Branch penalty = Branch penalty = Percent buffer hit rate × Percent incorrect