Summary
Performance testing
To test or compare the performance of machines, programs can be run and their execution times can be measured. However, the execution speed may depend on the particular program being run, and matching it exactly to the actual needs of the customer can be quite complex. To overcome this problem, standard programs called “benchmark
programs” have been devised. These programs are intended to approximate the real workload that the user will want to run on the machine. Actual execution time can be measured by running the program on the machines.
Commonly used measures of performance The basic measure of performance of a machine is time. Some commonly used measures of this time, used for comparison of the performance of various machines, are
Execution time is simply the time it takes a processor to execute a given program. The time it takes for a particular program depends on a number of factors other than the performance of the CPU, most of which are ignored in this measure. These factors include waits for I/O, instruction fetch times, pipeline delays, etc.
The execution time of a program with respect to the processor, is defined as
Strictly speaking, (IC X CPI) should be the sum of the clock periods needed to execute each instruction. The manufacturers for each instruction in the instruction set usually provide such information. Using the average is a simplification.
MIPS (Millions of Instructions per Second)Another measure of performance is the millions of instructions that are executed by the processor per second. It is defined as
MFLOPS (Millions of Floating Point Instructions per Second)For computation intensive applications, the floating-point instruction execution is a better measure than the simple instructions. The measure MFLOPS was devised with this in mind. This measure has two advantages over MIPS:
Whetstone is the first benchmark program developed specifically as a benchmark program for performance measurement. Named after the Whetstone Algol compiler, this benchmark program was developed by using the statistics collected during the compiler development. It was originally an Algol program, but it has been ported to FORTRAN, Pascal and C. This benchmark has been specifically designed to test floating point
instructions. The performance is stated in MWIPS (millions of Whetstone instructions per second).
DhrystonesDeveloped in 1984, this is a small benchmark program to measure the integer instruction performance of processors, as opposed to the Whetstone‟s emphasis on floating point instructions. It is a very small program, about a hundred high-level-language statements, and compiles to about 1~ 1½ kilobytes of code.
Disadvantages of using Whetstones and Dhrystones
Both Whetstones and Dhrystones are now considered obsolete because of the following reasons.
SPECSPEC, System Performance Evaluation Cooperative, is an association of a number of computer companies to define standard benchmarks for fair evaluation and comparison of different processors. The standard SPEC benchmark suite includes:
Advantages
An example machine is introduced here to facilitate our understanding of various design steps and concepts in computer architecture. This example machine is quite simple, and leaves out a lot of details of a real machine, yet it is complex enough to illustrate the fundamentals.
SRC Introduction
Attributes of the SRC
The figure shows the attributes of the SRC; the 32 ,32-bit registers that are a part of the CPU, the two additional CPU registers (PC & IR), and the main memory which is 232 1- byte cells.
SRC Notation
We examine the notation used for the SRC with the help of some examples.
Some more SRC Attributes
Before discussing these instruction types in detail, we take a look at the encoding of general purpose registers (the ra, rb and rc fields).
Next Part is General Purpose Register
- Measures of performance
- Introduction to an example processor SRC
- SRC:Notation
- SRC features and instruction formats
Performance testing
To test or compare the performance of machines, programs can be run and their execution times can be measured. However, the execution speed may depend on the particular program being run, and matching it exactly to the actual needs of the customer can be quite complex. To overcome this problem, standard programs called “benchmark
programs” have been devised. These programs are intended to approximate the real workload that the user will want to run on the machine. Actual execution time can be measured by running the program on the machines.
Commonly used measures of performance The basic measure of performance of a machine is time. Some commonly used measures of this time, used for comparison of the performance of various machines, are
- Execution time
- MIPS
- MFLOPS
- Whetstones
- Dhrystones
- SPEC
Execution time is simply the time it takes a processor to execute a given program. The time it takes for a particular program depends on a number of factors other than the performance of the CPU, most of which are ignored in this measure. These factors include waits for I/O, instruction fetch times, pipeline delays, etc.
The execution time of a program with respect to the processor, is defined as
Execution Time = IC x CPI x T
Where, IC = instruction count
CPI = average number of system clock periods to execute an instruction
T = clock period
Strictly speaking, (IC X CPI) should be the sum of the clock periods needed to execute each instruction. The manufacturers for each instruction in the instruction set usually provide such information. Using the average is a simplification.
MIPS (Millions of Instructions per Second)Another measure of performance is the millions of instructions that are executed by the processor per second. It is defined as
MIPS = IC/ (ET x 106)This measure is not a very accurate basis for comparison of different processors. This is because of the architectural differences of the machines; some machines will require more instructions to perform the same job as compared to other machines. For example, RISC machines have simpler instructions, so the same job will require more instructions. This measure of performance was popular in the late 70s and early 80s when the VAX 11/780 was treated as a reference.
MFLOPS (Millions of Floating Point Instructions per Second)For computation intensive applications, the floating-point instruction execution is a better measure than the simple instructions. The measure MFLOPS was devised with this in mind. This measure has two advantages over MIPS:
- Floating point operations are complex, and therefore, provide a better picture of the hardware capabilities on which they are run
- Overheads (operand fetch from memory, result storage to the memory, etc.) are effectively lumped with the floating point operations they support
Whetstone is the first benchmark program developed specifically as a benchmark program for performance measurement. Named after the Whetstone Algol compiler, this benchmark program was developed by using the statistics collected during the compiler development. It was originally an Algol program, but it has been ported to FORTRAN, Pascal and C. This benchmark has been specifically designed to test floating point
instructions. The performance is stated in MWIPS (millions of Whetstone instructions per second).
DhrystonesDeveloped in 1984, this is a small benchmark program to measure the integer instruction performance of processors, as opposed to the Whetstone‟s emphasis on floating point instructions. It is a very small program, about a hundred high-level-language statements, and compiles to about 1~ 1½ kilobytes of code.
Disadvantages of using Whetstones and Dhrystones
Both Whetstones and Dhrystones are now considered obsolete because of the following reasons.
- Small, fit in cache
- Obsolete instruction mix
- Prone to compiler tricks
- Difficult to reproduce results
- Uncontrolled source code
SPECSPEC, System Performance Evaluation Cooperative, is an association of a number of computer companies to define standard benchmarks for fair evaluation and comparison of different processors. The standard SPEC benchmark suite includes:
- A compiler
- A Boolean minimization program
- A spreadsheet program
- A number of other programs that stress arithmetic processing speed
Advantages
- It provides for ease of publication.
- Each benchmark carries the same weight.
- SPEC ratio is dimensionless.
- It is not unduly influenced by long running programs.
- It is relatively immune to performance variation on individual benchmarks.
- It provides a consistent and fair metric.
An example machine is introduced here to facilitate our understanding of various design steps and concepts in computer architecture. This example machine is quite simple, and leaves out a lot of details of a real machine, yet it is complex enough to illustrate the fundamentals.
SRC Introduction
Attributes of the SRC
- The SRC contains 32 General Purpose Registers: R0, R1, …, R31; each register is of size 32-bits.
- Two special purpose registers are included: Program Counter (PC) and Instruction Register (IR)
- Memory word size is 32 bits
- Memory space size is 232 bytes
- Memory organization is 232 x 8 bits, this means that the memory is byte aligned
- Memory is accessed in 32 bit words ( i.e., 4 byte chunks)
- Big-endian byte storage is used
The figure shows the attributes of the SRC; the 32 ,32-bit registers that are a part of the CPU, the two additional CPU registers (PC & IR), and the main memory which is 232 1- byte cells.
SRC Notation
We examine the notation used for the SRC with the help of some examples.
- R[3] means contents of register 3 (R for register)
- M[8] means contents of memory location 8 (M for memory)
- A memory word at address 8 is defined as the 32 bits at address 8,9,10 and 11 in the memory. This is shown in the figure.
- A special notation for 32-bit memory words is
Some more SRC Attributes
- All instructions are 32 bits long (i.e., instruction size is 1 word)
- All ALU instructions have three operands
- The only way to access memory is through load and store operations
- Only a few addressing modes are supported
SRC: Instruction Formats
Four types of instructions are supported by the SRC. Their representation is given in the figure shown.Before discussing these instruction types in detail, we take a look at the encoding of general purpose registers (the ra, rb and rc fields).
Next Part is General Purpose Register
Instruction of SRC Processor
Reviewed by MCH
on
September 08, 2014
Rating:
No comments: