We have learned that all computers have similar capabilities and perform essentially the same functions, although some might be faster than others. We have also learned that a computer system has input, output, storage, and processing components; that the processor is the “intelligence” of a computer system; and that a single computer system may have several processors. We have discussed how data are represented inside a computer system in electronic states called bits. We are now ready to expose the inner workings of the nucleus of the computer system—the processor.

The internal operation of a computer is interesting, but there really is no mystery to it. The mystery is in the minds of those who listen to hearsay and believe science-fiction writer. The computer is a nonthinking electronic device that has to be plugged into an electrical power source, just like a toaster or a lamp.

Literally hundreds of different types of computers are marketed by scores of manufacturers[1]. The complexity of each type may vary considerably, but in the end each processor, sometimes called the central processing unit or CPU,has only two fundamental sections:the control unit and the arithmetic and logic unit. Primary storage also plays an integral part in the internal operation of a processor. These three—primary storage, the control unit, and the arithmetic and logic unit—work together. Let's look at their functions and the relationships between them.

Unlike magnetic secondary storage devices, such as tape and disk, primary storage has no moving parts. With no mechanical movement, data can be accessed from primary storage at electronic speeds, or close to the speed of light. Most of today's computers use DRAM(Dynamic Random-Access Memory)technology for primary storage. A state-of-the-art DRAM chip about one eighth the size of a postage stamp[2] can store about 256,000,000 bits, or over 25,600,000 characters of data!

Primary storage, or main memory, provides the processor with temporary storage for programs and data. All programs and data must be transferred to primary storage from an input device(such as a VDT)or from secondary storage(such as a disk)before programs can be executed or data can be processed. Primary storage space is always at a premium;therefore, after a program has been executed, the storage space it occupied is reallocated to another program awaiting execution.

Figure 1-1 illustrates how all input/output(I/O)is “read to” or “written from” primary storage. In the figure, an inquiry(input)is made on a VDT. The inquiry, in the form of a message, is routed to primary storage over a channel(such as a coaxial cable). The message is interpreted, and the processor initiates action to retrieve the appropriate program and data from secondary storage[3]. The program and data are “loaded”, or moved, to primary storage from secondary storage. This is a nondestructive read process. That is, the program and data that are read reside in both primary storage(temporarily)and secondary storage(permanently). The data are manipulated according to program instructions, and a report is written from primary storage to a printer.Figure 1-1 Interaction Between Primary Storage and Computer System Components All programs and data must be transferred from an input device or from secondary storage before programs can be executed and data can be processed. During processing, instructions and data are passed between the various types of internal memories, the control unit, and the arithmetic and logic unit. Output is transferred to the printer from primary storage

A program instruction or a piece of data is stored in a specific primary storage location called an address. Addresses permit program instructions and data to be located, accessed, and processed. The content of each address is constantly changing as different programs are executed and new data are processed.

Another name for primary storage is random-access memory, or RAM. A special type of primary storage, called read-only memory(ROM),cannot be altered by the programmer. The contents of ROM are “hard-wired”(designed into the logic of the memory chip)by the manufacturer and can be “read only”. When you turn on a microcomputer system, a program in ROM automatically readies the computer system for use. Then the ROM program produces the initial display screen prompt.

A variation of ROM is programmable read-only memory(PROM). PROM is ROM into which you, the user, can load “read-only” programs and data. Once a program is loaded to PROM, it is seldom, if ever, change[4]. However, if you need to be able to revise the contents of PROM, there is EPROM,erasable PROM. Before a write operation, all the storage cells must be erased to the same initial state.

A more attractive form of read-mostly memory is electrically erasable programmable read-only memory(EEPROM). It can be written into at any time without erasing prior contents; only the byte or bytes addressed are updated[5].

The EEPROM combines the advantage of nonvolatility with the flexibility of being updatable in place[6],using ordinary bus control, address, and data lines.

Another form of semiconductor memory is flash memory(so named because of the speed). Flash memory is intermediate between EPROM and EEPROM in both cost and functionality. Like EEPROM, flash memory uses an electrical erasing technology. An entire flash memory can be erased in one or a few seconds, which is much faster than EPROM. In addition, it is possible to erase just blocks of memory rather than an entire chip. However, flash memory does not provide byte-level erasure[7]. Like EPROM, flash memory uses only one transistor per bit, and so achieves the high density of EPROM.Cache Memory

Program and data are loaded to RAM from secondary storage because the time required to access a program instruction or piece of data from RAM is significantly less than from secondary storage. Thousands of instructions or pieces of data can be accessed from RAM in the time it would take to access a single piece of data from disk storage[8]. RAM is essentially a high-speed holding area for data and programs. In fact, nothing really happens in a computer system until the program instructions and data are moved to the processor. This transfer of instructions and data to the processor can be time-consuming, even at microsecond speeds. To facilitate an even faster transfer of instructions and data to the processor, most computers are designed with cache memory. Cache memory is employed by computer designers to increase the computer system throughput(the rate at which work is performed).

Like RAM, cache is a high-speed holding area for program instructions and data. However, cache memory uses SRAM(Static RAM)technology that is about 10 times faster than RAM and about 100 times more expensive. With only a fraction of the capacity of RAM, cache memory holds only those instructions and data that are likely to be needed next by the processor. Two types of cache memory appear widely in computers. The first is referred to as internal cache and is built into the CPU chip. The second, external cache, is located on chips placed close to the CPU chip. A computer can have several different levels of cache memory. Level 1 cache is virtually always built into the chip. Level 2cache used to be external cache but is now typically also built into the CPU like level 1 cache.Figure 1-2 Inside a typical PC system unit. The system unit houses the CPU, memory, and other important pieces of hardwareWords and Expressions

注:本节主要介绍计算机的CPU,主存及闪存等内容。1.2 Bus Interconnection

A bus is a communication pathway connecting two or more devices. A key characteristic of a bus is that it is a shared transmission medium[9]. Multiple devices connect to the bus, and a signal transmitted by any one device is available for reception by all other devices attached to the bus. If two devices transmit during the same time period, their signals will overlap and become garbled. Thus, only one device at a time can successfully transmit.

Typically, a bus consists of multiple communication pathways, or lines. Each line is capable of transmitting signals representing binary 1 and binary 0. Over time, a sequence of binary digits can be transmitted across a single line. Taken together[10],several lines of a bus can be used to transmit binary digits simultaneously(in parallel). For example, an 8-bit unit of data can be transmitted over eight bus lines.

Computer systems contain a number of different buses that provide pathways between components at various levels of the computer system hierarchy[11]. A bus that connects major computer components(processor, memory, I/O)is called a system bus.

A system bus consists, typically, of from 50 to 100 separate lines. Each line is assigned a particular meaning or function. Although there are many different bus designs, on any bus the lines can be classified into three functional groups(Figure 1-3):data, address, and control lines. In addition, there may be power distribution lines that supply power to the attached modules[12].Figure 1-3 Bus Interconnection Scheme

The data lines provide a path for moving data between system modules. These lines, collectively, are called the data bus. The data bus typically consists of 8,16,or 32 separate lines, the number of lines being referred to as the width of the data bus[13]. Because each line can carry only 1 bit at a time, the number of lines determines how many bits can be transferred at a time. The width of the data bus is a key factor in determining overall system performance.

The address lines are used to designate the source or destination of the data on the data bus. For example, if the processor wishes to read a word of data from memory, it puts the address of the desired word on the address lines. Clearly, the width of the address bus determines the maximum possible memory capacity of the system.

The control lines are used to control the access to and the use of the data and address lines[14]. Because the data and address lines are shared by all components, there must be a means of controlling their use. Control signals transmit both command and timing information between system modules. Timing signals indicate the validity of data and address information. Command signals specify operations to be performed.

Most computer systems enjoy the use of multiple buses, generally laid out in a hierarchy[15]. A typical high-performance architecture is shown in Figure 1-4. There is a local bus that connects the processor to a cache controller, which is in turn connected to a system bus that supports main memory. The cache controller is integrated into a bridge that connects to the high-speed bus[16]. This bus supports connections to high-speed LANs, video and graphics workstation controller, as well as interface controller to local peripheral buses, including SCSI, and FireWire[17]. Lower-speed devices are still supported off an expansion bus, with an interface buffering traffic between the expansion bus and the high-speed bus[18].Figure 1-4  Example Bus ConfigurationPCI Express pumps up performance

In the past decade,PCI has served as the dominant I/O architecture for PCs and server, carrying data generated by microprocessors, network adapters, graphics cards and other subsystems to which it is connected[19]. However, as the speed and capabilities of computing components increase, PCI's bandwidth limitations and the inefficiencies of its parallel architecture increasingly have become bottlenecks to system performance.

PCI is a unidirectional parallel bus architecture in which multiple adapters must contend for available bus bandwidth. Although performance of the PCI interface has been improved over the years, problems with signal skew(when bits of data arrive at their destination too late),signal routing and the inability to lower the voltage or increase the frequency, strongly indicate that the architecture is running out of steam[20]. Additional attempts to improve its performance would be costly and impractical in response, a group of vendors, including some of the largest and most successful system developers in the industry, unveiled an I/O architecture dubbed PCI Express(initially called Third Generation I/O, or 3GIO).

PCI Express is a point-to-point switching architecture that creates high-speed, bidirectional links between a CPU and system I/O(the switch is connected to the CPU by a host bridge). Each of these links can encompass one or more “lanes” comprising four wires-two for transmitting data and two for receiving data. The design of these lanes enables the use of lower voltages(resulting in lower powers usage),reduces electromagnetic emissions, eliminates signal skew, lowers costs through simpler design and generally improves performance.

In its initial implementation,PCI Express can yield transfer speeds of 2.5 Gbit/sec in each direction, on each lane. By contrast, the version of the PCI architecture that is most common today, PCI-X1.0,offers 1 Gbit/sec in throughput. PCI Express cards are available in four-or eight-lane configurations(called x4 and x8). An x4 PCI Express card can provide as much as 20 Gbit/sec in throughput, while an x8 PCI Express card can offer up to 40 Gbit/sec in throughput.

Earlier attempts to create a new PCI architecture failed in part because they required so many changes to the system and application software. Drivers, utilities and management applications all would have to be rewritten. PCI Express developers removed the dependency on new operating system support, letting PCI-compatible drivers and applications run unchanged on PCI Express hardware[21].A bus for the future

Developers are working on increasing the scalability of PCI Express. While current server and desktop systems support PCI Express adapters and graphics cards with up to eight lanes(x8),the architecture will support as many as 32 lanes(x32)in the future.

The first Fibre Channel host bus adapters were designed to support four lanes instead of eight lanes, in part because server developers had designed their systems with four-lane slots. As even more bandwidth is required, implementing an eight-lane design potentially could double the performance, provided there were no other bottlenecks in the system.

This scalability, along with the expected doubling of the speed of each lane to 5 Gbit/sec, should keep PCI Express a viable solution for designers for the foreseeable future[22].

PCI Express is a significant improvement over PCI and is well on its way to becoming the new standard for PCs, servers and more. Not only can it lower costs and improve reliability, but it also significantly can improve performance. Applications such as music and video-streaming, video on demand, VoIP and data storage will benefit from these improvements.Words and Expressions

