Upgrading and Repairing PCs (17th Edition)

The CPU and motherboard architecture (chipset) dictates a particular computer's physical memory capacity and the types and forms of memory that can be installed. Over the years, two main changes have occurred in computer memoryit has gradually become faster and wider. The CPU and the memory controller circuitry indicate the speed and width requirements. The memory controller in a modern PC resides in the motherboard chipset. Even though a system might physically support a given amount of memory, the type of software you run could dictate whether all the memory can be used.

The 8088 and 8086 CPUs, with 20 address lines, can use as much as 1MB (1,024KB) of RAM. The 286 and 386SX CPUs have 24 address lines and can keep track of as much as 16MB of memory. The 386DX, 486, Pentium, and Pentium-MMX CPUs have a full set of 32 address lines, so they can keep track of 4GB of memory; the Pentium Pro, Pentium II/III, and 4, as well as the AMD Athlon and Duron, have 36 address lines and can manage an impressive 64GB. The Itanium processor, on the other hand, has 44-bit addressing, which allows for up to 16TB (terabytes) of physical RAM!

See "Processor Specifications," p. 39.

When the 286 and higher chips emulate the 8088 chip (as they do when running 16-bit software, such as DOS or Windows 3.x), they implement a hardware operating mode called real mode. Real mode is the only mode available on the 8086 and 8088 chips used in PC and XT systems. In real mode, all Intel processorseven the mighty Pentium familyare restricted to using only 1MB of memory, just as their 8086 and 8088 ancestors were, and the system design reserves 384KB of that amount. Only in protected mode can the 286 or better chips use their maximum potentials for memory addressing.

See "Processor Modes," p. 47.

P5 class systems can address as much as 4GB of memory, and P6/P7 class systems can address up to 64GB. To put these memory-addressing capabilities into perspective, 64GB (65,536MB) of memory would cost more than $10,000! Even if you could afford all this memory, some of the largest memory modules available for desktop PCs today are 1GB DIMMs. Installing 64GB of RAM would require 64 1GB DIMMs and most systems today support up to only four DIMM sockets.

Although memory sizes are increasing and some current desktop motherboards support 2GB modules, the real limitations on memory sizing in any system are the chipset and the number of sockets on the motherboard. Most desktop motherboards incorporate from two to four memory sockets, which allows a maximum of 4GB8GB if all the sockets are filled. These limitations are from the chipset, not the processor or RAM modules. Some processors can address 64GB, but no chipset on the market will allow that!

Note

See the "Chipsets" section in Chapter 4 for the maximum cacheable limits on all the Intel and other motherboard chipsets.

SIMMs, DIMMs, and RIMMs

Originally, systems had memory installed via individual chips. They are often referred to as dual inline package (DIP) chips because of their designs. The original IBM XT and AT had 36 sockets on the motherboard for these individual chips; then more of them were installed on the memory cards plugged into the bus slots. I remember spending hours populating boards with these chips, which was a tedious job.

Besides being a time-consuming and labor-intensive way to deal with memory, DIP chips had one notorious problemthey crept out of their sockets over time as the system went through thermal cycles. Every day, when you powered the system on and off, the system heated and cooled, and the chips gradually walked their way out of the socketsa phenomenon called chip creep. Eventually, good contact was lost and memory errors resulted. Fortunately, reseating all the chips back in their sockets usually rectified the problem, but that method was labor intensive if you had a lot of systems to support.

The alternative to this at the time was to have the memory soldered into either the motherboard or an expansion card. This prevented the chips from creeping and made the connections more permanent, but it caused another problem. If a chip did go bad, you had to attempt desoldering the old one and resoldering a new one or resort to scrapping the motherboard or memory card on which the chip was installed. This was expensive and made memory troubleshooting difficult.

A chip was needed that was both soldered and removable, and that is exactly what was found in the module called a SIMM. For memory storage, most modern systems have adopted the single inline memory module (SIMM) or the more recent DIMM and RIMM module designs as an alternative to individual memory chips. These small boards plug into special connectors on a motherboard or memory card. The individual memory chips are soldered to the module, so removing and replacing them is impossible. Instead, you must replace the entire module if any part of it fails. The module is treated as though it were one large memory chip.

Two main types of SIMMs, three main types of DIMMs, and one type of RIMM have been commonly used in desktop systems. The various types are often described by their pin count, memory row width, or memory type.

SIMMs, for example, are available in two main physical types30-pin (8 bits plus an option for 1 additional parity bit) and 72-pin (32 bits plus an option for 4 additional parity bits)with various capacities and other specifications. The 30-pin SIMMs are physically smaller than the 72-pin versions, and either version can have chips on one or both sides. SIMMs were widely used from the late 1980s to the late 1990s but have become obsolete.

DIMMs are also available in three main types. DIMMs usually hold standard SDRAM or DDR SDRAM chips and are distinguished by different physical characteristics. Standard DIMMs have 168 pins, one notch on either side, and two notches along the contact area. DDR DIMMs, on the other hand, have 184 pins, two notches on each side, and only one offset notch along the contact area. DDR2 DIMMs have 240 pins, two notches on each side, and one in the center of the contact area. All DIMMs are either 64-bits (non-ECC/parity) or 72-bits (parity or error correcting code [ECC]) wide (data paths). The main physical difference between SIMMs and DIMMs is that DIMMs have different signal pins on each side of the module. That is why they are called dual inline memory modules, and why with only 1" of additional length, they have many more pins than a SIMM.

Note

There is confusion among users and even in the industry over regarding the terms single-sided or double-sided with respect to memory modules. In truth, the single- or double-sided designation actually has nothing to do with whether chips are physically located on one or both sides of the module, and it has nothing to do with whether the module is a SIMM or DIMM (meaning whether the connection pins are single- or double-inline). Instead the terms single-sided and double-sided are used to indicate whether the module has one or two banks of memory chips installed. A double-banked DIMM module has two complete 64-bit-wide banks of chips logically stacked so that the module is twice as deep (has twice as many 64-bit rows). In most (but not all) cases, this requires chips to be on both sides of the module; therefore, the term doublesided has often been used to indicate that a module has two banks, even though the term is technically incorrect. Single-banked modules (incorrectly referred to as single-sided) can have chips physically mounted on both sides of the module, and double-banked modules (incorrectly referred to as double-sided) can have chips physically mounted on only one side. I recommend using the terms single-banked and double-banked instead because they are much more accurate and easily understood.

RIMMs also have different signal pins on each side. Three different physical types of RIMMs are available: a 16/18-bit version with 184 pins, a 32/36-bit version with 232 pins, and a 64/72-bit version with 326 pins. Each of these plugs into the same sized connector, but the notches in the connectors and RIMMs are different to prevent a mismatch. A given board will accept only one type. By far the most common type is the 16/18-bit version. The 32-bit version was introduced in late 2002, and the 64-bit version was introduced in 2004.

The standard 16/18-bit RIMM has 184 pins, one notch on either side, and two notches centrally located in the contact area. 16-bit versions are used for non-ECC applications, whereas the 18-bit versions incorporate the additional bits necessary for ECC.

Figures 6.36.8 show a typical 30-pin (8-bit) SIMM, 72-pin (32-bit) SIMM, 168-pin SDRAM DIMM, 184-pin DDR SDRAM (64-bit) DIMM, 240-pin DDR2 DIMM, and 184-pin RIMM, respectively. The pins are numbered from left to right and are connected through to both sides of the module on the SIMMs. The pins on the DIMM are different on each side, but on a SIMM, each side is the same as the other and the connections carry through. Note that all dimensions are in both inches and millimeters (in parentheses), and modules are generally available in error correcting code (ECC) versions with 1 extra ECC (or parity) bit for every 8 data bits (multiples of 9 in data width) or versions that do not include ECC support (multiples of 8 in data width).

Figure 6.7. A typical 240-pin DDR2 DIMM.

Figure 6.8. A typical 184-pin RIMM.

All these memory modules are fairly compact considering the amount of memory they hold and are available in several capacities and speeds. Table 6.11 lists the various capacities available for SIMMs, DIMMs, and RIMMs.

Table 6.11. SIMM, DIMM, and RIMM Capacities

Capacity

Standard

Parity/ECC

  

30-Pin SIMM

256KB

256KBx8

256KBx9

1MB

1MBx8

1MBx9

4MB

4MBx8

4MBx9

16MB

16MBx8

16MBx9

  

72-Pin SIMM

1MB

256KBx32

256KBx36

2MB

512KBx32

512KBx36

4MB

1MBx32

1MBx36

8MB

2MBx32

2MBx36

16MB

4MBx32

4MBx36

32MB

8MBx32

8MBx36

64MB

16MBx32

16MBx36

128MB

32MBx32

32MBx36

  

168/184-Pin DIMM/DDR DIMM

8MB

1MBx64

1MBx72

16MB

2MBx64

2MBx72

32MB

4MBx64

4MBx72

64MB

8MBx64

8MBx72

128MB

16MBx64

16MBx72

256MB

32MBx64

32MBx72

512MB

64MBx64

64MBx72

1,024MB

128MBx64

128MBx72

2,048MB

256MBx64

256MBx72

  

240-Pin DDR2 DIMM

256MB

32MBx64

32MBx72

512MB

64MBx64

64MBx72

1,024MB

128MBx64

128MBx72

2,048MB

256MBx64

256MBx72

  

184-Pin RIMM

64MB

32MBx16

32MBx18

128MB

64MBx16

64MBx18

256MB

128MBx16

128MBx18

512MB

256MBx16

256MBx18

1,024MB

512MBx16

512MBx18

SIMMs, DIMMs, DDR/DDR2 DIMMs, and RIMMs of each type and capacity are available in various speed ratings. Consult your motherboard documentation for the correct memory speed and type for your system. It is usually best for the memory speed (also called throughput or bandwidth) to match the speed of the processor data bus (also called the front side bus or FSB).

If a system requires a specific speed and uses DDR, DDR2, or RIMM memory, you can almost always substitute faster speeds if the one specified is not available. Generally, no problems occur in mixing module speeds, as long as you use modules equal to or faster than what the system requires. Because there's little price difference between the various speed versions, I often buy faster modules than are necessary for a particular application. This might make them more usable in a future system that could require the faster speed.

Because DIMMs and RIMMs have an onboard serial presence detect (SPD) ROM that reports their speed and timing parameters to the system, most systems run the memory controller and memory bus at the speed matching the slowest DIMM/RIMM installed. Most DIMMs are SDRAM memory, which means they deliver data in very high-speed bursts using a clocked interface. DDR DIMMs are also SDRAM, but they transfer data two times per clock cycle and thus are twice as fast.

Note

A bank is the smallest amount of memory needed to form a single row of memory addressable by the processor. It is the minimum amount of physical memory that is read or written by the processor at one time and usually corresponds to the data bus width of the processor. If a processor has a 64-bit data bus, a bank of memory also is 64 bits wide. If the memory is interleaved or runs dual-channel, a virtual bank is formed that is twice the absolute data bus width of the processor.

You can't always replace a module with a higher-capacity unit and expect it to work. Systems might have specific design limitations for the maximum capacity of module they can take. A larger-capacity module works only if the motherboard is designed to accept it in the first place. Consult your system documentation to determine the correct capacity and speed to use.

Registered Modules

SDRAM and DDR DIMMs are available in buffered, unbuffered, and registered versions. A buffered module has additional buffer circuits between the memory chips and the connector to condition or buffer the signals. Virtually all PC motherboards designed to use SDRAM or DDR require unbuffered or registered modules instead. In fact, no PCs that I am aware of use plain buffered modules. Some of the early PowerPC Macs might have used buffered SDRAM, but no PCs do. Because so few systems ever used them, you will not find buffered modules available for sale.

Most PC motherboards are designed to use unbuffered modules, which allow the memory controller signals to pass directly to the memory chips on the module with no interference. This is not only the cheapest design, but also the fastest and most efficient. The only drawback is that the motherboard designer must place limits on how many modules (meaning module sockets) can be installed on the board, and possibly also limit how many chips can be on a module. So-called double-sided modules that really have two banks of chips (twice as many as normal) onboard might be restricted on some systems in certain combinations.

Systems designed to accept extremely large amounts of RAM often require registered modules. A registered module uses an architecture which has register chips on the module that act as an interface between the actual RAM chips and the chipset. The registers temporarily hold data passing to and from the memory chips and enable many more RAM chips to be driven or otherwise placed on the module than the chipset could normally support. This allows for motherboard designs that can support many modules and enables each module to have a larger number of chips. In general, registered modules are required by server or workstation motherboards designed to support more than 1GB or 2GB of RAM. However, the initial version of the AMD Athlon 64 FX processor also uses registered memory because its design was based on the AMD Opteron workstation and server processor. Subsequent versions of the Athlon FX no longer require registered memory.

To provide the space needed for the buffer chips, a registered DIMM is often taller than a standard DIMM. Figure 6.9 compares a typical registered DIMM to a typical unbuffered DIMM.

Figure 6.9. A typical registered DIMM is taller than a typical unbuffered DIMM to provide room for buffer chips.

Tip

If you are installing registered DIMMs in a slimline case, clearance between the top of the DIMM and the case might be a problem. Some vendors sell low-profile registered DIMMs that are about the same height as an unbuffered DIMM. Use this type of DIMM if your system does not have enough head room for standard registered DIMMs. Some vendors sell only this type of DIMM for particular systems.

The important thing to note is that you can use only the type of module your motherboard (or chipset) is designed to support. For most, that is standard unbuffered modules or, in some cases, registered modules.

SIMM Pinouts

Table 6.12 shows the interface connector pinouts for standard 72-pin SIMMs. They also include a special presence detect table that shows the configuration of the presence detect pins on various 72-pin SIMMs. The motherboard uses the presence detect pins to detect exactly what size and speed SIMM is installed. Industry-standard 30-pin SIMMs do not have a presence detect feature, but IBM did add this capability to its modified 30-pin configuration. Note that all SIMMs have the same pins on both sides of the module. SIMM pins are usually tin plated. The plating on the module pins must match the socket pins because corrosion will result otherwise.

Table 6.12. Standard 72-Pin SIMM Pinout

Pin

SIMM Signal Name

Pin

SIMM Signal Name

Pin

SIMM Signal Name

1

Ground

25

Data Bit 22

49

Data Bit 8

2

Data Bit 0

26

Data Bit 7

50

Data Bit 24

3

Data Bit 16

27

Data Bit 23

51

Data Bit 9

4

Data Bit 1

28

Address Bit 7

52

Data Bit 25

5

Data Bit 17

29

Address Bit 11

53

Data Bit 10

6

Data Bit 2

30

+5 Vdc

54

Data Bit 26

7

Data Bit 18

31

Address Bit 8

55

Data Bit 11

8

Data Bit 3

32

Address Bit 9

56

Data Bit 27

9

Data Bit 19

33

Address Bit 12

57

Data Bit 12

10

+5 Vdc

34

Address Bit 13

58

Data Bit 28

11

Presence Detect 5

35

Parity Data Bit 2

59

+5 Vdc

12

Address Bit 0

36

Parity Data Bit 0

60

Data Bit 29

13

Address Bit 1

37

Parity Data Bit 1

61

Data Bit 13

14

Address Bit 2

38

Parity Data Bit 3

62

Data Bit 30

15

Address Bit 3

39

Ground

63

Data Bit 14

16

Address Bit 4

40

Column Address Strobe 0

64

Data Bit 31

17

Address Bit 5

41

Column Address Strobe 2

65

Data Bit 15

18

Address Bit 6

42

Column Address Strobe 3

66

EDO

19

Address Bit 10

43

Column Address Strobe 1

67

Presence Detect 1

20

Data Bit 4

44

Row Address Strobe 0

68

Presence Detect 2

21

Data Bit 20

45

Row Address Strobe 1

69

Presence Detect 3

22

Data Bit 5

46

Reserved

70

Presence Detect 4

23

Data Bit 21

47

Write Enable

71

Reserved

24

Data Bit 6

48

ECC Optimized

72

Ground

Notice that the 72-pin SIMMs use a set of four or five pins to indicate the type of SIMM to the motherboard. These presence detect pins are either grounded or not connected to indicate the type of SIMM to the motherboard. Presence detect outputs must be tied to the ground through a 0-ohm resistor or jumper on the SIMMto generate a high logic level when the pin is open or a low logic level when the motherboard grounds the pin. This produces signals the memory interface logic can decode. If the motherboard uses presence detect signals, a power on self test (POST) procedure can determine the size and speed of the installed SIMMs and adjust control and addressing signals automatically. This enables autodetection of the memory size and speed.

Note

In many ways, the presence detect pin function is similar to the industry-standard DX coding used on modern 35mm film rolls to indicate the ASA (speed) rating of the film to the camera. When you drop the film into the camera, electrical contacts can read the film's speed rating via an industry-standard configuration.

Presence detect performs the same function for 72-pin SIMMs that the serial presence detect (SPD) chip does for DIMMs.

Table 6.13 shows the Joint Electronic Devices Engineering Council (JEDEC) industry-standard presence detect configuration listing for the 72-pin SIMM family. JEDEC is an organization of U.S. semiconductor manufacturers and users that sets semiconductor standards.

Table 6.13. Presence Detect Pin Configurations for 72-Pin SIMMs

Size

Speed

Pin 67

Pin 68

Pin 69

Pin 70

Pin 11

1MB

100ns

Gnd

Gnd

Gnd

1MB

80ns

Gnd

Gnd

1MB

70ns

Gnd

Gnd

1MB

60ns

Gnd

2MB

100ns

Gnd

Gnd

Gnd

2MB

80ns

Gnd

Gnd

2MB

70ns

Gnd

Gnd

2MB

60ns

Gnd

4MB

100ns

Gnd

Gnd

Gnd

Gnd

4MB

80ns

Gnd

Gnd

Gnd

4MB

70ns

Gnd

Gnd

Gnd

4MB

60ns

Gnd

Gnd

8MB

100ns

Gnd

Gnd

8MB

80ns

Gnd

8MB

70ns

Gnd

8MB

60ns

16MB

80ns

Gnd

Gnd

Gnd

16MB

70ns

Gnd

Gnd

Gnd

16MB

60ns

Gnd

Gnd

16MB

50ns

Gnd

Gnd

Gnd

Gnd

32MB

80ns

Gnd

Gnd

Gnd

32MB

70ns

Gnd

Gnd

Gnd

32MB

60ns

Gnd

Gnd

32MB

50ns

Gnd

Gnd

Gnd

Gnd

= No connection (open)

Gnd = Ground

Pin 67 = Presence detect 1

Pin 68 = Presence detect 2

Pin 69 = Presence detect 3

Pin 70 = Presence detect 4

Pin 11 = Presence detect 5

Unfortunately, unlike the film industry, not everybody in the computer industry follows established standards. As such, presence detect signaling is not a standard throughout the PC industry. Different system manufacturers sometimes use different configurations for what is expected on these four pins. Compaq, IBM (mainly PS/2 systems), and Hewlett-Packard are notorious for this type of behavior. Many of the systems from these vendors require special SIMMs that are basically the same as standard 72-pin SIMMs, except for special presence detect requirements. Table 6.14 shows how IBM defines these pins.

Table 6.14. Presence Detect Pins for IBM 72-Pin SIMMs

67

68

69

70

SIMM Type

IBM Part Number

Not a valid SIMM

n/a

Gnd

1MB 120ns

n/a

Gnd

2MB 120ns

n/a

Gnd

Gnd

2MB 70ns

92F0102

Gnd

8MB 70ns

64F3606

Gnd

Gnd

Reserved

n/a

Gnd

Gnd

2MB 80ns

92F0103

Gnd

Gnd

Gnd

8MB 80ns

64F3607

Gnd

Reserved

n/a

Gnd

Gnd

1MB 85ns

90X8624

Gnd

Gnd

2MB 85ns

92F0104

Gnd

Gnd

Gnd

4MB 70ns

92F0105

Gnd

Gnd

4MB 85ns

79F1003 (square notch) L40-SX

Gnd

Gnd

Gnd

1MB 100ns

n/a

Gnd

Gnd

Gnd

8MB 80ns

79F1004 (square notch) L40-SX

Gnd

Gnd

Gnd

2MB 100ns

n/a

Gnd

Gnd

Gnd

Gnd

4MB 80ns

87F9980

Gnd

Gnd

Gnd

Gnd

2MB 85ns

79F1003 (square notch) L40SX

= No connection (open)

Gnd = Ground

Pin 67 = Presence detect 1

Pin 68 = Presence detect 2

Pin 69 = Presence detect 3

Pin 70 = Presence detect 4

Because these pins can have custom variations, you often must specify IBM, Compaq, HP, or generic SIMMs when you order memory for systems using 72-pin SIMMs. Although very few (if any) of these systems are still in service, keep this information in mind if you are moving 72-pin modules from one system to another or are installing salvaged memory into a system. Also, be sure you match the metal used on the module connectors and sockets. SIMM pins can be tin- or gold-plated and the plating on the module pins must match that on the socket pins; otherwise, corrosion will result.

Caution

To have the most reliable system when using SIMM modules, you must install modules with gold-plated contacts into gold-plated sockets and modules with tin-plated contacts into tin-plated sockets only. If you mix gold contacts with tin sockets, or vice versa, you are likely to experience memory failures from six months to one year after initial installation because a type of corrosion know as fretting will take place. This has been a major problem with 72-pin SIMM-based systems because some memory and motherboard vendors opted for tin sockets and connectors while others opted for gold. According to connector manufacturer AMP's "Golden Rules: Guidelines for the Use of Gold on Connector Contacts" (available at http://www.amp.com/products/technology/aurulrep.pdf) and "The Tin Commandments: Guidelines for the Use of Tin on Connector Contacts" (available at http://www.amp.com/products/technology/sncomrep.pdf), you should match connector metals.

If you are maintaining systems with mixed tin/gold contacts in which fretting has already occurred, use a wet contact cleaner. After cleaning, to improve electrical contacts and help prevent corrosion, you should use a liquid contact enhancer and lubricant called Stabilant 22 from D.W. Electrochemicals when installing SIMMs or DIMMs. Its website http://www.stabilant.com/llsting.htm) has detailed application notes on this subject that provide more technical details.

DIMM Pinouts

Table 6.15 shows the pinout configuration of a 168-pin standard unbuffered SDRAM DIMM. Note again that the pins on each side of the DIMM are different. All pins should be gold plated.

Table 6.15. 168-Pin SDRAM DIMM Pinouts

Pin

Signal

Pin

Signal

Pin

Signal

Pin

Signal

1

GND

43

GND

85

GND

127

GND

2

Data Bit 0

44

Do Not Use

86

Data Bit 32

128

Clock Enable 0

3

Data Bit 1

45

Chip Select 2#

87

Data Bit 33

129

Chip Select 3#

4

Data Bit 2

46

I/O Mask 2

88

Data Bit 34

130

I/O Mask 6

5

Data Bit 3

47

I/O Mask 3

89

Data Bit 35

131

I/O Mask 7

6

+3.3V

48

Do Not Use

90

+3.3V

132

Reserved

7

Data Bit 4

49

+3.3V

91

Data Bit 36

133

+3.3V

8

Data Bit 5

50

NC

92

Data Bit 37

134

NC

9

Data Bit 6

51

NC

93

Data Bit 38

135

NC

10

Data Bit 7

52

Parity Bit 2

94

Data Bit 39

136

Parity Bit 6

11

Data Bit 8

53

Parity Bit 3

95

Data Bit 40

137

Parity Bit 7

12

GND

54

GND

96

GND

138

GND

13

Data Bit 9

55

Data Bit 16

97

Data Bit 41

139

Data Bit 48

14

Data Bit 10

56

Data Bit 17

98

Data Bit 42

140

Data Bit 49

15

Data Bit 11

57

Data Bit 18

99

Data Bit 43

141

Data Bit 50

16

Data Bit 12

58

Data Bit 19

100

Data Bit 44

142

Data Bit 51

17

Data Bit 13

59

+3.3V

101

Data Bit 45

143

+3.3V

18

+3.3V

60

Data Bit 20

102

+3.3V

144

Data Bit 52

19

Data Bit 14

61

NC

103

Data Bit 46

145

NC

20

Data Bit 15

62

NC

104

Data Bit 47

146

NC

21

Parity Bit 0

63

Clock Enable 1

105

Parity Bit 4

147

NC

22

Parity Bit 1

64

GND

106

Parity Bit 5

148

GND

23

GND

65

Data Bit 21

107

GND

149

Data Bit 53

24

NC

66

Data Bit 22

108

NC

150

Data Bit 54

25

NC

67

Data Bit 23

109

NC

151

Data Bit 55

26

+3.3V

68

GND

110

+3.3V

152

GND

27

WE#

69

Data Bit 24

111

CAS#

153

Data Bit 56

28

I/O Mask 0

70

Data Bit 25

112

I/O Mask 4

154

Data Bit 57

29

I/O Mask 1

71

Data Bit 26

113

I/O Mask 5

155

Data Bit 58

30

Chip Select 0#

72

Data Bit 27

114

Chip Select 1#

156

Data Bit 59

31

Do Not Use

73

+3.3V

115

RAS#

157

+3.3V

32

GND

74

Data Bit 28

116

GND

158

Data Bit 60

33

Address Bit 0

75

Data Bit 29

117

Address Bit 1

159

Data Bit 61

34

Address Bit 2

76

Data Bit 30

118

Address Bit 3

160

Data Bit 62

35

Address Bit 4

77

Data Bit 31

119

Address Bit 5

161

Data Bit 63

36

Address Bit 6

78

GND

120

Address Bit 7

162

GND

37

Address Bit 8

79

Clock 2

121

Address Bit 9

163

Clock 3

38

Address Bit 10

80

NC

122

Bank Address 0

164

NC

39

Bank Address 1

81

SPD Write Protect

123

Address Bit 11

165

SPD Address 0

40

+3.3V

82

SPD Data

124

+3.3V

166

SPD Address 1

41

+3.3V

83

SPD Clock

125

Clock 1

167

SPD Address 2

42

Clock 0

84

+3.3V

126

Reserved

168

+3.3V

Gnd = Ground

SPD = Serial presence detect

NC = No connection

The DIMM uses a completely different type of presence detect than a SIMM, called serial presence detect (SPD). It consists of a small EEPROM or Flash memory chip on the DIMM that contains specially formatted data indicating the DIMM's features. This serial data can be read via the serial data pins on the DIMM, and it enables the motherboard to autoconfigure to the exact type of DIMM installed.

DIMMs can come in several varieties, including unbuffered or buffered and 3.3V or 5V. Buffered DIMMs have additional buffer chips on them to interface to the motherboard. Unfortunately, these buffer chips slow down the DIMM and are not effective at higher speeds. For this reason, most PC systems (those that do not use registered DIMMs) use unbuffered DIMMs. The voltage is simpleDIMM designs for PCs are almost universally 3.3V. If you install a 5V DIMM in a 3.3V socket, it would be damaged, but fortunately, keying in the socket and on the DIMM prevents that.

Modern PC systems use only unbuffered 3.3V DIMMs. Apple and other non-PC systems can use the buffered 5V versions. Fortunately, the key notches along the connector edge of a DIMM are spaced differently for buffered/unbuffered or 3.3V/5V DIMMs, as shown in Figure 6.10. This prevents inserting a DIMM of the wrong type into a given socket.

Figure 6.10. 168-pin DRAM DIMM notch key definitions.

DDR DIMM Pinouts

Table 6.16 shows the pinout configuration of a 184-pin DDR SDRAM DIMM. Note again that the pins on each side of the DIMM are different. All pins should be gold plated.

Table 6.16. 184-Pin DDR DIMM Pinouts

Pin

Signal

Pin

Signal

Pin

Signal

Pin

Signal

1

Reference +1.25V

47

Data Strobe 8

93

GND

139

GND

2

Data Bit 0

48

Address Bit 0

94

Data Bit 4

140

Data Strobe 17

3

GND

49

Parity Bit 2

95

Data Bit 5

141

Address Bit 10

4

Data Bit 1

50

GND

96

I/O +2.5V

142

Parity Bit 6

5

Data Strobe 0

51

Parity Bit 3

97

Data Strobe 9

143

I/O +2.5V

6

Data Bit 2

52

Bank Address 1

98

Data Bit 6

144

Parity Bit 7

7

+2.5 V

53

Data Bit 32

99

Data Bit 7

145

GND

8

Data Bit 3

54

I/O +2.5 V

100

GND

146

Data Bit 36

9

NC

55

Data Bit 33

101

NC

147

Data Bit 37

10

NC

56

Data Strobe 4

102

NC

148

+2.5V

11

GND

57

Data Bit 34

103

Address Bit 13

149

Data Strobe 13

12

Data Bit 8

58

GND

104

I/O +2.5V

150

Data Bit 38

13

Data Bit 9

59

Bank Address 0

105

Data Bit 12

151

Data Bit 39

14

Data Strobe 1

60

Data Bit 35

106

Data Bit 13

152

GND

15

I/O +2.5V

61

Data Bit 40

107

Data Strobe 10

153

Data Bit 44

16

Clock 1

62

I/O +2.5V

108

+2.5V

154

RAS#

17

Clock 1#

63

WE#

109

Data Bit 14

155

Data Bit 45

18

GND

64

Data Bit 41

110

Data Bit 15

156

I/O +2.5V

19

Data Bit 10

65

CAS#

111

Clock Enable 1

157

S0#

20

Data Bit 11

66

GND

112

I/O +2.5V

158

S1#

21

Clock Enable 0

67

Data Strobe 5

113

Bank Address 2

159

Data Strobe 14

22

I/O +2.5V

68

Data Bit 42

114

Data Bit 20

160

GND

23

Data Bit 16

69

Data Bit 43

115

Address Bit 12

161

Data Bit 46

24

Data Bit 17

70

+2.5V

116

GND

162

Data Bit 47

25

Data Strobe 2

71

S2#

117

Data Bit 21

163

S3#

26

GND

72

Data Bit 48

118

Address Bit 11

164

I/O +2.5V

27

Address Bit 9

73

Data Bit 49

119

Data Strobe 11

165

Data Bit 52

28

Data Bit 18

74

GND

120

+2.5V

166

Data Bit 53

29

Address Bit 7

75

Clock 2#

121

Data Bit 22

167

FETEN

30

I/O +2.5V

76

Clock 2

122

Address Bit 8

168

+2.5V

31

Data Bit 19

77

I/O +2.5V

123

Data Bit 23

169

Data Strobe 15

32

Address Bit 5

78

Data Strobe 6

124

GND

170

Data Bit 54

33

Data Bit 24

79

Data Bit 50

125

Address Bit 6

171

Data Bit 55

34

GND

80

Data Bit 51

126

Data Bit 28

172

I/O +2.5V

35

Data Bit 25

81

GND

127

Data Bit 29

173

NC

36

Data Strobe 3

82

+2.5VID

128

I/O +2.5V

174

Data Bit 60

37

Address Bit 4

83

Data Bit 56

129

Data Strobe 12

175

Data Bit 61

38

+2.5V

84

Data Bit 57

130

Address Bit 3

176

GND

39

Data Bit 26

85

+2.5V

131

Data Bit 30

177

Data Strobe 16

40

Data Bit 27

86

Data Strobe 7

132

GND

178

Data Bit 62

41

Address Bit 2

87

Data Bit 58

133

Data Bit 31

179

Data Bit 63

42

GND

88

Data Bit 59

134

Parity Bit 4

180

I/O +2.5V

43

Address Bit 1

89

GND

135

Parity Bit 5

181

SPD Address 0

44

Parity Bit 0

90

SPD Write Protect

136

I/O +2.5V

182

SPD Address 1

45

Parity Bit 1

91

SPD Data

137

Clock 0

183

SPD Address 2

46

+2.5V

92

SPD Clock

138

Clock 0#

184

SPD +2.5V

Gnd = Ground

SPD = Serial presence detect

NC = No connection

DDR DIMMs use a single key notch to indicate voltage, as shown in Figure 6.11.

Figure 6.11. 184-pin DDR SDRAM DIMM keying.

184-pin DDR DIMMs use two notches on each side to enable compatibility with both low- and high-profile latched sockets. Note that the key position is offset with respect to the center of the DIMM to prevent inserting it backward in the socket. The key notch is positioned to the left, centered, or to the right of the area between pins 52 and 53. This is used to indicate the I/O voltage for the DDR DIMM and to prevent installing the wrong type into a socket that might damage the DIMM.

DDR2 DIMM Pinouts

Table 6.17 shows the pinout configuration of a 240-pin DDR2 SDRAM DIMM. Pins 1120 are on the front side, and pins 121240 are on the back. All pins should be gold plated.

Table 6.17. 240-Pin DDR2 DIMM Pinouts

Pin

Signal

Pin

Signal

Pin

Signal

Pin

Signal

1

VREF

61

A4

121

VSS

181

VDDQ

2

VSS

62

VDDQ

122

DQ4

182

A3

3

DQ0

63

A2

123

DQ5

183

A1

4

DQ1

64

VDD

124

VSS

184

VDD

5

VSS

65

VSS

125

DM0

185

CK0

6

-DQS0

66

VSS

126

NC

186

-CK0

7

DQS0

67

VDD

127

VSS

187

VDD

8

VSS

68

NC

128

DQ6

188

A0

9

DQ2

69

VDD

129

DQ7

189

VDD

10

DQ3

70

A10/-AP

130

VSS

190

BA1

11

VSS

71

BA0

131

DQ12

191

VDDQ

12

DQ8

72

VDDQ

132

DQ13

192

-RAS

13

DQ9

73

-WE

133

VSS

193

-CS0

14

VSS

74

-CAS

134

DM1

194

VDDQ

15

-DQS1

75

VDDQ

135

NC

195

ODT0

16

DQS1

76

-CS1

136

VSS

196

A13

17

VSS

77

ODT1

137

CK1

197

VDD

18

NC

78

VDDQ

138

-CK1

198

VSS

19

NC

79

SS

139

VSS

199

DQ36

20

VSS

80

DQ32

140

DQ14

200

DQ37

21

DQ10

81

DQ33

141

DQ15

201

VSS

22

DQ11

82

VSS

142

VSS

202

DM4

23

VSS

83

-DQS4

143

DQ20

203

NC

24

DQ16

84

DQS4

144

DQ21

204

VSS

25

DQ17

85

VSS

145

VSS

205

DQ38

26

VSS

86

DQ34

146

DM2

206

DQ39

27

-DQS2

87

DQ35

147

NC

207

VSS

28

DQS2

88

VSS

148

VSS

208

DQ44

29

VSS

89

DQ40

149

DQ22

209

DQ45

30

DQ18

90

DQ41

150

DQ23

210

VSS

31

DQ19

91

VSS

151

VSS

211

DM5

32

VSS

92

-DQS5

152

DQ28

212

NC

33

DQ24

93

DQS5

153

DQ29

213

VSS

34

DQ25

94

VSS

154

VSS

214

DQ46

35

VSS

95

DQ42

155

DM3

215

DQ47

36

-DQS3

96

DQ43

156

NC

216

VSS

37

DQS3

97

VSS

157

VSS

217

DQ52

38

VSS

98

DQ48

158

DQ30

218

DQ53

39

DQ26

99

DQ49

159

DQ31

219

VSS

40

DQ27

100

VSS

160

VSS

220

CK2

41

VSS

101

SA2

161

NC

221

-CK2

42

NC

102

NC

162

NC

222

VSS

43

NC

103

VSS

163

VSS

223

DM6

44

VSS

104

-DQS6

164

NC

224

NC

45

NC

105

DQS6

165

NC

225

VSS

46

NC

106

VSS

166

VSS

226

DQ54

47

VSS

107

DQ50

167

NC

227

DQ55

48

NC

108

DQ51

168

NC

228

VSS

49

NC

109

VSS

169

VSS

229

DQ60

50

VSS

110

DQ56

170

VDDQ

230

DQ61

51

VDDQ

111

DQ57

171

CKE1

231

VSS

52

CKE0

112

VSS

172

VDD

232

DM7

53

VDD

113

-DQS7

173

NC

233

NC

54

NC

114

DQS7

174

NC

234

VSS

55

NC

115

VSS

175

VDDQ

235

DQ62

56

VDDQ

116

DQ58

176

A12

236

DQ63

57

A11

117

DQ59

177

A9

237

VSS

58

A7

118

VSS

178

VDD

238

VDDSPD

59

VDD

119

SDA

179

A8

239

SA0

60

A5

120

SCL

180

A6

240

SA1

240-pin DDR2 DIMMs use two notches on each side to enable compatibility with both low- and highprofile latched sockets. The connector key is offset with respect to the center of the DIMM to prevent inserting it backward in the socket. The key notch is positioned in the center of the area between pins 64 and 65 on the front (184/185 on the back), and there is no voltage keying because all DDR2 DIMMs run on 1.8V.

RIMM Pinouts

RIMM modules and sockets are gold-plated and designed for 25 insertion/removal cycles. Each RIMM has 184 pins, split into two groups of 92 pins on opposite ends and sides of the module. The pinout of the RIMM is shown in Table 6.18.

Table 6.18. RIMM Pinout

Pin

Signal

Pin

Signal

Pin

Signal

Pin

Signal

A1

GND

B1

GND

A47

NC

B47

NC

A2

LData Bit A8

B2

LData Bit A7

A48

NC

B48

NC

A3

GND

B3

GND

A49

NC

B49

NC

A4

LData Bit A6

B4

LData Bit A5

A50

NC

B50

NC

A5

GND

B5

GND

A51

VREF

B51

VREF

A6

LData Bit A4

B6

LData Bit A3

A52

GND

B52

GND

A7

GND

B7

GND

A53

SPD Clock

B53

SPD Address 0

A8

LData Bit A2

B8

LData Bit A1

A54

+2.5V

B54

+2.5V

A9

GND

B9

GND

A55

SDA

B55

SPD Address 1

A10

LData Bit A0

B10

Interface Clock+

A56

SVDD

B56

SVDD

A11

GND

B11

GND

A57

SPD Write Protect

B57

SPD Address 2

A12

LCTMN

B12

Interface Clock-

A58

+2.5V

B58

+2.5V

A13

GND

B13

GND

A59

RSCK

B59

RCMD

A14

LCTM

B14

NC

A60

GND

B60

GND

A15

GND

B15

GND

A61

Rdata Bit B7

B61

RData Bit B8

A16

NC

B16

LROW2

A62

GND

B62

GND

A17

GND

B17

GND

A63

Rdata Bit B5

B63

Rdata Bit B6

A18

LROW1

B18

LROW0

A64

GND

B64

GND

A19

GND

B19

GND

A65

Rdata Bit B3

B65

Rdata Bit B4

A20

LCOL4

B20

LCOL3

A66

GND

B66

GND

A21

GND

B21

GND

A67

Rdata Bit B1

B67

Rdata Bit B2

A22

LCOL2

B22

LCOL1

A68

GND

B68

GND

A23

GND

B23

GND

A69

RCOL0

B69

Rdata Bit B0

A24

LCOL0

B24

LData Bit B0

A70

GND

B70

GND

A25

GND

B25

GND

A71

RCOL2

B71

RCOL1

A26

LData Bit B1

B26

LData Bit B2

A72

GND

B72

GND

A27

GND

B27

GND

A73

RCOL4

B73

RCOL3

A28

LData Bit B3

B28

LData Bit B4

A74

GND

B74

GND

A29

GND

B29

GND

A75

RROW1

B75

RROW0

A30

LData Bit B5

B30

LData Bit B6

A76

GND

B76

GND

A31

GND

B31

GND

A77

NC

B77

RROW2

A32

LData Bit B7

B32

LData Bit B8

A78

GND

B78

GND

A33

GND

B33

GND

A79

RCTM

B79

NC

A34

LSCK

B34

LCMD

A80

GND

B80

GND

A35

VCMOS

B35

VCMOS

A81

RCTMN

B81

RCFMN

A36

SOUT

B36

SIN

A82

GND

B82

GND

A37

VCMOS

B37

VCMOS

A83

Rdata Bit A0

B83

RCFM

A38

NC

B38

NC

A84

GND

B84

GND

A39

GND

B39

GND

A85

Rdata Bit A2

B85

RData Bit A1

A40

NC

B40

NC

A86

GND

B86

GND

A41

+2.5V

B41

+2.5V

A87

Rdata Bit A4

B87

RData Bit A3

A42

+2.5V

B42

+2.5V

A88

GND

B88

GND

A43

NC

B43

NC

A89

Rdata Bit A6

B89

RData Bit A5

A44

NC

B44

NC

A90

GND

B90

GND

A45

NC

B45

NC

A91

Rdata Bit A8

B91

RData Bit A7

A46

NC

B46

NC

A92

GND

B92

GND

16/18-bit RIMMs are keyed with two notches in the center. This prevents a backward insertion and prevents the wrong type (voltage) RIMM from being used in a system. Currently, all RIMMs run on 2.5V, but proposed 64-bit versions will run on only 1.8V. To allow for changes in the RIMMs, three keying options are possible in the design (see Figure 6.12). The left key (indicated as "DATUM A" in Figure 6.12) is fixed in position, but the center key can be in three different positions spaced 1mm or 2mm to the right, indicating different types of RIMMs. The current default is option A, as shown in Figure 6.12 and Table 6.19, which corresponds to 2.5V operation.

Figure 6.12. RIMM keying options.

Table 6.19. Possible Keying Options for RIMMs

Option

Notch Separation

Description

A

11.5mm

2.5V RIMM

B

12.5mm

Reserved

C

13.5mm

Reserved

RIMMs incorporate an SPD device, which is essentially a Flash ROM onboard. This ROM contains information about the RIMM's size and type, including detailed timing information for the memory controller. The memory controller automatically reads the data from the SPD ROM to configure the system to match the RIMMs installed.

Figure 6.13 shows a typical PC RIMM installation. The RDRAM controller and clock generator are typically in the motherboard chipset North Bridge component. As you can see, the Rambus memory channel flows from the memory controller through each of up to three RIMM modules in series. Each module contains 4, 8, 16, or more RDRAM devices (chips), also wired in series, with an onboard SPD ROM for system configuration. Any RIMM sockets without a RIMM installed must have a continuity module, shown in the last socket in Figure 6.13. This enables the memory bus to remain continuous from the controller through each module (and, therefore, each RDRAM device on the module) until the bus finally terminates on the motherboard. Note how the bus loops from one module to another. For timing purposes, the first RIMM socket must be 6" or less from the memory controller, and the entire length of the bus must not be more than it would take for a signal to go from one end to another in four data clocks, or about 5ns.

Figure 6.13. Typical RDRAM bus layout showing a RIMM and one continuity module.

Interestingly, Rambus does not manufacture the RDRAM devices (the chips) or the RIMMs; that is left to other companies. Rambus is merely a design company, and it has no chip fabs or manufacturing facilities of its own. It licenses its technology to other companies who then manufacture the devices and modules.

Determining a Memory Module's Size and Features

Most memory modules are labeled with a sticker indicating the module's type, speed rating, and manufacturer. If you are attempting to determine whether existing memory can be used in a new computer, or if you need to replace memory in an existing computer, this information can be essential. Figure 6.14 illustrates the markings on typical 512MB and 1GB DDR memory modules from Crucial Technologies.

Figure 6.14. Markings on 512MB (top) and 1GB (bottom) DDR memory modules from Crucial Technology.

However, if you have memory modules that are not labeled, you can still determine the module type, speed, and capacity if the memory chips on the module are clearly labeled. For example, assume you have a memory module with chips labeled as follows:

MT46V64M8TG-75

By using an Internet search engine such as Google and entering the number from one of the memory chips, you can usually find the data sheet for the memory chips. Consider the following example: Say you have a registered memory module and want to look up the part number for the memory chips (usually eight or more chips) rather than the buffer chips on the module (usually from one to three, depending on the module design). In this example, the part number turns out to be a Micron memory chip that decodes like this:

MT = Micron Technologies (the memory chip maker)

46 = DDR SDRAM

V = 2.5V DC

64M8 = 8 million rows x 8 (equals 64) x 8 banks (often written as 64 Meg x 8)

TG = 66-pin TSOP chip package

-75 = 7.5ns @ CL2 latency (DDR 266)

The full datasheet for this example is located at http://download.micron.com/pdf/datasheets/dram/ddr/512MBDDRx4x8x16.pdf.

From this information, you can determine that the module has the following characteristics:

  • The module runs at DDR266 speeds using standard 2.5V DC voltage.

  • The module has a latency of CL2, so it can be used on any system that requires CL2 or slower latency (such as CL2.5 or CL3).

  • Each chip has a capacity of 512Mb (64 ¥ 8 = 512).

  • Each chip contains 8 bits. Because it takes 8 bits to make 1 byte, the capacity of the module can be calculated by grouping the memory chips on the module into groups of 8. If each chip contains 512Mb, a group of 8 means that the module has a size of 512MB (512Mb ¥ 8 = 512MB). A dual-bank module has 2 groups of 8 chips for a capacity of 1GB (512Mb ¥8 = 1024MB, or 1GB).

If the module has 9 instead of 8 memory chips (or 18 instead of 16), the additional chips are used for parity checking and support ECC error correction on servers with this feature.

To determine the size of the module in MB or GB and to determine whether the module supports ECC, count the memory chips on the module and compare them to Table 6.20. Note that the size of each memory chip in Mb is the same as the size in MB if the memory chips use an 8-bit design.

Table 6.20. Module Capacity Using 512Mb (64Mbit x 8) Chips

Number of Chips

Number of Bits In Each Bank

Module Size

Supports ECC?

Single or Dual-bank

8

64

512MB

No

Single

9

72

512MB

Yes

Single

16

64

1GB

No

Dual

18

72

1GB

Yes

Dual

The additional chip used by each group of eight chips provides parity checking, which is used by the ECC function on most server motherboards to correct single-bit errors.

A registered module contains 9 or 18 memory chips for ECC plus additional memory buffer chips. These chips are usually smaller in size and located near the center of the module, as shown previously in Figure 6.9.

Note

Some modules use 16-bit-wide memory chips. In such cases, only 4 chips are needed for single-bank memory (5 with parity/ECC support) and 8 are needed for double-bank memory (10 with parity/ECC support). These memory chips use a design listed as capacity x 16, like this: 256Mb x 16.

You can also see this information if you look up the manufacturer, the memory type, and the organization in a search engine. For example, a web search for Micron "64 Meg x 8" DDR DIMM locates a parts list for Micron's 512MB and 1GB modules at http://www.micron.com/products/modules/ddrsdram/partlist.aspxxpincount=184-pin&version=Registered&package=VLP%20DIMM. The Comp. Config column lists the chip design for each chip on the module.

As you can see, with a little detective work, you can determine the size, speed, and type of a memory moduleeven if the module isn't marked, as long as the markings on the memory chips themselves are legible.

Tip

If you are unable to decipher a chip part number, you can use a program such as HWiNFO or SiSoftware Sandra to identify your memory module, as well as many other facts about your computer, including chipset, processor, empty memory sockets, and much more. You can download shareware versions of HWiNFO from www.hwinfo.com and SiSoftware Sandra from www.sisoftware.net.

Memory Banks

Memory chips (DIPs, SIMMs, SIPPs, and DIMMs) are organized in banks on motherboards and memory cards. You should know the memory bank layout and position on the motherboard and memory cards.

You need to know the bank layout when adding memory to the system. In addition, memory diagnostics report error locations by byte and bit addresses, and you must use these numbers to locate which bank in your system contains the problem.

The banks usually correspond to the data bus capacity of the system's microprocessor. Table 6.21 shows the widths of individual banks based on the type of PC.

Table 6.21. Memory Bank Widths on Various Systems

Processor

Data Bus

Memory Bank Size (No Parity)

Memory Bank Size (Parity/ECC)

30-Pin SIMMs per Bank

72-Pin SIMMs per Bank

168-Pin DIMMs per Bank

8088

8-bit

8 bits

9 bits

1

n/a

n/a

8086

16-bit

16 bits

18 bits

2

n/a

n/a

286

16-bit

16 bits

18 bits

2

n/a

n/a

386SX, SL, SLC

16-bit

16 bits

18 bits

2

n/a

n/a

486SLC, SLC2

16-bit

16 bits

18 bits

2

n/a

n/a

386DX

32-bit

32 bits

36 bits

4

1

n/a

486SX, DX, DX2, DX4, 5x86

32-bit

32 bits

36 bits

4

1

n/a

Pentium, K6 series

64-bit

64 bits

72 bits

8[1]

2[2]

1

PPro, PII, Celeron, PIII, P4, Pentium D, Pentium Extreme Edition, Athlon/Duron, Athlon XP, Athlon 64 (single-channel mode)

64-bit

64 bits

72 bits

1

Athlon XP, P4, Pentium D, Pentium Extreme Edition, Athlon 64, Athlon 64 FX, Athlon 64 X2 (dual-channel mode)

64-bit

128 bits[3]

144 bits[3]

[1] Very few, if any, motherboards using this type of memory were made for these processors.

[2] 72-pin SIMMs were used by some systems running Pentium Pro, Pentium, and Pentium II and Pentium II Xeon processors; they were replaced by SDRAM and newer types of DIMMs

[3] Dual-channel mode requires matched pairs of memory inserted into the memory sockets designated for dual-channel mode. If a single module or two different-size modules are used, or the dual-channel sockets are not used, the system runs in single-channel mode.

The number of bits for each bank can be made up of single chips, SIMMs, or DIMMs. Modern systems don't use individual chips; instead, they use only SIMMs or DIMMs. If the system has a 16-bit processor, such as a 386SX, it probably uses 30-pin SIMMs and has two SIMMs per bank. All the SIMMs in a single bank must be the same size and type.

A 486 system requires four 30-pin SIMMs or one 72-pin SIMM to make up a bank. A single 72-pin SIMM is 32 bits wide, or 36 bits wide if it supports parity. You can often tell whether a SIMM supports parity by counting its chips. To make a 32-bit SIMM, you could use 32 individual 1-bit-wide chips, or you could use eight individual 4-bit-wide chips to make up the data bits. If the system uses parity, four extra bits are required (36 bits total), so you would see one more 4-bit-wide or four individual 1-bit-wide chips added to the bank for the parity bits.

As you might imagine, 30-pin SIMMs are less than ideal for 32-bit or 64-bit systems (that is, 486 or Pentium) because you must use them in increments of four or eight per bank. Consequently, only a few 32-bit systems were ever built using 30-pin SIMMs, and no 64-bit systems have ever used 30-pin SIMMs. If a 32-bit system (such as any PC with a 386DX or 486 processor) uses 72-pin SIMMs, each SIMM represents a separate bank and the SIMMs can be added or removed on an individual basis rather than in groups of four, as would be required with 30-pin SIMMs. This makes memory configuration much easier and more flexible. In 64-bit systems that use SIMMs, two 72-pin SIMMs are required per bank.

DIMMs are ideal for Pentium and higher systems because the 64-bit width of the DIMM exactly matches the 64-bit width of the Pentium processor data bus. Therefore, each DIMM represents an individual bank, and they can be added or removed one at a time. Many recent systems have been designed to use matched pairs of memory modules for faster performance. So-called "dual-channel" designs treat a matched pair of modules as a single 128-bit (or 144-bit if parity or ECC memory is used) device. In those cases, although a single module can be used, modules must be installed in pairs to achieve best performance.

The physical orientation and numbering of the SIMMs or DIMMs used on a motherboard is arbitrary and determined by the board's designers, so documentation covering your system or card comes in handy. You can determine the layout of a motherboard or an adapter card through testing, but that takes time and might be difficult, particularly after you have a problem with a system.

Caution

If your system supports dual-channel memory, be sure you use the correct memory sockets to enable dual-channel operation. Check the documentation to ensure that you use the correct pair of sockets. Most dual-channel systems will still run if the memory is not installed in a way that permits dual-channel operation, but performance is lower than if the memory were installed properly. Some systems provide dual-channel support if an odd number of modules are installed, as long as the total capacity of two modules installed in one channel equals the size of the single module in the other channel and all modules are the same speed and latency. Again, check your documentation for details.

Memory Module Speed

When you replace a failed memory module or install a new module as an upgrade, you typically must install a module of the same type and speed as the others in the system. You can substitute a module with a different (faster) speed but only if the replacement module's speed is equal to or faster than that of the other modules in the system.

Some people have had problems when "mixing" modules of different speeds. With the wide variety of motherboards, chipsets, and memory types, few ironclad rules exist. When in doubt as to which speed module to install in your system, consult the motherboard documentation for more information.

Substituting faster memory of the same type doesn't result in improved performance if the system still operates the memory at the same speed. Systems that use DIMMs or RIMMs can read the speed and timing features of the module from a special SPD ROM installed on the module and then set chipset (memory controller) timing accordingly. In these systems, you might see an increase in performance by installing faster modules, to the limit of what the chipset will support.

To place more emphasis on timing and reliability, there are Intel and JEDEC standards governing memory types that require certain levels of performance. These standards certify that memory modules perform within Intel's timing and performance guidelines.

The same common symptoms result when the system memory has failed or is simply not fast enough for the system's timing. The usual symptoms are frequent parity check errors or a system that does not operate at all. The POST might report errors, too. If you're unsure of which chips to buy for your system, contact the system manufacturer or a reputable chip supplier.

See "Parity Checking," p. 516.

Parity and ECC

Part of the nature of memory is that it inevitably fails. These failures are usually classified as two basic types: hard fails and soft errors.

The best understood are hard fails, in which the chip is working and then, because of some flaw, physical damage, or other event, becomes damaged and experiences a permanent failure. Fixing this type of failure normally requires replacing some part of the memory hardware, such as the chip, SIMM, or DIMM. Hard error rates are known as HERs.

The other more insidious type of failure is the soft error, which is a nonpermanent failure that might never recur or could occur only at infrequent intervals. (Soft fails are effectively "fixed" by powering the system off and back on.) Soft error rates are known as SERs.

About 20 years ago, Intel made a discovery about soft errors that shook the memory industry. It found that alpha particles were causing an unacceptably high rate of soft errors or single event upsets (SEUs, as they are sometimes called) in the 16KB DRAMs that were available at the time. Because alpha particles are low-energy particles that can be stopped by something as thin and light as a sheet of paper, it became clear that for alpha particles to cause a DRAM soft error, they would have to be coming from within the semiconductor material. Testing showed trace elements of thorium and uranium in the plastic and ceramic chip packaging materials used at the time. This discovery forced all the memory manufacturers to evaluate their manufacturing processes to produce materials free from contamination.

Today, memory manufacturers have all but totally eliminated the alpha-particle source of soft errors. Many people believed that was justification for the industry trend to drop parity checking. The argument is that, for example, a 16MB memory subsystem built with 4MB technology would experience a soft error caused by alpha particles only about once every 16 years! The real problem with this thinking is that it is seriously flawed, and many system manufacturers and vendors were coddled into removing parity and other memory fault-tolerant techniques from their systems even though soft errors continue to be an ongoing problem. More recent discoveries prove that alpha particles are now only a small fraction of the cause of DRAM soft errors.

As it turns out, the biggest cause of soft errors today are cosmic rays. IBM researchers began investigating the potential of terrestrial cosmic rays in causing soft errors similar to alpha particles. The difference is that cosmic rays are very high-energy particles and can't be stopped by sheets of paper or other more powerful types of shielding. The leader in this line of investigation was Dr. J.F. Ziegler of the IBM Watson Research Center in Yorktown Heights, New York. He has produced landmark research into understanding cosmic rays and their influence on soft errors in memory.

One example of the magnitude of the cosmic ray soft-error phenomenon demonstrated that with a certain sample of non-IBM DRAMs, the SER at sea level was measured at 5950 FIT (failures in time, which is measured at 1 billion hours) per chip. This was measured under real-life conditions with the benefit of millions of device hours of testing. In an average system, this would result in a soft error occurring every six months or less. In power-user or server systems with a larger amount of memory, it could mean one or more errors per month! When the exact same test setup and DRAMs were moved to an underground vault shielded by more than 50 feet of rock, thus eliminating all cosmic rays, absolutely no soft errors were recorded. This not only demonstrates how troublesome cosmic rays can be, but it also proves that the packaging contamination and alpha-particle problem has indeed been solved.

Cosmic-ray-induced errors are even more of a problem in SRAMs than DRAMS because the amount of charge required to flip a bit in an SRAM cell is less than is required to flip a DRAM cell capacitor. Cosmic rays are also more of a problem for higher-density memory. As chip density increases, it becomes easier for a stray particle to flip a bit. It has been predicted by some that the soft error rate of a 64MB DRAM will be double that of a 16MB chip, and a 256MB DRAM will have a rate four times higher.

Unfortunately, the PC industry has largely failed to recognize this cause of memory errors. Electrostatic discharge, power surges, or unstable software can much more easily explain away the random and intermittent nature of a soft error, especially right after a new release of an operating system or major application.

Studies have shown that the soft error rate for ECC systems is on the order of 30 times greater than the hard error rate. This is not surprising to those familiar with the full effects of cosmic-ray-generated soft errors. The number of errors experienced varies with the density and amount of memory present. Studies show that soft errors can occur from once a month or less to several times a week or more!

Although cosmic rays and other radiation events are the biggest cause of soft errors, soft errors can also be caused by the following:

  • Power glitches or noise on the line. This can be caused by a defective power supply in the system or by defective power at the outlet.

  • Incorrect type or speed rating. The memory must be the correct type for the chipset and match the system access speed.

  • RF (radio frequency) interference. Caused by radio transmitters in close proximity to the system, which can generate electrical signals in system wiring and circuits. Keep in mind that the increased use of wireless networks, keyboards, and mouse devices can lead to a greater risk of RF interference.

  • Static discharges. Causes momentary power spikes, which alter data.

  • Timing glitches. Data doesn't arrive at the proper place at the proper time, causing errors. Often caused by improper settings in the BIOS Setup, by memory that is rated slower than the system requires, or by overclocked processors and other system components.

  • Heat buildup. High-speed memory modules run hotter than older modules. RDRAM RIMM modules were the first memory to include integrated heat spreaders, and many high-performance DDR and DDR2 memory modules now include heat spreaders to help fight heat buildup.

Most of these problems don't cause chips to permanently fail (although bad power or static can damage chips permanently), but they can cause momentary problems with data.

How can you deal with these errors? Just ignoring them is certainly not the best way to deal with them, but unfortunately that is what many system manufacturers and vendors are doing today. The best way to deal with this problem is to increase the system's fault tolerance. This means implementing ways of detecting and possibly correcting errors in PC systems. Three basic levels and techniques are used for fault tolerance in modern PCs:

  • Nonparity

  • Parity

  • ECC

Nonparity systems have no fault tolerance at all. The only reason they are used is because they have the lowest inherent cost. No additional memory is necessary, as is the case with parity or ECC techniques. Because a parity-type data byte has 9 bits versus 8 for nonparity, memory cost is approximately 12.5% higher. Also, the nonparity memory controller is simplified because it does not need the logic gates to calculate parity or ECC check bits. Portable systems that place a premium on minimizing power might benefit from the reduction in memory power resulting from fewer DRAM chips. Finally, the memory system data bus is narrower, which reduces the amount of data buffers. The statistical probability of memory failures in a modern office desktop computer is now estimated at about one error every few months. Errors will be more or less frequent depending on how much memory you have.

This error rate might be tolerable for low-end systems that are not used for mission-critical applications. In this case, the extreme market sensitivity to price probably can't justify the extra cost of parity or ECC memory, and such errors then must be tolerated.

At any rate, having no fault tolerance in a system is simply gambling that memory errors are unlikely. You further gamble that if they do occur, memory errors will result in an inherent cost less than the additional hardware necessary for error detection. However, the risk is that these memory errors can lead to serious problems. A memory error in a calculation could cause the wrong value to go into a bank check. In a server, a memory error could force a system to hang and bring down all LAN-resident client systems with subsequent loss of productivity. Finally, with a nonparity or non-ECC memory system, tracing the problem is difficult, which is not the case with parity or ECC. These techniques at least isolate a memory source as the culprit, thus reducing both the time and cost of resolving the problem.

Parity Checking

One standard IBM set for the industry is that the memory chips in a bank of nine each handle 1 bit of data: 8 bits per character plus 1 extra bit called the parity bit. The parity bit enables memory-control circuitry to keep tabs on the other 8 bitsa built-in cross-check for the integrity of each byte in the system. If the circuitry detects an error, the computer stops and displays a message informing you of the malfunction. If you are running a GUI operating system, such as Windows or OS/2, a parity error generally manifests itself as a locked system. When you reboot, the BIOS should detect the error and display the appropriate error message.

SIMMs and DIMMs are available both with and without parity bits. Originally, all PC systems used parity-checked memory to ensure accuracy. Starting in 1994, a disturbing trend developed in the PC-compatible marketplace. Most vendors began shipping systems without parity checking or any other means of detecting or correcting errors! These systems can use cheaper nonparity SIMMs, which saves about 10%15% on memory costs for a system. Parity memory results in increased initial system cost, primarily because of the additional memory bits involved. Parity can't correct system errors, but because parity can detect errors, it can make the user aware of memory errors when they happen. This has two basic benefits:

  • Parity guards against the consequences of faulty calculations based on incorrect data.

  • Parity pinpoints the source of errors, which helps with problem resolution, thus improving system serviceability.

PC systems can easily be designed to function using either parity or nonparity memory. The cost of implementing parity as an option on a motherboard is virtually nothing; the only cost is in actually purchasing the parity SIMMs or DIMMs. This enables a system manufacturer to offer its system purchasers the choice of parity if the purchasers feel the additional cost is justified for their particular applications.

Unfortunately, several of the big names began selling systems without parity to reduce their prices, and they did not make it well known that the lower cost meant parity memory was no longer included as standard. This began happening mostly in 1994 and 1995, and it has continued until recently, with few people understanding the full implications. After one or two major vendors did this, most of the others were forced to follow to remain price-competitive.

Because nobody wanted to announce this information, it remained sort of a dirty little secret within the industry. Originally, when this happened you could still specify parity memory when you ordered a system, even though the default configurations no longer included it. There was a 10%15% surcharge on the memory, but those who wanted reliable, trustworthy systems could at least get them, provided they knew to ask, of course. Then a major bomb hit the industry, in the form of the Intel Triton 430FX Pentium chipset, which was the first major chipset on the market that did not support parity checking at all! It also became the most popular chipset of its time and was found in practically all Pentium motherboards sold in the 1995 timeframe. This set a disturbing trend for the next few years. All but one of Intel's Pentium processor chipsets after the 430FX did not support parity-checked memory; the only one that did was the 430HX Triton II.

Since then, Intel and other chipset manufacturers have put support for parity and ECC memory in most of their chipsets (especially so in their higher-end models). The low-end chipsets, however, typically do lack support for either parity or ECC. If more reliability is important to you, make sure the systems you purchase have this support. In Chapter 4, you can learn which recent chipsets support parity and ECC memory and which ones do not.

Let's look at how parity checking works, and then examine in more detail the successor to parity checking, called ECC, which can not only detect but correct memory errors on-the-fly.

How Parity Checking Works

IBM originally established the odd parity standard for error checking. The following explanation might help you understand what is meant by odd parity. As the 8 individual bits in a byte are stored in memory, a parity generator/checker, which is either part of the CPU or located in a special chip on the motherboard, evaluates the data bits by adding up the number of 1s in the byte. If an even number of 1s is found, the parity generator/checker creates a 1 and stores it as the ninth bit (parity bit) in the parity memory chip. That makes the sum for all 9 bits (including the parity bit) an odd number. If the original sum of the 8 data bits is an odd number, the parity bit created would be a 0, keeping the sum for all 9 bits an odd number. The basic rule is that the value of the parity bit is always chosen so that the sum of all 9 bits (8 data bits plus 1 parity bit) is stored as an odd number. If the system used even parity, the example would be the same except the parity bit would be created to ensure an even sum. It doesn't matter whether even or odd parity is used; the system uses one or the other, and it is completely transparent to the memory chips involved. Remember that the 8 data bits in a byte are numbered 0 1 2 3 4 5 6 7. The following examples might make it easier to understand:

Data bit number: 0 1 2 3 4 5 6 7 Parity bit Data bit value: 1 0 1 1 0 0 1 1 0

In this example, because the total number of data bits with a value of 1 is an odd number (5), the parity bit must have a value of 0 to ensure an odd sum for all 9 bits.

Here is another example:

Data bit number: 0 1 2 3 4 5 6 7 Parity bit Data bit value: 1 1 1 1 0 0 1 1 1

In this example, because the total number of data bits with a value of 1 is an even number (6), the parity bit must have a value of 1 to create an odd sum for all 9 bits.

When the system reads memory back from storage, it checks the parity information. If a (9-bit) byte has an even number of bits, that byte must have an error. The system can't tell which bit has changed or whether only a single bit has changed. If 3 bits changed, for example, the byte still flags a parity-check error; if 2 bits changed, however, the bad byte could pass unnoticed. Because multiple bit errors (in a single byte) are rare, this scheme gives you a reasonable and inexpensive ongoing indication that memory is good or bad.

The following examples show parity-check messages for three types of older systems:

For the IBM PC: PARITY CHECK x For the IBM XT: PARITY CHECK x yyyyy (z) For the IBM AT and late model XT: PARITY CHECK x yyyyy

where x is 1 or 2:

1 = Error occurred on the motherboard

2 = Error occurred in an expansion slot

In this example, yyyyy represents a number from 00000 through FFFFF that indicates, in hexadecimal notation, the byte in which the error has occurred.

Where (z) is (S) or (E):

(S) = Parity error occurred in the system unit

(E) = Parity error occurred in an optional expansion chassis

Note

An expansion chassis was an option IBM sold for the original PC and XT systems to add more expansion slots.

When a parity-check error is detected, the motherboard parity-checking circuits generate a nonmaskable interrupt (NMI), which halts processing and diverts the system's attention to the error. The NMI causes a routine in the ROM to be executed. On some older IBM systems, the ROM parity-check routine halts the CPU. In such a case, the system locks up, and you must perform a hardware reset or a power-off/power-on cycle to restart the system. Unfortunately, all unsaved work is lost in the process.

Most systems do not halt the CPU when a parity error is detected; instead, they offer you the choice of rebooting the system or continuing as though nothing happened. Additionally, these systems might display the parity error message in a different format from IBM, although the information presented is basically the same. For example, most systems with a Phoenix BIOS display one of these messages:

Memory parity interrupt at xxxx:xxxx Type (S)hut off NMI, Type (R)eboot, other keys to continue

or

I/O card parity interrupt at xxxx:xxxx Type (S)hut off NMI, Type (R)eboot, other keys to continue

The first of these two messages indicates a motherboard parity error (Parity Check 1), and the second indicates an expansion-slot parity error (Parity Check 2). Notice that the address given in the form xxxx:xxxx for the memory error is in a segment:offset form rather than a straight linear address, such as with IBM's error messages. The segment:offset address form still gives you the location of the error to a resolution of a single byte.

You have three ways to proceed after viewing this error message:

  • You can press S, which shuts off parity checking and resumes system operation at the point where the parity check first occurred.

  • You can press R to force the system to reboot, losing any unsaved work.

  • You can press any other key to cause the system to resume operation with parity checking still enabled.

If the problem occurs, it is likely to cause another parity-check interruption. It's usually prudent to press S, which disables the parity checking so you can then save your work. In this case, it's best to save your work to a floppy disk to prevent the possible corruption of the hard disk. You should also avoid overwriting any previous (still good) versions of whatever file you are saving because you could be saving a bad file caused by the memory corruption. Because parity checking is now disabled, your save operations will not be interrupted. Then, you should power the system off, restart it, and run whatever memory diagnostics software you have to try to track down the error. In some cases, the POST finds the error on the next restart, but you usually need to run a more sophisticated diagnostics programperhaps in a continuous modeto locate the error.

Systems with an AMI BIOS display the parity error messages in the following forms:

ON BOARD PARITY ERROR ADDR (HEX) = (xxxxx)

or

OFF BOARD PARITY ERROR ADDR (HEX) = (xxxxx)

These messages indicate that an error in memory has occurred during the POST, and the failure is located at the address indicated. The first one indicates that the error occurred on the motherboard, and the second message indicates an error in an expansion slot adapter card. The AMI BIOS can also display memory errors in the following manners:

Memory Parity Error at xxxxx

or

I/O Card Parity Error at xxxxx

These messages indicate that an error in memory has occurred at the indicated address during normal operation. The first one indicates a motherboard memory error, and the second indicates an expansion slot adapter memory error.

Although many systems enable you to continue processing after a parity error and even allow disabling further parity checking, continuing to use your system after a parity error is detected can be dangerous. The idea behind letting you continue using either method is to give you time to save any unsaved work before you diagnose and service the computer, but be careful how you do this.

Note that these messages can vary depending not only on the ROM BIOS but also on your operating system. Protected mode operating systems, such as most versions of Windows, trap these errors and run their own handler program that displays a message different from what the ROM would have displayed. The message might be associated with a blue screen or might be a trap error, but it usually indicates that it is memory or parity related. For example, Windows 98 displays a message indicating Memory parity error detected. System halted. when such an error has occurred.

Caution

When you are notified of a memory parity error, remember the parity check is telling you that memory has been corrupted. Do you want to save potentially corrupted data over the good file from the last time you saved? Definitely not! Be sure you save your work with a different filename. In addition, after a parity error, save only to a floppy disk if possible and avoid writing to the hard disk; there is a slight chance that the hard drive could become corrupt if you save the contents of corrupted memory.

After saving your work, determine the cause of the parity error and repair the system. You might be tempted to use an option to shut off further parity checking and simply continue using the system as though nothing were wrong. Doing so is like unscrewing the oil pressure warning indicator bulb on a car with an oil leak so the oil pressure light won't bother you anymore!

A few years ago, when memory was more expensive, a few companies marketed SIMMs with bogus parity chips. Instead of actually having the extra memory chips needed to store the parity bits, these "logic parity" or parity "emulation" SIMMs used an onboard parity generator chip. This chip ignored any parity the system was trying to store on the SIMM, but when data was retrieved, it always ensured that the correct parity was returned, thus making the system believe all was well even though there might have been a problem.

These bogus parity modules were used because memory was much more expensive and a company could offer a "parity" SIMM for only a few dollars more with the fake chip. Unfortunately, identifying them can be difficult. The bogus parity generator doesn't look like a memory chip and has different markings from the other memory chips on the SIMM. Most of them had a "GSM" logo, which indicated the original manufacturer of the parity logic device, not necessarily the SIMM itself.

One way to positively identify these bogus fake parity SIMMs is by using a hardware SIMM test machine, such as those by Tanisys (www.tanisys.com), CST (www.simmtester.com), or Innoventions (www.memorytest.com). I haven't seen DIMMs or RIMMs with fake parity/ECC bits, and memory prices have come down far enough that it probably isn't worth the trouble anymore.

Error Correcting Code

ECC goes a big step beyond simple parity-error detection. Instead of just detecting an error, ECC allows a single bit error to be corrected, which means the system can continue without interruption and without corrupting data. ECC, as implemented in most PCs, can only detect, not correct, double-bit errors. Because studies have indicated that approximately 98% of memory errors are the single-bit variety, the most commonly used type of ECC is one in which the attendant memory controller detects and corrects single-bit errors in an accessed data word (double-bit errors can be detected but not corrected). This type of ECC is known as single-bit error-correction double-bit error detection (SEC-DED) and requires an additional 7 check bits over 32 bits in a 4-byte system and an additional 8 check bits over 64 bits in an 8-byte system. ECC in a 4-byte (32-bit, such as a 486) system obviously costs more than nonparity or parity, but in an 8-byte-wide bus (64-bit, such as Pentium/Athlon) system, ECC and parity costs are equal because the same number of extra bits (8) is required for either parity or ECC. Because of this, you can purchase parity SIMMs (36-bit), DIMMs (72-bit), or RIMMs (18-bit) for 32-bit systems and use them in an ECC mode if the chipset supports ECC functionality. If the system uses SIMMs, two 36-bit (parity) SIMMs are added for each bank (for a total of 72 bits), and ECC is done at the bank level. If the system uses DIMMs, a single parity/ECC 72-bit DIMM is used as a bank and provides the additional bits. RIMMs are installed in singles or pairs, depending on the chipset and motherboard. They must be 18-bit versions if parity/ECC is desired.

ECC entails the memory controller calculating the check bits on a memory-write operation, performing a compare between the read and calculated check bits on a read operation, and, if necessary, correcting bad bits. The additional ECC logic in the memory controller is not very significant in this age of inexpensive, high-performance VLSI logic, but ECC actually affects memory performance on writes. This is because the operation must be timed to wait for the calculation of check bits and, when the system waits for corrected data, reads. On a partial-word write, the entire word must first be read, the affected byte(s) rewritten, and then new check bits calculated. This turns partial-word write operations into slower read-modify writes. Fortunately, this performance hit is very small, on the order of a few percent at maximum, so the tradeoff for increased reliability is a good one.

Most memory errors are of a single-bit nature, which ECC can correct. Incorporating this fault-tolerant technique provides high system reliability and attendant availability. An ECC-based system is a good choice for servers, workstations, or mission-critical applications in which the cost of a potential memory error outweighs the additional memory and system cost to correct it, along with ensuring that it does not detract from system reliability. If you value your data and use your system for important (to you) tasks, you'll want ECC memory. No self-respecting manager would build or run a network server, even a lower-end one, without ECC memory.

By designing a system that allows the user to make the choice of ECC, parity, or nonparity, users can choose the level of fault tolerance desired, as well as how much they want to gamble with their data.

Категории