SMP CPUs

Intel - Xeon

Foster: Socket 603, 400Mhz FSB
The "old" Xeon core is synonymous with the old Pentium 4 "Northwood" core in desktop PCs. Limited to 400Mhz FSB and with no hyperthreading it is virtually impossible to obtain these days. You wouldn't want one anyway - support on most new motherboards is limited at best.

Prestonia: Socket 603, 400Mhz FSB
The Prestonia core was a big step for the Xeon range. It was the first Xeon core to implement hyperthreading (note: this is back in 2001, way before the desktop P4 had hyperthreading). The new core also doubled L2 cache to 512KB and reduced voltage from the rather toasty 1.7v used by Foster to a more manageable 1.475v. The slower chips based on the new core (2.2Ghz and below) are only available in 400Mhz versions. The 2.4Ghz part is available in both 400Mhz and 533Mhz FSB versions, thus making it a popular choice for overclockers (get the 400Mhz part, set bus speed to 533Mhz, immediate 25% overclocking).

Prestonia: Socket 604, 533Mhz FSB
Higher-speed Prestonias (>2.4Ghz) are only available in the 533Mhz version. Offers 512KB L2 cache like its slower 400Mhz FSB counterparts. The extra pin on this socket is purely so people with older socket 603 boards can't use the newer 533Mhz chips. Yes, I'm a cynic.

Prestonia: 3.06Ghz and 3.2Ghz
When AMD's Opterons came on the market Intel had a problem: unlike its Pentium 4 counterpart (which is positively bathing in memory bandwidth) the Xeon is notoriously lacking in memory bandwidth. This is a limitation of the chipset design - both CPUs talk to the Northbridge, the Northbridge talks to the RAM. As clock speeds have increased with much smaller increases in FSB (400Mhz -> 533Mhz) this has become even more of a problem. As a stopgap solution Intel essentially took the P4 Extreme Edition and made it into a server chip. The 3.06Ghz Prestonia incorporates an additional 1MB Level 3 cache. The 3.2Ghz part incorporates a 2Mb L3 cache. It is hoped that these will prevent Opteron from running too far ahead in the benchmarks until the 800Mhz Xeon platform becomes available.

Gallatin:
Branded as the Xeon MP, this chip is insanely expensive and designed for 4- and 8-way multiprocessing. The Gallatin core is essentially the Prestonia 400Mhz FSB with *insane* amounts of Level 3 cache. Chips in this range are available with 1MB, 2MB and 4MB Level 3 caches in an attempt to make up for the criminal lack of memory bandwidth the platform suffers from.

AMD

Opteron 24x: Socket 940
The Athlon MP never really took off as a server chip, partly due to the lack of motherboard support and major OEMs still not entirely trusting AMD's corporate image. Remember AMD's previous chips were all low cost desktop parts (K5, K6, K6-II) - AMD had no experience in the SMP market at this point. The Opteron will in all likelihood change AMD's image forever. It's the first chip designed from the ground up as a server part rather than a server version of a desktop chip. The Opteron is available in 3 ranges: 14x, 24x and 84x. We'll ignore the 14x - it's not designed for multiprocessor operation. The 24x is capable of running in 2-way SMP, the 84x is capable of running anything up to 8-way SMP. Although raw core speeds are relatively low (the Opteron 240 runs @ 1.4Ghz, the fastest Opteron 248 runs @ 2.2Ghz) the Opteron seems to outperform Xeons a good 1Ghz ahead of them in pure clockspeed. The Opteron also supports the x86-64 instruction set for 64-bit computing. As more and more systems are compiled for x86-64 the Opteron's performance lead over competing Intel platforms is likely to increase still further.

The Opteron doesn't suffer the same problems faced by the Xeon cores - each CPU has its own integrated memory controller and essentially its own DIMM bank. This means that the Opteron's memory bandwidth increases proportionally to the number of chips in the system, thus resulting in far superior scaling. Each chip has 3 HyperTransport links, each with 6.4GB/sec bandwidth for inter-CPU and system-wide communication. In most 2-way configurations one HT link on each CPU is reserved for inter-chip communication (and when one CPU needs to access the other's DIMMs) for 12.8GB/sec between CPUs. Another HT link is usually used for AGP, often with a pass-through bridge for legacy components (PCI, USB, etc.). The other HT links are used by one or more PCI-X controllers for high-bandwidth peripherals (SCSI, SATA, GigE, etc.).

An important note about cache: The Opteron has a 128KB L1 cache (64K instruction, 64k data) and a 1Mb L2 cache. This gives a total of 1152KB, less than the equivalent Xeon platforms. But AMD's cache is exclusive, that is to say, the contents of L1 are not duplicated in L2. Intel's caches are inclusive, so everything in L1 is duplicated in L2, which is duplicated in L3. That Xeon cache suddenly got a whole lot smaller.

Athlon MP
AMD's first foray into the SMP market was a mixed bag. Never really accepted by the big OEMs or corporates, it gained respect amongst performance enthusiasts and hobbyists for its excellent performance at often a third of the price of a comparable Xeon configuration. The Athlon MP has kept pace clock-for-clock with the desktop Athlon XP line, though remaining at 133Mhz FSB where the desktop variants jumped to 166Mhz and later 200Mhz.

The Athlon MP is essentially an XP with a different badge, multiplier unlocked, and the L5 bridge connected. <2.2Ghz parts are based on the Palomino core (.18um), <2.8Ghz parts are based on the Thoroughbred core (.13um, 128K L1, 256K L2) and the final chip in the range, 2.8Ghz, is based on the Barton core (128KB L1, 512KB L2). This makes the Athlon MP an ideal choice for the SMPer on a budget - pick up a couple of AthlonXPs, reconnect the L5 bridge and you've essentially got Athlon MPs at half the price.

These days the AthlonMP is unfortunately crippled by the limited memory bandwidth of the 266Mhz (133Mhz DDR) bus on which it runs. With uniprocessor chipsets this has been in part solved by a move to dual-channel DDR and higher bus speeds, but these changes have not been passed onto the MPX line. It can be offset in part by massive overclocking of the FSB (some people have the MPX chipset running in excess of 200Mhz FSB - not a bad overclock for a chipset only certified for 133Mhz).

Conclusion

As far as high-performance servers and workstations are concerned, Opteron is clearly the way to go. It scales better than anything else on the market, the chips are more competitively priced than the Xeon performance equivalents, and of course you can look forward to a nice additional performance boost when (insert choice of OS) is available for x86-64. Having said that, Opteron motherboards are ridiculously overpriced at the moment - you're looking at spending a good ?400 or so on a board. In addition, the chips *require* the use of registered RAM, which means the ram you've currently got is pretty much useless. Buying a gig or two of registered RAM is expensive.
On the other hand, the lower-speed Xeons are now available for some damn good prices. The 2.4Ghz part is probably about the sweet spot and can be had for around the ?120 mark. Cheaper Xeon boards are now also available in the form of the Iwill DPI533 (circa ?180). Since Athlon MP chips are still around the same price as their Xeon counterparts, go for the Xeon over the AthlonMP since you'll get the benefit of the faster bus speed and dual-channel RAM access, *unless* you go down the AthlonXP - MP modification route. In that case you can pick up a fine SMP platform for around ?235 (motherboard ?135, CPUs 2x ?50).
Witten By Chris Bagnall
Last modified: 01/06/2005