Thursday, December 25, 2008

AMD Bets Big On CPU-GPU Fusion In 2009

ChannelWeb has offered its own series of predictions for 2009, covering everything from chips to networking to storage. But what do the real insiders think is going to happen in the coming year?

We asked three top executives at Advanced Micro Devices to break out the crystal ball and share their prognosticative insights with us. The upshot: Sunnyvale, Calif.-based AMD is betting its unique one-two punch as a leading maker of both x86-based central processors and cutting-edge graphics processors will really start to resonate in the market, that virtualization will go mainstream, and that general-purpose GPU (GPGPU) computing will move sharply away from proprietary interfaces.

Dirk Meyer, president and CEO:

The industry is embarking on the visual computing era, and computing trends in 2009 will amplify the role of graphics in delivering the enhanced experiences users crave. The next wave of industry innovation will come from the fusion of computing and graphics technologies. As the only company in the world shipping both x86-based CPUs and leading-edge graphics, AMD is in a leadership position to deliver the best end-user experiences at home, work or play.

Nigel Dessau, chief marketing officer:

Virtualization will jump the chasm from large enterprise projects to the mainstream. In addition, we should expect customers to demand multiarchitecture or heterogeneous solutions that enable live migrations between PC suppliers, chip and operating system types. This is the year where we will all finally agree: the speed of the processor is no longer the most significant factor in defining the experience of the user.

Rick Bergman, senior vice president and general manager, graphics products group

A transition from proprietary to standard interfaces for GPGPU is ready to occur in 2009, as the first generation of proprietary APIs is quickly replaced by industry standard interfaces such as Havok, OpenCL and the compute shader features of DirectX 11. This will be a natural and much-needed evolution, allowing GPGPU acceleration to fully penetrate mainstream programming environments, and further unlocking the massive floating point computation of GPUs for processing general purpose applications from the desktop to the data center.

History has shown that proprietary programming interfaces are almost always replaced by freely distributed industry standards. This is because industry standards typically encourage greater and more widespread programmer productivity and innovation, while also enabling programmers to address the largest possible hardware installed base.

And with a DirectX 10.1 hardware install base well into the tens of millions, we also expect to see an uptick in games using that API, as more visually complex games and an increasingly competitive game software landscape create the need to unlock advanced AA techniques and additional "free" game performance.

Server Wars: AMD`s Opteron VS. Intel

In the battle for 45 nanometer dual-process supremacy, AMD’s Opteron 2384 is pitted against Intel’s X5492 Xeon. The winner comes down to the classic factors of power consumption vs. performance.

With the launch of “Shanghai”, AMD has revamped their Opteron product line to do battle with Intel’s Xeon in the multi-cpu server space. Does Intel have anything to worry about?

Server CPUs tend to be unique beasts. Sure you can build a server with a desktop CPU, but once you move into the multi-processor space, things begin to change.

Intel has dominated that multi-processor server (and workstation) market with its 45 nanometer Xeon line, especially in the lucrative and large dual processor segment.

Recognizing Intel’s success, AMD recently launched its latest Opteron CPUs, code named Shanghai, to compete with Intel in dual-processor servers.

Intel would win hands down if winning in the dual processor race was based purely on speed. However, there are many more factors to consider when building dual-processor servers, including power usage, performance per watt, heat generated and, of course, price.

In the dual-processor server market, Intel’s 5400 series of Xeon (Harpertown) CPUs seem to be the most popular. Xeon 5400 series processors are available in several different models, ranging from the Quad Core E5405, which runs at 2Ghz and has a thermal design power (TDP) rating of 80 watts, to the X5492, which runs at 3.4Ghz and has a TDP of 150 watts.

AMD offers several models of the Opteron (45nm Shanghai) CPU for dual-processor systems, ranging from the 2376, which has a clock speed of 2.3GHz, to the 2384, which has a clock speed of 2.7GHz. All third generation Opteron (45nm Shanghai) offer a 4x512KB L2 cache, 6 MB L3 Cache and have an average CPU power of 75 watts.

Comparing the best of dual-processor servers meant pitting Intel’s 3.4Ghz X5492 Xeon against the AMD 2.7Ghz Opeteron 2384. For a direct comparison, we turned to the SPECjbb2005 benchmark, which scored the Xeon X5492 at 324,451 and the Opteron 2384 at 311,471–showing that the Intel CPU has a speed advantage, at least when it comes to synthetic testing of elements such as the JVM (Java Virtual Machine), JIT (Just-In-Time) compiler, garbage collection, threads.

AMD’s Opteron 2384 has a average retail price of about $1,050, while Intel’s X5492 Xeon retails for $1,700. That brings up an interesting point, how much Intel processor can $1,050 buy, and how would that compare with the Opteron 2384? The Intel Xeon E5450 sports a price of about a $1,000, and tones the specs down a little bit compared to the X5492. The Xeon E5450 is an 80 watt quad core 3.0Ghz CPU and uses significantly less power than the X5492 (150 watt), which translates to less heat generated and lower electric bills. The E5450 is priced close to the AMD Opteron 2384 and has comparable power ratings (80 watts for the Intel, 75 watts for the AMD).

That said, initial performance comparisons gives the edge to AMD with the Xeon E5450 offering a SPECjbb 2005 rating of
293,213, slightly behind AMD’s 311,471. Simply put the
Opteron 2384 offers about a 6 percent performance increase over the Xeon E5450.

Of course, Specjbb2005 is only one bench mark, when we switched over to PassMark to score the CPUs, we came additional ratings which mirrored the previous tests. PassMark’s Performance Test offers a “CPUMark” rating score, which focuses on testing the CPU performance– here a pair of Intel Xeon E5450 CPUs scored a CPUMark of 9025, while a pair of AMD Opteron 2384 CPUs scored a CPUmark of 9449, roughly a 5.5 percent performance advantage.

So what exactly does all of this mean for the system builder? The quick take is that despite the launch of new processors and changes in technology, little has changed in the market. It all comes down to two elements: speed vs. cost.

For system builders looking for the most bang for the buck, a dual-processor server using AMD’s 45nm Opterons is the way to go. Those looking for max performance, the Intel Xeon proves to be the top dog.

There are some further complications coming down the pike. Intel will be expanding upon the technology that makes it latest “Nehalem” CPUs impressive performers, while AMD is sure to learn a thing or two when their new Phenom II CPUs hit the streets early next year.

With the new Opterons, AMD has proved that it can play “catch-up” with Intel, and still keep the market competitive. Simply put, AMD has put the ball is in Intel’s court for the dual-processor server market. Don’t count Intel out, though, since it’s sure to mount a strong reply to the third-generation Opteron.

Multicore doesn't mean equal core

As anyone who has worked on a group project knows all too well, not all team members contribute equally to the success of a project. And now Virginia Tech researchers have found the same holds true for the cores in multicore processors.

Depending on how your code is distributed across seemingly identical cores, the speed at which that code is executed on a multicore processor can vary by as much as 10 percent.

If you've ever had a program perform slower than expected or perform quickly on one day and not as spritely the next, you might want to examine how that CPU is executing the job.

"The solution to this is to dynamically map processes to the right cores," said Thomas Scogland, a Virginia Tech graduate student who summarized this quirk at the SC08 conference in Austin, Texas, last month. Scogland and fellow researchers, with help from the Energy Department's Argonne National Laboratory, developed prototype software that could one day help balance performance more equally across all cores. DOE also helped fund the work.

In the past few years, Advanced Micro Devices and Intel have developed multicore chips as a way of boosting performance over previous generations of commodity microprocessors. They have moved from two cores to four or even eight cores per chip.

Developers and systems engineers have mostly expected each core in a multicore processor to have the same effective capability. However, that is not necessarily the case for a variety of reasons, Scogland said.

In all fairness, it's not the cores' fault, technically speaking. Although the cores are identical, how a program is distributed among the cores can affect how quickly it runs. And in most cases, the operating system and hardware spread a program across multiple cores rather arbitrarily, which leads to varying performance.

A number of factors contribute to that variance, the researchers said. One factor is how the CPU hardware handles interrupts. In many cases, they could be directed to a single core, which could slow other applications on that core. However, if the interrupts are distributed across all the cores dynamically, there is no guarantee that the core handling the interrupt will be the same one that is running the program for which that interrupt was intended. Therefore, additional communication time is needed between the two cores.

Wednesday, December 17, 2008

Crysis Warhead with 9950 and 4870x2 test

AMD 9950 unpacking

I found it on youtube for someone who wonder is there a heat sink in 9950 black edition.

How ATI stole Christmas

Yesterday we talked about Nvidia's Christmas lineup, bargains, or rather lack of them, here. Today it's ATI's turn, and things are looking good for the Canadian outfit, although considering what it's up against, this is hardly a surprise.

Let's start cheap and work our way up the ladder. At €33+ you can get the HD4350, while HD4550 prices start at €46. Don't let the numbers fool you, both are based on the RV710 core and differ in memory size and type (256 MB DDR2 vs. 512MB DDR3). While we can recommend the HD4350 for undemanding users, or HTPC builders, the HD4550 doesn't look like a good deal. Here's the catch, for just seven euros more you can get the HD4650, an RV730 card with 512MB of DDR2. The DDR3 version of the RV730, branded HD4670, costs €69.

In the same price range you'll find ATI's top value card of yesteryear, the HD3850. The cheapest one comes from MSI, and sells for €56. It's overclocked, but has just 256MB of memory. If you need more, you can get Sapphire's HD3850 with 512MB card for €69. HD3870 cards sell for €87+, and non-reference versions still cost over €100. Nvidia's G92 based 8800/9800GT cards have slipped into this price range, not to mention the HD4830, and both offer better performance. An overclocked dual GPU HD3870 X2 from MSI costs €174. It still packs quite a punch, and the price is reasonable.

So much for the RV670, let's move on to Nvidia's nemesis, ATI's RV770 based cards. The lineup starts with the HD4830, probably the best bang for buck card on the market at the moment. The cheapest ones come from HIS and Sapphire, and sell for €99. In our tests the HD4830 managed to outperform an overclocked 9800GT, not to mention the old HD3870, so even this crippled RV770 card packs quite a punch. As for the HD4850, it currently sells for €122+, a fair price, while the cheapest HD4870 costs €189. The latter seems a bit overpriced, especially following Nvidia's revamp of the GTX260 216 and the launch of new 180 Forceware drivers.

The R700 was never meant to be a budget card, but in recent weeks it's gone up in price, not down. Thank the weak euro and the bankers who led us straight into a recession for that. The cheapest one now costs €418, whereas three months ago it was selling for €30 less. The good news is that someone in ATI finally realized that the HD4850 X2 has potential, and what's more, they also realized that it can't be sold for €350. This, along with the two month delay, was the trouble with this rare card from the start. It was supposed to be a more sensible version of the R700, but it ended up costing almost as much as the regular one. Not anymore. As of today you can get the HD4850 X2 for €271 and it's finally starting to make sense.

ATI truly managed to steal Christmas this year, from the low-end, through the mid-range, all the way to the top, it dominates the market with affordable, yet powerful products. A year ago, Nvidia couldn't even meet demand for its G92 based cards, forcing ATI to fight for scraps, but the tables have turned and we can only hope Nvidia makes a comeback soon

AMD 7000 series

The new CPUs are part of the Athlon X2 product family, but carry a new 7000-series number sequence. The 7750 and 7550 (available to OEMs only) depart from the old 65 nm (K8) Brisbane core and use the “Kuma” core, which is based on the K10 or Stars architecture (the K10 name does not officially exist, according to AMD.) Built in 65 nm, the chips are released more than half a year after their initially planned release and offer clock speeds of 2.5 and 2.7 GHz. Like the preceding Brisbane core, Kuma comes with 128 KB L1 data cache, 128 KB L1 instruction cache and 1 MB L2 cache. However, Kuma adds 2 MB L3 cache – Brisbane CPUs do not integrate L3 cache.


The Athlon X2 7750 is currently listed with a tray-price of $79, just above the $76 the company charges for its Athlon X2 6000. With a quick look over to Intel, the same $79 would not be enough to buy you a Core 2 Duo-class processor. Intel currently sells the Pentium Dual-Core E2220 (2.4 GHz, 65 nm) for $74 and the E5200 (2.5 GHz, 45 nm) for $84. The cheapest Core 2 Duo, the E7200, lists for $113.

Tuesday, December 02, 2008

AMD Phenom II listings begin to appear, priced at sub-£200

Today, a pair of those Phenom II processors have appeared on the pages of UK-based etailer MoreComputers.com. According to its product listings, Phenom II will arrive in two models - a 2.8GHz Phenom II 920 and a 3.0GHz Phenom II 940 Black Edition, priced at £190.65 and £231.32, respectively.

The 920 and 940 nomenclature may seem strangely familiar, but the quad-core Phenom II won't be challenging Intel's Core i7 in the performance stakes, we feel. Instead, it'll attempt to put AMD back in the desktop race with a set of affordable mid-range chips that should at the very least challenge Intel's ageing Core 2 range.

There may still be no win in sight for AMD, though. Should Phenom II pose a threat, we'd expect Intel to respond with a slight swipe at its Core 2 pricing. Either way, the competition should result in well-priced mid-range processors for the masses.

Readers should also be aware that MoreComputers' prices are anything but definitive, and that the etailer doesn't offer any pre-sales or post-sales technical support whatsoever. Although the listings appear to be about right in our estimation, official pricing and availability won't be known until AMD lets it slip.


Source : www.hexus.net

Windows 7 to offer DirectX acceleration without a GPU

What would happen if you created a software wrapper that allowed a system without a graphics card to render DirectX 10 visuals on a CPU?

The folks at Microsoft decided to find out and development WARP10 (Windows Advanced Rasterisation Platform 10), a software component to be used in Windows 7.

WARP10, a software rasteriser, allows for DirectX rendering to take place on the CPU, allowing users to take advantage of DirectX functionality when a GPU isn't present. The idea itself isn't anything new, and despite being able to achieve its goal, performance is severely limited.

GPUs have the distinct advantage of dedicated graphics architecture, and features such as texturing units aren't available on today's CPUs. Similarly, a CPU's available bandwidth is far lower than that of a high-end graphics card.

Nonetheless, Microsoft found that WARP10 was able to run DirectX applications such as Crysis - a demanding 3D game - without any GPU at all. Highlighting the strain set upon the CPU, however, are the performance results. At a low resolution of 800x600, the high-end 3GHz Intel Core i7 processor managed an average FPS of only 7.36 - higher than Intel's integrated graphics, mind you, but still far too low to worry any dedicated graphics card.

Hardware

Ave FPS

Min FPS

Max FPS

Core i7 8 Core @ 3.0GHz

7.36

3.46

15.01

Penryn 4 Core @ 3.0GHz

5.69

2.49

10.95

Penryn 2 Core @ 3.0GHz

3.48

1.35

6.61

Phenom 9550 4 Core @ 2.2GHz

3.01

0.53

5.46

NVIDIA 8800 GTS

84.80

60.78

130.83

NVIDIA 8400 GS

33.89

21.22

51.82

ATI 3400

37.18

22.97

59.77

Intel DX10 Integrated

5.17

1.74

16.22


So, if its performance is so severely limited, what exactly is its purpose? Well, there are a few suggestions floating about. The first is that WARP10 will allow Microsoft to make its Windows 7 requirements a whole lot simpler, as a GPU may no longer be required in order to attach the "Windows 7 Capable" sticker.

There could be simpler uses, too. What would a user do if a dedicated GPU in a system were to fail? With WARP10, there's a fallback, and a user could continue to use the system without the GPU. There's a problem with this theory, though. WARP10 might take over graphics responsibilities without kicking up much of a fuss, but it'd need a video output in order to do so - that would be found on the integrated graphics or the dedicated card.



Source : www.hexus.net

NVIDIA's GPGPU ambition coming to fruition?

Soft Body demo

The Great Kulu, a tech demo by Kenneth Bugeja, demonstrates the use of soft-body PhysX technology in a real game play environment and is based on the UT3 3D-Engine.

The demo focuses on the behaviour of a soft-and-squishy creature, along with its torn pieces. It's all simulated on the GPU with the aid of PhysX and uses no pre-scripted animation.

Despite the promise of a "real game play" test, readers should be aware that the Soft Body demo is tailored to demonstrate the value of NVIDIA's PhysX technology. The advantage of PhysX in real-world gaming titles may vary.




No real surprises here, and as you'd expect, the demo achieves a far greater rate of frames per second when the GPU - as opposed to the CPU - is asked to do the PhysX calculations.

What's interesting is that at high detail, the raw power and throughput of the NVIDIA GeForce GTX 260 is able to offer a 70 per cent increase in performance when compared to PhysX running on a 3.2GHz quad-core CPU.


Source www.hexus.net

Jasper Found in the Wild: 65nm Xbox 360s are Appearing

After entirely too long, Jasper has finally been discovered in the wild. As we originally predicted back in early October, the first Jaspers were identified by being labeled 12.1A on the 12V rail. Somewhat surprisingly, the first ones discovered also came with a 150W PSU. We of course knew this would happen eventually, but many believed that early Jaspers would use left over 175W PSUs. It's still possible there are some Jaspers floating around labeled as 14.2A and including a 175W PSU that were simply never recognized for what they were. Also as predicted, the Jasper motherboard featured a new PSU power connector that prevents using a newer, less powerful power supply on an older Xbox that needs more power. This information wasn't confirmation enough for some, but no doubt remained after one was finally dissected. One other interesting tidbit is the inclusion of 256MB of on-board flash, allowing Microsoft to forgo the memory card for Arcade systems and also to preload NXE without a hard drive.

So far it appears that lot number 0842 from team FDOU or CSON and 0843 FDOU have been confirmed Jasper. Unfortunately, the somewhat confusing bit is that some of these same lot numbers have also been confirmed to be Falcon. Clearly, we are in the midst of a transition period, so there are still no guarantees based on external box markings. If you get a console labeled 12.1A on the 12V rail and comes with a 150W PSU, you can be assured that you have a Jasper-based console.

I would like to reiterate that Microsoft will likely never comment officially on Jasper or any other Xbox 360 revision, so don't waste your time trying to contact them. Along the same lines, your average retail employee is not going to have a clue what you're talking about if you ask, so don't bother.

It may take a few weeks for these to become more common - I still can't find one locally myself, and it would be a total crap shoot ordering online. So the hunt continues around here. We'll have our own dissection and analysis as soon as we can get our hands on one. And if you're hunting for one yourself, post all of your findings in the official thread in the AnandTech Forums.