Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Graphics Books Media Programming Software Book Reviews Hardware IT Technology

GPU Gems 2 70

Martin Ecker writes "Following up on last year's first installment of the "GPU Gems" book series, NVIDIA has recently finished work on the second book in the series titled GPU Gems 2 - Programming Techniques for High-Performance Graphics and General-Purpose Computation, published by Addison-Wesley. Just like the first book, GPU Gems 2 is a collection of articles by various authors from game development companies, academia, and tool developers on advanced techniques for programming graphics processing units (or GPUs for short). It is aimed at intermediate to advanced graphics developers that are familiar with the most common graphics APIs - in particular OpenGL and Direct3D - and high-level shading languages, such as GLSL, HLSL, or Cg. The reader should also be proficient in C++. As with GPU Gems, GPU Gems 2 is not for beginners. For professional graphics and game developers, however, it is an excellent collection of interesting techniques, tips, and tricks." Read on for Ecker's review.
GPU Gems 2: Programming Techniques for High-Performance Graphics and General-Purpose Computation
author Matt Pharr, Randima Fernando (Editors)
pages 814
publisher Addison-Wesley Publishing
rating 9
reviewer Martin Ecker
ISBN 0321335597
summary The second installment in NVIDIA's GPU Gems series shines with more "gems" on real-time graphics and general-purpose computation on GPUs.

The book is divided into six parts, each dealing with a different aspect of GPU programming. Compared to the first book, more emphasis is put on the quickly evolving area of general-purpose computation on GPUs, an area that is commonly known as GPGPU (General Purpose GPU; for more information see http://www.gpgpu.org). To my knowledge, this is the first book to contain so much information related to this relatively new field. In particular, three of the six parts of the book are about GPGPU and its applications. The first three parts, however, are about real-time computer graphics.

The first part of the book contains 8 chapters on photo-realistic rendering that mostly deal with how to efficiently render a large number of objects in a scene, which is a necessity for rendering convincing natural effects, such as grass or trees. Two chapters in this part of the book discuss geometry instancing and segment buffering, two techniques to render a large number of instances of the same object, and another chapter focuses on using occlusion queries to implement coherent hierarchical occlusion culling, which is also useful in scenes with high depth complexity.

Other interesting topics in this part of the book include adaptive tessellation of surfaces on the GPU, displacement mapping - an extension to the popular parallax mapping used in some current games - that allows to render realistic bumps on a simple quad, and terrain rendering with geometry clipmaps. Geometry clipmaps can be used to render large terrains almost completely on the GPU. They were first introduced in a SIGGRAPH 2004 paper by Frank Losasso and Hugue Hoppe, and the algorithm is discussed in detail by Arul Asivatham and Hoppe himself in chapter two of this book. This technique will most likely find wide application in next generation games.

Part two of the book consisting of 11 chapters deals with shading and lighting. I found chapter 9 by Oles Shishkovtsov on deferred shading in the soon-to-be-released computer game S.T.A.L.K.E.R. quite interesting. The game features a full deferred shading renderer, which is probably a first for a commercial game. In his chapter Oles describes some of the tricks used and some of the pitfalls encountered while developing that renderer. Also highly interesting is Gary King's chapter on computing irradiance environment maps in real-time on the GPU. These dynamically created irradiance maps can be used to approximate global illumination in dynamic scenes.
Furthermore, this part of the book has chapters on rendering atmospheric scattering, implementing bidirectional texture functions on the GPU, dynamic ambient occlusion culling, water rendering, and using shadow mapping with percentage-closer filtering to achieve soft shadows.

The third part of the book consists of 9 chapters on high-quality rendering. Most chapters in this part deal with implementing high-quality filtering in fragment shaders. For example, there is an interesting chapter on filtered line rendering and another chapter on cubic texture filtering. In chapter 23 NVIDIA also provides interesting insights into the inner workings of their Nalu demo, which was the release demo for the GeForce 6 series that displays an animated mermaid underwater. In particular, the chapter describes the techniques used to implement the mermaid's hair. Finally, Simon Green of NVIDIA presents his GPU-only implementation of improved Perlin Noise.

Whereas the first three parts of the book cover techniques for real-time computer graphics, the three final parts deal exclusively with GPGPU. Since GPUs nowadays offer a high level of programmability and because of their wide-spread use in commodity PCs, GPUs can be used as a cost-efficient processor for general computation in addition and parallel to the CPU.

Efficient usage of the GPU for general computation, so that conventional CPU implementations can be outperformed, requires special care when mapping algorithms to the highly parallel architecture of the GPU pipeline. Part four of the book mostly deals with exactly this and represents an introduction to the fantastic field of GPGPU. The 8 chapters of this part first describe the general streaming architecture of GPUs, with one chapter going into the details of the architecture of the GeForce 6 series in particular, and then move on to show how to map conventional CPU data structures and algorithms to the GPU. For example, textures can be regarded as the GPU equivalent to CPU data arrays. There is also a chapter on how to implement flow-control idioms on the GPU and a chapter on optimizing GPU programs.

The 6 chapters of part five of the book are on image-oriented computing and describe a number of GPGPU algorithms for performing global illumination computations, for example by using radiosity, on the GPU. There is also a chapter on doing computer vision on the GPU, which I found to be quite exciting. Because of its high parallelism, the GPU can be used to do the tedious tasks of edge detection and marker recognition required in computer vision in a very efficient manner, thus elevating the CPU to do other tasks in the meantime. James Fung, the author of this chapter, is also involved in an open source project called OpenVIDIA (see http://openvidia.sourceforge.net) that is all about GPU-accelerated computer vision. he final chapter in this part of the book explains how to perform conservative rasterization, which is important for some GPGPU algorithms to achieve accurate results.

The final part of the book has 6 chapters that present GPGPU techniques to perform a variety of simulation and numerical algorithms on the GPU. One chapter shows how to map linear algebra operations onto the GPU and develops a GPU framework to solve systems of linear equations. In other chapters the GPU is used for protein structure prediction, options pricing, flow simulation, and medical image reconstruction. These chapters show good examples of how the GPU can be used for non-graphics-related tasks. Furthermore, Peter Kipfer and Rüdiger Westermann present algorithms for efficient sorting on the GPU. Since sorting is such an important building block of many higher-level algorithms, it is important to have an efficient implementation for GPGPU algorithms.

The book contains many illustrations and diagrams that visualize the results of certain techniques or explain the presented algorithms in more detail. All images in the book are in color, which is definitely advantageous for a graphics book. In my opinion, the excellent quality and also the quantity of images and illustrations is one of the strongest points of this book compared to other graphics books.

The book also comes with a CD-ROM with supplemental material, videos, and demo applications to some chapters. Most of the applications include the full source code, which makes it easy to experiment with the techniques presented in the book. Note that most of the applications run on Windows only and many of them require a shader model 3.0 graphics card, such as a GeForce 6600 or 6800. The latest ATI cards, such as the X800, are not sufficient for running some demos because they only support shader model 2.0.

I highly recommend this book to any professional working as graphics or game developer. It is a valuable addition to my library of graphics books and I will come back to a number of articles in the near future. The focus on GPGPU in the second half of the book is a welcome addition and we can expect to see more and more non-graphics-related applications make use of the processing power in today's GPUs.


Martin Ecker has been involved in real-time graphics programming for more than 10 years and works as a game developer for arcade games. In his rare spare time he works on a graphics-related open source project called XEngine http://xengine.sourceforge.net.You can purchase GPU Gems 2 - Programming Techniques for High-Performance Graphics and General-Purpose Computation from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

This discussion has been archived. No new comments can be posted.

GPU Gems 2

Comments Filter:
  • Bring back glide!
    • Re:GLIDE (Score:3, Interesting)

      Hey, dont KNOCK GLIDE.

      From a developer standpoint, I enjoyed GLIDE a lot more then OpenGL, and multitudes more then Direct3D. I believe it didn't stick, simply because, unlike the other two, it didn't progress much, and Creative made sure it wasn't a general hardware API.
      • Creative made sure it wasn't a general hardware API.

        Uh... you're going to have to explain this one.
        • Creative made sure it wasn't a general hardware API.

          Uh... you're going to have to explain this one.


          Weird... GLIDE was 3DFX's proprietary API. All I can think of is that Creative made a GLIDE wrapper for their TNT2 cards. It was bad. Very bad.
          • Re:GLIDE (Score:3, Interesting)

            by Anonymous Coward
            Perhaps he confused Glide (3dfx) with Rendition's Speedy3d (later "RRedline") API. Rendition had a relationship with Creative for the production of boards for their first chip, the Verite 1000. It was Rendition that pioneered the "ISV Evangelist" concept for the PC 3D industry, but 3dfx caught on fast to doing that. It was required, because with no common API back then (somehow nobody considered OpenGL an option right at the beginning), the only way to get games to run on your chip was to get games comp
  • I probably will get this book to read up on the general purpose parallel processing. Hook up a cluster of hacked X-Boxes, use the CPUs and GPUs together = Massively Parallel Processing.
  • by Anonymous Coward on Thursday May 12, 2005 @04:36PM (#12513802)
    One of the uses for the GPU that looks promising is audio processing. There was an article [nforcershq.com] about one company developing VSTs that use the GPU. It will be interesting to see how people will utilize their graphics cards in the future.
    • Some audio cards already use DSP chips developed for graphics processing (notably the UAD-1 DSP card,) but the main problem with using GPUs sitting on an video card to process audio is that the video card doesn't have any way of sending the processed data back to the rest of the system at a useful rate. So you can process audio at 40+ GFLOPS, but you can't send it back to your sequencer software and/or audio card. If/once video card manufacturers hack some sort of interface to send data back to the system
      • That's one of the main advantages of the new PCI-Express cards. AGP was built for video cards so is very lobsided. PCI-Express was built as a general purpose high-speed interconnect and as such has balanced bandwidth to and from the (video) card. Also a returning audio stream, even if you're doing stereo or surround sound, it's all that bandwidth intensive.
      • Then what are those bus mastering ports for on graphics and audio cards?
    • by dustman ( 34626 ) <dlearyNO@SPAMttlc.net> on Thursday May 12, 2005 @05:19PM (#12514106)
      It will be interesting to see how people will utilize their graphics cards in the future.

      My prediction: In the future, people will use their graphics cards for graphics :)

      In the very near term, GPUs have so much more vector processing power, that people say "hmm, we could use the GPU for x"... But, the GPU is *not* a general purpose processor, and doing general "programming" for it isn't going to be a good use of it.

      GPU's are very powerful, so if you could bring a product/app to market within the next couple of years, you would have a performance advantage. But only for a little while.

      In the long term: The Cell processor has, on each CPU, 8 vector units. According to a quick google search, each one is a 32 GFLOPS vector processor... So, thats sort of like 256 GFLOPS of processing power available. A current NVidia GPU is more than 32, but a lot less than 100 GFLOPS. You don't really have "256 GFLOPS" available for any given task, because the processors are independent... You might be able to use one of the Cell vector units as a replacement for your "graphics card", but it won't be as special purpose as your graphics card is, so each GFLOP will get you less in terms of actual graphics goodness... And, they're independent, and although graphics is very parallelizable, no real world problems comes even close to being 100% parallelizable. Witness the dismal results of SLI rendering.

      Cell looks awesome, but when we can hold the real thing in our hands, I wouldn't be surprised if it didn't live up to its hype, (but I wouldn't be surprised if it did, either). My prediction for Cell is: the processor exists and is everything they say it is, which is just a PowerPC-based CPU with 8 attached decently powerful vector units... the whole concept of "automatic distributing computing" is probably not worth paying attention to.

      OK, this long post turned into an ad for Cell I think, but I am very psyched about it.

      A summary: You can use your graphics card for non-graphics processing, and you will have a performance advantage while doing it for a little while. In the long term, Cell or something like it will be the main CPU in your computer, and it will be much better for non-graphics-based vector processing than a GPU, and better for some kinds of graphics processing (raytracing, non-realtime, etc), but not for the current "standard" (raster-based graphics).
      • I'm gonna have to disagree with you about SLI...

        SLI actually provides a very significant performance increase over a single GPU solution. In actual applications (HL2, S.T.A.L.K.E.R.), an SLI configuration gains approximatly 90% over a single GPU. This is not a "marginal" win.
      • do you remember the "emotion engine"?

        i know for a fact it will not live up to anywhere near the hype.

      • I don't think you understand just what a processing monster current GPUs are. Basically you put a bunch of vector processors on a chip with very fast memory tied to it. These GPUs are typically on the same scale or slightly larger than a normal CPU. Now tell me how a general purpose CPU is going to compete with that?

        No you will never format rows in Word using a GPU. But if you are into scientific processing it would seem a reasonable investment. Even if CPUs eventually take back the dominance you'll have a
        • I don't think you understand just what a processing monster current GPUs are.

          I understand exactly what they are, it's my job.

          But, look at the numbers: Current GPUs are around 60 GFLOPS tops. The ATI hardware going into the new xbox is 48GFLOPS.

          Now tell me how a general purpose CPU is going to compete with that?

          Calling the Cell a "general purpose CPU" is not really correct. A Cell chip is *a* general purpose (PowerPC) CPU and *8* very capable 32 GFLOPS vector CPUs on the same die. It's not like a
    • Back in the day of 8086, 286, and even 386, I did stuff with audio processing on a DSP coprocessor board, machine language programming and all. It was good for its day, but the DSP coprocessor board became obsolete, the relentless march of 486, Pentium I, and Pentium II went on, and it just didn't pay to fiddle like that, especially when there were was good code optimization with VC++.

      I guess we are stalled with a minor improvement with the Pentium 3, and step backward per clockrate with the Pentium 4 an

  • Yes, but... (Score:4, Insightful)

    by Telastyn ( 206146 ) on Thursday May 12, 2005 @04:40PM (#12513828)
    Does it come with a dictionary?

    I mean really, even the review contained so many domain specific terms it was hard to follow. Still, I can't imagine this as being a requirement for game programmers. Certainly many games utilize the features within on PC's, but the majority of the games today aren't made for PC's. And the majority of programmers on a game dev team don't deal with the game's renderer.

    Sure, it might be good to have, and much of the parellel processing practices will translate to consoles, but required?
    • Most texts comes with an assumption of the prior knowledge possesed by the reader. Childrens books assume their readers are children, thus using a vocabulary suitable for them. Newspapers make the same kind of assumptions.

      This book, and review, is obviously for people who have a some amount of prior experience with 3d graphics programming. Thus both may assume that the reader knows those domain specific terms.

      If the reader doesn't, then it is the readers task to look up their meaning.
    • Consoles also have GPUs. ATI and Nvidia supplied current-generation consoles with their chipsets, so a lot of GPU programming techniques would be useful on current-generation consoles. Also, next generation consoles are also going to use graphics hardware by ATI and nVidia. Sony has already said [totalvideogames.com] the PS3 is going to use Cg (which is what this book focuses on).

      But you're right that only the graphics team needs to worry about shaders, unless (as discussed in the review) other areas start to use the GPU for
    • And the majority of those that deal with the renderer don't need the book...

      There are 3 things that could make the things you could do with a GPU much, much cooler:
      1) Spawning vertices from within in GPU program.
      2) Vertices interacting with each other (cloth, eg)
      3) Persisting the results of vertex ops without the horrible memory bottleneck.

      Now, all of those you'll be able to do with the next-gen of consoles... not on the GPU, but with the multi-core processors.

      I'm not saying that current GPU prog

  • by radiumhahn ( 631215 ) on Thursday May 12, 2005 @04:44PM (#12513871)
    Ok. What resolution should I set my monitor so that I can actually feel the guts splatter on my face in these first person shooters?
  • Slight clarification (Score:5, Informative)

    by The Optimizer ( 14168 ) on Thursday May 12, 2005 @04:56PM (#12513956)
    I found chapter 9 by Oles Shishkovtsov on deferred shading in the soon-to-be-released computer game S.T.A.L.K.E.R. quite interesting. The game features a full deferred shading renderer, which is probably a first for a commercial game.

    The first commercial game that I know of to use a full deferred shading engine was Shrek for the Xbox, which was released in Fall 2001.
    I also worked on an unannounced PC game in 2003 that had a fully deferred shading (lighting) renderer. Alas, that title was cancelled.

    Deferred Shading on the PC is not very practical on pre-shader model 2.0 hardware, though possible I'm sure. The Xbox allows direct access to the register combiners, exposing more than 2x the fragment processing power than DX8 / Shader 1-1.3 on the PC.

    • by daVinci1980 ( 73174 ) on Thursday May 12, 2005 @06:18PM (#12514474) Homepage
      Deferred shading isn't really dependent on the version of hardware you are using. It's more a question of whether it provides value.

      The entire reason behind doing deferred shading (in games) is that lighting computations per-pixel--especially when you start using HDR--are extraordinarily expensive. Deferring the shading of these fragments until you are sure they will be visible saves the GPU a ton of work.

      To briefly explain what deferred shading is (for those who aren't graphics programmers)... Deffered shading is a technique where you lay down the entire scene in two passes. The first pass renders the entire scene but with all color writes disabled. This allows modern cards to draw data at about 2x the normal rate (plus you can generally avoid any shading--except those that do depth replacement--which provides an additional speed increase to this pass). The second pass is then rendered using the full / normal technique. The benefit is that since the depth buffer has already been filled, the staggering amount of graphics hardware devoted to rejecting fragments that are hidden is used to full advantage. This allows for some serious speed increases, especially if the shading of visible surfaces is very complex.

      The downside of this technique (and maybe what the parent was trying to get at) is that if your shaders are not particularly complex, than this technique is really not much of a win... In fact it can be slower than standard one-pass solutions in that respect.
      • by The Optimizer ( 14168 ) on Thursday May 12, 2005 @06:59PM (#12514783)
        Deferred shading isn't really dependent on the version of hardware you are using. It's more a question of whether it provides value.

        Sort of. You're right about the value proposition, but regarding hardware, here is an example:

        One most important thing that is needed during the shading pass is to obtain the fragments position in the space you are working in. This usually means getting it's X,Y and Z values and transforming them into a space such as view space. On PC video cards, you can't read Z info from the Z-buffer (really would be nice), so you have to store it off somewhere. On a DX9 card such a Radeon 9600 you usually write out the Z to a separate render target using 32F format. On a DX8 card such as a Geforce 3, there is no support for float render targets, in fact nothing beyond 8888 and x555/x565 formats is supported. At lot more work is needed to get the fragment's position, and you have fewer pixel shaders ops to do it in.

        You are not entirely right in saying that Deferred shading saves computations by eliminating overdraw, and yes fast-Z rejection on subsequent passes is a big help. Yes, overdraw is saved, but your lighting costs now directly turned into fill, and it is possible to have situations that are worse than forward (traditional) shading. The biggest benefit of Deferred shading IMHO is that you separate out attribute rendering from lighting. Other benefits include complete per-pixel lighting and normal mapping for the scene, and the visual consistency across the entire scene.

        As an example, if you are going to render a character using forward shading, and there are several point lights that might impact the character, at each frame you need to determine which lights can impact the character and select a shader (or multi-pass) that supports all N number (and types) of lights.

        With deferred shading, you just render out the character's attributes (albedo, position, normal, specular info, whatever) to buffers (DX9 multi-render targets are a big help). Then, in the lighting passes, you project the volume affected by the light into 2d screen space and render a ligthing shader into that volume. The shader looks at each fragment and calculates the light influence on that fragment and accumulates it (separately for diffuse and specular usually). Repeat for each light in the scene. The upshot is that the amount of fill/pixel shader needed is dependent upon the projected area affected by the light. The possible downside is that many of the fragments in that area may be in front or behind the light volume, in which case the lighting calculation is performed on a pixel that's the light doesn't impact (it just accumulates zero light).

        For large numbers of small, dynamic lights, deferred can be a huge win. For static lighting it could also be a win. It all really depends on what the game is doing. There are other aspects of Deferred that are different or more difficult such as alpha-blending, but a full discussion is outside the scope of this forum.
  • by moz25 ( 262020 ) on Thursday May 12, 2005 @05:03PM (#12514003) Homepage
    The book seems like a must-have. One question though: to which extent does it apply to other manufacturer's GPUs too? I'm not entirely comfortable with it being written by one specific manufacturer if I'm looking for information applicable to all/most potential users.
    • In general, all GPU-esque books will be applicable to all manufacturers that support the required shader model. All modern cards (current top of the line NVIDIA and ATI) support shader model 2.0. NVIDIA's top of the line also support shader model 3.0.

      I believe the next ATI card is supposed to support Shader model 3.0, so you should be okay in the near-term.
    • by Anonymous Coward
      This book is about cutting-edge graphics card capabilities, in which nVidia has been the leader for a long time. ATI's design team seems to design cards with the minimum features to meet a certain spec. They try to get the cheapest design to go as fast as possible. nVidia's design team seems to try to please the developers more. Neither approach is bad, but unless one or the other company changes, nVidia will continue to be the first to tout the latest bleeding-edge features.
    • I'd guess around half the code in the book will run directly on other current high-end video cards. Another third (or so) of the techniques apply to other cards, even if the code isn't directly usable on them.

      The book also has a chapter written by a couple of guys from Apple summarizing their experiences with making their code work on both ATI and nVidia cards -- if you're interested in supporting both, this chapter alone might justify buying the book.

    • I've bought the first GPU gems, and I can tell you that the book is far from being NVIDIA-centered. Each chapter is written by experts in the field, and that means you'll get a good deal of math and physics before showing a piece of code that proves the concept. About the code used, I don't know in GPU Gems 2, but in the first one they used mostly Cg, and some writers warned about the profiles you should use to run the code. It's understandable that they use Cg, as it is their brainchild, but the language
  • by samkass ( 174571 ) on Thursday May 12, 2005 @05:21PM (#12514128) Homepage Journal
    One of the most interesting aspects of GPU programming that I've been playing with recently is Quartz Composer released as part of MacOS X 10.4's dev tools (included on the install DVD.)

    It's a visual programming environment that lets you hook together "patches" that create, control, and present audio and video. You can include GL-slang kernels as well. Also, since MacOS X 10.4's Core Image will recompile GL shader language into Altivec code if the GPU isn't up to the task, it adds a lot of flexibility as to when you're okay using the shader language. You can synchronize audio effects with real-time video effects, and hook up iSights, still images, MIDI sound, audio input, mix them all together on the CPU and GPU, and present some stunning effects. I'm certainly going to be checking this book out to see if it helps with this sort of endeavor.

    I don't want to Slashdot anyone's site, since most people working on it are just publishing their creations on personal blogs, but a few google searches can turn up some really fantastic visual effects people have created in the couple weeks it's been out.

    Here is Apple's intro to the subject:
    http://developer.apple.com/documentation/GraphicsI maging/Conceptual/QuartzComposer/qc_intro/chapter_ 1_section_1.html [apple.com]
    and, specifically,
    http://developer.apple.com/documentation/GraphicsI maging/Conceptual/CoreImaging/ci_plugins/chapter_4 _section_3.html [apple.com]

  • FFT's on GPU's? (Score:3, Interesting)

    by viva_fourier ( 232973 ) on Thursday May 12, 2005 @05:40PM (#12514241) Journal
    The very last chapter "Medical Image Reconstruction with the FFT" was really the only one that had caught my eye -- anyone out there know of any projects involving processing loads of FFT's on a GPU such as in image restoration? Just curious...
  • This sounds really useful, but I'm not sure I want to shell out too much for a book that goes out of date.

    Are there any wikis out there that contain these sorts of GPU tricks?

  • by Anonymous Coward
    I'm not a hardware person, but I am into 3d graphics - I've written my own raytracer, etc.. While many of the routines depend on hardward, I found the algorithm in Chapter 14 (Dynamic Ambient Occlusion and Indirect Lighting) to be quite nifty. It presents a way to generate ambient occlusion (a popular way of faking radiosity) without having to do raytracing. It's also deterministic, so there's no sampling noise.

    But what's especially cool is the chapter is available for free download on the GPU Gems 2 [nvidia.com] sit

  • I wonder if Open Graphics Project can gain from this book?
  • Every line of text on the main page is in italics now.
  • It seems to me that the kind of linear algebra needed for LAME encoding is similar to the operations that GPUs provide in very cheap MFLOPS. The raw performance of a pair of $100 videocards vastly exceeds that of the $200 CPU that's busy running the kernel. Is there any software out there that lets me plug videocards into my server, and run dozens or hundreds of LAME encoders on them?
    • Re:LAME GPUs (Score:1, Informative)

      by Anonymous Coward
      Others have gone down this path - i.e, let's use the GPU for non-graphics processing. The big problem is getting the data to and from the GPU processor's memory. If you're gonna feed it with audio stuff, for instance, you gotta get the data in and out. This turned out to be the bottleneck, unfortunately.

      If the GPU had high-speed direct access to main memory - ironically, the kind of architecture that those crappy low-end integrated graphics subsystems use - then this problem would go away, but currently th
      • GPUs don't seem to have problems transferring actual video between IO and VRAM, which makes audio rates look slow. And BionicFX [bionicfx.com] is exploiting exactly this approach.
  • by abes ( 82351 )
    For obvious reasons, the GPU will be best at doing linear algerbra related problems quickly. I am curious, however, whether anyone has tried numerical integration (I've gone through several sites before without seeing mention of this). While the GPU may not be the best suited for this, it seems like it might be possible to still offload some work done by the CPU with the GPU, and potentially speed up simulations.

    If anyone has any information on this, I would be very interested to hear (or perhaps other har

Top Ten Things Overheard At The ANSI C Draft Committee Meetings: (1) Gee, I wish we hadn't backed down on 'noalias'.

Working...