Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
United States Hardware

On the Supercomputer Technology Crisis 347

scoobrs writes "Experts claim America has been eating our 'supercomputer feed corn' by developing clusters rather than new supercomputer processors and interconnects. Forbes says America is playing catch-up and that the new federal budget items are too little too late. Cray is laying people off due to decreased federal spending and claims lower margin products have forced them to create products based on commodity parts. Red Storm, one of their new Linux-based products, is being delayed to next year."
This discussion has been archived. No new comments can be posted.

On the Supercomputer Technology Crisis

Comments Filter:
  • it makes sense (Score:5, Insightful)

    by dncsky1530 ( 711564 ) on Wednesday July 28, 2004 @05:53PM (#9825822) Homepage
    when you can build a top 5 supercomputer for under 6 million dollars, using off the shelf parts. Why spend the hundreds of millions of dollars?
    • Re:it makes sense (Score:5, Insightful)

      by Otter ( 3800 ) on Wednesday July 28, 2004 @06:01PM (#9825900) Journal
      If you RTFA, an administration panel on high-end computing claims that clusters are inappropriate for certain tasks. I don't necessarily trust the claims of what I assume is an industry-heavy panel, but then I don't necessarily trust the supercomputing expertise of a bunch of Lunix fanboys "administering a network" in their parents' basement either.

      My inclination is to let the market sort itself out, although if supercomputer makers go under, they won't necessarily reappear the moment they're needed.

      • Re:it makes sense (Score:5, Insightful)

        by badboy_tw2002 ( 524611 ) on Wednesday July 28, 2004 @06:13PM (#9825997)
        Then trust the fact that not all problems are easily attacked from a parallel perspective. This means problems where working on one section of the dataset affects large amounts of data in other sections. There's a lot of locking and waiting for tasks in other parts of the system to be completed; and a lot of data transfer/need for shared memory, which if you're bussing between cluster components, its going to be slow.

        This doesn't mean that clusters don't have some use in these regards, it just means that for these types of problems no one has figured out an efficient parallel algorithm to use on them.
        • problems where working on one section of the dataset affects large amounts of data in other sections

          Can you give a specific example? When I think about it, the most CPU-heavy problems that occur to me are highly parallelizable. Things like solving partial differential equations, for instance, which means physical simulations. Or, in the case of research centers, I suppose neural networks might be one heavy user.

          I can't think of any super-computer application that doesn't involve lots of data being proces

          • Re:it makes sense (Score:3, Informative)

            by Anonymous Coward
            Partial differential equations are NOT necessarily highly parallelisable. Linear ones, maybe. But the interesting ones that simulate Da Bombs are all nonlinear elliptic P.D.E.s.
          • Re:it makes sense (Score:4, Informative)

            by Rostin ( 691447 ) on Wednesday July 28, 2004 @08:12PM (#9826902)
            Computational fluid dynamics. The most common manifestation of this would be weather models, although it can also be used to model petroleum resevoirs, airplane wings, and the list goes on and on. The system is broken up into little chunks. What's going on in each little chunk depends on what goes on in all the the other chunks. And we are talking about millions and millions of chunks.
        • Then trust the fact that not all problems are easily attacked from a parallel perspective. This means problems where working on one section of the dataset affects large amounts of data in other sections.

          Possible bad example- all I'd do with this is create a virtual shared memory store on a gigabit network and use a reasonable data engine such as MySQL in SQL Server Mode to create a shared memory space. To make it really handy, put the whole thing on a terabyte ramdisk with battery backup.

          The vector pro
      • but then I don't necessarily trust the supercomputing expertise of a bunch of Lunix fanboys "administering a network" in their parents' basement either.

        Aren't the people running these top 6 clusters part of various Universities and Research Labs?

        I would think they would know if their cluster is doing what they want it to do. And if they have something that does the job, why should they go out and hand Cray several million dollars to build them something that Cray says is better?

      • Re:it makes sense (Score:4, Insightful)

        by Rei ( 128717 ) on Wednesday July 28, 2004 @06:24PM (#9826109) Homepage
        I'll agree that it sounds like an industry heavy panel; furthermore, it sounds like a supercomputer-industry-heavy panel. Furthermore, it sounds like a *nostalgic* supercomputer industry heavy panel. What else could explain lines such as " In contrast, classic supercomputers that rely on very fast, specially designed vector processors "could be programmed in Fortran," Scarafino said. "They could be programmed in a language that mere mortals . . . could program in." ?

        Yes, there are tasks where supercomputers are needed. Most tasks are not among these. If there is a single parallelizable task in a CPU-intensive process, odds are that a cluster is your best bet. For example, even if your core algorithm requires intensive memory locking and must be done in a completely serial manner, if you are going to be running that core algorithm over a range of possible inputs, a cluster will probably be your best choice.
      • While you may doubt the real difference between a traditional supercomputer and a cluster, let me assure you that it is not industry bull. A cluster can efficiently deal with tasks that are easily parallelizable and don't require much communication between the nodes. There are problems that are not easily parallelizable and/or require lots of communication between nodes. A cluster will not do in this case. There is more to the super in supercomputer than just MFLOPs.

        In previous years, clusters weren't
      • The Ford guy wants the Feds to cough up $200M in tax money to develop machines that solve problems that Ford and a couple of other people have, instead of either Ford spending the money themselves or the Feds spending money getting the industry to develop computers or software that solve problems that far more people have which could be much more benefit to industry as a whole. I'm not convinced that the Feds should be doing that either, but since the Weather Service and Nuclear Weapons Designers seem hap
        • Oh, and yes, I'm a Linux fanboy, but I was also reading comp.arch (remember Usenet?) back in the days when the Attack of The Killer Micros was starting to kill the minicomputer and mainframe industry ("careful with that Vax, Eugene!") and RISC vs. CISC was still a design issue, so I do have some perspective on the game.
    • It all depends on what you need the computer to do. If it's a special purpose machine for, let's say, primality testing for large Mersenne numbers, then standard x86 or PowerPC CPUs would be a waste of money. You'd want a bunch of chips that just do FFT, and do it quickly.

      Take for example Deep Crack (luminaries, remember that one?). Perfect example of specialized hardware for a single job.

      Although, you are probably right in that most of todays supercomputing needs can be met by clustering together
      • And there's absolutely no reason why you can't put a bunch of FFT chips on a specialized PCI card, mass produce it, and get a bunch of networked FFT enhanced supercomputers.
      • Yes, but (for example) lets say a specialized CPU could do FFTs 10x faster per clock cycle than your Opertons - but usually you are looking at paying so much more, that buying 20x more CPUs (even if they are Opertons) and using them is cheaper, and you get more power.

        Not only that, you can still use that cluster in the future for other jobs and not have to reinvest millions of dollars again.
      • Re:it makes sense (Score:3, Informative)

        A lot of posters in this article decry commodity hardware for sucking but they're not considering the long term. The tasks commonly performed by commodity hardware are starting to be more and more suited to scientific applications. We can already see how vector optimizations like Altivec and SIMD have worked their way into every desktop chip around.

        As 3D rendering and sophisticated media codecs are becoming the primary reasons for upgrading a home PC, the front side bus and CPU (especially the CPU) have
    • Re:it makes sense (Score:3, Interesting)

      by gUmbi ( 95629 )

      when you can build a top 5 supercomputer for under 6 million dollars, using off the shelf parts. Why spend the hundreds of millions of dollars?


      Because instead of fundamentally advancing the science of computing, the industry is simply scaling commodity technology. The American supercomputer industry has gone from innovator to an assembly operation.

      We're in need of a paradigm shift. Where's the next Seymour Cray?

      Jason.
      • The American supercomputer industry has gone from innovator to an assembly operation.

        Yes and so what?

        The American space industry is taking the same path with AirShipOne. Don't you think it's good?

        If computers hadn't commoditized, you'd still be posting your comment from a terminal connected to a mainframe, while a line of people are waiting in line behind you at the library. But instead, you have a great computer at home, just for you, because you can afford it. Heck, if you wanted, you could afford a s
    • by Rosco P. Coltrane ( 209368 ) on Wednesday July 28, 2004 @06:12PM (#9825990)
      Well, I don't have that kind of money, but if I did, I'd rather get me an older Cray than a cluster of beige-box PCs. If nothing else, Cray machines are classy and impressive, and when we make clients visit the premises, we can go "oh, and this is the Cray computer, crunching at numbers for you" and look at the customer being impressed, as opposed to "oh yeah, that pile of nondescript computer, that's the 1000-node beowulf cluster of AMD Cheaperon computers". I'm sure the extra value in marketting would be worth it...
      • Re:it makes sense (Score:2, Insightful)

        by DAldredge ( 2353 )
        That is why you write an app that looks pretty and shows the state of the system on several large flat panel monitors that are flush with the wall. Just make sure you program in several 'test' modes that really look cool so you can 'test' while the VIP's are around.

        Not that I have ever done anything like that :->
    • Re:it makes sense (Score:4, Insightful)

      by ch-chuck ( 9622 ) on Wednesday July 28, 2004 @06:16PM (#9826032) Homepage
      because, sometimes you need two strong oxen instead of 10240 chicken.

    • Re:it makes sense (Score:5, Insightful)

      by grawk ( 107524 ) on Wednesday July 28, 2004 @06:40PM (#9826254)
      As someone who works for a supercomping center, I can say that some things work VERY well on cheap unix based clusters. I am the primary admin on a 5 TFLOP cluster. We've also got a Cray X1, and while it's only 2.6 TFLOPs, it will eat my IBM's lunch when it comes to some specificly tuned tasks. Much in the same way that we can outperform mac clusters that have significantly higher floating point performance because of the speeds of the interconnects. Supercomputing is about a LOT more than just raw CPU power.
  • by beee ( 98582 ) on Wednesday July 28, 2004 @05:54PM (#9825840) Homepage
    This is an expected and predicted fallout from the recent rise in popularity of beowulf clusters. Slowly but surely managers are realizing, yes, it is possible to have a supercomputer on mass-market hardware, running a free OS.

    Don't see this as bad news... it's a sign that we're winning.
    • I agree, it's definately good news. Our tax dollars can be much better spent on other things than buying a Cray when the same level of performance can be had for much less by building a cluster.
      • If you RTFA you'll find out that "the same level of performance" CAN NOT be had by building a cluster. Clusters only help when a problem is easily paralelized, meaning it can be broken into many small parts which can easily be handled by their own low power processor. Other problems (many modeling applications) do not fit this description and require specialized hardware which can take a large, complex problem and deal with it in one massive chunk... A cluster will choke on these problems just like your
        • by susano_otter ( 123650 ) on Wednesday July 28, 2004 @06:34PM (#9826201) Homepage
          Since clusters are so much cheaper than mainframes, it's often the case that clusters still offer better performance for the money spent than a mainframe would, even if the cluster isn't really optimized the way the mainframe is, for the task at hand.

          That being the case, wouldn't it make more sense to invest heavily in R&D to solve the cluster's problems and remove its limitations, than to invest heavily in R&D into next-gen mainframes?
          • A cluster has poor latency and bandwidth compared to moving data register to register on a CPU. Big fast CPU's have lots of local bandwidth. Clusters have less. How in frack's sake do you expect to "Fix" that? It is the inherent distinction of a cluster. Separate boxes, with IO connecting them can never be faster at comms than the CPU itself. a 486 has more on chip bandwidth and better latency than Gig-ethernet. Sure, it only has 8 registers... Not a huge range of problems that it can solve entirely on-chip... :)
        • by Aadain2001 ( 684036 ) on Wednesday July 28, 2004 @06:40PM (#9826253) Journal
          Then let the people who have those problems pay for the hardware to solve those problems. For the people who are doing parallised work, clusters make perfect sense. I think the big guys (Cray) are just unhappy to see that most of their business is going away because their hardware isn't needed as much since people are figuring out how to use clusters to get the jobs done.
          • You pretty much hit the nail on the head.

            The problem with the big investments in supercomputing by the U.S. government in the last decade is that they've been time after time to put one huge white elephant system after another in one national lab after another. Some problems:

            - They take a long time to build from first RFP, to contract award to first delivery till they are fully deployed. By the time they are done they are usually starting to look long of tooth and the lab starts the whole process all ov
    • Dunno about that so much as we are supporting the status quo.

      One thing about technology is that each generation feeds off the last. If we get into a cycle where our expections for hardware empasize quantity rather than superiority how will we every achieve the ultimate ends of computing: to... uh... accomplish, uh, something... um...

      Maybe it's all headed somewhere.

      M

    • Don't see this as bad news... it's a sign that we're winning.

      Right. The Cray folks have just realized that they are about to go the way of buggy whip and the slide rule. They don't like it one bit. They can only complain by making a lot of noise. But it won't work. When you're extinct, there is no coming back.
      • They can only complain by making a lot of noise.

        That's the was it's normally done.

      • by DarkMan ( 32280 ) on Wednesday July 28, 2004 @07:22PM (#9826550) Journal
        Uh, Cray have a backlog of orders. A backlog to the tune of $153 million, if I recall correctly.

        That's not the sign of a dying buisness model. If they are having problems, it's down to the mangement, not lack of demand.

        There are problems that don't work well on clusters, but rocket on a proper supercomputer. These include a lot of interesting areas, there will always be demand for a few pieces of big iron. At the risk of echoing the ghost of IBM CEO's past, I think somewhere around 20-30 serious top end supercomputers in the world [0]. Most of the rest of the jobs will do just fine on high end clusters.

        If you read the article, there are no quotes from Cray people. What there are quotes from is the people who used to get to play with special hardware, who now admin those clusters.

        It's toys for the boys, not a buggy whip issue.

        [0] That's informed by being someone who uses high perfromance computing, both cluster and supercomputer.
    • This is an expected and predicted fallout from the recent rise in popularity of beowulf clusters. Slowly but surely managers are realizing, yes, it is possible to have a supercomputer on mass-market hardware, running a free OS.

      Don't see this as bad news... it's a sign that we're winning.


      Not necessarily. There are plenty of computational problems that, so far, do not lend themselves well to parallelized solutons.

      The point of this post and the linked article is that the hype about Beowulf and similar che
  • Inevitable (Score:3, Insightful)

    by Marxist Hacker 42 ( 638312 ) <seebert42@gmail.com> on Wednesday July 28, 2004 @05:55PM (#9825848) Homepage Journal
    What most people don't seem to understand is that you don't need a supercomputer when a mesh of nodes on a network will do just as well. Just like most people don't understand that a 386 running Linux and Word Perfect 5.1 is just as good of a word processor as a 2.5Ghz Itanium running Windows and Word. Computer power has *usefull* limits as well as technological limits.
    • Re:Inevitable (Score:5, Informative)

      by mfago ( 514801 ) on Wednesday July 28, 2004 @06:01PM (#9825908)
      a mesh of nodes on a network will do just as well

      In some cases.

      Unfortunately, some problems are particularly unsuitable for clusters of commercial computers, and really benefit from specialized architectures such as shared memory or vector processors.

      A while ago it was decided by the US government to essentially abandon such specializations, and buy COTS. It is certainly cheaper, but not necessarily effective.
      • by sterno ( 16320 ) on Wednesday July 28, 2004 @06:15PM (#9826019) Homepage
        If there truly is a demand for those kind of processors, then somebody will likely meet that demand. Right now, it seems that actual demand is so low that they have to drum up this legislation a as a sort of wellfare for vector processor manufacturers.

        It's a simple cost tradeoff. If you can save millions in purchasing computers, it means more money to pay for people to run those computers and do the real work.
      • Unfortunately, some problems are particularly unsuitable for clusters of commercial computers, and really benefit from specialized architectures such as shared memory or vector processors.

        My guess is that most of these problems could be done massively parallel, it's just harder to program (and thus hasn't been pursued yet). You can buy a lot of programmer-years for $10 million, though, and unlike a big vector mainframe purchase, you can share the results if you spend the money on software development ins

      • Shared memory can easily be simulated on a gigabit network, but vector processors is slightly harder. To simulate vector processors on a network, you basically have to create an extra bit of hardware for every node on the network- mass produce a vector processor card, and distribute that. You'd still probably end up cheaper than a Cray.
      • Re:Inevitable (Score:4, Insightful)

        by Performer Guy ( 69820 ) on Wednesday July 28, 2004 @08:23PM (#9826986)
        No the government didn't abandon these. Infact the government is one of the few remaining purchasers of this type of hardware. It just so happens that a lot of problems including the governments are solved by clusters.

        It could be argued that at least *some* of the ASCI (Advanced SuperComputing Initiative) computers had specialized architectures with loads of bandwidth & low latency interconnect (in their day).

        It's a bit of a joke complaining about a lack of vector computing when every Intel and AMD CPU sold today has floating point vector instruction set extensions with very interesting operators.

        I'd argue that if you take those lamented early 90s supercomputers there's not a problem they can solve faster that a relatively small contemporary cluster or even a single desktop system. A standard 4 CPU single PC desktop system with the right architecture could also spank those legacy systems in memory bandwidth, shocking but true. It just didn't keep pace with the scale and cost reduction of small systems & clusters.

        The real problem here is *relative* performance of supercomputers and commodity components, but as it takes hundreds of millions if not billions to develop a new competitive CPU & architecture and manufacture it, scientists pockets aren't deep enough to pay for those costs (and thank goodness because it's our tax dollars). It is rather pathetic to lament that supercomputers have been outpaced by clusters. The economics make it impossible for supercomputers sold in low numbers to keep pace. Or more reasonably stated, the economics of consumer PC systems makes powerful computing ubiquitous and affordable to the point where it no longer makes economic sense to pursue specialized processors and architectures to try to outperform them.

        If anything is to be done it would be to increase the bandwidth and reduce the latency of cluster interconnect, and guess what, that's EXACTLY what smart people are working on right now.

        As for eating America's seed corn, it is Intel and AMD that sell most CPUs used in clusters today. It is that competition and the pressure of increased development costs that makes custom hardware untennable.

        It is just false to imply that supercomputing technologies fed lower end development. It is a romantic vision of trickledown technology but it is not actually how technological development works. Look at computer graphics, since the commodity PC graphics cards beat big iron from SGI there has been more innovation and development in graphics hardware, not less. There is competition and a willingness to experiment with new features. The same is true with CPUs from Intel and AMD and the architectures and innovations in memory bandwidth they constantly drive forward.
  • by Billobob ( 532161 ) <{billobob} {at} {gmail.com}> on Wednesday July 28, 2004 @05:57PM (#9825865) Homepage Journal
    It appears to me as if we have reached the point where supercomputers aren't really as practical as they were before. Fewer and fewer industries need and prefer supercomputers to a cluster of cheap PCs, and the market is simply heading in that direction - nothing really unique happening here other than capitalism.

    Of course people are going to cry that companies like Cray are falling by the wayside, but the truth is that their services simply aren't as needed as they were in years past.

    • by susano_otter ( 123650 ) on Wednesday July 28, 2004 @06:39PM (#9826247) Homepage
      I suppose the counter-argument would go something like this:

      It's true that supercomputers aren't really all that useful or necessary these days. However, it may be that a future computing problem shall arise, which requires a next-generation supercomputer to solve. So we'd be well-served to have a next-generation supercomputer fresh from R&D, to apply to the problem.

      We may only encounter one or two more supercomputer-class problems, but they might be important ones. We should be prepared.

      On the other hand, we may encounter a problem that can only be solved by horses. But we don't see a lot of buggy-whip subsidies these days...
  • "Feed' Corn? (Score:4, Insightful)

    by jmckinney ( 68044 ) on Wednesday July 28, 2004 @05:57PM (#9825868)
    I think that should have been "Seed Corn."

    • Re:"Feed' Corn? (Score:3, Informative)

      by jmckinney ( 68044 )
      Offtopic, my left testicle. You're SUPPOSED to eat feed corn. You save seed corn to plant.

    • Yes, the person who posted this to Slashdot mangled the analogy in the Computerworld article [computerworld.com]. The quote from the article:

      Scarafino compared it to eating one's seed corn.

      Makes more sense this way. You eat feed corn (or rather, livestock does); you save your seed corn to plant next year's crop. Eating your seed corn is thus a very bad, short-sighted thing.

  • Expert complains: (Score:4, Insightful)

    by Anonymous Coward on Wednesday July 28, 2004 @05:58PM (#9825872)
    Free market sucess might lead to us actually having to pay for our own supercomputer research that we use in profit making ventures.
  • I Need A RAIS (Score:5, Interesting)

    by grunt107 ( 739510 ) on Wednesday July 28, 2004 @06:00PM (#9825894)
    Random Array of Inexpensive Servers.

    If the 'supercomputers' of today are increasing performance, does it really matter the design?

    Maybe that is a signal that monolithic computer tasks are best handled in a hive mentality - have the Queen issue the big orders, have the warriors performing security, have the workers transporting the goodies (data), and have the requisite extra daughters and suitors to grow the hive and assure its viability (redundancy).

    The fact that it is cost-effective is even better.
  • It's seed corn. Seed, as in, what you don't eat, but save to plant next year.

    Kids these days.
  • by Anonymous Coward on Wednesday July 28, 2004 @06:07PM (#9825953)
    Its the fact that clusters require higher skill to program efficiently for than do single processor systems. Plus you have all of the wasted processing power used for communication between the nodes. Granted, many problems lend themselves well to distributed computing (essentially what a cluster is, but the nodes are closer and communicate faster), but there are also problems that are handled better by a smaller amount of specialized hardware. The other point is that by using off the shelf parts, we are not really innovating in this space like we should be. We are allowing the commodity computer market determine the direction of the supercomputer market.
    • What's key here is the amount of processing power you get for a given dollar. Clusters of general purpose systems may not be as efficient as a vector system, but in the end, the price makes up for the inefficiencies.

      If the cost of the system plus the cost of the geek to run it is cheaper per unit of work than it is for a vector machine then that's all there is to it.

      We are innovating by squeezing more and more processing power into smaller and smaller spaces and by improving on the interfaces for interco
    • by CatOne ( 655161 ) on Wednesday July 28, 2004 @06:21PM (#9826086)
      Granted, it is more difficult to program something (from the ground up) that runs distributed, than it is to program something that runs on a giant 2048-way box.

      Just like it's more difficult to write multithreaded code than it is to write single-threaded code.

      That's where software, and platforms come in. There is a TON of research being done, which uses technologies like Infiniband and Myrinet as interconnects, and can make a cluster "look" like a big monolithic machine. If you as an end user write code that goes down into the TCP stack itself, you're working too hard, and you're going about it the wrong way.

      Put it this way: In 5 years the odds are overwhelming that there will be a good software platform that can let you pick 5000 servers and run your app 10,000 threaded, with everything appearing just like a single process, and running "as it would on a Cray." It's easier to solve this stuff with software -- take your problem (distributed computing) and solve the problem with a different set of technologies (high performance/low latency interconnects, shared address space/DMA across machines, etc).

      Apple's Xgrid is a step in this direction. It's missing a ton of "Supercomputer" functionality right now, but it's a nice cross-machine GUI scheduler. Right now this type of app can address maybe 20% of what supercomputer apps need... in the future maybe more like 98%.
      • There's one thing you touch upon in passing:
        The protocol stack. Many of the problems in speed when it comes to clusters is that people still use TCP(And it doesn't help that a bunch of idiots are trying to get people to use TCP/IP even at subnet level with Infiniband... Talk about crippling Infiniband by doing that..), with alll the performance hits that entails.
  • by xenocide2 ( 231786 ) on Wednesday July 28, 2004 @06:09PM (#9825968) Homepage
    One of my professors (everybody has one of these it seems) is working on cluster computing research, extensions of MOSIX. He's a guy with networking and operating systems expertise. I wouldn't hire him to build a new generation of super computing interconnects or processors. As the Republicans have taught us, federal budgets are not a zero sum game. Why divert focus from one to the other when we could have both?

    We have to be careful about measuring these things however. One of the goals of cluster computing was to lower the cost of computing. If the government is spending less and still meeting needs, thats not nessecarily an indicator of a problem. If that means that we aren't writing code to fit into a vector platform, so be it!
  • There is no crisis (Score:4, Interesting)

    by 0x0d0a ( 568518 ) on Wednesday July 28, 2004 @06:12PM (#9825993) Journal
    Cray has been engaging in scare tactics about "America being dominated by overseas competitors" for a while, because they're terrified of losing the lucrative business contracts from government and big business, they'll pull out all the stops. They've come up in the IT press recently a couple of times.

    Screw 'em. If there's a need, the market will provide. If it turns out that the important tasks can be parallelized and run on much less expensive clusters, then all that means is that we have a more efficient solution to the problem.
  • It isn't "feed corn" that's disappearing, it's the old Cold War-style Paranoia Pork Barrel. Companies that used to lap up obscene amounts of funding for exotic hardware now have to go face to face with fast and cheap clustered COTS hardware. 25+ years of commodity-scale engineering, in the case of commecial microprocessors, has vastly outstripped the achievements of specialty supercomputer technology by the metrics of bang for the buck and constancy of improvement through time.

    Poor little babies, now wher

  • by iabervon ( 1971 ) on Wednesday July 28, 2004 @06:13PM (#9826004) Homepage Journal
    If you really want a vector-processor supercomputer you can program in Fortran, get yourself a G5 and gcc. The PPC64 supports SIMD vector processing. For that matter, any problem which benefits from vector processing is trivial to parallelize with threads.
  • In the age of IP and patents it seems like it is very hard for companies to make major advances [in any field] without some other company cry foul and taking that company to court over patent/IP rights, especially if the alleged infringer is a smaller company (i.e. less lawyers). IBM and MS, among others, are filing dozen if not hundreds of patents a day. What we are seeing as an affect is that innovation is being stifled by litigation.

    (pat pending)
  • Trickle Down (Score:4, Interesting)

    by Anonymous Coward on Wednesday July 28, 2004 @06:14PM (#9826011)
    Technology first developed on the high end slowly works it's way down into the low end. What happens when the high end is no longer there.

    Not that many people really need a race care, but advances in fuels, materials, engineering in race cars eventually leads to bette passenger car. And for raw performsnce, strapping together a bunch of Festivas will not get you the same as an Indy racer.

  • by wintermute42 ( 710554 ) on Wednesday July 28, 2004 @06:15PM (#9826017) Homepage

    There seems to be some historical revisionism going on regarding the demise of the "supercomputer industry". People are coming out of the woodwork now saying that lack of government support caused the great supercomputer die off.

    As Eugene Brooks predicted in his paper Attack of the Killer Micros, the supercomputer dieoff was caused by the increasing performance of microprocessor based systems. Many of us now own what used to be called supercomputers (e.g., 3GHz Pentinum processors, capable of hundreds of megaFLOPs).

    The problem with supercomputers is that high performance codes must be specially designed for the supercomputer. This is very expensive. As people were able to fill their needs with high performance microprocessors they quit buying supercomputers.

    Many people who need supercomputer levels of performance for specialized applications (e.g., rendering Finding Nemo or The Lord of the Rings) are able to use walls of processors or clusters.

    There are, of course, groups where putting together off-the-shelf supercomputers will not suffice. But these groups are few and far between. As far as I can tell they consist of the government and a few corporations doing complex simulations. The problem is that this is not much of a market. Even if the government funds computer and interconnect architectural research, there does not seem to be a market to sustain the fruits of this research.

    In the heyday of supercomputers there were those who argued that when cheap supercomptuers were available the market would develop. The problem is, again, programming. High performance supercomputer codes tend to be specialized for the architecture. Also, no supercomputer architecture is equally efficient for all applications. It is difficult to build a supercompter that is good at doing fluid flow calculations for Boeing and VLSI netlist simulation for Intel (the first applications tends to be SIMD, the second, MIMD). The end result of these problems tends to suppress any emerging supercomptuer market.

    The reality right now seems to be that those who are doing massive computation must build specialized systems and throw a lot of talent into developing specialized codes.

  • by Bruce Perens ( 3872 ) <bruce@perens.com> on Wednesday July 28, 2004 @06:15PM (#9826021) Homepage Journal
    One of the nice things about clusters is that they encourage people to consider how to decompose a problem so that it can work without a large high-speed shared data memory. Some of the older supercomputers were important because scientists hadn't done this work because there wasn't the economic incentive back then. Now there is one.

    So, what tasks still require a high-speed shared data memory? Answer that, and you'll understand where you can still sell a supercomputer.

    Bruce

    • So, what tasks still require a high-speed shared data memory?

      A high-speed shared memory test program?
    • How about high resolution physical simulation (whether that be climate modeling or plasmas)? One great thing of a Supercomputer is that you can hold so much data at once in the active set. One node can only hold so much data, so the full simulation has to be distributed across the whole supercomputer. It is definitely not RC5 keys, the opposing end of the spectrum in this data/compute tradeoff.
  • America isn't going to be the best in everything, its just not possible. So what the fastest supercomputer in the world is a Japanese creation, other then some hurt pride among builders of these things, it doesn't mean anything. The reaction, "Oh my God someone else made something better, we better dump money on the problem for no reason,' doesn't do anything so give up already.
  • And do interesting [wistechnology.com] things [unlimitedscale.com]. And try to keep in touch [excray.com].
  • About time... (Score:5, Informative)

    by 14erCleaner ( 745600 ) <FourteenerCleaner@yahoo.com> on Wednesday July 28, 2004 @06:16PM (#9826033) Homepage Journal
    The surprising thing about this is that there are still companies making big-iron vector supercomputers. I worked in this industry from about 1980 to 1995, and when I left it was dying already. Even then, the majority of scientific computer users would rather have their own mini or microcomputer than get a small share of some behemoth Cray mainframe. It provided them more flexibility, and if they can use it 24 hours per day it also was more effective.

    For things like weather forecasting, maybe big vector machines still have an edge, but I suspect that's changing as the weather guys get more experience in using machines with large numbers of micros. This seems to have already occurred, in fact; NCAR [ucar.edu] appears to have mostly IBM RS6000 and SGI computers these days, with nary a Cray in sight.

    The most common term I used to hear in the early 90's was Killer Micros [wikipedia.org]; I think the term dates back David Bailey in the 80's sometime. If you want more evidence that the death of the supercomputer has been going on for a long time, check out The Dead Supercomputer Society [paralogos.com], which lists dozens of failed companies and projects over the years; this page was apparently last updated 6 years ago!

    • Yeah I went to NCAR a year or so ago and their new top of the line was racks and racks of IBM RS6000's. Sure, it doesn't look as sexy as the old Cray they now use as a couch in the museum, but it does appear to get the job done.
  • Comment removed based on user account deletion
  • by gmhowell ( 26755 ) <gmhowell@gmail.com> on Wednesday July 28, 2004 @06:30PM (#9826166) Homepage Journal
    And in other news today, buggy whip manufacturers demand increased government subsidies.
  • by peter303 ( 12292 ) on Wednesday July 28, 2004 @06:31PM (#9826178)
    Forbes has been complaining that federal support of advanced computing is too little? If the government over-stimulates an industry that has too small of a market, it wil just delay the failure.
    Of course the governent should continue in its current policy of funding a few leading-edge machines that are too costly to sell into the general market, but will test new technology. The governemnt itself is a customer will energy testing, weather modeling, medicine development, etc.
  • Money spent researching beowulf type systems advances the start of the art of Linux, communication systems and other stuff related to what I do. Supercomputer research only benefits me peripherally.

    Sorry, I'm selfish, but I like the previous status quo.

    Bryan
  • by Anonymous Coward on Wednesday July 28, 2004 @06:37PM (#9826231)
    I've been in this field over 25 years, been in public position at a major lab now for 8.

    If this was a simple issue, the HPC community would already have completely moved to clusters and never looked back 3 or 4 years ago. But it's not kiddies.

    Want to run a physics projection for more than 1 microsecond? Takes real horsepower that clusters cannot provide even distributed. Just too much damn data. Chem codes that include REAL data for useable time slices? too slow for clustered memory. Every auto maker in the world (almost) has been whining about the lack of BIG horsepower for a few years now.(crash codes and FEA) I could go on forever. Sure, some problems work awesome on clusters, which is why we have them. But definately not all of them.

    The problem is partly diminishing returns, partly the pathetic ammount of useable memory on a cluster and its joke for memory throughput, partly the growth in power of the low end and clustered networking, partly the ridiculously long development cycles invloved in High Performance Computing and the low $ returns,

    One of the biggest things congress sees is that this country will more than likely NEVER again lead the world in computing power for defense and research.

    And thats something we ought to do as the last real Superpower.

    The national labs TRIED clusters, they don't get all the jobs done they wanted. (see testimony before congress, writings in HPC jounals, and the last couple RFPs from US gov. labs,heck every auto maker in the world) People in HPC _know_ it now, but having let what little there was of the supercomputer industry die out, there isn't mcuh of an industry left to turn to now. It just may be too darned late. HPC hasn't been a money making industry since the early 80s.
    Heck, even Intel abandoned their clustered machine they custom built for the government.

    Most folks in HPC will readily admit the Top500 is kind of a joke. The HPC-challenge #s are a little more realistic for the tests, but we really do need something that approximately real world applications, not just a 70s cpu benchmark.

    For those that think this is a 'Linux wins' issue,
    consider that mostly it was fast interconnect networks that allowed clustering, not the OS. Examine the history of clusters and you'll see this is true. Btw, the last few SC companies are already mostly moving to linux anyway.(nec,fujitsu,cray;ibm dabbles in hpc)

    Hopefully the industry will survive long enough to allow for even better mergers of supercomputing power with low end cost, but at this point I doubt it. Cray has been on the ropes since 96, fujitsu's sc division is a loss leader, and NEC has been trying to get out of it for a while for something with a margin.

    Ed -gov labs HPC research punk
    -former Cray-on
    -former CDC type
    • by jsac ( 71558 ) on Wednesday July 28, 2004 @08:10PM (#9826886) Journal
      Here's the problem. On codes which need lots of data interchange, communication speed becomes the bottleneck. I don't know of anyone running a serious fluid dynamics or weather code, which are this kind of data-interchange-limited application, who gets anything near peak performance on "real-world" problems using ASCI machines. Sure, ASCI White (a 10000-node cluster) was billed as a 10-Teraflops supercomputer. Who cares, when you get 10% of peak performance if you're lucky? NOAA wanted to buy a supercomputer in the mid-90s, for weather and climate simulations. They did the requirements analysis and decided that a Japanese vector supercomputer was what they needed -- nobody in the U.S. made them anymore. Seymour Cray flipped out -- a government organization buying foreign supercomputers? heresy! -- pulled a bunch of strings, and very soon thereafter Japanese supercomputers faced a stiff tariff because the Japanese were "dumping" their product on the U.S. market. Of course, that meant NOAA couldn't get their NEC. They ended up buying some American-made cluster and getting their piss-poor 5% of peak performance. Well, two years ago, Japan brought Earth Simulator online. It's cluster of 5000 vector processors; it boasted 30 Teraflops peak performance, which was 3 times as fast as the then-current number one machine, ASCI White. And a group from NOAA went over to Japan on invitation to check the machine out. They spent on the order of a week adapting some of their current codes to the ES architecture and fired them up. And got 66% of peak performance right off the bat. How'd that happen? Well, ES cost on the order of $100 million. (By the way, as a rule, if your 'supercomputer' cost less than $10 million, it's not really a supercomputer.) Of that, about $50 million went into developing the processor interconnect -- it's a 5000-way(!) crossbar, for you EE types. With an interconnect that big and fast, the communication bottleneck which dooms the big physics codes suddenly disappears. So, yeah, the U.S. supercomputer market at its own seed corn. To see Earth Simulator jump to the top of the Top 500 was something of a slap in the face; to see it get 20 Teraflops on real-world problems was a terrible blow to the prestige of the U.S. supercomputing community. And not one we're going to easily recover from.
  • by gillbates ( 106458 ) on Wednesday July 28, 2004 @06:38PM (#9826242) Homepage Journal

    I've seen a lot of naive comments suggesting that supercomputers are being replaced by clusters. The truth is, anyone who can replace their supercomputer with a cluster didn't need a supercomputer in the first place:

    1. (compared to a supercomputer):
    2. The prime advantage of an x86-based server is that it is cheap, and it has a fast processor. It is only fast for applications in which the whole dataset resides in memory - and even then, it is still the slowest of the group.
    3. Clusters are a little better, but suffer from severe scalability problems when driving IO-bound processes. As with the x86 server, if you can't put the full dataset into memory, you might as well forget using a cluster. The node to node throughput is several orders of magnitude slower than the processor bus in multiple CPU systems. (6.4GB/s vs 17MB/s for regular ethernet, or 170MB/s for Gigabit)
    4. Multiple CPU servers do better, but still lack the massive storage capacity of the mainframe. They work better than clusters for parallel algorithms requiring frequent syncronization, but still suffer from a lack of overall data storage capacity and throughput.
    5. Mainframes, OTOH, possess relatively modest processors, but the combined effect of having several of them, and the massive IO capability makes them very good for data processing. However, their processors aren't fast at anything, and often run at 1/2 or 1/3 the speed of their desktop counterparts.
    6. Supercomputers combine the IO throughput of a mainframe with the fast processors typically associated with RISC architectures (if you can still consider anything RISC or CISC nowadays). They have faster processors, more memory, and much greater IO throughput than any other category.
    It used to be that the prime reason for faster computers came from the scientific and business communities. But now that the internet has turned computers into glorified televisions, the challenges have gone from that of crunching numbers to serving content:
    1. Clusters are great for serving read-only content, because there's very little active synchronization required between nodes, and the aggregate IO capacity scales well.
    2. Mainframes reign when it comes to IO throughput - companies that formerly had use for a supercomputer are finding that their role is shifting to more of an information-provider role; faster processors are no longer as important as fast IO subsystems.
    3. Scientists aren't being trained to use the computer as a tool; most think of a computer more or less as a means of verifying their hypothesis, rather than a means of discovering possible explanations. Their primary work is done with a calculator and pencil, and only later, when they need something to back up their ideas, do they turn to a computer simulation. The computer is a verification tool, not a means of discovery.

    As our economy has shifted away from a technological base to an entertainment one, the need for supercomputers has begun to evaporate. We outsource innovation overseas so that we can lounge around on the couch watching tv and drinking beer (or surfing the net and drinking beer). The primary purpose of technological innovation has shifted from that of discovering the universe to merely bringing us better entertainment.

  • by Anonymous Coward
    One technology that I work with is called Artificial Life and is basically large evolutionary software simulations. (This is not exactly the same thing as genetic programming, but it's close.) This is an example of something that just plain doesn't cluster well. Try to cluster one of these, and you will max out a gigabit switched LAN in less than a second (I've done it!). I've even maxed out a gigabit "star configuration" LAN with this stuff. It just doesn't cluster.

    The problem is that these simulatio
  • Hammertime (Score:3, Insightful)

    by Graymalkin ( 13732 ) * on Wednesday July 28, 2004 @06:50PM (#9826325)
    There is no super computer technology crisis, there is however a paradigm shift happening in the supercomputer market. Twenty years ago building your own supercomputer, even a loosely coupled cluster, was not a very viable option for most research institutions. Today this option is not only viable but often exercised.

    Obviously the big SC vendors and designers seeing less business roll their way, why pay them tons of money when you can have grad students assemble your cluster for the price of some pizzas? That isn't to say SC clusters are the end-all be-all of computing but they're very useful and relatively inexpensive. Realistically they're simply an extension of what Cray started with their T3D supercomputer. The T3D was very impressive in its days but now the technology to build such systems is in the hands of just about everyone.

    Taco: What the hell is up with the IT color scheme? This is even worse than the scheme for the Games section. I know the Slashdot editors don't actually read the site but other people try to and we're not all colorblind or reading from grayscale monitors.
  • by anactofgod ( 68756 ) on Wednesday July 28, 2004 @06:56PM (#9826368)
    Two...I have two words for you!

    Seriously, I don't see the problem, so long as companies like IBM [ibm.com] and (dare I say it) Microsoft [microsoft.com] continue to do research in this area. That is the real value of companies that are committed to *real* research in revolutionary sciences and technology.

    Of course, US companies don't have a hammerlock on this research. There is a lot of work being done internationally in the area, by corporations, and by educational/research institutions.

    ---anactofgod---
  • by twigles ( 756194 ) on Wednesday July 28, 2004 @07:01PM (#9826391)
    Is it because there is a perceived zero-sum game being played between Linux-based clusters and supercomputers? Hey, let's take a reality check here, a lot of research is not directly applicable. In fact, I've read numerous discussions on /. railing against the MBAs and the Bush regime for not funding anything that doesn't turn a profit within about 18 months or have something to do with killing brown people.

    Letting supercomputing die may be harmless, after all, the US doesn't have to be the best at everything in the world and some other country will fund the research. But from some of the more coherent posts I've read, it seems like supercomputing has a definite niche in the natural sciences, something we should be pushing for a better society - learning for learning's sake - and paying for out of public coffers. My taxes go to a lot of shitty things I'd rather them not go to, like subsidizing Haliburton with no-bid contracts. Why is it so offensive to /.'rs that the country as a whole subsidizes advanced computing? Isn't computer science all about seeing what can be computed? Letting supercomputing die because it's expensive seems like an extraordinarily short-sighted thing to do.
  • So what (Score:4, Insightful)

    by DarkOx ( 621550 ) on Wednesday July 28, 2004 @07:46PM (#9826703) Journal
    What does it matter if we don't develop single unit supercomputers. Clearly in a free market if these thing had value they would be persued. There is not predetory tax laws on supercomputer, or any other regulations on domestic use. The only reason development has slowed is there is not much market for the beasts.

    There are many reasons for that too, for one other then in stealer, neculear, mathematic, and bio research feilds few industries need more computing power then can be had off the shelf any day of the week. That was not true yesterday it took all sorts of custom hardware to make CGI happen in films that can be done now in my basement in resonable time frames. So no more super computer market there the ROI is gone I am sure this plays out in all sorts of other engineering feilds as well.

    Many places where you do need super computing power can be done with clusterd systems that are cheap to build and cheap to maintain.

    At least people in the pure science and research fields have learned to be better thinkers and programers, they found ways to do things in parallel that were traditionally serial. Things that still are serial can be made to work on a cluster, sure it might take longer then a single computer considered to be equal FLOPSwise but considering I could either spend all the money I saved makeing my cluster bigger and more powerful so I can get back to equal time or on other profitable efforts while I wait there is again no ROI.

    It so happens that may of the most interestin questions in math, physics and computer science such as quatum theory need massive amounts of parallel work, rather then serial so that works better on a cluster anyway.

    If there is a real reason to do it people will build supercomputer, because there is nothing stopping them other then economics. No need to fear Supercomputers are not going away. Everyone else that needs that kinda proc-ing power will settle for clusters, as well they should. This is just another largly obsolete industry wanting someone to bail them out because they have failed to adapt to a changing market. If they are going to die we should let them, just like we should let the Universitys adapt or die, and the RIAA needs to adapt or die, we need to stop proping up obsolete undustries so new ones can replace them!
  • Interesting... (Score:3, Insightful)

    by Xabraxas ( 654195 ) on Wednesday July 28, 2004 @08:11PM (#9826888)
    It's true these "parallel processing" machines can go fast--Virginia Tech built the third-fastest machine in the world for just $5.2 million with 1,100 G5 chips from Apple Computer (nasdaq: AAPL - news - people ). But they have proven "exceptionally difficult to program" and problematic at certain performance levels, according to a 2004 study by the President's High-End Computing Revitalization Task Force.

    Oh really. Don't blame me for not trusting a guy with that kind of potential bias.

  • by bombadillo ( 706765 ) on Wednesday July 28, 2004 @09:23PM (#9827342)
    This reminds of a Quote from Cray. It goes something like, "Would you rather have 1024 chickens pulling a wagon or a big Ox."
  • It is about cost (Score:4, Insightful)

    by deadline ( 14171 ) on Wednesday July 28, 2004 @10:24PM (#9827674) Homepage
    Sigh...

    There never really was a supercomputer market. There was a cold war, that subsidized the supercomputer market.

    Then there is the cost. Companies stopped making SC because they were too expensive. If the guy from Ford wants to pay 1 billion for a supercomputer I am sure someone will build him one. The cost build a FAB is over 4 billion. Why do you think HP teamed with Intel. Why do you think there are so few processor families? You have to make a living in the commodity market where you can sell things in the millions because supercomputers even in their heyday were sold in the hundreds.

    Then there is the problem that many problems are solvable on clusters. So those specialized problems can not depend on other parts of the HPC market to help subsidized their corner of the market. i.e. clusters make the really hard problems more expensive.

    It is question of how much you want to pay to solve your problem? Simple economics actually. If the numbers don't work, the problem doesn't get solved. If the Gov. wants to solve some problems (and during the cold war they did) then they can step in and subsidize the market.

    And don't cry about Japan and the Top500. When the top500 has price column then it will start to be meaningful.

  • by barneyfoo ( 80862 ) on Thursday July 29, 2004 @06:10AM (#9829731)
    If you read the papers at the recent OLS (Ottawa linux simposium) you'll see that SGI is running linux images (specially tuned) on 64, 128, 256, and in 2 cases 512 cpus. Reading the paper is an interesting view into the problems of running kernels and OS's on such huge NUMA machines.

    http://www.finux.org/proceedings/ [finux.org]

"Protozoa are small, and bacteria are small, but viruses are smaller than the both put together."

Working...