Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
United Kingdom Technology

GCHQ Builds a Raspberry Pi Super Computer Cluster 68

mikejuk writes GCHQ, the UK equivalent of the NSA, has created a 66 Raspberry Pi cluster called the Bramble for "educational" purposes. What educational purposes isn't exactly clear but you do associate super computers with spooks and spies. It seems that there was an internal competition to invent something and three, unnamed, GCHQ technologists decided that other Pi clusters were too ad-hoc. They set themselves the target of creating a cluster that could be reproduced as a standard architecture to create a commodity cluster. The basic unit of the cluster is a set of eight networked Pis, called an "OctaPi". Each OctaPi can be used standalone or hooked up to make a bigger cluster. In the case of the Bramble a total of eight OctaPis makes the cluster 64 processors strong. In addition there are two head control nodes, which couple the cluster to the outside world. Each head node has one Pi, a wired and WiFi connection, realtime clock, a touch screen and a camera. This is where the story becomes really interesting. Rather than just adopt a standard cluster application like Hadoop, OctaPi's creators decided to develop their own. After three iterations, the software to manage the cluster is now based on Node.js, Bootstrap and Angular. So what is it all for? The press release says that: "The initial aim for the cluster was as a teaching tool for GCHQ's software engineering community....The ultimate aim is to use the OctaPi concept in schools to help teach efficient and effective programming."
This discussion has been archived. No new comments can be posted.

GCHQ Builds a Raspberry Pi Super Computer Cluster

Comments Filter:
  • Now I know that I'm generating good ideas.
  • Easier to do this with VM software and a big PC.

    • Re:Virtual (Score:5, Interesting)

      by jandrese ( 485 ) <kensama@vt.edu> on Monday March 16, 2015 @04:17PM (#49270501) Homepage Journal
      I wonder which is faster. A box full of high end ATI cards doing GPU processing, or 64 RPis doing GPU processing? I'm guessing the ATI cards are probably faster because the Videocore IV on the Pi is pretty crappy. 66 Pis is roughly $2k. That would buy you 10 Radeon R9 280s, which is more than you can fit in a box. Lets assume you have 4 of them and use the other $1200 on the rest of the machine.

      This would get you 11,856 GFLOPS (4 * 2964) of raw performance. Those 64 Pis will crunch through roughly 1,536 GFLOPS [wikipedia.org] (64 * 24). Wow, it's not even a contest. The big caveat will be power consumption, the Pis will be a lot more efficient than a modern Radeon card, offset by the fact that they'll have to work a lot longer to get the job done.

      So lets try using the CPUs instead. We'll compare this cluster against a modern medium-high end Intel processor. The Intel processor will be an I7-2600k. The Pis use a 700Mhz ARM processor that manages 0.041 GFLOPS, for a total of 2.624 GFLOPS. The Intel chip pushes 8.5 GFLOPS [maxxpi.net].

      As an efficient use of money, this Pi cluster is a total failure. As a research toy it has some value, but total performance is less than a fairly ordinary PC that costs roughly the same. This doesn't even count all of the switches and power supplies and whatnot you need for the Pi cluster. Even if you aggressively overclock all of the Pis they just won't catch up. In general you top out at about doubling the CPU performance of a Pi with aggressive overclocking and the GPU generally only overclocks about 50% or so. The power consumption figures aren't even all that different when you consider that the Pi needs to be crunching for at least 4 times as long on any particular problem and that you have 66 of them to feed. Even a couple of watts add up across that many machines.
      • The PI uses 4 watts, so a cluster of 64 PIs will use around 256 Watt. A NVidia GTX960 will provide 2,308 GFLOPS at 120 Watt or around 20 GFlops per watt. GTX980 is even better with 28 GFLOPS per Watt. Adapteva Epiphany-IV is supposed to do 100 Gflops at 2 Watt.
        Tegra X-1 can do 512 GFlops at likely something between 5-10 Watts.

        But even if you would build a Tegra X-1 cluster, for many applications it would still be less power efficient than a smaller number of more powerful machines with a good interconnect:
        E

      • As an efficient use of money, this Pi cluster is a total failure.

        Depends what you but it for. If you buy it to maximise FLOPS/$ then yeah, it's a failure. If you buy it to have a super cheap ass-cluster on which you can run actual cluster code with realistic latencies and programming model for a network cluster and etc, then it's a perfectly good use of money.

        It's cheap enough that each programmer can have their own private cluster on which to develop, test and optimize high performance codes before they go

  • Cluster? maybe. Super? hardly.
    • but beowolf would approve!!!

    • by laird ( 2705 )

      Sure. But the goal was educational, not production, what they did is pretty reasonable. That is, they built a large cluster of computers for kids to learn parallel programming on, using dirt-cheap commodity components accessible to kids. Sure, it's not a supercomputer in that it won't be on the Top 100 list, but it's a good educational "trainer" supercomputer, in that learning parallel programming teaches the the programming models (though not the specific languages) used by the real supercomputers.

      Now, if

      • by Nerrd ( 1094283 )
        And thats how words loose their meaning. I don't mean to poopoo the educational aspect of learning to program for a cluster. I don't at all, just don't label it something it is clearly not.
  • Can you really call something with the performance of a high-end desktop PC (or maybe a dual-processor workstation) a "super computer cluster"?

    • Re:Super computer? (Score:5, Informative)

      by bill_mcgonigle ( 4333 ) * on Monday March 16, 2015 @03:04PM (#49269759) Homepage Journal

      Can you really call something with the performance of a high-end desktop PC (or maybe a dual-processor workstation) a "super computer cluster"?

      Hey, some CS nerd had $4500 left in his budget for the year, and the PR dept at GCHQ was desparate for *anything* that didn't involve destroying the security of the UK people.

    • Re:Super computer? (Score:5, Insightful)

      by memph ( 4038105 ) on Monday March 16, 2015 @03:13PM (#49269849)
      That thing is a lot more powerfull then a desktop. I agree it's no super computer, but it does have 264 900MHz Cortex-A7 cores. and woud be a good test bed for bigger 10000+ core systems.
      • by Guspaz ( 556486 )

        Define "a lot". Those are 0.9 GHz Cortex A7 cores. How much faster is a 4GHz latest-gen i7, considering the i7 is much faster clock-for-clock, and also has a much-higher clockspeed? And what about when you've got eight of those cores?

        If we assume (and I'm pulling a number out of thin air here) that the Intel chip has four times the clock-for-clock performance, you'd get a dual-processor 8-core system having the performance of 284 RPi 2 cores, which is pretty close. Now, I pulled my numbers out of my ass, bu

      • by itzly ( 3699663 )

        Where are you getting 264 ? The article says 64.

        Raspberry Pi has about 850 DMIPS, or 54400 for the whole cluster, and an i7 has about 127000 DMIPS. And then you still have all the communication overhead for the cluster, and the divided memory. For floating point, it's a similar difference.

      • by memph ( 4038105 )
        i was just doing a rough cpu*cycles count. the 264 was from 66 cpu's * 4 cores each. from the dmips specs 1 intel i7 core seems to be about 20x faster then a Cortex-A7, so the system here would be about the same as a workstation with 13 intel i7 cores. the 'about 20x' part is tricky and would likely take testing for a particular application. also speed depends on the bottle neck, the system here has 66 separate paths to 66 separate main memory, and the intel workstation only has 1 (or 2).
  • Comment removed (Score:4, Interesting)

    by account_deleted ( 4530225 ) on Monday March 16, 2015 @02:35PM (#49269443)
    Comment removed based on user account deletion
  • Super Computer?? (Score:4, Interesting)

    by OzPeter ( 195038 ) on Monday March 16, 2015 @02:37PM (#49269457)

    Since when does a collection of low powered machines ever deserve the term "Supercomputer"?

    Even the TFA doesn't call it a Supercomputer.

    All I can assume is a click-bait headline

    • by joemck ( 809949 )

      The fact that it's a cluster of smaller compute elements does. Supercomputers are usually really powerful, but they don't need to be in order to qualify. It's about system architecture and parallelism, not power.

    • I think it is best to think of this as a scale model, just like those Eiffel towers some people have built out of matchsticks. Yes, it is not a real tower, because it is not really, err, towering, but it is still an Eiffel tower.

      The only difference is that this was done for training purposes (i.e. not let newbees burn CPU time on the expensive real supercomputer cluster), rather than as a hobby.

  • Shocked and amazed? anyone?

  • by Aryeh Goretsky ( 129230 ) on Monday March 16, 2015 @03:43PM (#49270217) Homepage

    Hello,

    This actually isn't a bad idea... as a training tool. It exposes GHCQ's interns (or other programmers and IT pros) who are not used to programming or managing clusters with the underlying concepts, so they can then go and apply those to whatever real projects they have.

    Not everyone gets exposure to distributed computing concepts as part of their education, and having a small, simple system like this is a good and inexpensive means of introducing them to new hires. The homebrew cluster management software is another example of this.

    Regards,

    Aryeh Goretsky

    • by amiga3D ( 567632 )

      Sadly I think you're the only one I've seen that actually understood what it was all about. Everyone else seems to think they should have used 64 $1000 computers instead. Good idea for work, bad for a training project.

    • by itzly ( 3699663 )

      You can do the same training on 64 VM's running on a single desktop PC.

    • by AmiMoJo ( 196126 ) *

      It must be weird being an intern for a vast criminal enterprise.

  • I manage a large compute cluster for my job. I also have a Pi and love it for what it is. Building a Pi cluster could give people an opportunity to try parallel programming, and learn the sysadmin side like getting a scheduler working or using Salt or similar management tool to manage a cluster.

    However, I imagine a single Intel i5-4960 would smoke a 64-node Pi cluster. It's a worthwhile experiment, but probably not the best thing for most real-world use.

    • by AHuxley ( 892839 )
      GCHQ staff teach 'future spies' in schools (9 March 2011)
      http://www.bbc.com/news/educat... [bbc.com]
      "It is this decline which prompted GCHQ to start visiting schools to promote languages and also science and technology."
      The option to use a chipset that was gaining traction in the media for eduction would have been a consideration.
      Good optics and branding with a happy all UK message.
    • but probably not the best thing for most real-world use.

      Unless the real-world use is writing and optimizing HPC codes for a cluster. Given that's actually 99% of the hard work, I'd say that the RPi cluster is an excellent choice.

      Then once you've written and tested and optimized your code without hogging up the expensive high performance cluster, you can then ship it out to that cluster and be rpetty sure it will work and work well.

  • by Anonymous Coward

    RPI is quite inefficient.

    2.4watts
    0.041gflop

    = 0.017 gflop/watt

    i7-4790k

    123.4 watts
    117.35 gflop
    = 0.951 gflop/watt

    0.951/0.017 = ~56x faster. You'd need to spend $35 * 56 = $1960 on RPI's to get the performance of a single i7.

    • RPi: $35 per node * 64 nodes = $2240

      i7-4790: $500 per node * 64 nodes = $32000

      If you're testing cluster-computing algorithms, it doesn't really matter how fast your nodes are, but there are situations you'll encounter with a real networked cluster that you can't simulate on a single compute node, no matter how fast it is. It's much like how you can use multitasking on a single-threaded processor to simulate a multicore processor, but there are entire families of contention issues you'll never encounter.

  • The article doesn't have any details about the extra hardware that's connected to the RPi boards.

    From the low-res picture, you can see that in each 8-RPi unit, all Pis have a "PiGlow" connected and 4 of the 8 Pis have an extra network card (or another device with a RJ54 cable) plugged in the GPIO slot (with a pass-trough for the PiGlow)

    Anyone knows what device is that ?

    • by psergiu ( 67614 )

      After a bit of Google Image Search i think i found out the answer:

      Those are Xtronix PoE adapters and are used to power the RPis
      Notice how each RPi without this card is connected using a MicroUSB to MicroUSB cable to a nearby RPi who does have this card.
      That's stupid and wasteful:
      1) They are using a big honking switch with PoE in the back - those are expensive.
      2) Powering the next RPi using USB backfeed going over two fuses will ensure that that second RPi gets undervolted and will be very unstable. They cou

  • Dross and misleading article, states they couldn't disclose the source of how to build one, maybe because they stole it from a South Hampton University Professors original work and would rather you didnt build one putting there fail to shame on public record! See: http://www.southampton.ac.uk/~... [southampton.ac.uk] Beowulf clusters are a dime a dozen and easy enough to do, GCHQ is certainly not the first person to think of it, nor is there HQ somewhere I would advocate my child should work. The USAF already did something
    • South Hampton or Southhampton, what-eva, it's not quite scouse and it's not quite cockney, it's Birmingham! (emphisis on the word ming as in minging or minger!) Seriously, ARM chain's doing CPU intensive processing, I completely agree with the guy talking about GPU vs CPU but as to it being 64 bit over 32 show's what they know, it's never been 8 bit's, it's actually 7 with 1 floating bit and the raspberry pi will never be popular because of the fact it's based on Prism (ARM) technology, seriously, as if pe

Programmers do it bit by bit.

Working...