GCHQ Builds a Raspberry Pi Super Computer Cluster 68
mikejuk writes GCHQ, the UK equivalent of the NSA, has created a 66 Raspberry Pi cluster called the Bramble for "educational" purposes. What educational purposes isn't exactly clear but you do associate super computers with spooks and spies. It seems that there was an internal competition to invent something and three, unnamed, GCHQ technologists decided that other Pi clusters were too ad-hoc. They set themselves the target of creating a cluster that could be reproduced as a standard architecture to create a commodity cluster. The basic unit of the cluster is a set of eight networked Pis, called an "OctaPi". Each OctaPi can be used standalone or hooked up to make a bigger cluster. In the case of the Bramble a total of eight OctaPis makes the cluster 64 processors strong. In addition there are two head control nodes, which couple the cluster to the outside world. Each head node has one Pi, a wired and WiFi connection, realtime clock, a touch screen and a camera. This is where the story becomes really interesting. Rather than just adopt a standard cluster application like Hadoop, OctaPi's creators decided to develop their own. After three iterations, the software to manage the cluster is now based on Node.js, Bootstrap and Angular. So what is it all for? The press release says that: "The initial aim for the cluster was as a teaching tool for GCHQ's software engineering community....The ultimate aim is to use the OctaPi concept in schools to help teach efficient and effective programming."
Re: (Score:3)
Re: (Score:2, Troll)
The reinvented a wheel (a cluster) that was useless (an "educational" cluster) and did it using low end, shitty parts (Raspberry Pis), then reinvented the software side of it, three fucking times, until they settled on a Frankenstein amalgam of "Node.js, Bootstrap and Angular".
Lego case 64 Raspberry Pi Cluster (Score:2)
The reinvented a wheel (a cluster)
Actually they reinvented the wheel not just in the generic sense but also in the specific sense that someone else has already built a 64 node Raspberry Pi cluster [southampton.ac.uk]...and instead of some custom designed case theirs used a home build lego case which was definitely cooler. Of course this should not be too surprising. It was made by GCHQ after all so they probably got the idea from reading this guy's email!
Re: (Score:2)
Re: (Score:2)
Heh, back in the first dotcom boom I was working for some company that was making supercomputing clusters that were recursively scaleable. Back then you could get all the dumb, fast 8-port switches you wanted for cheap, but if you tried scaling it up with a big flat Cisco backplane for more than a few dozen nodes, you'd easily start paying more for switches than for computers. Plus if any of that infrastructure broke, you'd be out of a huge part of your cluster.
Re:GCHQ Does Something Retarded (Score:5, Informative)
So, there's a better computer for the same price, which wouldn't have the unusually-strong requirement to avoid inter-node communication. The Pi's fine as a beginner's learning tool, but it's a bad model for scaling up to PC-type hardware that a "real" cluster would probably be built out of.
Educational Purposes (Score:1)
They said it's for educational purposes. The point isn't performance -- you're not getting that from RasPis. The point is either to train people on supercomputer programming or to test supercomputer programs on smaller data sets without using time on a real, expensive supercomputer.
I could see building a smaller scale one of these myself as a way to learn MPI.
Re: (Score:2)
If you're just wanting to learn MPI, then a regular multicore PC/laptop is perfectly fine, since it's basically a tiny HPC with a very fast interconnect. If you have access to more than one machine, even better, as you've then got two machines connected with a slow interconnect, so you properly feel the pain of communication costs, and how to distribute your workload best.
Re: (Score:2)
The C1 is pretty impressive but the Raspberry Pi Model 2 also has quad core ARM V7 and 1GB of RAM now. I might pick up a C1 to play with though, it has some neat stuff. It's the first $35 board I've seen that actually seems as good or better than the Rpi. I'd say the community around the pi might make it a better choice still. I picked up a couple of the Pi A+ boards and camera modules and for $45 dollars for the combo it's hard to beat for surveillance cameras. The A+ uses next to nothing in the way o
Re: (Score:2, Insightful)
Yet still higher than yours.
Re: (Score:2)
You can get GeForces these days with up to 2,048 CUDA cores and the memory bandwidth to actually use them. A Raspberry PI is a cute toy--I have one--but anyone wanting to do massively-parallel computation has plenty of faster and cheaper choices out there. There's also the FPGA route if your routines are simple enough and you have enough of an EE background.
That's likely why he thinks it's retarded... it doesn't solve any problems that aren't better solved by other solution
Re: (Score:1)
Buy a single damn video card. [...] That's likely why he thinks it's retarded... it doesn't solve any problems that aren't better solved by other solutions.
I'm not seeing how a single video card will help solve the problem of teaching how to build a cluster out of multiple networked computers. Nor would it look nearly as cool, which is directly relevant to the purpose of (to quote TFA) "getting children interested in science and engineering". It sounds like what GCHQ came up with succeeds much better at achieving the goals in question.
I was going to do that! (Score:1)
Virtual (Score:2)
Easier to do this with VM software and a big PC.
Re:Virtual (Score:5, Interesting)
This would get you 11,856 GFLOPS (4 * 2964) of raw performance. Those 64 Pis will crunch through roughly 1,536 GFLOPS [wikipedia.org] (64 * 24). Wow, it's not even a contest. The big caveat will be power consumption, the Pis will be a lot more efficient than a modern Radeon card, offset by the fact that they'll have to work a lot longer to get the job done.
So lets try using the CPUs instead. We'll compare this cluster against a modern medium-high end Intel processor. The Intel processor will be an I7-2600k. The Pis use a 700Mhz ARM processor that manages 0.041 GFLOPS, for a total of 2.624 GFLOPS. The Intel chip pushes 8.5 GFLOPS [maxxpi.net].
As an efficient use of money, this Pi cluster is a total failure. As a research toy it has some value, but total performance is less than a fairly ordinary PC that costs roughly the same. This doesn't even count all of the switches and power supplies and whatnot you need for the Pi cluster. Even if you aggressively overclock all of the Pis they just won't catch up. In general you top out at about doubling the CPU performance of a Pi with aggressive overclocking and the GPU generally only overclocks about 50% or so. The power consumption figures aren't even all that different when you consider that the Pi needs to be crunching for at least 4 times as long on any particular problem and that you have 66 of them to feed. Even a couple of watts add up across that many machines.
Education Tool for Cluster Programming (Score:2)
The PI uses 4 watts, so a cluster of 64 PIs will use around 256 Watt. A NVidia GTX960 will provide 2,308 GFLOPS at 120 Watt or around 20 GFlops per watt. GTX980 is even better with 28 GFLOPS per Watt. Adapteva Epiphany-IV is supposed to do 100 Gflops at 2 Watt.
Tegra X-1 can do 512 GFlops at likely something between 5-10 Watts.
But even if you would build a Tegra X-1 cluster, for many applications it would still be less power efficient than a smaller number of more powerful machines with a good interconnect:
E
Re: (Score:2)
As an efficient use of money, this Pi cluster is a total failure.
Depends what you but it for. If you buy it to maximise FLOPS/$ then yeah, it's a failure. If you buy it to have a super cheap ass-cluster on which you can run actual cluster code with realistic latencies and programming model for a network cluster and etc, then it's a perfectly good use of money.
It's cheap enough that each programmer can have their own private cluster on which to develop, test and optimize high performance codes before they go
Supercomputer Cluster? (Score:2)
Re: (Score:2)
The common definition of super computer, is a computer with top performance compared to other computers. This one doesn't even come close.
Re: (Score:2)
My desktop PC is also a scalable model of a super computer, if you look at it that way.
Re:Supercomputer Cluster? (Score:4)
Your machine does not scale out to 64 processors for $2k.
Yes, it is slower, and yes, it could probably be done with a better set of hardware like some video cards.
However, the point is likely that the people doing the experiment wanted to have about 64 processors, and they knew how to use the Pi as opposed to the instruction sets for GPU coding. That and the Pi should have all the networking and other pieces on it to make it a standalone node.
When you want to practice working on an computer with multiple compute nodes, it is more important to have more nodes, because that is where the complexity is. You then scale out on a better class of nodes using the same principles. Either more Pis, or replace the Pis with better ARM nodes with similar characteristics, but better performance.
Re: (Score:2)
Your machine does not scale out to 64 processors for $2k.
But in order to make a super computer (according to my definition above), 64 is not enough.
BEOWOLF (Score:2)
but beowolf would approve!!!
Re: (Score:2)
Sure. But the goal was educational, not production, what they did is pretty reasonable. That is, they built a large cluster of computers for kids to learn parallel programming on, using dirt-cheap commodity components accessible to kids. Sure, it's not a supercomputer in that it won't be on the Top 100 list, but it's a good educational "trainer" supercomputer, in that learning parallel programming teaches the the programming models (though not the specific languages) used by the real supercomputers.
Now, if
Re: (Score:2)
Super computer? (Score:2)
Can you really call something with the performance of a high-end desktop PC (or maybe a dual-processor workstation) a "super computer cluster"?
Re:Super computer? (Score:5, Informative)
Can you really call something with the performance of a high-end desktop PC (or maybe a dual-processor workstation) a "super computer cluster"?
Hey, some CS nerd had $4500 left in his budget for the year, and the PR dept at GCHQ was desparate for *anything* that didn't involve destroying the security of the UK people.
Re:Super computer? (Score:5, Insightful)
Re: (Score:2)
Define "a lot". Those are 0.9 GHz Cortex A7 cores. How much faster is a 4GHz latest-gen i7, considering the i7 is much faster clock-for-clock, and also has a much-higher clockspeed? And what about when you've got eight of those cores?
If we assume (and I'm pulling a number out of thin air here) that the Intel chip has four times the clock-for-clock performance, you'd get a dual-processor 8-core system having the performance of 284 RPi 2 cores, which is pretty close. Now, I pulled my numbers out of my ass, bu
Re: (Score:2)
Where are you getting 264 ? The article says 64.
Raspberry Pi has about 850 DMIPS, or 54400 for the whole cluster, and an i7 has about 127000 DMIPS. And then you still have all the communication overhead for the cluster, and the divided memory. For floating point, it's a similar difference.
Re: (Score:1)
Comment removed (Score:4, Interesting)
Super Computer?? (Score:4, Interesting)
Since when does a collection of low powered machines ever deserve the term "Supercomputer"?
Even the TFA doesn't call it a Supercomputer.
All I can assume is a click-bait headline
Re: (Score:1)
The fact that it's a cluster of smaller compute elements does. Supercomputers are usually really powerful, but they don't need to be in order to qualify. It's about system architecture and parallelism, not power.
Re: (Score:2)
I think it is best to think of this as a scale model, just like those Eiffel towers some people have built out of matchsticks. Yes, it is not a real tower, because it is not really, err, towering, but it is still an Eiffel tower.
The only difference is that this was done for training purposes (i.e. not let newbees burn CPU time on the expensive real supercomputer cluster), rather than as a hobby.
Spy agency puts camera in cluster controller (Score:2)
Shocked and amazed? anyone?
Not a bad teaching tool (Score:4, Insightful)
Hello,
This actually isn't a bad idea... as a training tool. It exposes GHCQ's interns (or other programmers and IT pros) who are not used to programming or managing clusters with the underlying concepts, so they can then go and apply those to whatever real projects they have.
Not everyone gets exposure to distributed computing concepts as part of their education, and having a small, simple system like this is a good and inexpensive means of introducing them to new hires. The homebrew cluster management software is another example of this.
Regards,
Aryeh Goretsky
Re: (Score:3)
Sadly I think you're the only one I've seen that actually understood what it was all about. Everyone else seems to think they should have used 64 $1000 computers instead. Good idea for work, bad for a training project.
Re: (Score:2)
You can do the same training on 64 VM's running on a single desktop PC.
Re: (Score:2)
It must be weird being an intern for a vast criminal enterprise.
Mostly academic... (Score:2)
I manage a large compute cluster for my job. I also have a Pi and love it for what it is. Building a Pi cluster could give people an opportunity to try parallel programming, and learn the sysadmin side like getting a scheduler working or using Salt or similar management tool to manage a cluster.
However, I imagine a single Intel i5-4960 would smoke a 64-node Pi cluster. It's a worthwhile experiment, but probably not the best thing for most real-world use.
Re: (Score:2)
http://www.bbc.com/news/educat... [bbc.com]
"It is this decline which prompted GCHQ to start visiting schools to promote languages and also science and technology."
The option to use a chipset that was gaining traction in the media for eduction would have been a consideration.
Good optics and branding with a happy all UK message.
Re: (Score:2)
but probably not the best thing for most real-world use.
Unless the real-world use is writing and optimizing HPC codes for a cluster. Given that's actually 99% of the hard work, I'd say that the RPi cluster is an excellent choice.
Then once you've written and tested and optimized your code without hogging up the expensive high performance cluster, you can then ship it out to that cluster and be rpetty sure it will work and work well.
cost/gflop (Score:1)
RPI is quite inefficient.
2.4watts
0.041gflop
= 0.017 gflop/watt
i7-4790k
123.4 watts
117.35 gflop
= 0.951 gflop/watt
0.951/0.017 = ~56x faster. You'd need to spend $35 * 56 = $1960 on RPI's to get the performance of a single i7.
Re: (Score:2)
RPi: $35 per node * 64 nodes = $2240
i7-4790: $500 per node * 64 nodes = $32000
If you're testing cluster-computing algorithms, it doesn't really matter how fast your nodes are, but there are situations you'll encounter with a real networked cluster that you can't simulate on a single compute node, no matter how fast it is. It's much like how you can use multitasking on a single-threaded processor to simulate a multicore processor, but there are entire families of contention issues you'll never encounter.
Hardware connected to the GPIO pins (Score:2)
The article doesn't have any details about the extra hardware that's connected to the RPi boards.
From the low-res picture, you can see that in each 8-RPi unit, all Pis have a "PiGlow" connected and 4 of the 8 Pis have an extra network card (or another device with a RJ54 cable) plugged in the GPIO slot (with a pass-trough for the PiGlow)
Anyone knows what device is that ?
Re: (Score:2)
After a bit of Google Image Search i think i found out the answer:
Those are Xtronix PoE adapters and are used to power the RPis
Notice how each RPi without this card is connected using a MicroUSB to MicroUSB cable to a nearby RPi who does have this card.
That's stupid and wasteful:
1) They are using a big honking switch with PoE in the back - those are expensive.
2) Powering the next RPi using USB backfeed going over two fuses will ensure that that second RPi gets undervolted and will be very unstable. They cou
Re: (Score:2)
Misleading FUD! (Score:1)
Re: (Score:1)