Distributed Computing Overview 51
Fruitiger writes: "Well, P2P / distributed computing is all the rage these days, so if you want a good breakdown of who's doing what when, check out this article at Network World Fusion. Focuses on Porivo Technologies and provides some glimpses of what's to come in the future. An interesting appetizer before Intel's P2P Working Group meeting later this week."
Network World Fusion (Score:1)
How many are interested though? (Score:2)
P2P worked for me! (Score:5)
Neu
Trust (Score:2)
It's like when you share a bathroom with someone. Generally it's ok to share a toilet, sink, and the same roll of toilet paper (as long as users aren't there at the same time). But you don't want to share your bathroom with some stinky loser. It might lead to your comfortable room with the reading stool looking like a public restroom.
this could be interesting (Score:3)
The concept of sign up for our ISP and get a "free" computer was wildly popular...How about, run our software and get a free computer?
This has MANY advantages, including:
1) It really is free, you don't have to pay for the ISP service (which was more like financing)
2) Parents can get computers just for their kids, and while the kids are in school/asleep the computers can be running various routines and be paying themselves off
In turn, this would help increase the number of people with computers, as poor people wouldn't have to pay for the computer, just not use it all the time...and also in turn, it would increase computer literacy...
I'm also sure geeks/gamers would love this oppurtunity, since its a way to get a powerhouse computer(/computers?) free (or at least relatively cheap)
and that's only the beginning....
I'd like to see how this turns out, and how it gets used/abused...
--------------
Distributed computing (Score:1)
I have just one question about this? How well would it work with single processes or is it still limited by the same problems most parallel processing and multi-processing schemes encounter from data dependency?
If someone has found a way to take a single application (that is parallel processing dumb) and get it to run across multiple machines, well, they could be in for some serious cash coming their way.
How do they figure... (Score:4)
Pardon me, but how do they figure P2P will implement itself? Any new tech needs a human being to set it up, maintain it, format the results and queries and explain the whole thing to the boss. This tech will save money with companies by allowing them to get more computing power for less cash, but the up-keep and back end is still needed. Maybe the figure this will be a nice little project for all those lazy IT folks just lounging around doing nothing...I'm certain the SETI thing isn't running itself......
Porivo's Peer client, which resides on a user's desktop, works with the company's PeerPlane management software, which can reside on a dedicated server.
Great, we all have an extra dedicated server laying around..
They use Java! (Score:2)
Spiffy stuff. (Score:2)
Not only does this help average joe user (he can sell his CPU time to people who need to compile something), but think of it in a work place.
Where I work, everyone has a workstation. Problem is, everyone does their CPU-heavy work on a central server. Thus, these workstations are being under utilized. Sun (and a number of people) have this problem addressed halfway.. they have software that queues up tasks, and distributes them to idle CPU's as they free up.
What if we had a sort of "CPU NFS" in a way that individual instructions are handed off to remote machines, rather than entire jobs?
Mm, I want.
Of course, any such idea would be riddled with difficulties (I wager the complications would be like the ones NFS has, only worse), but the idea, again, is attractive.
Is this something Mojonation could expand to? They mention several times that you 'donate cpu cycles', but it never seems to directly state you can sell build time on your machine.
I'd like to see that.
Re:this could be interesting (Score:1)
If you use the computer too much, would you forfeit your *free* computer?
How would this affect the people who constantly refresh Slashdot trying to get a first post?
Re:Spiffy stuff. (Score:2)
I personaly use Platform's LSF product to control all of my queueing. I have custom perl scripts sitting arround it, allowing jobs to requeue on certain exits or even to validate that the output of the job is correct before saying that it is done.
Hook all of this up to a spiffy database to know where things are and were in the queues and you have a semi-decent render management toolkit
I have also read stuff on distributed fileserving as well, a pvfs system. Interesting ideas on ways to not need a massive system but use smaller, cheaper guys to get the job done.
Re:How many are interested though? (Score:2)
See http://www.parabon.com [parabon.com] for more info.
Compute Against Cancer (Score:1)
Re:How do they figure... (Score:1)
You can read our new white papers (API and platform) at: http://www.parabon.com [parabon.com]
Re:How many are interested though? (Score:1)
People get "discouraged" when they don't either see flashing number of their stats "high enough" which is pretty sad. What will kill distributed computing projects are morons...just like morons kill everything else.
I got your glispe of the future right here... (Score:3)
------------
"John, what is the future of Distributed computing?"
I'm glad you asked. Coming up soon is three more business schemes and companies who will take applications for future testing and promise you to make $$Big Bucks$$ for using your spare CPU cycles. The companies will then stop updating their new page about a month before disappearing all together.
Next up: Slashdot re-runs an article from June's Issue of Wired!
Re:They use Java! (Score:2)
obvious (Score:2)
Re:How do they figure... (Score:2)
Re:I got your glispe of the future right here... (Score:2)
Re:obvious (Score:1)
Re:this could be interesting (Score:2)
Buy a computer for someone else to use, so that you can have some part time cycles VS buying a computer for yourself and setting on your local network where you can have the cycles always?
-Tommy
No private systems (Score:1)
Re:Compute Against Cancer (Score:1)
As to email addresses; no, we most definitely do <i>not</i> sell them.
Re:How do they figure... (Score:1)
Re:this could be interesting (Score:2)
Well, there are non-capital costs with running these machines. Electricity for one, a physical place for the machine another, and keeping the thing running.
It's a bit of a stretch, sure, but it's conceivable that the cost-benefit could work out for somebody.
-
Re:Spiffy stuff. (Score:2)
Re:They use Java! (Score:3)
They can then port their SDK to different platforms and the java plugins will work fine anywhere and execute much closer to native speed.
Plus-wich if you buy all the hype regarding JITs, java isn't taking all that much of a performance hit over native code anyway.
-josh
Re:this could be interesting (Score:2)
And I'm using about 5% of the CPU. Right now, I could be doing something actually useful - but I'm not. If I start the distrubted.net client, I then start using ALL my CPU.
The bottom line is, since my computer is basically being used as a development platform (httpd is for some web dev stuff, the ircd is just me playing around with stuff), I'm not actually using too much CPU time. If it were a desktop, I'd probably being using less CPU time (well, not really - most of the processes right now are sleeping, waiting for I/O.) I'm not using too much processing power. Any modern OS is capable of multiprocessing - some better than others. Because of this, I can happily type away this message while the dnet client runs happily in the background.
If I give a family a PC, and say "here - you get a free PC, a free Internet connection, and all you have to do is keep it on all the time" who wouldn't like the deal? They're paying for electricity - depending on how the networking's done, they aren't paying for anything else. And since the PC is doing actual work, they needn't be bothered by ads. Sounds like a nice setup to me.
As long as the distributed client didn't eat too many CPU cylces, they wouldn't notice - and probably, wouldn't care. The only time I ever find the need to go and kill all unneeded processes is when I want to play CPU intensive games. For someone who wants to do word processing and surf the web, that'd be a very nice deal.
.Net (Score:1)
Re:How many are interested though? (Score:1)
Certainly, those who are running our compute engine (with our benevolent projects) before then will be alerted when the money starts. :)
Re:They use Java! (Score:3)
Re:.Net (Score:1)
Re:this could be interesting (Score:2)
-- Michael Chermside
Nothing new here (Score:2)
Back in the day (1991? 92?), batch.uu.net was an expensive Sun 640MP that just couldn't cut it. Pushing USENET through the box was hard enough. Compressing batches of news articles for dialup customers was more than it could handle. Instead of buying more hardware, our fearless leader came up with the idea of using all of the idle machines in the office at night (Sparc SLC/ELC/SS1/SS2) to run compress on them through rsh pipes.
Moral - a good sysadmin in your hand is worth two P2P sales reps in the bush.
For the most free computing power at your fingertips, hire script kiddies.
Re:They use Java! (Score:1)
How much memory is a Java process going to take? I've heard different stories about how fast Java is, but I've never heard anybody claim that their JVM goes easy on the RAM. Or how is the RAM used in a JVM (all that memory has to be used for something!) Is it the sort of usage that could be easily swapped out when you really start using the computer? Even then, swapping out your JVM so you can get real work done is a lot slower than the simple rescheduling and context switch it takes to change over the CPU usage.
I'm a lot more likely to notice a JVM running in the background than something spatially trivial like an RC5 client.
This sounds familiar... (Score:2)
slashdotted when I tried.. here's the text (Score:1)
week to define how to share unused CPU,
storage capacity across nets.
By APRIL JACOBS
Network World, 10/09/00
Intel, Hewlett-Packard, IBM and a slew of start-ups
will meet this week to set up the structure for a
working group that would give corporate customers a
new way to harness the collective power of
networked PCs, workstations and servers for
computer- and storage-intensive jobs.
Instead of purchasing more hardware and software
and hiring the IT staff needed to set up and support it,
an emerging technology called peer-to-peer (P2P)
computing will let users access valuable resources
when they aren't being used. The result: Users could
save millions of dollars by tapping unused processing
and storage resources.
P2P basically sets up a virtual supercomputer by
allowing the exchange of data among multiple
computers connected via a network. The software
that powers Napster and Gnutella is often held up as
the best example of the power P2P can harness.
In addition to next week's meeting, at least two firms,
Porivo Technologies and Mangosoft, will soon
announce P2P products aimed at corporate network
customers. Intel is testing a new peer-to-peer
application that the company says will save WAN
bandwidth and deliver applications and data more
quickly than existing technologies.
Porivo will roll out Peer, a secure, Java-based
application designed to let users harness spare PC
computing capacity, says Will Holmes, CEO at
Porivo. Porivo's Peer client, which resides on a user's
desktop, works with the company's PeerPlane
management software, which can reside on a
dedicated server. PeerPlane essentially aggregates the
computing resources of PCs connected to corporate
networks, letting users distribute work among them.
Mangosoft next week plans to announce Mangomind,
which it is billing as the first multiuser, Internet-based,
file-sharing service that provides real-time file sharing
for secure business communications. The new service
is a secure way for multiple users to access, share and
store files. Mangomind will let users work on their files
offline. When users go back online, Mangomind
automatically updates and synchronizes their files.
In the Groove
Another member of the working group - Groove
Networks - plans a highly anticipated Oct. 24 rollout
of its P2P technology, which will be aimed at
collaborative computing. Groove's founder Ray Ozzie
created Lotus Notes.
P2P could take many avenues in meeting the
computing needs of end users, much as the Web has
become more than a tool to deliver simple page
requests, says Andrew Mahon, evangelist at Groove.
Mahon declined to provide specifics about Groove's
product (for more on Groove, see 'Net Buzz, page
98).
One company interested in Porivo and other P2P
technologies is United Technologies Research Center
- the research arm of United Technologies. Paul
Kirschner, a senior project analyst at United
Technologies, is looking at how his company can
harness the power of computers across the company
to do production work.
What Kirschner likes is the idea of being able to do
massive compute jobs that might otherwise mean
buying more expensive hardware and software.
"Obviously, if you look at the number of desktops
across the company, there are tens of thousands," that
could potentially be tapped, he says. "To use what is
just sitting there doing nothing quite a bit of the time is
what makes this attractive because if you looked at
replacing that power with another box, another
cluster, that would represent a significant investment."
As a result, Kirschner expects to have P2P
technology up in some capacity by year-end.
Kirschner likes Porivo's offering because the desktop
client works with Windows 95. Others, such as
TurboLinux's EnFuzion software, only support
Windows NT and various flavors of Unix.
But that doesn't mean he's ready to bet the farm on
P2P.
"The technology is new, and how it is going to play in
the corporate environment isn't certain yet," he says.
"People will not tolerate it if their machines crash, slow
down or get locked up, or if unusual things happen."
While EnFuzion may not fit into United Technologies'
infrastructure, it has found a home elsewhere.
TurboLinux announced earlier this year that J.P.
Morgan is using the software to help power the firm's
worldwide risk management system for fixed-income
derivatives.
Cheryl Currid, president of the Currid & Company
consultancy, says P2P's big draw for corporate
customers is processing power that companies don't
know they have. "What they can get from
peer-to-peer is low-cost, high-capability processing
and storage."
Currid says users can benefit from P2P to varying
degrees - depending on how much effort they put into
incorporating it into their infrastructure. While
engineering and scientific jobs are a logical place for
P2P, more commonplace financial applications are
what could put it in the spotlight. "Imagine if your
trades could come back to you three times faster
because your company was using P2P to process
them in real time, instead of having to do big periodic
batch jobs," Currid says.
Intel is in
Intel is also using P2P. The company made a lot of
noise recently when it talked about how it saved $500
million over the past 10 years using a P2P application
called Netbatch. The application lets Intel engineers
harness more than 10,000 workstations across Intel's
network to do compute-intensive jobs for chip design,
says Manny Vara, an Intel spokesman.
"Every time we were designing a new chip, we were
buying a bunch of new mainframes to get the job done
- and that was just one area," he says.
Vara says Intel is testing a new application that goes
even further. He says Intel will try out a system that
will detect when employees access the WAN to
retrieve video files. If another employee at the same
location has already downloaded it, the P2P
application will retrieve it from that system where it
has been stored instead of going over the WAN to get
it.
What network managers will likely debate as P2P
gains momentum is how to use it without slowing
systems. Currid says estimates are that 75% of the
average PC and 60% of the average server go
unused.
Busy signal
But what about when they are busy?
P2P software from companies such as Entropia,
another member of the P2P Working Group, let
customers set policies that govern when computer
resources can be harnessed. Using Entropia's screen
saver makes it fairly easy. The computer's resources
are only used when the screen saver comes on. The
moment it turns off, indicating the machine is going to
be used, the P2P processes are halted.
Many P2P questions will hopefully be answered by
the working group set to meet in San Jose.
The meeting will be more organizational than anything
else, according to Intel's Vara. The members will
organize into task-related groups that will determine
how to solve issues related to interoperability,
standards and security.
Other members of the working group include Applied
MetaComputing, CenterSpan, Distributed Science,
Dotcast, Enfish Technology, Engenia Software,
Flycode, Kalepa, Statis, United Devices, Uprizer and
Vtel.
Re:They use Java! (Score:2)
The JVM in windows appears to have an overhead of about 4 meg when I run a simple program, but this may be misleading, as IE might pre-load some of this with whatever components - but still, this won't make a significant dent in the 512Meg laptop I buy next
-josh
Microsoft announces P2P software (Score:4)
It was also revealed today that the first virus for P-On-ME has been discovered. It is contained in email messages with the subject line "Do Not Open - Virus Inside".
Re:They use Java! (Score:1)
Free Distributed OS available for download (Score:1)
Re:Free Distributed OS available for download (Score:1)
Re:They use Java! (Score:2)
What you are seeing with your "factor of 3:2" is a JIT vastly superior to your C compiler -- I'm guessing some version of a GNU compiler. Writing blindingly fast C code requires knowledge of the target platform. JIT creates an abstraction that places that knowledge on the VM designers instead of the application designers. This allows sloppy and "stupid" programmers to write surprisingly good applications.
I have nothing against JAVA as a language. However, I am catagorically anti-interpreted code. BASIC feined in light of C and C++ because interpreted languages give laughable performance, but writing everything in assembly made too many people sick
Sun has already released theirs (Score:2)
http://www.sun.com/software/gridware/
looks like it is available for download already and the page says the code is slated to be released under an "industry-accepted open source license" I know that phrase will probably raise a few hackles but its better than what you'll get from most companies.
Re:Spiffy stuff. (Score:1)
Nope, not worth it. One of the basic tenets of distributed computing is that the gain from splitting up the computation has to outweigh the cost of communicating the results of that computation. That is, if it takes a couple nanoseconds to do the instruction, but it takes a few hundred microseconds to send off that instruction and get the results back, then it's a lose.
The only way to make fine-grained parallel computing a win is to make communication between nodes really, really cheap. And it doesn't sound like they're doing that here. Therefore, they must be doing coarse-grained parallel computation, probably on the scale of many seconds of computation between communications.
Project Condor? (Score:1)
http://www.cs.wisc.edu/condor/
Here is an except from their page.
What is Condor?
Condor is a software system that runs on a cluster of workstations to harness wasted CPU cycles. A Condor pool consists of any number of machines, of possibly different architectures and operating systems, that are connected by a network. To monitor the status of the individual computers in the cluster, certain Condor programs called the Condor "daemons" must run all the time. One daemon is called the "master". Its only job is to make sure that the rest of the Condor daemons are running. If any daemon dies, the master restarts it. If a daemon continues to die, the master sends mail to a Condor administrator and stops trying to start it. Two other daemons run on every machine in the pool, the "startd" and the "schedd". The schedd keeps track of all the jobs that have been submitted on a given machine. The startd monitors information about the machine that is used to decide if it is available to run a Condor job, such as keyboard and mouse activity, and the load on the CPU. Since Condor only uses idle machines to compute jobs, the startd also notices when a user returns to a machine that is currently running and removes the job.
Sounds quite similar.
Beware the Hype !!! (Score:1)
There are lots of technology that has existed for years that perform similar functionality, like Distributed OS / DB / FS, and cluster technologies. Of course, not all of these implementations are focused on what they want to achieve with a system in this article, but there are also some that will do the exact same job.
Ther are things missing in the article, and there are some of the humourus suggestions.
For example:
"Intel is testing a new peer-to-peer application that the company says will save WAN bandwidth and deliver applications and data more quickly than existing technologies"
This is by the way know as a distributed / hierarchichal caching proxy.
"Mangosoft next week plans to announce Mangomind, which it is billing as the first multiuser, Internet-based, file-sharing service that provides real-time file sharing for secure business communications. The new service is a secure way for multiple users to access, share and store files. Mangomind will let users work on their files offline. When users go back online, Mangomind automatically updates and synchronizes their files."
This is already known as AFS or Coda, Coda allows caching and disconnected operations, and believe it or not is architecture independent, which allow you to operate over the internet.
"Kirschner likes Porivo's offering because the desktop client works with Windows 95. Others, such as TurboLinux's EnFuzion software, only support Windows NT and various flavors of Unix."
Who would ever, except for Kirschner, consider using Windows 9x for what is inherently concurrent computing. Widnows 9x is not capable of propper concurrency.
But what is really lacking in this article is a description of the costs and consequences of implementing this in an environment.
Look at the top diagram, it points out that you can share/distribute CPU and disk usage. But for that to happen you need several upgrades of the client. First of all, you would need something else that an IDE disk on each client, since IDE is not capable of concurrency. Second of all you would need to upgrade your network, because it wont work efficiently on a 10Mbit network, you would need a Gbit network, this also includes the rest of the network infrastructure. Thirdly, you need user software that is capable of distributing processing jobs among the processors on the cluster, i.e. heavily threaded software. All this costs money and stil the software part might not be available for the software the company is using.
There is also the isssue of processing power gained, you have to analyse the load of the computers where you want to implement this to see how much can actually be gained by doing this. Of course there is lots of unused power lying around, but if the gain is only 10% then it is probably not worth it.
There is also the security aspect, escpecially for distribution of disks, and especially if you are using those disks to store company documents on them. The machines would be physically more accessible to thiefs. Or what if the user turns off the power of the machine, without shutting down, that might make, in the worst case, documents inaccsessible, or computing data lost.
Therefore, data storage should be on server clusters instead, and only use the clients for CPU/memory sharing
Lastly the article asks the question about what to do if the computer is busy, and the suggestion to that problem, according to the article, is that P2P should only run when the screensaver runs.
The question I have about that is, how often does the screensaver run compared to the load on the computer throughout the day, not often. Therefore you would not get much benefit of the "P2P" technology. A simple but much better solution is to use a priority configuration for local and networked processes, the simplest could be give local processes higher priority that networked processes. Of course the priority system is not by any means simple a simple system, so it cant be done out of the box, it depends on what really needs to run at higher priority than other processes.
There is probably a lot I forgot to mention, but what was stated, certainly applies as a critique to this article and P2P in general.
Glo
--
Re:Spiffy stuff. (Score:2)
Soon as I get two spare linux machines at work, I'm gonna tear them apart and try this Mosix stuff.. see if it can function in a software development environment.
Be neat to see if it works well enough that there's a net gain in efficiency.
Best part, it's free (which modern solutions aren't), so it's much easier to play with this sort of idea.
Re:.Net (Score:1)
Entropia (Score:1)