Running The Numbers: Why Gnutella Can't Scale 287
jordan (one of the founding developers of Napster), writes: "As the rumour mill churns over Napster's future, many folks see Gnutella as the next best hope for the music loving file sharing community. Problem is, Gnutella can't scale . [Note: if that URL doesn't work, try this mirror.] Almost all research on Gnutella up till now has been based on observations of the system in the wild, but this paper discusses the technical merits of that statement through a detailed mathematical analysis of the Gnutella architecture." The kind of numbers that you may not like to read if you figure networks expand to accomodate traffic at a never-ending pace. Update: 02/15 12:24 AM by T : Jordan also points to this mirror for your reading pleasure.
Peer-to-Peer can scale (Score:2)
Re:Well what about Freenet then? (Score:2)
No, reread the parent. You cannot know where the information is actually stored.
A freenet node is basically a caching router, and AFAIK even the RIAA hasn't yet been able to repeal the common carrier status, so you should be ok.
OMG NO NAPSTER??? Come on (Score:2)
Bottom line though is you people seem to forget what it was like in the good ole days for us to pioneer this CRAZE that swept the net. I feel like I should be talking to (Grandkids here saying this) "When I was your age we had to search the web for FTP servers and download them the old fashioned way."
"I recall having access to a T1 at work when only the elite few had that and was running an MP3 site boasting 1 gig of tunes on a SCSI HD that was in a STATE OF THE ART P150 Dell server( I now have close to 20 gigs of MP3's)"
Sure Napster is/was great Gnutella although will continue to be trouble...We will all make it.
BTW if anyone wants to contact me, I will happily workl with you to upload my collection if you wanna open a site somewhere.
The argument of college bandwidth, alhthough many will hate me for saying it, is legit. I work for a company that installs network management softweare especially to Universities and the ones that have blocked Napster have seen a substancial amount of traffic drop. I do not know what the answer is, I can say I know several gamerz that HATE Napster etc for the amount of bandwidth they lose on campus. Poor guys probably have a Ping of 27 instead of 21
Razzious Domini
Re:I challenge you... (Score:5)
I havent started on d and f, but they could be added.
This project is called The ALPINE Network [cubicmetercrystal.com]
It scales linearly, and provides a query mechanism that rivals the performance of a centralized directory. (Although the bandwidth is more than a centralized query, but at least you have direct control over how much bandwidth you use and how).
At any rate, I could use development assistance a great deal. Let me know if anyone is interested.
Regards...
Re:Well, duh (Score:2)
Thats discussed on the site I mentioned, but essentially you each pick an ID to associate a given peer with. Its that simple.
2. Let's say 10% of those 10K people are doing searches. That saturates a 56K modem, assuming you can really get your packets down to 56 bytes
It would only saturate your link if all 10,000 searched at once. If they all searched within a 3 minute time period, or no more than 70 in one second, your link will not saturate. And the packet is 56 bytes for an 8 character query. For a 16 character query, it would be 64 bytes. etc.
What happens when you try to have 100K people? One million? How about the 10 million+ of Napster? Your scheme would not scale.
That depends on how good of a peer you are. If you dont repsond much, and have a very low link, then you will be at the bottom of those 100,000 hosts query lists, and will get queried infrequently. I cover this on the site, but this is not a problem. The only thing that is limiting your use of the network is how much memory you have (you would need a hundred meg or so for a million connections) and your bandwidth.
OpenNap is flawed from a legal stance... (Score:2)
If the RIAA succeeds in suing Napster and blocking their service, which seems very likely at this point. It is not at all far fetched that they will easily be able to receive court orders against anyone else running the same time of service.
So your OpenNap is not a replacement service because every index server is liable for a court ordered shutdown.
That and the index server requires bandwidth, bandwidth costs money and how many people are going to donate full T3 lines to this? Thus the service is capped in terms of the number of connected users based on bandwidth available.
Once Napster is dead, there will be nothing else to replace it at the same scale unless it is operated with the blessing of the RIAA.
Re:Well what about Freenet then? (Score:2)
Re:Freenet (Score:2)
There is FUD there. (Score:3)
All the author really does is take an example of a mathmatical formula which grows exponentially and show how quickly he gets "scary" numbers. No effort is made to show whether or not the efficiency of Gnutella breaks down as the network increases in size. No effort is made show how much work is done per search or per result. He just makes assumptions about the gnutella network which results in exponential growth in the number of users, and then shows how the aggregate traffic also grows exponentially. Duh. What did you expect? By this logic, nothing scales.
Don't get me wrong, I don't think Gnutella scales either. But you don't need to wave around all the FUDdy math that this guy does to prove it. The argument why it doesn't scale is simple:
The reason is doesn't scale is that every search request (optimally) gets delivered to every client. We don't even have to look at how those searched get delivered. We'll completely ignore the amount of traffic in the backbone, and only count the traffic that has to exist on the last hop to each client. Let's assume that the requests are 100 bytes a piece, or about 1000 bits once we have all the overhead of UDP/IP/ethernet/PPP/ATM/whatever on top. If each search is 1000 bits, and the average client has a 56K modem, the whole thing falls apart when the search rate is 56 searches / second. If we assume 1 million users, each one can only perform a search about once every 5 hours on average before the modem links are 100% full.
The problem here is the broadcast of every search to every client. Any distributed search network needs to either assume very high bandwidth connections for all the clients (because they are all servers to the whole network) or have some hierarchy of caches / servers. The amount of bandwith being used at each client increases as more clients connect. If the number of users goes up by 1000%, the traffic on my local link goes up 1000%. This is why it doesn't scale. It has nothing to do with how many GB of traffic the network as a whole has to handle. It's the simple fact that the traffic at every client increases as more clients connect. This is the problem that has to be corrected, and Jordan's paper never even mentions this fact, relying instead on big scary numbers. His claim at the end that gnutella generates 2.4GBps of traffic for 1 million users is the ultimate FUD. How much traffic does Napster generate when it has 1 million people connected? He probably doesn't know because their servers go down first.
How About... (Score:3)
Of course, you'd have to work out how to prevent hostile clients and servers from corrupting your indexes, but I'm sure that's a much more easily solved problem than working out how to prevent some skript kiddie from flooding napsters servers off the net with a DDOS.
Who needs Gnutella? Move Napster off-shore. (Score:4)
In all seriousness, I don't condone mass piracy, but the RIAA has been screwing people for decades and I have to admit that I enjoy watching them squirm. What could the RIAA conceivably do if Napster were located offshore, preferably in a country not bound by the terms of the Berne convention?
Re:Well, duh (Score:2)
Won't be hard to locate them.
If the RIAA can get Congress to pass a law which places a substantial fine on those convicted of running internet services for the purpose of piracy... Which isn't that unlikely. The DMCA places something like a $1 million fine on creating tools to subvert copy protection.
Who will be able to afford to risk running said service? Bill Gates, maybe Larry Ellison. Doubt that'll happen.
Re:Well, duh (Score:3)
Maybe Gnutella needs to take the meta-internet approach. A "new" internet on top of the current internet?
(I dunno. I ask because I'm curious. How is Gnutella in general different than the internet in particular?)
Re:Who needs Gnutella? Move Napster off-shore. (Score:2)
Re:RIAA is advertising piracy... (Score:2)
You know, I have to take some exception to being called "sheep" because I buy CDs. Do you honestly, deep down, feel that because the current RIAA-supported distribution model doesn't compensate artists fairly, you are striking an idealistic blow for artists by using a model which, by and large, provides no compensation at all?
Reality check: if you download music copied from a CD sold by an RIAA-affiliated label, you are not "boycotting RIAA-sanctioned music". Boycotting means you are willing to go without a product on principle. That's not what you're doing. What you're doing is, at the least, legally considered stealing the music (presuming you don't own the CD already or buy it later)--and I'd have to say it's philosophically pretty dubious. If you didn't "just want the music," you wouldn't be getting it for free.
If you want to boycott the RIAA, you have to support artists who make their work available through "non-RIAA-sanctioned methods." But trading their music for free through Napster is not support.
It's easy to defend Napster for what it might become. I think digital music distribution is coming, soon, and I suspect it will live without the RIAA. But it will require a viable business model for the artists, not the record companies, that allows an average, "second-tier" artist to get equal or better compensation than they would from a record company and provides a reasonable level of promotional support for concerts, merchandising, radio airplay, and the like. Napster does not provide this model. A future model might be free as in speech, but currently Napster is unequivocally free as in beer, and we're not doing ourselves or anyone else any good by pretending otherwise.
This is easy. (Score:2)
The problem with Napster is that it has a single point of failure. The problem with Gnutella is it doesn't have an index. What you want is an index of all files with no single point of failure.
An index is a root node, which points at branches, which points at leaves. So make 10 copies of the root, 10 of each branch, 10 of each leaf, and put each on a different transient machine. (If you think 10 roots is too few, have every user keep their own copy of the root. It's not big.)
Then here's your protocol: Ping the roots one at a time, choose the first that responds. The chosen root pings the duplicates of the correct branch one at a time, chooses one. The chosen branch pings the duplicates of the correct leaf one at a time, it chooses one. The leaf sends the results back to the user.
Updating the structure is the same, with the addition that nodes occasionally try to sync with their duplicates. You end up with duplicates never quite in sync, but so what.
Re:Peer-to-Peer can scale (Score:2)
http://www.npsnet.org/~howard/dis.pdf [npsnet.org]
Re:Well, duh (Score:2)
The protocol used by ALPINE is for messaging. The types of broadcasts are very small packets. Usually 50-60 bytes. This makes a huge difference.
Re:RIAA is advertising piracy... (Score:2)
MyopicProwls
Re:Well, duh (Score:2)
-jon
Re:What about Usenet? (Score:2)
Now the problems (that havn't been mentioned yet): data on the usenet has a short lifetime, frequently less than 24 hours. If you don't keep on top of it, it is easy to miss things (like the fourth episode of a series). Second, you can't search out a particular song on the Usenet, you have to more or less take what is available. If you are looking for a particular song, the Usenet may not be for you (although you can certainly request it).
Down that path lies madness. On the other hand, the road to hell is paved with melting snowballs.
Re:RIAA is advertising piracy... (Score:2)
I really disagree. First of all, Napster unquestionably provides a distribution model that provides a reasonable level of promotional support for artists. It's really great how many new artists I've discovered because of MP3s. Not necessarily just because of Napster, but if a friend says (as often happens) "hey have you heard this new stuff from Boo Williams?" and I say "Boo who?" (no pun intended -- go download his music it's great) then all of a sudden Boo has an opportunity to have his music heard by someone who never would have heard it otherwise. Super!
Thankfully, MOST of the artists that I listen to have come down on the PRO NAPSTER side. This includes Ben Folds Five, Green Day, Limp Bizkit, The Offspring, Chuck D, and others. Unfortunately, some of my favorite artists have come down on the other side. These include the most vocal three: Metallica, Dr. Dre, and Eminem. That sucks.
MORALLY I get over the problem. Is it morally wrong for me to want free music? I don't think so. Is it morally wrong for an artist to produce work that I listen to for free, never buying his CD, never going to his concert, never buying his T-Shirts? Perhaps. Perhaps not. But certainly it is no worse that middlemen becoming so ridiculously rich by screwing me with $18 CDs. CDs should be between four and six dollars; about half should go to the artist (about what they get now, or a bit more).
Trust me, I sleep just fine at nite having spent the whole day listening to MP3s. And I do own CDs -- oh God do I own CDs. I counted once, and I probably gave the RIAA $4,000+ in my (short) lifetime. That's a lot. A whole lot. I figure they are still $3,500 ahead after all the free downloading I've done.
I also, by the way, have 'discovered' artists via MP3s and Napster, and subsequently bought their CDs and gone to their concerts (e.g. Cypress Hill and Lavay Smith -- don't laugh) so those artists are definately ahead because otherwise they wouldn't have seen any of my money at all.
MyopicProwls
Re:Tracing (Score:2)
Re:Scaling... (Score:3)
I even share the equations and methodologies I used, and try to poke holes in my own conclusions.
Further, I'm not a competitor. I haven't worked for Napster in 3 months. Before Napster my background was in poking holes in things anyway. All I did was finish a personal project I started a long time ago.
You actually sound more like FUD than anything. :-)
--jordan
Re:Well, duh (Score:2)
Not so fast. Right now, the biggest problem with decentralized networks is that they all have some form of routing/forwarding. If you got rid of routing/forwarding, then they could scale.
For instance, lets say you have a napster style peer group, 10,000 peers. What if, to query these peers, you sent a small UDP packet to each of them directly? No routing, no forwarding. How long would this take?
Modem: 2.5 minutes
DSL: 13 seconds
I would say that this is an acceptable period of time. And the bandwidth used was all your own, nobody elses, except for the 56bytes each peer received for that single packet they got from you.
I am working on such a network, its called The ALPINE Network [cubicmetercrystal.com] and has all the features mentioned.
So, if you get rid of the forwarding/routing you can have a decentralized network that scales linearly.
Re:Doubt It (Score:2)
There is no big database. There are lots and lots of little databases.
And the network adapts to load. I go into this somewhat on the site.
The important thing I want to point out is that this network is used mainly for locating content. Once you have found it, it may reside within Freenet, it may reside within OpenCOLA, or MojoNation, etc. And then you will benifit from their architecire for the actual delivery of the data.
The broadcasts are used solely for discovery of resources, with the delivery being a whole other scenario. UDP is a horrible bulk transfer protocol.
Re:A Future Alternative (and its scales linearly t (Score:2)
If you are getting swamped, you will respond to less and less queries, and then your quality in the eyes of those peers will drop, thus, you will receive less and less queries.
This is actually a balanced type of configuration, which handles load in an efficient manner.
Also note that over a DSL line, you could receive in excess of 10,000 queries a second.
Re:Scaling... (Score:3)
But... (Score:2)
Amber Yuan 2k A.D
Re:Well, duh (Score:2)
I did some monitoring of the gnutella network for a few months, and the size of an average query is about 8-16 bytes at most. Many many queries where even less.
Re:Well, duh. (Score:2)
No, you missed a major point; there is no routing and no forwarding.
This is what makes it so simple, and linear. You directly communicate with all the peers you want to query. Everyone directly communicates with each other. The only thing this implies is a transport service which can support a large number of concurrent connections efficiently. This is what DTCP does.
Re:What about Usenet? (Score:2)
Re:DNS Structure (Score:2)
(If the people running the RIAA and MPAA had been clueful, they would have been pursuing this strategy against anonymous file sharing from the very beginning. If 99 out of 100 requests for insert-top-forty-song-here on Napster return William Shatner singing "Lucy in the Sky with Diamonds", then most people would rather pay for the CD than sift through all the false results. But I digress.)
--
P2P on top of IRC - bets of both (Score:2)
The "client" would be a bot. It would join a channel (say, "#bjork" or "#trancegoa") and to make a request, it would simply utter something on that channel in some protocolish language (eg "SEARCH 'Bachelorette'"), and other bots would respond in a P2P fashion (ie
This would deliver us from the scaling curse as it is described by Jordan's paper. This would also lead to a Usenet-like classification of available files among channels (if you like david bowie, you would
Think of this: Napster was made as a sharing system, where people could chat. We have a chatting system. Why not allow people to share files on top of it ?
Re:Well, duh (Score:2)
You are wrong. Either you are speaking theorically, in which case the salesman problem is trivially solvable, or you are talking practically, and you don't give a shit about the *best* route. A good one will be sufficient, and there are very good heuristics for that.
Cheers,
--fred
Re:Well what about Freenet then? (Score:2)
*vomit*
Re:Well, duh (Score:3)
You build something that uses a distributed algorithm to build a spanning tree. The nodes near the top of the spanning tree become the servers. You build the algorithm so that parents in your spanning tree will naturally have more bandwidth than you do.
I've been thinking about this for a long while.
Building the spanning tree isn't hard. Every node just selects one and only one parent node. They tell the parent that they're a child of that parent. You prevent cycles having a parent refuse to be a parent unless it also has a parent. If it loses its connection to its parent, it tells all the children that it no longer is a parent. One node 'seeds' the network as a root by saying it can be a parent without being a parent and not looking for a parent. Eventually it can delegate roothood to a child that has proven high bandwidth. It cannot cease being a root without doing this delegation.
You can have connections to nodes that are neither parents nor children, but search requests should not be propogated to those nodes unless you have no parent. Eventually a search request will make it onto the spanning tree and be efficiently distributed.
You can eventually elect servers who are near the top of the spanning tree. Nodes should, in general, elect parents that have more bandwidth than they do. This means that nodes near the top of the spanning tree should have the most bandwidth.
Re:Scaling... (Score:2)
I already had an intuitive grasp of what he was talking about, and his numbers seemed ballpark correct to me. I too thought the result set bandwidth numbers looked a little fishy, but the others seemed fine.
I've been thinking about this for months.
A solution (Score:3)
I really want to build this with my StreamModule system [omnifarious.org], but nobody is helping me with it, and I don't have the time to hack it out, especially since I'm so ridiculously methodical when it comes to code.
You build something that uses a distributed algorithm to build a spanning tree. The nodes near the top of the spanning tree become the servers. You build the algorithm so that parents in your spanning tree will naturally have more bandwidth than you do.
I've been thinking about this for a long while.
Building the spanning tree isn't hard. Every node just selects one and only one parent node. They tell the parent that they're a child of that parent. You prevent cycles having a parent refuse to be a parent unless it also has a parent. If it loses its connection to its parent, it tells all the children that it no longer is a parent. One node 'seeds' the network as a root by saying it can be a parent without being a parent and not looking for a parent. Eventually it can delegate roothood to a child that has proven high bandwidth. It cannot cease being a root without doing this delegation.
You can have connections to nodes that are neither parents nor children, but search requests should not be propogated to those nodes unless you have no parent. Eventually a search request will make it onto the spanning tree and be efficiently distributed.
You can eventually elect servers who are near the top of the spanning tree. Nodes should, in general, elect parents that have more bandwidth than they do. This means that nodes near the top of the spanning tree should have the most bandwidth.
Re:Scaling... (Score:3)
Ok, this time I did a bit more thorough check of the numbers. I agree with the first half, the traffic generated by the request half of the message. What I'm not as convinced of is the response side of the equation.
I don't know what the typical percentage of Gnutella users sharing files is, so I'll accept your figure of 30%. But 40% of those sharing files having a match? Even with your reduced number here I think it's high. If 40% of people sharing files had a match that would mean with default settings you'd get: (N=4, T=5) 484*(0.3*0.4) = 58 people finding a match. And with the numbers you use later of 10 matches a person you'd get 580 matching entries. I've never received anything near that high. But if I did, I certainly would have no motivation to increase T or N.
What happens if it's only 10% of those sharing that have a match? With the default settings you'd still get 14 people matching, or about 140 matching entries. That's still a *lot* of responses, more than I've ever received.
If all your default numbers are used, your nightmare scenario would yield 0.3*0.4*7,686,400*10 "found" responses to your query. That's 9 million 223 thousand 680 "grateful dead live" songs (though not unique) shared among 900 thousand deadheads who are all simultaneously online. Whoa.
I'm not an expert in human psychology by any means, but let me suggest this. With most tools, people don't feel any need to "tweak" them unless they're not working right. With 480 songs returned, I don't think many people would feel a need to tweak their settings. If someone was having a hard time finding something they might then change their settings -- but if they were having a hard time finding it they wouldn't get so many responses returned.
The only way I can imagine those monstrous amounts of data resulting from querries is if it happens by maliciousness or mistake.
Am I missing something?
Gnutella vs Bandwidth - best 2 out of 3 (Score:2)
With Napster, the bandwidth usage from the query is negligable. A single packet (your query) goes out to a single destination (the Napster index server). A small handful of packets (your listing of places your desired song is located) comes back. A few K total, then you get your 4mb transfer.
With Gnutella, the bandwidth usage from the query is significant. Your query goes to several peers, which then forward it to other peers, etc... and each server with the song requested sends you back a packet. Looking at the numbers in the analysis shows that your query will quickly generate more bandwidth usage than the actual transfer (which you'll still have to do to get your song). The bandwidth hit is distributed, true, but it still adds up, and grows logarithmically with the user-base rather than linearly.
Gnutella's success depends upon a significant portion of its users also being servers (i.e. making files available for download) -- being a provider as well as a consumer. There's a server-side hit, too... with Napster, a provider of files sends a few packets to the Napster index server advertising its wares. Aside from the bandwidth usage of the actual transfers the provider is serving, very little impact. With Gnutella, every query within your range will hit your server. Bandwidth usage from queries will quickly outstrip bandwidth usage from transfers, and this will tend to discourage people from being providers.
Please, don't get me wrong here. I think that peer-to-peer will be the future, but there are problems to be solved. Gnutella, as it stands now, will not scale well... the math in the paper in question is good, and matches real-world observations. The challenge is managing the queries, routing the queries intelligently, and keeping the bandwidth usage down "below the radar" of backbone providers and system administrators.
I don't know what can be done about the bandwidth usage of the transfer itself, but keeping the query traffic down will help in keeping administrators and providers no more filesharing-hostile than they already are. Now is the time to be treating these people well, instead of antagonizing them further. You don't want to bite the hand that feeds you your bandwidth :)
This problem has been solved before, by the way. Think "routing tables".
Re:Scaling... (Score:3)
Not everything is practical just because there is a need for it.
Great straw-man rebuttal! How about if you try a more rational analogy? Going from gas combustion engines to teleportation or fusion power is a tad bigger leap than going from Napster to a similar service! And Napster ceasing to exist versus gas prices climbing higher is not analogous either...
A better analogy would be:
"If we run out of petroleum-based fuel, a similar or better form of energy will come to the forefront."
And that's ABSOLUTELY TRUE, reasonably proven through a huge mound of empirical evidence.
Where have I seen this before? (Score:5)
P2P is more then napster (Score:2)
Your system plays the role of file server by offering a list of available file and plays the role of search server for you by collecting the lists of available files from diffrent people. The key here is that only you search your own system's database, so only you get taged for the cost of collecting the databases of too many diffrent systems. Clearly, your system needs to figure out automatically which nodes it should track by remembering where you actually find stuff, but this should not present any real problem. You would also introduce a little randomization by tranking random nodes for a limited period of time.
This might work just as well as Napster for people who always DL the same type of music (like Tech for me). Clearly, you would not be able to show off to your friends by DLing any song they request, but that is not really that importent.
Why things are developed (long) (Score:4)
Warning: Rant Ahead!
Partially true. In your example, you said that if price of gasoline went up, teleportation or fusion-powered cars wouldn't be developed. I agree. However, if the price of gasoline went to $20/gallon tomarrow (an outrageous rate, but its just an example), then we'd either see a changeover to natural gas/electric or some other alternative energy source vehicle, or cars would be developed that got 400 miles/gallon.
So why would gas/electictric cars be implimented and not fusion or teleportation? Well, first we have a demand for transportation. The demand for transportation is rather high, at least in the developed world, and especially in the US, since all of us seem to want to live in the woods and commute to the city. Therefore, if the demand is high, we *will* find something to fulfill the need, as long as the cost of fulfilling the demand is not so great that we have to sacrific other, equally important demands. We don't commute to work via helicopters because the time, money, and energy we would have to exert to be able to use them isn't worth the extra few minutes we'd shave from our commute time. We don't commute to work with buses because we prefer living in areas with lower population densities (e.i. suburbs) which make buses impractical and we don't like the inconvience of having to conform to the bus's schedule and having to interact with other members of our community. We are looking for something that fulfills our need to get from point A to point B with the lowest oppertunity cost to us. This is the economics/social side of the scale. On the other side of the scale is the harsh laws of science and technology, which dictate what has been done, what is possible, and what is impossible, and what the costs for doing each are. Say we have a possible solution set such as this { car (gasoline), car (electric), walking, teleportation, car (fusion) }. Science tells us the teleportation looks impossible. Therefore, we eliminate it. Technology tells us that fusion powered cars haven't been done yet, and considering everything that we know about "hot" fusion, its doubtful we could ever fit a fusion reactor in a vehicle the size of a car. We are now left with gasoline-powered cars, electric-cars, and walking, in this simplified example. Walking is too much of an inconvience to us, science doesn't have a problem with it, but human nature, and the time it would take, plus distance that would have to be traveled, make it impossible. On the economic/psycology/social side, walking isn't happening. So what will it be, electric or gasoline? The technology that's in place makes gasoline-powered vehicle cheaper then electric, and gasoline, even at the high prices that it is lately, is still an economical means of transport. Plus, we have human nature, gasoline is tried and true, electric isn't. Electric also has some problems with travelling long distance, and infrastructure doesn't support electric right now. Therefore gas is the best solution to our problem. In the future, if electric becomes more ideal then gasoline (enough to override our habit of sticking with what we know), we will switch.
So, we learn this. Each problem/solution pair depends on economics, human nature (psycological/social), science, and technology.
Lets apply this to Napster, OpenNap, Gnutella, and the rest of the field. Napster was nice and easy, a lot of us became accustomed to using it, and the technology (on our end) was cheap. However, Napster is either dead or moving towards a fee-based service. All of a sudden, from the economics viewpoint, Napster is less ideal. OpenNap is simular to Napster, there is the additionaly hassle of finding a server, but since Napster is having trouble, OpenNap seems a lot more attractive. However, OpenNap from the social viewpoint, is insecure, it has a central server, it can be attacked. Therefore, what do we have left? Gnutella is free of cost, and cannot be shut down through elimination of a central server. It is harder to use, and technology says it won't scale in the current format. Plus, it eats up bandwidth like a hog.
The above was a rant, and presented simplified examples. I didn't mention gyro-driven cars, monorails, carts hauled by penguins, or bicycles, amoung other things, because I was trying to keep the examples simple (and carts hauled by penguins aren't really practical). I didn't mention stuff like how critical user mass applies to file sharing systems because it didn't pertain to the topic of the comment. So please, don't flame me with a comment how widget-driven cars are the ideal solution, or that file sharing also depends on bandwidth. Nitpicking just wastes both of our time. On the other hand, valid comments are appreciated.
Plus (Score:2)
Re:There is FUD there. (Score:2)
That's why I think your paper is FUD. It throws around big scary numbers which are technically correct, but which are in fact very modest levels of bandwidth when averaged over the userbase.
Re:Well, duh (Score:2)
That question bugged me so much, I will like to answer it for you, the answer is YES! I figured out a solution, after reading the paper yesterday, I spent my time in class scrawling and pondering over that, and I have a very simple elegant solution, I can't believe it! So, I am going to perform some experiments first before I make a fool of myself, but I certainly think it can be done. If I told you how, you do smack yourself in the forehead and say, "of course!"
Of course (Score:2)
On my campus, we've been using Limewire to make a private Gnutella network. We use it to trade files with each other. That way we're not all trying to get the same files from the internet. It's much faster. People at other colleges should try it.
It's overly optimistic. (Score:2)
Also, this means that the population P DOES have an effect on the number of reachable users, because as P increases the number of redundant connections will decrease. Don't have the math to prove it, but I think that's the way it works.
Also, is there analysis of why gnutella can't scale in terms of P? I can see why it won't scale in terms of number of users I can reach, but why not in total users, IF users are content to let themselves be limited to a small fraction of the network (this should be enforced by the clients. I know people can wrte their own, but they shouldn't write them to allow huge TTLs).
Also, what of the reflectors? [clip2.com]
Well what about Freenet then? (Score:4)
Freenet is also very well architected, unlike bogus Gnutella. It's designed to scale up, so that popular stuff gets cached all over the place. Like, more people downloading means that your connections go FASTER. This is cool.
Mojo Nation aims to do exactly that (Score:3)
Yes!
By using an internal microcredit/payment system (called mojo) and localized reputations Mojo Nation [mojonation.net] aims to do exactly that. Better connected brokers (peers) will naturally become more "server like" due to having a better uptime, lower latency and a lower mojo cost overall for other brokers (peers) to use.
The resources in the system are allocated dynamically. No strict heirarchy needs to be defined, it will establish itself appropriately for each individual peer as it is needed.
PS a new version (0.950) was released today.
OpenNap (Score:4)
possible solutions to gnutella problem (Score:2)
If you switch to a more napster-like model where each user submits a file list, then freeloaders don't consume as much bandwidth. You develop a database over time as you stock up on file lists. The downside is that you can't just join and search (though maybe asking nearest neighbors to search could be part of the protocol). Since users might update only a few times per day or less, the overall bandwidth use isn't that high.
For the topology problem, I would suggest more of a ring-chain topology, with some redundancy (backup connections in case a link breaks, and multiple rings that are sparsely interconnected).
This is fun stuff to think about. Similar problems are present (self-organized networks) in "bottom-up" nanotechnology. Maybe I should ask for a DARPA or NSF grant for nanotech research and spend my time and money working on a p2p network...
Life will find a way..... (Score:3)
Um... No. (Score:2)
This could be from any number of causes.
1. People at a college might have more straightjacketed finances and can't afford to increase their CD spending as fast as the general public.
2. People at a college might tend to order online more often, thus satisfying their music consumption through non-local stores.
3. People at a college may be joining CD clubs or may be purchasing CD's at home where they have convenient access to a large collection and bringing them to college instead of purchasing them near college.
4. A statistical anamoly. A decline in sales isn't actually happening.
5. A million other possible reasons.. Colleges are drugging their students so they purchase textbooks instead of CD's.
The conclusion: While such a correlation may exist: college cd purchases aren't increasing as fast as the average in the nation, that could have been generated by any NUMBER of possible causes.
If you want statistics I'll believe: Take universities who's student populations are similar demographics that do and don't have (say) napster, and ask them how many CD's they purchased in the last year. Or use some other technique that isn't susceptable for the flaws #1-5 above and give me numbers that don't have obvious artifacts.
Well, duh (Score:5)
Anyone who understands how Gnutella works (unfortunately, too few people) knows that Gnutella is horribly broken, will never work, and is basically unfixable.
The more relevent question is whether you can have a peer-to-peer network without central servers that *can* scale. And the answer is "no".
However, the REAL question is whether you can have a peer-to-peer network with decentralized servers, i.e., with clients that automatically establish a heirarchy among all the clients, and certain clients become more "server like". They only way to make a Gnutella work is by making it heirarchical, but the heirarchy needs to be automatic for it have the same general "virtual network" aspect of Gnutella.
Is it possible? I don't know. You would probably have to have automatic bandwidth measurements, depth probes, all kinds of things to make it work. I simply don't know if it would be possible to automate something like that.
--
Re:Scaling... (Score:2)
You're saying they paid him off, or did you just not bother to read the header?
Re:A Future Alternative (and its scales linearly t (Score:2)
All of the communication is done through a single UDP socket. DTCP is a multiplexing transport protocol which operates over a single UDP connection.
You are correct about the number of open UDP sockets though. On any UNIX or NT variant the limit is usually 1024 to 2048 per process, and 64k per IP address (the PORT value in UDP or TCP is only 2 bytes)
This is why native UDP or TCP cannot support the required number of connections to perform direct queries to each peer in a large network.
Audiogalaxy. (Score:3)
At first I was put off by the web interface, but:
1) It remembers everything you request in a queue and will get it when available. (A must for dial-up users)
2) Auto-resume using temp files.
3) A small app in your system tray/console only sends/receive when you have it running.
The greatest advantage is that ZDnet/CNet/MSNBC and other DON'T mention audiogalaxy in their "quest for Napster clone" articles, so the quality of users, and therefore the music, is excellent.
Unfortunately, it is a centralized system, but so far, it seems the mainstream media/RIAA have ignored it.
Thanks for the pointer! (Score:2)
I found the original paper:
MICHAEL O RABIN : Efficient Dispersal of Information for Security, Load Balancing, and Fault Tolerance [google.com]
Basically, it means you can break a file of length L into N chunks each of length L/M, such that only M chunks are needed to reconstruct the file. It's exactly the right thing for these circumstances.
--
Pfft... (Score:2)
Re:OMG NO NAPSTER??? Come on (Score:2)
MP3 first came out in 1996.
But it almost seems like forever doesn't it? To me it's encouraging that this stuff is so new because it means that in 4 years I'll be a "grandfather of the internet" too.
This is a truly fun time to be alive.
Re:OpenNap (Score:2)
--
Re:A solution (Score:2)
I'm not sure what you mean by this. The bandwidth constraint would only be a guideline on deciding on a parent, not a straightjacket. That guideline, consistently applied, will tend to push high bandwidth nodes towards the top of the tree. I'm considering connections here to be pretty fluid. Almost as fluid as current Gnutella connections.
The fact the connections and made and broken reasonably frequently will tend to cause the tree to become bushy.
What if the root goes down without performing any protocol? Are all children inaccessible when a branch node goes down until the spanning tree has been renegotiated?
I've thought about this. One solution would be to have the root's children hold an election as to the new root. Another would be to have the root designate an emergency root.
I consider this problem not too hard to solve. The much more interesting problem is when you have two nodes who think they're the root.
That was my intention initially. A later version of the protocol could use information about the spanning tree to designate caches that had all the information for either a subtree or the whole tree.
You also have to admit the going from n! to n is a huge improvement. :-)
Freenet (Score:3)
I understand that there are basically three reasons for Freenet:
AudioGalaxy! (Score:2)
It's definitely worth a try (and blocked by far fewer firewalls and ISPs than Napster!).
Error correction could make Usenet work (Score:2)
I haven't seen any P2P proposals which make use of error correction technology, and it does seem like it might be useful.
--
Re:Scaling... (Score:3)
--jordan
Re:There is FUD there. (Score:2)
As for why Gnutella can't scale, the point of my paper was not to duplicate other work or research. I don't mention a lot of the reasons because I think they are either irrelevant or different methodologies arriving at the same point. The premise of my paper wasn't to cover the practical limitations of Gnutella, since those have pretty much been beaten like a dead horse. The premise was to take an alternative angle at addressing the question "Can Gnutella Scale?", simply by calculating network impact with some math, and provoking some thought.
In other words, you look at 6GBps and say "FUD!!! That number is wrong!!" I look at it and say, "Hmm, well, 6GBps or 4GBps, makes no difference why or how, it ain't gonna work."
--jordan
RIAA is advertising piracy... (Score:4)
Now there will be media coverage (other than internet) mentionning other alternatives like IRC, Gnutella, search engines, etc etc, this is really a stupid move... not counting the many people that is going to be pissed off at RIAA and stop buying CDs.
RIAA should have worked closely with napster to bring a decent buisness model instead of bashing on them, they might have actually profited from that. They've shown how many "copyright material" were leeched every second (around 10,000) but did they show EVIDENCE that their sales decreased DUE to napster? no, they didn't have to, but if they would have, things wouldn't be that way. You bet after napster shuts down, their sales will decrease, I, for a start, will not buy anymore CDs.
I hope a company picks on big artists for digital distribution and doing something like stephen king, a buck a download, money would go STRAIGHT to them and the record label would stop it's own piracy (i.e. ripping many artists off and taking the public for complete morons).
For now Gnutella will do for most people, and if people SHARE, maths or not, it will work, not as nicely as napster did, but there will be a bunchload of alternatives if gnutella isn't doing the job.
Re:Peer-to-Peer can scale (Score:2)
Got a url to it? This sounds like an interesting read.
only part way through the paper... (Score:2)
i mean, each client is only passing a small amount of data between each, so i dont know if the agregation (sp) of the total bandwidth usage is a ... useful ... measurement...
tagline
Alternative Mirror (Score:3)
--jordan
A Future Alternative (and its scales linearly too) (Score:5)
The key aspects of this network will be:
- No forwarding. This is currently eating gnutella alive. A UDP based multiplexed transport protocol is used to maintain hundreds of thousands of direct connections to all the peers you want to communicate with. You can also tailor your peering groups precisely to what you desire, as far as quality, reliability, etc.
- Low Communication Overhead. All queries that are broadcast are performed with minimal overhead within UDP packets. A typical napster breadth query (10,000 peers) would take a few minutes on a modem, and seconds on a DSL line.
- Adaptive Configuration. Peers that have better or more responsive content will gravitate towards the top of your query list, thus, over time you will have a large collection of high quality peers which will greatly increase the chance of you finding what you need.
There are a number of other features, however too much to detail here.
Also, this is under heavy development, and not operational. I am going solo on this at the moment, and so progress is slow. However, once completed, it *should* be a scalable alternative to completely decentralized searching / location.
Re:Scaling... (Score:4)
But if Napster gets squeezed, you can bet your last dollar that it will be made to. Or something like freenet or audiogalaxy will take over.
But if the price of gasoline goes up, you can bet your last dollar that teleportation will be made practical. Or that cars that use fusion will be developed.
Not everything is practical just because there is a need for it.
--
Per-user bandwidth (Score:2)
Re:Scaling... (Score:2)
The gnutella network is improving... (Score:2)
Critics said man would never set foot on the moon. Now critics are saying Gnutella is doomed. Funny, they've been saying that since March of last year and I'm still happily downloading MP3s. Ignore the critics and keep the faith.
Shaun
Re:Well what about Freenet then? (Score:5)
Freenet is also very well architected, unlike bogus Gnutella.
The problem with Gnutella is not the transferring of files, it's the searching. You'll note that Freenet conspicuously avoids the subject of searching, except for "yeah, we're thinking about it... real soon now!"
--
Re:Well, duh (Score:2)
No, you connect to as many as you want. You can stop at 10,000, half a million, etc. Each peer is in direct control of how much bandwidth they use, how many peers to connect to, and how many queries they perform.
Second, presumably other people are making queries, too. If there are even 20 queries per second, your modem link will be saturated even if you're not making any queries of your own.
Thats where ALPINE comes into play. It allows the ordering of peers based on quality and value of responses. If you start getting busy, you simply quit replying to queries, and your perceived value to those peers drops, you then get queried less.
The details are nore complex, but you should never encounter saturation unless you specifically configure you client to do so, and even then its unlikely.
Third, discovering and storing a list of 10,000 peers -- not to mention 1 million peers -- is prohibitively expensive. Remember, there's no centralized server dishing out lists of addresses
You build them up gradually. And continually refine your list over time, so that you eventually have a list of similar peers with quality content and service. You dont get one million peers all at once. There is a discovery protocol in place, where you can ask for a number of peers from one you are already connected to. No need to do it all at once.
Third, discovering and storing a list of 10,000 peers -- not to mention 1 million peers -- is prohibitively expensive
You can store the connection information for 10,000 peers in 2 megabytes of RAM. The DTCP protocol is specifically designed to be very compact with almost no overhead per connection.
Fourth, the amount of churn in a group of 10,000 peers is quite high -- nodes are arriving, leaving, and crashing all the time. Even if you could find out about all 10,000 peers, your link would be saturated keeping up with changes in group membership
There are protocols for resuming connections if you IP address and port change. Also, these peers would have a perceived loq quality in comparision to more stable nodes, and thus would move down your list. you may not even need to maintain a connection to them at all.
The design of the server is also similar to a daemon process. Your GUI or client would interface with the server through a CORBA interface. You can shut down the client and the server is still running. You can reduce bandwidth usage if you wish, or shutdown the server. However, it is designed for a more persistant presence than most peering services.
And last, your network inherits all of the d-o-s, spam, and privacy problems inherent in any broadcast-search network. Gnutella has demonstrated these problems (if not solutions to them) handily. Learn from the idiocy of others.
Actually, this should be less of a problem than you would suspect. DoS is still only as bad as TCP. There is a connection protocol similar to TCP with handshakes, etc.
Spam is even less of a problem, as you can ban peers which spam or attack you. Peers can share this information in a growing pool so that spammers and rogue clients are effectively ostracized from the network. Each peer can decide who and when they communicate with. It puts the power back in your hands.
By the way, these were some very insightfull questions. Thanks for the reply.
I challenge you... (Score:5)
Can we expect therefore to see an equally interesting and thorough discussion of how Napster/Gnutella can grow, evolve and perhaps merge, to provide the "ideal compromise" where we will not need 100Gb networks, but where:
a) The destruction of any significant %age of the network is transparrently ignored or healed.
b) The network will not segment as GnutellaNet can.
c) Bandwith requirements are low[er]
d) Anonymity of participants is maintained where required.
e) The law can't shut it down so easily.
f) Data can be secured, encrypted and/or signed (etc.) for specific users
And MY personal wish:
g) The end result is so globally accepted for file exchange and storage, that FTP dies a death, and we all live without buffer-overflow exploits for the rest of out lives
Note that Napster and Gnutella were very one-sided in their freedom with files. There was no facility available to ensure that the law wasn't honoured where desired.
--
Re:A Future Alternative (and its scales linearly t (Score:2)
I could very much use some additional C++ development talent to help with this project. Anyone who is interested please let me know.
Thanks...
Gnutella Vs. Napigator (Score:2)
If its trading of MP3's at stake, I beleive that Napigator and nap servers like OpenNAP will save the movement, and not Gnutella.
This is not new news. (Score:2)
This included the fact that load on each server grows proportionally to the total number of servers, so the total CPU usage for the whole system grows quadratically. There are also serious issues with naming, searching, tagging, and other things that could have been dealt with.
There didn't seem to be much interest in this so I moved to lurking the Freenet mailing list, which seems to be a much more grownup way of doing the same thing.
Engineering is cheap (for us) (Score:2)
I'd say there is no way to put the genie back in the bottle, either by products dying out or by legal action. Now that there has been a taste there will eventually be one or more working models. None is likely to have the instant dominant position Napster had (except possibly Microsoft's offering if they bind it into Windows) but that doesn't mean the concept will die. File-sharing is a simple concept and a very addictive concept so it's something with low market entry and lots of possible market share. That will drive companies to invest. Us geeks will invest jour time just to keep the companies from sealing us in and because we like to hack code. I myself was working w/ file-sharing concepts long before Napster existed and am sure I will be long after. The concept has no doubt been growing ever since the invention of email. As a species we like being able to communicate freely. That includes text messages, voice messages, movies, photos, music, games, etc. Therefore there is no way the idea of sharing these things will die out. They'll just get thought about some more and new better concepts will be tried over and over until we find the perfect one. Email, ftp, gopher, web, instant messaging, Napster, etc are all steps we've taken.
Re:Well what about Freenet then? (Score:4)
A few months ago, I tried to find a simple, lucid discussion of exactly how FreeNet works with IP anonymity. On a technical level, but without having to plow through the code. Anybody want to try? I'm not disputing that it works... I just don't see how you can prevent someone from sitting on your router and watching packets fly, and correlate the IP on the other end to a single system.
--
Evan
Re:Well, duh (Score:2)
Why wouldn't it be ?
Why isn't the travelling salesman problem solvable? Why is pattern recognition such a difficult problem when humans do it so easily?
Don't underestimate the difficulty of the problem of a self-organizing network. It is definitely a non-trivial problem.
--
Re:Tracing (Score:2)
Yes. They can track your ISP, obtain a court order to search the ISP's logs, obtain your information, and arrest you.
is this likely to happen?
Short answer: no.
Long answer: Do you know how many people do this stuff? If the FBI went after every copyright violator in the nation, they woould need an incredible amount of manpower. IF you aren't reproducing and (important) selling bootlegs, nobody cares. You've been taking the "FBI Warning" at the beginning of videos way too seriously.
Re:Scaling... (Score:2)
Uh, right. Hands up everyone who actually needs to compile the latest, greatest kernel? Hands up everyone who did anyway?
Re: (Score:2)
Hey, dammit! (Score:3)
*sigh*
omega_rob -- friend of the dread pirate Napster
A solution :: semi-distributed database (Score:2)
The client software would have preference settings allowing a user connecting to the fileshare system to indicate their "elegibility" to become a temporary Database Host. Options of Always, Ask, and No.
Client software would work like this:
Access specific IRC Channel and Query established hosts.
Hosts (the temp. Database Hosts) would respond stating who they are, and requesting the client's share list.
Database Hosts would negotiate which host would accept the new clients list.
Client would then be told which host to transmit list to, and when its next update would be expected
Search requests are then transmitted to the Hosts through IRC, results are returned directly to clients by Hosts. 1 to 1 transfers are then initiated using cilent's choice of protocol.
When clients contact hosts indicating they are still online, the Hosts will ask client program about Server eligability. Database Hosts will change to those who indicate a preferable host environment.
Of course there's specific things to work out, but what do you guys think? Use IRC as a central communications channel for everything, and use a randomized central group of systems as centrallized databases - faster search returns than gnutella can produce, but at the same time, the lack of an easily shutdown central server.
Just a thought. Don't have the skills or time to write up a trial client.
More info: improving your connection (Score:3)
1. Never connect to more than 5 hosts at a time. There's no need for it and you'll only hurt yourself by doing so. I used to spend a lot of time in the gnutella.wego.com discussion area, and then the GnutellaNews boards, helping out new users. Time after time someone would come in and say, "Gnutella is shit! I type in a search and I don't get results for 10 minutes!" Me: "How many connections do you have open?" Them: "50, and if I try with 100, it goes even slower!!"
The more active connections you have, the slower your Gnutella experience will be... And by being a congested node, you're adding latency to the network for everyone else. Set your max connections to 5. That gives me, on average, an overhead of 6-10K/sec in background chatter, not counting uploads/downloads.
If you're on dialup, max your connections out at 2 and (it hurts to say this) don't share files or you won't be able to do anything else online. If you really want to share - and that's a good thing - cap your uploads at 1. Leave routing up to the people with the fatter pipes.
2. Go for diversity in your connections. If you load up your client and see that you're connected to 5 RoadRunner nodes, dump a few of them and try to connect to other networks. Peer-to-peer file sharing relies a lot on peering, after all. Connecting across ISPs, networks, and even across countries is a good thing.
3. Don't share junk files. Please. Every time I search for Pink Floyd and get a ton of under-1MB MP3s in the results, I want to kill someone. Know which directories, if any, you're sharing... And clean them out from time to time. All those incomplete downloads you made are being sent out as search results, but nobody is going to download them from you. Those are a lot of wasted bytes coming through your query hits.
4. Perhaps most importantly, use a good client. See the parent for details.
Shaun
Re:I challenge you... (Score:2)
But you're right, what's the next step? Well, I think there's a lot of great ideas out there already, but the technology in general is all still quite juvenile. I don't believe we'll see wide-scale adoption without a better Internet infrastructure to carry the traffic. Coming up with smart ways to ferry around data will always help, but in the end 14.4k is 14.4k, and 56k is 56k. Following the logic of Power-Law Phenomenon, fully distributed networks will probably never scale without the lowest-common-denominator amount of bandwidth being raised significantly.
--jordan
What about Usenet? (Score:3)
Tools like NewsShark and NewsGrabber make it easy to post or obtain binary formatted files such as multimedia and there is plenty of it available. No waiting for downloads, no acne-faced punk kids aborting them, and you can batch and resume at your convenience.
Usenet isn't that hard to use and there is a lot of music that can be found from your ISP's news server. Grab a client and check it out!
-Pat
Re:Well what about Freenet then? (Score:4)
The other thing you could do would be to take over a node on the network, and request the material you're interested in, but since freenet uses relay nodes, you can never be sure that the information you're recieving came directly from the node you are talking to or through N relay nodes. Also the data is encrypted on the harddrive of the node operator, so you, provably, cannot know if you are storing illegal data or a copy of Johney's essay for school.
Hope that helps,
--
Remove the rocks to send email
Re:Audiogalaxy? (Score:3)
Downloads are generally horribly slow. Generally most of my downloads on Napster/OpenNap servers come in around 25 - 100Kps. Audiogalaxy claims you get the fastest for your location, but I can't see how I'm getting 1-2.5Kps downloads if that's really true.
Selections not too bad, but you can't find the obscure stuff that you'll find on a network the size of Napster. It's organization is a step in the right direction, better than Napsters, but could be better.
What I really don't like about it is the fact that you have to choose which version each time. Sure, you're supposed to get the most popular version, but myself I don't like 128K mp3's. I prefer 192K files. So each time I download a song I have to choose which one I want. It would be nice if I could tell it I prefer 192K songs, and it would default to those.
With Napster I can find an entire album in a search and queue it up quickly. With AudioGalaxy it takes several clicks. You also don't have the ability to browse a users files. This is one of my favorite things about Napster is the ability to browse other users files. Sure Audio Galaxy gives you logical choices of other music, but I frequently find things I like that don't logically go with the song I initially searched for.
Basically I think AudioGalaxy is a good idea. I'd like to see a better client, maybe standalone or a Java client so it would have a little more flexibility, and I'd like to see more potential interaction between users.
Re:Well what about Freenet then? (Score:3)
Freenet is a totally peer-to-peer system. It is not possible to tell weather I'm sending the file directly to you or am just transmitting at the request of a node behind me. And if it's possible to procecute for that, then ISPs everywhere are in BIG trouble
--
Remove the rocks to send email
Re:Well, duh (Score:3)
This is how server on demand would work. A client starts and searchs for a server. It could be a local broadcast, searching a list of last known servers, searching a list of servers from a config file, asking the user for a server address, etc. If no server is found, the client automatically becomes a server. If a server is located, the client can begin to request data. If the server is under heavy load, it will ask a client (meeting certain requirements) to become a server. Of course, clients have a choice, perhaps a little config switch to either allower server status or never allow it. Anyway, the client needs to be worthy of server status. Some criteria might be uptime, bandwidth, available storage space, number of hops, etc. The burdened server would give the invitation. One thing that might be implamented is an automatic server invitation for clients which are located along the same route as more distant clients (say, somewhere in the middle of a 50 hop route based on response times). The middle server would then handle the requests for the more distant clients.
Obviously, we need to maintain a list of the quasi-centralized servers. Clients can maintain their own list of servers. Servers can hold lists of other servers. Search requests are never handed to clients. Only servers are searchable. Therefore, clients may publish available files to server databases. Servers may ignore these if clients do not meet certain guidlines.
We could incorporate a trust level into the server list. Once a client is deemed worthy to be a server, it is trusted. Certain bad events (dropping from the network too often, loosing storage space, etc) could reduce the trust rating. If that rating goes too low, server status is revoked and a new server can be appointed. Good events might raise that trust.
Anyway, that's a bit of my rambling. I'm not a network engineer so I couldn't describe the scalability of this method. It might work or might not. So comments are welcome! ^_^