Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
News

Deja For Sale 197

yet another coward writes: "According to a story in Internet Week, Deja.com is for sale. The company plans to sell the Usenet archive and the buying guide separately. This move might mean a comeback for the archive." So what's the going rate for 1,000,000 MMF posts?
This discussion has been archived. No new comments can be posted.

Deja for Sale

Comments Filter:
  • Deja's sale has been inevitable since they changed the name from DejaNews. I've been thinking, if anyone were to buy Deja, VA Linux would probably be one of the best choices.

    So far VA hasn't screwed up Slashdot any worse, right? I think Deja's database would fit right in with VA's move to be a content provider.
  • make money fast
  • Because they're making money off of something they have no moral right to. I have never bothered appending "x-no-archive" to my Usenet posts: I don't mind what I write being archived so that others may find it useful/interesting/not particularly harmful. However, if some company decides to make money off of my words without my explicit consent, I feel completely justified, and I will, scream bloody murder.

  • silly person, i wasnt trolling, i was being silly. There needs to be more sillyness in this world. Sillyness for all, I say!
  • bleh, maybe they'll be able to keep your service up....
  • Collecting tapes with the old articles took work which should be worth something. I don't know, but they possibly even had to pay for them.

    And it's not like it costs nothing for a news server to receive articles.

    As I see it, Dejanews is like any usenet news provider, only with a much longer expire period. (Unless they'd do things like edit articles to make words in them become hyperlinks and the like.) So yes, of course they can sell the database just like any other provider could.

  • Why would a spam company need to buy it? They can already grab everyone's email addresses.
  • "Hopefully, whoever buys the Usenet archive will 1)Keep the service free, and 2)Hire some programmers."

    Yeah, like anyone would pay for something if a free, better thing wasn't available.

    Uh, wait...

    My .02,

  • 1 - spam company buys it, suddenly has access to the emails of, well, everyone who has ever posted to the usenet.

    2 - a comback for dejanews.com which was a decent site without the portal junk.

    Personally I'm voting for #2, because I have enough problem getting off the linux-ipsec list and getting off every spammers list with my muliple email address would surely drive me (more) insane...
  • Most serious ISPs have USENET service
    access to which is free for their
    customers. Netcom, AT&T @HOME, come
    to mind as examples.

    I have nothing against WWW interface to USENET
    as yet another access method. What I do not
    like is being fed BS about expensive USENET
    software. It is cost of storage and bandwith
    which makes USENET expensive for ISPs. So,
    you putting all your USENET articles into DB engine,
    and serving them over HTTP will save you neither
    disk space not bandwidth. (In fact it will take
    more bandwith, due to HTML decorations and "sponsor messages").

  • AOL would like to buy Deja.Com

    Me too!

    And me!

    Add me to the list!

    Me too!!!!

    Me as well!!!

  • Hadden, the old rich guy from "Contact", buys deja.com ... "Why buy one floundering web site when you can have two at twice the price?" :)
  • by Anne Marie ( 239347 ) on Monday October 16, 2000 @04:57PM (#701040)
    I've gotten sick of defending myself and my gender time and time again, but I'll do so one last time. Just because most people on slashdot is male doesn't make me male, just as having most people on slashdot be of a certain race or nationality or religion doesn't assure that any single individual shares those characteristics. But I can cope, since in the greater scheme of things, it's no big deal that a few ACs continue to have their doubts.

    There is a bigger problem, though. Go ahead and look at my previous comments. Nearly every one of them has one or five AC replies to the effect of "suck my dick" or "I want to fuck you in the ass". Throughout history, female authors have been denied recognition for their work, because it was commonly assumed that women were incapable of creating what they created. And throughout history, women have been spat upon, threatened, battered, and gangraped by the same men you'll find here on slashdot. For all I know, you yourself are one of those same ACs.

    Ask yourself what you gain by contributing to this climate of fear and hate. Ask yourself that question when you scurry off for your nightly porn fix. Ask yourself that question when you insult and harass people on slashdot.
  • by Anonymous Coward
    You are forgetting an important part. You don't just have to archive the posts. You also have to provide efficient search capabilities.

    How do you search 10 terabytes of data in a few seconds? I dare you to do it with 50,000$ worth of equipment.

    Even if you restrict to title and e-mail, say, 20 bytes/message, it's still 20 GB. The only way you can do this is to have many computers each searching a piece of the archive concurrently.

    So, 200 computers each searching 100 MB of data? That still takes a few seconds. And that's a few seconds *per search*.

    Yes. I'd love to buy a copy of the archives. Whoever buys deja may want to market DVDs full of usenet posts (they may run into copyright issues since all posts are copyright of the authors).

    Building a fast search engine is always the problem.
  • In general, there has been a move away from Usenet and towards other fertile discussion forums within the last four years© I expect this trend to continue well into the next five years© Today, Usenet is nothing like what it was ten years ago© It'll be even less so, tomorrow©

    People keep predicting that usenet will die© People have been saying that usenet's been in decline for the last five years for at least as long as I've been on it ¥early 95©

    Usenet won't die because it's easy to use ¥news readers are a very mature technology, it's informative and there's a real sense of community there© No matter what happens in terms of technological advances, you'll never get those communities shifting en-masse to somewhere else©

    And ultimately that's what usenet is about, communities© I met my first girlfriend, my wife [wwwvelvetnet], a good portion of my close friends from one newsgroup© I learned Everything I Need To Know about programming from some uber-intelligent people on another©

    I was off usenet for while recently, 6 months© I missed it terribly© I'd been checking usenet probably 355+ days a year© I missed the sense of community©

    Until you kill that, until you remove all my friends internet connections, take away their newsreaders and burn the servers, news will survive© It's the most succesful form of online community, it always has been and it always will be© When IRC is ancient history, when ICQ, AIM etc no longer exist, I'll still be checking news©

    - Aidan

  • by Anonymous Coward
    Not only that, deja seems to have 'shrunk' the number of years they go back. It used to go back to 1994, now it seems only to go back to 1999

    Who else has usenet archives?!?!! Arg!!! I feel my complacency in letting them manage this has bit us all in the ass.

  • Anne-Marie, it's sad but I think you're right. Many readers of Slashdot favour porn, and it really isn't something to improve male-female relationships. Any user of pornography does degrade women and supports their abuse - those who say it's ok should meet women whose marriage has been destroyed due to husbands addiction to pornography. There are many of those. I have the unlucky priviledge to know such close couple personally.
  • Instead of running SETIathome, adopt a text group and back it up. We could call it the "Search for Intelligent Life on Usenet."
  • by uradu ( 10768 ) on Monday October 16, 2000 @05:03PM (#701046)
    It would be cool for some IT-oriented company such as IBM to buy the comp.* and alt.comp.* branches, slap on a real search engine that lets you perform actually useful searches, and put it back on the web. The wildcard capabilities need to be greatly enhanced, as well as searching for special characters. I used to live in deja for developement research, and it's still my first stop even today, even though my expecations are greatly lowered.

    It strikes me that IBM in particular could use it as a show piece for their technologies: DB2 (scalability, speed etc), their storage farms, search engine frontend etc. Make it part of their developerWorks and keep it really fast to show off their stuff.
  • by Detritus ( 11846 ) on Monday October 16, 2000 @02:56PM (#701047) Homepage
    http://www.deja.nsa.gov
  • Hmm.... Let me know when you sue what was RemarQ then. I am sure there are a few other people who would like to be on it. When RemarQ went commercial, I was pretty certain that deja would follow in the near to immediate future. (Off to start abusing my ISP for their crappy usenet server. Obviously vital now) Cheers Di
  • by Anonymous Coward
    now he knows there is a fast way to get rich!!! damn you!!
  • by jonfromspace ( 179394 ) <jonwilkins@gmai[ ]om ['l.c' in gap]> on Monday October 16, 2000 @02:59PM (#701050)
    Here is my offer for Deja

    3 Cans of Spam
    1 used Napkin (paper)
    My Slashdot and ICQ accounts
    The Head of Rob Malda


  • That was the most over-generalized, "This is what I have seen in my infinitesimal experience of the world, therefore it is true the universe over." statement I have ever seen. I would refer you to the movie "Orgasmo", even though you probably wouldn't watch it and even if you did your brain would go into offended mode and completely shutdown unable to see the social commentary it presents in places. One such thing is an exchange between Ron Jeremy and the supporting female role. Of course she goes on and on about the "Evil Perils of Porn " tirade and Ron Jeremy retorts that men are equally degraded and exploited in porn. Which the female replies that men are always the one in a position of power. Ron Jeremy counters with the fact that it is mostly men who want the product, so it exploits men's desires. See now someone with your narrow viewpoint of the world would never parse that exchange because your brain will not rationalize past your pre-established thought process. And you know what, I wouldn't want to meet any wymyn who let porn break up their marriage, I don't associate with wymynz who would blame me personally for Joan of Arc's demise and every other "atrocity" that has befallen wymynz in the ages. I am not even gonna go into the nauseating victimz stance you are promoting. If you are weak enough to let pictures break up a marriage you got other problems more important that porn sistah. Read my sig, be one with my sig, live my sig.

    DaPhreaker- Downloading usenet one piece or porn at a time
  • yeah, i'm going to buy a massive database of usenet that costs me out the anus to buy and maintain - solely for email addresses. no thanks.

    B1ood

  • When we all posted to usenet, it was under the implicit contract that our personal information wouldn't be bought and sold like so much cattle when the parent company was itself sold. Will Deja respect those promises now that they are under the knife? I'm personally surprised slashdot isn't raising a bigger stink about this; they sure did when toysmart was slapped by the FTC for similar behavior.
  • It's not just NOSPAM, you would also have to account for the /N[a-z0-9]+O[a-z0-9]+S[a-z0-9]+P[a-z0-9]+A[a-z0-9] +M/ as well as the /at/\@/ and /dot/\./ translations but none the less, currently usenet is not much more than a farm for spammers
  • You can probably work out the cost of archives as $n/Terabyte/year (not including capital costs of servers + storage). Not to mention costs of hiring at least one techie to keep the system running and the bandwidth charges. If your attitude is widespread, then the services is probably better off dead (as in dismembered) and the hardware given over to a more useful purpose and the people reassigned to more useful tasks. If a service can't justify itself as a valued social function or serve a market need then it will just degrade beyond recognition. If some entrepreneur keeps it alive and through their own efforts makes it functional (though you may consider their approach stupid or souless) why should you complain? After all you were not interested in keeping it around anyway or much less considered it relevant. A combination of idealism plus half a clue could ressurect the system but finding those type of people is hard (especially the half a clue part).

    TANSTAAFL

    LL
  • Sorry, but I think you mean "misandristic". "Misanthropic" means "having a dislike of the entire human race"; it is not the mere polar opposite of "misogynistic".

    Obviously, "misanthropic" would be the opposite of "nisgynistic" if we were prepared to abandon the stupid convention that "men" means "people" rather than "men" (believe me, I'm no happier than you with an abortion like "misandristic"), but since it seems difficult to get any force behind non-phallo-generic language, you need to distinguish between misanthropy and hatred of men.

  • Now that's a headline I'd like to see. ;)

  • I don't support the abuse you have suffered on /. but ask yourself for a moment if you are in any way helping the situation with your nick.

    Look at the nicks used by pretty much every other poster here. The closest anyone else gets to revealing any information about their identity is to run their initials together with their surname, from which you cannot infer their sex. Your nick leaves no doubt what sex you are (or wish to portray yourself as).

    I know someone who enjoys bloking, which is to say he assigns himself a female nick then logs on to IRC. He gets all the abuse, and finds it funny. There are a lot of people out there that do it, and it leads me to be skeptical about anyone who advertises their sex, as your nick does.

    "On the Internet, nobody knows you are a dog" I believe the cartoon frame once said (probably Farside?). Your previous post was great, I really enjoyed it, I didn't look at your nick because it isn't and shouldn't be relevant. But to the trolls, its just an open invitation. If your nick was BlackGeek they would call you a nigger. If it was Gay&Proud they would call you a fag. If it was SchoolNerd they would torment you about your age. If it was AMWhateverYourSurnameIs they would only be able to abuse you for the contents of your posting.
  • What Next? eBay.com For Sale? They could auction themselves off on... um... somewhere.

  • It's not FUNNY. My posts to USENET are in DejaNews. AFAIK, I still hold copyrights whether I marked them (Copyright 2000 ) or not. So does everyone else unless the copyrights in their country of posting don't give them copyright. Maybe even then, because US law does. IANAL.

    There are exemptions under copyright for the intended transmission, and reduced damages because I don't mark my posts (C), but they're still copyright. If DejaNews or whomever buys them starts charging, I want a cut!

    That said, other posters have commented on how an archive should be funded, and I agree it's a thorny issue. Much as I dislike gov't involvement, this seems a natural for the Library of Congress. Or maybe you would like an ICANN spinoff?

  • I asked them about this back in August and received the following reply:

    Recently we moved the Deja.com servers to a new facility in order to provide greater reliability and performance. The move is now complete and we thank you for your patience.

    Please note that currently our Usenet Discussion Service only retrieves messages from the past year (back through June 1999). As announced, we are reconfiguring the service that provides messages posted more than 1 year ago in order to provide greater reliability and performance. This will take some time though, possibly a few months. Have no fear: We're committed to bringing these messages back online as soon as possible.

    ... which doesn't say much really.

    Regards, Ralph.

  • It's very strange what your telling her.

    If I'm not mistaking, you're saying that she should hide the fact that she's a woman, or else it's some kind of provocation...

    Think about it for a minute, if some people call her name because "she advertises her gender" (as you wrote) and not because of the content of her posting, it's their guilt, not hers.

    "On the Internet nobody knows you're a dog?" Not at all, on the Internet, everybody assumes you're white, male and lives in the US. Why should you hide the fact that you're not?
  • I have an alternate interface to the Deja power search; with this the subject headers and posters' names won't be truncated, and the search results can be displayed in a nested format. Spam-Free Deja Power Search [techwolf.org]
  • Might not be what you're looking for, however...

    Alan Cox posted this link [linux.org.uk] to LKML a few months ago. It contains the early LKML posts, dating back to 1993. This prompted a post from tytso, who gave out this [mit.edu] link to even earlier posts.
  • Their Usenet archive is THE most useful resource on the internet. It may be the only really useful resource on the internet.

    As someone who has to constantly solve problems involving a wide assortment of hardware and software, I can't begin to estimate the value af being able to go to a single web site and find the answer to almost any problem within minutes.

    It is the only web site I would consider paying a monthly fee to use.

    On the other hand, their me-too product ratings have no value to me at all. No doubt it will sell for a lot more money and be around for ever.
  • by 11223 ( 201561 ) on Monday October 16, 2000 @02:41PM (#701066)
    Spam: $0.01 a.b.p.erotica posts: $0.10 The perfect troll: Priceless
  • "I have the weirdest feeling I've seen this article somewhere before..."


    To continue browsing the archives, please log back into the NYTimes.com website


    (The Average Slashdotter's Nightmare)


    -------
    Our Fish Keep Dying! Try not to laugh at the results!
    http://udel.edu/~jgephart/fishcounter.ht m

  • by MWoody ( 222806 ) on Monday October 16, 2000 @02:42PM (#701068)
    I've got some posts in that archive, I'm sure... Maybe I should sue for part of the asking price? ^_^
    ---
  • You are right it probably going to be hard to find that sort of archive, but even if micro$oft did buy deja I dout they would go to the extreme that you mention.

    NOW WHERE DO I GO FOR KIDDIE PORN?

  • Seriously, I'm pretty sure there are many posts that cannot be sold so easily... Let's say I write a piece of GPL code and send it on comp.os.linux.whatever (or alt.sex.goat if you prefer!) with the license. If any company wants to redistribute it, they have to make it available for free download. Not knowing it's GPL is not an excuse (posting any copyrighted material would be).

    To me, the simple fact that they ask that you pay to access copyrighted information (GPL or not) that they don't own, seems illegal.
  • by karma_policeman ( 232005 ) on Monday October 16, 2000 @03:06PM (#701071)
    Hopefully, whoever buys the Usenet archive will 1)Keep the service free, and 2)Hire some programmers.

    Deja is the buggiest major site I've ever come across. If you've tried to use deja.com to read anything other than the most recent day or two worth of traffic, you probably know what I mean. Follow a link to a specific post, and there's a good chance you'll be directed to a totally different post. This state of affairs has held for at least the last year, maybe longer.

    Knowing that deja is up for sale, it now makes sense that they haven't put a lot of effort into fixing bugs. But whoever buys the usenet archive is going to have some serious work to do.

  • On the other hand.. maintaining such a behemoth for no profit would suck, and would take someone far more idealistic than me.

    I don't know just how hard a thing deja is to maintain. The code in itself seems like it hasn't undergone many changes in the last little while, including this #$@! bug that comes around every now and then asking it to search only for messages that contain the '*' character...

  • Comment removed based on user account deletion
  • by devphil ( 51341 ) on Monday October 16, 2000 @03:09PM (#701074) Homepage

    I highly doubt that anyone will want to pay for an archive of usenet postings. Frankly, they are of limited use - most post threads offer very little useful information.

    Most of the books in the public library are crap, too, IMO, but I wouldn't once suggest that libraries are of limited use.

    Almost every single coding problem I've come up against, or configuration problem, or hardware problem, or VCR-clock-setting problem, has been asked already. All I need to do is a Deja Power Search, some thoughtful keywords, and I have my answer, courtesy of someone the previous year.

    Deja's archives may be of interest to an educational institution looking at the historical value of the posts, but the useful market value of the posts is zilch.

    Market value? Yeah, probably not. Usenet isn't there for market value; it's there to facilitate a huge meeting of the minds. And we need to preserve that information, so that those of us trying to write code and support the rest of you aren't forever asking the same questions.

    As for Deja as a product review site - what can you say?

    How about "It blew goats!"? Deja should have stuck to what it did best -- archiving Usenet -- and left that "portal" crap to places that believe in such things.

  • I'm assuming Make Money Fast, but I could be wrong.
  • by etherwalker ( 78824 ) on Monday October 16, 2000 @03:12PM (#701076)
    Shouldn't the Library of Congress (USA) maintain a Usenet archive? Anybody know if Congress has ever asked the LoC to do so? If not, why not? I would consider it a gross negligence of their duty if they're not.
  • Shoot me if I'm wrong, but I remember www.dejanews.com being _web accessible_ back in 1993 (which was the first time I used it, it may go back much further). The article claims that the company Deja was set up in 1995 in order to do the above. I think that is revisionist history. My version is that in 1995 they decided that they wanted to _make a profit_ from their web site, that's all.

    FatPhil
  • >The company believes the profitable Usenet business unit [...]

    Then we were right all along, and deja.com is truly fuckedcompany.com material.

    The reason their USENET archive was profitable (fuck, just think how many banner impressions get generated for a typical query - and I'm talking Deja Classic, not the "new" mode) was because it was useful, and people used it.

    I'm actually very relieved to see that they'll be selling the USENET archive to someone who gives a damn about USENET. Deja sure as hell didn't.

    And that the money-losing "product review" site will go to someone dumb enough to think that when I'm searching for "Frobozznitz 1996 specs", I want some FrobCo marketer's spiel about the latest and greatest, when the reality is merely that I found the circuit board for a Frobozznitz in a surplus store, the dates on the chips indicate it was made in 1996, and I wanna find out what it was!

    I just hope that the buyer of the USENET archive gets the full source tree for their code, so they can go back and dump the ass-sucking "frames" look, the nonproportional text fonts, the goofy colors (ugly shit-beige on white!?!) the tracking URLs (www.deja.com/wewatch/whatlinksyouclick/thenweredi rectyouto/http://www.eatatjoes.com/oldfr obs), the spammish URLs (http://www.frobcoscompetitor.com) inserted into USENET posters' posts, and all the marketing shite they added to Deja's code over the past 3 years.

    A USENET archive. Profitable. Kick ass.

  • In conclusion, I don't want Deja, and anyone who does want it will either be A) A zealot we admire but secretly resent; or B) A big businessman with a stupid business plan and no soul.

    Paul Allen! [paulallen.com]

    -jon
  • You did ask for them to archive your post by omitting the X- header which turns off archiving...

    FatPhil
  • I've gotten sick of defending myself and my gender time and time again, but I'll do so one last time.

    Why do you do it all? I for one didn't take notice of what you *wrote* in your signature, before you defended yourself against that troll. I don't care wheter your male or female. I don't care wheter your short, tall, good-looking, ugly, black, white, male, female or *whatever* as long as you write intelligent things.

    The only thing you should do when someone attacks you, is to *ignore the idiots*. Don't answer the obvious far too stupid trolls. They'll go away, eventually, hopefully. :)

    Until then, ignore'em. They're not worth your time. Intelligent people don't care about your gender when they discuss with you.


    --
  • The small glimmer of truth in what you say is that maybe some large non-commercial center should manage the archives. Where? MIT perhaps? Whatever. That would be great. However, every American tax payer would then be paying for it. So Europeans and South Africans and Australians and Japanese and Brasilians and ... would be getting the service for free.

    There is no such thing as a free lunch.

    FatPhil
    (A European who thinkgs that we Eurpoeans should pay our part to keep the archives up and running)
  • I don't get it. I keep reading posts by /.'ers saying "they can't sell the database, it has my copyrighted work/code/pictures/whatever on it." or "They can't sell the databas, it has my code on it, which is GPL'd, so they can't make money off it." I won't even discuss whether or not that's legal under the GPL--others have done a better job than I coud do.

    What I will discuss is the hypocrisy. Never mind that they're not actually selling your content, they're selling their business. (Technicality, but absolutely true, they're not really making money off the value of your content, which is what copyright is designed to protect, they're making money off its existence. Yeah, I ain't a lawyer, so I might be totally clueless there. It doesn't matter. I've got a real point here.)

    How about the hypocrisy (boy, the tangents...)? Here we are arguing that Napster should be legal cause it's not violating copyright, it's "sharing," and then when a company, that has merely archived posts that we knew were going out into the public domain, we start screaming about our copyright. For shame!

    Jeff

  • ... does'nt give you the right to whine. Bouh bouh you're a victim of the nasty anon. cowards. Hell, guess what, I browse at +1 and I don't see those. So stop whining, or fuck off.

    --

  • Hell yes I would rather. There are a lot of things I would rather have disappear than persist in another form. 1. Your mother dies and leaves instructions that her body go to a local teaching hospital. But the hospital is sold to an HMO chain. Your mother's corpse is put on display in the lobby, floating in a vat of formaldehyde under a banner that warns about the dangers of autoerotic asphyxiation. 2. 299 of your most intelligent, penetrating posts on slashdot are printed in a little bound volume and distributed by MicroSoft as "Thoughts of Chairman Bill". 3. You find yourself having to make micropayments to look up your own usenet posts. You spent hundreds of hours helping out newbies for the love of the community; now someone is selling your words. It's all about context. Just because you do something for free does not mean you don't value it. And if the law does not recognize that, who's wrong: you, or the law?
  • I know you're being funny, but I'm sure that the TLAs have as complete of an archive of USENET, public mailing lists, web sites and the like as they can possibly get. For a lot of interesting (to a government) topics you could get all the intelligence you'd ever want with some clever search techniques and decent analysts.
  • If you were so worried about your words, why didn't you archive them yourself? I understand the point you're trying to make, but there is a finite cost involved with maintaining and providing access to the data in question. Why shouldn't somebody be allowed to charge for that service?
  • Not a problem. Deja could just re-locate their servers to Russia (or any other country which doesn't have copyright), and there's no problem. Your copyright would be unenforceable. Then the sale could occur overseas, and they could move the servers back to the US. It would be legal, and there's absolutely nothing whatsoever you could do to impose your overbearing intellectual property laws.
  • I bet the trawlers sell off munged email addresses in the list regardless, anyway.
    I have a pretty large collection of spam sent to my personal domains. One recurring destination address is a Message-ID from a USENET post somewhen in 1991.

    A curious point, for me, is the number of spam pieces sent to usernames that not only don't exist, but never existed. And not just easy guesses like "sales@...", but plausible-looking usernames. These addresses could not have been trawled. I haven't seen any of them repeat, so they're probably not on lists. But I still wonder what the utility of sending spam to an address guaranteed to bounce might be. Are they spamming the postmaster through the bounce log?

  • by Anonymous Coward on Monday October 16, 2000 @03:13PM (#701095)
    If you have a DSL or Cable modem, you can roll your own archive. Since Sept 99, I have archived about 500 newsgroups, including most of comp.*, microsoft.*, linux.*, and a few rec.*, etc. Since then, I have archived 8.2 million messages. Each article is individually saved in a separate file and compressed with gzip. This still consumes 13 gigs of space over 3 partitions. I do not archive any "binaries" groups, either. When I get new articles, I store the subject of the article in a MySQL database. I wrote an Apache/PHP front end for the system. Once nice thing is that I can do a SQL select/like statement which gives me great flexibility when searching. Searching the entire database can be slow (50 secs). However searching in an individual group (or a small subset) only takes a second or two on a celeron 450. Since, my connection has a slow uplink, I send the requested file over the wire compressed with a gzip mime-type, and netscape and/or IE will decompress it on the client side.

    I like my implementation much better the deja. The interface to deja was horrible, IMHO, and was one of the main reasons I decided to roll my own. I'd suggest archiving stuff you'll never think you will need. Back then, I didn't know that I would be running Sybase on Linux, but since I archived most of the comp.* groups, including comp.databases.sybase, I was able to use the information with relative ease.
  • if they invalidate the search results so dejafilter no longer parses away the junk, whatever shall I do?[0]

    So, are they selling the entire archive? Or just submissions after May 15, 1999? :)

    [0] okay, rhetorical.. I'll hack the fscking perl source just like I did when they started putting those fscking little arrows all over the place, but still... *sigh*

  • Simply because a minority of this site's readership is zealously in support of the saying "information wants to be free" doesn't mean we all are.

    If you'll recall when Slashdot wanted to try to publish a book involving our comments on Katz' Hellmouth series? Around fifty people, including myself, were extremely vocal in our protests against it. You'll generally not see these same people supporting Napster. I can chose to use the GPL on a throwaway program. The program is *not* under the GPL by default, until I say otherwise. Likewise with my writings and public domain. The post you are reading is copyrighted. However unlikely I am to enforce it, I certainly could - and probably would, were someone else using it in a way that I found objectionable, without permission. (ie, publishing it on a different website, in print, etc)

    I'm not sure how long you've been reading this site, but the editorial opinion is not written into stone - perhaps the editors are fairly hypocritical, but the majority of the readership is not.

    --
  • It became wayyy to e-commercy and less of a nice, free usenet reader. I mean, any search you do gives you a dumb link on shopping for it.

    Example:

    "Find great deals on KILLING YOUR PARENTS on deja.com!" Endless fun.
  • Ignoring them is the best thing to do. No, it won't make them go away. There will always be trolls, here on Slashdot and everywhere else in the world. When the current crop of trolls grows up (most of them anyway), there will be more to take their place. Unfortunately, the stupid people far outnumber the intelligent.

    A troll only wants one thing, to be fed. Replying to them or their ignorant remarks only feeds them and makes them grow. Browse at +1 or +2 and don't worry about it.

  • Does anyone here actually think Deja went down in usability when they included the product review section? I don't.

    Granted, I'm not really interested in the older archives (which I'm sure they have on file).

    Still, I use it every day (particularly when I'm having driver issues or want fan reviews of my favorite games.

  • deja only keeps text posts (not binaries) so the data volume and the copyright issues aren't that big of a deal. You're still talking about many megabytes a day, but it's far short of the gigabytes flowing in binary groups...

    I'm an investigator. I followed a trail there.
    Q.Tell me what the trail was.
  • by Zach Baker ( 5303 ) <zach@zachbaker.com> on Monday October 16, 2000 @04:02PM (#701121) Homepage
    Usenet would be a very natural fit for Google's search-focused, keyword-focused ad placement site. While Google buying out part of Deja as a company is sketchy, I can see them buying out their Usenet archive separately and integrating Usenet with some of their expertise. Man, with Usenet on Google I'd never leave my chair^H^H^H^H^H^H^H^Htheir site.
  • by isaac ( 2852 ) on Monday October 16, 2000 @04:04PM (#701122)
    One can mark the decline of Dejanews by the decline in their USENET archives. First the formerly clean (and quick-loading) search interface became cluttered with other "portal" crap (altavista, anyone?), then the old posts went away ("temporarily"), then USENET searches were relegated to other pages, then the cutoff for old posts was 12 months, and finally they started parsing their usenet posts to add links to their product review databases (does anyone use those? apparently not...) in the bodies of the messages. Now they're on the block... boo hoo hoo.

    I hope the archives get bought by someone who wants to make a usefully complete, freely-searchable USENET archive (my wet dream: Google buys the archives), but I fear that they'll just be snapped up by a company like Lexis-Nexis, who'll happily take the publically contributed works of thousands and resell access at kilobucks-a-year.

    -Isaac

  • by Anne Marie ( 239347 ) on Monday October 16, 2000 @04:12PM (#701124)
    Cable companies contribute to C-SPAN because they fear congressional legislation and they want to appear like nice citizens to Congress. Where's the incentive with Usenet? In fact, ISPs would be grateful if Usenet died for all sorts of reasons:

    there'd be one fewer source of headaches for them. Usenet contributes to the perpetuation of spam, and it's the ISPs' majordomos that have to clean up when their users get led astray.

    Many ISPs are trying their best to set up their own proprietary bulletin boards accessible through their own channels. Usenet is an unnecessary source of competition

    Usenet is a big gaping sucking legal wound waiting to happen, what with all the copyright infringements, obscenity, and violations of the DMCA being tossed around. Any prudent ISP wary of tort suits should be wary of affiliating itself with such an anarchic beast.

    Usenet still, above all, requires enormous resources to maintain. Especially binary groups.

    In general, there has been a move away from Usenet and towards other fertile discussion forums within the last four years. I expect this trend to continue well into the next five years. Today, Usenet is nothing like what it was ten years ago. It'll be even less so, tomorrow.

  • It seems to me reasonable that one of the major public libraries (British Library, Bibliotheque Nationale, Library of Congress, whatever) should take on the archiving of Usenet.
  • by cybrthng ( 22291 ) on Tuesday October 17, 2000 @03:31AM (#701127) Homepage Journal
    Googly should by them up. Sell off the product side or not even pick them up, but offer the usenet archives along with there advanced web querying.

    Surely they can spare some of them 6,000 computers :).

  • So what you're saying is that you like hearing all the same questions asked over, and over, and over again?

    No, what I'm saying is that I don't use usenet anymore and I wouldn't pay a nickel for Deja's service...neither would nayone else, evidently.

  • just as I was getting used to the old deja email (which was NICELY spam-free), they dumped it and went with some totally clueless SPAM-BY-THE-BOATLOADS web-based email service.

    and it 'only' took them 3 weeks to restore the saved mail from the previous provider to the current one...

    its to the point where, since its no longer spam-free, that the web-based email service is totally USELESS. sheesh - didn't they know that's what the draw was?

    I loved the fact that dejanews.com (I prefer the old name to the shortened deja.com) archived usenet. and the fact that you could opt-out of it (x-no-archive: yes). who knows what level of service the new owner will exhibit. probably less than what is currently offered, I fear.

    --

  • It's an insurance policy against good luck.

    Imagine you selected 6 lottery numbers, but didn't buy the ticket. Disaster strikes: your numbers come out. Too bad you're uninsured.
  • so, post the source, man. let us all enjoy your code.

    today, its cheap to build an fast system (athlon 1Ghz), add 10rpm discs and hardware raid (for speed/reliability), sprinkle with lotsa mem., and you have a local deja of your own.

    'cache the info out towards the edges' is what I always say. power to the edges! praise be to he who caches reliably and frequently.

    --

  • [sheesh, bad typing day]

    obviously a 10rpm drive wouldn't help much.

    sed /rpm/Krpm/

    --

  • by MustardMan ( 52102 ) on Monday October 16, 2000 @02:42PM (#701156)
    I have the weirdest feeling I've seen this article somewhere before...
  • Throughout history, female authors have been denied recognition for their work, because it was commonly assumed that women were incapable of creating what they created.

    As a guy whose name (Jean-Michel) often causes USENET readers to mistakenly believe they are corresponding with a woman, I can relate in some small fashion. The behavior of some of the cretins on the internet (be it USENET, irc, or slashdot) is enough to make one ill.

    Nevertheless, I would implore you to consider the source. These are Anonymous Cowards, the keyword being Coward. Were they confronted with an actual woman showing any interest in their sorry existence whatsoever, they would almost certainly soil their pants with fear before stuttering something inane and descending hopelessly into a seizure of insecurity and general social cluelessness.

    You have no need to defend yourself or your gender. The offensive posts to which you refer speak for themselves and identify their posters as the penultimate losers of society, whose only chance at either a sexual or interpersonal relationship is limited to their right hand.

    I know it probably doesn't make you feel any better to read this, being the target of such purile harassment, but it is nevertheless true: your value as a contibuter is in no way diminished or tarnished by these idiots.

    And throughout history, women have been spat upon, threatened, battered, and gangraped by the same men you'll find here on slashdot.

    I know you're angry, but this comment is very unfair to the vast, vast majority of men on slashdot.

    Throughout history women have seduced, betrayed, and murdered men for cheap material gain, to grasp power, to avenge a wrong (real or imagined), or even out of simple spite and jealousy.

    The sexes have a long history of using and abusing one another, just as they have an equally long history of nurturing and sustaining one another.

    It would be as wrong for me to paint the women who post to slashdot as "gold digging cunts" (or some other equally offensive characterization) as it is for you to paint the men here as would-be rapists, batterers, etc.

    Put simply, some human beings are scum, irrespective of sex. Most are not. And while I don't blame you for being angry, please try to resist the very natural, human tendency to overgeneralize about an entire population of people based upon the behavior of a few mysogenist losers who will almost certainly remain sexually frustrated for the duration of their small, pitiful lives.
  • My girlfreind and I both enjoy watching porn immensly. It hasn't harmed our relationship at all. Indeed, it has at times added some rather interesting spice to our lives.

    And yes, we're both quite addicted to sex. :-)

    For all we know, it may have been your friend's wife's intolerance of porn that ruined their relationship, not the husbands "addiciton." Even if it wasn't, the notion that a few losers can't control themselves or are so narcissistic that they can only get off to pornography does not even remotely imply that sych would be true for the rest of us, who enjoy healthy lives and relationships while enjoying a little hardcore from time to time.
  • Is anyone actually sure they can do this? I don't recall ever signing a waiver for my posts. As it is they arrive on Deja, but also on other news servers. I believe Deja could sell their software, but the database of posts should go without cost.


    --
    Chief Frog Inspector
  • It strikes me that IBM in particular could use it as a show piece for their technologies: DB2 (scalability, speed etc), their storage farms, search engine frontend etc.

    After all, that was the whole point of Altavista, back when it was still altavista.digital.com -- it was intended to show off DEC's hardware.

    At least Altavista is still useful for its original function. Sure, they've crowded the window with crappy flashing ads, and put all the keyword crap in (search for "flying buttmonkeys", and it'll give you a link to "Comparison shop for flying buttmonkeys"!), but it still works as a web search site.

  • Market value? Yeah, probably not. Usenet isn't there for market value; it's there to facilitate a huge meeting of the minds. And we need to preserve that information

    I wasn't saying usenet had no worth, I was saying that Deja's archives had little monetary worth. I'm glad you have found tidbits in there from time to time, but you aren't going to pony up $100 million for it either, are you??

  • by CoughDropAddict ( 40792 ) on Monday October 16, 2000 @03:50PM (#701174) Homepage
    I always wished I could read usenet postings that were really old, say 8+ years. The old-timers always talk about the glory days of usenet, and we always see references to the famous postings of "Larry Wall" on April Fool's day, when the concept of Perl Poetry was first seen, or of course, the famous Linus posting when Linux first met the world. Anyone know of a place where you can read really old messages like these?

    --
  • So let's assume there are a billion Usenet posts at an average of 10Kbytes each (being conservative) in Deja's archives. That's 10 terabytes. A lot of storage, eh? Not really. Not anymore. Spinning magnetic store runs around $5/gigabyte. The Deja archive would require 10K gigabytes. That's $50,000 in spinning magnetic storage. The monthly payments (at 12% interest) on that capital would be around 1% of the $50,000 principle which works out to be $500/month.

    Get it all into a fault-tolerant RAID system with hot swap and quadruple the cost. You still have only $2000/month debt service.

    That's less than a fast food worker makes in Silicon Valley these days.

    If your objection to these figures are that the traffic is so high that the costs are dominated by bandwidth, then the problem is you've got too much business -- a terrible problem with which you must learn to cope.

    As the Buddha says: Life is suffering.

  • by pouwelse ( 118316 ) on Monday October 16, 2000 @04:40PM (#701182) Homepage

    Creating a open network of Deja.com like servers is my dream and I already have stable running code for it...

    At SourceForge.net the project UsenetWeb [sourceforge.net] is located that is the Open Source implementation of Deja.com

    Currently the software is stand-alone, but is could be expanded to form a network of OpenContent deja.com like servers. With the sharing of news groups across several servers, it could become a volenteers only job... The software only supports text-only newsgroups for now.

    Are there people here that would like to run this software and build a Open Usenet Network? ? ?

    See a demonstration of Open Source deja.com at Usenet4free.com [usenet4free.com]

    Johan.

  • by Chiasmus_ ( 171285 ) on Monday October 16, 2000 @02:45PM (#701187) Journal
    I was just wondering to myself, "If I had $500,000,000 in my wallet, how much would I spend on the Deja archives?"

    I came to the conclusion that I would spend nothing at all. Why? Because I feel that ideally, these archives should be free to all and any attempt to charge for access would be somehow wrong. These are ideas in their purest form, when they were just first beginning to be transferred into digital format en masse. This stuff belongs in a museum, not a pay site.

    On the other hand.. maintaining such a behemoth for no profit would suck, and would take someone far more idealistic than me.

    In conclusion, I don't want Deja, and anyone who does want it will either be A) A zealot we admire but secretly resent; or B) A big businessman with a stupid business plan and no soul.
  • They can sell my words for all I care if they find someone dumb enough to buy usenet posts :) they just can't claim they wrote 'em- or change them and claim I wrote them.

    If I remember correctly they've flirted with doing the latter already- in that they are putting hotlinks in stuff as if I, in writing 'connect w box to x box and then to y and z boxes', had added a link like 'connect w box to x box [null.com] and then to y and z boxes'.

  • You are forgetting an important part. You don't just have to archive the posts. You also have to provide efficient search capabilities.

    Grasshopper, when I made my little comment about "bandwidth" and the Buddha, I did neglect to mention that bandwidth includes everything that scales with usage: internal CPU to memory bandwidth and disk bandwidth as well as wide area network bandwidth. The point is that the incremental costs of maintaining the pre-1999 archive for online access are miniscule compared to the rest of Deja's other expenses, yet Deja has used the costs of providing the earlier archives as the reason they are offline. The CEO even admitted they get hardly any incremental increase in bandwidth utilization (see definition above) from the older archives. So, since the incremental costs of simply having them online is so low, how can the CEO use this reason? Where are the real incremental costs of keeping the older archives coming from?

    Oh, also, I did quadruple the $50,000 to $200,000 for interfaces, etc.

  • Ok, when I signed up to deja, I agreed to its privacy policy presented to me at that time.

    Now that deja will be owned by someone else and that includes its 'customer' databse, what happens to my personal information and yours?

    Anyone feel like playing a lawyer on /. today?
  • The kind of archive site I would like to see would be in the form of read-only nntp servers. Then people could access them with whatever newsreader interface they liked. To keep there from being too many articles at once for a newsreader to keep up with, call the servers news1991.deja.com, news1992.deja.com, or whatever, and stick just content from the one year on them. Or maybe use one server and rename each group 1991.news.group.name, or news.group.name.1991, etc.

    It'd be a damn sight easier to use than what Deja has now.
    --

  • Hmmm.... this might be difficuly - Google depends upon hyperlinks for its technology to work. There are no hyperlinks in Usenet. You could use the number of replies as a proxy, but then searches would give successful trolls high scores. Also, you lose a lot of info because a post that gets many replies can itself only be a reply to one post - whereas on the web, a page that is heavily linked to can link to many other sites.
  • If you can't write a program that takes out NOSPAM, you aren't a real programmer.

    And I suppose being able to do that makes you a real programmer? Wow and all this time here I thought I was just an average college level programmer who slapped together a few numerical computations and simulations for computational physics courses. Now, I see, 3ye 4m 4n 31337 h4x0r... excuse me while i r3w7 j00r b0x0rz!

    heh :)
  • > A curious point, for me, is the number of spam pieces sent to usernames that not only don't exist, but never existed [ ... ] plausible-looking usernames.

    Consult (deja.com :-) in news.admin.net-abuse.email for "dictionary attack".

    Both chickenboner ("guy in a trailer park) and mainsleaze (i.e. "legitimate" - at least, companies that *pretend* to be legit) marketroids are spamming any plausible username at any SMTP server they can get their hands on.

    The goal is, as your logs show, to cram spam down the throats of users who've never even used email. After all, the spammer just ignores the bounce, and your box has to consume bandwidth dealing with it. It's no skin off the spammer's nose if your box dies from the load.

    If you're seeing this, report it as a DOS attack. Because frankly, that's what it is.

  • by jimhill ( 7277 ) on Monday October 16, 2000 @02:52PM (#701219) Homepage
    For a couple of years now I've been arguing that ISPs ought to kick in for the maintenance of a Usenet archive in much the same manner that cable companies kick in to cover C-SPAN's costs. While some newsgroups are higher content than others, there's no denying that there is an absolute treasure-trove of information that passes through news servers every day. Since the entire online community benefits, the entire online community ought to pay to maintain an archive.

    Just today I saw an online article that over half the households in the US are online in some capacity. According to the Census Bureau, that means around 50 million households are online. A buck a month per customer routed through ISPs and you're looking at six hundred million dollars a year -- enough to cover an archive without even asking the rest of the world to kick in. We could pay for it ourselves as a token gesture of reconciliation for "Americanizing" the rest of the net through brute force.

    You run into the issue of censorship almost before the proposal hits paper. For every newsgroup there will almost certainly be someone or many someones who wants the content sifted or outright not in the archive. Beating these people into submission so that they will be silent forever will be difficult.

    Just a notion, make of it what you will. I'm sure there's a vast array of technical issues that would have to be worked out up front, but I'm absolutely convinced that this could work. Further, I think this is the only way a Usenet archive _can_ work (barring some well-funded philanthropic gesture from a dead billionaire).

    Comments?
  • The value of a web article, news story, or usenet post is inversely proportional to the square of its age in days.

    Or something like that.

    I highly doubt that anyone will want to pay for an archive of usenet postings. Frankly, they are of limited use - most post threads offer very little useful information.

    Deja's archives may be of interest to an educational institution looking at the historical value of the posts, but the useful market value of the posts is zilch.

    As for Deja as a product review site - what can you say? It lost the race.

    Epinions, Yahoo, and Amazon's product reviews are far out ahead, and Deja never really made a meaningful transition from being a usenet archive.

    The bigger question is whether NNTP is kaput altogether at this stage.

  • Yes, yes, yes... there are companies like EMC out there who have customers that will pay enormous amounts of money for "storage systems". I'm sure they'll take even more than $8 million from them if they'd give it to them. For that kind of money they might even be willing to stroke the customer ego by convincing them it really has to cost that much and that they are nobody's fool. Lots of technically illiterate suits have that kind of ego need as a consequence of posing as technical studs with their capital sources, which is a big market segment for companies like EMC, if not its biggest one.

    You ignored my comment about bandwidth which includes everything that scales with usage -- not storage. Furthermore, you say "no way this will cost 50k" when I quadrupled "the simple cost of the drives" to $200,000 to get the drives into a interface with some redundancy for hot-swap.

    As you, yourself, admit, inclusion of organizational costs is the cost of a from-scratch startup, not "the cost of maintaining an archive".

    When you want to go high availability (presumably when you want to go high performance), you may as well just duplicate the system in geographically remote locations -- this gets back to the scalability/bandwidth point I already made. The "transaction journal" for such a system is simply the Usenet feed. You don't need Oracle or even want it since it doesn't do the right kind of indexing anyway. The only backup you need is another server added to provide bandwidth. Recovery takes time, of course, even with a T3 retransmitting the archive from the redundant system(s), but it isn't too bad. At only 1Mbyte/sec you can recover a fully redundant nuked site in 4 months. You don't need or want a system like Veritas for high availability. When one system goes down, the other(s) keep(s) serving and feeds the log to the one being brought back online. This isn't up-front capital.

    At the limit, this gets to the real solution to the Usenet archiving problem:

    Peer-to-peer redundant archiving.

    The main problem to be solved with peer-to-peer redundant archiving is query optimization, decomposition and routing within a distributed redundant index. Know of any good work in this area?

  • Personally, I think that the Usenet archive could be very attractive to an ambitious "P2P" company [mojonation.com] that wants to show that they can re-de-centralize USENET archives while maintaining searchability. Volunteers might be brought in to help host the millions of posts themselves.

    Distributed full-text searching, possibly with some sort of centralized assistance, but truly distributed access would make for a pretty mighty technology demo. I wonder if they're up to the task.

You can be replaced by this computer.

Working...