Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
News

Google To Create "Blog" Search; Potentially Remove From Main 311

Skyshadow writes "Google, search engine of choice for pretty much everyone, has announced that it will begin a seperate index for blogs and remove them from the normal index, handling them instead in much the same way as their usenet archives. This will hopefully put an end to the recent difficulties locating primary source material among the mountains of blogs which are clogging the ratings system." There's been comments from elsewhere that says they won't be removing them - but that remains to be seen.
This discussion has been archived. No new comments can be posted.

Google To Create "Blog" Search; Potentially Remove From Main

Comments Filter:
  • journals (Score:5, Interesting)

    by asv108 ( 141455 ) * <asv@nOspam.ivoss.com> on Monday May 12, 2003 @11:04AM (#5936486) Homepage Journal
    Will /. journals be included in this?

    Is there any chance of having an RSS feature for journals, for everyone or even just subscribers?

  • 'Bout time (Score:4, Interesting)

    by Surak ( 18578 ) * <surakNO@SPAMmailblocks.com> on Monday May 12, 2003 @11:04AM (#5936494) Homepage Journal
    I, for one, am sick of searching material only to find that the page is some asshat's blog. Nothing against blogs, but you never know where this material came from.

    OTOH, what constitutes a 'blog'? Is Slashdot a blog? Is this a blog [witchvox.com]? The lines are constantly being blurred, and I'm not sure it'll be easy for google to make that distinction.
  • Re:journals (Score:5, Interesting)

    by jawtheshark ( 198669 ) * <{moc.krahsehtwaj} {ta} {todhsals}> on Monday May 12, 2003 @11:05AM (#5936499) Homepage Journal
    No... Check robots.txt [slashdot.org]
  • Good to weed out.... (Score:5, Interesting)

    by caffeinex36 ( 608768 ) on Monday May 12, 2003 @11:06AM (#5936519)
    Most of the useless information people put into blogs. Although, when you search for information, would you want to search 2 different locations? This is the whole claim to googles fame. I have found that many times people post how-to's in thier blogs along with other information.


    If it ain't broke...don't fix it

    -Rob
  • Re:journals (Score:3, Interesting)

    by jawtheshark ( 198669 ) * <{moc.krahsehtwaj} {ta} {todhsals}> on Monday May 12, 2003 @11:08AM (#5936527) Homepage Journal
    Oh, wait.... It says "User-agent: Mediapartners-Google*" can scan everything. This surprises me however. Still, that's not "GoogleBot", which I see from time to time in my apache logs.
    Anybody got an idea what "Mediapartners-Google*" exactly is?
  • ID? (Score:2, Interesting)

    by Acidic_Diarrhea ( 641390 ) on Monday May 12, 2003 @11:09AM (#5936534) Homepage Journal
    How are blogs being identified as opposted to non-blog pages? I can see how newsgroups could be moved to a separate search but blogs aren't easily identifiable. Will Google rely on bloggers to identify their sites to Google? I suppose that could work as the article states that bloggers want to legitimize what they do through such a move as Google is approaching.

    I also like the analogy made by the article to the voting system where a page votes for a topic: an expert site on turtles voting for turtles once a day every year vs. a blog mentioning turtles once in that same period leads to the expert site winning.

  • by Snowhare ( 263311 ) on Monday May 12, 2003 @11:10AM (#5936538)
    I wonder if this is also intended to stop Googlewashing [theregister.co.uk]? Google has a history of trying to 'play fair' - and the power of a few well connected blogs to basically 'take possession' of any term works against that philosophy.
  • /. is a blog, no? (Score:3, Interesting)

    by Eponymous Coward ( 6097 ) on Monday May 12, 2003 @11:13AM (#5936559)
    Am I the only one who thinks it is funny to see all the anti-blog comments everytime a weblog related story is posted? IMHO, Slashdot is a weblog.

    I think I originally found Slashdot on RobotWisdom-- yet another weblog. But that was a couple of years ago...

  • blogs (Score:5, Interesting)

    by Blocked By Sand ( 623943 ) on Monday May 12, 2003 @11:13AM (#5936563)
    One of the biggest newspapers in Norway, where I live, has recently said they believe blogs to be the new 'killer app' for delivering information on the net. The problem with that is that the treshold for publishing 'news' is so low, anybody can do it. This makes it very difficult for people to find the info they are looking for. At the same time there is no guaranty the info is useful or even correct. A good reputation will be more and more important for businesses and sites on the net.

    This move by google tells me newspapers in norway aren't the only ones seeing how influental blogs will/could become.This is a truly great step forward if Google could come up with a way of rating the different blogs. That way you could easily find serious tech-blogs.

    Wonder what rating /. would get though ;)
  • is ./ a blog? ebay? (Score:5, Interesting)

    by loomis ( 141922 ) on Monday May 12, 2003 @11:15AM (#5936575)
    As a previous poster briefly mentioned, what exactly is a blog? Would Slashdot forums be considered a blog? What about the myriad of ezboard message board forums out there, as well as other discussion websites? If the answer is no, it would be seemingly difficult and perhaps only of minor benefit to seperate just the true "blog" sites while ignoring the other sites.

    And what about ebay? Quite often I am searching for info on an old piece of electronics I've picked up someplace, and I do a goole search, hoping to find information about the item. Well, all I get in return are ebay links to a similar item that was sold on ebay a few months ago. And even then, I click on the link, hoping to see what the item sold for (and thus get an appraisal), but the auction has been removed from the database due to it being several months old. Why index ebay pages? It's really frustrating.

    Loomis
  • Re:blogs.google.com? (Score:5, Interesting)

    by GT_Alias ( 551463 ) on Monday May 12, 2003 @11:16AM (#5936584)
    Which is why Google is not eliminating them entirely, just moving them over to their own search.

    It's a reasonable solution, I think. Is it worth tainting the vast majority of the search results with useless blog entries just so that the (very) few blogs with good information will still show up?

    This solves their problem with bloggers manipulating search results, yet still keeps the information available to those who want it. Granted, you have to know to look for it, but it seems to me like a fair trade-off.

  • by Thoguth ( 203384 ) * on Monday May 12, 2003 @11:16AM (#5936585) Homepage
    I really don't mind finding blog links when I search for something, as they usually at least link to some relevant sources.

    On the other hand, it is really a pain to search for help on something, and instead of getting a useful, authoritative document, I'll get a half-dozen archived unanswered mailing list posts from people with the same problem. I would much rather Google address this dilution from mailing lists.
  • by Anonymous Coward on Monday May 12, 2003 @11:16AM (#5936591)
    What exactly is a Blog? Can anyone answer? What makes it different from a person web site full of links and comments, such as has existed on the Web for more than 5 years? What makes it something "new"?
  • by acarr0 ( 652849 ) on Monday May 12, 2003 @11:17AM (#5936598)

    The general consensus appears to view this tabbed filtering as a good thing. There are some valid concerns about missing out on good information as a result. Naturally one can go to the "Blog tab" to conduct a search but most people will likely tend not to do this.

    It seems to me that this may be an opportunity for google to improve upon their user interface a bit. Since most folks use the simple imterface provided on the main page it seems to me that adding a few check boxes just below the text box would be a good idea. That would allow for the quick addition of groups and/or blogs to your search query.

  • by borkus ( 179118 ) on Monday May 12, 2003 @11:17AM (#5936600) Homepage
    One of the reasons that Google is filtering out blogs is "link-backs". These create huge rings of circular links, distorting page rankings. Because of this, I wonder if Google is going to look for template elements in the HTML of a page to determine if it is a blog. Or maybe anything that has a Radio Userland, Blogger, Gray Matter or Moveable Type medallion on it will be ignored.
  • Re:'Bout time (Score:5, Interesting)

    by EinarH ( 583836 ) on Monday May 12, 2003 @11:20AM (#5936620) Journal
    The other day I searched Google for some radio stuff. (helping my father find some equipment).

    Then I noticed that Radio Userland appeared very high on Google. In fact, when you search for "radio"* they get a #5 at Google. As far as i know they only existed for a year. And their popularity, as it appears on google, looks very inflated because of extremly many links in blogs.

    Checked out Daypop.com, which ranks articles/links based on the number of links in blogs. This is what I got:
    Searching All Weblogs for link:radio.userland.com... Found 3260 pages matching query.

    Thats insane. When so many blogs links to the same page their ranking on google gets very high based only on blog-popularity.


    *Searching for only radio is obvious a bad idea as google returns some 40 m. hits.

  • by Zathrus ( 232140 ) on Monday May 12, 2003 @11:21AM (#5936632) Homepage
    If you know how to do serious web searches via Google then you're already searching at least 2 locations - the main Google search and the Google Groups search. You may also search Google News separately (although the info from there is usually in the main search as well).

    I'm looking forward to this, since most of the stuff Google hits in blogs is completely and utterly irrelevant to what I'm actually trying to find. Google will probably just have another tab to click on, or perhaps a few top links to blog-specific searches if they think it's relevant (like they do with cross links to Google News searches currently). Perhaps even a configurable "Include Blogs" on the preferences page. Whatever, I don't care, just let me exclude the damn things.

    If I don't get what I'm looking for in regular search then may go search Blogs as well. After newsgroups.
  • Bad Idea (Score:5, Interesting)

    by rwiedower ( 572254 ) on Monday May 12, 2003 @11:22AM (#5936637) Homepage
    I work at a company [peyser.com] that has a blog-like recap [peyser.com] of political news of interest for our clients and friends. If google tries to separate all sites with blog-like content, won't this naturally reduce my rank without actually increasing the source of information? Or am I missing something? How is google going to search for blog-like sites?
  • Great idea. (Score:3, Interesting)

    by Musashi Miyamoto ( 662091 ) on Monday May 12, 2003 @11:25AM (#5936666)
    I love this idea... and I have been waiting for something like it for some time...

    Think about it... I would love to search the blogosphere to see how widespread certain news items have become, or how widespread a certain opinion is...

    You could use something like this to measure the spread of ideas (at least within a vocal and technologically suave minority).

  • by Bartmoss ( 16109 ) on Monday May 12, 2003 @11:31AM (#5936718) Homepage Journal
    Alright, fair enough - but how do you identify a weblog? They can do this for blogger/blogspot/whatever that they bought, and maybe standard software like moveable type etc. But what about sites based on slash, phpnuke or totally custom code? And where does a weblog begin and a news site end?

    Filtering out usenet news is relatively easy, but weblogs? Mhhh, I shall remain sceptical until I see it implemented.
  • Re:Great! (Score:4, Interesting)

    by Joe the Lesser ( 533425 ) on Monday May 12, 2003 @11:36AM (#5936756) Homepage Journal
    Somehow I can't drop the feeling that this will be very similar to a spam filter...
  • Re:/. is a blog, no? (Score:5, Interesting)

    by RobotRunAmok ( 595286 ) * on Monday May 12, 2003 @11:40AM (#5936790)
    /. is a blog, no?

    No. SlashDot aggregates news stories. It's the Web generation of what the BBS guys had in CompuServe Forums and GEnie Roundtables. The staff is paid to aggregate and thread stories that are of interest to a particular community. (Sometimes they aggregate the really, really good ones more than once.) Technically, SlashDot staff don't submit the stories, members of the community do. Bottom line: it's a professional operation. (g'head, g'head, make the jokes, it's Monday, get 'em outta yer system...)

    Personally, I would use the litmus test of "professionalism" when doping out what is a blog versus what is "legitimate" content. If the "blogger" makes his living as a writer or journalist, then the blog is "supplemental online material." If the site is, as we referred to the vanity publishing phenomenon back in the early '90's, someone's "homepage," but with the added baggage of semi-regular diary entries, then it's a Blog.

    Use of "blogging software" doesn't make someone a writer, or a journalist, and it certainly doesn't automatically grant its user something worth saying, or even something factual to say.

    It's great to see Google realizing this and clamping down.
  • Re:journals (Score:5, Interesting)

    by cygnusx ( 193092 ) on Monday May 12, 2003 @11:45AM (#5936826)
    > Those text ads were quite tricky to filter out

    You're entitled to block them if you wish, of course, but if the ads don't consume too many bits, and bring the site-owner some moolah, and don't interfere with your browsing, how does blocking text ads help?

    Knee-jerk ad-blocking will only kill free content on the net, imho.
  • by Jace of Fuse! ( 72042 ) on Monday May 12, 2003 @11:45AM (#5936829) Homepage
    That's a great question. Does a site with "News and Commentary" fit in the blog catagory if only one or two people write it?

    What if it looks like a blog, but has nothing but on-topic posts (whatever the news-site's topic may be)? It has too many opinion spots, though, so it can't really be purely news. Does the fact that it's about a subject, and not some person mean it's no longer a blog?

    The line between Blog-NotBlog is so fuzzy at times, I don't see how they can fairly make a distinction.

    After all, in a way, Slashdot is just a blog for the editors. Certainly some people would consider my sites blogs.
  • by davids-world.com ( 551216 ) on Monday May 12, 2003 @11:54AM (#5936903) Homepage
    Categorization algorithms that combine different features would work quite well here, I believe!

    There is a wealth of categorization systems out there. Generally, they "position" the sites in an imaginary, highly-dimensional space, depending on whether keywords occurr (and how often/prominent etc.), and on certain structural properties of the documents. You can then try to define separating hyperplanes, which are functions that devide the ("feature") space into separate compartments, so you can group documents together.

    Usually, these systems are trained on a set of sample documents that are already categorized, in this case, for instance, a thousand blog pages and tenthousand non-blog pages.

    An example for this would be Support Vector Machines [kernel-machines.org] and Joachim's text classification algorithm.

    Relevant keywords (from the field) to look for include "Maximum Entropy Models", "classifiers", "categorization", "Bayesian *" (whatever), "Neural Network Classifiers", "Data Mining"...

  • Re:ID? (Score:3, Interesting)

    by nycroft ( 653728 ) on Monday May 12, 2003 @11:55AM (#5936909) Homepage
    Probably by the source code located in the HTML template. For example, Blogger's code has to include case sensitive tags like [Blogger][/Blogger] to format the web-based posting. I'm not sure how they would tell for other types like Blog*Spot or Moveable Type. I assume they have some sort of the same types of tags. Or maybe by noting server applets related to the HTML template.

    O yeah, one more thing, Google bought Blogger, so that's another way they'll be able to tell.
  • Re:Personally.. (Score:2, Interesting)

    by Fishstick ( 150821 ) on Monday May 12, 2003 @12:00PM (#5936955) Journal
    Hmm, I'm hoping the results are excluded, and blog is a "tab" just like the web, images, groups, directory, news are now.

    I've found this mechanism to be really effective in helping me find what I want.

    I use the google toolbar - this defaults to a 'web' search. 95% of the time what I'm looking for comes up on the first page. If not, I can click on the 'groups' tab, where my search is repeated (like when I'm trying to figure out an error message or somesuch).

    If the thing I'm looking for is a business, or a product or something likely to be listed, then the 'directory' tab will give me good results.

    Having a 'blog' tab (and keeping the results out of the main web results), seems like a good arrangement to me. Most of the time I'm not interested in results from blogs, and it doesn't seem too much extra work to just click one more time on the main results page to repeat the search in a blog-specific area.

    I've found some of the best information on blogs.

    I think it depends on the kind of info you are searching for. In my experience, most of the blog results aren't helpful. I've wanted a way to filter them out (usually putting in -comments -posted or similar helps).
  • ephemeral content (Score:4, Interesting)

    by esme ( 17526 ) on Monday May 12, 2003 @12:14PM (#5937047) Homepage
    i don't know that i have any particular need to have blogs filtered out of the google index (i don't see them very often in the searches i do...).

    but filtering out ephemeral content in general would be good -- blogs would be included in this. so would mailing list archives, news stories, online stores, auctions, discussion groups, etc.

    when i'm searching, i almost always prefer a page that somebody authored and put up as a permanent resource (or as permanent as the web allows). the top-level pages of the ephemeral sites would probably be good to keep in the main index, though i'm not sure how you index, e.g., the /. homepage.

    -esme

  • Re:blogs.google.com? (Score:3, Interesting)

    by Sethb ( 9355 ) <bokelman@outlook.com> on Monday May 12, 2003 @12:20PM (#5937097)
    I don't know, my blog has some very useful information that Google serves out to a lot of people needing help, for instance, this page [editthispage.com] is a lifesaver when you hose your Win2000 install using Easy CD Creator, and a lot of people still e-mail me, 2 years later, to thank me for writing it up.
  • Re:journals (Score:3, Interesting)

    by AndroidCat ( 229562 ) on Monday May 12, 2003 @12:24PM (#5937129) Homepage
    There Ain't No Such Thing As A Free Lunch. Google provides an excellent free service, and uses relevant text-only ads to pay for it. I look at most web sites as a package deal. If their ads are too much of a PITA, I tend to avoid the site.

    Ah well, your option. Some people do find ads matched to the search to be a useful feature.

  • Re:journals (Score:2, Interesting)

    by jonfelder ( 669529 ) on Monday May 12, 2003 @12:48PM (#5937302)
    Why did this get moderated up? With respect to google, how else do you expect them to make money? Would you rather they charged you per search instead? In many cases ads are annoying, however google's are about the least intrusive as they get.

    In google's case, I'd say the service is worth the slight inconvienence of the ads.
  • Re:Ummm... no (Score:2, Interesting)

    by neurostar ( 578917 ) <neurostarNO@SPAMprivon.com> on Monday May 12, 2003 @01:05PM (#5937418)

    Well, I guess I shouldn't have specified live journal. My guess would be that easy publishing websites (blogger, live journal, etc...) are more often (but not always) used by people who just want an online journal. Also, the name "live journal" implies that it's a journal, not a blog.

    In fact, my first 'blog' was hosted on blogger. It was mostly a journal. Then I switched to hosting it myself with more advanced software (movable type) and my blog migrated into a more news-oriented feature. As a result, I split [privon.com] my blog into a more journal-oriented blog [privon.com] and a news/science/politics blog [privon.com].

    I agree completely that a blog is about getting what you want to say out there. That's what I use mine for. I was merely responding to a comment that indicated that all blogs were just about mundane things that happened during the day.

    neurostar
  • by Tablizer ( 95088 ) on Monday May 12, 2003 @03:32PM (#5938519) Journal
    Rather than separating stuff, why not make it a series of choices using check-boxes. Example:

    Include Web-pages: [X]

    Include Blogs: [ ]

    Include Usenet: [X]

    And so forth. You can get better combos this way, If they add other "web types" in the future, you can combine searches without having to go to each one. They could still include a dedicated listing if they want, but I hope they don't hard-wire their data that way to prevent or reduce multi-factor searches in the future.

    Even more generic would be to have a pull-down list of the "strength" of each search. Thus, if you wanted weblogs included, but given less weight, you might assign it a lower number. Zero would be the same as a no-check above. However, this is perhaps too confusing to most users.
  • How about if... (Score:2, Interesting)

    by Uncle Gropey ( 542219 ) on Monday May 12, 2003 @08:30PM (#5941158) Journal
    ...they remove the blogs from main, then re-incorportate the highest-hitting blogs from the new search back into the main? Then you may not miss a relevant and useful blog while avoiding the one that is mainly about some highschool girl and funny text messages that she got from her friends?

Genetics explains why you look like your father, and if you don't, why you should.

Working...