Google To Create "Blog" Search; Potentially Remove From Main 311
Skyshadow writes "Google, search engine of choice for pretty much everyone, has announced that it will begin a seperate index for blogs and remove them from the normal index, handling them instead in much the same way as their usenet archives. This will hopefully put an end to the recent difficulties locating primary source material among the mountains of blogs which are clogging the ratings system." There's been comments from elsewhere that says they won't be removing them - but that remains to be seen.
blogs.google.com? (Score:5, Insightful)
yay and aaah (Score:3, Insightful)
Yes! But will there be a metasearch? (Score:5, Insightful)
However, I hope they maintain links between the main search and the blog search. Finding primary sources, then a button linking to all blog comments on theis topic would be a great research tool.
Re:'Bout time (Score:3, Insightful)
Personally.. (Score:5, Insightful)
Will they at least link to the new search? (Score:4, Insightful)
Next Prime Minister of Canada has a Blog (Score:1, Insightful)
Re:'Bout time (Score:5, Insightful)
Re:'Bout time (Score:5, Insightful)
I was wondering about that too. Its not black and white, of course, especially when you want to automate it. I can think of several indications that a page is a blog, some weighted linear combination of these factors should work well enough in practice if you spend some time tweaking the weights:
Ummm... no (Score:4, Insightful)
I think you're confusing a weblog with a "livejournal". A weblog is similar to slashdot (or warblogging.com [warblogging.com] and back-to-iraq.com [back-to-iraq.com]). In fact, my weblog (http://privon.com) deals with politics, science, and civil rights as well as opinion pieces I've written about various issues. A weblog is another source of information.
What you're thinking of is commonly called a "livejournal" and it's exactly that - a journal. Some blogs are also journals. For example, I've got two 'blogs'. One is the one I mentioned above. The other is slightly more journal oriented, with me posting about things I've done that my family and friends (and possibly others) might find interesting. For example, I've recently posted about visiting the Trek Bicycles Demo Day as well as some of my latest photography experiences.
It might be beneficial for you to review your definition of a blog. Blogs can be an excellent source of information, not just a diary.
neurostarWouldn't it be better (Score:2, Insightful)
I think this solution would make everyone happy.
Blur (Score:5, Insightful)
A comparison is being made between blogs and the newsgroups which are worlds apart in a number of different ways not the least of which is the thread-nature of the groups.
What defines a blog, anyway? What defines a not-blog? Is CNN.com a blog? Is it not a blog because many people write for it, because of the number of hits it gets or because it has press credentials? Which category does indymedia.org [indymedia.org] fit into?
Will I only get news results when I search for "ferret care?"
What if the source IS a blog? If the subject IS the blog, will a news site reporting on the blog wind up in the main search results while the subject itself -- the blog -- be only in the blog search?
Don't forget Google News... (Score:4, Insightful)
Ehh, the point of this message is to inform the uninformed of the wonderfulness of Google News [google.com]. It automatically features prominent headlines from all over the web, and you can search for topics, keywords, etc. in the search bar and have results sorted by relevance or date. News articles are mostly excluded from the normal index, which makes Google News the best headline locator on the Internet, by far.
Blogs removed from google = FUD (Score:5, Insightful)
Far more authoratative sources that I [weblogs.com] have already weighed in on this.
While there's certainly a lot of innane content available in blog form, this isn't really any different than it was before. I have never had to wade through 500 pages of results to find an original source either. The whole thing reeks of FUD to me Methinks that Orlowski and Roddy have their own axes to grind.
Re:blogs.google.com? (Score:2, Insightful)
They should separate mailing list archives first (Score:5, Insightful)
Re:What is a Blog anyway? (Score:3, Insightful)
A Web Page created by a person is usually created for a task in mind - Showing off a project (case mods, hacks on furbys, peep surgery), a fan information page (Dr. Who, Anime, Star Trek, Babylon 5), or a page created for a group (Local SCA Group, Computer User's Group, MMORPG Guild Page).
A Blog is usually created as a online journal or diary, often for a group of friends.
What tends to trip off the search engines are the Blog sites that link to other people by common interestes. WWW.Livejournal.com allows you to have linke by friends, and common interests. Were I to have a blog with them and I set up as one of my interests as Star Trek, then I'll likely end up with several hundred names of people that also like star trek.
Google goes out and farms new sites. It hits so-and-so's blog in Livejounal. It sees a link mentioning Star Trek and follows it...then it sees about 1000 more ST links... 1001 ST links that likely won't have a dang thing about Star Trek on the pages (unless someone happens to brag about how he scored ST:TNG season 1 on ebay for a song).
More and more people are blogging and hence this is why blogs (which have been around for quite a while) are now starting to become a concern for the search engines trying to filter out the signal to noise ratio.
I like Google's idea. One of the reasons blogs like together is often so people can network with people who share common interests. If you don't and want to learn about Star Trek can find real information by going to hte main page while the people looking for fellow ST fans can go to the blog page.
Makes sense to me
Phoenix
Re:'Bout time (Score:5, Insightful)
because what is important, in my point of view, is to GET THE ANSWER to what I'm looking for.
And if the answer is in a weblog that belongs to "Linux-freaks.Adhzerbahidjan", it still is the answer I'm looking for...
I mean things like "Proftpd doesn't seem to accept fxp connections", why the hell is this part of my distro not working as I wish...can only be proposed by people having the same problem and discussing it in a blog.
Another reason I prefer Weblogs to, say, IRC is that I don't have to humiliate myself asking "basic" questions to the 15 year old Guru that is nicknamed "EvilRootBeer" , I just have to parse a few blogs and get my answer without ANY fine manual to read.
"Nothing against blogs, but you never know where this material came from." Because you KNOW where the news from CNN is coming from ? I mean, they show proof and research material everytime they air a show, or a major groundbreaking news ("Mass destruction weapons found in Irak","Terrorist Bretzel Fails Coup d'Etat"..."
at least with blogs and the net, you can try and cross check the data, whereas with tv, you usualy only gulp some more mountain dew.
I just wish you had to find you Linux docs using the manuals provided on the distro and absolutly no other acees to raw data...
Re:journals (Score:2, Insightful)
Advertisements are intrusive no matter what form they take. Just because they use less bits and/or are smaller on the page doesn't change the fact that they are unwanted.
Re:Great! (Score:3, Insightful)
I welcome the change, and I'm glad people won't be seeing my journal that don't want to.
Bullshit. Please read. (Score:5, Insightful)
Slashdot, like other blogs, pollutes search engine searches with their "permalinks," which, although they might be useful, certainly constitute a blog. In fact, one of the problems with blogs and search engines is that they generate thousands of clickable hyperlinks effortlessly. It's great for someone reading a blog and trying to bookmark a certain section - it's terrible for the guy who wants information on combatting spam through more effective use of his SMTP server and has to search through 30 pages of
Certainly, Google's criteria for what defines a blog might be helpful, but it seems to me like you're subjectively deciding which blogs are legitimate news sources and which are "some kid rambling on." Say whatever you like about the legitimacy of
Re:/. is a blog, no? (Score:3, Insightful)
To me, it makes sense to separate the search for primary material (like slashdot's links and features) from the commentary on it (the comments).
I can't see how you could even begin to do this consistently. Most of the 'primary' (by your definition) material referred to on /. is summaries of or comments on something else. In many cases you could argue that it is 4 or 5 levels away from 'primary'.
On the other hand, you often get genuinely creative stuff in response to someone else's article. In the academic community, it is not unusual for the responses to or critiques of someone else's work to end up being rated more highly than the 'primary' stuff they are commenting on (IIRC, Chomsky's review of one of Skinner's long-forgotten books is a classic example: in the process of trashing Skinner, he floated a radical new theory on linguistics).
The Internet is all about linking content in non-linear ways. If we really want to go back to 100% primary sources, we are going to end up with "There is nothing new under the sun" as the only entry in the Google DB :-)
(On the other matter, the O'Reilly manual title "Running weblogs with Slash" would appear to support your case...)
Re:blogs.google.com? (Score:3, Insightful)
Mailing lists on the otherhand sometimes just target one small part of the problem however they are both definitly useful. Of course I'm also nosy so do like to read other peoples live's ocassionaly
rus
Re:I'd rather they do this for mailing list archiv (Score:3, Insightful)
As an aside, my most recent dead end involved a Win2K error that's been popping up on one of my boxes. Usenet is full of variations on this error reported over the years without any good answers to what causes it. That doesn't mean that my Linux and Solaris searches are always gems - but it does suggest that such dead ends can be found for almost any platform on a case by case basis.
Re:journals (Score:4, Insightful)
However, I think there is a potential problem with blogs that also contain real content or at least original content. A lot of people have regular webpages that they just update regularly in a blog fashion...will there be a seperation?
Re:journals (Score:3, Insightful)
You might want to use 0.0.0.0 instead. That way you won't get an access attempt on localhost. I usually only block annoying ads (x10) or privacy problems (doubleclick). I don't see the point in blocking Google's text ads.
One day I'm going to put a mini-server on 127.0.0.1 that serves up cute cat pictures instead of blocked banner ads. :^)
Offtopic... (Score:4, Insightful)
what is a blog (Score:3, Insightful)
Determining what is and what is not a blog will be a lot harder than determining what is and is not in a newsgroup.
I think this is a bad idea. Google has made a mistake if they think what we call currently call "blogs" are a novelty item. Blogs are the future of the web, even if a lot of people are using the technology for toy purposes today.
I want to be able to search the entire web in a single index, blogs and all. If PageRank is giving too much noise and not enough signal due to blogs, then fix PageRank.
Re:Ummm... no (Score:3, Insightful)
Re:How will they filter? (Score:3, Insightful)
Excellent point.
One thing this (the polluting of Google results with high-ranking, low-information blog comments) is proving is that ultimately evaluating the reliability of content is an AI problem. The blog issue is a problem in all web-of-trust models of evaluation: when one uses a consensus-based model to determine "truth," urban legends tend to rise to the top and detailed technicalities tend to sink to the bottom. Rating blogs can be done in two ways: intelligence, or statistics. And the rating of blogs by statistics would be as likely to be skewed by associations (cliques) as Google results are today by those same associations of bloggers.
Re:journals (Score:3, Insightful)
A) The ad for HPC I/O: A brief history at the top of this slashdot page.
B) The ad I get when I search for slashdot on google (It says: "Google is hiring (expert software designers)". YMMV)
C) The ad on Dutch TV which has some bimbo checking if her white trousers are bloody around the crotch area. (Several variations, for both tampons and pads, she looks over her shoulders to check from behind in a mirror or kicks up in front of a mirror). Note that this occurs at maximum volume first thing in to the ad break.
Now while I agree with you that ads can be intrusive, I personally don't mind even simple banners - my brain has learned to ignore them. As for pop-ups and flashing, Mozilla serves well. Interstitials (gamesp?) are rare as yet; we will work around those when we have to. Google ads are directed, and on the rare occasion I am search for a product rather than just information, I may well use them. By comparison, these are insignificant compared to TV ads.
Which is why I want a Tivo in Europe!
Is SourceForge.net a blog? (Score:3, Insightful)
Updated frequently ... "posted by" ... dates ... hosted on one of the popular blogging sites ... Links to and is linked from other weblogs
Sounds like the news sections of most SourceForge.net projects I've run into. They're updated frequently (release early, release often), the maintainers frequently post status updates on given dates, SourceForge.net has a lot of them, and they link to other projects that use their code or that contribute code that they use.
Is SourceForge.net a blog?
Re:'Bout time (Score:3, Insightful)
But sometimes I search for non-tech related information (shocking, I know). In fact, I was searching for information about a rare debilitating disease that a doctor told my friend that she might have (can't remember the name anymore off the top of my head) a couple of months ago and I wanted to learn about it... I typed the name of the disease into google and the first link that came up was some asshat's blog about how his aunt had the disease and little useful info, followed by a gazillion bloggers that all were referring to the first blogger's site (apparently this blogger was quite popular).
I was all like "Damn, I wish I could just tell google NOT to look at blogs." as searched through tons of other pages before I found a site with *real* medical information about the disease.
As it turns out, my friend didn't have the disease. Although she had some of the symptoms, they turned out to be caused by normal fatigue or something and she was just advised to get lots of bed rest.
But anyway, that's just one case
Re:journals (Score:3, Insightful)
Then there are sites like mine [joshw.org], which is part blog and part my website as a singer/songwriter. How would Google determine which parts are which? I'd be pretty peeved if the whole site was tagged as a blog.