The Future of Google Search and Natural Language Queries 148
eldavojohn writes "You might know the name Peter Norvig from the classic big green book, 'AI: A Modern Approach.' He's been working for Google since 2001 as Director of Search Quality. An interview with Norvig at MIT's Technology Review has a few interesting insights into the 'search mindset' at the company. It's kind of surprising that he claims they have no intent to allow natural questions. Instead he posits, 'We think what's important about natural language is the mapping of words onto the concepts that users are looking for. But we don't think it's a big advance to be able to type something as a question as opposed to keywords ... understanding how words go together is important ... That's a natural-language aspect that we're focusing on. Most of what we do is at the word and phrase level; we're not concentrating on the sentence.'"
natural language is an oxymoron (Score:4, Insightful)
I tend to agree with Norvig's focus on keywords and less emphasis on natural language. Trying to even define a natural language on top of a query engine introduces a layer of complexity probably unnecessary. Natural Language even introduces a level of noise to interfere with accurately (as possible) defining what the user is asking for.
Google has done a good job, and they get better each iteration figuring out what the user is looking for. I find their suggestion [google.com] an effective way to not only constrain a query, it actually provides a way to spell check in a pre-emptive way. If you've not used this, install the Firefox Google toolbar, or use the experimental Google "Suggest" [google.com]. Often Google will provide suggestions in the drop down menu that refine your search in ways you hadn't considered that drive to a more direct and accurate representation of your intended query. Of course if their suggestions don't satisfy, you get to continue typing your keywords to your heart's desire.
(I have to offer an example of suggestion's effectiveness. I often Google to get to the Chicago Tribune (I don't visit there often enough to have created a bookmark, plus it's easy to do this in anyone's browser). Simply typing the first four letters, "chic", I see the first suggestion is "Chicago Tribune". A simple TAB and RETURN, I'm on the Google page with the first link or so my link to the Tribune (with the added bonus of Google's breakout of sublinks).) Your mileage may vary (Google's ranking system may vary the order and options that appear in the drop-down over time), but I find it an amazingly effective research tool (suggestion, not the Trib).
Natural language is mostly trying to guess intent with structure and key words (as opposed to keywords), but at the end of the day, if you filter out the natural language, and focus on keywords you're going to end up in close to the same place.
The problem with natural language searches... (Score:3, Insightful)
Comment removed (Score:3, Insightful)
this is also why (Score:5, Insightful)
the idea of interacting with a computer like a human is an artificial hangover from being introduced to the computer the first time. after using it for awhile, you realize that ineracting with a computer, in small limited ways, like searching information, is easier NOT using natural language
for the very simple reason that it takes more thought, and more typing to interact naturally. it is easier to train a human to interact with a computer than it is to train a computer to interact with a human. and for the human, it is more rewarding, because the human realizes he doesn't need to exert so much effort
"what is the capital of france?"
versus
"france capital"
if you were to shout "france capital" at someone, it would be rude and confusing. but for a computer, it's actually superior
it is the conservation of communication effort at work here that wins out over natural language in computer interaction
Users have changed, too (Score:3, Insightful)
Just like "Click here to do X" isn't used as much on Web pages anymore. People now tend to know that they can click on underlined text to find out more.
what is google, freakin' jeeves? (Score:2, Insightful)
Maybe I'm just not up on my search engine technology (or, rather, I don't know anything about it). I just don't know anybody who'd think to put a regular question into google.
Re:this is also why (Score:1, Insightful)
The next evolution in computer search is understanding what documents really satisfy "what is the capital of france" versus returning anything with "france" and "capital" in its text such as "France should always be spelled with a capital letter". Google doesn't attempt to differentiate and they leave you to filter the results manually to find what you want.
The reason for natural language interfaces is not simply to collect the keywords, it is to understand the context within which you want results and filter out meaningless results. Google uses a pagerank that should bubble the more common meanings of the keywords to the top. But I still find myself having to filter out tons of irrelevant results to get to a very specific results that is 4 or 5 pages down. So I then have to learn to think like a computer and add other keyword context that differentiates the result I want. Like "france capital -letter -capitalization +city" and inevitably I end up filtering out results that fit my context, but happen to have terms that I filtered on.
So searching on meaning is still a holy grail. And in fact, I'm surprised this guy from Google said this when another Google engineer at Ideafest stated very matter of factly that Google's future was in natural language 'star trek' like computer systems. This is completely contrary to what is being said here.
He's lying (Score:4, Insightful)
Me Tarzan; You Jane (Score:3, Insightful)
Re:What's really the story (Score:4, Insightful)
Even if you mastered natural language (and I'm not saying that's a surmountable task) I think people would be shocked to see that Google searches would still be frustrating.
I'm not just saying "blame the user", I'm saying that language itself is not even the last obstacle to overcome. You're going to need to figure out an program that not only understands natural language, but also context, culture, etc.
Getting an AI of near-human intelligence is not enough, because to be really good at getting people the answers to questions they can't ask you have to be of above-average capability.
Not worth processing sentences (Score:3, Insightful)
Re:The problem with natural language searches... (Score:4, Insightful)
The argument against universal grammar is of course is non-Latin languages like Japanese (and possibly Russian) which don't play by the rules. I'm not really a language expert on either, but I'm tried to learn Japanese and its really tough.
Everything is relationship based off the speaker and to the person or object he is talking about and then the audience. As in... If I'm talking about a pencil sitting on my desk, it has a different tense than a pencil on your desk and then a difference tense in someone else's hand or a pencil that is sitting at a far off place (-sara or -kara? I can't remember). And we haven't even gotten to issues about ownership like if it was in my hand or your hand.
Whereas in Latin based languages it is more concerned about action or tense of ownership but not relationship to the speaker or audience. Hence... It is argued universal grammar does not apply in that respect.
What could possibly be wrong with that? (Score:4, Insightful)
Your query does not include a verb.
> find wii
Whose "wii" do you want me to find?
> find wii review
Unable to find any reviews authored by "wii".
> find review about wii
No reviews found concerning the common noun "wii".
> find review about Wii
Here is the most recent review about the proper noun "Wii": [url to a page full of keywords related to Wii]
> find review about Wii order by relevence
"relevence" is not an English word. Did you mean "relevance"?
> find review about Wii order by relevance
Here is the most relevant review about Wii: [url to a 2 year old pre-review of the Wii before it was launched]
> find review about Wii order by relevance then date
Here is the most recent and most relevant review about Wii: [url to a fanboy site]
> find all reviews about Wii order by relevance then date
Working...
> abort
Abort what?
> abort search
I am currently performing 1,231,415 searches. Which search do you want me to abort?
> abort last search
You do not have permission to abort others' searches.
> abort my last search
Last search aborted.
> find several reviews about Wii order by relevance then date
"Several" is not a quantifiable adjective. Do you mean "seven"?
> find seven reviews about Wii order by relevance then date
Here are your results. For better search results please capitalize the first word of sentences, and end sentences with proper punctuation.
Dan East