CMU Web-Scraping Learns English, One Word At a Time

CMU Web-Scraping Learns English, One Word At a Time 148

Posted by timothy on Saturday January 16, 2010 @03:18PM from the hao-ubowt-hahmnimz dept.

blee37 writes "Researchers at Carnegie Mellon have developed a web-scraping AI program that never dies. It runs continuously, extracting information from the web and using that information to learn more about the English language. The idea is for a never ending learner like this to one day be able to become conversant in the English language." It's not that the program couldn't stop running; the idea is that there's no fixed end-point. Rather, its progress in categorizing complex word relationships is the object of the research. See also CMU's "Read the Web" research project site.

CMU Web-Scraping Learns English, One Word At a Time

This discussion has been archived. No new comments can be posted.

Search 148 Comments Log In/Create an Account

Comments Filter:

Machine learning algorithms (Score:4, Insightful)

by sakdoctor ( 1087155 ) writes: on Saturday January 16, 2010 @03:26PM (#30792374) Homepage

Only as good as current machine learning algorithms.
So not very.

Re:Finally, people are getting AI right. (Score:4, Insightful)

by sakdoctor ( 1087155 ) writes: on Saturday January 16, 2010 @03:31PM (#30792424) Homepage

letting it grow into it's own intelligence
This is still weak AI. It isn't going to grow into anything, let alone strong AI.

Re:Uh oh... (Score:5, Insightful)

by Bragador ( 1036480 ) writes: on Saturday January 16, 2010 @03:36PM (#30792460)

Actually, it reminds me of a chatbot named Bucket. When people at 4chan heard of it, they started to use it and teach it. It became a complete mess filled with memes, bad jokes, racists comments, and everything you can think of.
http://www.encyclopediadramatica.com/Bucket
One response from the bot:
Bucket: I don't know what the fuck you just said, little kid, but you're special man. You reached out and touched my heart. I'm gonna give you up, never gonna make you cry, never gonna run around and desert you, never gonna let you down, never gonna let you down, never gonna make you cry, never gonna let me down?
The quality of the teachers is important when learning.

Re:Finally, people are getting AI right. (Score:4, Insightful)

by buswolley ( 591500 ) writes: on Saturday January 16, 2010 @04:20PM (#30792774) Journal

Of course. Thatis why is is important during human development that the infant has huge cognitive constraints (e.g. low working memory) in language learning; it limits the number of possible pairings of label and meaning. Of course, constraints can also be an impediment.

Re:Finally, people are getting AI right. (Score:4, Insightful)

by DMUTPeregrine ( 612791 ) writes: on Saturday January 16, 2010 @06:09PM (#30793578) Journal

The obligatory classic AI Koan:
In the days when Sussman was a novice Minsky once came to him as he sat hacking at the PDP-6. "What are you doing?", asked Minsky. "I am training a randomly wired neural net to play Tic-Tac-Toe." "Why is the net wired randomly?", asked Minsky. "I do not want it to have any preconceptions of how to play." Minsky shut his eyes. "Why do you close your eyes?", Sussman asked his teacher. "So the room will be empty." At that moment, Sussman was enlightened.

Re:Machine learning algorithms (Score:3, Insightful)

by poopdeville ( 841677 ) writes: on Saturday January 16, 2010 @06:14PM (#30793624)

It's not as if human use of "machine learning" algorithms is any faster. It takes about 12 months for our neural networks to figure out that the noises we make elicit a response from our parents. And according to people like Chomsky, our neural networks are designed for language acquisition.
AI "ought" to be an easy problem. But there's one big difference in the psychology of humans, and of computers. Humans have drives, like hunger, the sex drive, and so on. In particular, an infants' drive to eat is a major component in its will to learn language. But this drive to eat has other psychological manifestations.
It is difficult to imagine a programmatic "generalized goal system" that mirrors the role of human drives in learning. The "goals", usually, are to maximize fitness in a particular domain. A real human has to maintain sufficient fitness in multiple domains, in order to survive.
This should not be so surprising. Human evolution has about 300,000 generations of improvements on the brain since we first stood up. Our drives are clearly genetically programmed, and are just as hard wired as a machine learning algorithms' "drive" to maximize. The human drive is just much more nuanced, and informed about the real world. There is a model of the world in our genes. It is unfair to expect that a computer will ever be "smart" without one.

Re:Uh oh... (Score:3, Insightful)

by Rocketship Underpant ( 804162 ) writes: on Sunday January 17, 2010 @02:42AM (#30796434)

Yes, database pollution sounds like a problem to me. Not only do you have to deal with AOL-speak and horrific spelling disasters of every kind, there's the issue of broken English and nonsensical English produced through machine translation, which shows up on corporate websites a lot more than it should.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

CMU Web-Scraping Learns English, One Word At a Time 148

CMU Web-Scraping Learns English, One Word At a Time More Login

CMU Web-Scraping Learns English, One Word At a Time

Machine learning algorithms (Score:4, Insightful)

Re:Finally, people are getting AI right. (Score:4, Insightful)

Re:Uh oh... (Score:5, Insightful)

Re:Finally, people are getting AI right. (Score:4, Insightful)

Re:Finally, people are getting AI right. (Score:4, Insightful)

Re:Machine learning algorithms (Score:3, Insightful)

Re:Uh oh... (Score:3, Insightful)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot