Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Education Software Technology

CMU Web-Scraping Learns English, One Word At a Time 148

blee37 writes "Researchers at Carnegie Mellon have developed a web-scraping AI program that never dies. It runs continuously, extracting information from the web and using that information to learn more about the English language. The idea is for a never ending learner like this to one day be able to become conversant in the English language." It's not that the program couldn't stop running; the idea is that there's no fixed end-point. Rather, its progress in categorizing complex word relationships is the object of the research. See also CMU's "Read the Web" research project site.
This discussion has been archived. No new comments can be posted.

CMU Web-Scraping Learns English, One Word At a Time

Comments Filter:
  • by sakdoctor ( 1087155 ) on Saturday January 16, 2010 @03:26PM (#30792374) Homepage

    Only as good as current machine learning algorithms.
    So not very.

  • by sakdoctor ( 1087155 ) on Saturday January 16, 2010 @03:31PM (#30792424) Homepage

    letting it grow into it's own intelligence

    This is still weak AI. It isn't going to grow into anything, let alone strong AI.

  • Re:Uh oh... (Score:5, Insightful)

    by Bragador ( 1036480 ) on Saturday January 16, 2010 @03:36PM (#30792460)

    Actually, it reminds me of a chatbot named Bucket. When people at 4chan heard of it, they started to use it and teach it. It became a complete mess filled with memes, bad jokes, racists comments, and everything you can think of.

    http://www.encyclopediadramatica.com/Bucket

    One response from the bot:

    Bucket: I don't know what the fuck you just said, little kid, but you're special man. You reached out and touched my heart. I'm gonna give you up, never gonna make you cry, never gonna run around and desert you, never gonna let you down, never gonna let you down, never gonna make you cry, never gonna let me down?

    The quality of the teachers is important when learning.

  • by buswolley ( 591500 ) on Saturday January 16, 2010 @04:20PM (#30792774) Journal
    Of course. Thatis why is is important during human development that the infant has huge cognitive constraints (e.g. low working memory) in language learning; it limits the number of possible pairings of label and meaning. Of course, constraints can also be an impediment.
  • by DMUTPeregrine ( 612791 ) on Saturday January 16, 2010 @06:09PM (#30793578) Journal
    The obligatory classic AI Koan:

    In the days when Sussman was a novice Minsky once came to him as he sat hacking at the PDP-6. "What are you doing?", asked Minsky. "I am training a randomly wired neural net to play Tic-Tac-Toe." "Why is the net wired randomly?", asked Minsky. "I do not want it to have any preconceptions of how to play." Minsky shut his eyes. "Why do you close your eyes?", Sussman asked his teacher. "So the room will be empty." At that moment, Sussman was enlightened.

  • by poopdeville ( 841677 ) on Saturday January 16, 2010 @06:14PM (#30793624)

    It's not as if human use of "machine learning" algorithms is any faster. It takes about 12 months for our neural networks to figure out that the noises we make elicit a response from our parents. And according to people like Chomsky, our neural networks are designed for language acquisition.

    AI "ought" to be an easy problem. But there's one big difference in the psychology of humans, and of computers. Humans have drives, like hunger, the sex drive, and so on. In particular, an infants' drive to eat is a major component in its will to learn language. But this drive to eat has other psychological manifestations.

    It is difficult to imagine a programmatic "generalized goal system" that mirrors the role of human drives in learning. The "goals", usually, are to maximize fitness in a particular domain. A real human has to maintain sufficient fitness in multiple domains, in order to survive.

    This should not be so surprising. Human evolution has about 300,000 generations of improvements on the brain since we first stood up. Our drives are clearly genetically programmed, and are just as hard wired as a machine learning algorithms' "drive" to maximize. The human drive is just much more nuanced, and informed about the real world. There is a model of the world in our genes. It is unfair to expect that a computer will ever be "smart" without one.

  • Re:Uh oh... (Score:3, Insightful)

    by Rocketship Underpant ( 804162 ) on Sunday January 17, 2010 @02:42AM (#30796434)

    Yes, database pollution sounds like a problem to me. Not only do you have to deal with AOL-speak and horrific spelling disasters of every kind, there's the issue of broken English and nonsensical English produced through machine translation, which shows up on corporate websites a lot more than it should.

An Ada exception is when a routine gets in trouble and says 'Beam me up, Scotty'.

Working...