Indiana First With Computerized Grading 524

Posted by michael on Thursday May 20, 2004 @01:00PM from the replaced-by-a-small-shell-script dept.

Mz6 writes "Computerized grading has been talked about previously, however, the New York Times reports that Indiana has become the first state to grade high school English essays by computer. The computerized grading process, called 'e-rater', uses a 6-point rating scale and uses artificial intelligence to 'mimic the grading process of human readers'. The system was tested over a 2-year pilot program and produced results virtually identical to those of trained readers. The big question is, will other states begin to emulate Indiana by tossing human grading?"

Indiana First With Computerized Grading

This discussion has been archived. No new comments can be posted.

Search 524 Comments Log In/Create an Account

Comments Filter:

From the web site. (Score:4, Informative)

by Bluesman ( 104513 ) writes: on Thursday May 20, 2004 @01:28PM (#9205989) Homepage

>If you would like to try out e-rater, you can obtain an ID and password and submit and original essay for scoring on the CriterionSM Web site.

Submit "and" essay? I guess they haven't run the software on themselves.

F.

Re:I would have loved this is a kid (Score:4, Informative)

by Kiriwas ( 627289 ) writes: on Thursday May 20, 2004 @01:34PM (#9206084) Homepage

In the florida school system this is a writting test called the Florida Writes. Its a standardized test, doesn't count for much, but we all had to take it. This is basically exactly what we had to do. Write the most horrible peice of trash you could think of, but make it adhere to a few preset guidelines and BAM instant grade. My first time taking it I wrote a rather good peice of work if I do say so myself. Problem was, it didn't ahhere to their simply 5 paragraph, introduction, 3 body, and conclusion. I did horrible. My second time, I wrote something that I barely called English but followed what they wanted perfectly and got top marks. I see this new computerized grading as being just exactly the same.

As someone procrastinating grading right now... (Score:5, Informative)

by Dr. Spork ( 142693 ) writes: on Thursday May 20, 2004 @01:36PM (#9206112)

I don't think that automatized, high-school-level grading is an all-bad thing. We can call it unfair if we like, but as someone who grades a lot of stuff I can tell you that I'm nothing like fair. I don't always even know how to distinguish a B- from a C+, and I just go with my gut, which, as far as I can tell, is much like flipping an internal coin. If we looked at human grade assigners as an algorithm, we would find a whole lot of stuff wrong, even among those of us who try hard to be fair (cover author names, compare close grades for consistency, keep a constant mood, that sort of stuff).
But I think that if a computer grading program which is no worse than humans could be devised, it would be a great learning tool. A lot of people make it to college as borderline illiterates. I'm not kidding. I read a lot of their crap. That's because their HS teachers were too overworked to grade their writing, so they didn't assign much. If a computer program could auto-grade and give detailed comments on how to improve the writing, high school students could be assigned an essay per week, and really get the hang of writing well. Teachers could focus on teaching instead of tedium.
Sure, the first grading applications are going to make a few serious errors. This is the first stage of every application when a computer is asked to interpret rich data. Early voice recognition sucked. Now it sucks much less, and it will just keep getting better. Same with OCR, chess software, machine translation, etc. So the right debate to have is about when this will be good enough for school use, and not whether. I'm prepared to admit that the answer to the right question is "not yet" (I'm sure how deep the current problems go), but I fully support working on this system until it works right.

I took this test (Score:5, Informative)

by tundog ( 445786 ) writes: on Thursday May 20, 2004 @01:38PM (#9206140) Homepage

I live in Indiana (no, NOT India) and took this test. Being a techie, I figured I'd try to fake out the system. This test works out to be 10% of the final grade and since I had a 98 going into the test, I figured I could afford to gamble a little, figuring if it back-fired I could blame it on a computer error since every one would figure the kid with a 98 MUST be telling the truth.

I almost wimped out. I wrote about 80 percent of the essay (about influence of pop-culture on society - and silly me I always thought society influences pop-culture but anyway). I had 5 paragraphs - 1 intro, 3 body - 1 half-assed conclusion. I reoreded the paragraphs, copied the one I felt was the best written and pasted it into the body 3 times.

Guess what I got.....6/6 (six point grading scale which is pretty messed up because a 5/6 is an 83%). Hopefully they won't audit mine....

Re:I would have loved this is a kid (Score:3, Informative)

by LnxAddct ( 679316 ) writes: <sgk25@drexel.edu> on Thursday May 20, 2004 @01:46PM (#9206253)

Well I've read almost every post under this story and noone has mentioned how it actually works. From the e-rater site "E-rater learns to score essays on a particular topic by processing a significant number of essays on the topic, each of which has been scored by two or more faculty readers. While e-rater is a powerful scoring engine, it is not meant to replace a teacher whose judgment is essential to helping students improve their writing ability." This means that its essentialy a bayesian filter that instead of being fed spam and ham and told which is right from wrong, it is fed good papers and bad papers and told which is right from wrong. You could literally reproduce this with something like SpamBayes but instead of feeding it spam, feed it essays. Anyway, you can probably beat the system but the fact that it is based on preprocessed papers means that a) it will be much harder to game the system without knowing what papers were scanned, and b) all future grading is based solely on the fact that previous generations have written papers and the futre grades will be based on them. The good thing is that unless the teachers are willing to write 20 different papers all on the same subject and scan them in, then the computer won't expect anything better then what was previosuly written. This also means, assuming the teacher won't write essays but rather will use ones from students from prior years, that the essay topics can't be changed, otherwise the teacher would have nothing to scan from prior years and thus nothing to base the grades on. As a result of the essay topic not being able to be changed that simply means that you can find someone from a year above you and ask them to save their papers for you, or in a worst case scenario, everyone must be given the same topic so at least you can help each other out. Often times teachers would give everyone a different topic to avoid people "helping" each other, but with such a system that would be impossible. I sure wish I was back in highschool.
Regards,
Steve

Re:I would have loved this is a kid (Score:3, Informative)

by Krach42 ( 227798 ) writes: on Thursday May 20, 2004 @02:57PM (#9207287) Homepage Journal

I'll let you in on how it works, as I ported some software that does exactly what this essay grader does, as work for a professor who worked on this stuff.

It's not a baysian filter, it's Latent Semantic Analysis. LSA works by taking large amounts of text, and comparing the usage and application of the words within paragraphs. It learns very quickly what words mean, and the interesting thing is, that once it's trained far enough, it starts gaining more meaning to its words by where they're not, than by where they are.

LSA has been put through a variety of tests. And has taken tests even. LSA has been shown to produce "average" results on a synonym test. ("A Doctor is: A) Nurse, B) Practitioner, C) Politician, D) Numerologist") Producing incorrect results mainly only when one word given is more associated (strongly linked) to the word than another more suitable word. Such as in my example, it would pick "Nurse" not "Practitioner" because it's seen Nurse used more often with Doctor.

LSA has been seriously tested by the designers to see if they could write a bad essay that gets a good grade. The answer? Yes, it's possible. But you have to REALLY know the subject well, (as you'd have to produce garbage that relates words accurately between each other) and a lot of time.

The recommended the best way to cheat the system, was to do your research, know your topic, and... write a good essay. Any other way requires too much effort, and a vastly superior knowledge of the subject.

Interesting is that this system can identify plagarism, give it two papers, and it looks how closely the papers match. This gets not just exact copies, but also paraphrased plagarism. The system doesn't really care what the words are, as it looks at their similarity. It could tell that "The doctor studied the patient." is just paraphrased "The practitioner examined his customer."

If train it right, it will even do this between two languages. It's also useful as a spam detector, as it will get "Enlarge your member" from just one marked "Make your dick bigger."

(So, I was told from the professor, Apple's Mail.app is supposed to use LSA)

For any interested. The professor at New Mexico State University was Peter Foltz, and some college up in Colorado was doing a lot of work on it also.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Indiana First With Computerized Grading 524

Indiana First With Computerized Grading More Login

Indiana First With Computerized Grading

From the web site. (Score:4, Informative)

Re:I would have loved this is a kid (Score:4, Informative)

As someone procrastinating grading right now... (Score:5, Informative)

I took this test (Score:5, Informative)

Re:I would have loved this is a kid (Score:3, Informative)

Re:I would have loved this is a kid (Score:3, Informative)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot