Indiana First With Computerized Grading 524
Mz6 writes "Computerized grading has been talked about previously, however, the New York Times reports that Indiana has become the first state to grade high school English essays by computer. The computerized grading process, called 'e-rater', uses a 6-point rating scale and uses artificial intelligence to 'mimic the grading process of human readers'. The system was tested over a 2-year pilot program and produced results virtually identical to those of trained readers. The big question is, will other states begin to emulate Indiana by tossing human grading?"
OSS? (Score:3, Interesting)
I would have loved this is a kid (Score:5, Interesting)
Google Bombing (Score:1, Interesting)
I wonder if it will be as simple as repeating a high ranking sentence?
Stupid (Score:4, Interesting)
What about tricking the software? (Score:5, Interesting)
Re:I would have loved this is a kid (Score:4, Interesting)
Not the First (Score:5, Interesting)
It's About Time (Score:1, Interesting)
If he or she didn't like what you wrote, or took a point of view opposite to theirs, you would get a lower grade. Frequently, the "special" students would get the benefit of the doubt, and easy grading just for exceeding their own limitations. An 'A' paper in one English class could be a B- in another, etc, etc.
With this computer grading, these students now know that they will be treated equally, and not bitch about potential human biases. Then, everyone will have a fair shot.
In fact, (Score:3, Interesting)
Re:Stupid (Score:2, Interesting)
identical results to those of trained readers... (Score:4, Interesting)
Re:I smell lawsuits, how about you? (Score:5, Interesting)
What I see as being problematic is kids learning to beat the system. Typically these systems are predicated on gramatical analysis (use of punctuation and sentence compeleteness) and evidence of citing the text the question is based off. I'd bet its a real easy system to beat.
In Other News (Score:5, Interesting)
Indiana Director of State Board of Ed comments: "Isn't it wonderful how technology is improving education?"
Completely offtopic papers (Score:2, Interesting)
Missing feedback (Score:3, Interesting)
How do they judge the content? What if you submit an excellent paper on middle ages history but the assignment was on socialism?
Human feedback is required in order to learn how to write well, you can't just expect a machine to tell you how to improve your writing. Grammar perhaps, but not ideas and how to let them flow coherently.
In order for these students to get that feedback someone has to read it, and since they're reading it anyway, why not just grade it then?
Seems like they are trying to solve the wrong problem with this system, or a problem that dosen't exist. (Are there really so many papers to mark you need a machine to do it?)
Patriot Act Junior (Score:2, Interesting)
Grammar, 90%
Spelling, 95%
Patriotism, 80%
also:
I'd love to see famous writings graded by this system.
Essay grading is harder than science grading (Score:3, Interesting)
I say this because there is an objective criteria for grading the solution to a physics or math problem: correctness. For essays I do not beleive that we (and the current state of AI) can come up with an exact criteria like that. You might determine whether an essay is too different from essays which were written by experts, but cannot a very different essay to be just as good?
To my knowledge the AI programs can solve physics problems which are limited to some well defined domain (for example: http://www.cs.utexas.edu/users/novak/cgi/isaacdem
I will accept an essay grading program after they grade solutions to math and physics problems.
I conjecture that some writers would feel offended if their essay did well according to the program: they might think it means they are too conformist and conservative and not novel in their approach...
Matyas
Re:I smell lawsuits, how about you? (Score:4, Interesting)
You can't grade subjectively because those grades will be compared objectively down the line. You can't say "this is pretty good for kevin, I'll give him an A", but then say "josh's paper is way better than kevin's paper, but josh is a bright kid, so I'm giving him a C". Kevin will think he's mastered the english language while Josh will go insane trying to achieve perfection.
Grading, when used for anything other than helping the teacher learn about each students, just plains sucks, and is only used for competition.
Re:Stupid (Score:3, Interesting)
A computer can obviously not grade essays fairly, so it shouldn't be done.
The article states that comparisons between the computer grading and human grading revealed nearly identical scoring. By this data, I don't think it's obvious that a computer can't be as fair as a human.
I got a 5/6, which, according to the computer, was extremely well. However, this was an 83%, which brought down my grade significantly. This computerized grading isn't fair at all.
What does the computer have to do with it? You would have been in the same situation if a human had given you the 5/6. Perhaps what you mean to assert is that the entire test isn't a fair assessment of performance; but that sounds like sour grapes from someone who didn't get a 6. :-)
at least the system will be applied uniformly (Score:1, Interesting)
Re:I smell lawsuits, how about you? (Score:3, Interesting)
Re:No way this is sound (Score:3, Interesting)
Re:Computers can't grade "interesting"! (Score:1, Interesting)
Exactly! That's why a writer or journalist has an editor. A good, interesting writer who makes a few minor spelling, grammer or syntactic errors on occasion is going to have no problem finding work while someone who never makes that type of mistake but who's writings are just plain boring never will. That's why the pro's have editors, to catch the small things and offers suggestions on how to improve what's there. A good editor can take a good work with some errors, eliminate those errors and offer suggestions on how to make the piece great. A good editor can not, however, take a boring piece and make it great. The work, ultimately, has to come from the writer. An editor can help improve a work, but he can't transform it, because that would make him a co-author of the material. Ultimately a good writer has to produce interesting and engaging material. Grammar and spelling and syntactic errors can be fixed. If you are incapable of producing something interesting and engaging, there isn't an editor on the planet that can help you.
A third source: TurnItIn.com-style relationships (Score:3, Interesting)
If it doesn't already, I would expect a service like this will eventually include plagiarism detection, due to marketing pressure if nothing else. This is something that human graders do, at least over the space of papers they grade and works they remember.
But if plagiarism detection is added, then the grading service would have to make and retain some encoding of each graded paper, a derivative work, in its database.
Once that happens, the grading service also becomes subject to all of the issues already raised with services like TurnItIn.com [turnitin.com], already discussed here [slashdot.org].
I also found this comment from ETS's site [ets.org] rather strange, to say the least:
Re:Stupid (Score:3, Interesting)
Do you have any evidence or thought behind your statement that "A computer can obviously not grade essays fairly, so it shouldn't be done"? Why is that obvious? Is it obvious only because the grading of your essay was, in your opinion, not fair?
I suggest you review difference in usage between "good" and "well." Proper grammar/word choice is a large part of what makes a good essay.
How is the conversion from the computer's 6 point scale to your teacher's/school's 100 point scale the fault of the computer? It appears that what you really mean is that the implementation of the computer grading system is not fair as it doesn't use an appropriate scale.
tricking it (Score:2, Interesting)
"Experienced writers, teachers, and writing assessment specialists have tested e-rater to determine the extent to which it "understands" the content of essay responses. Some of these writers have submitted essays that have tricked e-rater into giving a score even though the essay does not make any sense. The individual words in these "challenge essays" are grammatically correct, but they are strung together in such a way that they create nonsense sentences."
That observation shouldn't be surprising because earlier it says: "An e-rater score will be most beneficial to students who make a good faith effort at using it to improve their writing skills."
The program works (grossly oversimplified) by mimicking the grading of humans on essay samples.
I scored 5/5 on the AP English exam... (Score:3, Interesting)
I just signed up for a userid so I can take the exam online, but after submitting my info it said I may have to wait up to two days to get an account.
Curious that they can grade essays with a computer but it looks like they have to have a human pass out the user ids.
Anyway, I'll see if I can submit one of my articles to the exam, and will post here how I did. Since I have to wait for my user ID, you'll have to look back here later to see how I did.
Comment removed (Score:5, Interesting)
AP Essay Rubric (Score:3, Interesting)
A friend of mine who teaches Biology said that she saw some pretty bad essays which she would have given a poor grade to because the english was atrocious but she had to follow the grading rubric and give high scores to because the keywords were present.
Re:Missing feedback (Score:3, Interesting)
They can't. But on a standardized test, you don't get any feedback anyway.
"In order for these students to get that feedback someone has to read it, and since they're reading it anyway, why not just grade it then?"
Because it takes too long.
"Are there really so many papers to mark you need a machine to do it?"
Yes. Human graders for standardized tests get about 1-3 minutes per paper. Human graders don't have time to really read your essay, so they grade you on the same kinds of criteria as this software does (grammar, spelling, a clear layout, etc.). Thus, it's not hard to create a computerized system which performs the same tasks as well as human graders.
"How do they judge the content? What if you submit an excellent paper on middle ages history but the assignment was on socialism?"
As I said, human graders don't have time to evaluate this anyway. The computer systems actually tend to be better at this.
Look, this system can and should not be used to relpace English teachers grading papers. A good English teacher will spend at least 30-60 minutes on a paper, and will write lots of comments.
What this system *is* good for is standardized tests. When everyone takes a test, you have (in many cases) a million tests to grade. This system can blast through the data and can actually perform better than an underpaid, overworked employee who has 1-2 minutes to grade a paper (and who grades hundreds of papers per day).
Re:the triumph of mediocrity (Score:2, Interesting)
Nobody wants to read 5-paragraph themes all day long, even if they do get the point across. They are just a means to an end.
One of the best English teachers I've ever had would point to the use of alliteration, clever turns of phrase, humor, novel word choice (not just synonym-madness), and other completely subjective facets of writing as some of what makes the written word worth reading.
I had computerized grading over 20 yrs ago! (Score:1, Interesting)
Using my TRS-80 Color Computer and DMP-100 dot matrix printer, I offered an alternative scholarship program. For a $10 fee, I would print a report card that was identicial to the real ones, except for the grades. My "clients" would take their real report card, pencil in their new grades, and my computerized grading system would do the rest. Each kid would go home and say that he forgot his report card in his locker and would bring it home tomorrow. I would deliver the new & improved report cards the next day and all was well.
The "offical" grades remained unchanged, so it was up to each client to avoid flunking courses that would prevent graduation. Anyone who failed a mandatory course was ineligible for my "service". One client tried to blackmail me into providing the service for free, but I said, "Just try and get someone to believe that report cards are being manufactured in a student's house."
The only disappointment I had was when some kids decided to publish an underground newspaper. I wanted to take out an ad, and they refused.
Re:OSS? (Score:3, Interesting)
My guess is that they are talking only about things like this [gnu.org]. I used to use a similar program back when I was taking English classes, in order to bring my papers down to an 8th-grade reading level.
These are encredibly easy to mess around with. For example, the fog index is:
Fog Index = 0.4*(words/sentences+100*((words >= 3 syllables)/words))
Which is roughly equal to the school grade reading level required for the essay. If I remember correctly, Associated Press articles are written to a 4th-grade reading level, which is why all of the paragraphs are only one sentence long.
Re:OSS? (Score:3, Interesting)
readability grades:
Kincaid: 6.4
ARI: 6.6
Coleman-Liau: 9.2
Flesch Index: 77.8
Fog Index: 8.5
Lix: 35.8 = school year 5
SMOG-Grading: 8.0
sentence info:
408 characters
96 words, average length 4.25 characters = 1.33 syllables
6 sentences, average length 16.0 words
50% (3) short sentences (at most 11 words)
33% (2) long sentences (at least 26 words)
3 paragraphs, average length 2.0 sentences
0% (0) questions
100% (6) passive sentences
longest sent 28 wds at sent 2; shortest sent 6 wds at sent 4
word usage:
verb types:
to be (9) auxiliary (0)
types as % of total:
conjunctions 1(1) pronouns 9(9) prepositions 14(13)
nominalizations 1(1)
sentence beginnings:
pronoun (3) interrogative pronoun (0) article (0)
subordinating conjunction (1) conjunction (0) preposition (0)