Forgot your password?
typodupeerror
Education Technology

Grading Software Fooled By Nonsense Essay Generator 187

Posted by samzenpus
from the it-is-good-report dept.
An anonymous reader writes "A former MIT instructor and students have come up with software that can write an entire essay in less than one second; just feed it up to three keywords.The essays, though grammatically correct and structurally sound, have no coherent meaning and have proved to be graded highly by automated essay-grading software. From The Chronicle of Higher Education article: 'Critics of automated essay scoring are a small but lively band, and Mr. Perelman is perhaps the most theatrical. He has claimed to be able to guess, from across a room, the scores awarded to SAT essays, judging solely on the basis of length. (It’s a skill he happily demonstrated to a New York Times reporter in 2005.) In presentations, he likes to show how the Gettysburg Address would have scored poorly on the SAT writing test. (That test is graded by human readers, but Mr. Perelman says the rubric is so rigid, and time so short, that they may as well be robots.).'"
This discussion has been archived. No new comments can be posted.

Grading Software Fooled By Nonsense Essay Generator

Comments Filter:
  • by litehacksaur111 (2895607) on Wednesday April 30, 2014 @08:39PM (#46885707)
    I though most schools don't even care about the essay. Also the elite schools nowadays prefer the ACT and SAT II subject tests to demonstrate real knowledge. The SAT is really a dumb test, especially with all the coaching resources available now.
    • Re: (Score:3, Funny)

      by Anonymous Coward

      Your post tells me that you didn't score all that well on the SAT. Bad grammar, incoherent thoughts.

  • Irrelevant (Score:3, Insightful)

    by Ol Olsoc (1175323) on Wednesday April 30, 2014 @08:50PM (#46885771)
    As long as Precious gets an "A', Helicopter Daddy, and Blackhawk Mommy won't try to have the school president fired for ruining Precious's permanent record.
    • Re: (Score:3, Funny)

      Hey, Helicopter Daddy and Blackhawk Mommy dropped good boodle for that 'A', mister!
      You can just stand down from all that meritocratic whinging right now, mister.
    • That's when I wish I had a SAM for such occasions...

    • Re:Irrelevant (Score:4, Interesting)

      by TheMeuge (645043) on Wednesday April 30, 2014 @09:08PM (#46885863)

      At least helicopter daddy and blackhawk mommy give a shit about the Precious. Or do you prefer the absent daddy and welfare mommy? People DO go overboard... but I feel like the pendulum is starting to swing entirely too far the other way.

      • by AK Marc (707885)
        It's swung so far the wrong way that in many places, aggressive parents are required to get the minimum education proscribed by law. If you don't have active parents, the school actively punishes the children. The thought is that they are already failures because of their parents, so better to get them used to failure and hope they drop out, so as not to harm the schools statistics.
      • I feel like the pendulum is starting to swing entirely too far the other way.

        What we have here is a superposition of extreme pendulum states.

        (Actually, no: we simply have multiple pendulums at opposite extremes. However, calling it a "superposition" is more fun!)

  • Since the essays are grading subject knowledge, and it takes subject knowledge to provide the keywords, it is fairly irrelevant if the essay happens to be structured in a manner that is nonsensical.

    Demonstration of deeper understanding, if it needs to be tested, can be achieved via other types of questions.

    • The next generation of the software will have a keyword database attached for every subject possible to ensure that every student takes different keywords (chosen randomly from the stock).

      Then your grades are pretty much dependent on whether the random number generator chooses keywords that the grading software likes.

      I fail to see the difference to now, to be honest, it's just way less work on the student's side.

    • by EmagGeek (574360)

      Did you happen to read TFA? In the TFA, it is said that the College Board does not take points off for factual errors. In fact, it says that it cares not for factual errors, because errors in fact seldom subtract from the quality of the essay being graded.

      WTF, right?

      • by MrBigInThePants (624986) on Wednesday April 30, 2014 @09:05PM (#46885851)
        Not being from the USA, every article I ever read about your education system just leaves me scratching my head.

        How on earth did you guys let it get so ridiculous??
        • by TubeSteak (669689)

          How on earth did you guys let it get so ridiculous??

          Never underestimate the power of Intelligent Design.

        • by hendrips (2722525)

          Well, not that it's much of a defense, but absolutely no one that I know of took the SAT essay section seriously. I have not heard of any university that actually considered that section of the SAT when making admission decisions. So our education establishment wasn't completely stupid, I guess.

  • by Anonymous Coward on Wednesday April 30, 2014 @08:56PM (#46885793)

    ... because Slashdot shows that humans already make evaluations about articles without reading them.

  • Quid pro quo (Score:4, Insightful)

    by Opportunist (166417) on Wednesday April 30, 2014 @08:57PM (#46885797)

    When you're too lazy to read my essay to grade me and let software do it, I don't really see no moral problem with doing the same to write the essay.

    • > I don't really see no moral problem

      I guess someone should have graded your essays a little more closely instead of relying on a robot.

    • Re:Quid pro quo (Score:5, Insightful)

      by Anubis IV (1279820) on Wednesday April 30, 2014 @09:56PM (#46886087)

      As someone who graded hundreds of essays while serving as a teaching assistant for a senior-level engineering ethics course, I have to say that I find your lack of integrity rather appalling. Your moral obligation to write the essay yourself is independent of the method they use for grading it. Just because someone else is doing a lousy job does not mean that you suddenly have a license to short-change them for what you're obligated to do.

      I would guess that I graded around 300-400 essays during the three semesters I served as a TA, and that I probably averaged around 20 minutes per essay, since I was a strong believer in providing useful feedback over things the students could improve, even if they weren't necessarily incorrect. That said, other TAs spent as little as a minute or two per essay, and barely provided any feedback at all. Regardless of how much time the TAs did or didn't spend on the essays, however, the students had the same obligations, and rightfully so.

      • by Anonymous Coward

        If I've been hired to build a Potemkin village, then it would be unethical of me to spend time constructing interiors for the buildings.

        The English department has some nice courses on compositional writing where I can get real feedback on my progress on those skills. As far as the machine-graded essays for any other Department -- either I understood the topic before writing the essay or I didn't and if I didn't then a no-feedback essay isn't going to fix the problem.

        • If I've been hired to build a Potemkin village, then it would be unethical of me to spend time constructing interiors for the buildings.

          Not unethical, idiotic.

      • Re:Quid pro quo (Score:4, Insightful)

        by number17 (952777) on Wednesday April 30, 2014 @10:24PM (#46886201)

        Your moral obligation to write the essay yourself is independent of the method they use for grading it.

        Students pay big bucks and expect to have experts in the field teach them and grade their work. It sounds like these schools are off-shoring their marking so that they can do other work (ie Research). If the school was upfront, before paying tuition, that they were going to just send your essay to Bangladesh for marking then I would be ok with having a moral obligation to write the essay myself.

        • by thegarbz (1787294)

          Students pay big bucks and expect to have experts in the field teach them and grade their work.

          What has one got to do with the other? I'd rather have a fantastic teacher from whom I learn and receive zero feedback and a token passing grade than a shithouse teacher who makes me study for the test which I pass with flying colours and was well read.

          Not all effort is equal put in by both students and teachers is equally valuable.

        • Students pay big bucks and expect to have experts in the field teach them and grade their work.

          So if I feel I'm being shortchanged, I'll just not do the work, ensuring that I don't get the education I'm paying for. That'll show 'em!

      • Your moral obligation to write the essay yourself is independent of the method they use for grading it.

        That's an interesting claim. I'd be curious to hear you make an argument to support it.

      • "I have to say that I find your lack of integrity rather appalling."

        Unfortunately, engineering ethics is something that is normally taught in the undergrad level. With the onslaught of international graduate students and H1-B workers, engineering ethics become a "luggage" for competitiveness om the domestic student or workers to compete with these people who treat plagiarism as honorable activity. Any person with integrity will lose out to those who has no bottom line to achieve the goals.

      • by ruhri (1480067)

        As someone who graded hundreds of essays while serving as a teaching assistant for a senior-level engineering ethics course, I have to say that I find your lack of integrity rather appalling.

        As someone who served on the IEEE ethics committee I find your appeal to argumentum ab auctoritate [wikipedia.org] rather appalling. You should know the distinction between ethics and morals. One could make the Utilitarianist [wikipedia.org] case, in which (arguably) the behavior cited is morally OK. One could also make the Kantian [wikipedia.org] argument that (arguably) comes closer to what you were condoning.

        Regardless of how much time the TAs did or didn't spend on the essays, however, the students had the same obligations, and rightfully so.

        As an assignment for your ethics class: please elaborate, under which ethical systems, the above statement holds true or not, and why.

        • As someone who served on the IEEE ethics committee I find your appeal to argumentum ab auctoritate rather appalling.

          Fair enough. Truth be told, I was merely trying to mirror his opening statement by providing a contrast from the other side. It was not my intent to use my former position in such a manner, though I can certainly see how it comes across that way. The fault lies with me on that one. I should have been more careful.

          You should know the distinction between ethics and morals.

          I do, though depending on what systems we're talking about, that distinction evaporates.

          As an assignment for your ethics class: please elaborate, under which ethical systems, the above statement holds true or not, and why.

          I'll admit, I've always leaned quite a bit more towards deontological approaches to analyzing situations in m

          • by ruhri (1480067)

            Excellent reply. A+. ;-)

            I, of course, absolutely agree with your original statement, but I also think the GP wanted to point out the much more important ethical aspect: should we build and use machines for something that is such a profoundly human activity, i.e. the communication and exchange of ideas? Taking the Kantian approach here as you so eloquently pointed out in your post: Since I as an essay writer (and reader, FWIW) expect to communicate with humans, using machines for either or both of these tas

            • Hah, thanks!

              And I think a lot of that gets back towards the purpose of the essay. I'd suggest that we write essays for our courses, not for the purpose of communicating, but rather for the purpose of improving our communication (a subtle, but important, distinction).

              If it's possible to write software that can capably analyze how skilled we are at communicating, I'd have trouble coming up with any objections to using it, given that it could successfully serve the same purpose as the human grader. That said,

      • The problem is that technology allows universities to take short cuts in education, and not in the students advantage. Add to that some of the current goings on in the university system, and the future of the education system is a little worrisome (then again the future has always been worrisome and somehow we've muddled through).

        But, while before you might have a few bad apples not providing sufficient feedback to students (or not doing it in a useful way) you have, as matters of policy, short cuts.

        Why pa

      • "As someone who graded hundreds of essays while serving as a teaching assistant for a senior-level engineering ethics course, I have to say that I find your lack of integrity rather appalling. Your moral obligation to write the essay yourself is independent of the method they use for grading it."

        No, it isn't.

        Once you failed on your end of the contract (in this case, that you will do a serious attempt to grade my intimate knowledge on the issue by using experts to review my work) you shouldn't hold any assum

      • I want to add a data points (anecdote). I'm an English professor. For several years, I've noticed that students will keep repeating the same easily-corrected mistakes in paper after paper. I offer corrections, advice, and instruction in class. I began to suspect that students were not reading the comments. I gradually came to believe that the phones out in class were not being used for note-taking. This semester I have taught a content-heavy lecture course. Since it's new, and I'm up for tenure, and student
      • by ultranova (717540)

        Your moral obligation to write the essay yourself is independent of the method they use for grading it.

        That is highly questionable, but let's start with a simpler one: on what basis would you have such a moral obligation in the first place? Simply because someone who has power over you said so?

        Regardless of how much time the TAs did or didn't spend on the essays, however, the students had the same obligations, and rightfully so.

        Go on, don't leave us hanging: "rightfully so, because..."?

      • by bmo (77928)

        Just because someone else is doing a lousy job does not mean that you suddenly have a license to short-change them for what you're obligated to do.

        TAs spent as little as a minute or two per essay

        Read what you just posted. Then read it again.

        The only person being shortchanged in this case is the student who is actually footing the fucking bill for an education. If the education is a fraud because grading is done on whim and in a slapdash manner, which is what you are describing, then what is the fucking p

      • by tlhIngan (30335)

        As someone who graded hundreds of essays while serving as a teaching assistant for a senior-level engineering ethics course, I have to say that I find your lack of integrity rather appalling. Your moral obligation to write the essay yourself is independent of the method they use for grading it. Just because someone else is doing a lousy job does not mean that you suddenly have a license to short-change them for what you're obligated to do.

        I would guess that I graded around 300-400 essays during the three se

    • Re:Quid pro quo (Score:4, Interesting)

      by clifyt (11768) <[moc.liamg] [ta] [rettamkinos]> on Thursday May 01, 2014 @12:15AM (#46886645) Homepage

      As someone that wrote software like this -- and disagreed with the subject of the story a decade ago when he tried to get us with both the Gettysburg Address as well as Kennedy's inaugural address (both of which are GREAT speeches with historical value, but shitty college entrance exams) -- you are looking at this entirely wrong.

      I can give you background of how these things are generally graded. 3 people get an essay, look at it for 30 to 45 seconds, throw a score and it and if they are all within a margin of error, they move on. If not, a senior rater comes in and and they can replace one other person and it is now within margin of error, they move on as well. If not, it is workshopped for 5 minutes.

      In 99% of the cases, you have less than 2 minutes of viewing on your essay between 3 people.

      Enter the computer...the raters are told they are going to be rated themselves. We can throw a lot more prerated essays that had been normed by a large group of raters, and train the rater. They know they are being measured and the average rater spends two or more minutes reading through these. You actually have MORE time with eyes on your essay with a computer rater involved than you do without. Having a computer rater doesn't remove humans -- it adds a safe guard. It means one person spends more time and is verified with something that is unbiased (within reason...actually was able to figure out subtle racism and otherwise that wouldn't have been detected with purely human raters...'black' or 'hispanic' names and scores go down...'asian' names and the scores go up...give the same essay with the names switched and the humans change ratings...the computer was actually more objective).

      I haven't been involved with this sort of thing in a decade, and I can only assume it is much better than when I left my project...but lazy isn't the right word. Underpaid and overworked? Yeah...but not lazy.

      • Racism, sexism and other discrimination is quite effectively countered with anonymous grading. My university gave you a unique number before each exam and you put only that number on the sheets. Only afterwards did the administrators (not anyone involved in the course) look up and file the exam under your name. I found this helpful as a TA too because we really wanted to be fair both in grades and comments.

        You can still be biased by the handwriting but we tried to counter that ourselves. If someone in my TA

  • by Joe_Dragon (2206452) on Wednesday April 30, 2014 @09:11PM (#46885871)

    student athlete need some like this with 60 hours a week playing football they don't have time for class.

  • If they're using some stupid automated grader, odds are a computer-generated essay could consistently grade higher than any humans (because it can focus on scoring without worrying about content).

  • by TsuruchiBrian (2731979) on Wednesday April 30, 2014 @09:25PM (#46885955)

    I don't see a problem with automated essay graders in principle. It's just that the current essay graders are no good. Once we are able to make computer software that can actually understand essays as well as a human it will be should be perfectly competent to grade an essay.

    I certainly see the motivation to have a computer grade essays. Who wants to read multitudes of mediocre essays. I might rather be put in solitary confinement. I am all for the automated essay graders, but only after they can be proven to be as competent as a human.

    I have no idea how to make a such a competent essay grader, but I do know how to grade an essay grader. You have a bunch of computer graders and human graders grading the same essays. If the computer graders show a more consistent performance than the humans (i.e. are the outlier less frequently), then the computer grader is better.

    If a paper is scored by 4 human judges and a computer, and the humans score the paper 1, 2, 3, 4, and the computer scores the paper as a 9, then it means that according to most of the human graders, the computer was way off. Essays are inherently subjective. Are the humans right or is the computer right? Who cares it doesn't matter.

    If a paper is scored by 4 human judges and a computer, and the humans score the paper 4, 5, 7, 9, and the computer scores the paper as a 6, then it means that according to every human grader, the computer did better than half the humans.

    If a computer can do better than the humans even by human standards, then I think it's fair to say that a computer is good enough.

    • by clifyt (11768)

      I helped design one of these essay graders a decade+ ago with Dr. Ellis 'Bo' Page (Duke and MIT).

      Even then, we were as good as humans in solely grammar and mechanics and all that sorta stuff. We were rating on a 6 point scale and something like 70% of the scores were a perfect match, and 85% were within 1 point.

      Given that we were using professional human raters that were trained on weekly basis and had round tables to go over controversial papers, and these were considered some of the best in the US at the

      • If it becomes the case that writing style is able to be analyzed and produced by a computer algorithm, it seems to me that having a good writing style will become like having good arithmetic skills (i.e. less importance is placed on these skills as they become trivial for machines to replicate), and ironically this ability to automatically test and reproduce skills drives those very skills into obscurity.

        It seems like the skills that computers can't do yet are the only ones that it is worthwhile for humans

    • "I don't see a problem with automated essay graders in principle."

      I don't see a problem with automated essay creators then.

      "Who wants to read multitudes of mediocre essays."

      Nobody. That's why they attach a paycheck by the end of the week to that activity. If you think that's not fair, you can forego your paycheck at any time.

      "If the computer graders show a more consistent performance than the humans (i.e. are the outlier less frequently), then the computer grader is better."

      ON AVERAGE. It happens that it

      • Nobody. That's why they attach a paycheck by the end of the week to that activity. If you think that's not fair, you can forego your paycheck at any time.

        I think you missed my point. It's also boring to calculate logarithms by hand. Before we had digital computers, skilled human computers (usually women) were paid to tediously do this work. It wasn't fair or unfair. It was a waste of human effort to do something so tedious. With the advent of computers, that human effort could be spent on much more interesting things, like programming computers to perform more tedious tasks.

        ON AVERAGE. It happens that it is the outstanders the ones that have more potential and you are just conciously throwing all them by the bathtub.

        If a computer can score an essay between where all the human graders scored the

  • by sootman (158191) on Wednesday April 30, 2014 @09:37PM (#46885997) Homepage Journal

    Artificial intelligence, while seemingly tasty on the surface, tends to be underwhelmed by insufficient fish, with regard to warrantless searches.

    • by mpe (36238)
      Artificial intelligence, while seemingly tasty on the surface, tends to be underwhelmed by insufficient fish, with regard to warrantless searches.

      AI is HARD. Plenty of tasks which people can do easily are difficult to get machines to do, even throwing lots of processing resources at the problem.
      Natural Language Processing is one of these difficult problems. With "grading essays" also being nowhere near beginner level NLP.
      Quite possibly actual NLP experts would not attempt to write such software, because t
  • Each to each.

    I do not think that they will sing to me....

    • by bobjr94 (1120555)
      I have checked a bunch of websites and some searching and found no link to this babel generator or even a small excerpt from the submitted paper. I would have expected at least one if not both to be easily found.
  • Example from article: "Privateness has not been and undoubtedly never will be lauded, precarious, and decent.". There are too many comments on news sites which read like that.

  • Finally we know where some of these Slashdot articles are generated!
  • Quick! Where's the German version? I need to boost my sociology grades!

    Seriously, the first thing you have to thouroughly disable when doing sociology is your brain and any sense of logic or common sense in it. The bizar bullshit that is put out in this field even at academic level is mindboggling. The blatant non-sense that's in the books and readers of this subject is unbelievable. ... I need that generator to keep my braincells from killing themselves to end the agony.

Vax Vobiscum

Working...