Stop, Historians! Don't Copy That Passage! Computers Are Watching

January 26, 2002

By EMILY EAKIN

 

These are boon times for muckrakers on the scholarship

beat. In the last month alone, not one but two of the

nation's most high-profile historians, Stephen Ambrose and

Doris Kearns Goodwin, stand accused of plagiarism in cases

that are generating headlines and hand-wringing.

Sensing an opportunity to uncover front-page-worthy fraud,

journalists armed with Post-It notes - and anonymous tips

about the thefts - have turned into literary gumshoes,

painstakingly combing through books in the library stacks.

But the job needn't be so taxing. Over the last decade,

plagiarism detection has gone high-tech. Today's software

market is flooded with programs designed to rout out

copycats with maximum efficiency and minimum effort.

Historians were among the first scholars to try to nail a

plagiarism suspect with a computer. In 1991, in a case that

became famous in academic circles, several historians filed

a complaint with the American Historical Association

charging Stephen B. Oates, a historian at the University of

Massachusetts at Amherst and the author of a well-regarded

1977 biography of Abraham Lincoln, with plagiarism.

As evidence, Mr. Oates's accusers pointed to passages in

his book that closely resembled passages in a 1952

biography of Lincoln by Benjamin P. Thomas. Mr. Oates

furiously denied the charges, attributing any similarities

between the two books to a reliance on the same historical

sources. Twenty-three colleagues signed a public statement

calling the plagiarism charges "totally unfounded." After

deliberating on the case for a year, the association ruled

that Mr. Oates had "failed to give Mr. Thomas sufficient

attribution for the material he used," but carefully

avoided the word plagiarism.

Some of Mr. Oates's opponents were convinced he was being

let off the hook too easily. One hit on the idea of having

a computer judge the case and approached Walter Stewart and

Ned Feder, scientists at the National Institutes of Health

in Bethesda who had developed what the media dubbed a

"plagiarism machine."

Mr. Stewart and Mr. Feder spent four months on the project.

By the time it was over, they had scanned more than 60

books into a computer and compared them not just to Mr.

Oates's Lincoln biography but to his subsequent biographies

of William Faulkner and the Rev. Dr. Martin Luther King Jr.

as well. Their software followed a simple rule: each time a

string of at least 30 characters in one of Mr. Oates's

books matched a string of 30 characters in one of the other

books, the computer made a note. (Strings of fewer than 30

characters were apt to turn up meaningless matches -

including common proper names and phrases.)

In February 1993, the scientists submitted a 1,400- page

report to the association, detailing what they claimed were

175 instances of plagiarism in the Lincoln biography, 200

instances in the Faulkner biography and 240 instances in

the King biography, all identified by their computer. But

once again the association found no evidence of plagiarism,

though it did state that Mr. Oates had depended to a degree

greater than recommended "on the structure, distinctive

language and rhetorical strategies of other scholars and

sources." The association also took pains to dismiss Mr.

Stewart and Mr. Feder's plagiarism machine, declaring that

"computer-assisted identification of similar words and

phrases in itself does not constitute a sufficient basis

for a plagiarism or misuse complaint."

The scientists' supervisors at the National Institutes of

Health were no more enthusiastic. When they caught wind of

Mr. Stewart and Mr. Feder's extracurricular activities,

they confiscated the plagiarism machine and had their

research lab shuttered.

For the nascent plagiarism detection business, this was an

inauspicious beginning, but hardly, it turned out, a major

setback. Nearly 10 years later, antiplagiarism software is

routinely used by dozens of colleges and universities -

even high schools - on student work.

At one end of the spectrum are companies like Turnitin.com,

based in Oakland, Calif., which uses a software program to

check the content of student work against millions of sites

around the Web and a database of papers from online

term-paper mills.

At the other end are companies like Glatt Plagiarism

Services in Chicago, which draw on techniques from

cognitive theory to verify authorship. The Glatt Plagiarism

Screening program, for example, relies on a method called

the "Cloze procedure," originally used in the reading

comprehension portion of standardized intelligence tests.

Sample passages from a suspect work - which can range in

size from a single essay to an entire book - are scanned

into a computer, which, following the Cloze procedure,

removes every fifth word. The sample passages are then

returned to the author, who is asked to fill in the missing

words.

Glatt's founder and president, Dr. Barbara Glatt, says that

if the work is authentic, the author will be able to recall

most of the missing words. A plagiarist, on the other hand,

will invariably flunk the test, or else fess up before

taking it. "It's a tough test to pass," Dr. Glatt said. "I

have never gotten 100 percent of them right."

Nevertheless, she insisted, the Cloze technique is

considered highly reliable. Scientists have tried removing

the third and fourth words instead, she said, but with much

less success. "So far," she added, "no one has ever been

falsely accused by the test."

Of course, neither of these approaches seems well suited

for catching scholarly plagiarists. Professional historians

of the stature of Mr. Ambrose and Ms. Goodwin, both of whom

deny plagiarism but concede carelessness, are unlikely to

be stealing from online term- paper mills. And though Dr.

Glatt's approach has the advantage of being able to detect

plagiarism when the identity of the plagiarized text is

unknown, it's hard to imagine scholars readily agreeing to

sit through a Cloze procedure exam at their accusers'

request.

The approach Mr. Stewart and Mr. Feder adopted - comparing

one book to another - may still be a literary sleuth's best

bet.

Last year, Louis Bloomfield, a physics professor at the

University of Virginia, created one such software program

that he uses to run quick checks on his students' work.

(When he first tried it last spring, he found 122 cases of

possible cheating, leading to 15 student explusions and

volunteer departures so far.) "It would be interesting to

scan the world's libraries into electronic form and start

doing these kinds of comparisons," Mr. Bloomfield said with

a mischievous laugh. "I'm afraid you'd pop up all kinds of

trouble."

http://www.nytimes.com/2002/01/26/arts/26TANK.html?ex=1013062219&ei=1&en=6b227104b5224c93