Liz Wager: How should editors respond to plagiarism?

Liz WagerGross plagiarism is easy to spot and most people agree it’s wrong, so it’s relatively easy to deal with. But while stealing somebody else’s paper and pretending it’s your own is obvious misconduct, it’s surprisingly hard to define exactly what plagiarism is, especially for more minor offences. It would be helpful if we could agree a definition of plagiarism (or a classification of different types) so that editors (and teachers) could decide how they should handle it/them. Editors now have access to powerful text-matching software (such as CrossCheck or even a simple Google™ search). It’s now easy to discover the percentage of text in one document that matches text in another (or several others). But it’s much more difficult to know what those numbers mean. In fact, one editor I know says that the numbers are meaningless (although she admits that the tools are helpful for flagging up possible problems and then looking for large matches).

I agree that it wouldn’t be helpful to rule that, say, anything above 50% matching text was major plagiarism, anything from 20-49% constituted minor plagiarism, and <20% was simply chance. While the amounts are helpful, they are only one aspect that should be considered. It’s also important to realise that data and figures can be plagiarised but won’t be picked up with text-matching software. Similarly, if work is translated and then appropriated, the words won’t match but this is clearly a form of plagiarism. And it is also possible to plagiarise somebody’s theory or analytical framework but express it in different words and claim it as your own, so the definition needs to cover more than simply identical text.

Another problem is that software can spot identical strings of words but can’t distinguish between common terms and sparks of original genius. To illustrate the problem, if you Google the phrase “p<0.005 was considered statistically significant” you’ll find 588,000 documents that contain it or 410,000 stating that research was “performed according to the Declaration of Helsinki.” Nobody knows who first used these strings and probably nobody cares, but other shorter strings such as “the winter of our discontent” (Shakespeare) or “the end of the beginning” (Churchill) are clearly quotations which ought to be attributed.

So, it’s hard to define plagiarism but COPE (the Committee on Publication Ethics) wants to try, and would appreciate your help. We’ve produced a discussion document (available at:
and would like comments from anyone who’s interested (researchers, authors, students, academics, editors, and readers).

Conflict of interest: I wrote the COPE discussion document on plagiarism and am chair of COPE …. so this is blatant advertising, but it isn’t plagiarised!

Liz Wager PhD is a freelance medical writer, editor, and trainer. She is the current chair of the Committee on Publication Ethics (COPE).

  • Bob Creutz

    Liz – We spend a lot of time at iThenticate (the developers of the software that powers CrossCheck) trying to discourage users from putting too great an emphasis on raw similarity index.  In fact, I have been referring many editors to IEEE's CrossCheck Information Page (specifically,…  At the end of the guide, you will see four iThenticate report classifications.  These classifications are tremendously useful, and will likely allow for greater efficiency in reviewing iThenticate reporting results.  Always love reading your blogs.

    Bob Creutz  

  • Liz Wager

    Thanks so much. iThenticate / CrossCheck is a fantastic tool (and I hope you didn't think my remarks were disparaging it in any way) but we do sometimes get calls from editors asking for COPE to set arbitrary limits (which we always resist). Thanks for the link to the classifications.

  • Tiago Villanueva

    How about text that is plagiarised from, say, English, and re-published in another language? How will it be possible to deal with that sort of global, trans-national and multi-lingual plagiarism? Has COPE thought about  that?

  • Liz Wager

    Indeed, we mention this in our paper, as this kind of plagiarism cannot be detected automatically by software, so it is harder to detect — but it is definitely plagiarism!