Skip to main content

View and understand word count in Turnitin

This guide helps instructors understand how Turnitin calculates word count, where to find it in the product, and what factors, such as file type and character set, can affect it.

In this guide:

  • Overview of word count
  • How to view word count in Turnitin
  • Word count and file type
  • Word count and character sets

Overview of word count

Turnitin calculates word count to help instructors and students assess the length and scope of submissions. However, word count values may differ depending on file type, character set, and how the content is processed.

Word count is also an important component of Turnitin’s file requirements. This guide will help you learn more about these file requirements .

How to view word count in Turnitin

Not sure which version of Turnitin you are using? Learn how to identify your version of the Similarity Report 

Classic reportNew, enhanced report

Feedback Studio/Originality Check

In the classic Similarity Report, word count can be found in the submission information.

To open the submission information, select the “i” icon in the bottom of the layers side panel.

Similarity/SimCheck

In the Similarity and SimCheck classic Similarity Report, word count can be found in the submission details.

To open the submission details, select the Submission Details option in the top right-hand corner of the report.

 

Word count and file type

Turnitin processes file formats differently, which can lead to variations in word count.

File Type Processing Method
.doc , .docx Turnitin extracts word count directly from Microsoft Word. Formatting and hidden text are also parsed.
.pdf

Turnitin’s text extraction system calculates word count. Turnitin analyzes the visible text layer of the PDF, which may omit: 

  • Hidden text
  • Scanned image text (unless OCR is applied)
  • Complex layout elements (e.g., columns, tables)
.txt Word count is calculated by Turnitin’s text extraction system. Due to the basic nature of these files, there should be minimal discrepancies.

Word count and character sets

Turnitin uses internal parsing logic to count words. Non-Latin alphabets (e.g., Chinese, Japanese, Korean) and certain symbols may affect this count.

  • Chinese, Japanese, Korean:
    • Turnitin often counts characters as individual words.
    • Punctuation and spacing rules can cause variation from traditional word processing tools.
  • Thai, Arabic, and similar languages:
    • Microsoft Word may not count words reliably due to script-specific rules.
    • PDF files that use Turnitin’s text extraction system may yield a more consistent or different count. 
  • Special characters or symbols:
    • May be excluded unless attached to standard word structures.
  • Languages with compound words (e.g., German):
    • Long compound words are typically counted as one word.

If your institution uses Turnitin for multiple languages, word count expectations may vary based on academic or language norms.

Was this article helpful?
3 out of 13 found this helpful

Articles in this section

See more
Powered by Zendesk