How can we help?

Home > User Guides > Authorship Investigation > Investigator > The Authorship Investigation Report > The Authorship Investigation Report

The Authorship Investigation Report

Investigation Recommendation

The top of the report will provide an investigation recommendation and summary based on results of the report. If the report contains discrepancies that require a more thorough review the recommendation will read Investigation Recommended. If the report does not contain any visible irregularities the recommendation will read Investigation Not Recommended.


The summary contains the body of evidence that the report has used to make its recommendation. There are three sections in the summary; document information, readability, and punctuation. If one or more of these sections contains discrepancies that warrant further scrutiny, the section will contain a flag to draw attention to it and a brief summary of the evidence found.


Readability uses the Flesch-Kincaid reading ease scale. Assuming the text is grammatically correct, this scale estimates how easy the text is to read.


A score of less than 50 is approximately college level writing, 50-70 is approximately high school level writing, and above 70 is approximately grade school level writing.


The readability score of the investigation file is visualized in orange, while the comparison files are visualized in grey. The numeric value will be shown in the Feature Comparison section of the report. This article will help you learn more about the Flesch-Kincaid readability score and how it is calculated.

Feature Comparison

This table will compare the stylistic features of the investigation file against the comparison file(s). The investigation file will be shown in the static left column. The comparison file(s) will be shown in the scrollable right columns.

Readability Score

This is the numeric readability score that is visualised above. Radically different scores are causes for concern.



An author’s readability score should improve over time. This should be taken into consideration if any of the comparison file(s) are older examples of an author’s work.

Character Similarity

Character similarity is calculated using 3-grams. These are sequences of 3 characters; for example, “how”, “and”, “ise”, “ize”, etc.


The percentage indicates how often 3-grams from the investigation file appear in the comparison file(s).

Document Information

If any of the files submitted to Authorship Investigation are a .docx file type, the report will pull the file’s metadata. This metadata is often used to identify potential issues in a student's work. For example, when the name of the file creator does not match the supposed author.


The author is the name given to the file creator.



An author may use a blank file created by an instructor or peer as the basis for their document.

Last Modified By

The name of the last person to modify the file before submission.



An author may have asked a peer to proofread and spell check a document, leading to modifications by someone other than themselves.

Date Created

The date that the file was created.

Date Last Modified

The last date the file was modified before submission.

Total Editing Time

The total time spent editing the file.



If the editing time is unusually short, this can be an indicator that the author has copied the content into the file from another file.


How many times the file has been revised (opened and changes made).


How an individual uses punctuation is often a stylistic feature that is common in all their writing. A change in punctuation usage between documents can often signify a change in authorship.

Single or Double Space After Period

After a period (full stop) a document may use a single or double space before beginning the next sentence; for example, a single space. (Like this). Or a double space.  (Like this).

Item/Item Words

Item/Item words refer to words which are separated by a slash (/).

For example, "he/she", "and/or".


The vocabulary section of the report contains various lexical features of the files.

Unique Word Usage

Unique word usage (type-token ratio) calculates the total number of unique words as a percentage of the total number of words used in a document.

For example, the following sentence contains 8 words with 6 unique words, resulting in a unique word usage score of 75%: “The white cat sat on the white mat.”

Vocabulary Richness

Vocabulary richness (Hapax Legomena ratio) calculates the percentage of words in a document that only occur once.


For example, the following sentence contains 8 words with 4 words occurring only once, resulting in a vocabulary richness score of 50%: “The white cat sat on the white mat

Sentence Complexity

Sentence complexity is the average number of phrases per sentence in a document. Phrases are calculated by splitting the sentence on colons, semi-colons, and commas.


For example, in a document containing 100 sentences split up into 200 total phrases, the sentence complexity score would be 2.0.

Content Word Usage

Content word usage is the  number of content words (nouns, adjectives, verbs, adverbs) as a percentage of the total number of words used in a document.

For example, the following sentence contains 10 words and 4 content words, resulting in a content word usage score of 40%: “A cat jumped over my wall and then ran away”.

Last modified


This page has no custom tags.


(not set)