How can we help?

Home > User Guides > Authorship Investigate > Investigator > The Authorship Report > The Authorship Report

The Authorship Report

This page details each of the sections in the Authorship Report and how they have been calculated. To better understand how to interpret the report, please read this guide on interpreting the Authorship Report.

Report summary

The top of the report contains the investigation filename and will provide an investigation recommendation and summary based on results of the report. If the report contains discrepancies that require a more thorough review the recommendation will read Investigation Recommended. If the report does not contain any visible irregularities the recommendation will read Investigation Not Recommended.


Screen Shot 2018-10-18 at 9.36.22 AM.png

 

 

 

 

 

 

 

 

 

 

 

There are three links below the recommendation. For further guidance on how to interpret the report, select the How do I interpret the report? link.

 

If you are viewing the report for the first time, you will be prompted to view a walkthrough tour. This tour is designed to help you understand the results of the report, and we recommend all new users to view it.

 

Screen Shot 2018-08-02 at 15.21.13.png

You can skip the tour, or close it at any time. It will always be available at the top of the report by selecting the 'View tour' link.

 

If you would like to print a hard copy of the investigation report, select the 'Print report' link. This will generate a print friendly version of the report that you can use as a resource for further investigation (for example, as evidence in an interview or hearing). The print version of the report will contain a glossary at the end.

 

If any of the files could not be processed and were left out of the report results, there will be a notification in this section to warn you.

 

The summary contains the body of evidence that the report has used to make its recommendation. The summary will pull in results from four sections; Readability, Document Information, Sentences and Vocabulary. If one or more of these sections contains discrepancies to suggest contract cheating has potentially taken place, the section will contain a flag  to draw attention to it and a brief summary of the evidence found.

 

The report details are shown in the top right. The filename, date created, and the number of comparison files are shown. If you are an administrator you can download the investigation file. Select the number of comparison files to open a popup showing the comparison files. If you are an administrator you can download any of the comparison files.

 

Screen Shot 2018-09-18 at 10.51.28.png

Readability

Readability uses the Flesch-Kincaid Grade Level Formula. Assuming the text is grammatically correct, this scale estimates the years of education needed to understand the document and gives you an indication of what grade level each file falls into. Radically different scores are causes for concern.

 

This article will help you learn more about the Flesch-Kincaid readability tests and how the score is calculated.

 

The readability scores can be viewed in two different ways. As a pin chart distribution with a 90th percentile prediction box, or as a submission time series. The submission time series is only available in Reports generated using the submission ID method.

 

If you are viewings the distribution pin chart, the readability score of the investigation file is visualized in orange, while the comparison files are visualized in grey. The orange percentile box predict where student’s future paper results will fall into based on their past submissions.

 

Screen Shot 2018-10-29 at 3.31.09 PM.png

 

Based on the comparison files, this percentile is calculated using a prediction interval. In other words, the more comparison files uploaded, the more accurate the prediction will be. The Authorship Report will recommend investigation if the scores of the investigation file falls outside the percentile window of confidence.

 

If you used the submission ID upload to create the Authorship Report you will be able to view the Readability score as a time series visualization. Select the button to switch between the two views. The time series will display all the submissions in chronological order. The data of the chart will be displayed on the x axis, with the submission dates along the y axis. 

 

 

Beneath the visual scale, the readability results are displayed in a table. The results can be ordered alphabetically or numerically by selecting the column title.

 

Screen Shot 2018-08-02 at 15.41.20.png

 

Document Information

If any of the files submitted to Authorship Investigation are .docx or .pdf file types, the report will pull the file’s metadata.

 

The .docx files will provide the most complete metadata. The .pdf will provide the author, data created, and date last modified.

 

If any of the files are not .docx or .pdf file types, or the report cannot find any metadata for them, then they will be italicized and in gray to signify no metadata could be found for them. These files will not affect the investigation recommendation.

Author Name

The author is the name given to the file creator. The results will be displayed in a table that will allow you to see the filename, the author and who that file was last modified by. The results can be ordered alphabetically (or reverse alphabetically) by selecting the column titles.

 

Screen Shot 2018-05-11 at 11.15.57.png

Dates

The report will collect dates that are pertinent to the investigation file and comparison files. The results will be displayed in a table that will allow you to see the filename, the date that the file was created, and the date the file was last modified. The tables’ results can be ordered chronologically (or reverse chronologically) by selecting the column titles.

 

Screen Shot 2018-05-11 at 11.16.10.png

Editing Time

This visual scale and table will show the total time spent editing the file. This is the amount of time spent with the document open and in front of other windows, whether you are typing or not. This time is saved and added up each time you save your changes.

 

The table can be ordered alphabetically (or reverse alphabetically) by filename, and by length by selecting the column titles.

WarningTriangle_Red.png

This feature might be disabled in some countries for privacy reasons. If that is the case, the value will always be shown as '0'.

Revisions

This visual scale and table will show how many times the file has been revised (opened and changes made). The table can be ordered alphabetically (or reverse alphabetically) by filename, and by amount of revisions by selecting the column titles.

Sentences

There are many nuanced features in our writing that indicate our unique style which may not be obvious when reading a single document. The report surfaces how each file has used the different sentence types, the average sentence length, and the amount of phrases per sentence used.

Sentence Type

The report will display how each submission has used sentence types. There are four main sentence structures: simple, compound, complex, and compound-complex. If a sentence does not fall into one of these types, the report will list it as “other”.

 

 

A simple sentence contains one independent clause (e.g., "I like cats.")

A compound sentence contains two or more independent clauses (e.g., "I like cats, but my friend likes dogs.")

A complex sentence contains one independent clause and one or more dependant clauses (e.g., "The cat ran inside because it was raining.")

A compound-complex sentence contains two or more independent clauses, and one or more dependent clauses (e.g., "The cat ran inside because it was raining, and it hates getting wet.")

Phrases Per Sentence

Phrases per sentence is the average number of phrases per sentence in a document. Phrases are sets of words that form a single grammatical piece of a sentence.

 

There are several types of phrases (noun phrase, verb phrase, adjective phrase, prepositional phrase, adverb phrase, etc.). The score is calculated using top level phrases; that is, phrases that are not nested inside another phrase.

 

For example, the sentence "The cat sat on the mat" has two top level phrases; a noun phrase (The cat) and a verb phrase (sat on the mat). Therefore, in a document containing 100 sentences split up into 200 total phrases, the phrases per sentence would be 2.0.

Average Sentence Length

Average sentence length is the average number of words per sentence in a document.

Vocabulary

The vocabulary section of the report contains various lexical features of the files. These results are displayed as a visual chart and a table. Variation in subject matter and/or assignment length can produce notably different measures from the same author.  Discrepancies shouldn’t be presented as evidence without being reviewed.

Unique Word Usage

Unique word usage (type-token ratio) calculates the total number of unique words as a percentage of the total number of words used in a document.

 

Screen Shot 2018-05-11 at 11.24.45.png


For example, the following sentence contains 8 words with 6 unique words, resulting in a unique word usage score of 75%: “The white cat sat on the white mat.”

Vocabulary Richness

Vocabulary richness (Hapax Legomena ratio) calculates the percentage of words in a document that only occur once.

 

Screen Shot 2018-05-11 at 11.26.55.png

 

For example, the following sentence contains 8 words with 4 words occurring only once, resulting in a vocabulary richness score of 50%: “The white cat sat on the white mat

 

Last modified

Tags

This page has no custom tags.

Classifications

(not set)