Text Analysis Software

Since 1999, in collaboration with the late Dr John Olsson, Seren Web developed a range of text tools to help in the analysis of texts including tools for word occurrence, comparing phrases in two separate texts and an analysis of percentage of words in common across texts.

These are free for you to use
Occurrence of Words

Occurence of words in a text based on word length

This software will analyse your text for occurrence of any word, lexical, non-lexical or both. You can specify the minimum length of the the word.


Phrases in Common

Phrases of six words in length between two texts, then five, four, three, two

This software will analyse your 2 texts for shared phrases of six words in length between the two texts. Then it will check for five word lengths, four, three and finally two.


Words in Common

Number of words and the number of instances of each word in common

This software analyses the number of words in common and the number of instances of each word in common between two texts.


About the Corpus

In linguistics, a corpus is a large and structured set of texts used to do statistical analysis and hypothesis testing, checking occurrences or validating linguistic rules within a specific language territory.

The Seren Corpus is a growing collection of articles taken from wikinews English language pages with an emphasis on the latest news items to reflect current use of language online in the English language.

The Corpus will be available again soon