lemmarank demo

Help

Entering input

You have a choice between three options: enter text in the text box, choose one of the demo text options, or upload a file. A variety of file formats are supported: plain utf-8 text (.txt), and unless the formatting is especially convoluted, .pdf, .doc, .docx, .csv, .epub, .html, .odt, .rtf and .xls files.

Understanding output

The output is represented in two ways: a table and a dynamic graph.

The table has four columns. The first three show the base forms of nouns, verbs, and adjectives that are especially frequent in this text compared to a reference corpus, and the fourth shows named entities (NERs; people, places, organisations, time expressions). The named entities are ranked by absolute frequency, ie. not relative to a reference corpus.

The graph starts out by plotting the frequency of the three most representative lemmas. They are highlighted in blue in the table. By clicking on cells in the table, you can toggle the display of each one on and off. The text is divided into 20 sections, each comprising 5% of the text, and the plot shows a count of each cell over a particular 5% section.

lemmarank demo

Rank representative lemmas and named entities from running text

Help

Entering input

Understanding output