Information about the corpora

From TermiKnowledge
Revision as of 13:57, 6 June 2022 by WeronikaSzeminska (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
Key data about the corpora used (description and size)
Corpus number of words
Normative corpus in English 336,908
Research corpus in English (COVID-19 on Sketch Engine) 224,061,570
Press corpus in English 506,089
Comments corpus in English 1,340,603
Normative corpus in Czech 331,038
Research corpus in Czech 300,000
Press corpus in Czech 232 694
Comments corpus in Czech 206,657
Normative corpus in German 504,414
Research corpus in German 350,908
Press corpus in German 754,707
Comments corpus in German 181,851
Normative corpus in Italian 458,116
Research corpus in Italian 157.974
Press corpus in Italian 585.684
Comments corpus in Italian 22,298
Normative corpus in Polish 584,694
Research corpus in Polish 73,185
Press corpus in Polish 277,387
Comments corpus in Polish 112,969