Multilingual support

The table below gives an overview of the multilingual support for each feature area in ELFEN.

Feature Area

Supported Languages

Notes

surface

All languages with available spacy/stanza models

raw_sequence_length is language-agnostic

pos

All languages with available spacy/stanza models

lexical richness

All languages with available spacy/stanza models

readability

All languages with available spacy models and spacy_syllables support

Most readability formulas were designed for longer English texts.

information

Language agnostic

semantic

Open WordNet-based features: All languages with available spacy/stanza models and OWN support; Hedges: English only

entities

All languages with available spacy/stanza models

emotion

All languages with available spacy/stanza models and available lexicons. For the full specification of the available, check the original resources.

psycholinguistic

language dependent, currently EN, DE, FR, IT, ES, PL, NL supported. For more information, see full table

More languages planned, psycholinguistic features are dependent on available norms that have to be integrated.

morphological

All languages with available spacy/stanza models

dependency

All languages with available spacy/stanza models

For psycholinguistic norms, the following table provides an overview of the available resources for each language: .. csv-table:

:header-rows: 1
:file: ../psycholinguistic_norms.csv
:widths: 30, 10, 10, 10, 10, 10, 10, 10