Custom configuration

The full specification of the custom configuration is as follows:

custom_config = {
    "backbone": str,  # Backbone to use for feature extraction. Either "spacy" or "stanza"
    "language": str,  # Language to use for feature extraction. E.g. "en" for English, "de" for German
    # NOTE: The language must be supported by the specified backbone
    "model": str,  # Model to use for feature extraction. E.g. "en_core_web_sm" for English, "de_dep_news_trf" for German
    "max_length": int,  # Maximum length (chars) of the text to process. Default is 100000
    "remove_constant_cols": bool,  # Remove feature columns with constant values, i.e. where all texts produce the same feature value. Default is True
    "text_column": str,  # Name of the text column in the DataFrame. Default is "text"
    "features": {  # Features to extract, grouped by feature area; each feature area is a list of feature names.
        "dependency": List[str],
        "emotion": List[str],
        "entities": List[str],
        "information": List[str],
        "lexical_richness": List[str],
        "morphological": List[str],
        "pos": List[str],
        "readability": List[str],
        "semantic": List[str],
        "surface": List[str]
    }
}

For a fully specified configuration, check CONFIG_ALL in the elfen.config module.