Quickstart
The easiest way to get started with ELFEN is to use the Extractor class in with the standard configutation. This will use spaCy as a default backbone, and extract all implemented features for an English dataset using the en_core_web_sm model.
import polars as pl
from elfen.extractor import Extractor
# Load your dataset as a polars DataFrame
# example from csv
df = pl.read_csv("path/to/your/dataset.csv")
# Initialize the Extractor with your DataFrame
# This will automatically load the spaCy model
# and preprocess the text column
# Assumes the text column is named "text"
extractor = Extractor(data = df)
# Extract features
features = extractor.extract_features()
print(extractor.data.head())
To load a specific model in a different language, you can specify the language and model parameters in the Extractor class.
extractor = Extractor(data = df, language = "de", model = "de_dep_news_trf")
# Extract features
features = extractor.extract_features()
print(extractor.data.head())
For more advanced usage, check our Tutorials section.