Speech Emotion Recognition (RAVDESS)
Predicts one of eight emotions from a short spoken clip using a frozen WavLM-large encoder with a lightweight trained head. The model is evaluated with speaker-independent cross-validation: no actor appears in both training and testing, so the reported accuracy reflects how it generalizes to voices it has never heard, not memorized speakers.
Classes: neutral, calm, happy, sad, angry, fearful, disgust, surprised.