- A model to classify the emotions of speeches
- Features were extracted by modified pyAudioAnalysis library
- Preproccess the features.
pyAudioAnalysis library is modified by the addition of functions in order to extract original features from .wav files and present them in 3D arrays.
- To use the modified codes, one shall overwrite the package installed with the files:MidTermFeatures.py and audioTrainTest.py
-
directory_feature_extraction_no_avg
This function is able to extract features from a directory without averaging each file.
-
multiple_directory_feature_extraction_no_avg
This function is able to extract features from multiple directories without averaging each file.
-
directory_feature_extraction_no_avg_3D
This function aims to extract audio features from a directory and turn a 3D array in terms of (batch,step,features)
-
multiple_directory_feature_extraction_no_avg_3D
Multi-directories extraction for 3D array.
In order to determine the window size, window step and window number.
`read_audio_length` file is executed to read the audios' length in the directories and visualize the length by plotting a histogram.
Other functions defined and used were listed in ultil.py
All the models used during for the project were listed in the model_training.py
There are three attention based models. Residual attention, Multiplicative attention and MultiHead Attention.