This project classifies sound signals from different environmental classes in the ESC-10 dataset. the above photo summarizes the model steps:
- The model read all the signals of different classes and assign a label number to each class.
- The signal is converted from the time domain to Wavelet Transform. PCA technique is used to reduce dimensions of wavelet transform as it contain a lot of dimensions.
- Full Convolutional Neural Network(CNN) is defined and used to classify 10 different classes of ESC-10 dataset.
-
download the ESC-10 dataset from this link: ESC-10
-
change the directory name that contains the dataset to the name in the notebook file or change the name in the notebook file in these three lines:
#here my directory name is "dataset"
data, samplerate = librosa.load("dataset/dog/1-30344-A.wav", sr=44000)
for filepath in glob.iglob('dataset/*'):
for j in glob.iglob('dataset/'+i+'/*'):
- numpy
- keras
- matplotlib
- librosa
- pylab
- glob
- tensorflow
- scipy
- pywt
For more details on wavelet transform and how to deal with it see this course