Skip to content

Latest commit

 

History

History
40 lines (39 loc) · 1.9 KB

README.md

File metadata and controls

40 lines (39 loc) · 1.9 KB

InstrumentDetection

Based on: SciPy, NumPy, scikit-learn.
Codes in mfcc.py partly originates from scikits.talkbox.
Developed on python 3.

The program currently uses MFCC(Mel Frequency Cepstral Coefficents), △MFCC and △△MFCC as the features coefficients.
SVM is used as the classifier and is trained under the "one-against-one" approach.

Follow these steps to run it:

  1. Convert all the training and testing audios to 16bit/32bit/floating-point .wav files. pydub may help you convert MP3 to WAV.
  2. Arrange the training audios in this structure:
    + Store the audios played by a same instrument in a same folder.
    + Name the folders the instruments' names.
    + Put all the folders in a same path.
    + Make sure there aren't any audios that are not training audios contained in the path.
  3. Put the testing audios together in one folder. The structure should look like this:
    training audios/
     |
     |-piano/
     |  |-*.wav
     |  |-*.wav
     |  |-...
     |
     |-guitar/
     |  |-*.wav
     |  |-*.wav
     |  |-...
     |
     |-violin/
     |  |-*.wav
     |  |-*.wav
     |  |-...
     |
     |-...
     
    testing audios/
     |-*.wav
     |-*.wav
     |-...
  1. Run generateMFCC.py. Follow the program's instruction and enter the path of the training audios (In the above example, it is the path of the folder called "training audios"). You'll get MFCC, △MFCC and △△MFCC saved in insrument_name.npy files.
  2. Run trainmodel_SVM.py. You'll get the SVM model named model_svm and a file named names which stores the names of the instruments.
  3. Run test.py and enter the path of the testing audios. The detection results will be shown.