Download PDFOpen PDF in browser

DCASE 2019 Challenge: Audio Tagging with Noisy Labels and the Effect of Model Architecture

EasyChair Preprint no. 1274

5 pagesDate: July 10, 2019

Abstract

In this study we investigate the use of noisy-labeled data for pretraining multi-label audio tagging models. The system is implemented based on Neural Network architecture using convolutional and dense layer applied to log-scale mel-frequency spectrograms of the input data. Pretraining on the full noisy dataset is compared with pretraining on a part of the noisy dataset, selected automatically based on a model trained on a curated data, and with use of a curated only data for building an audio tagging system. In addition, the effect of the model architecture on the performance of the system is assessed. In particular, the baseline model consisted of four convolutional blocks followed by a fully connected head is com-pared with models build based on densely connected blocks. The effect of the number of blocks and pooling, for generation of features for fully connected part, is verified. Use of an optimized architecture along with 64 frequency channels in preprocessed melspectrograms has allowed us to build a fast baseline model with 5-fold cross-validation label-weighted label-ranking average precision (lwlrap) score reaching 0.87 and taking only 60-90 seconds (5-folds) for generating a prediction on the public portion of test dataset. The final system is composed of 15 models trained with slightly modified parameters and selected to minimize correlations between model predictions in the ensemble.

Keyphrases: audio classification, Convolutional Neural Network, deep learning, noisy data

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@Booklet{EasyChair:1274,
  author = {Maxim Shugaev and Lehan Yang and Theo Viel and Khoi Nguyen},
  title = {DCASE 2019 Challenge: Audio Tagging with Noisy Labels and the Effect of Model Architecture},
  howpublished = {EasyChair Preprint no. 1274},

  year = {EasyChair, 2019}}
Download PDFOpen PDF in browser