Data Format

  • Noise samples should be various .wav files ranging from 10-240 minutes
  • Ambient Examples:
    • 45 Minute wav file of coffee shop noise
    • 80 Minute wav file of street noise
  • Talk Examples:
    • 1 Hour long wav file of a talk show
    • 30 Minutes long wav file of a podcast
  • Music Examples:
    • Several wav files of music tracks of varying genres
  • Verify that these long files are 16 kHz Mono with as much variability as possible

Data Structure

  • User must provide a Noise Directory with “garbage” data
  • Data must be split in two ways:
    • By Dataset
    • By Noise Type (Ambient, Music, or Talk)
  • Data split among datasets must be of similar content but from different sources


  • Example: One-Hour long news clips from CNN, ABC, and BBC can be appropriately split into:
    • CNN à test/Talk
    • ABC à train/Talk
    • BBC à valid/Talk

Data Length

  • NOTE: User can use the copyright free sources Noise Dataset Composition provided by AONDevices
  • Noise files can be large segments of random talk, music, car/street noise, etc.
  • Noise data must be split into three different noise type directories:
    • Ambient (Freeway Noise, White Noise, Cocktail Noise, etc.)
    • Music (Rock Music, Jazz, Orchestra, etc.)
    • Talk (News, Movies, TV Shows, etc.)
  • Directories must be labeled this way
  • Table below provides minimum required noise inputs split evenly between Music and Talk:

Data Length

*Does not include required Unknown directory.

  • Below are the required noise inputs for Ambient for any number of classes:

Note: Tool will notify user if insufficient data is provided