

I have just released a new project! It is a VGG16-based audio classifier. This particular project is intended to classify between scratch and hit sounds which are extracted from the Greatest Hit 1 dataset by Andrew Owens et. al.

This VGGIshIsh follows the idea introduced by Vladimir Iashin and Esa Rahtu in their publication Taming Visually Guided Sound Generation 2. See the GitHub repository of the project for more detailed information and instructions on how to use it ♥‿♥.

Find my project here.

  1. Visually Indicated Sounds by Andrew Owens et. al @ https://andrewowens.com/vis/ 

  2. Taming Visually Guided Sound Generation by Vladimir Iashin & Esa Rahtu @ https://github.com/v-iashin/SpecVQGAN