Audio Datasets for Machine Learning

栏目: IT技术 · 发布时间: 4年前

内容简介:At Lionbridge, we have deep experience helping the world’s largest companies teach applications to understand audio. From virtual assistants to in-car navigation, all sound-activated machine learning systems rely onThis time, we at Lionbridge combed the we

Audio Datasets for Machine Learning

At Lionbridge, we have deep experience helping the world’s largest companies teach applications to understand audio. From virtual assistants to in-car navigation, all sound-activated machine learning systems rely on large sets of audio data .

This time, we at Lionbridge combed the web and compiled this ultimate cheat sheet for public audio datasets for machine learning.

Audio Speech Datasets for Machine Learning

AudioSet : AudioSet is an expanding ontology of 632 audio event classes and a collection of 2,084,320 human-labeled 10-second sound clips drawn from YouTube videos.

LibriSpeech : LibriSpeech is a carefully segmented and aligned corpus of approximately 1000 hours of 16kHz read English speech, derived from read audiobooks.

Spoken Digit Dataset : This dataset was created to solve the task of identifying spoken digits in audio samples.

Flickr Audio Caption Corpus : This corpus includes 40,000 spoken captions of 8,000 natural images. It was collected in 2015 to investigate multimodal learning schemes for unsupervised speech pattern discovery.

Spoken Wikipedia Corpora : This is a corpus of aligned spoken Wikipedia articles from the English, German, and Dutch Wikipedia. Hundreds of hours of aligned audio, and annotations can be mapped back to the original html.

VoxCeleb : VoxCeleb is an audio-visual dataset consisting of short clips of human speech, extracted from interview videos uploaded to YouTube.

Freesound : This is a platform for the collaborative creation of audio collections labeled by humans and based on Freesound content.

Acoustic Datasets for Machine Learning

Mivia Audio Events Dataset : This dataset includes 6,000 events of surveillance applications, namely glass breaking, gunshots, and screams. The events are divided into a training set composed of 4,200 events and a test set composed of 1,800 events.

DCASE 2017 Challenge Data : These are open datasets used and collected for the Detection and Classification of Acoustic Scenes and Events (DCASE) challenge.

Music Datasets for Machine Learning

Million Song Dataset : This is a freely-available collection of audio features and metadata for a million contemporary popular music tracks.

Ballroom : This dataset includes data on ballroom dancing, such as in online lessons. It provides characteristic excerpts and tempi of dance styles in real audio format.

Free Music Archive (FMA) : This is a dataset for music analysis that consists of full-length and HQ audio, pre-computed features, and track and user-level metadata.

If you missed our previous articles, we’d recommend the 50 Best Datasets for Machine Learning , 12 Best Social Media Datasets , andmore.

Still can’t find what you need? Lionbridge AI provides customvoice and sound data in 300 languages for your specific machine learning project needs.


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

实战Linux编程精髓

实战Linux编程精髓

罗宾斯 / 中国电力出版社 / 2005-7 / 59.80元

编写应用软件,特别是那些比较重要的软件,毫无疑问要涉及到系统调用。在UNIX/Linux环境下编程更是如此。要想编写优秀的软件,就必须熟悉这些系统调用的方方面面。通过阅读这本书,你能够快速地掌握这些重要技术,以构建严谨的Linux软件。全书主要分为三大部分:第一部分讨论了基本的编程问题,包括Linux编程环境、基本的文件和进程管理与操作、内存操作,还介绍了一些基本的库接口。第二部分比较深入地讨论了......一起来看看 《实战Linux编程精髓》 这本书的介绍吧!

CSS 压缩/解压工具
CSS 压缩/解压工具

在线压缩/解压 CSS 代码

RGB CMYK 转换工具
RGB CMYK 转换工具

RGB CMYK 互转工具

HEX HSV 转换工具
HEX HSV 转换工具

HEX HSV 互换工具