Machines that can see and hear

栏目: IT技术 · 发布时间: 4年前

Machines that can see and hear

Editor’s note: The Towards Data Science podcast’s “Climbing the Data Science Ladder” series is hosted by Jeremie Harris. Jeremie helps run a data science mentorship startup called SharpestMinds . You can listen to the podcast below:

One of the most interesting recent trends in machine learning has been the combination of different types of data in order to be able to unlock new use cases for deep learning. If the 2010s were the decade of computer vision and voice recognition, the 2020s may very well be the decade we finally figure out how to make machines that can see and hear the world around them, making them that much more context-aware and potentially even humanlike.

The push towards integrating diverse data sources has received a lot of attention, from academics as well as companies. And one of those companies is Twenty Billion Neurons, and its founder Roland Memisevic, is our guest for this latest episode of the Towards Data Science podcast. Roland is a former academic who’s been knee-deep in deep learning since well before the hype that was sparked by AlexNet in 2012. His company has been working on deep learning-powered developer tools, as well as an automated fitness coach that combines video and audio data to keep users engaged throughout their workout routines.

Here were some of my favourite take-homes from today’s episode:

  • Academics who started down the deep learning path prior to 2012 were often ridiculed. The world of the 2000s was dominated by tabular data that simple models like decision trees and support vector machines were well suited for, so most people incorrectly generalized from this and assumed that the tools of classical, statistical machine learning were more promising than neural networks. What kept deep learning buffs moving despite all that pushback was the belief that deep learning should have the potential to process a type of information that humans consume all the time, but that machines rarely encountered, especially back then: video and audio data.
  • The computational constraints imposed by mobile devices are a big consideration for companies that are developing new consumer-facing applications for machine learning. When Twenty Billion Neurons got started, mobile devices couldn’t handle the on-device machine learning capabilities that they needed if they were going to run their automated fitness trainer software, so they were faced with a choice: find a way to compress their models so that they could be run on-device, or wait for the hardware to catch up with their software. Ultimately, Twenty Billion went with option 2, and that paid off: in 2018 Apple phones started carrying a chip that unlocked the on-device processing they needed.
  • If you’re interested in experimenting with datasets that contain multiple data types, Roland recommends checking out the “something something” dataset, publicly available from here .

You can follow Twenty Billion Neurons on Twitter here or on LinkedIn here and you can follow me on Twitter here .

If you’re curious about their upcoming fitness app launch, you can also give them a follow on Instagram here .


以上所述就是小编给大家介绍的《Machines that can see and hear》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

React Native:用JavaScript开发移动应用

React Native:用JavaScript开发移动应用

【美】Truong Hoang Dung(张皇容) / 奇舞团 / 电子工业出版社 / 2015-9 / 65.00

React Native是当前移动端开发中的优秀解决方案。《React Native:用JavaScript开发移动应用》围绕着如何将一个完整App提交到App Store,讲解了使用React Native开发iOS应用所涉及的方方面面。首先介绍了Flexbox布局,教大家从零开始搭建一个初始应用,以此阐明React Native的基础运行机理;然后介绍了Flux的设计思想,怎么理解和使用Pro......一起来看看 《React Native:用JavaScript开发移动应用》 这本书的介绍吧!

CSS 压缩/解压工具
CSS 压缩/解压工具

在线压缩/解压 CSS 代码

在线进制转换器
在线进制转换器

各进制数互转换器

MD5 加密
MD5 加密

MD5 加密工具