Machines that can see and hear

栏目: IT技术 · 发布时间: 5年前

Machines that can see and hear

Editor’s note: The Towards Data Science podcast’s “Climbing the Data Science Ladder” series is hosted by Jeremie Harris. Jeremie helps run a data science mentorship startup called SharpestMinds . You can listen to the podcast below:

One of the most interesting recent trends in machine learning has been the combination of different types of data in order to be able to unlock new use cases for deep learning. If the 2010s were the decade of computer vision and voice recognition, the 2020s may very well be the decade we finally figure out how to make machines that can see and hear the world around them, making them that much more context-aware and potentially even humanlike.

The push towards integrating diverse data sources has received a lot of attention, from academics as well as companies. And one of those companies is Twenty Billion Neurons, and its founder Roland Memisevic, is our guest for this latest episode of the Towards Data Science podcast. Roland is a former academic who’s been knee-deep in deep learning since well before the hype that was sparked by AlexNet in 2012. His company has been working on deep learning-powered developer tools, as well as an automated fitness coach that combines video and audio data to keep users engaged throughout their workout routines.

Here were some of my favourite take-homes from today’s episode:

  • Academics who started down the deep learning path prior to 2012 were often ridiculed. The world of the 2000s was dominated by tabular data that simple models like decision trees and support vector machines were well suited for, so most people incorrectly generalized from this and assumed that the tools of classical, statistical machine learning were more promising than neural networks. What kept deep learning buffs moving despite all that pushback was the belief that deep learning should have the potential to process a type of information that humans consume all the time, but that machines rarely encountered, especially back then: video and audio data.
  • The computational constraints imposed by mobile devices are a big consideration for companies that are developing new consumer-facing applications for machine learning. When Twenty Billion Neurons got started, mobile devices couldn’t handle the on-device machine learning capabilities that they needed if they were going to run their automated fitness trainer software, so they were faced with a choice: find a way to compress their models so that they could be run on-device, or wait for the hardware to catch up with their software. Ultimately, Twenty Billion went with option 2, and that paid off: in 2018 Apple phones started carrying a chip that unlocked the on-device processing they needed.
  • If you’re interested in experimenting with datasets that contain multiple data types, Roland recommends checking out the “something something” dataset, publicly available from here .

You can follow Twenty Billion Neurons on Twitter here or on LinkedIn here and you can follow me on Twitter here .

If you’re curious about their upcoming fitness app launch, you can also give them a follow on Instagram here .


以上所述就是小编给大家介绍的《Machines that can see and hear》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

Python基础教程

Python基础教程

[挪] Magnus Lie Hetland / 袁国忠 / 人民邮电出版 / 2018-2-1 / CNY 99.00

本书包括Python程序设计的方方面面:首先从Python的安装开始,随后介绍了Python的基础知识和基本概念,包括列表、元组、字符串、字典以及各种语句;然后循序渐进地介绍了一些相对高级的主题,包括抽象、异常、魔法方法、属性、迭代器;此后探讨了如何将Python与数据库、网络、C语言等工具结合使用,从而发挥出Python的强大功能,同时介绍了Python程序测试、打包、发布等知识;最后,作者结合......一起来看看 《Python基础教程》 这本书的介绍吧!

随机密码生成器
随机密码生成器

多种字符组合密码

HTML 编码/解码
HTML 编码/解码

HTML 编码/解码

URL 编码/解码
URL 编码/解码

URL 编码/解码