What Will It Take To Usher In The Next Era Of Deep Learning?

What Will It Take To Usher In The Next Era Of Deep Learning?
This post was published on the now-closed HuffPost Contributor platform. Contributors control their own work and posted freely to our site. If you need to flag this entry as abusive, send us an email.

What are the major bottlenecks in making deep learning systems more effective? originally appeared on Quora - the knowledge sharing network where compelling questions are answered by people with unique insights.

Answer by Rajat Monga, engineering director for TensorFlow, on Quora.

Deep learning seems to have gained success in recent years because of the convergence of three things:

  • Algorithms: This area has seen some improvements, but most of the early wins came from fairly old ideas. Now that deep learning is showing success, we are seeing some good advances as well.
  • Datasets: Training large networks is hard without large enough datasets. MNIST can only go so far in pushing the limits. Having datasets like ImageNet has really helped pushed the state of the art in vision.
  • Compute Engine: I believe this has been the biggest game changer in recent years. With my background in systems, I'm a little biased; however, Compute had a big role to play in some of the early big wins for deep learning. The cat paper from Google Brain in 2011 and the 2012 ImageNet results by Krizhevsky et al. brought deep learning into the forefront of computer vision.

That said, all of these have a long way to go to make deep learning more effective:

  • Compute Engine: This remains a challenge. Even as we are getting customized chips, there's continued demand for more.
  • Datasets: With the current algorithms being largely supervised, there is need for larger datasets to continue to push the envelope on new research. At Google, we are actively working on improving this and have released a video dataset and two for robotics recently.
  • Algorithms: Not to downplay the part of algorithms, ideas like ReLU, Dropout, Sequence to Sequence, and GANs have brought in big changes. Going back to Compute, we aren't likely to get a 1000x improvement with the traditional, pure hardware improvements. It will need co-design of algorithms and Compute (e.g. can we create a model with 1000x more parameters, but using only 10x more Compute?). I believe that building sparse models that address this issue will be a game changer.

This question originally appeared on Quora. - the knowledge sharing network where compelling questions are answered by people with unique insights. You can follow Quora on Twitter, Facebook, and Google+.

More questions:

Popular in the Community

Close

What's Hot