> Deep learning requires very large training sets, and these shouldn’t be shippe...

agibsonccc · on June 1, 2014

I think there is something to be said for the "deep learning gold rush" I'm one of the ones trying to cash in myself by being an independent player in the space with my own distributed deep learning framework[1].

The goal of data accessibility can be solved by an abstraction layer that auto vectorizes (transforms in to matrices) the needed data at runtime, trains the nets on that particular mini batch of data, and continues on.

That's what I'm trying to do with a concept of a DataSetIterator[2]. This understands how to pull in the data, and handles all the logistics while the runtime only knows about DataSetIterators.

I'm also partnering with a former cloudera engineer in the hadoop space to take on in process YARN deep learning[3]. Data should not be moved. It should be processed and left where it is. I'll be interested to see the innovations in this space in the coming years.

I don't believe deep learning as a service is the way to go, I think behind the firewall deep learning apps will be the way to go here.

[1]: http://deeplearning4j.org/

[2]: http://deeplearning4j.org/customdatasets.html

[3]: https://github.com/jpatanooga/Metronome