I didn’t read the linked paper but I did read the linked blog post (http://karpathy.github.io/2014/09/02/what-i-learned-from-competing-against-a-convnet-on-imagenet/) with great interest. I like this quote:
“It is clear that humans will soon only be able to outperform state of the art image classification models by use of significant effort, expertise, and time.”
Originally shared by Jeff Dean
Rethinking the Inception Architecture for Computer Vision
An Arxiv paper posted yesterday at http://arxiv.org/abs/1512.00567
by my colleagues Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna details a bunch of improvements to the Inception image classification model that they’ve been working on. An ensemble of four of these models achieves 3.46% top-5 error on the validation set of the Imagenet whole image ILSVRC2012 classification task, compared with an ensemble of the initial version of Inception that won last year’s 2014 Imagenet classification challenge with a 6.66% top-5 error rate (a 49% reduction in top-5 error rate).
For comparison, Andrej Karpathy estimates that a well-trained human (him) can achieve about 5.1%, as detailed in his delightfully written “_What I learned from competing against a ConvNet on ImageNet_” blog post at http://karpathy.github.io/2014/09/02/what-i-learned-from-competing-against-a-convnet-on-imagenet/
The latest Inception models were trained using TensorFlow, our newly open-sourced machine learning system (see http://tensorflow.org).