"What I cannot create, I do not understand." – R. Feynman
Jonas Kubilius*, Kohitij Kar*, Kailyn Schmidt, James J. DiCarlo

Can deep neural networks rival human ability to generalize in core object recognition?

CCN , Philadelphia (PA, USA) , 2018-09-06 16:30 url paper poster

Humans are thought to transfer their knowledge well to unseen domains. This putative ability to generalize is often juxtaposed against deep neural networks that are believed to be mostly domain-specific. Here we assessed the extent of generalization abilities in humans and ImageNet-trained models along two axes of image variations: perturbations to images (e.g., shuffling, blurring) and changes in representation style (e.g., paintings, cartoons). We found that models often matched or exceeded human performance across most image perturbations, even without being exposed to such perturbations during training. Nonetheless, humans performed better than models when image styles were varied. We thus asked if there was any linear decoder that, when applied on model features, would rectify model performance. By adding examples from all representation styles to decoder training, we found that models matched or surpassed human performance in all tested categories. Our results indicate that ImageNet-trained model encoding space is sufficiently rich to support suprahuman-level performance across multiple visual domains.