"What I cannot create, I do not understand." – R. Feynman
Jonas Kubilius

Deep nets as a computational model for human shape sensitivity

Cox lab (Harvard) , Cambridge, MA (USA) , 2015-05-22 slides

Theories of object recognition agree that shape is of primordial importance, but there is no consensus about how shape might be represented and so far attempts to implement a model of shape perception that would work with realistic stimuli have largely failed. Here we demonstrate that the sensitivity for shape features characteristic to human and primate vision emerge in state-of-the-art convolutional neural networks when trained for a generic object recognition from natural photographs. We show that these models explain human shape judgments for several benchmark behavioral and neural stimulus sets on which earlier models failed. In particular, although never explicitly trained for, these models develop sensitivity to non-accidental properties that have long been implicated to form the basis for object recognition. Even more strikingly, when tested with a challenging stimulus set in which shape and category membership are dissociated, the most complex model architectures capture human shape sensitivity as well as some aspects of the category structure which emerges from human judgments. As a whole, these results indicate that convolutional neural networks not only learn physically correct representations of object and scene categories but also develop perceptually accurate representational spaces of shapes. An even fuller model of human object representations might be in sight by training deep architectures for multiple tasks, which is so characteristic in human development.