In this talk, I will present a theoretical synthesis of various ideas about the processing of visual information in mid-level areas. To support the proposed framework, I will present some recent experimental evidence from our lab and show a proof-of-the-concept model that implements these ideas in practice. In the second part of the presentation, I will expand the scope of discussions to convolutional neural architectures, another interesting candidate for investigating visual processing, and demonstrate several experiments that I conducted recently to explore their capacities in mid-level processing.