How do we obtain a rich understanding of the world from visual input? What visual cues enable us to recognize objects from their structure or texture, or to remember objects and scenes that we have encountered before? What is the future for intelligent systems that can drive a car or assist the visually impaired? This unit explores these questions from the perspectives of perception and cognition, brain imaging, and the engineering of intelligent systems.
From very rudimentary visual abilities present at birth, infants learn to recognize complex objects such as hands and how to detect the direction of gaze of a caregiver. Shimon Ullman’s first lecture shows how such capabilities can be learned from a stream of natural videos and without supervision.
Shimon Ullman’s second lecture explores the minimal configurations of image content needed to recognize an object category, revealing a human capability that far surpasses that of current recognition systems and yielding insights into the brain mechanisms underlying visual recognition.
What makes an image memorable? Using insights from perceptual experiments, fMRI studies, and computational modelling, Aude Oliva has identified some key factors that determine visual memorability.
(Image based on Isola, Phillip, Jianxiong Xiao, Devi Parikh, Antonio Torralba, and Aude Oliva. “What Makes a Photograph Memorable?” IEEE Trans. Pattern Anal. Mach. Intell . 36, no. 7 (July 2014): 1469–1482. From author’s final manuscript, license CC BY-NC-SA.)
Understanding what makes an image memorable can shed light on the representations of visual knowledge in the brain and neural basis of memory loss, and is important for applications such as data visualization and image retrieval. From Aude Oliva, you will learn about the visual cues that govern memorability, revealed from behavioural experiments, computational models, and brain imaging studies.
Guest speaker Eero Simoncelli presents a physiologically inspired model of the analysis of visual textures by the ventral pathway of the brain. The synthesis of texture metamers, stimuli that are physically different but appear the same to a human observer, provide a powerful tool for probing the underlying brain mechanisms.
We are rapidly approaching a time of fully autonomous vehicles and intelligent systems to assist the blind. Guest speaker Amnon Shashua shows how advanced computer vision technology created by Mobileye will change transportation, and how wearable devices created by OrCam can profoundly impact the lives of the visually impaired.
Unit Activities
Useful Background
- Introductions to machine learning, neuroscience, cognitive science