
We’ve discovered neurons in CLIP that respond to the same concept whether presented literally, symbolically, or conceptually. This may explain CLIP’s accuracy in classifying surprising visual renditions of concepts, and is also an important step toward understanding the associations and biases that CLIP and similar models learn.
Read PaperView CodeBrowse Microscope
Contents
Multimodal neurons in CLIP
Absent concepts
How multimodal neurons compose
Fallacies of abstraction
Attacks in the wild
Bias and overgeneralization
Conclusion
Fifteen years ago, Quiroga et al. discovered that the human brain possesses multimodal neurons. These neurons respond to clusters of abstract concepts centered around a common …





