Yale Perception & Cognition Lab

VSS '18 Abstracts
 
 
Jump to:  
Chen, Y. -C., Colombatto, C., & Scholl, B. J. (2018). Looking into the future: An inward bias in aesthetic experience driven only by gaze cues. Poster presented at the annual meeting of the Vision Sciences Society, 5/23/18, St. Pete Beach, FL.  
When you aim your camera at a scene you wish to capture, you face a problem that artists have faced for centuries: How can you best frame your composition? There is no easy answer to this question, since people's aesthetic preferences vary dramatically, and are influenced by personal history and countless cultural factors. There are, nevertheless, some regularities that are powerful enough to persist across people, contexts, and time. For example, in framed images such as photographs, we prefer peripheral figures that face inward (vs. outward). Why does this "inward bias" exist? Since agents tend to act in the direction in which they are facing, one intriguing possibility is that the inward bias reflects a preference to view scenes from a perspective that will allow us to witness those predicted future actions. This account has been difficult to test with previous displays, in which facing direction was often confounded with either global shape profiles or the relative locations of salient features (since, e.g., someone's face is generally more visually interesting than the back of their head). But here we demonstrate a robust inward bias in aesthetic judgment driven by a cue that is socially powerful but visually subtle: averted gaze. Subjects adjusted the positions of people in images to maximize the images' aesthetic appeal. People with direct gaze were not placed preferentially in particular regions, but people with averted gaze were reliably placed so that they appeared to be looking inward. A second experiment with color-inverted images ruled out confounds related to lower-level visual properties, suggesting that the effect is driven by perceived gaze per se. These results demonstrate that the inward bias may be an adaptive feature of our minds: it can arise from visually subtle features, when those features signal how future events may unfold.
 
Colombatto, C., Chen, Y. -C., & Scholl, B. J. (2018). Gaze cueing is tuned to extract the mind behind the gaze: Investigations of 'gaze deflection'. Talk given at the annual meeting of the Vision Sciences Society, 5/19/18, St. Pete Beach, FL.  
The most salient 'social' visual stimuli we encounter are faces, and perhaps the most informative features of faces are eyes. Indeed, other people's eyes seem to be particularly meaningful to us, and perceived gaze can rapidly and automatically cause shifts of attention, as in the phenomenon of gaze cueing. But why is eye gaze so important? Presumably, gaze is meaningful not because of what it reveals about another person's eyes, but rather what it reveals about the *mind behind the gaze* -- e.g. about what someone is attending to, or is intending to do. When you turn to look at something, however, it is not always because you are attending to it. Consider, for example, the familiar but unexplored phenomenon of 'gaze deflection' -- when you are surreptitiously looking at someone and then suddenly look away when they catch you staring. In these cases, the 'deflected' gaze is not directed at anything in particular, but is only directed *away* from something (or someone) else. Do such 'deflected' gazes still orient other people's attention? To find out, we had subjects view videos of a person turning her head to look in a specific direction either to attend in that direction (Intentional gazes) or because she had just gotten caught staring at someone else and was looking away from that person (Deflected gazes). Gaze cueing (measured by the ability to identify a briefly flashed letter in the direction of a gaze) was stronger for Intentional gazes than for otherwise equivalent Deflected gazes -- and this difference disappeared in control videos in which gaze did not appear to be 'deflected', even while controlling for other low-level visual properties. This shows how the process of gaze cueing is especially sophisticated -- insofar as it is well-tuned to extract the 'mind' behind the gaze.
 
Kominsky, J., & Scholl, B. J. (2018). Retinotopically specific adaptation reveals different categories of causal events: Launching vs. entraining. Poster presented at the annual meeting of the Vision Sciences Society, 5/23/18, St. Pete Beach, FL.  
Visual processing recovers not only low-level properties such as color and motion, but also seemingly higher-level properties such as causality. In Michotte's 'launching effect', for example, an object (A) moves toward a stationary second object (B) until they are adjacent, at which point A stops and B starts moving in the same direction. In this situation, observers have a visceral visual impression that B's motion was *caused* by A's impact. And among the evidence that this truly reflects visual processing (as opposed to higher-level judgment) is the discovery that causal launching supports retinotopically specific adaptation (Rolfs et al., 2013, Current Biology): viewing causal launching causes a later ambiguous event (in which A and B may overlap to some degree before A stops and B starts moving) to be perceived as non-causal 'passing' -- but only if the two events occur in the same retinal location. Does this reflect the detection of some unitary phenomenon of causality, or might vision extract multiple distinct forms of causal perception (as explored by Komsinky et al., 2017, Psychological Science)? Here we use adaptation to ask whether launching is a fundamentally different category from *entraining* -- which is superficially identical to launching, except that A continues to move along with B once they make contact. In contrast to other sorts of causal events (Kominsky & Scholl, 2016, VSS), retinotopically specific adaptation did not transfer between launching and entraining. In particular, adapting to entraining events had no effect on the subsequent perception of ambiguous events as involving launching or passing. We conclude that there are indeed fundamentally distinct categories of causal perception in vision. Furthermore, this emphasizes the sensitivity of the adapation effect, which is specific enough to distinguish not only between causal and non-causal events, but between different categories of causal events.
 
Lin, Q., Yousif, S., Scholl, B. J., & Chun, M. (2018). Visual memorability in the absence of semantic content. Poster presented at the annual meeting of the Vision Sciences Society, 5/23/18, St. Pete Beach, FL.  
What makes an image memorable? Recent work has characterized an intrinsic property of images, memorability, which predicts the likelihood of an image being remembered across observers (Isola et al., 2011; Bainbridge et al., 2013). Memorable images frequently contained objects and humans -- raising the question of whether there is memorability in the absence of semantic content. Here, we describe *visual memorability*: memorability that is driven not by semantic content but by low-level visual features per se. Participants viewed a sequence of natural scene images (sampled from Isola et al., 2014) and made a response whenever they saw an image that they had seen previously during the task. Replicating previous findings, memorability was reliable across individuals, and these memorability scores were significantly correlated with those from the original study. To eliminate semantic content, we then transformed the original natural scene images using transformations such as phase-scrambling or texture-scrambling, and tested their memorability using the same paradigm in independent samples. Unsurprisingly, transformed images were significantly less memorable than the original meaningful images. Critically, however, we still found reliable memorability for both types of scrambling. That is, certain images were more likely to be remembered across observers, even when they contained little-to-no semantic content. Interestingly, memorability scores for intact, phase-scrambled, and texture-scrambled images were unrelated: an image that was memorable once transformed was not necessarily memorable in the original sample, and vice versa. Furthermore, when we used a computer vision model previously trained to predict memorability (Khosla et al., 2015), the predictions for the scrambled images did not predict the memorability of the scrambled images themselves although they did predict the memorability of the original images, suggesting that scrambling preserves low-level features that predict memorability. Thus, our results expand prior work and suggest that there is pure visual memorability that operates independently of semantic content.
 
Ongchoco, J., & Scholl, B. J. (2018). The end of motion: How the structure of simple visual events impacts working memory and enumeration. Poster presented at the annual meeting of the Vision Sciences Society, 5/19/18, St. Pete Beach, FL.  
Beyond static scenes, our visual experience is populated by dynamic visual events -- and these frequently overlap in time, as events are continuously and asynchronously starting, unfolding, and ending all around us. How do we represent and remember events amidst this rush of things that are always *happening*? The mind could simply accumulate information largely regardless of how that information is bound into particular discrete events. Or information could be prioritized when it marks the *onset* of a new event. Or information may only be stored for as long as its 'parent' event is ongoing (as in models that posit 'memory flushing' at event boundaries). We explored such possibilities using maximally simple visual events. Observers viewed animations with a number of initially-static dots. A subset of dots then moved in random directions and speeds, and eventually (e.g. after 1s) were again static -- but in some conditions these motions could occur asynchronously, with each dot potentially beginning and ending its motion at a different moment. On each trial, observers simply had to estimate the number of dots that had moved. An animation's event structure -- i.e. just when the motions started and stopped -- had a strong impact on performance: asynchronous motions were consistently underestimated relative to synchronous motions. Further comparisons revealed that the (a)synchrony of motion *onsets* had little effect, whereas asynchrony of motion *offsets* always led to underestimation (regardless of whether onsets were simultaneously synchronous or not). Thus, when attempting to answer questions of "how many moved?", the visual system seems to instead deliver information about "how many just stopped moving?". In other words, the *ends* of events seem to have an outsize influence on working memory: once a motion ends, it seems more difficult to recall that particular motion as having occurred as a distinct event.
 
Uddenberg, S., & Scholl, B. J. (2018). Ten angry men: Serial reproduction of faces reveals that angry faces are represented as more masculine. Poster presented at the annual meeting of the Vision Sciences Society, 5/20/18, St. Pete Beach, FL.  
Men are angry. That, at least, is a common stereotype relating gender and emotion. But how is this stereotype realized in the mind? It could reflect a judgmental bias, based on conceptual associations in high-level cognition. But another possibility is that it is (also) more deeply ingrained, such that we actually see male faces as angrier, as a consequence of relatively automatic aspects of visual perception. We explored this using the method of serial reproduction, where visual memory for a briefly presented face is passed through 'chains' of many different observers. Here, a single face was presented, with its gender selected from a smooth continuum between Female and Male. In an exceptionally simple task, the observer then just had to reproduce that face's gender by morphing a test face along the gender continuum using a slider. Critically, both the initially presented face and the test face could (independently) have an Angry or Happy expression, which the participant could not change. Within each chain of observers, these expressions were held constant, while the gender of each initially presented face was determined by the previous observer's response. In most cases, the chains merely converged on a region toward the midpoint of the gender continuum (even when they started out near the extremes). Strikingly, however, we observed a very different pattern -- with chains instead converging near the Male extreme -- when observers were shown an Angry face but then tested on a Happy face. This is exactly the pattern one would expect if Angry faces are perceived (and thus misremembered) as more Male than they actually were. (In contrast, when Angry faces are tested with Angry faces, this sort of bias effectively cancels out.) These results illustrate how prominent stereotypes have reflections in relatively low-level visual processing, during exceptionally simple tasks.
 
van Buren, B., & Scholl, B. J. (2018). The 'Blindfold Test' for deciding whether an effect reflects visual processing or higher-level judgment. Poster presented at the annual meeting of the Vision Sciences Society, 5/19/18, St. Pete Beach, FL.  
Beyond lower-level features such as color and orientation, visual processing also traffics in seemingly higher-level properties such as animacy and causality. For decades, researchers have studied these sorts of phenomena by asking observers to view displays and make subjective reports about such properties -- e.g. "How alive does this dot look on a scale from 1-7?" Do these experiments measure observers' visual impressions, or merely their judgments that certain features *should* reflect animacy? (Even if you accept that we can truly perceive properties such as animacy, of course we can and do think about them as well.) Here we introduce the 'Blindfold Test' for helping to determine whether an effect reflects perception or judgment. The logic of the test is simple: If an experimental result can be obtained not only with visual displays, but also using written *descriptions* of those displays -- i.e. without any visual stimuli at all (as if the subjects were wearing blindfolds) -- then the fact that subjects attest in some way to seeing a property cannot (and should not!) be taken as evidence for visual processing of that property. Here we apply the Blindfold Test to two past studies. In the first study, subjects reported that moving shapes looked more animate when they increased their speed or changed heading. In the second study, subjects reported that shapes in a collision event appeared to exert less force when they 'shattered' into many pieces. To find out whether these results implicate visual processing per se, we reran these experiments while replacing the visual stimuli with (mere) descriptions. Both experiments' key findings replicated -- in other words, they failed the Blindfold Test. As such, these studies do not license conclusions about perception, and this test may aid researchers interested in properly interpreting such results.
 
Yousif, S., Chen, Y. -C., & Scholl, B. J. (2018). The origin of spatial biases: Memory, perception, or action? Poster presented at the annual meeting of the Vision Sciences Society, 5/23/18, St. Pete Beach, FL.  
Spatial location is surely one of the most fundamental properties that is encoded in the mind. Yet it is striking how biases emerge even in the simplest spatial tasks we can imagine -- e.g. when an observer must merely identify the location in which a stimulus appeared. When a dot appears momentarily in a shape, for example, subsequent localization responses are biased away from the shape's primary horizontal and vertical axes (so that, for example, a dot on one of the midlines is mislocalized as having been slightly off the midline). Such spatial biases are powerful (and are clearly visible to the naked eye in aggregated response plots), but also somewhat mysterious. In particular, their underlying nature remains uncertain. Are these biases of spatial memory? Of spatial perception? Of spatial responses? We addressed such challenges by looking for biases in tasks that minimize the demands of memory and perception. In a first study, observers completed a localization task, but one that minimized memory demands. Observers viewed two outlined shapes (e.g. circles) of different sizes, in different locations. A reference dot then appeared on each trial in one of the shapes. Observers' task was simply to place a response dot in the other shape, so that it was in the same relative location as the still-visible reference dot. Observers' responses were again biased away from the shape's horizontal and vertical axes. In a second study, we abandoned localization altogether. Observers completed an unrelated task, and entered their responses using a circular response wheel. The responses in this case were still biased away from the horizontal and vertical axes of the response space. Collectively these results suggest that spatial biases may be more general than has been previously supposed -- reflecting not memory, but perception or action instead.