[Frontiers in Bioscience 5, d169-193, January 1, 2000]
VISUAL SEARCH: BOTTOM-UP OR TOP-DOWN?
Gargi A. Patel and K. Sathian
Department of Neurology, Emory University School of Medicine, WMRB-6000, Atlanta, GA 30322
TABLE OF CONTENTS
The aim of the experiments in this paper was to explore the relationship between top-down and bottom-up processes in visual search. Employing behavioral techniques, we first consider the possible role of the magnocellular visual pathway in visual search, and find that visual search does not necessarily depend on processing by this visual sub-system. We next use functional imaging (positron emission tomography) to explore the effect of varying top-down strategy during visual search. Our findings indicate that the neural processes underlying visual search are distributed over an extensive network of brain regions, with varying roles for different parts of the network as the dynamics of top-down vs. bottom-up influences shift. The conjunction of bottom-up processing with top-down attentional suppression of an irrelevant singleton could account for activity found in right primary visual cortex (V1). The conjunction of bottom-up processing with top-down attentional set could explain activity noted in the right superior temporal gyrus/insular cortex. The left lateral cerebellum appears to play a role in attention, either in signaling popout or in switching attention repeatedly between multiple visual attributes. Loci in left parietal cortex (parietal operculum/superior temporal gyrus, parieto-occipital fissure and precuneus) are implicated in attention-demanding search for a target shape. Returning to behavioral experiments, we find that, when multiple feature singletons compete for attention, interference between them is strongest for features closely related to the distinguishing target feature. This competition appears to be feature-level rather than object-level, and is characterized by a varying degree of specificity for different features. Task complexity modulates interference effects, even for abrupt visual onsets, which are often considered to capture attention involuntarily. Overall, our observations converge on the conclusion that visual search is extremely flexible and subject to considerable specificity of top-down control, although such specificity is clearly not absolute.
2.1. Bottom-up and top-down attentional modes
Beginning with the writings of William James (1) over a hundred years ago, attention has been recognized to involve a selection process. When multiple sensory stimuli or locations in space compete for attention, the brain processes certain types of stimuli more fully, at the expense of others. This is a consequence of the limited capacity of the brain to process multiple stimuli simultaneously (2). Accordingly, attention can be considered an information-processing filter. Important to our full comprehension of attention is an understanding of attentional selectivity. Why do we orient preferentially toward particular objects instead of others in the visual world? In some instances, items such as a bright light will almost instantly engage us; this exemplifies "bottom-up," "exogenous," or "stimulus-driven" attentional selectivity. At other times, such as when seeking a car in a parking lot, objects draw our attention because we search for them; this is an instance of "top-down," "endogenous," or "goal-driven" attentional selectivity involving an individual's deliberate intentions (2,3). Behavioral evidence suggests that visual attentional selection in everyday experience depends on the interaction of both components: a bottom-up, fast, mechanism that selects stimuli based on their perceptual saliency and a second, slower, top-down mechanism, that is under cognitive control (3). Both processes interact to select visual stimuli for detailed investigation.
2.2. Visual search
Visual search has been an especially valuable approach to the study of how the two types of attentional control interact. In this paradigm, observers search for a predefined target embedded in a multi-element array of non-targets. When an element differs significantly from nearby stimuli on one or more feature dimensions, and is thus perceptually salient, it is referred to as a feature singleton (4). Numerous studies have revealed a dichotomy in visual search for singleton and non-singleton targets. Treisman and colleagues (5,6) demonstrated that a singleton "pops out" of an otherwise homogeneous multi-element display when it is the target of visual search. For instance, a horizontal bar effortlessly pops out amongst vertical bars of the same color (5). Detection of a 4 amongst Cs or a C amongst 4s is similarly efficient (7). The reaction time (RT) to find a target in this kind of efficient search is essentially independent of display set size (number of elements in the display), usually increasing by no more than 10 ms/item (8). In contrast, search for a non-singleton target amongst heterogeneous non-targets does not yield popout. For example, a horizontal bar does not pop out amongst vertical, left oblique, and right oblique bars (5). Detection of a T amongst Ls is also relatively inefficient (9). Search for non-singleton targets such as these yields performance profiles with RT increasing as a function of display set size, slopes typically exceeding 50 ms/item (6).
These and related findings have spawned a number of theories seeking to explain the basis of attentional guidance. Both feature integration theory (FIT) (6) and texton theory (9) share a similar conception in which a fast, preattentive, parallel process differentiates basic visual features, leading to popout of a singleton or rapid texture segregation, while a slower, attentive, serial process combines features to produce the more complex object representations required for searches in which RT increases as a function of display set size. In particular, FIT advances that only basic features such as color and orientation can be distinguished by parallel search, while serial search is required for targets defined by conjunctions of features such as a green T amongst brown Ts and green Xs (6). However, the subsequent literature reported several instances in which conjunction search was associated with little or no increase in RT as a function of display set size, suggestive of parallel search (10-14). Moreover, search for a triple conjunction (size, color, shape) target, such as a large red O amongst small red Xs, small green Os, and large green Xs, yields a negligible slope of RT vs. display set size (14). This is inconsistent with FIT, which predicts that such a search would be even more inefficient than a search for a conjunction of two features. In order to better explain such findings, the guided search model of Wolfe and colleagues elaborated on an earlier two-stage model (15) and proposed a modification of FIT in which information from a rapid, parallel process can guide a subsequent quasi-serial processing stage (13,16). The results from triple conjunction search are consistent with the guided search model, which predicts that increasing the number of conjoined features recruits a greater number of parallel processes to guide the serial stage, leading to more efficient search with shallower RT slopes (14).
Another theory by Duncan and Humphreys (17) was devised to account for the results of studies involving letter search. FIT does not clarify what elementary features of letters, such as letter size or line orientation, are encoded at the preattentive, parallel stage. For example, an L amongst Ts (L is unique only in its within-object conjunction of horizontal and vertical line orientations), is normally difficult to find, but the L can be made to pop out by increasing letter size (17). These investigators also found that a target with a unique feature does not necessarily yield efficient search if the non-targets are heterogeneous, such as an R amongst Ps and Qs (the R contains a unique oblique line). Results of this nature led Duncan and Humphreys (17) to argue that there is no dichotomy between parallel and serial stages, but rather, a more unified mechanism exists in which visual search is always parallel, with efficiency varying along a continuum due to factors such as similarity between target and non-target items and heterogeneity of non-targets. Building on this is the "biased competition" model of Desimone and Duncan (2). In this model, when objects in a visual scene compete for attention, exogenous factors bias competition toward stimuli that differ considerably from their spatio-temporal background. Attentive, endogenous processes can then guide, and possibly override, the exogenous weighting to achieve behaviorally relevant goals. Neural studies have provided evidence supporting this model and are reviewed later in this article (Section 2.4).
2.3. Bottom-up and top-down factors in visual search
Experimental findings in the visual search literature raise an important question regarding attentional control: whether the efficient detection of a singleton in visual search, or the popout effect, is "bottom-up", depending on stimulus properties, or "top-down", depending on the observer's attentional set. In most studies of popout, singletons were themselves the target of search, inducing a goal-driven state of attentional readiness for them. Hence, popout does not necessarily instantiate complete dependence on stimulus-driven processes. In order to differentiate between the contributions of stimulus-driven and goal-driven processing, investigators have employed a visual search task for a task-relevant target singleton distinguished by one feature in the presence of a task-irrelevant distractor singleton distinguished by another feature.
A number of studies of this type have shown that an irrelevant singleton interferes with search for a target that is itself a singleton. Pashler (4) first reported this with the use of brief, masked displays in which the target was an O among tilted lines or vice versa. The accuracy of target localization on the left or right of the display was degraded by the presence of an irrelevant color singleton, more so when the target was unspecified in advance of the trial than when it was prespecified. Similarly, search for an uninformative brightness or color singleton was delayed by the presence of a distractor that was either identical to it or unique on the other dimension (8,18). Further, a highly conspicuous color singleton interfered with search for a shape singleton but not vice versa, while reducing the saliency of the color singleton reversed this effect (18,19). Theeuwes (18,19) therefore argued that attention was captured by the most salient item in the display, as measured by the shortest RT to find the item when it was the target of search in the absence of a distractor. Together, these data suggest that singletons can engage attention based on bottom-up factors. However, it appears that singletons can capture attention in this manner only during a brief temporal window: responses to a probe are faster at the location of a color distractor singleton than to a shape target singleton, but only when the probe appears within 100 ms of display onset (20).
Bacon and Egeth (8) discouraged subjects from adopting a strategy of searching for a singleton in two different ways. In one experiment, they introduced several instances of the target feature so that the target shape was no longer a singleton. In another experiment, they added one or two additional unique shapes to the display on some trials, thus increasing non-target heterogeneity. In both cases, RT increased little with display set size, and an irrelevant (color or shape) singleton had no effect on RT. In displays that were completely heterogeneous with respect to shape and search for the non-singleton target was inefficient, feature singletons that coincided only rarely with the target, providing no consistent information about the target location (i.e., uninformative, irrelevant singletons), had little influence on search (21,22). However, search efficiency improved and the target popped out when it was always of a different color or brightness, i.e., an informative singleton (21,22). These studies illustrate how the task-relevance of a singleton dramatically influences visual search performance. Together, these data have led to the suggestion that an irrelevant singleton can capture attention only when the search target is itself a singleton, i.e., when subjects are in "singleton-search mode" (4,8,17-19).
One view emerging from experiments of this sort is that attentional set can be tuned for a singleton (4,8,17-19,23), leading to capture of attention by the most salient singleton in the display, but that further top-down selectivity is not possible (18,19). However, other evidence suggests that capture of attention is dependent, not solely on stimulus distinctiveness (i.e. singletons), but on the relationship between both stimulus properties and task demands. Folk and colleagues (24,25) used a spatial cuing task based on Posner's classic paradigm (26). Targets in a four-element display were feature singletons on dimensions such as color, abrupt onset, or motion. A valid precue (appearing at the target location) improved performance, and an invalid precue (appearing at a non-target location) diminished performance, but only when the cue and target shared the same feature. These findings led Folk and colleagues (24,25) to propose that there may be a basic difference between static and dynamic properties, so that during singleton-search for a static property such as color or shape, only singletons with static properties can interfere with performance. Conversely, during singleton-search for a dynamic property such as an abrupt onset or motion-defined object, only singletons with dynamic properties can interfere with performance. Contrary to this idea, however, Theeuwes (27) found that an abrupt onset distractor delayed search for a color target and vice versa, in a paradigm where items compete with one another in a near-simultaneous presentation. This issue is addressed empirically in a later section of this article (Section 5.4). Other evidence favoring top-down versus bottom-up factors in controlling attentional selectivity comes from studies of texture segregation, in which the extent of interference from distractors varied directly with their similarity to targets (28).
Thus, we see that popout does not appear to reflect automatic, or strongly involuntary (3), attentional capture. It is fair to say that popout does not occur on the basis of bottom-up factors such as salience alone, but that the effect of a singleton in visual search is also affected by the cognitive strategy of the observer.
2.4. Neural Studies Relevant to Visual Search
2.4.1. Single-neuron studies
Attentional effects are, in general, stronger in higher-order visual areas (29). In the dorsal (occipito-parietal) visual pathway stream of behaving macaque monkeys, the classic observation of enhancement of neuronal responsiveness due to spatial attention (30) has been elaborated by findings that neurons in area 7a of parietal cortex respond preferentially to stimuli outside the focus of attention (31) while neurons in a sub-region of 7a, the lateral intraparietal area (area LIP), respond only to stimuli of behavioral relevance or to abrupt visual onsets (32). Such neurons may play a role in orienting attention.
Neurophysiological studies in multiple visual cortical areas, including V1, ventral stream areas such as V2, V4 and inferotemporal cortex (IT), and dorsal stream areas such as MT and MST have adduced evidence for the biased competition model (2). These studies have demonstrated modulation of neuronal responses by attention in the presence of competing stimuli within the receptive field of the neuron under study. When attention is directed to one stimulus of a pair, the neuron's response tends toward the response evoked by that stimulus alone. This effect has been observed in V1, V2, V4 and IT based on whether or not attention is focussed on an object (33-37). It has also been found in instances where attention is directed to various object properties such as location (in V4 (38)), color or brightness (in V4 (39,40)) and motion (in MT/MST (41)). Memory-related activity in a delay period intervening between a cue and a subsequent sample stimulus occurs in IT (35) and could represent the neural substrate for the "attentional template" (2) to which a match is sought. The biasing signals for these mnemonic effects may originate in neurons of lateral prefrontal cortex (42).
Attention to a particular stimulus or spatial location has also been shown to increase the sensitivity of cells to that stimulus. For instance, in V4, attention to a stimulus increased sensitivity of the neurons to orientation and color (43). Related behavioral results in humans suggest that attention to a particular location serves to improve task performance, not by reducing noise or changing decision criteria, but rather by enhancing spatial resolution at that location (44).
2.4.2. Functional imaging and event-related potential (ERP) studies
In humans, a positron emission tomography (PET) study of a feature conjunction search (see Section 2.2) found activation of a region of superior parietal cortex (in the dorsal stream) implicated in spatially shifting attention, supporting the notion that serial attentional shifts are involved in conjunction search (45). Using functional magnetic resonance imaging (fMRI), others demonstrated activity in the parieto-occipital junction, superior parietal cortex and other areas of extrastriate visual cortex during a color/shape popout task, while conjunction search activated the same areas as well as the frontal eye fields (46). Recent fMRI studies (47,48) have obtained evidence for competitive interactions in human extrastriate cortex similar to those found in monkeys with single-neuron recordings. In the absence of directed attention, multiple stimuli were shown to interact in a mutually suppressive manner. This suppression was relieved when attention was directed to one particular stimulus. The data are in agreement with the idea that attention serves to bias competition between stimulus representations.
ERP studies have identified a peak known as the N2pc that is related to visual attention (49-51). Interestingly, the same peak was obtained during visual search for color, orientation and motion singletons, the motion singleton eliciting it even when task-irrelevant (49). The N2pc appears to reflect activity involved in filtering a target item from irrelevant non-targets and seems to be essential for conjunction but not feature search (50). A recent ERP study (51) has re-ignited the serial vs. parallel processing debate by demonstrating interhemispheric shifts of the N2pc in association with shifts of attention between more likely and less likely targets. Although this was taken as evidence favoring serial processing, the study design involved selection of potential targets based on color popout and therefore does not rule out parallel processing in completely heterogeneous displays. Using fMRI, spatial attention-related activity was found in V1 and multiple areas of extrastriate visual cortex (52). The time course of ERPs recorded in the same study, together with dipole source modeling, suggested that that the attention-related signal in V1 was not due to bottom-up sources, but rather from feedback projections originating in higher-order areas. This data is consistent with the biased competition model of Desimone and Duncan (2) in which top-down signals can influence bottom-up processes.
2.4.3. Lesion studies
In monkeys, lesions of area V4 interfere with visual search, but only for non-salient items (53). Monkeys with lesions of V4 and TEO, areas whose neuronal responses have been shown to be modulated by attention, are unable to ignore salient, singleton-like stimuli, even if they are irrelevant to their behavioral goals (54). These results suggest that the lesions caused loss of a top-down attentional mechanism that normally would resolve bottom-up sensory competition by biasing representations of behaviorally relevant stimuli. Together, these data provide converging evidence for the biased competition model in which mutually suppressive, competitive sensory interactions occurring across networks of neurons are biased by attention to a given stimulus or spatial location. Relevant studies in humans are much more limited. A patient with biparietal lesions was reported to show impairments in conjunction but not feature search (55,56), and hemineglect in visual search is modulated by non-target density and discriminability after frontal but not parietal lesions (57).
2.5. Experimental Goals
The aim of the following series of experiments was to better characterize the neural basis of exogenous and endogenous attentional control in visual search, and to explore the relationship between these two modes. Employing behavioral techniques, we first consider the possible role of the magnocellular visual pathway in visual search, given evidence implicating this pathway in bottom-up processes. We next discuss functional imaging studies directed at understanding the effect of varying top-down strategy during visual search. We then return to behavioral experiments addressing interactions between feature singletons in visual search.
3. POSSIBLE ROLE OF THE MAGNOCELLULAR VISUAL SYSTEM IN STIMULUS-DRIVEN PROCESSES
3.1. Experiment 1: Singleton search vs. search in a heterogeneous display
The primate visual system is divided into two pathways, the magnocellular (M) and the parvocellular (P), whose segregation begins at the retinal ganglion cells and continues through the lateral geniculate nucleus (LGN) of the thalamus (58). Cross-talk between the pathways begins as early as layer 4B of the primary visual cortex (V1) (59), and continues to occur beyond V1 in extrastriate visual cortex (60). M system neurons are particularly sensitive to motion and flicker and are preferentially activated over P system neurons in response to stimuli of high temporal and low spatial frequencies, but are relatively insensitive to color stimuli (58). P system neurons, on the other hand, are highly sensitive to color and are preferentially activated over M system neurons by stimuli of high spatial and low temporal frequencies (58). There is a strong correlation between physiological properties of M and P system neurons and perceptual experience; damage to either M or P system neurons results in characteristic perceptual deficits. Monkeys with focal lesions of the M layers of the LGN demonstrate a compromised ability to perceive stimuli of high temporal frequency (61,62). Monkeys with lesions of the P layers of the LGN, on the other hand, demonstrate compromised chromatic vision, acuity, and contrast detection at low temporal and high spatial frequencies (63).
Consider the types of items that engage our attention in everyday experience: rapidly moving or flashing objects, large items, and brightly-colored objects. We generally do not orient as rapidly to slower, smaller and dimmer items. With the exception of color, the sensitivity of M, but not P, system neurons matches quite well with the kind of stimuli that ordinarily capture our attention. Color selectivity is a property of the P system, but note that colored objects in our world are also of varying brightness or luminance. It may well be the luminance property of a brightly colored object that is responsible for the capture of attention, rather than its color. Breitmeyer and Ganz (64) speculated that any stimulus, such as motion, that strongly activates the transient visual channels (i.e., M system) would capture attention. The ability to respond preferentially to stimuli which capture attention has a high survival value and thus, evolutionary implications; in fact, the M system appears to be both phylogenetically and ontogenetically older than the P system (58). It is reasonable to conjecture then that the M system may mediate automatic attentional capture. This conjecture would seem to be borne out by the finding that a visual element that is abruptly onset after a brief delay, relative to other elements, preferentially captures attention in a multi-element display (65). More recently, though, it has been argued that the key factor in attentional capture by abrupt visual onsets is not the visual transient, but, rather, the appearance of a new perceptual object in the visual scene (3). However, the role of the M system in the bottom-up processes of visual search has not been explored fully. We targeted this issue by exploiting the differential sensitivity of the M and P systems.
To assess the extent of involvement of the M system in attentional selectivity, we evaluated visual search performance in a condition where the M system was relatively disabled as compared to a control condition where both the M and P systems were comparably active. For stimuli that are presented under isoluminant conditions, luminance information is unavailable to the visual system, and the stimulus is defined solely by chromatic cues (58). As the M system is relatively color-blind, it is unable to detect these color cues and is hence impaired; thus, the P system dominates perception. In Experiment 1, we evaluated the contribution of the M system to bottom-up processes in visual search under control and M-impaired conditions (isoluminance). As reviewed earlier (Section 2.2), search for a singleton target is characterized by RT performance which is independent of the display set size. For instance, a horizontal bar effortlessly pops out amongst vertical bars of the same color (5). In such a task, bottom-up factors play a significant role although they do interact with top-down strategy. On the other hand, search for a non-singleton target, which is dominated by top-down factors, results in RT performance that increases significantly with display set size.
For example, a horizontal bar does not pop out amongst vertical, left oblique, and right oblique bars (5).
Based on the literature, we chose two kinds of displays (Figure 1). Each had a vertical target bar; in one, amongst horizontal non-target bars (orientation contrast display, OCD, Figure 1a), and in the other, amongst randomly oriented bars (random orientation display, ROD, Figure 1b). We expected the OCD but not the ROD to lead to orientation popout. If the M system mediates the bottom-up processes involved in popout, then isoluminance should abolish popout in the OCD but should not affect performance in the ROD.
Figure 1. Visual displays used in Experiment 1, consisting of green bars on a red background. The vertical target (dotted box, used only for illustration and not actually present in the displays) is amongst horizontal non-targets in the OCD (a) and randomly oriented non-targets in the ROD (b).
In this and all subsequent experiments, undergraduate or graduate students between the ages of 18 and 32 years from Emory University and the Georgia Institute of Technology were studied after obtaining informed consent. Subjects had normal or corrected-to-normal visual acuity and normal color vision, as assessed using Ishihara plates. None of the subjects had any history of neurological injury or disease. All subjects were naïve to the task under study. Subjects participated in partial fulfillment of a psychology course requirement or for financial compensation. Institutional Human Investigations Committees approved the procedures.
Displays were presented on a Macintosh IIfx computer with a 13-inch color monitor, with display presentation and data acquisition being controlled by the Superlab package. Subjects (n=16) searched rectangular arrays (centered on fixation) of OCDs and RODs of 4, 9, or 16 green bars (0.25° x 0.14° visual angle) on a red background (brightness 90% of maximum). A fixation cross was presented for 750 ms prior to each display presentation and subjects were asked to maintain fixation at this location during display presentation. Subjects indicated target bar presence (50% of trials) or absence by pressing one of two keys. Each display was presented under two conditions: a control condition where the green bars were substantially brighter than background (brightness 95% of maximum) and isoluminance, determined as outlined below. Trials were presented in blocks within which the kind of display (OCD/ROD) was constant but display set size, luminance condition and target presence/absence were randomized. The blocks were interleaved in randomized order.
Each subject's perceptual red-green isoluminance point was individually determined by presenting green bars in an X-shaped pattern on the red background described above. The luminance of the bars was varied sequentially from 95% to 10% of maximum and back again, in increments of 5%. The subject was asked to compare the relative brightness of the stimuli and background and to judge the point at which the stimuli appeared to be of equal brightness to the background. Although this is subjective, subjects were able to make a confident decision over one or two sequences, often aided by the unique and well-known characteristics of perception at isoluminance e.g., blurring of borders, a peculiar shimmering quality and haziness or instability of the images (58). In another study (66), we found that the mean of the distribution of subjective isoluminant points (occurring at a brightness of 35% of maximum), corresponded closely with the isoluminant point measured photometrically.
In this and all subsequent experiments, subjects were instructed to respond as quickly and accurately as possible and RT was measured. In every experiment, a minimum of 20 trials were presented to each subject for each condition. Statistical analysis of RT data was performed after eliminating trials with erroneous responses and those in which the RT exceeded 2.5 SD of the mean for a given condition and subject. The number of trials thus rejected was usually small, so that the data set for statistical analysis typically included at least 15 trials per condition per subject. For all statistical analyses in this paper, alpha was 0.05. In the present experiment, display set size and luminance condition were constant within a block and blocks were interleaved in randomized order. Target presence/absence was randomized within each block. Statistical analysis for the present experiment used SPSS (for Macintosh).
3.1.4. Results and Discussion
For this and all subsequent experiments involving target detection, we restrict ourselves to presentation and discussion of results obtained with target-present displays, since the implications of findings with target-absent displays are unclear. Figure 2 illustrates that RT was longer at isoluminance than in the control condition for both the OCD and the ROD, but isoluminance did not seem to alter the nature of the relationship between RT and display set size in either case. ANOVA of the RT data was performed in a within-subject design, using display set size and luminance condition as factors. For both the OCD and ROD, there was a significant effect of display set size (OCD: F(2,30) = 8.93, p = 0.001; ROD: F(2,24) = 54.97, p = < 0.001), luminance condition (OCD: F(1,15) = 67.17, p < 0.001; ROD: F(1,12) = 28.14), p < 0.001) and a significant interaction (OCD: F(2,30) = 4.92, p = 0.01; ROD: F(2,24) = 8.77, p = 0.001) between these two factors. Although the effect of display set size was significant in the OCD, the linear regression slopes of RT vs. display set size in both control (OCD: 1 ms/item; ROD: 42 ms/item) and isoluminant (OCD: 3 ms/item; ROD: 61 ms/item) conditions were minimal and non-significant for the OCD but much steeper and significant for the ROD. This indicates that, regardless of whether the displays were isoluminant or not, popout occurred in the OCD but not the ROD. The small increase in RT at isoluminance, relative to control, in both the OCD and ROD suggests a non-specific effect due to increased task difficulty or diminished target salience, rather than reliance of popout on the M system. The results of Experiment 1 thus argue against specific dependence on the M system of the bottom-up processes involved in popout.
Figure 2. Experiment 1, mean RTs to target detection in the OCD (a) and ROD (b). Ctrl: control; iso: isoluminant.
3.2. Experiment 2: Multiple top-down strategies--control experiment
In Experiments 2-5, we used a different visual search paradigm as a basis for exploring the role of the M system. This paradigm illustrates the influence of varying top-down search strategy on the bottom-up, attention-grabbing effect of a singleton. The paradigm used heterogeneous displays involving a difficult shape search. As reviewed earlier (Section 2.3), addition of a color singleton to such a display fails to draw attention if the singleton is rarely a target. The target in such cases does not pop out even on the rare trials in which the (task-irrelevant) singleton coincides with it (21,22,67). Here, the singleton is uninformative about target location. Its bottom-up effect is suppressed because it does not aid performance. In contrast, the target pops out from a heterogeneous display when it consistently differs from non-targets in a property such as color or brightness (21). In this case the color or brightness singleton is informative about target location and improves search efficiency, with popout resulting when the bottom-up effect of the singleton matches a top-down set for it. Thus, the effect of a visually compelling singleton varies depending on its relevance and the specific search strategy adopted.
Adapting the paradigm of Folk and Annett (21), we employed a difficult visual search in a display of heterogeneous shapes, varying the presence and task-relevance of a color singleton across several conditions. Displays, in this and all subsequent experiments, were presented on a high-resolution (1024 x 512 pixels) 21" graphics monitor with a 120 Hz frame rate, controlled by a 133 MHz Pentium PC running VisionWorks software (Vision Research Graphics, Inc. Durham, NH) customized for the series of experiments reported in this paper. Statistical analysis in this and all subsequent experiments was performed using the statistical tools in Microsoft Excel for linear regressions and t tests (two-tailed), and SAS for ANOVA and Scheffe tests.
In Experiments 2-5, subjects (n=9) searched for a target bar amongst displays consisting of 6 or 14 gray shapes (R=G=B=70; values percent of maximum for each gun), arranged around a fixation cross in a ring at 5° eccentricity, on a red background (R=90; G=B=0). The presence and relevance of a green singleton (R=B=0; G=90) were varied across 4 conditions, as described below. The heterogeneous non-target shapes, a circle and various polygons, were of comparable area to a target bar that measured 1° x 1.125° and was present on all trials, oriented at 5° to the right or left of vertical. Subjects indicated target orientation by pressing one of two keys. Subjects were trained to maintain gaze on the fixation cross in order to minimize eye movements. In each experiment, the sequence of conditions was counterbalanced across subjects. Trials occurred in blocks with target orientation being randomized within a block. Display set size and condition were constant within a block and blocks were interleaved in randomized order. All subjects ran in the four experiments in the same order: Experiment 5, Experiment 2, Experiment 3, Experiment 4.
Experiment 2 was run to verify expectations from the literature (21,22), and to provide a baseline for assessing the effect of M system manipulation. We expected that search for the gray target bar amongst gray heterogeneous shapes would be difficult in the absence of a singleton (Absent; condition A, Figure 3a), i.e., RT would increase substantially with increasing display set size. In this situation, there is little influence of bottom-up factors. Such a search can be simplified if the target is green, i.e., when the target shape is also always a color singleton (Figure 3b). This target-singleton coincidence should cause the target bar to pop out, facilitating search in the heterogeneous display so as to yield a low slope for RT as a function of display set size. In this Popout condition (condition P), the color singleton is task-relevant because it is reliably informative about the target location. Thus, bottom-up factors interact with the top-down set (in this case for the color singleton) to speed up search. If the green item is always a non-target (Figure 3c), visual search performance for the gray target bar remains unaffected. In this condition (condition N), when the green singleton Never coincides with the target shape, and therefore is not informative, it is unlikely to be actively sought; in fact, its bottom-up effects must be suppressed so as not to interfere with task execution, which should be dominated by the top-down set. Similar performance would be expected in the Rarely condition (condition R) when the singleton coincides with the target on a small fraction of the trials. In conditions R and N, as in condition A, RT should increase as a function of display set size. Prior to running in each experimental condition, subjects were informed of the relevance of the singleton to target detection. Consequently, they could use this information to maximize task efficiency.
Figure 3. Visual displays used in Experiments 2-6, consisting of heterogeneous gray shapes on a red background; target bar (dotted box) was oriented 5° to the left or right of the vertical. A green color singleton is absent (a), coincides with the target bar (b) or coincides with a non-target shape (c).
3.2.3. Results and Discussion
Figure 4 shows that, consistent with expectations based on the literature (21,22) the slope relating RT to display set size was relatively low in condition P, compared to conditions A, R and N. This reflects target popout only in condition P when it was coincident with the singleton, but not in conditions A, R, and N when the singleton was either absent or coincident with a non-target. These conclusions were corroborated by statistical analysis. Linear regression slopes were relatively low and non-significant in condition P (17 ms/item) and substantially higher and significant in the remaining three conditions (A: 69 ms/item; R: 76 ms/item; N: 70 ms/item). Bonferroni-corrected paired t tests (Table 1) confirmed that RT in the P condition was significantly different from the A, R, and N conditions, which did not differ from one another. Thus, we verified that in our particular paradigm, the informative singleton led to popout in condition P, while the uninformative singleton had no effect on RT in the heterogeneous display in conditions A, R, and N, as expected based on earlier work (21,22).
Figure 4. Experiment 2, mean RTs for conditions A (singleton absent), P (singleton coincides with target and causes it to pop out), R (singleton rarely coincides with target), and N (singleton never coincides with target).
Table 1. Experiments 2 - 6, p-values of paired t tests.
Asterisks indicate significant p values after Bonferroni correction for 12 comparisons (Experiments 2 - 5 and Experiment 6: Behavioral) and for 6 comparisons (Experiment 6: Imaging). dss = display set size.
3.3. Experiment 3: Multiple top-down strategies--isoluminance
In Experiments 3-5, we manipulated stimulus conditions to assess involvement of the M system in stimulus-driven processes. In Experiment 3, all four search conditions of Experiment 2 were presented at isoluminance, a condition in which the M system is relatively disabled. We reasoned that if the M system is important in stimulus-driven processes, then at isoluminance the target in condition P should not pop out even if it coincides with a salient green singleton. Search for the target bar should be difficult, resulting in a performance profile where RT increases with display set size. Performance in the remaining three conditions was not expected to show any differences from that observed in Experiment 2, as bottom-up processes are task-irrelevant in these conditions. Thus, the approach to disabling the M system was the same in this experiment as in Experiment 1 although the paradigm was different. We determined each subject's isoluminant point using the method employed in Experiment 1 except that a ring of bars (at 5° eccentricity) was substituted for the X-shaped pattern. (For most subjects, isoluminance was achieved at R=90, G=B=0 (red); R=B=0, G=35 (green); and R=G=B=25 (gray), comparable to Experiment 1 on a different computer system).
3.3.2. Results and Discussion
This experiment revealed a pattern of results very similar to that in Experiment 2. The slope of RT vs. display size was quite shallow and non-significant in condition P (10 ms/item) but considerably steeper and significant in the remaining conditions (A: 76 ms/item; R: 107 ms/; N: 76 ms/item) (Figure 5), reflecting target popout only in condition P when it was coincident with the singleton. Bonferroni-corrected paired comparisons (Table 1) showed that RT in the P condition was significantly different from the A, R, and N conditions, (except for P vs. A at the smaller display set size); and conditions A, R and N did not differ from one another. Hence, isoluminance did not interfere with popout search efficiency in condition P. Thus, the results of Experiment 3 confirmed those of Experiment 1, pointing against a selective role of the M system in exogenous processes in visual search.
Figure 5. Experiment 3, mean RTs. Details as for Figure 4.
3.4. Experiment 4: Multiple top-down strategies--background flicker
Whether isoluminant stimuli truly disable the M system has been debated, with arguments on both sides (68-70). It could therefore be argued that the results of Experiments 1 and 3 reflect simply a failure to effectively inactivate the M system at isoluminance. Hence, we sought an alternative method of impairing the M system. One method that has been employed exploits the sensitivity of the M system to stimuli of high temporal frequency. When the display surround is flickered at 12 Hz, the M system is preferentially activated by this flicker, rendering it incapable of responding to foreground stimuli (71). Under such conditions, abolition of popout would indicate a significant role for the M system in mediating popout.
Experimental conditions were as in Experiment 2, except that the red background was flickered sinusoidally at 12 Hz with a 70% modulation depth (71). Modulation depth was taken as (L1-L2)/(L1+L2) where L1 and L2 are the luminance values (of the red gun) at the peaks of the sinusoid (R=16,90, G=B=0). We predicted that if the M system is necessary in stimulus-driven processes, then the target in condition P would not pop out even if it coincides with a singleton because the M system is engaged by the flicker in the background. This would result in a performance profile where RT increases with display set size. Again, no performance changes were expected in the remaining three conditions.
3.4.3. Results and Discussion
Figure 6 shows that this experiment essentially replicated the pattern of results found in Experiments 2 and 3. The search slope was shallow and non-significant in condition P (5 ms/item) but much steeper and significant in conditions A (45 ms/item), R (76 ms/item) and N (62 ms/item), i.e., target-singleton coincidence enabled popout in condition P while there was no popout in the remaining conditions. Paired comparisons (Table 1) confirmed that RT in the P condition was significantly different from the A, R, and N conditions, which did not differ from one another. Thus, background flicker did not interfere with search efficiency in condition P. The results of Experiment 4 therefore converged with those of Experiments 1 and 3 to negate a selective role of the M system in exogenous processes in visual search.
Figure 6. Experiment 4, mean RTs. Details as for Figure 4.
3.5. Experiment 5: Multiple top-down strategies--flickering singleton
Evidence from lesions in monkeys suggests that objects flickering at a high frequency (25 Hz) and low luminance contrast (22%) can be processed only by the M system (61). Introspection suggests that flicker confers powerful attention-grabbing capability on objects, which is consistent with the ability of abrupt visual onsets to capture attention (65), although the new perceptual object account (3) implies that visual transients such as onsets and flicker are not necessarily attention-grabbing. In order to test potential involvement of the M system in the bottom-up processes involved in visual search using a different approach from that in the preceding experiments, we employed a flickering singleton instead of a color singleton. Flicker characteristics were chosen so that the stimulus would be processed largely by the M system. Under such conditions, we expected that condition P in which the flickering singleton always marked the target location would yield popout as in previous experiments. The key question is whether the flickering singleton would automatically draw attention when it was not the target, in conditions R and N. If it did, then this would implicate the M system in bottom-up processing that could impact upon visual search. A negative answer, on the other hand, would further strengthen the conclusion of the preceding experiments, and rule out a specific role of the M system in visual search.
The paradigm of Experiment 2 was used, except that a flickering gray singleton was substituted for the color singleton. The flicker was sinusoidal at 25 Hz with a modulation depth of 24% (RGB values at peaks of sinusoid were R=G=B=55,90).
3.5.3. Results and Discussion
Once again, this experiment revealed the same pattern as in the preceding three. Search, as expected, was efficient in condition P and relatively inefficient in the other three conditions (Figure 7). This indicates that the target popped out only in condition P when it was coincident with the singleton. As in the preceding three experiments, regression slopes were small and non-significant in condition P (2 ms/item) but large and significant in the other three conditions (A: 65 ms/item; R: 69 ms/item; N: 88 ms/item). .RT in the P condition was significantly different from that in the A, R, and N conditions, none of which differed from one another (Table 1). Thus, the flickering singleton failed to involuntarily capture attention in conditions N and R. Hence, Experiment 5 also demonstrated no evidence for a preferential role of the M system in subserving bottom-up processes in visual search. The results of this experiment fit with the account that it is not visual transients per se, but rather the creation of a new perceptual object, that is responsible for attentional capture by abrupt visual onsets (3).
3.6. Summary of Experiments 1-5
In summary, we were unable to demonstrate interference with search efficiency at isoluminance, an M-impaired condition, using two different search paradigms (in Experiments 1 and 3). Experiment 4 confirmed this finding using another method of impairing the M system, background flicker. Finally, in experiment 5, we were unable to demonstrate that a flickering object, a feature thought to engage the M system, could automatically capture attention, regardless of cognitive strategy. Together, the data serve to falsify the hypothesis of a selective role of the M system in exogenous processes in visual search. Two possible conclusions emerge from these results. One is that stimulus-driven processes cannot be attributed to a specific neural subsystem such as the M pathway, but may be more complex and distributed. Another possibility is that the role of bottom-up processes in visual search is quite limited. We favor the latter possibility, based on recent work from our laboratory that converges with evidence from others' investigations (Section 5).
4. VISUAL SEARCH: FUNCTIONAL IMAGING STUDIES
4.2. Experiment 6: Multiple top-down strategies --functional imaging
In the preceding experiments, we used behavioral means to investigate the involvement of a particular neural sub-system, the M system, in the bottom-up processes of visual search. In this experiment, we studied visual search using functional neuro-imaging with PET to measure changes in regional cerebral blood flow (rCBF). Our purpose was to localize brain regions affected by manipulations of the balance between bottom-up effects and top-down attentional set in visual search, while minimizing physical differences in stimulus conditions. We used the same paradigm as in Experiment 2, in which conditions A, P, N, and R share similar stimulus properties but singleton presence and relevance is varied.
Subjects (n=6) in this study were all males and strongly right-handed according to the 14 items of the Edinburgh handedness inventory with the highest validity (72). Subject selection was restricted to right-handed males in order to remove sources of variance due to gender and handedness. In addition to institutional Human Investigations Committees, the Radiation Safety Committee of Emory University approved the procedures in this experiment.
In a separate session prior to PET scanning, subjects ran in the four visual search conditions of Experiment 2 to verify that the expected performance patterns were obtained. All subjects were presented the same condition sequence (A, N, P, R); the smaller display set size of a condition preceded the larger. Other methodological details were as in Experiment 2.
22.214.171.124. PET Scanning and image analysis
Our methods for PET scanning and image analysis have been described previously (73). PET scanning was performed in two-dimensional mode. 31 contiguous planes were acquired, covering a 105 mm field of view, with nominal isotropic resolution of 5 mm at full-width half-maximum (FWHM). Subjects lay supine in the scanner with the head restrained with an individually fitted thermo-plastic mold. Task performance began 30 sec prior to a bolus intravenous injection of 35 mCi H215O, scan acquisition beginning 10 sec after injection. Stimulation continued for a total period of 150 sec. Subjects performed the same four tasks during scanning as they did in the behavioral session, except that display set size was always 6 in the scanning session. Behavioral data was acquired during scanning for comparison with data from the pre-scanning session to ensure that performance was comparable in both sessions. A single visual search task was performed during each scan. During a 3-hour session of 12 scans, three repetitions of each of the four tasks were presented in a pseudo-random order, 10 minutes apart. About 180 trials were performed during each scan period for all tasks except the popout task, for which approximately 200 trials occurred during scanning.
PET images were reconstructed using calculated attenuation correction (74). After within-subject alignment of PET scans using an automated registration algorithm (75), PET images were mapped into Talairach coordinate space to allow between-subject subject averaging (76). Images were smoothed with a 3-dimensional Gaussian filter to a final isotropic resolution of 14.8 mm FWHM and then normalized for changes in global blood flow. Linear contrast analyses based on repeated-measures ANOVA (77) were used to produce t-statistic images (at an uncorrected threshold of p < 0.005) of rCBF differences between particular task-pairs. Activations thus identified were corrected for multiple comparisons within the entire volume of gray matter, using an algorithm that takes into account the size of the activation and the degree of image smoothness (78). To aid visualization in relation to cortical anatomy, the activations were superimposed on an "average" magnetic resonance image derived from a separate population of 18 subjects.
4.2.4. Results and Discussion
126.96.36.199. Behavioral results
As in Experiments 2-5, RT was independent of display set size in condition P (4 ms/item, nonsignificant) but increased with display set size in the other three conditions (A: 69 ms/item; R: 78 ms/item; N: 63 ms/item; all significant) (Figure 8a). Paired comparisons (Table 1) showed that condition P was significantly different from the other three conditions, which did not differ from one another. Thus, as in the preceding experiments, the informative singleton led to popout, while the uninformative singleton had no effect on RT in the heterogeneous display.
Figure 8. Experiment 6, mean RTs in behavioral session (a) and imaging session (b). Details as for Figure 4.
188.8.131.52. Behavioral results during PET imaging
Figure 8b reveals that performance during scanning was similar to that in the preceding behavioral session, RT being shortest in condition P (consistent with popout) and comparably long in the other three conditions (A, R, N). Statistical analysis confirmed that the data replicated the findings obtained in the behavioral session (Table 1).
After the PET imaging session, subjects were debriefed regarding their strategies in the four conditions. In condition A, where the singleton was absent, subjects reported actively searching the display for the target bar. In condition P, subjects sought the salient color singleton that always coincided with the target, to facilitate the task, as expected. In condition N, subjects reported that they tried very hard to ignore the compelling color singleton, which they knew would never coincide with the target. In these three conditions, all subjects used a constant strategy throughout a given condition. Only in condition R did subjects' approach vary: because target-singleton coincidence was variable, subjects did not know on a given trial if the singleton necessarily indicated the target location. Hence, on most trials, subjects attempted to ignore the singleton, understanding that the singleton would usually coincide with a non-target. However, on some trials, they sought out the color singleton, on the chance that it would coincide with the target. This variation in strategy limits interpretation of differences in activity between condition R and the other conditions.
184.108.40.206. PET imaging results
The locations and magnitudes of significant changes in rCBF are tabulated in Table 2. Because of the complexity of the tasks and of interpretation of the results, we consider our conclusions tentative and to represent hypotheses that merit testing in future work.
Table 2. Experiment 6, subtractions yielding significant activations (after correction for multiple comparisons, see text).
Never - Absent
Compared to condition A (singleton absent), a significant rCBF increase in condition N (singleton was never the target) was found in right V1 (Figure 9a). Values of rCBF at this location were greater in conditions containing a singleton (P, R, N), relative to condition A which did not contain a singleton. This suggests that the V1 activation could reflect preattentive sensory processing of the color singleton. Another contributing factor emerges from consideration of the rCBF patterns (Figure 9b).
Figure 9. Experiment 6, Never - Absent. a: Activation of right primary visual cortex. b: Bar graph showing mean rCBF (ml/100 g/min) in each condition.
Amongst conditions containing the singleton, rCBF decreased as the probability of target-singleton coincidence increased (N > R > P). This decreasing rCBF pattern could reflect a decreasing tendency for top-down inhibitory activity to suppress the neural response to the singleton. The target and singleton never spatially coincided in condition N and subjects ignored the uninformative color singleton. At the other extreme, in condition P, because the target and singleton always coincided, subjects actively sought out the informative singleton. The situation in condition R was intermediate, since the singleton occasionally coincided with the target and subjects did search for the singleton on some trials. As the color singleton was presented with equal probability in all parts of the symmetrical visual display, the right hemispheric activity is not attributable to singleton location in the visual field. Rather, it is consistent with the involvement of attentional processes in top-down inhibition, given the well-known dominance of the right hemisphere in attentional control (79-82). Thus, the right V1 focus could represent both synaptic activity triggered bottom-up by the singleton (accounting for rCBF being greater in condition P than in condition A) and that due to top-down attentional suppression of the response to the singleton (explaining the lateralization and the graded decrease in rCBF with increasing probability of target-singleton coincidence).
Popout - Absent
Relative to condition A where the singleton was absent, condition P, in which the singleton always coincided with the target, demonstrated a significant activation in the right superior temporal gyrus/insula (Figure 10a). Figure 10b illustrates that the rCBF at this location was greater in conditions containing a singleton (P, R, N) relative to the singleton-absent condition (A), implying that the activation could be due to preattentive processing of the color singleton. Further, amongst conditions containing the singleton, rCBF was maximal when search was for the color singleton and decreased with decreasing probability of target-singleton coincidence (P > R > N). This pattern, which is the opposite of that noted for the V1 activation on the N-A subtraction, suggests that activity in this cortical region may reflect not only bottom-up, preattentive processing of the color singleton, but also its match to the top-down search strategy. It would be interesting to know whether the activation is specific to search for and detection of a color singleton, or is more general for any singleton, but this must await further investigation.
Figure 10. Experiment 6, Popout - Absent. a: Activation in right superior temporal gyrus/insula. b: Bar graph as in Figure 9b.
Popout - Never
Subtraction P-N revealed an active focus in the left lateral cerebellum (Figure 11a), with rCBF being roughly comparable in conditions A,R,N (Figure 11b). This rCBF difference points to mechanisms involved in P over and above those involved in the other three conditions. In condition P, the singleton was task-relevant in that it was informative about target location. Condition N contained a task-irrelevant singleton that provided no information about target location. In this subtraction, the task-pair was balanced with respect to the occurrence of the color singleton and its sensory effects, so that the only difference was singleton relevance and consequent differential attentional effects. The greater rCBF in condition P than in the other conditions suggests an attentional role for this part of the cerebellum. One possibility is that this cerebellar focus is specifically activated by visual popout, i.e., the match between top-down search for the singleton and its bottom-up detection. Another plausible explanation is suggested by observations that activity close to this region of the cerebellum is associated with repetitive attentional shifts between multiple visual features of foveal stimuli, such as color and shape (83). Switching attention between multiple visual attributes is especially important in condition P. In this condition, search is first set for one feature, singleton color. Once the singleton is found, subjects probably switch to a second feature, target shape, to confirm that it is a bar and then make a discrimination of a third feature, target orientation. In the other three conditions (A, R, N), switching of attention is limited to two attributes, shape and orientation. These two features appear more closely related, intuitively, than color and orientation (this introspective observation is borne out empirically by the results of our Experiments 7 and 8A, see Section 5), so that the requirements for switching attention between attributes is probably less in these conditions than in condition P.
Figure 11. Experiment 6, Popout - Never. a: Activation in left lateral cerebellum. b: Bar graph as in Figure 9b.
Never - Popout
On the reciprocal subtraction N-P, multiple active sites were found in the left parietal cortex (Figure 12a). One locus was inferolateral in the parietal operculum/superior temporal gyrus, a second was inferomedial in the parieto-occipital fissure and a third was superomedial in the precuneus. The rCBF pattern across conditions was similar at all 3 sites, with greater rCBF in conditions A, R, N than in condition P (Figure 12b). In conditions A, R, N which did not contain informative color singletons, subjects attentively searched the display for a particular shape. However, in condition P, shape search was unnecessary since an informative color singleton readily indicated target location. Thus, the activations may reflect neural processing underlying the attentive shape search that is involved in A, R, N but not in P. Consistent with this, activity close to these regions has been found in other tasks involving attentive shape discrimination (84). It is interesting that this subtraction (and also Absent - Popout) failed to identify the biparietal and right inferior frontal activations that are associated with spatial shifts of attention (45,73,81,82), given that serial shifts of attention have been implicated by some workers in visual search in heterogeneous displays (5). The absence of these activations, however, is consistent with the "biased-competition" model in which search is always parallel but with continuously graded difficulty (2). Alternatively, the demand for spatial shifts of attention may have been relatively low in our case because of the small display set size of 6.
Figure 12. Experiment 6, Never - Popout. a: Foci of activation in left parietal operculum/superior temporal gyrus (A), left parieto-occipital fissure (B) and left precuneus (C). b: Bar graph (as in Figure 9b) shown only for left parieto-occipital fissure activation; rCBF pattern was similar at all three loci.
Overall, this study of visual search revealed that activity in the neural network responsible for controlling visual attention is widely distributed and that various components of the network are differentially active depending on the nature of the particular attentional strategy used. Though stimulus parameters were balanced across the various experimental conditions, alterations in behavioral strategy resulted in substantial changes in the pattern of brain activity, due to changes in the dynamics between endogenous and exogenous processes. The degree to which the findings are feature-specific or task-specific remains unresolved until studied with alternative paradigms.
5. COMPETITION BETWEEN SINGLETONS IN VISUAL SEARCH
5.1.. Experiment 7: Search for an orientation singleton
A number of studies have shown that an irrelevant singleton interferes with search for a target that is itself a singleton (8,18,19), as reviewed earlier (Section 2.3). One view emerging from experiments using this paradigm is that attentional set can be tuned for a singleton (4,8,18,19,23). Attention is then captured by the most salient singleton in the display, and further top-down selectivity is not possible according to this view (18,19). But other evidence suggests that capture of attention is affected by task demands (24,25). During initial experiments using the irrelevant-singleton paradigm, we failed to find any consistent interference effect of an irrelevant color or brightness singleton on search for an orientation singleton. We also found no effect of an irrelevant orientation singleton on search for a color or brightness singleton. Since this was inconsistent with some studies of singleton interactions (8,18,19), we decided to systematically assess the effect of irrelevant singletons on a variety of dimensions, individually and in combination, for their ability to interfere with search for an orientation singleton.
In pilot experiments, four individuals viewed uniform displays of 6 and 14 bars. In different blocks, the target was distinguished from the non-targets on one of the following properties: orientation, length, color, shape, brightness, or because it flickered. Subjects indicated the presence or absence of this singleton target. All singleton feature targets popped out, the color target being the most salient by the criterion of shortest RT to target detection (18,19) while the orientation target was the least salient. Of these features, we chose to use an orientation singleton as the target.
The visual search experiment was divided into two parts. The first used a task requiring detection of the presence or absence of the orientation singleton target. In the second, discrimination of target orientation was required. Visual displays (Figure 13) consisted of either 14 or 18 gray bars (R=G=B=40), measuring 1.5° x .75° , on an isoluminant red background (R=60; G=B=0). The non-target bars were uniformly oriented at 45° from the vertical and arranged around fixation in a ring of eccentricity 7° . The singleton target bar, oriented at 5° from the vertical, was present on half the trials in the detection task. In the discrimination task, stimulus conditions were similar to the detection task except that the singleton target was present on all trials, oriented at +5° or -5° .
Figure 13. Visual display of Experiment 7. Gray non-target bars are oriented at 45° to the right of vertical on a red background; target bar (arrowhead) is oriented at 5° . Distractor singletons (arrows, clockwise order) differ from both target and non-targets in orientation, brightness, length, color, or shape. (Arrowhead and arrow are for illustration only and were not displayed.)
Irrelevant distractor singletons were absent on some blocks of trials. In other blocks of trials, one of six distractor singletons occurred at random on each trial (instead of one of the non-targets). Each singleton used was salient, as mentioned above. These distractors were distinguished from both target and non-targets by one of the following properties: orientation (a bar oriented at 85° from vertical), length (a bar of 3 times greater length), shape (a circle of same area as a bar), color (isoluminant green bar; R=B=0, G=60), brightness (a bar of 2.5 times greater brightness; R=G=B=100) or flicker (a bar with sinusoidal flicker of 25 Hz with 24% modulation, peak RGB values being R=G=B=55,90). On yet other blocks of trials, these distractors occurred in various combinations: four singleton pairs (orientation-length, length-shape, color-shape, and brightness-flicker), one set of four singletons (color-shape-brightness-flicker), and a set of all six singletons. Display set size was constant within a block and blocks were interleaved in randomized order. Subjects (n=12) were informed that the distractors were irrelevant and were to be ignored.
5.1.3. Results and Discussion
Figure 14a shows that, in the detection task, all distractor types had a tendency to prolong RT relative to the control, distractor-absent condition. However, the magnitude of the effect clearly varied depending on the distractor type. ANOVA, using display set size and distractor condition as factors, confirmed that RT was significantly affected by distractor condition (F(12,6210) = 40.32, p = 0.0001). Target popout was not affected by the presence of distractors, as shown by the lack of effect of display size (14 vs. 18 items) on RT (F(1,6210) = 0.09, p = 0.76). Moreover, slopes of the linear regressions relating RT to display size were non-significant (with the exception of a single negative slope) and measured 10 ms per item or less in each condition; slopes in this range are generally considered to be indicative of popout (8). There was also no significant interaction between distractor condition and display size (F(12,6210) = 0.78, p = 0.67). Post hoc comparisons with the distractor-absent condition using the Scheffe test indicated that the orientation, length and shape distractors were effective in prolonging RT, while the color, brightness and flickering distractors were ineffective. All tested distractor combinations were effective. The pairs of orientation-length and length-shape were more effective than any single distractor and just as effective as all six distractors together. Also, the orientation-length pair was more effective than the other four distractors together.
Figure 14. Experiment 7, mean RTs and SEMs (bars) for display set size of 14; the results were very similar for the display set size of 18. Asterisks indicate distractor conditions for which RT was significantly greater (by Scheffe testing) than in the distractor-absent condition (see text). Slopes (ms/item) of the linear regressions relating RT to display size are shown below each distractor condition; asterisks indicate slopes that were significantly different than zero. a: Detection of orientation singleton target. b: Discrimination of singleton target orientation. DA: distractor absent; other abbreviations refer to particular distractors - Or: orientation; Le: length; Sh: shape; Co: color; Br: brightness; Fl: flicker; OL: orientation-length; LS: length-shape; CS: color-shape; BF: brightness-flicker; 4D: color-shape-brightness-flicker; 6D: all six distractors.
In the discrimination task, the results were along similar lines as in the detection task, though the absolute RTs were longer and the magnitude of distractor effects were generally smaller (Figure 14b). Again, the effect of distractor condition was significant by ANOVA (F(12,6044) = 14.21, p = 0.0001). Regression slopes were non-significant with the exception of 3 significantly negative slopes which contributed to the marginally significant effect of display size (F(1,6044) = 3.73, p = 0.054); the interaction effect was not significant (F(12,6044) = 1.63, p = 0.076). In this case, the orientation singleton, the pairs orientation-length and length-shape and the set of all six distractors were the only ones to significantly prolong RT relative to the distractor-absent condition, while the six-distractor set was more effective than other distractor types except for orientation, orientation-length and length-shape (Scheffe test).
These findings are incompatible with the view that attention is obligately captured by the most salient singleton when in "singleton-search mode", moving serially to the target if the first singleton encountered is a distractor. Importantly, they refute the idea that saliency, as assessed by the speed of response to a particular target singleton in the absence of distractors, critically determines whether or not a singleton captures attention (18,19). By this criterion, the color singleton was the most salient, and the orientation singleton the least salient, in our set. Yet, the former was a relatively ineffective distractor while the latter was the most effective single distractor. Our results argue for considerable specificity of the matching process underlying visual search for a predefined target and suggests that top-down factors play a major role in such specificity.
The orientation distractor was very effective in delaying responses to the orientation target. Specificity was not absolute, since singletons distinguished by length and shape were also quite effective. A likely explanation is that orientation, length and shape are computed in similar neuronal pools and hence a salient difference on one of these dimensions interferes with segregation of a target on another one. Thus, the process of competition between singletons appears to be strongly biased in a top-down manner by the attentional template. Such selectivity was not found in many earlier studies because a large number of distractor types were not tested against a single target type, as in our experiments. Even relatively ineffective distractors in our study did tend to prolong RT, so that the difference in effectiveness between distractor types is likely to be quantitative rather than qualitative. Further evidence for top-down influences is provided by our finding that, in the discrimination task, fewer distractors were effective and the magnitude of the distractor effect was substantially less compared to the detection task, implying that the effectiveness of distractor singletons is sensitive to task complexity or attentional load.
Thus, Experiment 7 demonstrated that visual search for a unique element in a multi-element array is not based on automatic capture of attention by the most salient item (18,19) but, instead, is subject to strong top-down selectivity for the target feature. This is consistent with the findings of others (24,25,28) that attentional interference is dependent on the relationship between stimulus properties and task demands, even when potential distractors are feature singletons.
5.2. Experiment 8: Search for orientation, brightness and color singletons
In Experiment 7, distractor-absent trials and trials with single and multiple distractors were presented in separate blocks, with an excess of trials with the distractor absent. This precluded within-subject ANOVA (unless the set of distractor-absent trials was truncated to a small fraction, which was considered inappropriate). We therefore replicated the detection study of Experiment 7 using a within-subject design (Experiment 8A), in a new set of subjects.
A further aim of Experiment 8 was to explore whether the findings of Experiment 7 could be generalized to search for features other than orientation. In Experiment
8B, search was for a target distinguished by brightness while Experiment 8C employed search for a color target. We predicted that interference in these experiments would be greatest when the distractor and target shared common features, i.e. that brightness distractors would be most effective in search for a target distinguished from non-targets in its brightness (Experiment 8B) and that color distractors would be most disruptive for color search (Experiment 8C).
15 subjects ran in all three parts of this experiment, in which only detection tasks were performed. Experimental conditions were similar to those of Experiment 7 except as noted. Displays consisted of 10 or 18 items. In Experiment 8A, non-target bars were gray (R=G=B=20) on a black background (R=G=B=0). The single distractors were of the same brightness and color as the non-targets except for the green color distractor (R=B=0, G=30; isoluminant to the gray bars). In addition to the single distractors, distractor combinations modified on the basis of the results of Experiment 7 were used, grouping together the most effective single distractors separately from the less effective distractors. There were five sets of distractor pairs (brightness-flicker, orientation-length, orientation-shape, shape-color, shape-length) and one set of three distractors (orientation-length-shape).
In Experiment 8B, gray non-target bars (R=G=B=20) were presented on a red background (R=90, G=B=0); search was for a brightness singleton target that was dimmer (R=G=B=5) than the non-targets. Distractors, similar to those of Experiment 8A, were the same brightness and color as the non-targets, except the color (R=B=0, G=30) and brightness (R=G=B=80) distractors. In addition to the single distractors, we used four sets of distractor pairs (brightness-flicker, brightness-color, orientation-length, shape-color) and two sets of three distractors (brightness-flicker-shape, orientation-length-shape).
In Experiment 8C, subjects sought a red-colored singleton target (R=90, G=B=0) that was isoluminant to the non-targets. In addition to single distractors, we used four sets of distractor pairs (length-color, brightness-color, orientation-length, shape-color) and two sets of three distractors (orientation-length-shape, shape-length-color).
In each part of this experiment, the distractor-absent condition, each condition with a single distractor and each condition with multiple distractors occurred with equal frequency and in randomized order within blocks of the same display set size; these trial blocks were interleaved in randomized order. The order of Experiments 8A-C was counterbalanced across subjects. Within-subject ANOVA was used to analyze data, separately in each part of this experiment.
5.2.3. Results and Discussion
The results of Experiment 8A closely mirrored those of Experiment 7 in the pattern of interference by distractor singletons (Figure 15a, Table 3). ANOVA, using display set size and distractor condition as factors, confirmed that RT was significantly affected by distractor condition (F(12,2112) = 24.64, p = 0.0001). The presence of distractors did not affect target popout, as shown by the lack of effect of display size on RT (F(1,176) = 0.35, p = 0.55) and the mostly non-significant slopes. There was no significant interaction between distractor condition and display size (F(12,2112) = 1.09, p = 0.36). As in Experiment 7, irrelevant singletons similar to the target feature were the most effective distractors, speaking to the specificity of search processes. The most effective single distractors were orientation and length singletons. The most effective distractor sets were comprised of singletons distinguished by features that were similar to the target feature: orientation-length, orientation-shape, shape-length, orientation-length-shape. Additionally, multiple distractors were more effective than single distractors. Thus, Experiment 8A demonstrates the reproducibility of the results of Experiment 7 in a different subject pool and with a slightly different design.
Figure 15. Experiment 8, mean RTs and SEMs (bars) for display set size of 18; the results were very similar for display set size of 10. Asterisks indicate distractor types for which RT was significantly greater (by Bonferroni-corrected paired t-testing) than in the distractor-absent condition (see text and Table 3). a: Experiment 8A, orientation singleton target. b: Experiment 8B, brightness singleton target. c: Experiment 8C, color singleton target. OS: orientation-shape; OLS: orientation-length-shape; BC: brightness-color; BFC: brightness-flicker-color; CL: color-length; CLS: color-length-shape; other abbreviations and details as for Figure 14.
Table 3. Experiment 8A, paired t tests.
Asterisks indicate significant p values after Bonferroni correction for 25 comparisons. DA = distractor absent.
Experiments 8B and 8C both failed to show any effect of irrelevant distractors of any type (Figure 15b, c). Statistical analysis of the data was therefore not pursued apart from computation of regression slopes, which were all in the popout range (< 10 ms/item (8)) and mostly non-significant. At first glance, these findings are surprising, as we had expected to find interference from distractors distinguished from non-targets on the same feature dimension as the target. However, the findings could be interpreted as being consistent with a highly specific attentional template in the case of brightness and color targets, or, alternatively, with a larger difference in the bottom-up signals attributable to target vs. distractor. For instance, if search is set for a target dimmer than the non-targets, as in Experiment 8B, then a distractor that is perceptibly brighter than all other items might fall outside the window of parameters specified by the attentional template. Similarly, if the target of search is a given color (Experiment 8C), then other colors might be effectively screened out. In the case of orientation targets (Experiments 7 and 8A), the attentional template might not specify the target orientation very precisely, or the bottom-up signals attributable to target vs. distractor might not differ as much, under the experimental conditions tested here.
We wondered if the results of Experiments 8B and 8C might be due to high salience of targets and potential distractors along the same feature dimension, making them relatively easy to tell apart. We therefore repeated Experiment 8C minimizing the differences of a reddish target and greenish distractor from gray non-targets in color space (non-target CIE coordinates x=.28, y=.31; R=G=B=20; target x=.38, y=.31; R=49.6, G=13.2, B=14.7; color distractor x=.28, y=.41; R=9.6, G=23.9, B=10), while taking care to ensure that both the target and distractor singletons were salient enough to pop out. However, the results were no different from those of Experiment 8C (data not shown). In a further variation on this experiment, we addressed whether color distractors could be effective if the distractor condition was invariant within a block, rather than variable as in the original experiment. Again this did not affect the results (data not shown).
We therefore conclude that, for reasons that remain unclear, greater specificity is possible in matching candidate targets to the attentional template in the case of color and brightness than in the case of orientation. We explore this issue further in the next experiment, using shape search.
5.3. Experiment 9: Search for a shape singleton
This experiment was a follow-up to the previous two experiments. The aim was to test whether the results of Experiments 7 and 8A could be generalized to any other feature domain, or whether the results of Experiments 8B and 8C were in fact generally applicable and Experiments 7 and 8A represented a special case. The feature chosen for this experiment was shape, and the experiment was modeled on the preceding two.
The experiment was divided into two tasks. The first task (detection) was a visual search to detect the presence or absence of the target shape. The second task (discrimination) was a visual search to discriminate between two potential target shapes. For both tasks, visual displays were along the lines detailed for Experiment 8 and consisted of either 14 or 18 gray items, on a black background (Figure 16). The non-target bars were uniformly oriented at 45° from the vertical. The singleton target, created by displacing the upward-pointing vertex of an isosceles triangle slightly to the right, was present on half the trials in the detection task. In the discrimination task, the same singleton target as in the detection task, or its mirror-reversed image, was present on all trials.
We used 6 different singleton distractors. Each singleton used was salient, as assessed by its ability to pop out in a search task in the absence of a distractor in pilot subjects (data not shown). A distractor singleton was distinguished from both target and non-targets by one of the following properties: shape (a circle, pentagon, or equilateral triangle of same area as a bar), orientation, length, or color (properties of the latter three distractors were as in Experiment 7). The order of the detection and discrimination tasks was counterbalanced across subjects (n=14).
Figure 16. Visual display of Experiment 9. Gray non-target bars are oriented at 45° to the right of vertical on a black background; target (arrowhead) is a shape singleton. Distractor singletons (arrows, clockwise order) differ from both target and non-targets in shape (circle, equilateral triangle, pentagon), length, orientation, or color.
5.3.3. Results and Discussion
Figure 17 demonstrates that the equilateral triangle had a major effect in delaying search in both the detection and discrimination tasks. Within-subjects ANOVA for the detection task revealed no significant effect of display set size (F(1,294) = 0.07, p = 0.79) but significant effects of distractor condition (F(6,1764) = 50.49, p = 0.0001) and the interaction term (F(6,1764) = 4.42, p = 0.0002). Regression slopes were all were clearly in the popout range, below 10 ms/item (8), in both tasks and were non-significant in all but one condition. In the discrimination task, significant effects were found for distractor condition (F(6,1680) = 51.2, p = 0.0001) and display set size (F(1,280) = 4.64, 0.032) but not their interaction (F(6,1680) = 0.3, p = 0.94). Bonferroni-corrected paired comparisons (Table 4) showed that, for the detection task, only the equilateral triangle produced significant interference relative to the distractor-absent condition. Similar comparisons for the discrimination task revealed an additional effect of length. It is curious that shape distractors other than the equilateral triangle were ineffective but that the length distractor was. The reason for this is not clear. In any case, the major finding of this experiment is that the likelihood of interference is greatest from a distractor that is quite similar to the target.
Figure 17. Experiment 9, mean RTs and SEMs (bars) for display set size of 18. a: Detection of shape singleton target. b: Discrimination of singleton target shape. Ci: circle, Pe: pentagon; Tri: triangle; other abbreviations as for Figure 14. Other details as for Figure 15.
Table 4. Experiment 9, paired t tests.
Asterisks indicate significant p values after Bonferroni correction for 6 comparisons. DA = distractor absent.
The findings of this experiment converge with those of the preceding two experiments in supporting the notion of a high degree of feature-specificity in visual search. It appears that the attentional template is specified clearly in a top-down manner, based on the requirements for search. The extent of feature-specificity in matching candidate targets to the template, however, seems to vary depending on the nature of the search target: distractors on related dimensions are effective in some cases while even distractors on the same dimension as the target are ineffective in other cases. Whether such variation is due to differences at the level of specification of the attentional template (i.e. top-down factors) or at the level of sensory representations of the stimulus (i.e. bottom-up factors) is not known. This is a question that is worth pursuing neurophysiologically.
5.4. Experiment 10: Effect of abrupt onsets on singleton search
The visual system seems to be biased towards new objects or objects that have not been seen recently. For example, responses of some stimulus-selective inferotemporal (IT) cells become suppressed with increasing experience of the stimulus (85,86); thus, the temporal context of a stimulus may contribute as much to its saliency as its spatial context (2). In contrast to singletons, abrupt luminance onsets have a seemingly strong tendency to capture visual attention. Yantis and Jonides (87) originally demonstrated this using uninformative abrupt onsets that coincided with a visual search target only rarely. Abruptly onset targets popped out, while RT for targets exposed by offset of masking elements increased as a function of the number of (heterogeneous) non-targets. On the other hand, targets that were uninformative brightness or color singletons showed no RT advantage, suggesting that abrupt luminance onsets are unique in their ability to draw attention (65). Abruptly onset location precues are hard to ignore even if they are detrimental to performance (88). However, if the target location is precued, the effect of abrupt onsets disappears (87,89). As reviewed earlier (Section 2.3), abrupt onsets interfere with search for a color target and vice versa (27) but are not effective as spatial cues unless the subsequent target is also of a dynamic type (24,25).
It appears to be the creation of a new perceptual object rather than the luminance change associated with an abrupt onset that captures attention, since appearance of isoluminant new objects causes them to pop out of heterogeneous displays while a salient luminance increment does not (90). Moreover, abrupt onsets are no more effective in capturing attention than abrupt offsets when neither result in creation of new objects, although both take priority over objects without such luminance changes (91). Also, offset transients and increase in non-target number can interfere with popout of abruptly onset targets (92). However, abrupt isoluminant color changes do not pop out as predicted by the new-object account (93). Thus, there may be a continuum of "attentional priority tagging", with abrupt onsets or new objects at one end of this continuum (91). At the single-neuron level, neurons in area LIP respond preferentially to abrupt visual onsets but equally to behaviorally relevant stimuli (32), as mentioned earlier.
The paradigm of Experiment 7 afforded an opportunity to test the efficacy of abrupt onsets in capturing attention, as compared to other singleton distractors. We reasoned that if an irrelevant abrupt luminance onset produces interference during search for an orientation singleton target in an OCD, this would suggest that onsets do capture attention and thus, strengthen the case for onsets as a special visual feature category. On the other hand, if onsets demonstrate no such effect, this is more consistent with specificity of top-down control and the lack of a unique role for abrupt onsets (24,25). We also reasoned that if onsets produced interference even during a difficult search for an orientation non-singleton target in an ROD, this would strongly favor the idea of involuntary attentional capture by abrupt onsets and their uniqueness in this regard.
Subjects (n=11) searched OCDs and RODs of either 10 or 18 gray bars. Display characteristics were similar to Experiment 8. In a given block of trials, one of six possible distractor conditions occurred at random on each trial. In the OCD (Figure 18a), either there was no distractor or the distractors were distinguished from both target and non-targets by one of the following properties: orientation, length, color, orientation-length (all with properties as in Experiment 7) or abrupt onset (onset by a luminance increment 50 ms after onset of the rest of the display). In the ROD (Figure 18b), a shape distractor substituted for the orientation distractor, which could not be used meaningfully in this display. The order of display presentation (OCD vs. ROD) was counterbalanced across subjects.
Figure 18. Visual displays of Experiment 10. Gray non-target bars are oriented at 45° to the right of vertical on a black background. The target bar (arrowhead) is oriented at 5° amongst non-target bars oriented uniformly at 45° in the OCD (a) and randomly in the ROD (b). Distractor singletons (arrows, clockwise order) differ from both target and non-targets on the dimension of color, length (both displays) orientation (OCD only) or shape (ROD only).
5.4.3. Results and Discussion
The abrupt onset was as effective as the orientation and length distractor but not as powerful as a pair of distractors on these two dimensions (Figure 19a). Within-subjects ANOVA for the OCD showed significant effects of display set size (F(1,253) = 35.4, p = 0.0001), distractor condition (F(5,1265) = 45.78, p = 0.0001), and the interaction term (F(5,1265) = 3.51, p = 0.0037). Slopes for all conditions in the OCD were within the popout range of 10 ms/item (8) and often significantly negative, which contributed to the significant effect of display set size in this condition. Paired t-tests with the Bonferroni correction (Table 5) revealed that the orientation and length distractors, individually or in combination and the abrupt onset distractor significantly delayed search in the OCD, compared to the distractor-absent condition. The onset distractor did not differ significantly in efficacy from the orientation and length distractors. Furthermore, each of the single distractors (orientation, length, and onset) was significantly less effective than the pair orientation-length. Thus, abrupt onsets were as effective as distractors on the same dimension as the search target or a closely related dimension, but less effective than a pair of such distractors.
Table 5. Experiment 10, OCD, paired t tests.
Asterisks indicate significant p values after Bonferroni correction for 10 comparisons. DA = distractor absent.
Consistent with the literature reviewed in Section 2.3 (8,21,22), none of the singleton distractors were effective in interfering with performance in the ROD, a heterogeneous display in which search was for a non-singleton target (Figure 19b). Critically, the abrupt onset was also an ineffectual distractor in the ROD (Figure 19b). For the ROD, we found a significant effect of display set size (F(1,140) = 203.12, p = 0.0001); however, there was no effect of distractor condition (F(5,700) = 1.17, p = 0.32) or interaction between the factors (F(5,700) = 1.55, p = 0.17). As expected, slopes for the ROD were significant and much steeper than for the OCD.
Figure 19. Experiment 10, mean RTs and SEMs (bars) for display set size of 18. a: OCD; b: ROD. On: onset; other abbreviations as for Figure 14. Other details as for Figure 15.
Together, these findings indicate that abrupt onsets can involuntarily capture attention, but only under some circumstances. When search is set for a singleton and attention is presumably distributed over the display, an abruptly onset element can interfere with search for an orientation singleton. This contrasts with the conclusion of Folk and colleagues (24,25) (see Section 2.3). These workers found that dynamic cues such as abrupt onsets produced validity effects only when subsequent targets were also dynamic. The differences are probably attributable to the different experimental conditions. The finding that abrupt onsets in our study did not affect search in a heterogeneous display (ROD) is different from that obtained by Yantis and Jonides (87), also in a heterogeneous display but with much smaller display set sizes (2-4 elements) than we used. Increasing display set size, as in our experiment, appears to negate the ability of an abrupt onset to grab attention. Thus, though abrupt luminance onsets can interfere with search under certain conditions, the bottom-up effect exerted by such an onset is not absolute since it is subject to modulation by top-down effects and vanishes in sufficiently complex tasks.
5.5. Experiment 11: Feature-level vs. object-level competition between singletons
Many theories of attention focus on the critical role of spatial location in attentional guidance (94-96). FIT emphasizes feature-specific detectors and serial, spatial shifts of attention between items in a display when popout does not occur (5). In contrast, others have argued that attention is allocated, not to spatial locations, but to surfaces (97,98) or entire objects (99). For instance, completion of illusory contours across the midline can overcome hemispatial extinction in patients with parietal lesions, presumably via object-centered attention (100). In the context of visual search, popout depends in some cases on three-dimensional representations of objects (101) rather than more basic visual features.
Experiments 7, 8A and 10 showed that search for an orientation singleton was delayed more in the presence of a pair of distractors, such as orientation-length, than by either member of the pair alone. Against the background of object-oriented theories of attention, it is of interest whether the greater efficacy of a pair of distractors relative to one distractor depends upon feature-level or object-level competition. This issue was examined in this final experiment.
Subjects (n=8) searched displays for an orientation singleton target in detection and discrimination tasks. Displays were similar to those in Experiment 8A.
Irrelevant distractor singletons were absent on some blocks of trials while one of four distractor conditions occurred at random on other trials (the number of trials of each kind was balanced). The distractor singletons used were distinguished from both target and non-targets by one of the following properties: orientation, length, orientation and length features on 2 separate objects (all as in Experiment 8), and orientation-length on a single object (Figure 20). Display set size was constant within a block and blocks were interleaved in randomized order. If comparable distractor effects were observed from multiple distractor features, irrespective of the number of distractor objects, this would be consistent with feature-level competition. Alternatively, if the distractor effect for multiple features was smaller when the features were on a single object than when they were distributed across multiple objects, this would favor object-level competition.
Figure 20. Visual display of Experiment 11. Gray non-target bars are oriented at 45° to the right of vertical on a black background; target bar (arrowhead) is an orientation singleton oriented at 5° . Distractor singletons (arrows, clockwise order) differing from both target and non-targets in orientation, length, or orientation and length.
5.5.3. Results and Discussion
Figure 21 illustrates that the effect of the combination of orientation and length distractors was greater than for either alone, as in our earlier experiments. It also shows that the effect of the pair of features was comparable whether they occurred on separate objects or the same object. Within-subject ANOVA revealed no significant effect of display set size (detection: F(1,144) = 0.27, p = 0.6; discrimination: F(1,144) = 0.82, p = 0.37) but significant effects of distractor condition (detection: F(4,576) = 65.07, p = 0.0001; discrimination: F(4,576) = 11.18, p = 0.0001) and the interaction term (detection: F(4,576) = 5.23, p = 0.0004; discrimination: F(4,576) = 3.18, p = 0.014), for both tasks. Slopes were all quite low, in the popout range in most cases and mostly non-significant. Paired Bonferroni-corrected comaprisons (Table 6) revealed that all distractor conditions were significantly different from the distractor-absent condition. Additionally, both combinations of orientation-length were significantly more effective than either feature alone. The effect of the combinations did not differ significantly whether the two features occurred on the same object or separately.
Figure 21. Experiment 11, mean RTs and SEMs (bars) for display set size of 10; the results were very similar for the display set size of 18. a: Detection of orientation singleton target; results shown for target-present displays. b: Discrimination of singleton target orientation. Tog: orientation and length features together on a single object; other abbreviations as in Figure 14. Other details as for Figure 15.
Table 6. Experiment 11, paired t tests.
Asterisks indicate significant p values after Bonferroni correction for 9 comparisons. DA = distractor absent.
Thus, a single distractor that differs in both orientation and length delays search more effectively than a single orientation or length distractor and is as effective as the distractor pair of orientation-length. These findings are consistent with the idea that competition between singletons for attention in this visual search paradigm is feature-based rather than object-based.
In summary, our observations in the present article converge on the conclusion that visual search is extremely flexible and subject to considerable specificity of top-down control, although such specificity is clearly not absolute. Visual search does not necessarily depend on processing by the magnocellular (M) visual sub-system. The underlying neural processing is distributed over an extensive network of brain regions, with varying roles for different parts of the network as the dynamics of top-down vs. bottom-up influences shift. The conjunction of bottom-up processing with top-down attentional suppression of an irrelevant color singleton could account for activity in right primary visual cortex (V1). The conjunction of bottom-up processing with top-down attentional set could explain activity in the right superior temporal gyrus/insular cortex. The left lateral cerebellum appears to play a role in attention, either in signaling popout or in switching attention repeatedly between multiple visual attributes. Loci in left parietal cortex (parietal operculum/superior temporal gyrus, parieto-occipital fissure and precuneus) are implicated in attention-demanding search for a target shape. When multiple feature singletons compete for attention, interference between them is strongest for features closely related to the distinguishing target feature. This competition appears to be feature-level rather than object-level, and is characterized by a varying degree of specificity for different features. Task complexity modulates interference effects, even for abrupt visual onsets, which are often considered to capture attention involuntarily. Further study of visual search at the neural level, with regard to competition between features, is likely to be rewarding.
This work was funded in part by grants from the Emory University Research Committee and the Emory-Georgia Tech Biomedical Technology Research Center. We thank Scott Peterson and Tony Simon for their involvement in some of the early behavioral experiments, Delicia Votaw and Michael White for technological assistance during PET imaging, Scott Grafton for lending expertise with analysis of imaging data, and Ron Boothe and Jim Wilson for helpful comments. Finally, we are indebted to our subjects for their participation.
1. James W: Attention. In: The Principles of Psychology. Henry Holt and Co., New York, 402-458 (1890)
2. Desimone R, J Duncan: Neural mechanisms of selective visual attention. Annu.Rev.Neurosci. 18, 193-222 (1995)
3. Yantis S: Attentional capture in vision. In: Converging Operations in the Study of Visual Selective Attention. Eds: Kramer AF, Coles MGH, Logan GD, Amer. Psychol. Assoc., Washington, D.C., 45-76 (1996)
4. Pashler H: Cross-dimensional interaction and texture segregation. Percept.Psychophys. 43, 307-318 (1988)
5. Treisman A: Features and objects: the fourteenth Bartlett Memorial lecture. Q.J.Exp.Psychol. 40A, 201-237 (1988)
6. Treisman AM, G Gelade: A feature-integration theory of attention. Cognit.Psychol. 12, 97-136 (1980)
7. Egeth H, J Jonides, S Wall: Parallel processing of multielement displays. Cognit.Psychol. 3, 674-698 (1972)
8. Bacon WF, HE Egeth: Overriding stimulus-driven attentional capture. Percept.Psychophys. 55, 485-496 (1994)
9. Julesz B, JR Bergen: Textons, the fundamental elements in preattentive vision and perception of textures. Bell System Technical Journal 62, 1619-1645 (1983)
10. Nakayama K, GH Silverman: Serial and parallel processing of visual feature conjunctions. Nature 320, 264 (1986)
11. Steinman SB: Serial and parallel search in pattern vision. Perception 16, 389-398 (1987)
12. Sagi D: The combination of spatial frequency and orientation is effortlessly perceived. Percept.Psychophys. 43, 601-603 (1988)
13. McLeod P, J Driver, J Crisp: Visual search for conjunctions of movement and form is parallel. Nature 332, 154-155 (1988)
14. Wolfe JM, KR Cave, SL Franzel: Guided search: an alternative to the feature integration model for visual search. J.Exp.Psychol: Hum.Percept.Perf. 15, 419-433 (1989)
15. Hoffman JE: A two-stage model of visual search. Percept.Psychophys. 25, 319-327 (1979)
16. Cave KR, JM Wolfe: Modeling the role of parallel processing in visual search. Cognit.Psychol. 22, 225-271 (1990)
17. Duncan J, GW Humphreys: Visual search and stimulus similarity. Psych.Rev. 96, 433-458 (1989)
18. Theeuwes J: Cross-dimensional perceptual selectivity. Percept.Psychophys. 50, 184-193 (1991)
19. Theeuwes J: Perceptual selectivity for color and form. Percept.Psychophys. 51, 599-606 (1992)
20. Kim M-S, KR Cave: Top-down and bottom-up attentional control: On the nature of interference from a salient distractor. Percept.Psychophys. 61, 1009-23 (1999)
21. Folk CL, S Annett: Do locally defined feature discontinuities capture attention? Percept.Psychophys. 56, 277-287 (1994)
22. Yantis S, Egeth HE: On the distinction between visual salience and stimulus-driven attentional capture. J.Exp.Psychol: Hum.Percept.Perf. 25, 661-676 (1999)
23. Egeth HE, S Yantis: Visual attention: Control, representation and time course. Annu.Rev.Psychol 48, 269-297 (1997)
24. Folk CL, RW Remington, JC Johnston: Involuntary covert orienting is contingent on attentional control settings. J.Exp.Psychol: Hum.Percept.Perf. 18, 1030-1044 (1992)
25. Folk CL, RW Remington, JH Wright: The structure of attentional control: Contingent attentional capture by apparent motion, abrupt onset and color. J.Exp.Psychol: Hum.Percept.Perf. 20, 329 (1994)
26. Posner MI: Orienting of attention. Q.J.Exp.Psychol. 32, 3-25 (1980)
27. Theeuwes J: Stimulus-driven capture and attentional set: selective search for color and visual abrupt onsets. J.Exp.Psychol: Hum.Percept.Perf. 20, 799-806 (1994)
28. Ghirardelli TG, HE Egeth: Goal-directed and stimulus-driven attention in cross-dimensional texture segregation. Percept.Psychophys. 60, 826-838 (1998)
29. Colby CL: The neuroanatomy and neurophysiology of attention. J.Child Neurol. 6, S88-S116 (1991)
30. Bushnell MC, ME Goldberg, DL Robinson: Behavioral enhancement of visual repsonses in monkey cerebral cortex. I. Modulation in posterior parietal cortex related to selective visual attention. J.Neurophysiol. 46, 755-772 (1981)
31. Steinmetz MA, C Constantinidis: Neurophysiological evidence for a role of posterior parietal cortex in redirecting visual attention. Cerebral Cortex 5, 448-456 (1995)
32. Gottlieb JP, M Kusunoki, ME Goldberg: The representation of visual salience in monkey parietal cortex. Nature 391, 481-484 (1998)
33. Moran J, R Desimone: Selective attention gates visual processing in the extrastriate cortex. Science 229, 782-784 (1985)
34. Motter BC: Focal attention produces spatially selective processing in visual cortical areas V1, V2, and V4 in the presence of competing stimuli. J.Neurophysiol. 70, 909-919 (1993)
35. Chelazzi L, EK Miller, J Duncan, R Desimone: A neural basis for visual search in inferior temporal cortex. Nature 363, 345-347 (1993)
36. Luck SJ, L Chelazzi, SA Hillyard, R Desimone: Neural mechanisms of spatial selective attention in areas V1, V2, and V4 of macaque visual cortex. J.Neurophysiol. 77, 24-42 (1997)
37. Reynolds JH, L Chelazzi, R Desimone: Competitive mechanisms subserve attention in macaque areas V2 and V4. J.Neurosci. 19, 1736-1753 (1999)
38. Connor CE, DC Preddie, JL Gallant, DC Van Essen: Spatial attention effects in macaque area V4. J.Neurosci. 17, 3201-3214 (1997)
39. Motter BC: Neural correlates of attentive selection for color or luminance in extrastriate area V4. J.Neurosci. 14, 2178-2189 (1994)
40. Motter BC: Neural correlates of feature selective memory and pop-out in extrastriate area V4. J.Neurosci. 14, 2190-2199 (1994)
41. Treue S, JHR Maunsell: Attentional modulation of visual motion processing in cortical areas MT and MST. Nature 382, 539-541 (1998)
42. Rainer G, WF Asaad, EK Miller: Selective representation of relevant information by neurons in the primate prefrontal cortex. Nature 393, 577-579 (1998)
43. Spitzer H, R Desimone, J Moran: Increased attention enhances both behavioral and neuronal performance. Science 240, 338-340 (1988)
44. Yeshurun Y, M Carrasco: Attention improves or impairs visual performance by enhancing spatial resolution. Nature 396, 72-75 (1998)
45. Corbetta M, GL Shulman, FM Miezin, SE Petersen: Superior parietal cortex activation during spatial attention shifts and visual feature conjunction. Science 270, 802-805 (1995)
46. Miyauchi S, Sasaki Y, Pütz B, Takino R, Imamizu H, Okamoto H: Activation of parieto-occipital junction and superior parietal cortex during visual search task. Soc.Neurosci.Abstr. 22, 729.5 (1996)
47. Kastner S, P De Weerd, R Desimone, LG Ungerleider: Mechanisms of directed attention in the human extrastriate cortex as revealed by functional MRI. Science 282, 108-111 (1998)
48. Kastner S, MA Pinsk, P De Weerd, R Desimone, LG Ungerleider: Increased activity in human visual cortex during directed attention in the absence of visual stimulation. Neuron 22, 751-761 (1999)
49. Girelli M, SJ Luck: Are the same attentional mechanisms used to detect visual search targets defined by color, orientation and motion? J.Cognit.Neurosci. 9, 238-253 (1997)
50. Luck SJ, MA Ford: On the role of selective attention in visual perception. Proc.Natl.Acad.Sci.USA 95, 825-830 (1998)
51. Woodman GF, SJ Luck: Electrophysiological measurement of rapid shifts of attention during visual search. Nature 400, 867-869 (1999)
52. Martinez A, L Anllo-Vento, MI Sereno, LR Frank, RB Buxton, DJ Dubowitz, EC Wong, H Hinrichs, HJ Heinze, SA Hillyard: Involvement of striate and extrastriate visual cortical areas in spatial attention. Nature Neuroscience 2, 364-369 (1999)
53. Schiller PH, K Lee: The role of the primate extrastriate area V4 in vision. Science 251, 1251-1253 (1991)
54. De Weerd P, MR Peralta, III, R Desimone, LG Ungerleider: Loss of attentional stimulus selection after extrastriate cortical lesions in macaques. Nature Neuroscience 2, 753-758 (1999)
55. Friedman-Hill SR, LC Robertson, A Treisman: Parietal contributions to visual feature binding: Evidence from a patient with bilateral lesions. Science 269, 853-855 (1995)
56. Kim M-S, Robertson LC: Spatial attention in feature search: evidence from a patient with bilateral parietal lesions. Proc.Cognit.Neurosci.Soc. 119 (1997)
57. Husain M, C Kennard: Distractor-dependent frontal neglect. Neuropsychologia 35, 829-841 (1997)
58. Livingstone MS, DH Hubel: Psychophysical evidence for separate channels for the perception of form, color, movement and depth. J.Neurosci. 11, 3416-3468 (1987)
59. Sawatari, A. and Callaway, E. M. Convergence of magno-and parvocellular pathways in layer 4B of macaque primary visual cortex. Nature 380, 442-446 (1996)
60. Van Essen DC, EA DeYoe: Concurrent processing in the primate visual cortex. In: The Cognitive Neurosciences. Ed: Gazzaniga MS. MIT Press, Cambridge, MA, 383-400 (1995)
61. Merigan WH, JHR Maunsell: Macaque vision after magnocellular lateral geniculate lesions. Visual Neurosci. 5, 347-352 (1990)
62. Merigan WH, CE Byrne, JHR Maunsell: Does primate motion perception depend on the magnocellular pathway? J.Neurosci. 11, 3422-3429 (1991)
63. Merigan WH, LM Katz, JHR Maunsell: The effects of parvocellular lateral geniculate lesions on the acuity and contrast sensitivity of macaque monkeys. J.Neurosci. 11, 994-1001 (1991)
64. Breitmeyer BG, L Ganz: Implications of sustained and transient channels for theories of visual pattern masking, saccadic suppression, and information processing. Psych.Rev. 83, 1-36 (1976)
65. Jonides J, S Yantis: Uniqueness of abrupt visual onset in capturing attention. Percept.Psychophys. 43, 346-354 (1988)
66. Simon TJ, S Peterson, G Patel, K Sathian: Do the magnocellular and parvocellular visual pathways contribute differentially to subitizing and counting? Percept.Psychophys. 60, 451-464 (1998)
67. Todd S, AF Kramer: Attentional misguidance in visual search. Percept.Psychophys. 56, 198-210 (1994)
68. Schiller PH, CL Colby: The responses of single cells in the lateral geniculate nucleus of the rhesus monkey to color and luminance contrast. Vision Res. 23, 1631-1641 (1983)
69. Logothetis NK, PH Schiller, ER Charles, AC Hurlbert: Perceptual deficits and the role of color opponent and broad band channels in vision. Science 247, 214-217 (1990)
70. Shapley R: Parallel Neural Pathways and Visual Function. In: The Cognitive Neurosciences. Ed: Gazzaniga MS. MIT Press, Cambridge, MA, 315-324 (1995)
71. Lehmkuhle S, RP Garzia, L Turner, T Hash, JA Baro: A defective visual pathway in children with reading disability. N.Engl.J.Med. 328, 989-996 (1993)
72. Raczkowski D, JW Kalat, R Nebes: Reliability and validity of some handedness questionnaire items. Neuropsychologia 12, 43-47 (1974)
73. Sathian K, TJ Simon, S Peterson, GA Patel, JM Hoffman, ST Grafton: Neural evidence linking visual object enumeration and attention. J.Cognit.Neurosci. 11, 36-51 (1999)
74. Huang S-C, RE Carson, ME Phelps, EJ Hoffman, HR Schelbert, DE Kuhl: A boundary method for attenuation correction in positron computed tomography. J.Nucl.Med. 22, 627-637 (1981)
75. Woods RP, ST Grafton, CJ Holmes, SR Cherry, JC Mazziotta: Automated image registration: I. General methods and intrasubject validation. Journal of Computer Assisted Tomography 22, 139-152 (1998)
76. Woods RP, ST Grafton, JDG Watson, NL Sicotte, JC Mazziotta: Automated image registration: II. Intersubject validation of linear and nonlinear models. Journal of Computer Assisted Tomography 22, 153-165 (1998)
77. Neter J, Wasserman W, Kutner MH: Applied Linear Statistical Models, Irwin, Boston (1990)
78. Friston KJ, KJ Worsley, RSJ Frackowiak, JC Mazziotta, AC Evans: Assessing the significance of focal activations using their spatial extent. Human Brain Mapping 1, 210-220 (1994)
79. Heilman KM, RT Watson, E Valenstein, ME Goldberg: Attention: behavior and neural mechanisms. In: Higher functions of the brain. Ed: Plum F. American Physiological Society, Bethesda, MD, 461-481 (1987)
80. Whitehead R: Right hemisphere processing superiority during sustained visual attention. J.Cognit.Neurosci. 3, 329-334 (1991)
81. Corbetta M, FM Miezin, GL Shulman, SE Petersen: A PET study of visuospatial attention. J.Neurosci. 13, 1202-1226 (1993)
82. Nobre AC, GN Sebestyen, DR Gitelman, MM Mesulam, RSJ Frackowiak, CD Frith: Functional localization of the system for visuospatial attention using positron emission tomography. Brain 120, 515-533 (1997)
83. Le TH, JV Pardo, X Hu: 4T-fMRI study of nonspatial shifting of selective attention: cerebellar and parietal contributions. J.Neurophysiol. 79, 1535-1548 (1998)
84. Gulyas B, A Cowey, CA Heywood, D Popplewell, PE Roland: Visual form discrimination from texture cues: a PET study. Human Brain Mapping 6, 115-27 (1998)
85. Li L, EK Miller, R Desimone: The representation of stimulus familiarity in anterior inferior temporal cortex. J.Neurophysiol. 69, 1918-1929 (1993)
86. Miller EK, L Li, R Desimone: A neural mechanism for working and recognition memory in inferior temporal cortex. Science 254, 1377-1379 (1991)
87. Yantis S, J Jonides: Abrupt visual onsets and selective attention: Evidence from visual search. J.Exp.Psychol: Hum.Percept.Perf. 10, 601-621 (1984)
88. Remington RW, JC Johnston, S Yantis: Involuntary attentional capture by abrupt onsets. Percept.Psychophys. 51, 279-290 (1992)
89. Yantis S, J Jonides: Abrupt visual onsets and selective attention: Voluntary versus automatic allocation. J.Exp.Psychol: Hum.Percept.Perf. 16, 121-134 (1990)
90. Yantis S, AP Hillstrom: Stimulus-driven attentional capture: Evidence from equiluminant visual objects. J.Exp.Psychol: Hum.Percept.Perf. 20, 95-107 (1994)
91. Watson DG, GW Humphreys: Attention capture by contour onsets and offsets: No special role for onsets. Percept.Psychophys. 57, 583-597 (1995)
92. Martin-Emerson R, AF Kramer: Offset transients modulate attentional capture by sudden onsets. Percept.Psychophys. 59, 739-751 (1998)
93. Theeuwes J: Abrupt luminance change pops out; abrupt color change does not. Percept.Psychophys. 57, 637-644 (1995)
94. Posner MI, SE Petersen: The attention system of the human brain. Annu.Rev.Neurosci. 13, 25-42 (1990)
95. Posner MI, S Dehaene: Attentional networks. Trends Neurosci. 17, 75-79 (1994)
96. Mangun GR, SA Hillyard: Mechanism and models of selective attention. In: Electrophysiology of Mind. Event-related brain potentials and cognition. Eds: Rugg MD, Coles MGH. Oxford University Press, Oxford, 40-85 (1995)
97. Nakayama K, ZJ He, S Shimojo: Visual surface representation: A critical link between lower-level and higher-level vision. In: Invitation to cognitive science: Vision. Ed: Kosslyn SM. MIT Press, Cambridge, MA, 1-70 (1995)
98. He ZJ, K Nakayama: Visual attention to surfaces in 3D space. Proc.Natl.Acad.Sci.USA 92, 11155-11159 (1995)
99. Duncan J: Selective attention and the organization of visual information. J.Exp.Psychol: Gen. 113, 501-517 (1984)
100. Mattingley JB, G Davis, J Driver: Preattentive filling-in of visual surfaces in parietal extinction. Science 275, 671-674 (1997)
101. Enns JT, RA Rensink: Influence of scene-based properties on visual search. Science 247, 721-723 (1990)
102. Talairach J, Tournoux P: Co-planar stereotaxic atlas of the brain. Thieme Medical Publishers, New York (1988)