Theeuwes (2010) presents an array of evidence in support of the theoretical claim that “early” visual selection is completely stimulusdriven, and that top–down factors play a role only after attention has been allocated, through recurrent feedback processing. As noted by Theeuwes, however, this view represents one side of a heated debate over the role of top–down attentional “set” in the involuntary allocation of attention (i.e., attentional “capture”). On the other side of the debate is the “contingent capture” view, which states that the attention allocation system can be “configured” or “set” to respond to stimulus properties that are needed to distinguish target stimuli from non-target stimuli. With a top–down control setting in place, stimuli that match the setting will elicit a shift of attention, even when the stimuli are irrelevant (i.e., when the observer knows in advance that such shifts will likely impair relevant target processing). In addition, according to the contingent capture view, irrelevant, salient stimuli that do not match top–down control settings will not result in a shift of attention. As summarized in Theeuwes (2010), the primary evidence for the stimulus-driven view comes from Theeuwes' additional singleton paradigm, while the primary evidence for the contingent capture view comes from the modified spatial cuing paradigm of Folk, Remington, and Johnston (1992). These two paradigms differ in the temporal presentation of distractors and targets: in the additional singleton paradigm distractors and targets are presented simultaneously, while in the spatial cuing paradigm the distractor precedes the target. The opposing theories have each proposed mechanisms to account for the discrepant results from the other paradigm. In this commentary, rather than providing a full critique of Theeuwes’ theoretical perspective (which is well beyond the scope of a short commentary), we will focus specifically on one critical aspect of his account, the notion that rapid disengagement of attention accounts for the results of the spatial cuing paradigm. The rapid disengagement construct is indeed critical to the stimulus-driven perspective because it is the only way to salvage a bottom–up account of the “contingent capture” pattern found in the spatial cuing task. We will argue, however, that a critical evaluation of the evidence suggests that the rapid disengagement account of contingent capture is at best unsupported, and at worst, unfalsifiable.