How does preference reversal appear and disappear? Effects of the

For extensive reviews, see Slovic and Lichtenstein (1983), Tversky, Slovic, .... The range of each HVG was always three-times larger than the range of the LVG, and the ... as the ''rating task'' in the literature) on a scale from 0 to 10 (''0'' for ''Not ...
119KB taille 2 téléchargements 286 vues
Journal of Behavioral Decision Making J. Behav. Dec. Making, 17: 395–408 (2004) Published online in Wiley InterScience (www.interscience.wiley.com) DOI: 10.1002/bdm.482

How Does Preference Reversal Appear and Disappear? Effects of the Evaluation Mode CHRISTOPHE SCHMELTZER1*, JEAN-PAUL CAVERNI1 and MASSIMO WARGLIEN2 1 Cognitive Psychology Laboratory, Aix-Marseille I University and CNRS, Marseille, France 2 Department of Business, Economics and Management, Ca’Foscari University, Venice, Italy

ABSTRACT Preference reversal is a systematic change in the preference order between options when different response methods are used (e.g., choice vs. judgment). The present study focuses on procedures used to elicit preferences according to an evaluability hypothesis. Two experiments compared joint vs. separate evaluations and explicit vs. non-explicit joint evaluations. Subjects had to express preferences between highvariance gambles (HVGs) and low-variance gambles (LVGs) either by choosing one gamble to play in a lottery or by assigning gambles minimum selling prices. We show that HVGs are preferred in both choice and pricing conditions when gambles are evaluated separately, and LVGs are preferred in both choice and selling conditions when gambles are evaluated in pairs: i.e., when the evaluation mode is held constant, classic preference reversal disappears. These results support the evaluability hypothesis, and suggest that preferences depend on whether subjects are allowed to compare the options they are asked to choose from or judge, independently of the nature of the scale (i.e., attractiveness vs. minimum selling price) they are required to adopt. Copyright # 2004 John Wiley & Sons, Ltd. key words preference reversal; evaluability hypothesis; anchoring and adjustment; evaluation mode

INTRODUCTION In this paper we explore the hypothesis that preference reversal (PR) in lotteries is driven by the evaluation mode, i.e., the way pairs of gambles are to be evaluated, whether jointly or separately, independently of the worth scale—i.e., the nature of the scale—under consideration (e.g., attractiveness or minimum selling

* Correspondence to: Christophe Schmeltzer, Universite´ d’Aix-Marseille I, Laboratoire de Psychologie Cognitive, Case 66, 3 Place Victor Hugo, F-13331 Marseille Cedex 3, France. E-mail: [email protected]

Copyright # 2004 John Wiley & Sons, Ltd.

396

Journal of Behavioral Decision Making

price). We suggest that the relative difficulty in taking into account the variance of lottery outcomes in separate evaluation (SE) is largely responsible for the presence or absence of PR. The role of worth scales usually associated with the choice/judgment distinction may thus have been overemphasized, due to confusion between evaluation mode and worth scale in most classic experiments on PR. The PR phenomenon has been attracting interest for more than thirty years, i.e., since the pioneering work by Lichtenstein and Slovic (1971). It can be broadly defined as a change in the preference order between options when different procedures are used to elicit such preferences. The best-known example involves lottery gambles in choice vs. judgment procedures: subjects are presented with two different gambles of equal (or nearly equal) expected value. When choosing which of the two gambles is more attractive, subjects usually prefer low-variance gambles (LVGs)—called ‘‘P-bet’’ in the PR literature—with high probability of low winnings (e.g., 28/36 chances of winning $10). When evaluating the minimum selling price of each gamble, however, the same subjects prefer high-variance gambles (HVGs)—called ‘‘$-bet’’ in the PR literature—with low probability of high winnings (e.g., 3/36 chances of winning $100). We will refer to this baseline phenomenon as ‘‘classic’’ PR. Having resisted many experimental trials (cf. Lichtenstein & Slovic, 1971; Grether & Plott, 1979; Pommerehne, Schneider, & Zweifel, 1982; Reilly, 1982),1 over time, the PR phenomenon has demonstrated its robustness (usually from 40% to 50% of responses show classic PR), and has been replicated in a growing set of decision tasks. This has not come without paying the price of increasing complexity: multiple factors generating PR have emerged, together with different theoretical explanations trying to account for the phenomenon. For extensive reviews, see Slovic and Lichtenstein (1983), Tversky, Slovic, and Kahneman (1990), and Camerer (1995). If PR can be caused by many different factors, interactions between them may matter. These interactions have not been sufficiently investigated (Caverni, 1996). In this sense the most important exception is the work of Goldstein and Einhorn (1987), which proposed distinguishing two dimensions of the procedures involved in classic decision-making tasks: the ‘‘response method’’ (‘‘ . . . what subjects have been asked to do, i.e., choose or judge’’, p. 237) and the ‘‘worth scale’’ (‘‘ . . . the scale the subjects have been asked to do it with, i.e., attractiveness or minimum selling price’’, p. 237). This distinction led them to define different types of PR according to different combinations of the response method and the worth scale. Goldstein and Einhorn’s distinction has recently been further developed by Hsee et al. (1999) to clarify the role of the response method. Hsee et al. suggest that a basic source of differentiation between response methods is due to the degree of comparativeness of the evaluation process induced by the decision procedure. Joint evaluation (JE) occurs when the options are presented simultaneously, and they are easily compared; separate evaluation (SE) corresponds to opposite situations in which options are presented and evaluated one after the other. Hsee et al. show that a shift from JE to SE mode is sufficient to induce PR, with the worth scale constant. They explain the effect by resorting to the notion of evaluability: some attributes are easier to evaluate in isolation, while others will be fully apprehended and appreciated only by comparing the options. Easily evaluable attributes are likely to play a prominent role in SE, while attributes with weaker evaluability will enter the evaluation process only in JE. While Hsee et al. carefully restrict their explanation to the case in which the worth scale is held constant, it is tempting to extend their argument and suggest that in many cases the effect might persist over different worth scales (as in the case of classic PR). In particular, we expect that, independently of the worth scale under consideration, when lotteries are presented one by one, the variance of single lotteries is comparatively harder to evaluate than in JE conditions. Thus, variance should affect subjects’ decision making only in the JE mode, when a direct comparison of the variance in lotteries can be made. In the classic PR experimental 1 By using a procedure involving simultaneous tasks, (i.e., choosing and pricing at the same time), which forced subjects to debias their judgements and make explicit their discrepancies, Ordo´n˜ez et al. (1995) observed that the phenomenon was significantly reduced. In this paper, we will refer only to the classic conditions (sequential tasks).

Copyright # 2004 John Wiley & Sons, Ltd.

Journal of Behavioral Decision Making, 17, 395–408 (2004)

C. Schmeltzer et al.

Preference Reversal

397

paradigm, subjects choose after the joint presentation of a pair of lotteries, while they evaluate separately the minimum selling price of each lottery. PR might therefore be caused by differences in the relative importance accorded to variance in different conditions. We suggest that when the evaluation mode is constant, classic PR should be considerably weakened or even disappear. Care must be taken in selecting an object for choice simple enough for a stringent formulation of our hypothesis. A lottery gamble is the most classic object in the PR literature. Gambles are quite complex informational objects: they contain information on probabilities and outcomes, and both are needed to compute (even approximately) expected values and variance. This might make gambles ill-suited to a study testing the evaluability hypothesis: too many factors might affect individual evaluation processes. Ganzach (1996), however, has recently introduced simpler gambles with equiprobable outcomes and nearly equal expected values (for the sake of simplicity, we will refer to such gambles as eq-gambles, for equiprobable outcomes and equal expected values). Ganzach showed that the PR phenomenon also holds with eq-gambles (31% of classic PR): although most subjects choose LVGs in which all outcomes are moderate (e.g., win one, and only one, of these five outcomes: $28, $44, $52, $56 or $72), they put a higher price on HVGs, with higher as well as lower outcomes (e.g., win one, and only one, of these five outcomes: $10, $18, $54, $80 or $90). These gambles considerably reduce cognitive difficulties connected to probabilities, leaving all variance to be determined by the values of different outcomes and making the calculation of expected values a relatively simple arithmetic task. We thus decided to use these kinds of gambles in our experiments. Applied to eq-gambles, our hypothesis can be stated as follows: since the role of probability is eliminated by the use of equiprobable outcomes, the main sources of differentiation between eq-gambles are the value of each outcome, the most salient outcomes, and the variance (or other risk-related parameters) of outcomes in each gamble. We expect, therefore, that in SE, variance would be the less-evaluable attribute of an eqgamble; in this case, salient values should play a key role in the evaluation process. Earlier experiments suggest that an anchoring and adjustment process should occur for the highest monetary outcomes of each gamble (cf. Slovic & Lichtenstein, 1968; Lichtenstein & Slovic, 1971; Schkade & Johnson, 1989; Ganzach, 1996). Consequently, HVGs that have the highest monetary outcomes should be preferred in SE. In JE, however, it should be fairly easy to compare the variance of gambles. Risk considerations should play a larger role in the evaluation process, and risk-averse behavior (i.e., LVG preference) should be expected to prevail, at least provided that gambles are defined in the domain of gains (Kahneman & Tversky, 1979). We also expect the worth scale to have less effect on evaluability: whether the worth scale is attractiveness or minimum selling price, variance should be the hardest attribute to evaluate, and the JE/SE distinction should dominate over worth-scale differences. Consequently, we predict that while PR will appear whenever JE and SE are compared, more consistent preferences should be revealed when the evaluation mode is held constant, even if different worth scales are compared. Moreover, according to our hypothesis, different kinds of consistency should emerge when different worth scales are compared in constant JE or SE conditions. In SE, subjects should consistently tend to prefer HVGs, since variance is hard to evaluate; in JE, variance is more likely to be taken into account, and consistent preferences for LVGs are expected to be the modal response. Lastly, we also try to investigate process data that could test our proposals. The evaluability hypothesis emphasizes the relative difficulty in evaluating information about different attributes as a source of PR. This should be reflected in the processes by which subjects search and compare information on gambles. A selfpaced display time paradigm (SDTP) enables us to follow these processes. Subjects are presented with slots of covered information on a computer screen and asked to uncover the screen by passing the mouse on the slots or by pressing keys. It is thus possible to record how much time subjects spend looking at each stimulus, and what sequence of information slots they go through (cf. Caverni, 1987; Schkade & Johnson, 1989; Payne, Bettman, & Johnson, 1992). If their preference for HVGs depends on an anchoring and adjustment process, subjects should focus more on the highest outcomes, while those consistently preferring LVGs should distribute their time more evenly among the outcomes. Copyright # 2004 John Wiley & Sons, Ltd.

Journal of Behavioral Decision Making, 17, 395–408 (2004)

398

Journal of Behavioral Decision Making EXPERIMENT 1: JOINT EVALUATION VERSUS SEPARATE EVALUATION WITH ATTRACTIVENESS AND MINIMUM SELLING PRICE WORTH SCALES

The aim of Experiment 1 was to check if: (1) the PR phenomenon still occurs with eq-gambles when making use of the classic elicitation procedures (control group); (2) classic PR is reduced when the same evaluation mode is used for both attractiveness and minimum selling price worth scales (JE and SE groups); and (3) both anchoring and evaluability have an effect on the allocation of attention (control, JE, and SE groups). In each group, every subject evaluated the stimuli under both attractiveness and minimum selling price conditions. Within each group, half of the subjects dealt with the attractiveness condition first, while the other half dealt with the minimum selling price condition first.

Method Participants Ninety-six psychology undergraduate students participated: 32 in each group. In the control and JE groups, 20 subjects were from the University of Quebec in Montreal and 12 from Aix-Marseille I University; in the SE group all subjects were from Aix-Marseille I University.

Stimuli The stimuli were 16 eq-gambles. Each gamble involved four possible outcomes ranging from the lowest to the highest. The outcomes were either in French francs (FF) or in Canadian dollars ($C) according to the subjects’ nationality: the lowest outcome was FF3 ($C1), the highest was FF368 ($C109). The lottery consisted of selecting only one of the four possible outcomes for each gamble. Thus, for each gamble, subjects were sure to win one of the four outcomes, and the probability of winning one of each of these outcomes was 0.25. For example, if the first outcome was selected, subjects could only win FF7 with a gamble involving the outcomes FF7, FF31, FF56, or FF106 (HVG), while they could win FF34 with a gamble involving the outcomes FF34, FF46, FF53, or FF67 (LVG). If the fourth outcome was selected, subjects could win FF106 with the HVG, while they could only win FF67 with the LVG. Each pair of these gambles included an LVG and an HVG with equal expected values (FF50 ($C15) or FF150 ($C45)). The range of each HVG was always three-times larger than the range of the LVG, and the number of outcomes dominating the corresponding outcomes in the other gamble of the same pair was manipulated: for one pair, the LVG had three outcomes higher than the HVG; for a second pair, the LVG had three outcomes lower than the HVG; and for two others, the LVG and HVG had an equal number of dominating outcomes (cf. Table 1).

Material The experiment was run on a PC, and the SDTP was carried out by a Cþþ program with sequential information processing. Gambles were presented randomly and masked. In order to look at the outcome of each gamble, subjects had to press the appropriate colored keys on the keyboard, each key corresponding to only one outcome. Each outcome remained visible as long as the subjects pressed the corresponding key. It was impossible to look at two outcomes at the same time, but subjects could look at them individually as many times and for as long as they wanted. Another key was pressed to make the response. When subjects made an outcome visible on the screen, the program recorded which outcome it was and the time spent on it. This material allowed us to study the visualization times for each outcome and the visualization times for each gamble. Copyright # 2004 John Wiley & Sons, Ltd.

Journal of Behavioral Decision Making, 17, 395–408 (2004)

C. Schmeltzer et al.

Preference Reversal

399

Table 1. Stimuli (French Francs): eight pairs of eq-gambles, each pair including a high-variance gamble (HVG) and a low-variance gamble (LVG) with four possible outcomes. Their expected value (EV), their range (R) and their dominated outcomes ( > or < ) Pair No. 1 2 3 4 5 6 7 8

Gamble variance HVG LVG HVG LVG HVG LVG HVG LVG HVG LVG HVG LVG HVG LVG HVG LVG

Outcome 1

Outcome 2

Outcome 3

4< 37 7< 34 3< 31 6< 29 13< 112 22< 103 11< 92 17< 86

41 >39 31< 46 46 >43 32< 37 122 >117 92< 137 139 >131 96< 114

61 >57 56 >53 49< 62 39< 66 182 >169 167 >158 142< 186 119< 197

Outcome 4 94 >67 106 >67 102 >64 123 >68 283 >202 319 >202 308 >191 368 >203

R

EV

90 30 99 33 99 33 117 39 270 90 297 99 297 99 351 117

50 50 50 50 150 150 150 150

Note: R ¼ outcome 4 – outcome 1 (with R of HVG/3 ¼ R of LVG); EV ¼  outcome/number of outcomes.

Procedure In JE mode (stimuli presented pair by pair), subjects were asked to indicate, for each pair, either the gamble they would prefer in order to participate in the lottery (attractiveness condition of control and JE groups) or the gamble to which they would attribute the highest minimum selling price and consequently give up taking part in the lottery (pricing condition of the JE group). A new pair was presented only after a subject had completed the procedure for the previous pair of gambles. In SE mode (stimuli presented one by one), for each gamble, subjects were asked to indicate either its degree of attractiveness (this procedure is known as the ‘‘rating task’’ in the literature) on a scale from 0 to 10 (‘‘0’’ for ‘‘Not attractive’’ and ‘‘10’’ for ‘‘Very attractive’’) in order to participate in the lottery (attractiveness condition of the SE group) or the minimum selling price they would require to give up taking part in the lottery (pricing condition of control and SE groups). Again, a new pair was presented only after a subject had completed the procedure for the previous pair of gambles. The control group was used in order to check whether the PR phenomenon would still occur in our subject population using the material and procedure in the experiment. We can summarize our experimental design as follows: 3 (groups: control, JE, SE)  2 (worth scales: attractiveness, minimum selling price)  2 (orders: attractiveness first, minimum selling price first).

Results In keeping with the PR literature, a ‘‘preferred gamble’’ is a gamble chosen from a pair presented simultaneously (JE), or a gamble given the higher value when the gambles are presented separately (SE). We undertook two kinds of analysis. One was conducted on preferences, the dependent variable being the percentages of the HVGs preferred out of the total number of preferences expressed by subjects. The other was conducted on information taking, i.e., the time spent visualizing each outcome for each gamble. We used single ANOVAs to analyze data. They revealed that no effect of order, expected value, or gamble was observed in any group. Copyright # 2004 John Wiley & Sons, Ltd.

Journal of Behavioral Decision Making, 17, 395–408 (2004)

400

Journal of Behavioral Decision Making

Preferences Table 2 summarizes the results. In the control group, the PR phenomenon was neatly reproduced. In the attractiveness condition, the LVGs were preferred, while in the pricing condition, preferences were reversed: the percentages (64 vs. 34) of HVGs preferred were significantly different (F(1, 31) ¼ 21.52, p < 0.0001)2 between the two worth scales. The JE and SE groups, on the other hand, show that consistent patterns of response can be obtained when the response method is held constant by the evaluation mode: there were no significant differences between the rates of HVGs preferred in the JE group (F(1, 31) ¼ 2.79, p < 0.1) and those in the SE group (F(1, 31) ¼ 0.07, p < 0.79). Moreover, the percentages of HVGs preferred (36 vs. 63 and 46 vs. 62) were significantly different in the JE and SE groups (F(1, 62) ¼ 16.43, p < 0.0001, in the attractiveness condition and F(1, 62) ¼ 4.10, p < 0.04 in the pricing condition), independently of the worth scale. Lastly, these results reveal an evaluation-mode effect (F(1, 188) ¼ 28.32, p < 0.0001) but not a worth-scale effect (F(1, 188) ¼ 1.53, p < 0.3). Between-subject comparisons of our different groups show the remarkable overall robustness of each experimental outcome. The comparison between control and JE groups shows that while there were no significant differences between the percentages of HVGs preferred (36 vs. 34) in the JE/attractiveness combinations, the percentage (64) in the control/minimum selling price combination was significantly different from the percentage (46) in the JE/minimum selling price combination (F(1, 62) ¼ 5.49, p < 0.02). The comparison between control and SE groups shows a complementary pattern: while there were no significant differences between the percentages (64 vs. 62) of HVGs preferred in the SE/minimum selling price combinations, the percentage (34) in the JE/attractiveness combination was significantly different from the percentage (63) given in the SE/attractiveness combination (F(1, 62) ¼ 20.71, p < 0.00001). Table 2. Percentages of high-variance gambles (HVGs) preferred (gray), of preference reversal (bold), and of the modal response (*), for each group and each worth scale (attractiveness vs. minimum selling price) in Experiment 1 Control group Minimum selling price

Attractiveness

HVG LVG

HVG

LVG

23 41* 64

11 25 36

34 66 100

LVG 15 39* 54

36 64 100

LVG 23 15 38

63 37 100

Joint evaluation group Minimum selling price Attractiveness

HVG LVG

HVG 21 25 46

Separate evaluation group Minimum selling price Attractiveness

2

HVG LVG

HVG 40* 22 62

All F-tests were performed at an alpha level of 0.01.

Copyright # 2004 John Wiley & Sons, Ltd.

Journal of Behavioral Decision Making, 17, 395–408 (2004)

C. Schmeltzer et al.

Preference Reversal

401

Further insight can be gained by comparing the rate of classic PR responses in the three groups: the percentages of this inconsistency were significantly lower (25) in the JE group (F(1, 62) ¼ 5.82, p < 0.02) and (22) in the SE group (F(1, 62) ¼ 9.19, p < 0.004), than in the control group (41), while there was no difference (F(1, 62) ¼ 0.40, p < 0.6) between those in the JE and SE groups. Moreover, while the modal pattern of response in the control group was a classic PR response, in the JE and SE groups the modal pattern of response was a consistent response. This demonstrates that the PR phenomenon effectively disappeared.3

Information taking Process observations provided by the use of the SDTP offer insight into the actual patterns of attention related to subjects’ responses. Former process analyses of PR (Schkade & Johnson, 1989) have shown that different average response times correspond to different worth scales. The minimum selling price condition shows systematically higher response times than the attractiveness condition (choice or rating). We obtained similar findings in our experiment. In all three groups (cf. Table 3) the minimum selling price condition involved visualization times for each gamble (i.e., the sum of visualization times of the four possible outcomes for each, gamble) approximately 50% higher than with the attractiveness condition: F(1, 454) ¼ 28.88, p < 0.0001 for the control group, F(1, 508) ¼ 22.53, p < 0.00001 for the JE group, and F(1, 358) ¼ 32.35, p < 0.00001 for the SE group. But the most interesting observations may be made at a finer grain of analysis. The subjects’ attention spans on the outcomes of each gamble varied with the preference for LVGs or HVGs. The pattern of outcome visualization times shows a fairly flat distribution of attention spans when LVGs were preferred, but there was a sharply inclined curve—monotonically increasing in outcome values—when the HVGs were preferred (cf. Figure 1). The relation between preference and visualization times of each outcome is significant (F(3, 3951) ¼ 19.50, p < 0.00001). This effect is observed irrespective of the worth scale and the evaluation mode.4

Discussion These results confirm the occurrence of the PR phenomenon even when highly simplified gambles such as eq-gambles are being evaluated. Furthermore, when both worth scales are combined with the JE mode, the modal response is consistently in favor of LVGs, whereas when both worth scales are combined with the SE mode, modal response behaviors shift towards being consistently for HVGs.5 Table 3. Mean visualization time (in seconds) for each gamble for each group and each worth scale (attractiveness vs. minimum selling price) in Experiment 1 Control group Attractiveness Minimum selling price

4.62 7.61

Joint evaluation group 4.74 6.88

Separate evaluation group 5.14 8.75

3 Of course, several reversals still occured, but they were no longer systematic classic PR: there were no significant differences between the rate of classic PR and the rate of inverse PR, or (which is the same thing with such ANOVA analyses) between the rates of HVGs preferred. 4 Results did not differ significantly between the three groups; thus, the analysis has been collapsed over this factor. 5 The results concerning the preference for HVG in SE/attractiveness are very different from those observed in the literature (rating task usually involving a strong preference for LVG options), but very similar to those obtained by Ganzach (1996). Ganzach explains these results by the fact that when both probabilities and outcomes have to be taken into account in the evaluation processes, the outcome dimension dominates in pricing while the probability dimension dominates in rating; but, when only outcomes have to be taken into account (as with eq-gambles), rating and pricing provide the same preferences. We concur with this interpretation.

Copyright # 2004 John Wiley & Sons, Ltd.

Journal of Behavioral Decision Making, 17, 395–408 (2004)

402

Journal of Behavioral Decision Making

Figure 1. Graph of interaction of mean visualization times (in seconds) of each outcome, between preference and outcome, for all groups of Experiment 1. Note: Outcome from 1 to 4 ¼ from the lowest to the highest outcome; LVG and HVG for low- and high-variance gamble

Our results suggest that when subjects take into account all dimensions of the stimuli (i.e., outcome variance and the value of each outcome), they prefer low-variance options. But when subjects take into account only one dimension of the stimuli (i.e., the value of each outcome), they prefer high-variance options. This is consistent with the evaluability hypothesis (Hsee et al., 1999), and fits nicely with the anchoring explanation for the classic PR phenomenon, originally proposed by Slovic and Lichtenstein (1968). We would add, however, that rather than suggesting that only the highest value is considered, our data reveal a smoother process of attraction towards high-value outcomes. Consequently, when the response method involves comparison between stimuli (and thus the outcome variance is easier to take into account, having a greater evaluability), low-variance options are preferred. Conversely, high-variance options are preferred when the response method does not allow comparisons between stimuli (involving the use of anchoring and adjustment processes). Accordingly, the PR phenomenon observed in the control group disappears in the experimental groups, independently of the worth scale. The minor role of the worth scale in the presence of the PR phenomenon is further demonstrated by the fact that visualization times of each gamble present similar patterns for the three groups. It could be argued that, as far as outcome visualization times are concerned, our main hypothesis should reveal a threefold interaction between outcome, preference, and evaluation mode. Although we agree that this threefold interaction might be expected, the variability of responses in each evaluation mode (i.e., PR, inverse PR, or consistent behaviors) did not yield significant results: in both JE and SE, all behaviors were observed, even when there was a modal response. However, we did observe an interaction in visualization times between outcome and preference (cf. Figure 1) indicating that the stronger preference for HVGs in SE can be explained by the fact that subjects are more liable to take into account the larger outcome (i.e., using an anchoring process) in this evaluation mode than in JE (when they are more liable to take into account the variance). Lastly, we observe that the average time spent on visualizing each gamble is strikingly similar for JE and SE groups with the attractiveness condition. This seems to support, at least at this level of aggregation, assigning both classic choices to a single attractiveness-worth scale type, as Goldstein and Einhorn (1987) have done. We must stress that we are not claiming that choice and rating are the same kind of activity (see Schkade & Johnson, 1989; Mellers et al., 1992, for some strong negative evidence). What we are suggesting is that it can justifiably be claimed that subjects resort to similar worth scales, implying similar attention spans. Copyright # 2004 John Wiley & Sons, Ltd.

Journal of Behavioral Decision Making, 17, 395–408 (2004)

C. Schmeltzer et al.

Preference Reversal

403

EXPERIMENT 2: EXPLICIT COMPARISON VERSUS NON-EXPLICIT COMPARISON IN JOINT EVALUATION Our second experiment was designed to provide an answer to a possible objection to Experiment 1, while refining the evaluability hypothesis. A possible weakness in the design of Experiment 1 concerns the evaluation of the minimum selling price in the JE group. In this condition, subjects were asked to indicate the gamble for which they would ask the highest minimum selling price for each pair of gambles. It might be objected that this kind of task does not necessarily require that subjects actually estimate the minimum selling prices of both gambles—they might just answer by indicating one gamble without using a numerical scale. In that case the task would be much closer to the attractiveness condition than to the pricing condition. The consistent behavior observed in the JE group might thus reflect a design fault. We thus designed a variation on the original JE group in which subjects, besides evaluating attractiveness as in Experiment 1, would have to indicate the gamble for which they would ask the highest minimum selling price in each pair of gambles. Moreover, they were also asked to indicate the minimum selling price for each gamble. Our hypothesis was, of course, that the behavior in this case would be very similar to that observed in the JE group in Experiment 1. We labeled this group the ‘‘JE-explicit comparison group.’’ At the same time, we wished to refine our understanding of the evaluability hypothesis. In particular, we were interested in assessing how much the evaluability effect is due to an evaluation mode effect (JE vs. SE) or to the comparative nature of the task. We thus presented two tasks to a second group of subjects, maintaining the JE mode. The first included the attractiveness condition of Experiment 1, while in the second task, subjects were asked to indicate the minimum selling price for each gamble (numerical scale only) in each pair of gambles, without having to make any explicit comparison between gambles. We labeled this group the ‘‘JE-non-explicit comparison group.’’ Our hypothesis was that in the absence of an explicitly comparative task, the effects of the JE mode would be significantly weakened.

Method The stimuli (only in FF) and the material were the same as in Experiment 1. Each subject evaluated the stimuli in both attractiveness and minimum selling price conditions.

Participants Sixty (28 in the JE-explicit comparison group and 32 in the JE-non-explicit comparison group) psychology undergraduate students at Aix-Marseille I University participated. Within each group, half of the subjects dealt with the attractiveness condition first, while the other half dealt with the minimum selling price condition first.

Procedure The attractiveness condition in the JE-explicit and JE-non-explicit comparison groups was the same as in the control and JE groups in Experiment 1. In the pricing condition of the JE-explicit comparison group (stimuli presented pair by pair), for each pair, subjects were asked to indicate the gamble to which they would attribute the highest minimum selling price to give up taking part in the lottery and the minimum selling price for each gamble. In the pricing condition of the JE-non-explicit comparison group (stimuli presented pair by pair), for each gamble, subjects were asked to indicate only the minimum selling price that they would require to give up taking part in the lottery. Copyright # 2004 John Wiley & Sons, Ltd.

Journal of Behavioral Decision Making, 17, 395–408 (2004)

404

Journal of Behavioral Decision Making

Results Preferences Table 4 summarizes the results. On one hand, as expected, they show that the JE-explicit comparison group behaves just like the corresponding JE group in Experiment 1, exhibiting consistent behavior as a modal pattern of response: there were no significant differences between the percentages (43 vs. 35) of HVGs preferred (F(1, 27) ¼ 2.15, p < 0.15). On the other hand, in the JE-non-explicit comparison group, the PR phenomenon reappears when the comparative aspects of the task are made only implicit: the percentages (63 vs. 33) of HVGs preferred were significantly different (F(1, 31) ¼ 17.50, p < 0.0002) between the two worth scales. Between-subject comparisons of the different groups in Experiment 2 again reveal a great overall robustness for each observation. Between-subject comparisons of JE-non-explicit comparison and JE-explicit comparison groups thus show that while there were no significant differences between the percentages (35 vs. 33) of HVGs preferred in the attractiveness conditions, the percentage (63) in the JE-non-explicit comparison/minimum selling price combination was significantly different from the percentage (43) in the JE-explicit comparison/minimum selling price combination (F(1, 58) ¼ 5.95, p < 0.02), and also from the percentage (46) observed in the JE group (F(1, 62) ¼ 4.32, p < 0.05). However, there were no significant differences between the percentages of HVGs preferred for both the JE-non-explicit comparison and control groups, and the JE-explicit comparison and JE groups. The comparison of control and JE-explicit comparison groups shows a complementary pattern: while there were no significant differences between the percentages (35 vs. 34) of HVGs preferred in the attractiveness conditions, the percentage (64) in the minimum selling price condition of the control group was significantly different from the percentage (43) given in the JE-explicit comparison/minimum selling price combination (F(1, 58) ¼ 7.56, p < 0.01). By comparing the rates of the classic PR responses between these two groups, we obtain the same kind of analysis: the percentage of this inconsistency was significantly lower (42 vs. 25) in the JE-explicit comparison group than in the JE-non-explicit comparison group (F(1, 58) ¼ 6.97, p < 0.01). It was also significantly lower (41 vs. 25) than in the control group (F(1, 58) ¼ 6.06, p < 0.02). There were no significant differences, however, between the classic PR percentages of the JE-explicit comparison and JE groups or the SE group, and those of the JE-non-explicit comparison and control groups, demonstrating that the PR phenomenon also effectively disappears in this experiment.6 Table 4. Percentage of high-variance gambles (HVGs) preferred (gray), of preference reversal (bold), and of the modal response (*), for each group and each worth scale (attractiveness vs. minimum selling price) in Experiment 2 Joint evaluation-explicit comparison group Minimum selling price

Attractiveness

HVG LVG

HVG

LVG

18 25 43

17 40* 57

35 65 100

Joint evaluation-non-explicit comparison group Minimum selling price

Attractiveness

6

HVG LVG

HVG

LVG

21 42* 63

12 25 37

33 67 100

Cf. footnote 3.

Copyright # 2004 John Wiley & Sons, Ltd.

Journal of Behavioral Decision Making, 17, 395–408 (2004)

C. Schmeltzer et al.

Preference Reversal

405

Table 5. Mean visualization time (in seconds) for each gamble for each group and each worth scale (attractiveness vs. minimum selling price) in Experiment 2 Joint evaluation-explicit comparison group Attractiveness Minimum selling price

Joint evaluation-non-explicit comparison group

4.83 10.21

5.01 11.76

Figure 2. Graph of interaction of mean visualization times (in seconds) for each outcome, between preference and outcome, for all groups of Experiment 2. Note: Outcome from 1 to 4 ¼ from the lowest to the highest outcome; LVG and HVG for low- and high-variance gamble

Information taking As regards information, we find the same effects as in Experiment 1. First, concerning the global visualization times: in the two groups (F(1, 448) ¼ 78.89, p < 0.00001, for the JE-explicit comparison group and F(1, 442) ¼ 77.53, p < 0.00001, for the JE-non-explicit comparison group), the minimum selling price condition significantly involved visualization times approximately 50% higher than the attractiveness condition (cf. Table 5). Second, concerning the time spent on each outcome: when subjects preferred LVGs, each outcome was visualized with a similar attention span, while we observed an increasing focalization on the highest values when subjects preferred HVGs (cf. Figure 2: the interaction between the preference and the outcomes visualization time was significant (F(3, 2625) ¼ 16.36, p < 0.00001).

GENERAL DISCUSSION By showing both the presence of PR and its disappearance, our results support the hypothesis that the shift in the evaluation mode can be the major cause of this phenomenon, even across different worth scales, thus also suggesting that the worth scales only play a minor role. Furthermore, the results show that, when influenced by response method, different individual preferences also have an impact on the distribution of attention, as Copyright # 2004 John Wiley & Sons, Ltd.

Journal of Behavioral Decision Making, 17, 395–408 (2004)

406

Journal of Behavioral Decision Making

revealed by visualization times. This confirms the usefulness of the SDTP, even beyond the pioneering results already obtained by Schkade and Johnson (1989), who found weaker relations between process recording measures and PR. We suggest that the gamble type used in our experiments (eq-gamble) might be responsible for this clearer evidence. These observations reveal the effect of the evaluation mode (JE vs. SE) and are consistent with the evaluability hypothesis (Hsee et al., 1999): the way subjects evaluate stimuli is influenced by the information given and also by how they have to combine it. Furthermore, these results also show that the anchoring process actually occurs, and suggests that its occurrence is related to the difficulty of evaluating variance: we noted that steep attention curves were frequent when HVGs were preferred. This observation supports the hypothesis that the anchoring process is facilitated by SE, independently of the worth scale. But although the anchoring process plays an important role in the occurrence of the PR phenomenon, it is not its ultimate cause. Indeed, as we saw in the SE group, when two different worth scales are used in the same SE mode, anchoring occurs with both worth scales and the modal response type is consequently consistent in HVG preference. The PR phenomenon thus disappears. Experiment 2 reveals, however, that in JE the mere joint availability of information on pairs of options may not suffice to trigger the use of less evaluable attributes, since subjects may still ignore comparable information. The actual comparison needs to be made explicit in the JE task. This suggests that requirements on real comparability of pairs of options may be stricter than in Hsee et al. (1999), emphasizing that it is the actual presence/absence of a comparing process, rather than its possibility, which is responsible for classic PR. It might be objected, on the other hand, that in the JE-explicit comparison group subjects could have chosen the gamble they preferred and then indicated the prices justifying their choice—in this case, pricing would just be an ex-post rationalization of their choice. Analyses of visualization times, however, show that when the explicit comparison is primed (reflecting a preference for LVGs), subjects distribute their attention among outcomes in a much more even way than in the ‘‘non-explicit’’ case. This is consistent with the evaluability hypothesis. Process data from Experiment 1 clearly suggest that the distribution of attention is related to the evaluation mode rather than to choice vs. pricing, lending support to our interpretation of the results from Experiment 2. Lastly, Goldstein and Einhorn (1987) distinguish three stages of response processes: encoding, evaluation, and expression, and attribute the most critical role in determining classic PR to the last stage. As regards this analysis, the relation between individual preferences and attentional patterns observed in our experiments suggests that many things happen in the early stages. Once more, we suggest a cautious interpretation of our results. In particular, when more complex stimuli are used, the expression stage might be more relevant and even have significant feedback on the encoding and evaluation stages. This might explain some significant divergences between our results (and also those of Hsee et al., 1999) and the original findings of Goldstein and Einhorn (1987). Thus, for example, even with the same response method, Goldstein and Einhorn found patterns of response contrasting with those predicted and obtained in Hsee et al. and our experiments.

CONCLUSION The evaluability and anchoring hypotheses seem to be good explanations for the occurrence of classic PR with eq-gambles. The classic conditions used to elicit preferences reveal the PR phenomenon because different evaluation modes (JE vs. SE) are offered to subjects in each condition. Thus, when we ask them to compare gambles (JE mode), they can take into account all the dimensions of the stimuli. This is not the case when they evaluate gambles one by one (SE mode) and then focus on the high outcomes through an anchoring process. One important consequence is that different worth scales (attractiveness vs. minimum selling price) have no influence on the subjects’ preferences, even if visualization times reveal different processes for each worth scale. Copyright # 2004 John Wiley & Sons, Ltd.

Journal of Behavioral Decision Making, 17, 395–408 (2004)

C. Schmeltzer et al.

Preference Reversal

407

A final consideration on our findings leads us to extend a doubt already implicit in the notion of evaluability: i.e., does the classic PR phenomenon genuinely occur with the construction of individual preferences, or does it reflect (at least in many cases) the use of different information in the response processing? The latter hypothesis is reinforced by the fact that the PR phenomenon tends to become less frequent with appropriate manipulations of the evaluation mode. If this is the case, the PR phenomenon may turn out to be an epiphenomenon of underlying information processing rather than a true phenomenon of cognitive inconsistency.

ACKNOWLEDGMENTS We thank Nicolas Lipari for his Cþþ program conception, and David Kerr, Olivier Cre´mieux, and He´loı¨se Joly for their very helpful reading.

REFERENCES Camerer, C. (1995). Individual decision making. In J. H. Kagel, & A. E. Roth (Eds.), The handbook of experimental economics (pp. 340–375). Princeton: Princeton University Press. Caverni, J.-P. (1987). Self-paced display time for process-tracing in assessment of acquired knowledge. European Bulletin of Cognitive Psychology, 7, 633–651. Caverni, J.-P. (1996). How to better understand the processes underlying the so-called ‘‘preference reversal phenomenon’’ if there is any reversal phenomenon? Journal of Behavioral Decision Making, 9, 111. Ganzach, Y. (1996). Preference reversals in equal-probability gambles: a case for anchoring and adjustment. Journal of Behavioral Decision Making, 9, 95–109. Goldstein, W. M., & Einhorn, H. J. (1987). Expression theory and the preference reversal phenomena. Psychological Review, 94, 236–254. Grether, D. M., & Plott, C. R. (1979). Economic theory and the preference reversal phenomenon. American Economic Review, 69, 623–638. Hsee, C. K., Loewenstein, G. F., Blount, S., & Bazerman, M. H. (1999). Preference reversals between joint and separate evaluations of options: a review and theoretical analysis. Psychological Bulletin, 125, 576–590. Kahneman, D., & Tversky, A. (1979). Prospect theory: an analysis of decision under risk. Econometrica, 47, 263–291. Lichtenstein, S., & Slovic, P. (1971). Reversal of preference between bids and choices in gambling decisions. Journal of Experimental Psychology, 89, 46–55. Mellers, B. A., Chang, S.-J., Birnbaum, M. H., & Ordo´n˜ez, L. D. (1992). Preferences, prices and ratings in risky decision making. Journal of Experimental Psychology: Human Perception and Performance, 18, 347–361. Ordo´n˜ez, L. D., Mellers, B. A., Chang, S.-J., & Roberts, J. (1995). Are preference reversals reduced when made explicit? Journal of Behavioral Decision Making, 8, 265–277. Payne, J. W., Bettman, J. R., & Johnson, E. J. (1992). Behavioral decision research: a constructive processing perspective. Annual Review of Psychology, 43, 87–131. Pommerehne, W., Schneider, F., & Zweifel, P. (1982). Economic theory of choice and the preference reversal phenomenon: a reexamination. American Economic Review, 72, 569–574. Reilly, R. J. (1982). Preference reversal: further evidence and some suggested modifications in experimental design. American Economic Review, 72, 576–584. Schkade, D. A., & Johnson, E. J. (1989). Cognitive processes in preference reversals. Organizational Behavior and Human Decision Processes, 44, 203–231. Slovic, P., & Lichtenstein, S. (1968). Relative importance of probabilities and payoffs in risk taking. Journal of Experimental Psychology Monographs, 78, 165–182. Slovic, P., & Lichtenstein, S. (1983). Preference reversals: a broader perspective. American Economic Review, 73, 596–605. Tversky, A., Slovic, P., & Kahneman, D. (1990). The causes of preference reversal. American Economic Review, 80, 204–217.

Copyright # 2004 John Wiley & Sons, Ltd.

Journal of Behavioral Decision Making, 17, 395–408 (2004)

408

Journal of Behavioral Decision Making

Authors’ biographies: Christophe Schmeltzer has a PhD in psychology from the Aix-Marseille I University. His research interests are primarily in the area of judgment, decision making, and reasoning, with a special interest in judgment bias. Jean-Paul Caverni is a professor in cognitive/experimental psychology. He is the head of a research team on the study of the inferential processes in reasoning and decision making. He is currently working on hypothetico-deductive and probabilistic reasoning. Massimo Warglien is a professor of information and decision making at the Ca’ Foscari University of Venice, and a research director at the Cognitive Science Lab of Rovereto. Current research interests: neural networks and learning in games, economics of language; short-term memory capacity and decision-making under risk.

Authors’ addresses: Christophe Schmeltzer and Jean-Paul Caverni, Universite´ d’Aix-Marseille I, Laboratoire de Psychologie Cognitive, Case 66, 3 Place Victor Hugo, F-13331 Marseille Cedex 3, France. Massimo Warglien, Universita` Ca’ Foscari di Venezia, Dipartimento di Economia e Direzione Aziendale, Dorsoduro 1075, 30123 Venezia, Italia.

Copyright # 2004 John Wiley & Sons, Ltd.

Journal of Behavioral Decision Making, 17, 395–408 (2004)