Lectures on Industrial Organization - Page Web de Frederik Ducrozet

Aug 19, 2003 - equilibrium provide a system of n equations in n unknowns: .... that marginal cost is increasing in output using only the supply equations is if ...... The management team made estimates about several limit prices that ... one fast food outlet, one car dealer, one hotel, one flight, etc. ...... profits net of entry costs.
476KB taille 6 téléchargements 291 vues
Lectures on Industrial Organization Ken Hendricks August 19, 2003

1

Oligopoly Pricing in Homogenous Markets.

An industry with a small number of competing firms (greater than 1), where ”small” means that the decisions of each firm have nonnegligible effect on the others so that firms cannot ignore this effect in their considerations. In this lecture, we consider the following question: how are prices and output determined when there are a small number of firms producing a homogenous - identical - product? The monopoly models feature a simple decision problem, namely, optimization in environments that react in a very predictable way. The modeling of oligopoly is conceptually more complicated, since when an oligopolist considers an action, such as announcing a price or introducing a new product, it has to consider how its competitors might react. It is not reasonable to suppose that they will not react or that their reactions are fully predictable. Although the situations in which we are interested are inherently dynamic (i.e., repeated play), we initially look at a much simpler static model. The hope is that it will give us some insights into this complicated problem.

1.1

Cournot Model

We present the Cournot model formally as a game in normal form. A game in normal form consists of three elements: player set, strategies, and payoffs. • Player Set: i = 1, 2, ..N • Strategy for firm i: yi ∈ [0, ∞); • Payoffs for firm i:

π i (yi , Y−i ) = P (yi + Y−i )yi − C(yi )

where y−i = (y1 , ..yi−1 , yi+1 , ..yn ) and Y−i = Σnj6=i yj .

Interpretation: two firms decide simultaneously what quantity to produce and supply to the market. Given the amounts supplied, price adjusts so as to clear the market. This model of price formation makes some sense for agricultural commodities where farmers

1

commit to production before prices are determined or resources such as oil. In Cournot’s original example, the commodity was spring water. To solve this game, we need a solution concept. Clearly, each firm should be optimizing given its belief about the other firm’s action. But to say anything about what is likely to happen in this market we need to posit beliefs for each firm. Can we say anything about these beliefs? In general, no. However, we can suggest a criterion that under certain circumstances makes some sense. Definition 1 A Nash Equilibrium is a profile y ∗ = (y1∗ , .., yn∗ ) satisfying ∗ ∗ π i (yi∗ , y−i ) ≥ π i (yi , y−i )

for any yi ∈ [0, ∞), i = 1, .., n. Interpretation: each firm is behaving optimally given its conjecture about its rival’s choice of quantity and, in equilibrium, their conjectures are correct. An alternative, more useful definition of Nash equilibrium is based on best reply functions. Let Ri (y−i ) denote firm i’s best reply to its rivals’ output choices y−i . If Ri is single-valued for all y−i then Ri is a function; otherwise it is a correspondence. (See Fudenberg and Tirole, Game Theory, for details or Dasgupta and Maskin (1986).) Definition 2 A Nash Equilibrium is a profile y ∗ = (y1∗ , .., yn∗ ) satisfying ∗ yi∗ ∈ Ri (y−i )

for i = 1, .., n. This definition provides an algorithm for finding Nash equilibrium points. It also is useful for proving existence results since y∗ represents a fixed point of R, where R is the Cartesian product of Ri (where Ri needs to extended trivially to include yi even though it depends only upon y−i ). Existence of Nash equilibrium then reduces to checking that Ri meets the conditions of a fixed point theorem. Example 1 Assume demand is given by P (y) = a − by, .y = y1 + .. + yn and technology by C(yi ) = cyi , i = 1, .., n. Now consider the individual firm’s optimization problem. When firm 1 contemplates what to supply, it has to think about what the other firms might supply. Let Y−1 denote the combined output of all firms other than firm i. Then its problem, anticipating the price formation process, is to find y1 that maximizes its profit π 1 (yi , Y−i ) = [a − b(y1 + Y−1 ) − c]y1 . 2

The solution to this problem is y1 = (a − c − bY−1 )/2b. The above relationship is known as firm 1’s best reply. The best replies of the other firms are similar. Mathematically, one has n equations to solve for n unknowns. Clearly, given the symmetry of demand and costs, a symmetric solution exists in which each firm produces y∗ y ∗ = [a − b(N − 1)y ∗ − c]/2b which yields the solution y ∗ = (a − c)/[b(N + 1)]. With linear demand and costs, it is not hard to show that this solution is the only solution. Note that, if there are only two firms, the Nash equilibrium can be represented geometrically as an intersection point of the two best reply curves. How might the firms arrive at the Nash solution? If firms reach a non-binding agreement prior to production that these are the quantities they will produce, then the agreement will be self-enforcing (i.e., each firm will want to honor it if it believes that the other will). Such a situation may also be reached through repeated interaction where firms come to expect each other to produce these quantities because of past experience. It may also be possible that each firm arrives at its conjecture by solving its own and its rival’s problems and recognizing that the above solution is the only pair of quantities in which conjectures are mutually consistent. 1.1.1

Asymmetric Cournot Duopoly

Suppose n = 2, and Ci (yi ) = ci yi . The pair of best reply curves are given by y1 = (a − c1 − by2 )/2b y2 = (a − c2 − by1 )/2b, from which it follows that y1∗ = (a + c2 − 2c1 )/3b, y2∗ = (a + c1 − 2c2 )/3b. 1.1.2

Properties of the Cournot solution

The first-order condition determining firm i’s best reply to Y−i , evaluated at the equilibrium, can be expressed as follows: [P (Y ∗ ) − ci )]/P (Y ∗ ) = si /η(Y ∗ ).

PN ∗ ∗ is firm i’s market share, and η(Y ∗ ) is Here Y ∗ = i=1 yi is total output,si = yi /Y the absolute value of the elasticity of market demand evaluated at the equilibrium output (price). Several observations follows immediately from the above equation: 3

1. Oligopolists in the Cournot solution exercise market power since price is above marginal cost. 2. Market power is limited by the market elasticity. The more elastic demand, the less the mark-up. 3. Firms with lower marginal costs will have greater market shares. 4. The greater the number of competitors, the smaller is each firm’s market share and the less its market power. 5. In the symmetric case, the solution lies between the competitive equilibrium and monopoly in terms of quantities, prices, profits and surplus.

1.2

The Bertrand Duopoly Model

Let D(p) denote demand in the market when price is p. In the Bertrand model, the normal form of the game is given as follows: • Player set: two firms, indexed by i = 1, 2. • Strategy for firm i: pi ∈ [0, ∞); • Payoffs for firm i:

  pi D(pi ) − C(D(pi )) if pi < pj p D(pi )/2 − C(D(pi )/2) if pi = pj π i (pi , pj ) =  i 0 if pi > pj

Interpretation: two (or more) firms decide simultaneously what price to charge and produce to demand. Consumers know the prices charged and buy from the firm that charges the lowest price. Each firm is assumed to have sufficient capacity to service all of the market demand at its quoted price. The Bertrand model captures an important feature of many markets, namely, that firms set prices. Definition 3 A Nash Equilibrium is a pair of prices (p∗1 , p∗2 ) satisfying π i (p∗i , p∗j ) ≥ π i (pi , p∗j ) for all pi ∈ [0, ∞), i 6= j, i, j = 1, 2.. Remark 1 This is an example of a game where payoffs are discontinuous on the diagonal (strategy space for the game is p∗2 > c. Then π 1 = 0. But, if firm 1 sets p1 = p∗2 − , it can earn profits (p∗2 − − c)D(p∗2 − ) > 0 for sufficiently small. Contradiction. 4

D(p∗1 ) . But, by lowering its price slights, 2 ∗ ∗ firm 1 can earn (p1 − c − )D(p1 − ), which exceeds profits at p∗1 for sufficiently small. Contradiction. (c) Suppose p∗1 > p∗2 = c. Then π i = 0 for i = 1, 2. But, by raising its price slightly, firm 2 can earn D(c + ) > 0 for sufficiently small that c + < p∗1 . Contradiction. (d) This leaves only the possibility that p∗1 = p∗2 = c. Neither firm can make a profitable deviation so this is an equilibrium. Q.E.D. (b) Suppose p∗1 = p∗2 > c. Then π 1 = (p∗1 − c)

This result is known as the Bertrand paradox: ”One is monopoly, two is perfect competition”. It would seem to contradict observation in two ways. First, in markets with few sellers, firms typically do not sell at marginal costs. Possible exceptions: pricing of air travel in a city-pair market, long distance rates. Second, even in periods of technological and demand stability, prices in oligopolistic markets are not always stable. Prices may fluctuate, sometimes wildly (e.g., retail gasoline markets). 1.2.1

Asymmetric Duopoly

Suppose the unit costs of the firms are c1 and c2 with c1 < c2 < pM (c1 ). Assuming prices are denominated in pennies, the undercutting logic yields a Nash equilibrium in which p1 = c2 − $.01, p2 = c2 . [Prove this claim]. That is, the lower cost firm prices just below the unit cost of the higher cost firm and captures all of the market. This result also appears to contradict observation. It implies that the entire market goes to the most efficient firm(s). If there are more than one efficient firm, then output is divided among the firms and the division is arbitrary. 1.2.2

Remarks

For many markets, price competition seems more natural than quantity competition. However, the Bertrand model yields predictions that appear inconsistent with observation. What is wrong? Need to focus on three crucial assumptions. 1. Unlimited capacity In practise, firms typically do not have the capacity to service the entire market demand. That is, in the short run, capacity is fixed. Customers are rationed, which means that the higher price firm will face a residual demand curve against which it can optimize. This typically leads to equilibrium prices above marginal cost. But, capacity itself should be viewed as a choice, one which firms will make anticipating how the outcome of the price competition will vary depending upon the capacity choices taken. If they anticipate correctly, how much capacity will duopolists choose? In a remarkable result, Kreps and Scheinkman have shown that, given certain regularity conditions and efficient rationing, duopolists will choose capacities equal to the Cournot quantities. In other words, they will commit not to engage in ruthless price competition. In fact, the equilibrium of the pricing game will be

5

that each firm sets the same price and that price is equal to the Cournot price. 2. Homogenous good Consumers care only about price and respond en mass to the slightest difference in price by buying from the lowest price firm. In reality, firms often have preferences over suppliers and are willing to accept some differential before responding. Costs of adjustment or differentiated products will allow the more ”continuous” adjustment to price cuts. It will not be true that a penny difference in price can induce a huge change in demand. 3. Static game The price game is played only once. In practise, firms compete against each other repeatedly.

1.3

Price Setting Duopoly with Capacity Constraints

Which is the “right” model of firm behavior? Do firms chooses quantities or prices? Clearly, the answer matters. Kreps and Scheinkman propose an ingenous solution to the dilemna. They study a two-stage game in which firms first simulatenously choose capacities and then simultaneously choose prices after observing capacities. To illustrate the basic ideas, we consider the following simple example. D(P ) = 1 − P, c = 0, r > 0 where r is the unit cost of capacity and c is the marginal cost of production. Let ki and yi denote the capacity and output of firm i. • Player set: two firms, indexed by i = 1, 2. • Pure strategy for firm i: ki ∈ [0, ∞); pi : (ki , kj ) −→ [0, ∞). In order to define payoffs, we need to specify what happens when the low-price firm does not have the capacity to serve the entire market. Suppose firm i charges the lower price. 1. Efficient Rationing Rule: consumers with the higher valuations buy at the lower price. Hence, if ki is the amount that firm i can supply at pi , then the residual demand curve for firm j at pj > pi is Dj (pj ) = max{0, D(pj ) − ki }. For pj < pi , Dj (pj ) = D(pj ). 2. Proportional Rationing Rule: every consumer has the same probability of buying at the lower price. In this case, residual demand function for firm j at pj > pi is ¶ µ D(pi ) − ki }. Dj (pj ) = max{0, D(pj ) D(pi ) 6

Remark 2 Geometrically, the efficient rationing rule yields a residual demand curve that is a parallel shift of the market demand. The proportional rationing rule yields a residual demand curve that is an inward rotation of the market demand curve through the vertical intercept. The basic assumption of the Kreps-Scheinkman model is that rationing is efficient. Given this assumption, the payoffs in the second stage of the game can be expressed as follows:  L (p ) = pi min{ki , max(0, 1 − pi )} if pi < pj    i i ki Ti (pi ) = pi min{ki , max(0, 1 − pi ) } if pi = pj . π i (pi (ki , kj ), pj (ki , kj )) = k + kj  i   Fi (pi ) = pi min{ki , max(0, 1 − kj − pi )} if pi > pj

The tie-breaking rule is that the firms divide the market according to the relative size of their capacities. Let σ i = (ki , pi (ki , ·)) denote firm i’s (pure) strategy. Then the payoffs to the two stage game are given by Πi (σ i , σ j ) = π i (pi (ki , kj ), pj (ki , kj )) − rki . What is the solution concept? We could use Nash: . However, we will want to rule out incredible threats so we use subgame perfect equilibrium. Subgame perfection requires that, (i) for every subgame (ki , kj ), prices in the second stage must be a Nash equilibrium: π i (p∗i (ki , kj ), p∗j (ki , kj )) ≥ π i (pi (ki , kj ), p∗j (ki , kj )) and (ii) capacities in the first stage must be a Nash equilibrium given equilibrium play in the second stage: Πi (ki∗ , kj∗ ) ≥ π i (p∗i (ki , kj∗ ), p∗j (ki , kj∗ )) − rki . [Problem: give an example of a pair of strategies that is a Nash equilibrium but is not subgame perfect?] Subgame perfect equilibria can be calculated by solving the game backwards. First, we solve for the equilibrium prices for every possible state (ki , kj ). We then compute the equilibrium payoffs for every state and solve for the equilibrium capacities. Region 1: (k1 , k2 ) ≥ (1, 1). If each firm can serve the entire market, regardless of pricec, then the Bertrand solution holds: p∗1 = p∗2 = 0, y1∗ = y2∗ = 1/2. Furthermore, this is the only region in by definition of Region 3, so firm 2 is unconstrained at p. If firm 1 sets p, 2 one can show that this would lead to a contradiction. Q.E.D. Lemma 1.3 Suppose k2 > k1 . Then p =

9

(

) (1 − k1 )2 [1 − (k1 (2 − k1 ))1/2 ] Lemma 1.4 Suppose k2 > k1 . Then p = max , . 4k2 2 Proof. It follows from the previous lemma that firm 2 announces p with positive probability. Consequently, for firm 2 to be indifferent between p and p, profits at these prices must be the same. If firm 2 is capacity constrained at p, then π 2 (p) = π 2 (p) =⇒ p =

(1 − k1 )2 . 4k2

If firm 2 is unconstrained at p, then π 2 (p) = π 2 (p) =⇒ (1 − p)p =

(1 − k1 )2 [1 − (k1 (2 − k1 ))1/2 ] =⇒ p = . 4 2

Q.E.D.

The boundary of the region in which firm 2 is capacity constrained at p is given by the equation [1 − (k1 (2 − k1 ))1/2 ] . k2 = (1 − p) = 2 (See Figure in class.) Theorem 1.5 Suppose k2 > k1 and (k1 , k2 ) are in Region 3. Then there exists a unique Nash equilibrium in mixed strategies, G1 (p) = G2 (p) = (1 − k1 ) and p = max where p = 2

(

L2 (p) − L2 (p) , L2 (p) − F2 (p) L1 (p) − L1 (p) , L1 (p) − F1 (p)

(1 − k1 )2 [1 − (k1 (2 − k1 ))1/2 ] , 4k2 2

)

.

Proof. We need to establish that G1 and G2 are distributions. By construction, F2 (p) = L2 (p), so G1 (p) = 1. Furthermore, G1 (p) = 0 and it is not difficult to show that G1 is strictly increasing in p (recall that p < 12 ). To establish that G2 is a distribution, straightforward calculations yield L1 (p) > F1 (p), which implies that limp↑p G2 (p) < 1. Thus, G2 has a mass point at p. Clearly, G2 (p) = 0.Q.E.D. Having determined equilibrium profits for every pair of capacities, we can now turn to the first stage and determine the optimal capacities.

10

Theorem 1.6 The two stage game has a unique subgame perfect equilibrium in which k1∗ = 1−r . k2∗ = 3 Proof. Suppose (k1 , k2 ) is in region I. Then firm i’s profits are −rki . Clearly, firm i wants to lower capacity to reduce losses. Now suppose (k1 , k2 ) is in region III and assume that k2 > k1 . Note that this implies that k1 < 1 and k2 > 1/3. Firm 2’s profits in this region is (1 − k1 )2 − rk2 . Π(k1 , k2 ) = 4 Once again, it can increase its profits by lowering capacity to at least k1 . A similar calculation reveals that, when k2 = k1 , each firm wants to lower their capacities. This rules out capacities in Regions I and III as equilibria. Now suppose (k1 , k2 ) lies in region II where firms produce to capacity. In this region, firm i’s reduced from profit function is Πi (ki , kj ) = (1 − ki − kj )ki − rki , which is the Cournot form. Deriving the optimal reply functions and solving for the fixed point, we obtain the solution 1−r , k1∗ = k2∗ = 3 which is the Cournot solution.Q.E.D. Remark 4 The critical assumptions are (i) subgame perfection and (ii) efficient rationing. Deneckere and Davidson have analyzed the two-stage game with proportional rationing.

2

Estimating Markups in Homogenous Good Markets.

In this lecture we discuss the New Empirical Industrial Organization (NEIO) approach to estimating market power. We present the theoretical context for the empirical model, discuss the estimation and identification issues, and then examine applications by Genesove and Mullin and by Porter. Before doing so, however, we briefly review some terminology and issues that arise in going from a theoretical model to an empirical model.

2.1

Review

A structural econometric model is a stochastic economic model of behavior of economic agents. It gives rise to a conditional distribution of endogenous variables of economic interaction given exogenous variables of the interaction. We refer to the latter as the reduced form model. To illustrate the basic ideas, consider the linear simultaneous equations model yt0 Γ − x0t β = 0t .

11

(1)

where yt is a vector of endogenous variables, xt is a vector of exogenous variables, and a vector of stochastic shocks. We shall assume E( t ) = 0, E(

0 t t)

t

is

= Σ.

Here Γ, β, and Σ are the structural parameters to be estimated. The reduced form model is obtained by solving for the endogenous variables as functions of the exogenous variables. yt0 = x0t Π + Vt0 where

(2) 0

Π = −βΓ−1 , Vt0 = 0t Γ−1 , E(Vt Vt0 ) = Ω = Γ−1 ΣΓ−1 . Equation (2) gives the conditional distribution of the endogenous variables yt given the exogenous variables xt . The reduced form parameters are Π and Ω. Under the above asssumptions, E(yt |xt ) = x0t Π and V ar(yt |xt ) = Ω. In general, statistical analysis yields consistent estimates of the conditional distribution of yt given xt . Two possible approaches to applied economic research. Reduced form analysis - describes conditional distribution of endogenous variables given exogenous variables. Structural econometric modeling - formulates stochastic economic model of behavior of economic agents which generates conditional distribution of endogenous variables given exogenous variables. Does data know that above structural model is the data generating mechanism (DGM)? The answer is no. Data only has something to say about reduced form parameters. Identification asks if there exists a unique set of structural parameters associated with a given b and Ω, b is it possible set of reduced form parameters. In other words, given estimates Π to uniquely solve for estimates Γ, β, and Σ? The answer involves exclusion and inclusion restrictions. Exclusion restrictions are usually discussed, but inclusion restrictions are often ignored. (See Amemiya, pp. 229-30 for a discussion of the rank and order conditions that need to be satisfied for identifiability.) Can a reduced form regression analysis determine causality? This is nothing more than a restatement of the question: does correlation imply causality? Suppose the true model is as follows: X = Zγ + W β + 1 , Y = Zδ + 2 . Here X and Y are highly correlated, but X does not cause Y or vice versa. Causality is determined by economic theory. Theory specifies the variables that need to be included in an economic model, and which variables are endogenous and which are exogenous. This distinction establishes the direction of causality. Economic theory combined with the underlying statistical model yields a structural econometric model. Note that theory is usually silent on the issue of functional form although it usually predicts qualitative properties of the relationship between endogenous and exogenous variables. 12

2.1.1

Uses of Structural Econometric Models

1. To estimate parameter or effect of economic interest not directly observable from data at hand. Examples: in production theory, one wants to measure returns to scale, elasticity of substitution between inputs, and price elasticity of demand. In the supply and demand model, one wants to know the impact of an $X increase in price on the amount demanded or supplied. 2. To perform welfare analysis of a change to a new economic equilibrium. Examples: in demand analysis, this requires knowledge of the parameters required for computing compensating and equivalent variations. In many IO applications, one requires estimates of marginal cost and demand elasticity to measure the welfare loss and increased profits associated with market power. 3. To simulate changes in equilibrium outcomes due to changes in underlying economic environment. Because structural parameters are primitives of economic agents, they should be invariant to underlying economic environment in which agents operate. Examples: impact of private and/or deregulation of a specific industry, impact of a change in structure such as a merger. 4. To compare relative predictive performance of two competing theories conditional on same set of underlying economic primitives for the same dataset. 2.1.2

Uses of Reduced Form Model

Reduced form analysis is not invariant to to the economic environment in which the economic agent operates. For example, consider a monopolist with constant marginal cost, zero fixed cost and linear demand and suppose the firm’s marginal cost increases by 10%. The monopolist’s response depends upon whether the environment is regulated or not. If regulated, the 10% cost increase implies a 10% increase in price. But if the firm is a profit maximizing monopolist, the 10% increase in marginal cost implies less than 10% increase in price. That is not to say that reduced form analysis is not important. Many economically interesting questions can be answered without a structural model. Example: in the supply and demand model, the reduced form analysis will answer the question, what is the impact on equilibrium price and quantity of a $X rise in income? The reduced form is often sufficient to predict “but-for” prices in collusion cases. In general, any prediction about the impact of an exogenous variable’s impact on an endogenous variable can be examined using reduced form analysis since only the conditional distribution of the endogenous variables given the exogenous variables is needed. 2.1.3

Testing the Theory

The primary goal of the structuralist approach is to estimate the primitives. To do so, economic theory is assumed to hold. While a direct test of economic theory embodied in 13

structural model is possible, most economists would not find such tests very interesting. The main reasons are (1) economic theory rarely specifies the structural econometric model completely; and (2) economic theory imposes restrictions on relationships but typically does not specify functional forms. One can always find a specific functional form to reject or fail to reject theory depending on desired outcome. Thus it is not possible to make an unqualified rejection. All rejections of the theory have to be qualified. Example 1 Economic theory informs us that the Slutsky matrix is negative semidefinite. There are at least three reasons why the data may reject this null hypothesis yet theory is still true. 1. Inappropriate data: the data may aggregate over individual consumers, variables may be measured with error, the data may not be complete (i.e., not all variables relevant to agent’s decision are observable to econometrician. 2. The functional form may be too restrictive: Cobb-Douglas versus translog indirect utility function. 3. Economic versus statistical significance: suppose you find the ∂2H = 1.0000001 ∂pi ∂pj

and

∂2H = 1.01 ∂pj ∂pi

and the difference is statistically significant. Should you reject the theory? One can reject theory conditional on: (1) dataset, (2) functional form assumptions, (3) description of behavior of agents, (4) vector of exogenous or pre-determined variables on which agents condition their decisions. Main Point: in any modeling exercise, one must condition nested hypothesis test on a maintained statistical model so that all tests of theories are conditional on this feature. In other words, the conditional distribution of observed endogenous variables given exogenous variables should be maintained in examining different structural models. 2.1.4

Framework for Specifying and Estimating Structural Econometric Models

1. What are the primitives of economic environment? endowments

e.g., preferences, technology,

2. What are the behavioral assumptions which allow computation of an equilibrium? e.g., objective functions, choice variable (quantity vs. price) [1] and [2] completes the specification of the theoretical model.

14

3. Choice of functional forms- what criteria should be used for selection? Three issues: (a) substitution between data and parametric flexibility - too restrictive does not allow the data to “speak”; (b) phenomenon under study - for example, it would not be appropriate to restrict attention to Cobb-Douglas when studying homotheticity of preferences; (c) modeling complexity - ease of estimation, difficulty in imposing restrictions of theory on model. For example the translog is more complicated than quadratic functions. 4. Econometric model - where does stochastic structure come from? There are four generic types of errors in an econometric model: (a) measurement error in data; (b) shocks to economic environment against which the agent optimizes; (c) econometrician’s uncertainty about underlying characteristics of economic environment (d) optimization error, which is apparent from the econometrician’s perspective by failure of first-orderconditions of agent’s optimization problem. A crucial issue in answering this question is when is the uncertaintly resolved and who is it known by. 5. Distributional assumptions. [1]-[5] yields a conditional joint density of endogenous variables given exogenous variables. 6. Estimation technique: (a) Maximum likelihood - this requires specific distributions. The benefits of MLE are that it is the most efficient estimator and more underlying economic structure can often be recovered. The cost is that it is susceptible to misspecification. (b) Instrumental variables - usually only specify properties of first few moments of stochastic shocks. The benefit is that it is more robust to misspecification. The cost is that it is a less efficient estimator and less underlying economic structure can be recovered. 7. Specification check: to the extent possible, check the assumptions of the model. In particular, check for (a) heteroscedasticity and autocorrelation, (b) omitted variables, functional form (c) overidentifying restrictions test - use a less restrictive version of the structural model against which to test.

15

2.2

Supply and Demand in Competitive Market

Consider the standard model of competitive markets. Let Yts and Ytd denote the quantity supplied and demanded in market t and let Pt denote the market t price. Then the market is described by three equations: Aggregate Supply: The (aggregate) marginal cost function determining supply is given by mct = β 10 + γ 11 Yts + β 11 wt +

1t

where w is an observed factor affecting costs, and 1 are the unobserved (by the analyst) factors that cause differences in marginals costs. Examples of cost shifters: factor prices, productivity, capital stocks. In a competitive market, firms are price-takers, which implies that Pt = mct . Subsituting price for marginal cost yields the competitive supply function. Aggregate Demand: Ytd = β 20 + γ 21 Pt + β 21 xt + 2t where xt is an observed exogenous variable describing state of demand in market t, and 2 are unobserved factors that cause differences in demand at a given price. Examples of demand shifters: price of substitutes or complements, income, seasonal factors. Equilibrium: In equilibrium, the quantity that consumers demand must be the same as the quantity that firms are willing to supply. Yts = Ytd = Yt . Note that this equilibrium condition assumes that the auctioneer has knowledge of the disturbance vector . Imposing the equilibrium condition on aggregate supply yields two equations. These two equations represents a structural model of a competitive market. Why is the above model a structural model? Because the supply function specifies behavior response of firms in the market, and the demand function describes how consumers behave in the market. The structural parameters are β = (β 10 , β 11 , β 20 , β 21 ) and γ = (γ 11 , γ 21 ). The simultaneous solution of supply and demand (equilibrium condition) yields values of endogenous variables as functions of exogenous variables and shocks. Pt = π 10 + π 11 xt + π 12 wt + u1t Yt = π 20 + π 21 xt + π 22 wt + u2t . These two equations represents the reduced form model.

16

2.2.1

Estimation and Identification

• The conditions necessary for OLS estimation of the structural model are E( 1 |x, P ) = E( 2 |w, P ) = 0. Both of these conditions are unlikely to hold. If the “auctioneer” takes the disturbances into account in finding the market clearing price, then P and Y depend upon the disturbances. • However, estimation by 2SLS will work. Regress P on x and w and substitute the predicted value of P into the demand equation. In other words, use w, the cost shifter, as an instrument for P. It is correlated with P through market-clearing, and should be uncorrelated with 2 , the demand shock. Similarly, one can use x as an instrument for Y in the supply equation. • One could also estimate the reduced form by OLS since the necessary conditions E(u1 |x, w) = E(u2 |x, w) = 0 are satisfied. The relationships between the reduced form parameters Π and the structural parameters β and Γ are as follows:     β 10 + γ 11 β 20 γ 21 β 10 + β 20 π 10 π 20   π 11 π 21  = (1 − γ 11 γ 21 )−1  γ 11 β 21 β 21 π 12 π 22 β 11 γ 21 β 11

The stochastic terms of the reduced form are related to the structural errors as follows: u1t = (1 − γ 11 γ 21 )−1 [

1t

u2t = (1 − γ 11 γ 21 )−1 [γ 21

+ γ 11 1t

+

2t ] 2t ].

Therefore, the relationship between Ω and Σ is · ¸ · ¸ ω11 ω12 σ 21 + 2γ 11 σ 12 + γ 211 σ 22 γ 21 σ 21 + (1 + γ 11 γ 21 )σ 12 + γ 11 σ 22 −2 =α ω21 ω22 γ 21 σ 21 + (1 + γ 11 γ 21 )σ 12 + γ 11 σ 22 γ 221 σ 21 + 2γ 21 σ 21 + σ 22 where α = (1 − γ 11 γ 21 ).

b and Ω, b estimates of the structural parameters can be obtained as Given estimates Γ follows. First, note that π b22 π b11 =γ b21 ; =γ b11 ; α b =1−γ b12 γ b22 . π b12 π b21

Having obtained these estimates, one obtains

b22 b π b21 b11 = π , β 21 = . β α b α b 17

Finally, one solves the pair of equations, b11 β 20 α bπ b10 = β 10 + γ

α bπ b20 = γ b21 β 10 + β 20

for estimates of β 10 and β 20 . It is then straightforward to obtains estimates of the b ij , i 6= variance and covariance parameters σ 1 , σ 2 and σ 12 from the estimates of ω j, i = 1, 2.

The exclusion restriction that allows for identification is that, for each equation, there is one exogenous variable that appears in that equation and not in the other equation. The inclusion restriction is that each equation possesses one exogenous variable. Since the number of excluded exogneous variables is exactly equal to the number of included endogenous variables in each equation, the system is exactly identified.

2.3

Cournot Model

We keep the aggregate demand but now assume that a finite number of firms set quantities in a homogenous good market. The profit function for firm i is π i (yi ) = p(Y )yi − Ci (yi ). Assuming differentiability, and that all firms produce positive quantities, the foc for a Nash equilibrium provide a system of n equations in n unknowns: p(Y ) + yi

∂p − mci (yi ) = 0. ∂Y

Of course, even if yi > 0, this is only a necessary condition. Second order conditions for an equilibrium are ∂2p ∂mci (yi ) ∂p + qi 2 − < 0. 2 ∂Y ∂ Y ∂yi Sufficient conditions are (i) marginal revenue slopes down and (ii) marginal cost slopes up. • When should we worry that there may be some firms with zero quantities? • Assume marginal cost is constant and that the demand curve has a constant elasticity. What are sufficient conditions for a (symmetric) equilibrium to exist. 2.3.1

Firm Data

Suppose we have data on prices and firm outputs from a cross-section of markets. Then the econometric model consists of n supply equations, the demand equation and the identity Y = Σni=1 yi . Thus, we have n + 2 equations in n + 2 unknowns. We usually estimate by tacking independent errors on to each equation (except for the identity). One way to 18

estimate is to solve the entire system of equations and express the endogeneous variables in terms of the exogenous variables and estimate the reduced form by OLS or ML. An alternative approach is to use GMM. Let the inverse demand curve be given by Pt = α0 + α1 xt − γYt +

t

For the supply side we now have a set of n foc for each market instead of a supply curve. Suppose the cost function is of the form Ci (y) = yδi . Marginal cost is linear in output with slope δ, mct,i = βwt,i + δ i yt,i + η t,i . Then the Cournot foc for firm i is Pt − βwt,i − (γ + δ i )yt,i = η t,i . Once again, provided E[ t |x, w] = 0, we can estimate the demand equation with IV techiques (i.e., 2SLS). To estimate the supply equations, need to assume that E[η|x, w] = 0. Estimation requires the following steps: • First write

η t,i (θ) ≡ Pt − βwt,i − (γ + δ i )yt,i

where θi = (β, δ i , γ). • Find sufficiently rich vector valued function f (x, w) and form the sample moment conditions GT (θ) = (nT )−1 ΣTt=1 Σni=1 η t,i (θ)f (xt , wt ). • Search for the value of θ that makes kGT (θ)k as close as possible to zero. In the above example, we have “n + 1” parameters: β and (γ + δ i ). Demand and cost shifters are good candidates for instruments since they are correlated with Pt and yi,t but independent of η i,t . Since the equilibrium depends upon the entire set of cost shifters {wt,i }ni=1 , we have more than enough instruments for both the demand and supply equations. More instruments leads to overidentification and permits testing. Technical Remarks 1. Need to assume that the support of η t,i is such that every firm produces positive outputs so that moment conditions hold at every realization of η. 2. Whether or not optimal quantity is zero depends on fixed costs, and on the parameters. If quantity is zero for some firms, then you only observe y for a particular subset of η t,i . If the original sample is a random draw from a population that satisfied E[η|x, w] = 0, then generally E[η|x, w, y > 0] 6= 0, and you have a selection problem. Need to build a model of when and why firms do not produce and estimate this model. 3. The η 0t,i s may be correlated. In that case, you need to include a variance correction and estimate θ more efficiently. 19

4. You can also gain efficiency by jointly estimating the demand and supply equations, allowing for covariance between and η. 5. If demand function is nonlinear, 2SLS does not help but GMM still works. Discussion The foc for quantities do not per se deetermine the slope of either the marginal cost or the demand funciton. That is, estimation of the supply equations yields estimates only of γ + δ i . It is possible to identify the slope of each function only by combining estimates of demand and supply curves. Another way of interpreting this result is that, in the linear case, the foc cannot tell us whether the market acts “as if” it is populated by price taking firms or by Cournot competitors. In the former case, the slope of the pricing equation is slope of the mc curve and in the latter case, it is the sum of the slopes of the two curves. Note that if one assumes that marginal cost is constant, then η t,i (θ) ≡ Pt − βwt,i − γyt,i − δ i where, in an abuse of notation, δ i is firm i’s marginal cost. Here we achieve identification by functional form. One way of distinguishing between the two behavioral mdoels under the assumption that marginal cost is increasing in output using only the supply equations is if the shifters of the demand curve affects its slope (and not just the intercept). For example, suppose the inverse demand curve takes the form Pt = α0 + α1 xt − γYt − α2 xt Yt + t . Then the first order condition in the Cournot model becomes Pt − (γ + α2 xt + δ i )yt,i − βwt,i = η t,i whereas in the perfectly competitive model Pt − δ i yt,i − βwt,i = η t,i . Thus, if the demand slope shifters affect estsimates of the supply equations, then this is evidence of market power. Note that this is a reduce form test. If we reject price-taking behavior, it does not mean that we should accept Cournot behavior. 2.3.2

Aggregate Data

Suppose we only observe total sales per market, Yt and the cost shifter, wt , is common to all firms. In that case, you average over the first-order conditions of the firms and treat the resulting equation as the first-order condition of the representative firm. You need to be careful in this step. Different functional forms for demand and cost functions can have different implications for aggregation. In the above linear case, it is Pt − βwt − (γ + δ) 20

Yt = ηt n

where δ = Σni=1 δ i st,i . Note that technically, δ should be subscripted by t, which would be a problem. In practise, usually assume that it is a constant, which is okay if shares are roughly constant across markets. Otherwise, you are also averaging across markets. 2.3.3

Conduct Parameter

In a Nash equilibrium, each firm’s conjecture concerning the response of aggregate output to a change in its output is by definition equal to 1. That is, each firm takes the choices of the other firms as fixed in choosing its own choose. Empirical economists, however, have frequently adopted the view that the first order condition of firm i should take the form p(Y ) + yi where

∂p ∂Y − mci (yi ) = 0. ∂Y ∂yi ∂Y = 1 + φi , ∂yi

is firm i’s conjecture about the response of the total industry to increases in its output. They then usually require that φi = φ in what they call a conjectural variations equilibrium, and proceed to estimate φ, which is intrepreted as the industry’s conduct parameter. • if φ = 0, firms behave as if the equilibrium is Cournot, • if φ = −1, the firms behave as if they are price-takers, and outcome is Bertrand. • if φ = 1, the firms behave as if they are perfectly collusive. Clearly, this parameterization of behavior is a tempting propostion for empirical workers. The problem is that it is logically flawed. It is an attempt to introduce dynamics into a static model, firms should expect rivals to reacte to its output choice. But game theory argues that these reactions need to be made explicit by specifying the timing of decisions and hence the expectations need to determined within the model, and not introduced in an ad hoc way. The point here is quite similar to the criticism that rational expectations levied against short-run Keynesian models which treated expectations as a parameter of the model. Hendricks and McAfee (2000) provide a more rigourous theoretical defense. In their model, firms report their marginal cost curves to the auctioneer who then chooses price of clear the market and allocates aggregate output efficiently among suppliers (each firm’s reported marginal cost is equal to the equilibrium price). However, in their model, the behavior of the firms is determined by the shape of the marginal cost curve: if steep, the firms behave like Cournot players, if flat, the firms behave like Bertrand. In other words, conduct is not a parameter but determined by the nature of the technology. 21

Nevertheless, the idea of introducing a conduct parameter has deep roots in empirical IO because it does provide a framework for evaluating the competitiveness of the industry. However, the conduct parameter is identified only if the demand shifters affect the slope of the demand function. In this case, the first-order condition in the linear marginal cost model becomes Pt − (γφ + φα2 xt + δ i )yt,i − βwt,i = η t,i . Estimates of the demand equation yields γ and α2 . Thus dividing the coefficient on the cross product term xt yt,i by α2 yields an estimate of φ, which in turn means that δ i is identified. However, without the cross-product term, the model with a conduct parameter and rising marginal cost is not identified. Fundamentally, all of the above relies a bit heavily on functional form. What we would like to do is compare price to marginal cost and ask what behavioral model could explain the difference. There are a couple of studies that have been able to do this and we will go over one of them, the paper by Genoseve and Mullin. The problem is that data on marginal cost is typically not available. What is readily available from annual reports is average variable cost (wages + materials). But this is frequently not a good proxy for marginal costs. For example it ignores costs of capital. It is possible to make adjustments for capital by writing Ki mci = δ i + (i + d) yi where i is the interest rate, d is the depreciation rate, and Ki is firm i’s capital. However, differences in accounting and economic treatments of depreciation and difficulties in measuring capital make this approach problematic. Finally, while cost data are quite rare, sometimes input data are available. Provided factor markets are competitive, one replaces the first order conditions which equates marginal cost to marginal revenue with a system of first order conditions of the form [P + yt,i

∂P ∂fi (z) ] = wk ∂Y ∂zk

where fi is firm i’s production function, z is a vector of inputs, and wk is the price of input k.

3

Oligopoly Pricing in Differentiated Markets.

Products produced by different firms are often differentiated by their characteristics or attributes. A useful classification is to distinguish between horizontal and vertical differentiation. 1. Horizontal: attributes are location or type. Examples: ice cream vendors on a beach are differentiated by their location; dark or light beer; different cereals. 2. Vertical: attributes make one of the products better than another - i.e., quality, durabity. Examples: batteries - Duracell vs Energizer; diamonds - number of carats. 22

The key modeling difference lies in the specification of consumer preferences. If the product space is vertically differentiated, consumers have the same ordering over products - they unanimously agree on which product or brand is better (i.e., there is a ”quality” ladder). On the other hand, if the product space is horizontally differentiated, consumer have diverse preferences over products - some will prefer A to B and others will prefer B to A. There is no agreement on which product or brand is preferred. For example, bathers on a beach will prefer ice cream vendor A to B if they are closer to A than B and B to A if they are closer to B. In this course, we focus primarily on horizontal differentiation.

3.1

Models with Fixed Number of Products

When the number of products is fixed, a simple approach to modeling differentiation is to simply specify demand functions that vary continuously with respect to own and rivals’ prices. Suppose there are two goods with demand functions, q1 (p1, p2 ) = a − bp1 + cp2 ; q2 (p1, p2 ) = a − bp2 + cp1 ;

b > 0, b > |c|

The two goods are substitutes if c > 0 and complements if c < 0. The assumption that b > |c| implies that own price effects dominate cross price effects. It guarantees invertibility of the system. An equivalent representation of the system of demands is given by the inverse demands p1 (q1, q2 ) = α − βq1 − γq2 ; p2 (q1, q2 ) = α − βq2 − γq1 , where

β γ α(β − γ) , b= 2 , c= 2 . 2 2 2 β −γ β −γ β − γ2 Remark: The restriction b > |c| rules out perfect substitutes and complements. In the former case, demands are given by qi = a − b min[p1, p2 ], i = 1, 2 and, in the latter case, qi = a − b(p1 + p2 ), i = 1, 2. In both cases, the inverse demands do not exist. The best reply functions in the pricing game are given by a=

pi = (a + cpj )/2b, i = 1, 2, i 6= j and, in the quantity game, they are given by, qi = (α − γqj )/2β, i = 1, 2, i 6= j. As we shall see in later lectures, the effect of market power on prices and quantities in differentiated product markets will depend criticially upon whether best reply functions slope upward or downward . quantity price

Substitutes negative slope positive slope

Complements positive slope negative slope

Exercise: Show that competition between two firms is soft when best reply curves slope downward and tough when they slope upward. 23

3.2

Turnpike Model

Consider a ”private” road from point A to point B of length 1. Ownership of the road is divided among N individuals who each own 1/Nth of the road. Let pi denote the toll that individual i charges travelers for driving on its section of the road. Operating costs are zero. Total demand for traveling on the road is given by D(P ) = a − bP where P =

N X

pi .

i=1

Given the tolls charged by the other owners, p−i = (p1 , .., pi−1 , pi+1 , ..pN ), individual i’s profits are X pj ). π(pi , p−i ) = pi (a − bpi − b i6=j

Differentiating with respect to pi and setting the equation equal to zero yields the best reply function for individual i: P a − b i6=j pj . pi = 2b Assuming symmetry, the equilibrium price for each individual is p∗ =

a , (N + 1)b

which yields equilibrium market price and quantity P∗ =

aN , (N + 1)b

Q∗ =

a . N +1

Equilibrium profit of each individual is π∗ =

a2 . b(N + 1)2

Note that, in this equilibrium, price is above the monopoly price, since P ∗ increases in N . In fact, in the limit, as N −→ ∞, P ∗ −→ a. Consumers are better off with a monopoly than with competition!

3.3

Models of Horizontal Differentiation

The problem with the above model of product differentiation is that it is not a good model for addressing questions about the number and variety of products. To study these issues, we typically use a location model. We consider three such models: linear city, circular city, and the Dixit-Stiglitz-Spence model. 24

3.3.1

Linear City

Consider a beach or street of unit length (say a mile) with a vendor located at each end. Vendor 1 is located at the left endpoint, which we denote as 0, and vendor 2 is located at the right endpoint, which we denote as 1. Each vendor i posts a price pi at which it is willing to sell. For simplicity, marginal costs of production are zero. Location costs are fixed and sunk. The Primitives • M potential consumers are distributed continuously and uniformily along the beach. The uniform distribution means that the number of customers in any section of the beach of length z (measured as fraction of a mile) is equal to M ∗ z. (Often, M is normalized to be equal to 1). Each consumer is indexed by her location on the beach relative to 0. Thus, consumer x is the person who is located x miles from 0, where x lies between 0 and 1. • Transport costs are quadratic in distance traveled. • Each consumer wants only one unit. The utility of consumer x is given by  if x purchase from vendor 1  s − p1 − tx2 s − p2 − t(1 − x)2 if x purchases from vendor 2 . u(x) =  0 otherwise

• Production costs are zero.

Let x ei denote the marginal consumer to vendor i. Given the locations of the vendors, at any pair of prices (p1 , p2 ), x ei is either someone who is indifferent between buying from vendor 1 or between buying from the lowest cost vendor and not buying at all. Case 1: Local monopoly Suppose there is no overlap in the market coverage of the two vendors when they set price equal to the monopoly price. In that case, there is an interval of consumers who do not purchase from either vendor and the marginal consumer for each vendor is someone who is indifferent between buying and not buying from them. Let x e denote the marginal consumer for vendor 1. By definition, x e1 is determined by the following equation: p s − p1 − te x21 = 0 =⇒ x e1 (p1 ) = (s − p1 )/t Demand facing Vendor 1 is given by

D1 (p1 , p2 ) =

x eZ 1 (p1 ) 0

25

dt = x e1 (p1 ).

The marginal consumer for vendor 2 is determined by e2 )2 = 0 =⇒ x e2 (p2 ) = 1 − s − p2 − t(1 − x

Similarly, demand facing vender 2 is

D2 (p1 , p2 ) =

Z1

x e(p2 )

p (s − p2 )/t

dt = 1 − x e(p2 ).

Vendor 1’s problem is to choose price to maximize p π(p1 ) = p1 ( (s − p1 )/t)M

The solution is pM 1 = 2s/3. Vender 2’s problem is identical to that of Vendor 1 so it sets the same price. Marginal consumer for vendor 1 is p x e1 = s/3t,

and it is 1 − x e1 for Vendor 2. Therefore, if s < 3t/4, then x e < 1/2 and each vendor enjoys a local monopoly.

Case 2: Duopoly Suppose s > 3t/4. If the two vendors try to set price equal to the monopoly price, everyone buys from either vendor 1 or vendor 2 - there is no one who does not buy. In this case, the marginal consumer is someone who is indifferent between buying from vendor 1 or buying from vendor 2. Let x b denote her location. It solves the following equation: x2 = p2 + t(1 − x b)2 =⇒ x b = (p2 − p1 + t)/2t p1 + tb

Notice that the location of the marginal consumer depends upon the prices of both vendors. They compete for this person. Vendor 1’s problem is to choose price to maximize bM = p1 [(p2 − p1 + t)/2t] π 1 (p1 , p2 ) = p1 x

Vendor 2’s problem is to choose her price to maximize

b)M = p2 [(p1 − p2 + t)/2t]M π 2 (p1 , p2 ) = p2 (1 − x

with best reply function

p2 = (p1 + t)/2

e = 1/2. Equilibrium The Nash equilibrium prices are p1 = p2 = t. Marginal consumer is x profits are π 1 = π 2 = t. 1. This model is easily extended to allow for different vendor locations (see problem set) and more vendors. With more than two vendors, each vendor competes directly only with its immediate neighbours but not with vendors further along the street. In that sense, competition is local. 2. The model can also be extended to allow each vendor to have more than one location (i.e., product). 26

3.3.2

Location

We can endogenize the location of the vendors using a two-stage game in which they first choose locations (i.e., pay the fixed cost associated with location) and then compete in prices. Clearly, in making their location choices, the vendors will want to anticipate the kind of prices that will occur in the second stage. Will the vendors locate as far from each other as possible (maximal differentiation) or as close as possible (minimal differentiation)? Assume that vendor 1 is located at point a and vendor 2 is located at point 1 − b where 1−b ≥ a. In other words, b measures the distance vendor 2 is located from 1. Thus, a+b = 1 corresponds to minimal differentiation and a + b = 0 to maximal differentiation. We solve the two stage game by working backwards. Given any pair of locations a and 1 − b, the demand functions are e=a+ D1 (p1 , p2 ) = x

p2 − p1 1−a−b + 2 2t(1 − a − b)

D2 (p1 , p2 ) = 1 − x e=b+

p1 − p2 1−a−b + . 2 2t(1 − a − b)

The Nash equilibrium prices given locations are

p∗1 (a, b) = t(1 − a − b)(1 +

a−b ) 3

b−a ). 3 Substituting the equilibrium prices into the vendors’ profit functions yields the reduced form profit functions p∗2 (a, b) = t(1 − a − b)(1 +

π i (a, b) = p∗i (a, b)Di (a, b, p∗1 (a, b), p∗2 (a, b)) for i = 1, 2. Now consider the first stage choice of vendor 1. Given b, she chooses a to maximize her profits. Differentiating with respect to a and applying the envelope theorem, we obtain ∂π 1 ∂a

· ¸ ∂p1 ∂D1 ∂p∗1 ∂D1 ∂p2 ∗ ∂D1 = D1 + p1 + + ∂a ∂a ∂p1 ∂a ∂p2 ∂a ¸ · ∂D1 ∂D1 ∂p2 + . = p∗1 ∂a ∂p2 ∂a

Using the above equations, one can verify that 3 − 5a − b ∂D1 = ∂a 6(1 − a − b) ∂D1 ∂p2 a−2 = . ∂p2 ∂a 3(1 − a − b) 27

∂π 1 < 0. This implies that vendor Combining the two expressions, we obtain the result that ∂a 1 wants to locate as far as possible from vendor 2. By symmetric reasoning, the same is true for vendor 2. Therefore the subgame perfect equilibrium outcome is a∗ = 0, b∗ = 0, p∗1 = p∗2 = t. Basic conflict: (1) locate where the demand is =⇒ move to the center; (2) stay away from the competition =⇒go to the ends. Which force dominates depends upon the transport cost function. In quadratic case, the second force dominates =⇒ maximal differentiation. In the linear case, the first dominates =⇒ minimal differentiation except in this case, there may be no equilibrium in pure strategies for certain locations. The problem is that the demand function may not be continuous and, as a result, the profit function is discontinuous and nonconcave. To understand why this may be the case, suppose consumer x belongs to vendor 2’s turf, that is, x lies to the right of 1 − b. Her choice is to pay p1 + t(x − a) or p2 + t(x − (1 − b)) depending upon whether she buys from vendor 1 or 2. The distance that she has to travel to vendor 2 is common to both vendors - she also has to travel this segment if she buys from vendor 1. That is, x − a = [x − (1 − b)] + [(1 − b) − a]. Thus, her choice between the two vendors turns on whether p1 + t(1 − b − a) T p2 . Furthermore, if thet inequality holds for x, it also holds for all z ≥ 1 − b. Consequently, if p1 falls below p2 − t(1 − b − a), vendor 1 experiences a large increase in demand which can lead to existence problems. Note: this problem does not arise when firms are located at either end of the linear segment. 3.3.3

Circlular City

Objective of linear city was to study issues of product choice. Purpose of circular city is to address the issue of product variety, i.e., number of firms. The Primitives • Consumers are distributed uniformly on a circle with circumference 1. • Each consumer demands one unit. • Transport costs are linear; t per unit of distance traveled. • n firms are symmetrically located around the circle at distance 1/n apart. 28

• entry cost is f ; production cost is c per unit. What are the equilibrium prices? Assume that t is sufficiently large that a symmetric pure strategy equilibrium exists in which each firm services the immediate vicinity of its location. Then each firm i competes only with the neighboring firms i−1 and i+1. Assume that these two firms charge a common price p. Then the marginal consumers on either side of firm i’s are located the same distance for firm i’s location, some x ∈ (0, n1 ) where x satisfies pi + tx = p + t( n1 − x). Thus, demand for firm i is 1 Di (pi , p) = 2x = [p + t

t n

− pi ].

Firm i chooses pi to maximize π i (pi , p) = (pi − c)Di (pi , p) which yields best reply pi = 12 [p + c + nt ]. Imposing symmetry yields the symmetric equilibrium price and profit, p∗ =

t n

+ c, π ∗ =

t n2

Thus, profit margin declines with n. To derive the equilibrium number of firms, we impose zero profit condition. Ignoring integer problems, this leads to the following calculation: p p π ∗ − f = 0 =⇒ n∗ = t/f =⇒ p∗∗ = tf + c.

As entry costs go to zero, number of firms goes to infinity and each consumer gets product at marginal cost with zero transport cost. Does the market generate too much or too little product variety? Consider the social planner’s problem. She will want to choose n to minimize the sum of fixed costs and consumers’ transport cost since the gross consumer surplus is always s (assuming it is optimal for all consumers to purchase a unit). Therefore the social planner solves

min{nf + t(2n) n

1/2n Z

xdx} ⇐⇒ min{nf + n

t } 4n

0

Differentiating and solving for the socially optimal number of products, n e, yields r n∗ t n e= = . 4f 2 29

Conclusion 1 Market generates too much diversity. Intuition: entrants to do not take into account the impact of their entry on profits of incumbents. Remark 1 We have assumed equi-distant spacing. This can be justified if transport costs are quadratic (Economides (1984)). 3.3.4

Dixit-Stiglitz-Spence Model

The Hotelling models assumed that each product competes directly only with its neighbors. An alternative model of number of firms assumes that each product competes symmetrically with all of the products in the markets. The Primitives • Consumers have CES preferences over a numeraire good q0 and n differentiated goods given by U (q0 , (Σni=1 qiσ )1/σ ), σ ≤ 1. Consumer maximizes utility subject to budget constraint q0 + Σni=1 pi qi = I. • Each firm produces one good; production involves a fixed cost f and marginal cost c. To compute demands, substitute out q0 from the utility function using the budget equation and differentiate. This yields n first order conditions of the form ∂U ∂U n σ 1 −1 σ−1 pi − (Σ q ) σ qi = 0, i = 1, .., n. ∂q0 ∂qi i=1 i Now suppose that n is sufficiently large that changes in consumption of individual products have little affect on the partial utility of the quantity index of differentiated goods. Then −1

qi ∼ = kpi1−σ ,

where k ≡

U1 . U2

Firm i chooses pi to max (pi − c)qi (pi ) − f. pi

Differentiating, (pi − c)(−

1 −1 − −1 1 c )kpi 1−σ + kpi1−σ = 0 =⇒ pi = . 1−σ σ

Note that the equilibrium prices are independent of n for n large (since k is being treated as a constant). To determine n∗ , qi∗ , we use the zero-profit condition and the first-order condition: c ( − c)q ∗ = f σ 1 c = U2 n σ −1 . U1 σ 30

Let q ∗ and n∗ denote the solution to the two equations. It can be shown that this solution is also the solution to the social planner’s problem. Thus, in this case, the competitive market gives the optimal solution.

4

Estimating Demands in Differentiated Product Market

(This chapter is based on Lecture Notes of Ariel Pakes, Aviv Nevo, and on "Estimating Discrete Choice Models of Product Differentiation" by Steve Berry (RAND, 1994). The methodology for measuring markups in differentiated product markets is the same as in the homogenous good market: we need estimates of demand and supply. One way to proceed is to specify the system of demand equations q = D(p; x) where q is a J dimensional vector of quantities demanded of J products, p is a J dimensional vector of prices of the products, and x is a vector of exogenous variables that shift demand. One issue is to choose functional forms which are consistent with choice theory (e.g., translog models, Almost Ideal Demand System, Linear expenditure model) and sufficiently flexible to accommodate the data. Previous work on consumer theory have explored this issue at great length. However, these studies typically work at a much higher level of product aggregation that industrial organization economists. The problem that arises in applying these methods to estimating demands in differentiated product markets is the dimensionality problem. The number of products, J, is frequently quite large (e.g., think of the number of types of cars, or brands of beer, or varieties of cereals). When the number of products, there are too many parameters to estimate. For example, in a linear demand system, the number of parameters is on the order of J 2 . Theoretical restrictions such as symmetry of the Slutsky matrix and adding up restrictions can reduce the number of parameters but it is still proportional to J 2. There are essentially three solutions to the dimensionality problem: 1. Avoid the problem by focussing on an aggregate (e.g., Porter aggregates all eastbound shipments rather than differentiating across cities), or on a narrowly defined product (Borenstein and Shephard look at self-service 87 octane), or on a submarket (Baker and Bresnahan examine a particular segment of the beer industry). The issue here is: do you need to estimate the full demand system to answer the question in which you are interested. 2. Impose structure on preferences such as symmetry or separability to restrict the substitution patterns across products, reducing the number of parameters to be estimated. 3. Use discrete choice models in which products are defined as bundles of a limited number of characteristics and define preferences over characteristics.

31

4.1

Traditional Models

One widely used specification of preferences which you have already encountered is the CES utility function of Dixit/Stiglitz/Spence. 1/σ  J X U (q1 , ..., qJ ) =  qjσ  j=1

which yields demand functions

−1/(1−σ)

p qk = J k X

I

, k = 1, .., J.

−σ/(1−σ) pj

j=1

where I denotes the consumer’s income. Here σ is the parameter that measures substitution across products. The dimensionality problem is solved by assuming symmetry, and the number of parameters to be estimated is one. The cost of using the CES utility function is that it imposes strong (and implausible) restrictions on own and cross-price elasticities. In particular, ∂qk pj ∂qi pj = , for all i, j, k. ∂pj qi ∂pj qk That is, cross-price elasticities are equal. An alternative to the CES is U(q1 , .., qJ ) =

J X j=1

δ j qj −

J X

qj ln qj

j=1

which Anderson, de Palma, and Thisse (1992) have shown yields the Logit demands sk =

exp{δ k − pk }

J X j=1

exp{δ j − pj }

where sk denotes the budget share of product k. Estimation involves only J parameters. It allows for richer substitution patterns than the CES but they are still quite restrictive. In particular, if price of good i increases, then the ratio qj /qk is constant, for all j, k 6= i. The reason can be seen from the utility function. The first term implies that the consumer consumes only the product with the highest δ j − pj . The second term expresses varietyseeking desires. It implies that the consumer always purchases a positive amount of every good. But, as in the CES utility function, all products enter this term in a symmetric way, which accounts for the cross-elasticity restrictions. 32

A different and more general approach to solving the dimensionality problem is the divide the products into smaller groups and allow for a flexible functional form within each group. This approach relies upon two ideas known in the literature (see Deaton and Mullbauer) as separability and multi-stage budgeting. Preferences are said to be (weakly) separable if the commodity vector q can be partitioned into (q A , q B , .., q N ) such that u = f (vA (q A ), vB (q B ), ...., vN (q N )), where vK (q K ) is a subutility function (i.e., represents a preference ordering over q K ), and f is any function that is increasing in all of its arguments. It follows that maximization of u implies maximization of the subutilities subject to whatever is optimally spent on the groups. In other words, break up the optimization problem into two steps. First, maximize u given group expenditures (I A , .., I N ). This involves solving N optimization problems of the form max vK (q K ) s.t. pK q K = I K , qK

where pK is the vector of prices of the commodities q K . The solution to each of these subproblems consists of subgroup demands qiK = giK (pK , I K ). The second step consists of determining an allocation of income over the commodity groups. That is, after substituting the subgroup demands into the utility function, the household solves K max f (ψ A (pA , I A ), ..., ψ N (pN , I N )) s.t. ΣN K=A I = x. (I A ,...,I N )

Weak separability is necessary and sufficient for the first step. Thus, if preferences are separable over time periods, commodity demand function conditional on x and p for each time period are guaranteed to exist. Similarly, if goods are separable from leisure, commodity demand functions of the usual type are justified. Separability can be tested since it imposes restrictions on the substitution matrix. Of course, subgroup demand functions are only a part of what the applied econometrician needs from separability. Just as important is the question of whether is is possible to justify demand functions for composite commodity groups in terms of total expenditures, x, and price indices. For example, we often estimate demands for food, shelter, clothing and entertainment, each of which are commodity composites, in terms of a food price index, a housing price index, a clothing price index and an entertainment price index. Suppose for example that IK , ψ K (pK , I K ) = bK (pK ) where b is a linearly homogenous function. Then bK (pK ) can be interpreted as the commodity K price index and, defining vK ≡ ψ K (pK , I K ), 33

the above optimization problem is equivalent to K max f (v1 , ..., vN ) s.t. ΣN K=1 b(p )vK .

(v1 ,..,vN )

Here vK can be interpreted as composite commodity group K. The above specification imposes a lot of restrictions on demands within each group. Essentially the subutility functions have to be homothetic which implies that the budget shares of each good within the group are independent of the expenditure on that group. This rules out groups which possess both luxeries and necessities. A more attractive alternative is to assume the preferences are additively separable across groups, that is, K K u = ΣN K=1 v (q ).

When this is the case, we can weaken the restrictions on v K - the subutility functions can be quasi-homothetic. In particular, the indirect utility function ψ K can take the generalized Gorman polar form, ¶ µ IK K K K K + aK (pK ), ψ (p , I ) = f bK (pK )

where f K is monotone increasing. This permits very general forms of Engel curves for the individual commodities comprising each group. An excellent example of an empirical study that exploits multi-stage budgeting is Hausman Leonard, Zona (1994) and Hausman (1996). They construct a multi-level demand system with three levels: the top level corresponds to overall demand for the product (e.g., beer or ready-to-eat cereal); the middle level involves demand for different market segments (e.g., in beer, lagar, pilsner, and ales; in cereals, family, kids, and adult cereals); and the bottom level involves a system of demands for the different brands in each segment. A flexible parametric function form has to be assumed for each level, and the choice has to be consistent with the conditions required for multi-stage budgetting. A popular choice for the bottom level is the AIDS model: the demand for brand j within segment g in city c in quarter t is ¶ X µ Jg Igct + γ jk log pkct + εjct , j = 1, .., Jg , c = 1, .., C, t = 1, .., T. sjct = αjc + β j log Pgct k=1

Here sjct is the segment expenditure share of brand j in city c in quarter t, Igct is overall expenditure on segment g in city c in quarter t, Pgct is the price index for segment g in city c in quarter t, and Jg is the number of brands in segment g. The AIDS model admits a wide variety of substitution patterns within each segment. It also has two additional advantages: (1) it aggregates well over individuals and (2) it is easy to impose the theoretical restrictions such as adding up, homogeneity of degree zero, and symmetry. The price index, Pgct , can be computed in a number of ways. The Stone logarithmic price index is Jg X sjct log pjct , Pcgt = j g

34

and the Deaton and Muellbauer exact price index is Pgct = α0 +

Jg X

αj pjct +

j g

Jg Jg X X

γ jk log pkct log pjct .

j g k g

With the Stone price index, the estimation can be performed using linear methods; the DM price index requires non-linear methods. The middle level of demand captures the allocation between segments and can be modeled using the AIDS model, in which case, the demand specified by the above equation is used with expenditure shares and prices aggregated to a segment level (using either the Stone of Deaton & Muellbauer indices). An alternative is the log-log specification used by Hausman in his two articles: log qgct = β g log Ict +

G X

δ g log π gct + αgc + εgct

g=1

where qgct is the quantity of the g th segment in city c in quarter t, Ict is total expenditure on all segments, and π gct are the segment price indices computed as above. Note that AIDS satisfies the Generalized Gorman Polar Form so preferences at the second level should be additively separable in order to be consistent with exact two-stage budgeting. Neither the second level AIDS nor the log-log system satisfies this requirement. Finally, at the top level, the demand for the whole beer or ready-to-eat cereal categoy is specified as log qct = β 0 + β 1 log Ict + β 2 log π ct + θZct + εct where qct is the overall consumption of beer or cereal in city c in quarter t, Ict is real income in city c in quarter t, π ct is the price index for beer or cereal and Zct are variables that shift demand (e.g., demographic and time factors). This specification does satisfy additive separability (recall Cobb-Douglas demands depend only upon income and own price).

4.2

Discrete Choice Models

Basic Setup: Products are bundles of characteristics; j = 0, ...., J Consumer preferences are defined on characteristics; each consumer choose one unit of the good; i = 1, .., M. Aggregate demand: sum over individual demands; sj is market share of good j. Consumers are heterogenous; the heterogeneity is parameterized and its distribution specified.

35

4.2.1

Supply

There are N firms in the market, and each is assumed to be a price setter. Let t index firms and let Jt denote the set of products supplied by firm t. Here pt is the vector of product prices set by firm t, and p−t is the vector of prices set of firm t’s rivals. Each firm t chooses its prices to X [pj − mcj (qj , wj , ωj )]M sj (x, p, ξ) max π t (pt ; p−t ) = j∈Jt

where qj is output of product j, wj is vector of observables affecting the costs of producing j and ω j represents the unobserved factors affecting production costs of product j. Thus, the first-order conditions for product j is X ∂sr (·) sj (·) + (pr − mcr )M = 0. ∂pj j∈Jt

Although firms may offer multiple products, the products are distinct. Hence, we have J first-order conditions of the form given above. Rewrite the foc in vector notation as s + (p − mc)∆ = 0 where ∆ij is nonzero for the elements of a row that are produced by the same firm as the row good. Need to specify functional form for marginal costs. Assume that ln mcj = wj γ + ω j where ω j is a draw from a distribution with mean 0 and is uncorrelated with observere characteristics other than price. Equating, ln(p − ∆−1 s) − wγ = ω(θ) where θ denotes the unknown parameters of marginal costs and marginal revenue. Interact ω(θ) with a function of the instruments (x, w) and find the value of θ that makes the sample moments as close as possible to zero. 4.2.2

Demand

Utility of individual i for product j is given by uij = U (xj , pj , ξ j , vi ; θ) where xj = vector of observed characteristics of product j ξ j = unobserved characteristics of product j pj = price of product j vi = unobserved preference characteristics of consumer i. 36

Typically 0 is the “outside” good. It is the (aggregate) commodity that does not compete with the goods in the industry, and hence whose price or quantity is set exogenously. If there is no outside good then aggregate demand in the industry is assumed to be the population of buyers (≤ M ), and one cannot study the effects of product prices on aggregated demand in the industry. Could treat M as a parameter, but for this lecture, we will treat it as data. The subset of preferences that lead to the choice of good j is given by Aj (θ) = {v|uij > uik , ∀k}. Letting f denote the density of v in the population of interest, the choice probabilities, which are also the model’s predictions for market shares, are Z f (v)dv. sj (x, p, ξ; θ) = v∈Aj (θ)

where x = (x1 , .., xJ ) and p = (p1 , .., pJ ). Total demand is M sj (x, p, ξ). The choices of each individual are invariant to affine transformations. This implies that some normalizations are possible. The usual ones are: (1) normalize the value of the outside good to zero =⇒ equivalent to substracting ui0 from all uij and (2) normalize the coefficient of one of the variables to one =⇒ measure utility in units of that variable (e.g., coefficient on price). Note: estimation requires parametric assumptions on U and f . 4.2.3

Generation I Models

Data: {s0j , pj , xj }Jj=1 . Here s0j denotes the observed market share of product j. The data generating process is a choice model. The choice model determines predicted market shares (or choice probabilities) {sj (θ)}Jj=0 where θ represents the unknown paramters of the utility function and distribution of ν i . To relate the predicted shares to the data, we assume that each individual is an independent draw and therefore that the distribution of a sample of size M generates a multinomial distribution of product choices. Let mj denote the number of consumers in the sample that select product j. Then the likelihood function for the data is L ∝ ΠJj=0 sj (θ)mj . Takings logs, choose θ to

J X

max M θ

⇔ min θ

s0j log[sj (θ)]

j=0

J X j=0

(s0j − sj (θ))2 . sj (θ)

The equivalence statement follows from taking a Taylor series approximation of sj (θ) around the point s0j . Latter is called minimum χ2 . If put in observed shares in the denominator, then it is modified χ2 . 37

Logit In this model, utility takes the form uij = xj β − pj +

ij .

If ij is distributed i.i.d over both products and individuals, and F ( ) = exp{− exp(− )} (known as the extreme value distribution), we have closed form solutions for market shares. They are exp{xj β − pj } , j = 0, .., J sj = PJ j=0 exp{xj β − pj } Note that normalizing utility of the outside good to zero implies 1

s0 = PJ

j=0 exp{xj β

Thus,

− pj }

.

log sj − log s0 = xj β − pj . The extreme value distribution has no parameters so θ = β. Vertical In this model, the utility function takes the form, uij = vi ϕj − pj ,

vi > 0.

Here ϕj measures the quality of good j, which is assumed to be strictly increasing in j. Bresnahan assumed no unobserved product characteristics and parameterized quality ϕj = xj β. To obtain market shares, first order the goods by price from lowest to highest. For each good j > 1, a necessary condition for demand to be positive is that vϕj − pj > vϕj+1 − pj+1 and vϕj − pj > vϕj−1 − pj−1 . Combining these two inequalities implies pj − pj−1 pj+1 − pj j > 0. Define 40 = −∞ and 4J = ∞. Then necessary and sufficient condition for demands to be positive for all goods is that 4j is strictly increasing in j. Market share of good j is given by sj = F (4j+1 ) − F (4j ). Since ΣJj=0 sj = 1,normalize ϕ0 = 0. Also, usually normalize p0 = 0. Substituting the predicted market shares into the objective function, choose β and the parameters of F to minimize the criterion function. 38

4.2.4

Generation II Models

Differences between the actual market shares and the choice probabilities can only be due to sampling error. As M −→ ∞, sj −→ s0j , the model should fit exactly, no prediction error. Thus, the model is certain to be rejected by the data. That is, there will not exist a value of β such that actual and predicted market shares are equal even when M is very large. Need to develop a theory of the estimation error. Berry develops a model in which actual market shares can differ from predicted shares due to unobserved product characteristics. Assume that utility takes the form uij = xj β − αpj + ξ j ≡ δ j . Here δ j is the mean utility of product i. This model includes the vertical model described above when ϕj = xj β + ξ j , α = 1, and v is assumed to be distributed with mean one (which is just a normalization on the units of quality). It also includes the logit model. Basic strategy is simple. Assume that M is large so that s0j = sj (δ ∗ ) where δ ∗ solves the system of J independent equations (since ΣJj=0 sj = 1).. If we assume that F is known then δ ∗ can be treated as a known, nonlinear transformation of the market share data. Using {δ j } as data, we can run the regression δ ∗ (s0 ) = xj β − αpj + ξ j where ξ j is the unobserved error. Since pj is a function of ξ j , we will need to find instruments for pj . This is the usual simultaneity problem that arises in estimating supply/demand relations. Available instruments: cost shifters w and characteristics of other products. The identifying moment conditions here are . When F is not known, then δ ∗ depends upon the parameters σ of F . In that case, we can continue to use the approach of inverting the system of equations to obtain the mean utilities. For each value of θ = (β, σ), there exists a unique solution for ξ that makes the predicted shares equal to the actual shares. Let ξ(θ) denote this solution. Then use the moment conditions E[ξ(θ0 )w] = E[ξ(θ0 )x] = 0, where θ0 denotes the true value, to estimate θ. Logit Model ln sj − ln s0 = δ j .

In this case, there is no need to compute the δ 0j s since they are simply equal to the difference in shares. Simply regress the difference in log market shares on (xj , pj ) using a simple instrumental variables regression. 39

Vertical Model Here δ j = ϕj − pj . Compute the ϕ0j s using the market share data as follows. Recall that sj = F (4j+1 ) − F (4j ). Substituting market shares and inverting gives the recursive relation, 4j = F −1 (F (4j+1 ) − s0j ), with initial value 4J = F −1 (1 − sJ ). Normalizing the value and price of the outside good to zero, the values of ϕj can be obtained from the recursion ϕj = ϕj=1 + (pj − pj−1 )/4j . The final step is to treat ϕ0j s as data and regress δ j on product characteristics, xj . Once again, ξ j is treated as unobserved error in the regression. 4.2.5

Generation III Models

Main problem with the above model is that the pattern of substitution is too restrictive. In the vertical model, cross-price elasticities are zero for non-neighboring goods as defined by prices. Also own price elasticities are often not smaller for high priced goods, even though one might expect this to be the case. In the logit model, the ratio of quantity demands between goods j and k (i.e., sj /sk = qj /qk ) is independent of the number of alternatives available. This is known as the independence of irrelevant alternatives (IIA) property. Own and cross price effects are ∂sj ∂sj = −sj (1 − sj ), = sj sk . ∂pj ∂pk Let εij denote the cross price elasticity between goods i and j. Then the latter implies that εik = εjk for all i, j. This is all a little implausible. One approach is to relax the IIA property for individuals and allow correlation among subsets of products. The idea is to partition the products into groups or “nests”. Products within the group are closer substitutes than across groups. An example of this approach is the nested logit. Let g index the group and partition the products into G + 1 exhaustive and mutually exclusive groups where group 0 consists of good 0. Consumer i’s utility of good j is uij = δ j + ς ig + (1 − σ) ij , where δ j ≡ xj β − αpj + ξ j , ς ig is a shock common to every product j in group g and ij is identically and independently distributed extreme value. The distribution of ς is the unique distribution with the property that, if is an extreme value random variable, then ς + (1 − σ) is also extreme value random variable. Here σ is a measure of the within group correlation among products. Note that it is symmetric across products within the same group. (It is also the same for all groups, although this can be relaxed.) 40

The nested logit yieldsd closed form solutions for the market shares. If product j is in group g, its market share as a fraction of the total group expenditure is sj|g (δ, σ) = where Dg ≡

exp[δ j /(1 − σ)] P Dg j∈Jg exp[δ j /(1 − σ)]

X

j∈Jg

exp[δ j /(1 − σ)].

The market share of group g is given by (1−σ)

Dg sg (δ, σ) = P G

(1−σ) g=0 Dg

.

Combining these two expressions gives the market share of product j: sj = sj|g sg =

exp[δ j /(1 − σ)] . P (1−σ) Dgσ [ G ] g=0 Dg

Normalize δ 0 = 0. The, since group 0 consists only of the outside good, D0 = 1. Therefore, s0 (δ, σ) =

[1 +

Taking logs of market shares, we get ln s0j − ln s00 =

PG

1

(1−σ) ] g=1 Dg

.

δj − σ ln Dg . (1 − σ)

Taking logs of group shares, ln Dg = [ln s0g − ln s00 ]/(1 − σ). Substituting, we obtain ln s0j − ln s00 = xj β − αpj + σ ln s0g + ξ j . Estimates of (β, α, σ) can be obtained from a linear instrumental variables regressions of difference in log market shares on product characteristics, prices, and log of within group share. Note that last term is endogenous and requires its own instruments. The second approach is to allow individual consumers to value characteristics differently. This is known as the random coefficients model. The utility of consumer i for product j is given by uij = xj β i − αpj + ξ j + ij , 41

where β ik = β k + σ k ς ik . Thus, uij = xj β − αpj + ξ j + ν ij , where vij = ΣK k=1 xjk σ k ς ik +

ij .

Assume that ij is identically and independently distributed extreme value and ς ik has a mean zero, identically and independently distributed standard normal distribution across individuals and characteristics. Thus, IIA holds at the individual level but market shares can exhibit a much richer pattern of substitution. An increase in the price of good j continues to affect only those consumers who are purchasing good j. However, in the random coefficients model, the consumers who substitute away from good j will tend to substitute towards goods with ”similar” characteristics. The reason why they were buying good j initially is because they valued the characteristics of good j. The main difficulty in estimating the random coefficients lies in obtaining the market shares. Conditional on (δ, θ), the predicted market share of good j is Z exp{δ j + ΣK k=1 xjk σ k ς ik } g(v)dv. sj (δ, θ) = PJ [1 + j=1 exp{δ j + ΣK k=1 xjk σ k ς ik }] This is a K-dimensional integral and is intractible. Need to use a simulation estimator for aggregation. Use the Js simulation draws to “aggregate” over v. ) ( X x σ ς } exp{δ j + ΣK jk k ikr k=1 g(v)dv . sbj (δ, θ) = PJ K [1 + exp{δ j + Σk=1 xjk σ k ς ikr }] j=1 r

Note that we still use the fact that we can integrate out over exactly. Having obtained the market shares, we proceed in much the same way as before. Equate actual market shares to the simulated market shares and invert to get the mean utilities ξ(θ, s0 ) with a function of the instruments (x, w) and find or, equivalently, b ξ(θ, s0 ). Interact b the value of θ that makes the sample moments as close as possible to zero. Note that the simulated market shares enter non-linearly in the moment conditons, so the nice properties of simulation estimators are not valid.

4.3

Discrete Choice vs Multi-Stage Budgetting

The multi-stage model requires a priori segmentation of the market into relatively small groups. The grouping may not be obvious. Second, the choice of empirical specification is likely to require more flexibility than theory allows. Another problem with the AIDS model (and most of the others) is that it assumes no corner solutions, all consumers consume all products. This is an okay assumption when the goods are aggregates like food and clothing, but it is not appropriate in many differentiated product markets. Yet another problem is that, in practise, it is hard to find instrumental variables for all the prices in the system that are exogenous and do not generate moment conditions that are nearly collinear. On the plus side, it is more familiar that the characteristics approach, easier to understand. 42

Discrete choice models have plenty of instruments since one can use the characteristics of the products as instruments. However, the models are more computationally demanding and are less intuitive.

5

ENTRY DETERRANCE AND ACCOMODATION

The existence of rents from market power will attract entry and, in the absence of effective entry deterrence, new firms will enter and compete away the economic profits of the incumbents. In fact, in a world in which entry is easy (i.e., not costly), the issues of market power and anticompetitive practises disappears. Thus, from an antitrust perspective, the question of entry deterrence is central to a determination of market power. Our focus in this lecture is to identify conditions under which an incumbent firm can expand capacity to deter entry or at least limit the size at which an entrant enters. The prototypical case of DuPont. In 1972 Dupont was the largest producer of a chemical agent known at titanium dioxide which is used to whiten paper and paint. It had approximately 35% of industry capacity, most of which used its proprietary chloride process. In the early 1970s two exogenous shocks gave DuPont’s proprietary process a considerable cost advantage over the other two technologies that were available. One was the adoption of stricter pollution controls which threatened the viability of the sulfate process. The second was that the price of the raw material used by the other chloride process tripled. This gave DuPont a substantial cost advantage over its rivals. The management of DuPont discussed two strategies for responding to their advantageous position. The ”maintain status quo” strategy called for increasing DuPont’s market share as the sulfate process exited the industry from 35% to 40% by 1985, investing $192 million in new capacity and no real change in prices. The ”growth” strategy was based upon exploiting the strategic advantage. The ideas was to invest in $394 million in new capacity, building a new plant and expanding several existing plants, and expanding capacity to 65%. The management team made estimates about several limit prices that would trigger competitors expansions and imports. They proposed a pricing strategy that was high enough to generate the cash flow needed to finance the expansion but low enough to discourage entry. Dupont estimated the ”growth” strategy to be more profitable and pursued it. By 1979, Dupont had acheived its goal and had 60% of the capacity. The FTC filed a complaint charging DuPont with monopolizing the market by engaging in limit pricing and holding excess capacity. The court dismissed the charges, primarily because the FTC failed to demonstrate that DuPont had invested in excess capacity. Subsequent econometric analysis by Hall found that DuPont’s investments reduced its short-run marginal costs and that strategic effect limited the output and capacity expansion of its rivals and prevented entry. However, DuPont’s expansion did not involve excess capacity. There are two issues that we need to address. First, does it make sense for an incumbent to invest in excess capacity, which it then holds in reserve until entry or expansion by rivals? In this case, the excess capacity is a warning to rivals that entry or expansion will be met by an aggressive price war. The second issue is whether an incumbent will overinvest in 43

capacity but that any such investments will be used even if entry does not occur.

5.1

Capacity Deterrence

Consider a homogenous good industry with inverse demand P (y) = 1 − y, where y denotes industry output. Firm 1 has been servicing this industry as a monopolist because of a patent. However, its patent is about to expire and a rival, firm 2, is poised to enter when this happens. Can it use its capacity to deter entry? To produce one unit of output requires one unit of capacity. The cost of a unit of capacity is c < 1/2. Prior to the entry decision of firm 2, firm 1 chooses a level of capital k1 which is then fixed. Firm 2 observes k1 and then decides whether to enter or not. Entry costs are denote by F. If firm 2 decides to entry, it chooses a level of capital k2 , which is also fixed. Given their capacities, each firm then chooses output simultaneously subject to the restriction that yi ≤ ki . Production costs are zero. We assume without loss of generality that firm 1 cannot augment its capital stock. 5.1.1

Indivisible Capital

Suppose capital is indivisible and can be purchased only in integer units. In our case, one unit is sufficient to service the entire market. Examples: small towns which can only support one fast food outlet, one car dealer, one hotel, one flight, etc. In these cases, purchasing the unit of capital represents the entry cost so we shall assume that F = 0. Since each firm can service the entire market with a unit of capital, the post-entry game is likely to be Bertrand. We solve the game working backwards. If it enters, firm 2 knows that price will be undercut to marginal cost (which is zero in our example) and it will earn zero profits following entry. Hence, it will not enter as long as c > 0. This example illustrates an important point: if the incumbent’s capital is sunk and it can commit to behaving aggressively in the post-entry game, entry is deterred. Clearly, firms have an incentive to acquire such reputations. 5.1.2

Divisible Capital

Suppose any indivisibilities in capital are negligible relative to demand so ki is any real number in the interval [0,1]. In this case, it makes a lot more sense to assume that the post-entry game is Cournot. We study the equilibrium to this game by once again working backwards. Suppose firm 1 has invested in capacity k1 and firm 2 has entered (i.e., paid F ). Firm 2 has to choose its capacity and output. Since it has no incentive to invest in more capacity

44

than it needs to produce a given level of output, we may assume that y2 = k2 . Firm 2 then chooses y2 to max π 2 (y1, y2 ) = (1 − c − y1 − y2 )y1 . y2

Solving this optimization problem yields firm 2’s best reply function, R2 (y1 ) = (1 − c − y1 )/2 Firm 1 chooses its output to max π 2 (y1, y2 ) = (1 − y1 − y2 )y1 s.t. y1 ≤ k1 y2

This yields firm 1’s best reply function, ½ (1 − y2 )/2 if y2 > 1 − 2k1 R1 (y2 ) = k1 if y2 ≤ 1 − 2k1 The important point to note about these best replies is that firm 1’s best reply does not depend upon capacity costs. The reason is that its investment is sunk, so its marginal cost of production in the post-entry game is zero. By contrast, firm 2 has to invest in capacity to produce, so it faces a marginal cost of c per unit of output. As we shall see, firm 1 can use this asymmetry to deter entry. Idle Threats Figure 1 depicts the best replies when c = 1/5 and k1 = 1/2. The equilibrium to the post-entry game is (y1 , y2 ) = (1/5, 2/5), which yields a price of 2/5. Profits to firm 2 are 1/25; profits to firm 1 are 2/25. Therefore, as long as entry costs are sufficiently low, that is, F < 1/25, firm 2 enters and earns positive profits. Note, however, that firm 1 produces less than capacity. If firm 1 produces to its capacity of 1/2 after firm 2 enters, firm 2’s best reply is 3/20. But, if firm 2 produces that much, then firm 1 will want to reduce its output to 17/40. Hence, firm 1’s threat to produce up to capacity is not credible. Furthermore, firm 2 should anticipate this fact in making its entry decision. This argument leads to a very important point. If firm 1 is not constrained in its output choice in the equilibrium of the post-entry game, it should reduce its capacity prior to firm 2’s entry decision. The extra capacity is costly and has no benefit since it does not affect the entrant’s decision and is not used following entry. In the above example, firm 1’s equilibrium output is 2/5 but its capacity is 1/2. If it reduces k1 to 2/5, it lowers investment costs with no change in revenues. We therefore obtain the following conclusion: In the absence of any commitment to producing to capacity, it is never optimal for firm 1 to invest in idle capacity. Therefore, in equilibrium, y1 = k1 . It is worth emphasizing that firm 1 may want to commit to producing to capacity if it could. For example, if it could credibly threaten to produce 1/2 in the post-entry 45

game, firm 2’s profits are much lower. The outcome to the post-entry game would be (y1 , y2 ) = (1/2, 3/20) which yields p = .35. Profits to firm 2 are 9/400. Hence, if F > 9/400, firm 2 would not enter in response to firm 1’s threat and, in turn, firm 1 would earn monopoly profits of 1/4. But, how can firm 1 make its threat credible? There is no way in this model. Equilibrium In what follows, we denote the post-entry equilibrium outputs as (y1∗ , y2∗ ). Case 1: High Entry Costs Entry is said to be blockaded if the monopoly choice of capacity by firm 1 is sufficient to deter entry. The monopoly choice of output in our model is k m (c) = (1 − c)/2. To determine the value of F for which this choice blockades entry, we need to compute the profits of firm 2 in the post-entry equilibrium and check when they are less than F. Define k(c) = (1 + c)/3. Recall that k is the maximum amount that firm 1 can credibly threaten to produce in the post-entry equilibrium. Note that k can be less than km . More precisely, k m T k ⇔ c S 1/5. Therefore, in evaluating the profitability of entry, we need to distinguish between two cases. When c < 1/5, firm 2 computes its profits from entry assuming that firm 1 produces k, which is less than its capacity of k m . When c > 1/5, firm 1 computes its profits from entry assuming that firm 1 produces to capacity, km , .which is less than k. The post-entry equilibrium can be characterized as follows: ½ ½ ½ (1 − 2c)/3 if c < 1/5 (1 + c)/3 if c < 1/5 k if c < 1/5 ∗ ∗ ∗ p = , y2 = y1 = m (1 − c)/4 if c ≥ 1/5 (1 + 3c)/4 if c ≥ 1/5 k if c ≥ 1/5 Therefore, entry is blockaded if F >

½

(1 − 2c)2 /9 if c < 1/5 (1 − c)2 /16 if c ≥ 1/5

The equilibrium outcome when F satisfies the above inequality is that firm 1 invests in km , firm 2 does not enter, and firm 1 produces to capacity. Case 2: Low Cost Advantage Suppose F satisfies the above inequality for c < 1/5. Recall that the maximum output that firm 1 can threaten to produce in the post-entry equilibrium is k. Since firm 2’s profits at this output level exceed F , firm 1 cannot deter entry. Knowing this, how much capacity

46

should firm 1 invest? Applying the no idle capacity result, the optimal choice of capacity for firm 1 is determined by the following problem: max π 1 (k1 ) = (1 − k1 − c − R2 (k1 ))k1 s.t. k1 ≤ k. k1

The solution to the unconstrained problem is k1 = (1 − c)/2, which is, of course, the monopoly output, k m . (This result is due to the linearity of the demand function and is not true in general.) But, for c < 1/5, km > k. Hence, in this case, firm 1 should simply invest in capacity k. Firm 2 enters with a capacity of (1 − 2c)/3, which is less than that of firm 1, and each firm produces to capacity. Case 3: High Cost Advantage Suppose F satisfies the above inequality and c ≥ 1/5. Firm 1 cannot deter entry by choosing its preferred capacity of km . But, in this case, km < k, so it can increase capacity beyond km (but not beyond k) and credibly threaten to use that capacity in the event that firm 2 enters. Hence, in this range of capital costs, firm 1 may have two possible choices. One choice is to invest in capacity km , let entry occur at a capacity of (1 − c)/4, and then produce to capacity. The other choice is to choose a higher level of capacity, kd , deter firm 2 from entering, and then produce to capacity as a monopolist. Moreover, it will use that capacity in the event that firm 2 does not enter since k < 1/2 for c < 1/2. Here kd is defined as the solution to the following equation: π 2 (k1 , R2 (k1 )) = [1 − c − k1 − (1 − c − k1 )/2](1 − c − k1 )/2 = F which is given by

√ kd = 1 − c − 2 F .

Firm 1 will utilize this option when c is relatively large for then km is much smaller than k and an increase in output to k substantially lowers firm 2’s profits. Example 1 c = 2/5. The post-entry equilibrium at k1 = km is km = 3/10, R2 (km ) = 3/20, p∗ = 11/20, π ∗2 = 9/400, π ∗1 = 9/200 The post-entry equilibrium at k1 = k is k = 7/15, R2 (k) = 1/15, p∗ = 7/15, π ∗2 = 1/225, ..π ∗1 = 7/225 Now consider an entry cost such as F = 4/400. It is less than 9/400, the level required to blockade entry at k m , and exceeds 1/225, the entry-deterring level at k. But firm 1 does 47

not have to commit to a capacity of k to deter entry, it need only expand to kd , which is 2/5. If it does so, then firm 2 does not enter, firm 1 enjoys a monopoly. Output, price and profits are y1 = 2/5, p∗ = 3/5, π ∗1 = 2/25. In this case, kd yields higher profits for firm 1 than km .

5.2

Learning By Doing

In the 1970s, several consulting firms including Boston Consulting Group recommend to their clients that they should sacrifice short-run profits early in the product life cycle in order to gain a strategic advantage over rivals later in the cycle. The argument is that by cutting price and producing a lot early, a firm slides down the learning curve and lowers its costs in subsequent periods. This in turn will allow the firm to enjoy a larger cost advantage vis-a-vis its rivals and perhaps deter entry. The key assumption here is that learning is not transferable (i.e., no spillovers), it can only be achieved throught production. We examine this argument in this lecture. The phenomenon of learning-by-doing was first discussed by Alchian (1950) in a study of wartime airframe production. Last year, a Yale Ph.D economics student estimated the magnitude of learning-by-doing effects in the airplane production and found that they were huge and were an important determinant of product exit and entry patterns. Lieberman (1982, 1984) estimated learning curves in production of chemical products. More recently, Dick (1991) has studied strategic effects of learning in the semiconductor industry, and Jarmin (1994) has conducted a similar study of the US Rayon industry in the period 19201938. He finds that the leading rayon producers overinvested in learning to push down their learning and reduce their rivals’ markets shares in subsequent periods. 5.2.1

Model

There are two periods. Demand in each period is given by P (y) = 1 − y. In period 1, firm A is a monopolist. Its unit cost in period 1 is c. Its unit cost in period 2 depends upon its output in period 1, denoted y1 , according to the equation c2 = c − θy1 where θ < c. For simplicity, we assume that firm 1 does not discount profits, so its objective is to choose outputs (y1 , y2 ) to maximize profits summed over the two periods. Firm A is worried about firm B entering in period 2.

48

Monopoly Before studying how the threat of entry affects firm A’s choice of output in period 1, let us first establish the monopoly benchmark. Firm A’s maximization problem is as follows: max (1 − y1 − c)y1 + (1 − y2 − c + θy1 )y2 y1 ,y2

Let us first optimize with respect to second period output. The first-order condition for optimality requires y2 = (1 − c + θy1 )/2 Susbstituting this relation into the maximization problem given above and simplifying yields max (1 − c − y1 )y1 + (1 − c + θy1 )2 /4. y1

Differentiating with respect to period output yields 1 − c − 2y1 + θ(1 − c + θy1 )/2 = 0. The profit maximizing solution is y1M =

(1 − c) (1 − c) , y2M = (2 − θ) 2−θ

Intrepretation: If the monopolist cared only about first period profits, it would produce (1 − c)/2. When it takes into account the effect of first-period production of second-period costs, it produces more. The larger is the value of θ, the higher is the monopolist’s first period output. Threat of Entry Suppose now that firm A worries about entry by firm B in period 2. Let x denote output by firm B. It produces at a unit cost of c. Assume that y1 is observed by firm B. Then, if it enters, then its optimization problem is max(1 − c − y2 − x)x x

Differentiating with respect to x yields its best reply R(y2 ) = (1 − c − y2 )/2. Similarly, firm A’s best reply in period 2 is R(x) = (1 − c2 − x)/2. Solving for the intersection of the best replies yields the Cournot solution y2 = (1 + c − 2c2 )/3,

x = (1 + c2 − 2c)/3.

Equilibrium profits in period 2 are 2 B 2 πA 2 = (1 + c − 2c2 ) /9, π 2 = (1 + c2 − 2c) /9.

49

Firm A’s first period optimization problem, assuming it accommodates the entrant, is max(1 − c − y1 )y1 + (1 − c + 2θy1 ))2 /9. y1

Solving yields y1D

· ¸ (1 − c) (9 + 4θ) = 2 (9 − 4θ)

It is not difficult to show that y1D is larger than y1M . Intuitively, firm 1 wants to be more aggressive in period 1 since it then commits itself to higher output in period 2. This is in turn causes the entrant to choose a lower output, increasing profits to firm 1. If firm 2 incurrs an entry cost of F , and its profits at the above solution exceed F , firm 1 may want to increase output even further to deter entry. Once again, there may be two solutions: the output level which assumes accommodation and a larger output which deters entry. Example 2 Suppose c = θ = .25. Then y1M = 3/7, y1D = 15/32. Since output in period 1 is approximately equal to 1/2, firm 1’s unit costs in period 2 are approximately 1/2 of its unit cost in period 1. Hence y2 ≈ 1/3 and x ≈ 5/24. Total output in period 2 is approximately 13/24, which is only slightly less than the amount in period 1. Hence, prices do not decline very much, even though period 1 is monopoly and period 2 is duopoly.

5.3

Complementarities

Given the freedom to choose their own route structure and prices following deregulation in 1979, most airlines transformed their networks into hub and spoke networks. Interlining traffice (i.e., changing airlines at a connecting point) declined dramatically. Bamberger and Carlton (1993) report that interlining traffic as a share of connecting traffic fell from 38.8% in 1979 to 4.5% in 1989. This reflects the growth of single carrier hub airports. By 1986, Pickrell and Oster report that virtually all of the commuter and regional airlines were tied contractually to one of the major airlines operating hub and spoke networks. We provide an explanation of this phenomenon in this lecture and also why hub and spoke networks are a deterrent to entry by small airlines. 5.3.1

Model

There are n ≥ 4 cities and individuals living in each city who wish to travel to other cities. Individuals who wish to travel from city g to city h are assumed to have no desire to travel anywhere else (i.e., no substitutability in demand across city-pair markets). Individuals care only about reaching their destination at the lowest price, not how this destination is reached. In particular, they are indifferent to distance traveled, number of stops incurred, 50

or the airline that is flying them. Demand in each city-pair market is the same. If pgh is the price of the cheapest return ticket from city g to city h, then the number of g-h travelers is given by D(pgh ) = 1 − pgh Here the g-h market is distinct from the h-g market. For simplicity, we assume that transport cost per passenger per flight are 0. The fixed costs of offering a direct flight between cities g and h is F . The flight services both the g-h and h-g markets. Here we assume that F is not sunk. Monopoly Hub Operator Suppose airline H is a monopolist and operates a hub and spoke network. Let city 1 denote the hub city. The total number of markets are n(n-1). Of these, 2(n-1) are serviced by a direct flight, and (n-1)(n-2) city-pair markets that are serviced by a one-stop flight. Since length does not matter either to airline or travelers, the monopolist charges the same price in each city-pair market. Define π M ≡ max p(1 − p) p

to be the monopoly profits in a city-pair market. It is easily verified that pM = 1/2, π M = 1/4. Network profits to the hub operator are ΠM = (n − 1)[n/4 − F ]. Threat of Spoke Entry A low-cost regional airline is contemplating entry into one of the spoke markets, say the (1-2) and (2-1) markets. Its marginal costs are also 0 and its fixed cost is FE . Will the hub-operator concede the spoke market to the lower cost regional airline? To answer this question, we need to compute the equilibrium profits of the huboperator when it does and does not concede the market. Suppose it does not concede the market. Then the pricing in the (1-2) and (2-1) markets is essentially Bertrand. Each airline will undercut the other until price is equal to marginal cost. Each airline earns zero profits in these markets. Prices in all other city-pair markets are unaffected. In particular, the hub operator continues to charge monopoly prices in each (g-2) and (2-g) market, g 6= 1, 2, and in each of its hub markets. Therefore, its losses are -1/2. Now, suppose the hub operator concedes the (1-2) and (2-1) markets by withdrawing its flights. By doing so, it saves the fixed cost F. What does it lose? Let s denote the price charged by the regional carrier for traveling on its (1-2) or (21) flight. It can charge only one price on each flight since it cannot discriminate among travelers on the basis of their origin or destination. The hub operator faces a similar problem trying to discriminate between (g-1) and (g-2) travelers (as well as (1-g) and (2-g) travelers). Because the latter fly with the regional traveler, the hub operator is constrained to charge 51

the (g-1) and (g-2) travelers the same price. Let p denote this price. It is determined by the following optimization problem: π s ≡ max p(1 − p) + p(1 − s − p) p

which yields the best reply function p = (1 − s)/2. Note that π s consists of profits in the hub market, (g-1), and the connecting market, (g-2). Given the hub operator’s price p in these markets, and assuming this price is the same for every g ≥ 3, the regional carrier solves max s(1 − s) + (n − 1)s(1 − p − s) s

which yields the best reply s= The equilibrium consists of s=

(n − 1) − (n − 2)p . 2(n − 1)

n n−1 n−1 , p= , πs = 3n − 2 3n − 2 3n − 2

There are 2(n-2) connecting markets associated with city 2. Therefore, the loss in operating profits from accommodation is ¶ µ n−1 1 1 − − F > 2(n − 2) − F. 2(n − 2)(2π M − π s ) − F = 2(n − 2) 2 3n − 2 6 Clearly, given any F such that the regional operator makes a profit, n does not have to be very large for the hub operator to be better off not conceding the market. If it concedes the (1-2) and (2-1) market, the hub operator has to lower prices in its hub markets and share profits in each connecting market involving city 2. Note that prices in the latter markets are higher with entry due to double marginalization. The complementarities of hub-spoke networks make it difficult for a regional carrier to enter a spoke markets or to survive as an independent operator. In the former case, the hub operator will not want to concede the market and in the latter case, it has a strong incentive to invade its markets. Therefore, it is not surprising that regional carriers quickly became allied with the hub operators. The alliance also allowed the airlines to codeshare and price discriminate. It is not hard to see how the argument given above for airline networks extends to sets of complementary products such as computer software and hardware. The economic forces suggest that one firm will offer the entire set of products and that it will be difficult for any firm to invade a part of the product space. Furthermore, competition may not be a good thing in these circumstances. Prices tend to be lower when the entire set of products are offered by a single firm. 52

5.4

Brand Proliferiation

In 1972, the FTC filed a complaint against Kellogg, General Mills, General Foods and Quaker Oats - the four largest ready-to-eat (RTE) breakfast cereals in the U.S. They were charged with violating Section I of the Sherman Act. These four firms accounted for 91% of the market in 1970. Despite substantial growth in industry sales in the period from 1947 to 1970, the number of competitors had declined from 55 in 1947 to 30 in 1967. Each of the four firms enjoyed rates of return which were double the rates enjoyed by other firms in the food industry. The FTC argued that the Big Four colluded on price and engaged in a number of exclusionary practices designed to restrict competition among themselves and exclude entry. The practices included (1) a shelf-space allocation plan designed to stablize market shares (2) intensive advertising to create the impression of product differentiation and to create barriers to entry and (3) brand proliferation. According to the FTC, the Big Four occupied virtually every single profitable position in the product spectrum of the RTE breakfast cereals market. With so many brands available in the product spectrum, it was hard for a new entrant to find a niche to establish itself. Furthermore, when an entrant did manage to find a niche, the four firms immediately brought out new brands to compete in that location . This lecture addresses the following question. Is product proliferation on the part of an incumbent monopolist a profit-maximizing strategy? 5.4.1

Model

Products are differentiated along a single dimension which we model by assuming that tastes of consumers are uniformly differentiated along the unit interval, [0,1]. Each consumer is indexed by her location and each consumer wants only one unit of the good. The utility of consumer x is given by u(x) = s − p − td2 where p is the price of the product purchased and d is the distance to the location of the product. Each product requires a setup cost of f . Marginal costs of production are zero. There are two firms. Firm 1 is the incumbent firm. It has been servicing the entire market with a product located at 0. It has been charging customers a per unit price of pM = s − t. This is the highest possible price that the monopolist can charge and still attract the consumer located at l. The firm will want to do so as long as s exceeds t. Its profits were ΠM 0 = M (s − t) − f where M was the number of consumers. We shall assume that M t/2 < f < M t. 53

This assumption implies that no one will enter when the state of demand is M. Recall that in the case of quadratic transport costs, if another firm enters, then it will do so at location 1. Furthermore, if it did so equilibrium prices would be p1 = p2 = t where pi denotes the price charged by firm i. Each firm would serve half of the market. Consequently, if firm 2 entered at location 1, it would earn a profit of ΠD 0 =

Mt −f 2

which is negative by the above assumption. Suppose demand doubles so firm 2 can enter profitably. Firm 1 has the advantage of incumbency and can move first. If it locates a new product at location 1, then its profits are ΠM 1 = 2M (s − t/2) − 2f. The marginal consumer is now located at 1/2. If firm 1 allows firm 2 to enter at location 1, then the two firms will divide the market at price t and each will earn ΠD 1 = M t − f. D Under our assumption about f , it is easily checked that ΠM 1 > Π1 . Thus, firm 1 will try to preempt entry by firm 2 by locating a new product first. Notice that the argument here is exactly the same as in the R&D game. Entry competes away monopoly profits that the incumbent will want to protect by preemptive investment. Judd (1985) has pointed out that the above argument depends critically upon the immobility of products. If exit costs are negligible, then firm 2 should ignore the fact that firm 1 has located a new product at location 1 and locate its product there as well. Why? Because firm 1 will be hurt by the competition at location 1 and will prefer to withdraw its product. To see why, let p10 and p11 denote firm 1’s prices of products 0 and 1 and let p2 denote firm 2’s price of its product 1. If firm 1 does not withdraw its product, then Bertrand competition will drive price of good 1 to zero, that is, p11 = p2 = 0. This in turn means that the marginal consumer for good 0 is determined by

x2 = 0 + t(1 − x b)2 =⇒ x b = (t − p10 )/2t. p10 + tb

Firm 1 seeks to choose p10 to maximize

which yields the solution

π(p10 ) = p10 x b(p10 ) p∗10 = t/2. 54

Total revenues for firm 1 if it insists on offering product 1 are tM . On the other hand, if it withdraws product 1 and leaves the market for firm 2, then the equilibrium price for each product increases to t and firm 1 earns revenues of 2M t. Therefore, even if f is sunk, firm 1 is better off withdrawing its product 1 provided exit costs are not too large. If setup costs are not sunk, then it gains even more. Notice that the argument in this case is exactly the opposite of the one given for hubspoke networks. The reason is that here products are substitutes, not complements. Finally, I would like to remark that the proliferation of brands may have nothing to do with entry deterrence. If a cartel fixes prices, then its members may try to compete in other dimension such as advertising and the number of products offered. Rents are dissipated, not by lowering prices, but by offering an excessive number of products.

5.5

Tying

Tying exists when a seller of a product requires as a condition of sale that the customer purchase a second product (the tied product) as well. It is often viewed by antitrust authorities as an exclusionary tactic and a violation of #1 of the Sherman Act. The issue is whether a firm with a monopoly in one market, say market A, can monopolize a second market, say market B, by tying the sale of product B to the sale of product A. 5.5.1

Model

We consider a simple model due to Whinston (1987) and discussed by Tirole in his text. There are two firms, 1 and 2, and two markets, A and B. The potential customers for products A and B are the same and equal to M. Demand in this market is ½ M if q ≤ v D(q) = 0 if q > v where q is the price of good A. Market B is a differentiated market in which demand for firm i (assuming q ≤ v) is denoted by Di (pi , pj ). Note that, since the set of customers in each market is the same, Di (·, ·) ≤ M. Let cA denote the unit production costs of good A and let cB denote the (common) unit productoin costs of good B. Tied Sales Suppose firm 1 refuses to sell individual units of goods A and B. It only sells packages consisting of one unit of each good. Let P denote the price of the package. Given firm 2’s price for good B, firm 1 chooses P to π ∗1 ≡ max(P − cA − cB )D1 (P − v, p2 ). P

How many customers buy from firm 1? Only those who are willing to pay v for good A and P − v for a unit of good B. Customers who are willing to pay v for a unit of good A but whose willingness to pay for good B from firm 1 is less will not buy the package. Let P ∗ denote the profit-maximizing tying price.. 55

Separate Sales Suppose firm 1 offers to sell the goods separately. It offers good A at price v and good B at price pe1 where pe1 solves the maximization problem π e1 ≡ max(p − cB )D1 (p, p2 ) p

In this case, it sells good A to all M customers, and good B to D1 (e p1 , p2 ). But π e1 ≥ (v − cA )M + (p∗ − cB )D1 (p∗ , p2 ) ≥ (v − cA + p∗ − cB )D1 (p∗ , p2 ) = π ∗1 .

The first inequality follows from the fact that p∗ = P ∗ −v does not maximize firm 1’s profits in market B. The second follows from the fact that M exceeds the number of customers buying good B from firm 1. Clearly, firm 1 is better off pricing each product separately and not tying. It can then sell units of good A to customers who prefer to buy from firm 2 or who just want good A at existing prices. This captures the intuition that firm 1 should be able to do better charging two prices than one price. Conclusion 1 If firm 2 is in the market, tying is not optimal. 5.5.2

Tying to Foreclose

The rationale for tying is to foreclose market B to firm 2. The basic idea is that tying allows firm 1 to commit to charging a low price in market B, thereby making this market less profitable to firm 2 and deterring it from entering (assuming production involves fixed costs). The key is to recognize that p∗ is lower than pe. By definition, p∗ maximizes (p∗ − [cB − (v − cA )])D1 (p∗ , p2 ).

In other words, tying has the effect of reducing firm 1’s costs in market B from cB to (cB − (v − cA )). Since best reply is increasing in unit cost, it must be the case that pe exceeds p∗ at any p2 . Hence, tying causes firm 1’s best reply to shifts inward, which implies lower equilibrium prices in market B and less profit for firm B. Conclusion: tying can be a useful tactic to foreclose competitors from the market.

56

6

Empirical Models of Market Entry

Most of the entry models that have been estimated are two period models. Firms first decide whether or not to enter a market and then they compete for customers by setting prices or quantities. Firms incur fixed set-up costs upon entry. Thus, equilibrium involves a finite (integer) number of firms, with the number determined by the size of the market and the amount of the fixed costs. Mankiw and Whinston (RAND, 1986) study the question of whether free entry yields too many firms or too few relative to the socially optimal number. In contrast to previous models studied in this course (e.g., Dixit-Stiglitz), MW do not explicitly model the postentry game but work with reduced form profit functions. • In homogenous good markets with imperfect competition (i.e., prices exceed marginal cost), the business-stealing effect causes excessive entry in the absence of integer constraint. The business-stealing effect is present if equilibrium output per firm declines with the number of firms. The intuition is that business-stealing causes the entrant to value entry more highly than the social planner. The social value of its entry is equal to its profits less the social value of the reduction in output of its rivals. Imposing the integer constraint can yield free entry outcomes in which the number of firms is less than the socially optimal number but by no more than one firm. • Product differentiation can reverse the free entry bias toward excessive entry. Now the entrant creates social surplus by increasing variety and it does not capture the full value of that surplus in profits. Hence, product diversity works in the opposite direction from the business-stealing effect. Its possible to specify functional forms in which one effect dominates (e.g., Spence (1976)) but in general there is a tradeoff. The empirical models of entry do not attempt to estimate the primitives (i.e., demand and cost functions) but instead try to approximate the properties of the reduced form, postenty profit function. The basic idea is simple: a firm enters if it expects to earn postive profits net of entry costs. Its post-entry profits depend upon the number of competitors and their behavior. In most models, post-entry profits decline with the number of competitiors, but the rate of decline is different depending upon the nature of post-entry competition. The goal of the empirical work is to infer something about post-entry behavior from the entry choices of firms.

6.1

Model 1 : Symmetric Firms

Data consists of a cross-section of markets: {Nt , St , Xt }Tt=1 . Here t indexes markets, Nt is the number of firms in market t, St is the size of market t, and Xt is a vector of profit shifters (e.g., demand and cost shifters). Firms are identical so the value of entering market t with Nt firms is the same for all firms. Define Π(Nt ) = W (Nt , St , Xt , θ) − Ft 57

as the value of entering market t with Nt firms; W is the continuation value function: it represent the present discounted value of post-entry profits; Ft is the fixed cost of entering market t. Note that the post-entry profit function and the fixed cost of entry are common to all firms. The fixed cost varies across markets. We assume that W is decreasing in N, increasing in S. If market t is large enough to support at least one firm (i.e., Π(1) > Ft ), then the entry game has multiple pure strategy Nash equilibria. For example, consider an entry game with two potential entrants. A strategy for firm i is given by δ i ∈ {0, 1} where 1 denotes entry. The payoffs to the entry game are given by 0, 0 0, Π(1) Π(1), 0 Π(2), Π(2) This game has two pure strategy equilibria when 0 < Π(1) < Π(2). Firm1 or firm 2 can enter but not both. (Note that in this case there is also a mixed strategy equilibrium in which each firm randomizes and earn zero expected profits.) The muliplicity of equilibria rules out using discrete choice models of firm behavior. Furthermore, the multiplicity problem is not the result of symmetry. Even if firms as asymmetric, multiple pure strategy equilibria are usually present. The basic problem is that the identity of entrants is not uniquely determined. One approach to avoid this problem is to assume sequential entry. The difficulty in practise with this solution is to identify the order in which firms make their choices. The approach adopted by Bresnahan and Reiss, Berry, Berry and Waldfogel, and others is to reinterprete the model as one that predicts the number of firms (i.e., Nt = δ 1 + δ 2 ), not their identities. In the symmetric model, Nash equilibrium implies that the number of firms satisfies Π(Nt , St , Xt , θ) > 0 > Π(Nt + 1, St , Xt , θ) or equivalently, W (Nt , St , Xt , θ) > Ft > W (Nt + 1, St , Xt , θ) Conditional on market characteristics, the equilibrium number of firms varies with fixed costs. Thus, the probability of observing i firms in market t depends on the probability that fixed costs in market t lies between a lower and upper bound. This is an ordered probability model and, if the distribution of Ft is assumed to be normal, an ordered probit. Let F denote the mean of the distribution of F . The likelihood of i firms is Pit = Pr{Nt = i|St , Xt ; θ, α} = Φ(W (i, St , Xt , θ) − F ; α) − Φ(W (i + 1, St , Xt , θ) − F ; α), and one can estimate the parameter vector (F , θ, α) by maximum likelihood. That is, define the indicator variable ½ 1 if Nt = i ; yit = 0 otherwise then the log likelihood function is

log L =

n T X X t=1 i=1

58

yit log Pit .

The parameter estimates can be used to generate the set of cutoff points in market t : {Wt (1), Wt (2), Wt (3)...Wt (n)}. Here n is the maximum number of firms. These cutoff points will vary with market characteristics. The interpretation of a cutoff point, Wt (N ), is that this is the value of fixed cost which makes the N th firm just indifferent between entering or staying out. The parameter estimates can also be used to partition the range of market sizes. Instead of varying F for fixed S and x, we vary S for fixed F and x. Typically this is done by evaluating F and x at their mean values in the data, F and x. Define S N as the population size which makes the N th entrant indifferent between entering or staying out. It solves θ) = F . W (N, S N , x, b

The set of cutoff sizes {S 1 , .., S N } are called entry thresholds. Clearly, entry thresholds are increasing since W is decreasing in N and increasing in S. The question is what is SN ? Is it increasing, constant, or decreasing? happening to the per firm threshold, sN = N The answer to this question depends upon post-entry behavior and to some extent on functional form assumptions. Bresnahan and Reiss assume that the continuation value function takes the form W (Nt , St , Xt , θ) = V (Nt , Xt , θ)St . In this case, W is increasing linearly in size and the thresholds are given by SN =

F V (N, x, b θ)

.

Suppose firms collude on the monopoly price and the good is homogenous; then V (N ) = V (1) , and sN is constant in N . If prices fall with entry, then V (N ) should decrease more N V (1) ). Therefore, if the per rapidly, particularly when the number is small (e.g., V (2) < 2 firm thresholds are decreasing in N , that is, sN +1 > 1, sN then this is evidence of price competition. Note that product differentiation complicates the story since entry expands the market. In the extreme case of no substitutability, V (N ) is independent of N , which implies that entry thresholds are also constant across N. Thus, one cannot distinguish between collusion and product differentiation. In general, the two effects compete against each other. Bresnahan and Reiss apply the above model to data on retail and professional firms in small isolated markets. The idea is to find towns that are sufficiently isolated that the firms do not have to compete with retail and profession firms in other towns, and small enough that they can support no more than two firms. The data are cross-sectional observations on town characteristics and on number of establishments. In their RES, 1990 paper, they 59

focused on towns where the outcomes for N are 0, 1 (monopoly), or 2 (duopoly). This allowed them to consider more general models than the one given above. In their later paper, they consider more outcomes for N . The following table is taken from their JPE, 1991 paper. Profession Doctors Dentists Plumbers Tire dealers

s2 s1

s3 s2

s4 s3

s5 s4

1.98 1.78 1.06 1.81

1.10 0.79 1.00 1.28

1.00 0.97 1.02 1.04

0.95 0.94 0.96 1.03

Under the assumption of homogenous goods, prices appear to decline a lot in moving from monopoly to duopoly for doctors, tire dealers or dentists but further increases in N do not seem to increase competition very much. Plumber’s prices never fall. The assumption of homogenous goods is not likely to be plausible. A threshold near one may simply reflect the off-setting effects of price competition and market expansion. If price data are available, then one obvious way of quantifying the effects is to estimate the fall of prices with entry. B&R in their 91 paper did observe prices for tire dealers and found that they do indeed fall with the first few entrants and then level off, which is consistent with the threshold size pattern. An interesting fact is that the prices appear to level out at a much higher level in small towns than in large, big-city towns. 6.1.1

Extensions

In their ’90 paper, BR considered the following extensions: • size of the market depend on variables other than town population size (TOWNPOP) such as number of people living within ten miles of the town (OPOP10), growth rates in population (NGRW and PGRW) and number of country residents who regularly commute to jobs outside the county (OCTY). S(Yt ) = T OW NP OPt + λ1 OP OP 10t + λ2 N GRWt + λ3 P GRWt + λ4 OCT Yt Positive and negative growth rates are entered separately because of asymmetries between entry and exit decisions caused by sunk cost. • mean of the distribution of fixed cost varies with N . The difference may reflect entry barriers. Note that the model remains symmetric: if firm i enters first, its fixed cost is a draw from Φ1 , and if it enters second, its fixed cost is a draw from Φ2 . They also include retail wage (RETWAGE) and per acre value of land (LANDVAL). Ft = γ 1 + γ 2 D + γ 3 RET W AGEt + γ 4 LAN DV ALt + ξ t where D is a dummy variable that is equal to 1 if the market is a duopoly. 60

• Genesove (2002) uses the BR model to study competition in the daily newspaper industry. He has panel data on the population of towns and on their number of daily newspapers. He uses a different estimation procedure. Define Qit = Pr{Nt ≥ i|St ; α, θ} = Φ(W (i, St , θ); α). Instead of estimating an ordered probit, he estimates n separate probits for i = 1, 2, ..n. This procedure allows the marginal impact of size to vary across the thresholds. David also does not impose any distributional assumptions on Ft by estimating the probits nonparametrically. We will hear more about this model in the student presentations. • Berry and Waldfogel (RAND, ’9?) consider entry into the radio industry where both price and quantity data are available. The additional information allows the authors to estimate some (or all depending upon the model) of the parameters of W (N, S, X). The entry data is used to estimate the remaining parameters (if any) of W and the distribution of fixed costs. • Mazzeo (1999) extends the BR model in a different direction. They consider a vertically differentiated market in which firms can choose to enter a high or a low quality product. In this case, the continuation value function depends upon the number of firms offering the high and low quality products in a markete. Thus, it is no longer possible to summarize the market conditions in terms of the number of firms. One needs to know M , the number of firms offering the high quality product, and N the number of firms offering the low quality product. Furthermore, the conditions under which entry is profitable is more complicated since we need to consider two value functions, W1t and W2t , which measure the returns from entering in the low and high quality segments of market t. That is, Wit = β i Xt + g(Mt , Nt , θ) + ξ it , i = H, L where H denotes high quality and L denotes low quality. The unobserved error is the same for all firms offering quality i.Firms are also assumed to have the same fixed cost. Thus, firms are symmetric ex ante and symmetric ex post condition on location. We will hear more about this model in class presentations. Note: the simultaneous move game may have non-existence of equilibrium problems.

6.2

Model II: Heterogenous Firms

Berry (Econometrica, 1992) studies entry of airlines operating in city-pair markets. He assumes that each city-pair market is a homogenous good market in which each firm earns the same profits. However, he allows the distribution of fixed costs of entry to vary across firms. In other words, he assumes that Π(Nt ) = W (Nt , Xt , θ) − F (Zf t , ξ f t ; θ) 61

where Zf t is a vector of observed firm-specific variables, and ξ f t is the unobserved firm/market specific profit shock. For simplicity, we include the size of the market in X. The particular function form assumed by Berry is Π(Nt ) = βXt − δ ln(Nt ) + αZf t + ξ f t , where

p ξ f t = ρut + ( 1 − ρ2 )uf t

consists of a market-specific component and a firm-market specific component. The disturbancesut and uf t are assumed to be i.i.d. standard normal across firms and markets. The correlation of the ξ 0f t s across firms within market t is given by ρ2 . As is typically the case in discrete choice models, the units of Π are not identified. Here p we adopt the usual normalization that the variance of ξ f t is one, hence the weight of 1 − ρ2 on uf t . As before, we work with the number of firms as the dependent variable and not with the individual firm’s entry choice. Berry shows that, even though firms are heterogenous, the equilibrium number of firms in market t is uniquely determined (although identities are not). The next step is to compute the probability of the event that Nt = i. Heterogeneity makes this step quite complicated since the dimension of the unobservables is no longer one-dimensional. The dimension is equal to the number of potential entrants since each firm has its own unobserved term. Hence, computing the probability of the event {Nt = i} involves as many as integrals as firms. But even this is not the main difficulty. There are numerical routines that can compute high dimensional integral terms. The primary difficulty has to do with the irregular shape of the domain of integration. To see why this is the case, consider a simple example with two firms, 1 and 2. Define Bij = {ξ|ξ i > −W (1), ξ j < −W (2)}, i 6= j, i, j = 1, 2, and B00 = {ξ|ξ 1 < −W (1), ξ j < −W (1)}, B22 = {ξ|ξ 1 > −W (2), ξ j > −W (2)}. Clearly, B00 denotes the probability of the event that neither firm enters, B22 denotes the probability that both firms enter, and Bij denotes the probability that firm i enters and firm j does not. The figure on the board partitions the plane according to the four possible outcomes. The regions where both or neither enter are simple rectangles. However, the probability of the event that one firm enters is Pr{N = 1} = Pr(B12 ) + Pr{B21 } − Pr{B12 ∩ B21 }, which is not a simple rectangle. It is possible in the case of two firms to express the integral as the sum of simple integrals but this is not generally possible for more than two firms. Berry’s solution to the problem is to use simulation methods to compute the relevant probabilities. Take S draws on the underlying R + 1 random variables where R is the 62

number of potential entrants. For each quess of θ and market observables, construct the bt (us ) where us is the realization of the sth draw. At the true equilibrium number of firms, N value of θ, the difference between the observed values of N and the value of N predicted by the model is mean independent of all the determinants of N . This lets us build a method of moments subroutine that looks for the value of θ that makes the covariance of the difference between the observed and predicted N 0 s and any function of its determinants equal to zero.

63