Discrete MUltinomial Choices and Event Counts - NYU Stern

Il n'est plus possible de penser une organisation ou un groupe social en ne .....
ce qui allait et ce qui n'allait pas, les points sur lesquels le modèle méritait d'être
corrigé, ..... 5. La spécialisation des fonctions. Processus de sélection à partir de
... tâches correspondantes; 2°) La protection des fonctionnaires dans l'exercice
de ...

Part of the document



18


Discrete MUltinomial Choices and Event Counts








18.1 Introduction

Chapter 17 presented most of the econometric issues that arise in analyzing
discrete dependent variables, including specification, estimation,
inference, and a variety of variations on the basic model. All of these
were developed in the context of a model of binary choice, the choice
between two alternatives. This chapter will use those results in extending
the choice model to three specific settings:

Multinomial Choice: The individual chooses among more than two choices,
once again, making the choice that provides the greatest utility.
Applications include the choice among political candidates, how to commute
to work, which energy supplier to use, what health care plan to choose,
where to live, or what brand of car, appliance, or food product to buy.

Ordered Choice: The individual reveals the strength of their preferences
with respect to a single outcome. Familiar cases involve survey questions
about strength of feelings regarding a particular commodity such as a
movie, a book, or a consumer product, or self-assessments of social
outcomes such as health in general or self-assessed well-being. Although
preferences will probably vary continuously in the space of individual
utility, the expression of those preferences for purposes of analyses is
given in a discrete outcome on a scale with a limited number of choices,
such as the typical five-point scale used in marketing surveys.

Event Counts: The observed outcome is a count of the number of occurrences.
In many cases, this is similar to the preceding settings in that the
"dependent variable" measures an individual choice, such as the number of
visits to the physician or the hospital, the number of derogatory reports
in one's credit history, or the number of visits to a particular recreation
site. In other cases, the event count might be the outcome of some less
focused natural process, such as incidence prevalence of a disease in a
population or the number of defects per unit of time in a production
process, the number of traffic accidents that occur at a particular
location per month, the number of customers that arrive at a service point
per unit of time or the number of messages that arrive at a switchboard per
unit of time over the course of a day. In this setting, we will be doing a
more familiar sort of regression modeling.
Most of the methodological underpinnings needed to analyze these cases
were presented in Chapter 17. In this chapter, we will be able to develop
variations on these basic model types that accommodate different choice
situations. As in Chapter 17, we are focused on discrete outcomes, so the
analysis is framed in terms of models of the probabilities attached to
those outcomes.

18.2 MODELS FOR UNORDERED MULTIPLE CHOICES

Some studies of multiple-choice settings include the following:
1. Hensher (1986, 1991), McFadden (1974), and many others have analyzed
the travel mode of urban commuters. In Greene (2007b), Hensher and Greene
analyze commuting between Sydney and Melbourne by a sample of individuals
who choose among air, train, bus, and car as the mode of travel.
2. Schmidt and Strauss (1975a, b) and Boskin (1974) have analyzed
occupational choice among multiple alternatives.
3. Rossi and Allenby (1999, 2003) studied consumer brand choices in a
repeated choice (panel data) model.
4. Train (20032009) studied the choice of electricity supplier by a
sample of California electricity customers.
5. Michelsen and Madlener (2012) studied homoewners' choice of type of
heating appliance to install in a new home.
56. Hensher, Rose, and Greene (20062015) analyzed choices of automobile
models by a sample of consumers offered a hypothetical menu of features.
7. Lagarde (2013) examined the choice of among different sets of
guidelines for preventing malaria by a sample of individuals in Ghana.

In each of these cases, there is a single decision among two or more
alternatives. In this and the next section, we will encounter two broad
types of multinomial choice sets, unordered choices and ordered choices.
All of the choice sets listed above are unordered. In contrast, a bond
rating or a preference scale is, by design, a ranking; that is, its
purpose. Quite different techniques are used for the two types of models.
We will examined models for ordered choices in Section 18.3. This section
will examine models for unordered choice sets. General references on the
topics discussed here include Hensher, Louviere, and Swait (2000), Train
(2009), and Hensher, Rose, and Greene (20062015).

18.2.1 Random Utility Basis of the Multinomial Logit


Model





Unordered choice models can be motivated by a random utility model. For the
[pic]th consumer faced with [pic] choices, suppose that the utility of
choice [pic] is
[pic]
If the consumer makes choice [pic] in particular, then we assume that [pic]
is the maximum among the [pic] utilities. Hence, the statistical model is
driven by the probability that choice [pic] is made, which is
[pic]
The model is made operational by a particular choice of distribution for
the disturbances. As in the binary choice case, two models are usually
considered, logit and probit. Because of the need to evaluate multiple
integrals of the normal distribution, the probit model has found rather
limited use in this setting. The logit model, in contrast, has been widely
used in many fields, including economics, market research, politics,
finance, and transportation engineering. Let [pic] be a random variable
that indicates the choice made. McFadden (1974a) has shown that if (and
only if) the [pic] disturbances are independent and identically distributed
with Gumbel (type 1 extreme value) distributions,
[pic] (18-1)
then
[pic] (18-2)
which leads to what is called the conditional logit model. (lt is often
labeled the multinomial logit model, but this wording conflicts with the
usual name for the model discussed in the next section, which differs
slightly. Although the distinction turns out to be purely artificial, we
will maintain it for the present.)
Utility depends on [pic], which includes aspects specific to the
individual as well as to the choices. It is useful to distinguish them. Let
[pic] and partition [pic] conformably into [pic]. Then [pic] varies across
the choices and possibly across the individuals as well. The components of
[pic] are typically called the attributes of the choices. But, [pic]
contains the characteristics of the individual and is, therefore, the same
for all choices. If we incorporate this fact in the model, then (18-2)
becomes
[pic] (18-3)
Terms that do not vary across alternatives-that is, those specific to the
individual-fall out of the probability. This is as expected in a model that
compares the utilities of the alternatives.
For example, inConsider a model of a shopping center choice by
individuals in various cities that depends on the number of stores at the
mall, [pic], the distance from the central business district, [pic] and the
shoppers' incomes, [pic], the utilities for three choices would be
[pic]
[pic]
[pic]
The choice of alternative 1, for example, reveals that
[pic]
[pic]
The constant term and Income have fallen out of the comparison. The result
follows from the fact that the random utility model is ultimately based on
comparisons of pairs of alternatives, not the alternatives themselves.
Evidently, if the model is to allow individual specific effects, then it
must be modified. One method is to create a set of dummy variables
(alternative specific constants), [pic], for the choices and multiply each
of them by the common w. We then allow the coefficients on these choice
invariant characteristics to vary across the choices instead of the
characteristics. Analogously to the linear model, a complete set of
interaction terms creates a singularity, so one of them must be dropped.
For this example, the matrix of attributes and characteristics would be
[pic]
The probabilities for this model would be
[pic]
[pic]

18.2.2 THE MULTINOMIAL LOGIT MODEL

To set up the model that applies when data are individual specific, it will
help to consider an example. Schmidt and Strauss (1975a, b) estimated a
model of occupational choice based on a sample of 1,000 observations drawn
from the Public Use Sample for three years: l960, 1967, and 1970. For each
sample, the data for each individual in the sample consist of the
following:
1. Occupation: [pic], [pic], [pic], [pic], [pic]. (Note the slightly
different numbering convention, starting at zero, which is standard.)
2. Characteristics: constant, education, experience, race, sex.
The multinomial logit model[1] model for occupational choice is
[pic] (18-4)
(The binomial logit model in Section 17.3 is conveniently produced as the