Please make sure to review the materials for this Module before addressing issues pertaining to the measurement of police performance. For this weeks discussion we have a very complex issue presented in a simple question:
How should police officer performance be assessed?
Please note your opinion on this issue must be supported by relevant literature and research.
New Perspectives in Policing
Executive Session on Policing and Public Safety This is one in a series of papers that will be published as a result of the Executive Session on Policing and Public Safety.
Harvard’s Executive Sessions are a convening of individuals of independent standing who take joint responsibility for rethinking and improving society’s responses to an issue. Members are selected based on their experiences, their reputation for thoughtfulness and their potential for helping to disseminate the work of the Session.
In the early 1980s, an Executive Session on Policing helped resolve many law enforcement issues of the day. It produced a number of papers and concepts that revolutionized policing. Thirty years later, law enforcement has changed and NIJ and the Harvard Kennedy School are again collaborating to help resolve law enforcement issues of the day.
Learn more about the Executive Session on Policing and Public Safety at:
www.NIJ.gov, keywords “Executive Session Policing”
www.hks.harvard.edu, keywords “Executive Session Policing”
VE RI TAS HARVARD Kennedy School Program in Criminal Justice Policy and Management National Institute of Justice
M A R C H 2 0 1 5
Measuring Performance in a Modern Police Organization Malcolm K. Sparrow
Introduction
Perhaps everything the modern police executive
needs to know about performance measurement
has already been written. But much of the best
work on the subject is both voluminous and now
more than a decade old, so there is no guarantee
that today’s police executives have read it. Indeed, it
appears that many police organizations have not yet
taken some of its most important lessons to heart.
I hope, in this paper, to offer police executives
some broad frameworks for recognizing the
value of police work, to point out some common
mistakes regarding performance measurement,
and to draw police executives’ attention to key
pieces of literature that they might not have
explored and may find useful. I also hope to
bring to the police profession some of the general
lessons learned in other security and regulatory
professions about the special challenges of
performance measurement in a risk-control or
harm-reduction setting.
A research project entitled “Measuring What
Matters,” funded jointly by the National Institute
of Justice (NIJ) and the Office of Community
Oriented Policing Services (COPS Office), led
2 | New Perspectives in Policing
to the publication in July 1999 of a substantial
collection of essays on the subject of measuring
performance.1 The 15 essays that make up that
collection are fascinating, not least for the
divergence of opinion they reveal among the
experts of the day. The sharpest disagreements
pit the champions of the New York Police
Department’s (NYPD’s) early CompStat model
(with its rigorous and almost single-minded
focus on reductions in reported crime as the
“bottom line” of policing) against a broad range
of scholars who mostly espoused more expansive
conceptions of the policing mission and pressed
the case for more inclusive and more nuanced
approaches to performance measurement.
Three years later, in 2002, the Police Executive
Research Forum (PERF) published another
major report on performance measurement,
Recognizing Value in Policing: The Challenge
of Recognizing Police Performance, authored
principally by Mark H. Moore.2 PERF followed
that up in 2003 with a condensed document, The
“Bottom Line” of Policing: What Citizens Should
Value (and Measure!) in Police Performance,3
authored by Moore and Anthony Braga.
Despite the richness of the frameworks presented
in these and other materials, a significant
proportion of today’s police organizations
seem to remain narrowly focused on the same
categories of indicators that have dominated the
field for decades:
(a) Reductions in the number of serious crimes
reported, most commonly presented as
local comparisons against an immediately
preceding time period.
(b) Clearance rates.
(c) Response times.
(d) Measures of enforcement productivity (e.g.,
numbers of arrests, citations or stop-and-frisk
searches).
Cite this paper as: Sparrow, Malcolm K. Measuring Performance in a Modern Police Organization. New Perspectives in Policing Bulletin. Washington, D.C.: U.S. Department of Justice, National Institute of Justice, 2015. NCJ 248476.
A few departments now use citizen satisfaction
surveys on a regular basis, but most do not.
Clearance rates are generally difficult to measure
in a standardized and objective fashion, so
category (b) tends to receive less emphasis than
the other three. Categories (c) and (d) — response
times and enforcement productivity metrics —
are useful in showing that police are getting to
calls fast and working hard but reveal nothing
about whether they are working intelligently,
using appropriate methods or having a positive
impact.
Therefore, category (a) — reductions in the
number of serious crime reports — tends to
dominate many departments’ internal and
external claims of success, being the closest
thing available to a genuine crime-control
outcome measure. These measures have retained
their prominence despite everything the field is
supposed to have learned in the last 20 years
about the limitations of reported crime statistics.
Those limitations (which this paper will explore in
greater detail later) include the following:
Measuring Performance in a Modern Police Organization | 3
(1) The focus is narrow because crime control is just
one of several components of the police mission.
(2) The focus on serious crimes is narrower still, as
community concerns often revolve around other
problems and patterns of behavior.
(3) Relentless pressure to lower the numbers,
without equivalent pressure to preserve the
integrity of the recording and reporting systems,
invites manipulation of crime statistics —
suppression of reports and misclassification of
crimes — and other forms of corruption.
(4) Focusing on reported crime overlooks
unreported crimes. Overall levels of
victimization are generally two to three times
higher than reported crime rates.4 Particularly
low reporting rates apply to household thefts,
rape, other sexual assaults, crimes against
youths ages 12 to 17, violent crimes committed
at schools, and crimes committed by someone
the victim knows well.5
(5) P r e s s u r e t o r e d u c e t h e n u m b e r s i s
counterproductive when dealing with invisible
crimes (classically unreported or underreported
crimes, such as crimes within the family, white
collar crimes, consensual crimes such as
drug dealing or bribery, and crimes involving
intimidation). Successful campaigns against
these types of crime often involve deliberate
attempts to expose the problem by first driving
reporting rates up, not down.6
(6) A focus on crime rate reductions does not
consider the costs or side effects of the strategies
used to achieve them.
(7) Emphasizing comparisons with prior time
periods affords a short-term and very local
perspective. It may give a department the
chance to boast, even while its crime rates
remain abysmal compared with other
jurisdictions. Conversely, best performers (with
low crime rates overall) might look bad when
random fluctuations on a quarterly or annual
basis raise their numbers. Genuine longer term
trends may be masked by temporary changes,
such as those caused by weather patterns or
special events. More important than local
short-term fluctuations are sustained longer
term trends and comparisons with crime rates
in similar communities. Pressure to beat one’s
own performance, year after year, can produce
bizarre and perverse incentives.
(8) Even if crime levels were once out of control,
the reductions achievable will inevitably run
out eventually, when rates plateau at more
acceptable levels. At this point, the department’s
normal crime-control success story — assuming
that reductions in reported crime rates had been
its heart and soul — evaporates. Some executives
fail to recognize the point at which legitimate
reductions have been exhausted. Continuing to
demand reductions at that point is like failing to
set the torque control on a power screwdriver:
first you drive the screw, which is useful work;
but then you rip everything to shreds and even
undo the value of your initial tightening. The
same performance focus that initially produced
legitimate gains becomes a destructive force if
pressed too hard or for too long.
4 | New Perspectives in Policing
(9) A number is just a number, and reliance on it
reduces all the complexity of real life to a zero
or a one. One special crime, or one particular
crime unsolved, may have a disproportionate
impact on a community’s sense of safety and
security. Aggregate numbers fail to capture
the significance of special cases.
Reported crime rates will always belong among
the suite of indicators relevant for managing a
complex police department, as will response
times, clearance rates, enforcement productivity,
community satisfaction and indicators of morale.
But what will happen if police executives stress
one or another of these to the virtual exclusion of
all else? What will happen if relentless pressure
is applied to lower the reported crime rate, but
no counterbalancing controls are imposed on
methods, the use of force, or the integrity of
the recording and reporting systems? From the
public’s perspective, the resulting organizational
behaviors can be ineffective, inappropriate and
even disastrous.
If we acknowledged the limitations of reported
crime rates and managed to lessen our
dependence on them, then how would we
recognize true success in crime control? And how
might we better capture and describe it?
I believe the answer is the same across the
f u l l ra nge of gover nment’s r isk-cont rol
responsibilities, whether the harms to be
controlled are criminal victimization, pollution,
corruption, fraud, tax evasion, terrorism or other
potential and actual harms. The definition of
success in risk control or harm reduction is to
spot emerging problems early and suppress them
before they do much harm.7 This is a very different
idea from “allow problems to grow so hopelessly
out of control that we can then get serious, all of a
sudden, and produce substantial reductions year
after year after year.”
What do citizens expect of government agencies
entrusted with crime control, risk control, or
other harm-reduction duties? The public does not
expect that governments will be able to prevent
all crimes or contain all harms. But they do
expect government agencies to provide the best
protection possible, and at a reasonable price, by
being:
(a) Vigilant, so they can spot emerging threats
early, pick up on precursors and warning
signs, use their imaginations to work out what
could happen, use their intelligence systems
to discover what others are planning, and do
all this before much harm is done.
(b) Nimble, flexible enough to organize
themselves quickly and appropriately around
each emerging crime pattern rather than
being locked into routines and processes
designed for traditional issues.
(c) Skillful, masters of the entire intervention
toolkit, experienced (as craftsmen) in
picking the best tools for each task, and
adept at inventing new approaches when
existing methods turn out to be irrelevant or
insufficient to suppress an emerging threat.8
Real success in crime control — spotting
emerging crime problems early and suppressing
Measuring Performance in a Modern Police Organization | 5
them before they do much harm — would not
produce substantial year-to-year reductions in
crime figures because genuine and substantial
reductions are available only when crime
problems have first grown out of control. Neither
would best practices produce enormous numbers
of arrests, coercive interventions or any other
specific activity because skill demands economy
in the use of force and financial resources and
rests on artful and well-tailored responses rather
than extensive and costly campaigns.
Ironically, therefore, the two classes of metrics
that still seem to wield the most influence in
many departments — crime reduction and
enforcement productivity — would utterly fail to
reflect the very best performance in crime control.
Furthermore, we must take seriously the fact
that other important duties of the police will
never be captured through crime statistics or
in measures of enforcement output. As NYPD
Assistant Commissioner Ronald J. Wilhelmy
wrote in a November 2013 internal NYPD strategy
document:
[W]e cannot continue to evaluate
personnel on the simple measure of
whether crime is up or down relative to a
prior period. Most importantly, CompStat
has ignored measurement of other core
functions. Chiefly, we fail to measure
what may be our highest priority: public
satisfaction. We also fail to measure
quality of life, integrity, community
relations, administrative efficiency, and
employee satisfaction, to name just a few
other important areas.9
Who Is Flying This Airplane, and What Kind of Training Have They Had?
At the most recent meeting of the Executive
Session, we asked the police chiefs present, “Do
you think your police department is more or less
complicated than a Boeing 737?” (see photograph
of Boeing 737 cockpit). They all concluded fairly
quickly that they considered their departments
more complicated and put forward various
reasons.
First, their departments were made up mostly of
people, whom they regarded as more complex
and difficult to manage than the electrical,
mechanical, hydraulic and software systems that
make up a modern commercial jetliner.
Second, they felt their departments’ missions
were multiple and ambiguous, rather than single
and clear. Picking Denver as a prototypical flight
destination, they wondered aloud, “What’s the
equivalent of Denver for my police department?”
Given a destination, flight paths can be mapped
out in advance and scheduled within a minute,
even across the globe. Unless something strange
or unusual happens along the way, the airline pilot
(and most likely an autopilot) follows the plan. For
police agencies, “strange and unusual” is normal.
Unexpected events happen all the time, often
shifting a department’s priorities and course. As
a routine matter, different constituencies have
different priorities, obliging police executives to
juggle conflicting and sometimes irreconcilable
demands.
6 | New Perspectives in Policing
Photograph of Boeing 737 Cockpit Source: Photograph by Christiaan van Heijst, www.jpcvanheijst.com, used with permission.
Assuming that for these or other reasons, the
answer is “more complicated,” then we might
want to know how the training and practices
of police executives compare with those of
commercial pilots when it comes to using
information in managing their enterprise.
The pilot of a Boeing 737 has access to at least 50
types of information on a continuing basis. Not
all of them require constant monitoring, as some
of the instruments in the cockpit beep or squeak
or flash when they need attention. At least 10 to
12 types of information are monitored constantly.
What do we expect of pilots? That they know,
through their training, how to combine different
types of information and interpret them in
context, so they can quickly recognize important
conditions of the plane and of the environment
and know how they should respond.
A simple question like, “Am I in danger of
stalling?” (i.e., flying too slowly to retain control
of the aircraft) requires at least seven types of
information to resolve: altitude, air temperature,
windspeed, engine power, f lap deployment,
weight and weight distribution, together with
knowledge of the technical parameters that
determine the edge of the flight envelope. Some of
these factors relate to the plane, and some relate
to external conditions. All these indicators must
be combined to identify a potential stall.
Measuring Performance in a Modern Police Organization | 7
Thanks to the availabilit y of simulators,
commercial airline pilots are now trained to
recognize and deal with an amazing array of
possible scenarios, many of which they will never
encounter in real life.10 They learn how various
scenarios would manifest t