As regular visitors will know, I have almost completed the latest Thin Blue Line analysis on police detections. Before uploading the report to the site, I would like to take this opportunity to welcome Inspector Simon Guilfoyle of West Midlands Police to the world of blogging.
Simon has recently launched his own blog which can be viewed here. I will let Simon introduce himself . . . . . .
"I am a West Midlands Police Inspector, in charge of seven neighbourhood policing teams and a proactive team covering the North East area of Wolverhampton, West Midlands. The areas covered are: Wednesfield, Heathtown, Fallings Park, Low Hill, Bushbury, Pendeford and Oxley.
I am a practical, fair and upfront person. I care deeply about the area I am responsible for and am passionate about policing and doing the right thing. I have a big problem with unnecessary bureaucracy, numerical targets and anything else that stops my officers from providing the best service they can.
I joined West Midlands Police in 1995 and with the exception of 15 months at headquarters I have always been a frontline uniformed officer. This is the job I joined to do and I am proud of it. Even my 15 months at HQ was spent implementing new working practices designed to make frontline policing more ‘common sense’ orientated for my colleagues out there.
Most of my time is spent running my sector, but I also provide Duty Inspector cover for the shift on 24/7 duties and am a public order trained officer (PSU Commander) so I get to work a lot of football, demonstrations, large events etc.
I’ve recently figured out how to use Twitter and am doing my best to embrace social media, hence this blog. I consider myself to be a systems thinker and try to apply this philosophy to real world policing. I also like catching the bad guys".
Simon has written an excellent article about police recorded crime and detections, "Crime In Progress: The Impact Of Targets On Police Service Delivery", the content of which is so relevant to the report I am compiling, he has given kind permission for it to be reproduced here. You can read his report here, or continue to scroll down this page.
Over to you Simon . . .
A Very Brief Intro
This article looks at the effect of numerical targets in public services, with a particular focus on the police in the UK. For those of you who do not wish to read over 6,000 words, the article can be summarised as follows:
■Priorities are important.
■Performance measurement (when done properly) is useful.
■Numerical targets are bad.
Simon Guilfoyle,
April 2011
What is ‘Good’ Police Performance?
Good police performance means different things to different people. From the perspective of a victim of crime, good performance might mean a prompt police response and competent investigation. A police manager may judge good performance by counting the number of arrests or detected offences recorded by individual officers. A local politician may consider that a reduction in the overall crime rate indicates good performance. Others may have different interpretations.
So what is ‘good performance’ and how should it be measured? A helpful definition of good performance is:
“A combination of doing the right things (priorities), doing them well (quality) and doing the right amount (quantity)”. (Home Office, 2008a)
If we take these three elements of good performance as a starting point, it becomes apparent how difficult it is to quantify ‘quality’. Numerical data relating to response times, arrest figures and crime rates is comparatively easy to measure, and in the absence of scientifically robust qualitative measures, it is argued that the police service has become heavily dependent on quantitative measures to assess performance.
From the outset it is appropriate to make the distinction between priorities and targets. Aims such as catching criminals and working to prevent crime are clearly appropriate priorities for a police force. ‘Priorities’ is one of the three features of the Home Office’s definition of good performance, and it would be difficult to argue that the police should not strive to prevent crime or prosecute offenders. The rub comes where a priority is fixed to a numerical target.
Before embarking on attempting to deconstruct the argument that numerical targets can ever be appropriate or useful, it is also necessary to voice strong support for the proper application of performance measurement – as long as it is used proportionately, the data is interpreted intelligently, and most of all, numerical targets are never a feature, performance measurement can be a valuable tool for understanding the system and improving service delivery.
Performance measurement can assist managers in recognising areas that require improvement and provides a solid evidence base for identifying weaknesses in the system. This enables action to be taken to make systemic adjustments, redirect resources, or address poor performance. Managers can interpret the data obtained from the performance measurement system to understand how the organisation is performing and monitor improvement or deterioration over time. The transparency achieved through effective performance management also has the benefit of enhancing accountability. This is particularly important in the public services arena.
There are, however, a number of caveats. Bouckaert and van Dooren (2003) argue that “…performance measurement is only useful if it improves policy or management” (2003, p.135), and this is the test that should be applied when determining whether a particular performance measurement system is necessary or appropriate.
Numbers, Numbers, Numbers
Reliance on numerical outputs as a measure of performance can be traced back to Taylor’s Theory of Scientific Management. (1911) This involved measuring relatively simple inputs and outputs, such as time taken to complete a unit of work, or the number of items produced. The methodology was originally intended for application in work environments such as in factories; units produced per hour would be measured and this would act as a benchmark for all the workers. Taylor’s approach resulted in the standardisation of working practices, and in the right conditions increased efficiency, but is limited to those environments where it is easy to measure performance by using numerical performance indicators. His methodology does not easily translate into more complex performance environments such as policing, where it is often difficult to accurately measure activity.
As numerical performance measurement systems are incapable of recognising quality, there is the danger that if a large number of poor quality units were produced it would still give the appearance of good performance. This is despite resultant product failure, rework, additional cost and ultimately a reduction in efficiency. This would occur whilst achieving numerical output targets and under the veneer of apparently good performance.
A particular limitation associated with numerical performance data is that it is difficult to establish a causal link between the number of outputs and whether the job gets done well. This is particularly relevant where managers are forced to rely on a proxy measure of performance, for example measuring the number of potholes filled in a day. The intention would be to establish if a highway repair team was performing well, but variables such as the size and depth of potholes, amount of traffic management required at each site, and distance travelled between sites would all affect the number of repairs a team could complete within a given time. Aside from the quality argument, this system would be biased towards a crew who have a large number of small potholes to repair on quiet roads within a compact area.
In the public sector, accurate performance measurement is even more problematic. Pollitt (1999) argues that this is because many public service activities are geared towards dealing with variable circumstances that do not lend themselves to producing simple outputs. Caers et al (2006) also argue that unlike the private sector, it can be difficult to measure the outputs generated by public services. Furthermore, it is notoriously difficult to establish a causal link between a specific activity and an eventual outcome.
For example, in a policing context, the output measured may be the number of arrests made, but the intended outcome could be increased feelings of safety within the community. The number of arrests made does not necessarily equate to increased feelings of safety, and may even indicate that officers are being over-zealous, or that crime has increased. In either case, this could actually alarm the community and drive down perceptions of safety. It is therefore proposed that simply measuring the number of arrests is meaningless.
Since the 1990s, targets have proliferated within the public sector. A series of top-down targets introduced in 1997 marked the intensification of the target-driven performance culture within the police service. Over subsequent years, the focus has shifted between detection and reduction targets, crime types, to public satisfaction rates and others.
Such targets include:
■Reducing the overall levels of crime and disorder.
■Reducing the levels of specified offence types (e.g. vehicle crime).
■Reducing the fear of crime.
■Increasing the number of detections per officer.
Many of these targets include prescriptive numerical measures (e.g. 30% reduction in vehicle crime over 5 years). Comparative information on how police forces performed against the targets is publicly disseminated, and league tables have been published that attribute success or failure based entirely on numerical data.
Whilst the ever-growing list of targets pertains to much policing activity that one would rightly expect to be prioritised, it does not take into account the myriad of external factors that can affect data outputs. For example, the overall crime rate can be affected by economic cycles, unemployment and social issues such as deprivation, substance abuse or poor personal security. None of these factors are directly within the gift of the police to directly control.
Furthermore there has never been any obvious science behind why a target would be set at for example, 30% instead of 32%, 27%, or 80%. Some targets appear to have been set purely because they are slightly higher than whatever was achieved during the previous period. This is purely based on the unenlightened assumption that the last period’s performance must have been ‘normal’. In some cases, crime detection targets appear embarrassingly low; for example the target for robbery detections in one police force is 13%. Why? Would the public think that this was impressive? Would the average police officer try harder (or conversely, expend less effort in catching robbers) if the target was 12% or 14% or 47%? Of course not. What is wrong with trying one’s best to catch as many robbers as possible, or in other words, to strive to achieve 100% all of the time?
Naturally, because of various external factors (e.g. lack of forensic evidence or no identification by witnesses) it is obvious that every single robber will not be caught, but it is argued that there is absolutely no benefit in setting an arbitrary numerical target in these circumstances. There is even less sense in feeling a great sense of achievement if 13.1% of robberies are detected one month, or a sense of failure if 12.9% are detected during the next.
In both the public and private sectors, a further consideration relevant to performance measurement is the cost involved in setting up and maintaining the system. (Pidd, 2005) Both internal and external performance measurement systems involve additional processes, overheads and staff. This has the effect of building in additional cost to the original activity and risks generating a burdensome and disproportionate audit and inspection culture. Power (1996) observes that such regimes have proliferated to such an extent in recent years that he has coined the term ‘The Audit Explosion.’
Not only does audit and inspection increase costs in financial terms, but there is the very real consequence of human cost, in terms of damage to morale and strained relationships. Clarke (2003) for example, notes the effect on morale, pointing out that the “…high cost / low trust mix…” of a “…competitive, intrusive and interventionist mode of scrutiny creates potentially antagonistic relationships”. (2003, pp.153-154) Argyris (1964) warns that control through performance measurement can be counterproductive, especially in the case of those individuals who are predisposed to work hard, as it can adversely affect motivation and lower productivity. Western (2007), drawing on Weber (1930, 1947) also warns of the damage to morale and the dehumanizing effects of Taylorist methodology.
Control Freakery
It is argued here that the real danger with performance management systems is when they are used as a means of control, and specifically where numerical targets are introduced into the system. Deming (1986) exhorted against the use of numerical targets, arguing that they are often used as a poor substitute for leadership and proper understanding of the system. Amongst his set of fourteen key principles he urged: “Eliminate management by numbers, numerical goals”. (1986, p.24)
As Simon Caulkin (2004) puts it, “Targets are only useful as long as you do not use them to manage by”. The danger of target-based performance measurement systems is that they not only measure performance, but they affect performance. The absolute pinnacle of inappropriate application of such regimes is within the public service environment. Here, the imposition of target-based performance management results in severe consequences, ranging from inefficiency, poor service delivery, and a demotivated workforce.
Goodhart’s Law warns that, “Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes”. (Goodhart, 1975) In other words, the activity being measured will be skewed towards meeting targets, which results in an inaccurate picture of what true performance looks like. If sanctions are likely to result from failure to meet targets, then workers will ‘cheat’ to meet them, and the greater the pressure to meet the target, the greater the risk of gaming or cheating. (Bevan and Hood, 2006; de Bruijn, 2007; Seddon, 2003, 2008)
Not only does inappropriate use of performance measurement result in the creation of perverse incentives and behaviours, but it also diverts effort away from the task in hand, as well as from other equally important activities that do not happen to be subject of performance targets. This is inherently inefficient, and also results in systemic failure, as some areas are ignored whilst others receive disproportionate attention. Furthermore, the inflexible, process-driven approach that results from target driven performance management restricts innovation, constrains professionalism, and turns the workforce into virtual automatons. (de Bruijn, 2007)
Bevan and Hood (2006) identify three main types of gaming that occur in target-based performance measurement systems:
■‘Ratcheting’ – where next year’s targets are based on the current year’s performance, and there is a perverse incentive for the manager to under-report current performance in order to secure a less demanding target for next year.
■‘Threshold effects’ – where performance across different functions is reported as a whole, thereby disguising departmental failure. In effect, the departments that exceed their targets vire their surplus across to the poorly performing sections. This also has the perverse incentive of encouraging those who exceed targets to allow their performance to deteriorate to the norm.
■‘Output distortions’ – where targets are achieved at the expense of important but unmeasured aspects of performance.
(Adapted from Bevan and Hood, 2006, p.9)
Other consequences of target-based performance measurement are:
■Tunnel vision – where managers select some targets (usually the easiest to achieve or measure) and ignore others.
■Sub-optimisation – where managers operate in such a way that serves their own operation but damages the performance of the overall system. (This concept is synonymous with Hardin’s ‘Tragedy of the Commons’. (1968)
■Myopia – where managers focus on achievable short-term goals at the expense of longer-term objectives.
■Ossification – where a performance indicator has become outdated yet has not been removed or revised, and energy is still directed towards achieving it.
(Adapted from Smith, 1990; Pidd, 2005)
It is argued that the pitfalls of target-based performance management are extensive and the consequences outlined above are practically guaranteed to occur when numerical targets are introduced into a performance measurement system. The result is that efficiency deteriorates, service delivery worsens, and operational effectiveness and morale are irrevocably damaged. In a private sector setting, this is bad; in a public services environment it is catastrophic.
The Folly of Total Reliance on Numerical Data
In the same way that Taylor was able to measure performance by assessing the input-output ratio of assembly line production, police performance systems count outputs such as the number of arrests per officer. The national key performance objectives set out for the police are only capable of recognising such outputs, and therefore neglect the quality aspects of policing that often have the greatest impact on people’s lives.
A serious limitation of using numerical data as a complete measure of good performance is in how the data is subsequently interpreted. Striking newspaper headlines such as “UK’s worst police forces named” (Daily Mail, 2006a) do nothing but to encourage public and media vilification, yet these judgements are based purely on published ‘league tables’ of performance against numerical criteria.
What is also worrying is the apparent inability of some managers to understand natural variation when interpreting statistical data. Seddon (2003) emphasises that any activity measured over a period of time will present varying results, and this is normal. Variation can also be attributed to external factors that are outside of the direct control of those working within the system. Furthermore, the overall capability of a system will naturally determine the parameters within which data should be anticipated, and that a degree of variation within these parameters is normal statistical activity. (Shewhart, 1939; Wheeler, 2000, 2003)
Setting a rigid point against this natural systemic variation and making it a ‘target’ will therefore have consequences. If the target is set above the upper control limit, it will not be achievable. If it is set between the upper and lower control limits (i.e. range of natural variation), then sometimes the target will be met and other times it won’t regardless of consistent effort. If the target is set below the lower control limit “there is no incentive for improvement; people slow down”. (Seddon, 2003, p.72)
Targets for police response times provide an appropriate example to illustrate this point:
Although now officially defunct, the 2009 Policing Pledge set a target for police response times. Such time-based limits remain, although they vary from force-to-force. The Policing Pledge target is as follows:
“Answer 999 calls within 10 seconds, deploying to emergencies immediately giving an estimated time of arrival, getting to you safely, and as quickly as possible. In urban areas, we will aim to get to you within 15 minutes and in rural areas within 20 minutes”. (Home Office, 2009a)
As with other high-level aims attached to targets, at first glance this appears to be an appropriate aspiration for the police to aim for, but let us consider the limitations and ambiguities within this target:
■How is an ‘urban area’ defined?
■Which time target applies if the route traverses rural and urban areas?
■When does the ‘clock’ start? Is it at the point 999 is dialled, when the call is answered, or once all pertinent information has been passed to the operator, and the police unit is actually despatched?
Now, let us consider the factors that could affect response times:
■Availability of resources.
■Driving grade of response driver and vehicle capability.
■Distance from the incident.
■Road conditions.
■Volume of traffic.
■Weather.
■Accuracy of information presented by the caller.
There are no measures of quality within this target. It is entirely possible that a call could be answered quickly, a police unit happened to be nearby, but the incident was dealt with badly. This would still meet the target. Conversely, a well-managed incident with a fast response time (albeit where the call was answered after 11 seconds), would fail against this rigid numerical measure. Even if it were possible to strategically position permanently available police response vehicles in such a manner that it was almost guaranteed the response times could be achieved, there will always be a degree of variation in the data; some would arrive in 10 minutes, others in 8, others in 13.
Targets within emergency call centres pose their own problems. It is not unusual for large LCD screens on the walls of such places to show in real time the volume of calls coming in, the amount of calls waiting, the speed with which calls are answered, and so on. These screens indicate whether every facet of call handling is on target or not, often with forbidding red text indicating ‘failure’. This can incentivise the call handlers to rush calls so they can get to the next one, resulting in them failing to obtain sufficient information required by the control room to effectively despatch a unit to the incident. The effect of this is that the control room staff then have to call the caller back to obtain the information they require, which represents avoidable rework, and which diverts them away from their primary function. This also causes delays, which can ultimately mean a slower or less effective response to the incident.
Meanwhile, the call handlers are able to move onto the next call, under the glare of the monitoring screen which warns them that three more calls are waiting. If any of these go unanswered, this will have a negative effect on the target that relates to ‘dropped calls’. Of course, this situation is not limited to the emergency services – the private sector often finds itself in a similar position, and there have been many examples of the perverse effects of such control through targets. This type of pressure can encourage gaming to meet the target; for example, answering the call quickly but then putting the caller on hold, passing the call to another department, or offering a call back. Some call centres simply place the caller on hold automatically, or the hapless victim has to negotiate their way through labyrinthine menus to reach a human being. In each case, the clock stops, the target is met, and the caller receives a sub-optimal service.
Where there are insufficient levels of staff in the first place, it will be impossible to meet these targets, regardless of effort. This is because, in effect, the capability of the system prevents it from performing to the levels demanded, and setting targets will not change the capacity of the system. Failure to meet the target will generate pressure from management, which demotivates the staff who are trying their best, which then affects performance (maybe one goes sick, reducing the workforce), which results in more failure to meet the targets, and so on. This downward cycle will intensify unless action is taken to improve the system, instead of setting arbitrary targets or browbeating the workers.
In the case of response time targets, there are other factors outside the control of either the call handler, control room operator or police driver that can affect whether the target is met. A 999 call will usually be routed to a central call-handling centre, and the operator will create an incident log whilst the caller is on the line, adding information to it as they speak. The log will then be routed to a local divisional control room for staff to despatch a unit. If the incident is particularly complex or there is a lot of information to be gleaned, it may be a few minutes before the log is sent to the local control room. Unfortunately, the clock beings to tick at the point the incident log is created, eating into the time limit permitted for an officer to arrive at the incident. This means that if the intial call handler has a large amount of information to enter onto the log, then by the time a unit is despatched it may be impossible to meet the response time target.
These circumstances mean that thorough information capture at the first point of contact actually adversely affects the likelihood of meeting the target. In addition to this, where the 999 operator does not initially grade an incident as urgent, but when upon receipt at the local control room a subsequent operator reassesses the severity of the incident and upgrades it, it will also often be too late to arrive at the incident in time to meet the target. This perverse situation means that someone who is doing the right thing and seeking to get a police officer to a caller as quickly as possible can actually increase the likelihood of that division failing to meet its response time target. It is easy to see how the temptation to leave the incident at its original grading could creep in.
Worse still, it is also possible to downgrade urgent incident logs, meaning that a less stringent response time target applies. Such activity would be wholly unethical, but when dealing with human beings who are under pressure, it is possible to see how the information contained within a particular incident log may be interpreted as slightly less serious than first believed. When this does occur, it is important to understand that this is not because the operators are bad people.
One UK police force recently changed its self-imposed response time target for urgent calls from 10 minutes to 15 minutes. The 10-minute target had been in place for over fifteen years and on average, was achieved between 80%-95% of the time. This would indicate that the system was stable and the 15% degree of variation (caused by the factors outlined above) was normal. Of course, divisional commanders would be held to account if their division was at the lower end of this scale during one month, but when (through natural systemic variation) the subsequent month showed an apparent ‘improvement’ they were able to comfort themselves in the knowledge that performance must have improved.
It is difficult to rationalise the reasoning behind changing one such target for another (especially when the system remains untouched), so one wonders what the benefit will be in reducing the response time target from 10 to 15 minutes. The only apparent advantage would be the anticipated exceptionally high proportion of incidents where the new less challenging response time target is achieved. Of course, nothing will have actually changed on the ground, and there is absolutely no perceivable benefit to the public whatsoever.
Certainly, getting to an urgent call as quickly and safely as possible is an appropriate priority for the police, so why have a target at all? One would hope that any police response driver would get to a burglary-in-progress as quickly as they could regardless of whether there is a time-based target or not. It is suggested that if a police force experimented with a different response time target every month for a year, there would not be a great deal of difference between the actual response times. The data would purely indicate what the capabilities of the system were.
The perversities around setting targets in this environment are truly frightening. It is entirely appropriate to prioritise incidents, but it is argued that there is no additional benefit in attaching a time-based response target to them once prioritised. It should be enough to aim to respond to an urgent incident as quickly and safely as possible.
As a colleague recently pointed out, “The public don’t grade incidents”.
Targets Can Seriously Damage Your Health
The introduction of performance targets in the public sector has had a significant impact, with examples of just about every one of the unintended consequences outlined above. Bevan and Hood (2006) expose examples of tampering with data in respect of ambulance response times, and delaying treatment at hospital to meet time-based targets. Seddon (2008) notes that, “…there have been many examples of police officers reclassifying offences in order to meet targets”. (2008, pp.124-125) In 2008 a Home Office Select Committee announced that the Government’s statutory performance indicators had generated a culture amongst officers of pursuing minor offences in order to meet numerical targets; some would “abandon their professional discretion as to how they might best deal with these incidents”. (Home Office, 2008b, p.13)
Ironically, the experience of a victim of crime who felt that they received a sympathetic and competent response to a distressing incident (e.g. sudden death in the family), would not register on the performance regime of a police force under the target system. In contrast, the arrest and cautioning of a 13-year-old child for committing an offence of Common Assault by throwing a water bomb at another child would count towards the sanction detection target. It is worth noting that a lengthy and complex investigation leading to an arrest and charge for murder, also counts as ‘one point’ in this system.
This type of example is one of the many symptoms of officers ‘hitting the target but missing the point’. Front line officers were sometimes given individual targets such as ‘to make three arrests per month’; as long as the officer achieved this target there was often little interest in what the arrests were for. This type of target-setting has resulted in otherwise law-abiding citizens being criminalised for extremely low-level or one-off offences. Often these ‘offences’ were little more than playground fights or name-calling between children. Under the target culture, these incidents provide rich opportunities for officers to achieve sanction detections for offences of Harassment, Public Order and Common Assault. Previously, these types of occurrences would have been dealt with by words of advice from a local officer.
Even when officers do not seek to meet targets by criminalising children, there has often been no choice. In 2002, the Government introduced the National Crime Recording Standards (NCRS), which were designed to ensure that crime was recorded ethically and corporately across all police forces. NCRS was supplemented by a prescriptive manual that set out exactly which crime should be recorded in which circumstances (Home Office Counting Rules, or HOCR), and another set of rules relating to how all incidents must be classified (National Standards of Incident Recording, or NSIR).
NCRS, HOCR and NSIR compliance is rigorously monitored by internal and external audit and inspection regimes. This has the effect of ensuring that compliance targets are achieved without necessarily adding any value to the service that the public receive. In some extreme examples, police forces have posted officers to a full-time role of retrospectively reviewing incident logs and changing classifications to ensure that they comply with the standard prior to audit. Again, this is not ‘value’ work and does nothing to enhance service delivery.
A further counter-productive effect of ‘ethical crime recording’ is the impression given of the levels of violent crime. Name-calling between 11-year-olds can be recorded as a criminal offence under Section 5 of the Public Order Act 1986. A push by one child on another, even where there is no injury caused whatsoever is still Common Assault. Both these offences contribute to the Government’s ‘Violent Crime’ classification. This results in sensationalist headlines such as ‘Violent crime on the increase.’ (Daily Mail, 2006b) It also distorts the true picture of violent crime. (The Times, 2007a) Again, this does not enhance the public’s feelings of safety or decrease the overall fear of crime, which of course is another of the national key performance objectives!
The emphasis on technical compliance with standards rather than doing the right thing can lead to huge amounts of effort being focused toward activity that has no direct benefit to the public. For example, it became common practice to have a big push for detections at the end of each month (and especially in the last month of the performance year) in order to meet targets. This meant that investigations risked being rushed and minor crimes with ‘easy prisoners’ were prioritised over more pressing matters. Admin staff who usually worked until 4pm would be paid overtime until midnight to ensure that all detections were inputted into the system before the end of the performance year.
Localised police performance charts that count things such as the number of intelligence logs submitted also results in some of the consequences discussed earlier. If teams are pitted against each other to produce more intelligence logs, no one wants to be bottom of the league table, so invariably the volume increases. (What gets measured gets managed, after all). The problem is that the quality of the intelligence logs does not necessarily increase alongside the volume, and enterprising officers find new and innovative ways to avoid being the one in the spot light for apparent poor performance. Common tricks include:
■Submitting an intelligence log for the most mundane piece of information. (e.g. ‘The kids have been hanging around by the shops again’).
■ Breaking one piece of information into multiple pieces to enable the submission of several logs for the same piece of intelligence. (e.g. Log 1: “John Smith is associating with Frank Jones”. Log 2: “John Smith and Frank Jones stole a car, registration number ABC123 three days ago”. Log 3: “Vehicle registration number ABC123 was involved in a burglary two days ago”).
■ Duplicating information already captured by another process. (e.g. submitting an intelligence log as well as a stop / search form after conducting a search in the street).
■ Two officers working together both submitting an intelligence log about the same incident.
Of course, the result of this sort of activity is that the volume of intelligence logs increases, whilst the intelligence of real value risks being lost in the ‘noise’. The intelligence department will also struggle to process the increased volume of logs and have to wade through excessive amounts of submissions that are of limited or no use. This causes delays, clogs the system, and quality suffers.
‘Gaming’ in how crimes are recorded (or not recorded) is another danger. “There have been many examples of police officers reclassifying offences in order to meet targets – for example, reclassifying shop theft as burglary”. (Seddon, 2008, pp.124-125) Depending whether a target focuses on crime reduction or crime detection will determine whether officers are encouraged to under-record a particular offence type (where there is little chance of detecting it) or over-record it (where there is an easy arrest).
In extreme cases, by proactively targeting a particular offence type (e.g. prostitution or drug activity), this can have the undesirable consequence of increasing recorded crime. This paradox was recognised by the Centre for Crime and Justice Studies in a report that noted,
“It is a moot point whether it made sense for the government to set a target to reduce police recorded robbery in the first place, given that increases might well reflect enhanced police action in this area. Ironically, the government’s target on street crime has risked creating a perverse incentive for police forces to avoid identifying and recording robbery offences”. (Centre for Crime and Justice Studies, 2007, p.33)
There is also the risk that as confidence in the police’s ability to deal with such offences increases, the public are more likely to report incidents that may not have been reported previously. Of course, this gives the impression that the crime rate is increasing, which damages public confidence (a policing target), increases the fear of crime (another target) and prevents crime reduction targets from being met.
Another example of targets dictating how officers on the ground respond to crime is how they are incentivised to make arrests for Section 5 Public Order instead of Drunk and Disorderly, as the former counts towards sanction detection targets. (The Times, 2007b) Of course, this works in reverse if the focus for a local commander is to reduce crime, as officers can be persuaded to deal with an identical disorder-related incident by arresting for Drunk and Disorderly, as this does not count as a crime…
When performance data is publicised, this too can have adverse consequences. Often, there is little interpretation of the data, and when accompanied by sensationalist headlines, it is easy to present a negative impression of any public service. The publication of league tables for schools, hospitals and the police serves little purpose but to galvanise negative sentiment towards those who are apparently ‘failing’. The irony is that the quality of healthcare, schooling or policing does not necessarily bear any correlation to a particular institution’s star rating or position in the league table.
The impact of targets is exacerbated when it is considered that police and CPS targets sometimes conflict with each other; for example, the police are under pressure to increase detections, whilst the CPS are judged on their ability to reduce failed prosecutions. (Home Office, 2008b, p.13) This causes the police to prefer charging a suspect in a borderline case, whilst the CPS are often unwilling to risk proceeding unless there is a very high likelihood of success at court. The only losers in this situation are victims of crime.
Conclusion
It is important to return to the assertion that performance measurement per se is not a bad thing. Indeed it is a valuable tool for enhancing accountability and encouraging continuous improvement. It enables managers to identify failing departments or organisations, and take action. Without it, genuine failings would not be exposed and sub-optimal performance would go unchallenged. A proportionate performance measurement system allows professionalism and innovation to flourish, whilst reminding the workforce that standards must be maintained in order to achieve organisational effectiveness and maximum efficiency. This is consistent with the systemic approach espoused by Deming (1986), Seddon (2003, 2008) and others.
It is also important to remember that this argument is against numerical targets and not priorities. Priorities such as for the police to detect crime, or for the NHS to promote health are entirely appropriate. These principles are embedded within these organisations, and form the bedrock of their raison d’être. Priorities should remain as organisational objectives, but without a numerical target being attached, as this obfuscates the original purpose and diverts activity away from it. The experience of recent years has demonstrated the toxic effect of performance measurement being used as a management tool in the public sector.
It is argued that arbitrary numerical targets should be abandoned, particularly in the public services arena. Targets generate perverse incentives and behaviours, and do not add value to the service that is delivered. It is better to strive for 100% all of the time and concentrate on doing the right thing, instead of worrying about whether current ‘performance’ is a fraction of a percent above or below an arbitrary target that was created with all the science of a ‘finger in the air moment’.
The public have a right to expect an effective and accountable police service, but also one that is flexible enough to respond to a variety of circumstances. The target culture has not delivered this goal. Numerical targets are the most destructive feature of performance measurement systems, and when imposed on a public service setting will guarantee inefficiency, additional cost, lower morale, and ironically, sub-optimal performance. Performance measurement is vital when implemented properly, priorities are crucial, but numerical targets must be eradicated.
Reference List
Argyris, C. (1964) Integrating the Individual and the Organization. New York: Wiley
Bevan, G. and C. Hood (2006) ‘What’s measured is what matters: targets and gaming in the English public healthcare system’ Public Administration 84 (3): 517-538
Bouckaert, G and van Dooren, W. (2003) “Performance measurement and management”. In Bovaird, A and E Loffler. (eds.) Public Management and Governance. London: Routledge.
Daily Mail (2006a) UK’s Worst Police Forces Named [Online]
http://www.dailymail.co.uk/news/article-412255/UKs-worst-police-forces-named.html [Accessed 25th March 2011]
Daily Mail (2006b) Violent Crime on the Increase [Online]
http://www.dailymail.co.uk/news/article-375164/Violent-crime-increase.html [Accessed 1st January 2010]
de Bruijn, H. (2007) Managing Performance in the Public Sector. London: Routledge
Caers, R., Du Bois, C., Jegers, M., De Gieter, S., Schepers, C. and Pepermans, R. (2006) ‘Principal-Agent Relationships on the Stewardship-Agency Axis’. Nonprofit Management and Leadership. 17 (1): 25-47
Caulkin, S. (2004) ‘Against the Grain’. Society Guardian. [Online]
http://www.guardian.co.uk/society/2004/mar/31/guardiansocietysupplement.publicservices [Accessed 25th March 2011]
Centre for Crime and Justice Studies (2007) Ten Years of Criminal Justice Under Labour – An Independent Audit. London: Centre for Crime and Justice Studies
Clarke, J. (2003) “Scrutiny through inspection and audit”. In Bovaird, A and E Loffler. (eds.) Public Management and Governance. London: Routledge. pp 153-154
Deming, W. E. (1986) Out of the Crisis. Cambridge: MIT Press
Goodhart, C.A.E. (1975) ‘Problems of Monetary Management: The UK Experience’. Papers in Monetary Economics. (Volume I) Reserve Bank of Australia
Hardin, G. (1968) ‘The Tragedy of the Commons’. Science. (162): 1243-1248
Home Office (2008a) Improving Performance – A Practical Guide to Police Performance Management. London: HMSO
Home Office (2008b) Policing in the 21st Century – Volume 1. London: HMSO
Hughes, O.E. (2003) Public Management and Administration, 3rd. ed. Basingstoke: Palgrave Macmillan
Pidd, M. (2005) ‘Perversity in Public Service Performance Measurement’. International Journal of Productivity and Performance Management. 54 (5/6): 482-493
Pollitt, C. (1999) Integrating Financial Management and Performance Management. Paris: OECD/PUMA
Power, M. (1996) The Audit Explosion. London: White Dove Press
Seddon, J. (2003) Freedom from Command and Control. Buckingham: Vanguard
Seddon, J. (2008) Systems Thinking in the Public Sector. Axminster: Triarchy Press
Shewhart, W. (1939) Statistical Method from the Viewpoint of Quality Control. Washington DC: The Graduate School, US Department of Agriculture
Smith, P.C. (1990) ‘The Use of Performance Indicators in the Public Sector’. Journal of the Royal Statistical Society. 153 (1) pp.53-72
Taylor, F. (1911) The Principles of Scientific Management. New York: Harper & Row
The Times (2007a) Police Chief Says Officers Chasing Targets Distort Picture of Crime [Online]
http://www.timesonline.co.uk/tol/news/uk/crime/article2441818.ece [Accessed 1st January 2010]
The Times (2007b) We Are Making Ludicrous Arrests Just to Meet our Targets
[Online]
http://www.timesonline.co.uk/tol/news/uk/crime/article1790515.ece [Accessed 1st January 2010]
Van Slyke, D.M. (2007) ‘Agents or stewards: Using theory to understand the government-nonprofit social service contracting relationship’ Journal of Public Administration Research and Theory 17 (2): 157-187
Weber, M. (1930) The Protestant Ethic and the Spirit of Capitalism. London: Allen and Unwin
Weber, M. (1947) The Theory of Social and Economic Organisation. New York: Oxford University Press
Western, S. (2007) Leadership: A Critical Text. New York: London: Sage
Wheeler, D.J. (2000) Understanding Variation: The Key to Managing Chaos. (2nd Ed.) Knoxville: SPC Press
Wheeler, D.J. (2003) Making Sense of Data. Knoxville: SPC Press
COMMENT
A courageous and superbly written article Simon, congratulations! A timely reminder that the spectre of performance targeting and its effects on police recorded crime and detections has yet to be exorcised from the service.
Look out for our forthcoming report on police detections.