Performance of People II
Numerical Rating
Problems abound. Primarily there are four:
They are: Definition, Measurement, Assumptions and Variation. The first three alone render any attempt to numerically rate or rank performance (or merit) impossible. There are some who would argue that it is not done well and could be improved. The fact is, it cannot be done at all.
Currently there is a fad in the land that good managers ought to annually get rid of the bottom 10% of low performers. While this may be of questionable ethical merit, it is thought to be a necessity for long-term success in the market place. This is nothing less than a complete abdication of management’s responsibility to make the best use of the workforce they have and, worse, it is a practice devoid of the slightest bit of scientific or logical merit.
Perhaps worst of all, such a practice is ruinous of employee morale. If it is true that an organization’s most important resource is the people who work there, then this practice strikes at the heart of that resource. To advocate for the annual evisceration of the workforce and then expect anything like loyalty of top performance is ignorance at it’s worse.
Why can’t performance be numerically rated and ranked? It can’t be defined operationally, it can’t be measured with any degree of precision, it can’t be separated from other effects, and it is destined to vary over time in any case. Any one of these factors present significant (if not insurmountable) problems itself. Combined the problems create an impossible barrier.
Definitional Problems
Performance is ephemeral. It is an idea; a concept. In order to measure it, there must be a system of translation. That is the purpose of operational definitions. It allows the classification of concepts into a quantifiable entity. It describes an operation by which this translation will take place.
It is likely that a group of ‘experts’ in management from a business school could reach an agreement on a listing of desirable attributes of a manager. The list might include, for example, communication skills, ability to give direction, ability to delegate, and so on.
A typical performance-rating scheme will list a set of attributes such as this and then ask the rater to classify the ratee according to those attributes. The wording might be something like, “. can the person delegate authority well?” There usually will be some kind of attempt at categorization such as, “. Always, Almost Always, Sometimes, Almost Never” Each of these categories is assigned a numerical value and some kind of composite number is tallied.
Despite the attempt to appear rigorous, such ‘rating’ schemes are actually based solely on subjective judgement. There is nothing wrong with subjective judgement Per Se. The difficulty comes when we dress it up in the trappings of science and statistics.
An operational definition will require some verifiable measurement criteria. For example, ‘ability to delegate’ might be operationally defined as the number of delegating opportunities that resulted in actual delegation. Of course, this is difficult to ascertain.
But any attempt to create a rating scheme that is comparable for a given person from time to time, and allows comparison from individual to individual will require exactly this kind of detail. Otherwise, using the subjective judgement approach, the rating is dependent on who is doing it and not comparable from time to time and person to person.
Since this rating is to be used to dispense raises, promotions, and to identify ‘low performers’ (for the axe), this comparability is critical.
A further problem is that any rating scheme will have multiple categories. Not only must a consistent list of these attributes be used, but also the individual categories must be assigned weights. Some are more important than others. Not weighting them is not a possibility since that merely assigns them all equal weights.






Reader Comments
When I first heard Dr. Deming talking about this, I didn’t understand the problem. I recall distinctly a lunch session at Nashua Country Club with Dr. Deming and Dr. Lloyd S. Nelson where Deming was heatedly pointing out how Ford’s performance ‘management’ system was holding them back. He said rating and ranking “….can’t be done. Can’t be done at all.” This would have been around 1981.
Frankly, I thought he was off base. As time went on, and my understanding improved, I came to realize he was exactly right. There is no valid way to numerically categorize performance in order to rank or differentiate among people. It cannot be done.
John D.