Employee surveys—what HR people need to know

Employee Surveys: What HR People Need to Know

Employers often use surveys to find out what their employees think about various issues: management policies (such as flexible working), their working conditions. Surveys can take the form of questionnaires administered to the entire workforce (a census) or one corporate division, or sometimes  to a carefully chosen subset of the entire employee population (a random sample survey).

These days the questionnaire is usually sent to every employee by an email containing a link to a central server that collects the responses. The responses are analysed and reported to management. Sometimes the data analysis and reporting is outsourced to a specialist company, so that management can reassure the workforce that no-one in management has seen the individual questionnaire responses, only the aggregate averages, proportions and crosstabulations. 

However, there are a few things that HR people need to know when looking at such reports. 

Sample size

Where samples are used instead of a census there is some uncertainty in the results, due to the fact that a sample has been used. This is called sampling error. If the sample has been taken by random sampling methods (for example if every employee has an equal chance of being in the sample) there is a body of mathematical theory that can tell you how far off your results will usually be, for a given sample. This enables the sample size to be chosen to keep the sampling error to acceptable levels. So, for example, management may wish to know whether the employees are satisfied with the standard of their accommodation. The overall answer will be a percentage, and management may want the answer to a precision of, say, ±2 percentage points. 

Survey statisticians also worry about bias,  which is where the survey results don’t paint a true picture of what’s going on.  

The pay grade gradient

Employee surveys usually garner more enthusiastic, positive  responses from people in senior positions than those in junior positions in the organisation. Failure to take this into account can cause the reported questionnaire results to be misleading. 



Nonresponse

Often there will be some employees who have not answered an employee survey. This is called nonresponse. Sometimes they don’t answer the survey at all (unit nonresponse) and sometimes they answer most of the questions but do not answer some particular questions. The latter is called item nonresponse. Sometimes, particularly with a long questionnaire, they start the survey, but then give up after a certain point (partial nonresponse). If there is no response present, a census (which is a 100% sample) becomes a sample survey, and it isn’t a representative sample.  Usually, unit and item nonresponse is more prevalent amongst junior grades than in senior grades. If you see partial nonresponse it means your survey is too long.   

Effects of Nonresponse

Sample nonresponse has two main effects: greater uncertainty in the results and biased results. The sample size of a survey determines its statistical precision. A smaller sample gives results that have more uncertainty (wider confidence intervals or margin of error). 

If it were the case that the nonresponders were a random sample of the employees this would be a minor matter. The results would simply be a bit less accurate, but they wouldn’t tend to be biased, i.e. they wouldn’t be systematically too high or too low.

However, if the nonresponse is more common amongst junior employees than amongst senior ones, and if junior staff are also less satisfied with the working conditions (or whatever the question is about) then the reported results will systematically tend to be too optimistic. This is called bias and it’s a serious threat to the value of the survey.

Mitigation: Post-stratification

One of the first things to do when the survey results are in is to look at the counts of the number of people of each sex, age category and pay or seniority grade, and compare these with the known numbers of such people from HR records. This tells you who is over-represented and who is under-represented by differential nonresponse.

The survey counts can then be reweighted to line up with the actual people numbers. In effect this treats the results of the missing people as being on average the same as others of the same grade, sex and age category, so it mitigates the pay grade gradient and any other imbalances in the key variables. This is known as post-stratification.

Comparing Lines of Business

Often senior management is interested in how a particular result question differs between different lines of business. If, as is usually the case, there is a pay grade gradient in the responses it is not possible to directly compare two lines of business that have different grade structures. (In one large organisation I am aware of, Group Legal often was the best performing division on attitude surveys. They had a majority of well-paid lawyers and few junior staff.)

Mitigation: Standardised grade distributions

When comparing a specific measure (say, satisfaction with working conditions) between two different divisions of the same organisation you are not comparing like with like unless they have the same distribution of key variables such as pay grades or age category or sex.

For example, if the engineering division has a large number of junior field force staff and the marketing department consists of large numbers of older, highly paid creatives you are virtually bound to see a substantial difference in attitudes and the differences are highly misleading
 

This problem is very familiar to medical statisticians, who noticed that although the overall death rates in Bournemouth were higher than those in Derby, in every single age category those in Bournemouth were lower than those in Derby. (This surprising result is an example of Simpson’s paradox.) The solution in the medical case is to work out age-specific death rates for each town and apply them to a standard age distribution (usually the national population).

The solution to the problem of comparing two divisions in the same company with respect to some sampled measure is the same. Compute the grade-specific average responses for both divisions and apply them both to a hypothetical organisation with a standard distribution of pay grades, such as the pay grade distribution of the entire company. This puts both sets of results on the same footing and the pay grade effect has been eliminated.

Goodhart’s Law

“When a measure becomes a target, it ceases to be a good measure.”

Charles Goodhart, 1975. (British economist.)


It can happen that senior management track some measure that is important to them by means of regular surveys, then penalise the managers whose teams yield a poor score. When this happens the measure now becomes a target and managers will attempt to influence it, and not necessarily by the means that senior management had in mind.

For example, when the British Raj ruled India, cobras were a deadly menace and people were encouraged to kill them. The British issued a bounty for every dead cobra brought to them. However, people started to breed cobras so that they could be killed, in order to collect the bounty. When the British realised what was going on they cancelled the bounties and the Indians released the remaining cobras into the wild, thus increasing the number of cobra incidents.

The moral is, poorly chosen targets can create perverse incentives.