The information that you need
Can a survey give you the information you need?
Be clear about what you want from your survey results. Are you looking for factual information? Are you interested in people's attitudes and opinions? Do you need a combination of both?
What you need to know should be guided by how you plan to use your survey outputs—for formulating policy, for research and development, for publicity, or for some other purpose. This will also help determine the kind of questions your survey asks.
To get the most meaningful survey results, define your information needs so that they are measureable and/or observable.
Avoid the expense of duplicating existing information. Once you know what information you need and how you plan to use it, explore alternative sources of information in case all or part of what you need is already available.
What kind of survey do you need?
Surveys are not "one-size-fits-all." The kind of survey that you choose depends on a combination of the amount, type, accuracy and scale of the information required. Here are some things to consider:
The amount of information you need:
- Short answers to a small number of simple questions
- Long answers with detailed content
- Complex answers to complex subjects
The type of information you need:
- Public opinions and attitudes
- Facts about social or economic events
The level of accuracy you need:
- A quick, but reasonable, approximation
- Highly precise estimates
The scale of responses you need:
- Survey estimates for the total population at a single level of aggregation such as the nation, a province or a municipality
- Detailed information broken down into categories such as location or age
- Information about a specific sector such as business, agriculture or some other distinct population
All of the above can help you determine the kind of survey you need and the survey organization best positioned to provide it.
Who can provide the survey?
No single survey provider will be the best choice for all the possible types of surveys. Know what kind, and what quality, of information you need before you approach a survey provider. And, know what services the provider can deliver.
- If your data needs are simple, or if you want information on public opinions, then a provider that can deliver the basics at low cost may be a good option.
- If you need highly accurate, detailed socio-economic information with in-depth analysis, the services of a large firm or a national or provincial statistics office would be more suitable.
- Some organizations conduct omnibus surveys with regular collection cycles to which a client can add a modest number of questions for a relatively low cost.
- If you are interested in quick pulse-taking on current topics, there are organizations that can act quickly. Some do so by maintaining continuing panels of respondents so be cautious: These panels are often subject to weaknesses that make them inappropriate for providing detailed data.
Find out if the survey provider possesses the full range of skills, infrastructure and experience to deliver the results and quality you need. The survey provider should have the capacity to carry out all the survey steps from planning and design through to data collection and analysis, at an acceptable cost.
Find out to what extent the survey provider will support you in the analysis of the results, their interpretation and use, and documentation. The survey provider should provide you with sufficient support after the survey to ensure you can use the results to meet your needs.
How involved should you be? Verify the degree of direct participation your team will have in the different steps of the survey process. You may even decide to do the survey yourself and would need advice with only certain parts of the process.
Data collection and questionnaire
How will the data be collected?
How the information will be collected is important because the method impacts the response rate as well as the quality of the responses.
The most common collection methods include the following:
- Interviewers ask the survey questions in a telephone interview or in a face-to-face (personal) interview.
- Respondents self-complete the questionnaire without the assistance of an interviewer via traditional mail, email or on-line.
A survey may use one or more of these approaches. For example, a paper questionnaire sent via traditional mail may use a telephone follow-up if responses are not received within a certain timeframe.
Each of these methods has various advantages and disadvantages. Inappropriate use can introduce unintended errors, which can make survey results less than reliable, if steps are not taken to reduce such errors.
Will interviewers be fully trained for this particular survey?
A survey provider's interviewing staff is the backbone of its data collection effort. The interaction between interviewer and respondent is a crucial element in the success of your survey.
If your questions are unclear to interviewers, then they will likely be unclear to respondents, and interviewers will struggle to help respondents understand what they are being asked. Make sure everyone understands what the questions mean.
Verify that the organization employs experienced, well-trained interviewing staff.
- Check that training manuals are provided to the interviewers and cover all field procedures.
- Ask about the amount of time the interviewers have been working for the organization conducting surveys.
- Ask about the types of surveys that the interviewers are experienced in collecting.
- Ensure that interviewers are provided with a good introduction to the survey for their initial approach.
Will the survey provider apply a range of best practices to ensure the highest possible response rate?
Key points to focus on include the following:
- A well designed and planned survey should incorporate procedures for following up with the people who have not responded on the first attempt.
- If interviewers collect the survey information, they should make more than one attempt to contact respondents who are not available on the first try.
- Call-backs should be made on different days of the week and at different times of the day.
- The collection period should be long enough to ensure maximum response rates.
- The survey provider should use industry-standard methods to calculate response rates.
Will the information be kept confidential?
Determine what steps the survey provider will take to respect respondent confidentiality. These steps may take the form of an interviewer oath of secrecy, documentation of how personal information will be used, infrastructure that guards against unintended information uses or sharing, or a combination of these and other safeguards.
What to look for in your questionnaire?
Always ask for a copy of the survey questionnaire and take the time to review the exact wording of all the questions that will be asked.
The wording should be fair and unbiased. Look for any evidence of leading or loaded questions, and verify that the questionnaire presents a balanced set of response choices. The readability level is also important. Most people should be able to understand the questionnaire wording easily.
Pay attention to the order of the questions to make sure the sequence doesn't inadvertently bias the results. Seemingly minor points like this can seriously undermine the quality of your survey results.
Use short words and simple, direct sentences so the questions will be understood uniformly by most people. This will also ensure more accurate translation into official and, if applicable, minority languages.
Make the questionnaire as short as possible to meet your information needs. Keep the "need to know" questions. Remove extraneous questions that may distract from your survey's focus.
Thoroughly test the questionnaire in all language versions. Be prepared to edit the questionnaire for issues uncovered during testing.
Interpreting survey results
What are confidence intervals and margins of error?
These indicate the precision of a survey's results. The confidence level should always be reported as part of the margin of error statement. The confidence level is often stated as 19 times in 20 (95% confidence level) or 9 times in 10 (90% confidence level). For a given sample result, the higher the confidence level is, the larger the margin of error. But remember, the confidence level and margin of error only indicate sampling errors.
Example: A survey recently published by XYZ Consultants found that 73% of Canadians regularly watch ice hockey games on television but only 2% watch field hockey.
The survey interviewed a representative sample of 1,200 Canadian adults and has a margin of error of plus or minus 3 percentage points, 19 times out of 20, i.e., 95% of the time.
This means that 73% is our best estimate of the percentage of ice hockey viewers in the whole population, and the true value is expected to lie within 3% of that number—in other words, between 70% and 76%, at a confidence level of 95%.
Strictly speaking, we can infer there are 95 chances in 100 that the sampling procedure, which generated the data, will produce a 95% confidence interval that includes the true value.
What is a coefficient of variation?
A coefficient of variation (CV) is simply the standard error expressed as a percentage of the estimate to which it refers. In the XYZ example, with an estimate of 73% and a standard error of 1.5%, the CV is 100*1.5/73, or about 2% of the estimated level of ice hockey viewing.
The CV is useful in the interpretation of relative levels of precision, especially when widely varying quantities are being compared.
Example: In a province there may be an estimated 50,000 people unemployed with a standard error of 1,300 people. At the same time, that province's estimated unemployment rate is 8% with a standard error of 0.2%. It is difficult to compare these numbers directly. However, the CV of the estimated number of unemployed is 2.6%, while the CV of the estimated unemployment rate is 2.5%. (They need not be equal.) This shows that the two estimates have essentially the same level of precision.
What was the achieved response rate for the survey?
Response rates are important for a number of reasons:
- Non-respondents may be different from respondents in ways that can affect the survey results. Determine what techniques were applied to maximize response rates.
- A low response rate can be more damaging to data quality than a small sample size by contributing to total survey error.
- An unexpectedly high response rate can be indicative of other problems, as might be the case in quota sampling.
Example: If the survey results are based on an apparent 100% response rate obtained by interviewing the first 1,000 people willing to respond, then the results should be interpreted with caution. Such quota sampling has no information about how many people were approached in total in order to get the 1,000 interviews. There is also no information about how the respondents may be different from those who did not respond.
Can statistics be misused?
Yes. For this reason you should request and use statistics that are produced with professional and scientific rigour, commensurate to their use. You should question what a statistic represents, how it was calculated, and its strengths and limitations. Some say that "some statistical information is better than none at all." This statement is true to the extent that the user is aware of the limitations of the statistics and the risk of using them in their particular context.
Here are a few examples where statistics are to be interpreted or used with caution:
Representing an average
When reporting on salaries in a company, Person A claims that the average salary is over $60,000, Person B claims that the average worker gets $28,000, and Person C claims that "most" employees gets only $26,000. Any or all of these statements may be true at the same time. How to make sense of this? First, each person is trying to convey a single numerical representation of the salaries. Person A actually reports the mean, which is the sum of all salaries divided by the number of paid employees, including the CEO who makes $900,000. Person B reports the median, meaning that half of employees make less than $28,000 and half make more. Finally, person C reports the mode, which is the most frequent or typical salary in the company. The mean, mode and median are clearly defined statistical concepts; the average is not.
Exaggerating the precision
In a quick poll, 57.14% preferred X and 42.86% preferred Y. In fact, this could mean that 4 of the 7 persons interviewed preferred X over Y. If only one person more had preferred Y over X, the results would had been totally reversed. The size of the sample is far too small to support the level of precision expressed by the proportions.
Finding the answer you want
"Seven out of ten dentists prefer Toothpaste X." How many different times did Toothpaste Company X ask groups of 10 dentists about their preferences before finally finding one group with 7 in favour?"
Up and down
Mr. A's income dropped by 40% from 2009 to 2010, but in 2011 it rose by 50% so he's better off than ever. Is this so? A 40% drop from $100,000 took him down to $60,000. Then an increase of 50% of that brought him back up to $90,000 so he's still down by 10%.
Survey samples
From what population (area or group) will the sample be selected?
The population of interest for the survey, or target population, must be carefully identified. The information, called the "sampling frame" by statistical agencies and the "call-list" by public opinion research organizations, used to identify members of the target population should be up-to-date and well documented. If the sampling frame does not cover the desired target population accurately, the survey results may be severely biased. If the survey targets a specific group of the population or a specific geographical area, the results should not be interpreted as representing people outside of that group or area. A specific group might be men, women, Aboriginals, teachers, political party supporters and so on. A specific area might be a province, a region, a city, and so on.
How will people be selected for interviewing?
To avoid sample bias, some important questions must be asked about how people will be selected to participate in the survey. The survey documentation should indicate whether the sample will be chosen using a probability or non-probability sampling method.
If a probability sampling method is used, you should verify the following:
- Respondents will be selected objectively, that is, randomly
- All members of the target population will have a known chance to be selected in the survey
You should also enquire about the general structure of the sampling design, such as stratification, clustering, multi-stage or multi-phase design, as applicable.
If a non-probability approach is used, the way respondents are selected should also be explained.
- Will the selection of people to be interviewed be left up to an interviewer, such as in quota sampling?
- Will respondents select themselves in some way such as by participating in a phone-in poll, responding to a questionnaire in a book or magazine, or by joining an on-going panel?
Note that some surveys use a combination of probability and non-probability sampling. An example of this might be overlaying a quota sampling constraint onto an initially probability-based design.
Will the sample selected from a population be representative of that population?
To ensure the sample selected for your survey represents the population, you should ensure that key characteristics within the selected sample are similar to those characteristics in the population. It is also important to verify that the characteristics among the actual survey respondents are similar to the characteristics in the selected sample. Key characteristics within a population might include age, sex, education, marital status, or any other available profiling information to help answer questions important to the survey subject.
Survey errors
What errors may affect the survey results?
Errors may occur at any stage during the collection and processing of survey data, whether it is a census or a sample survey. There are two main sources of survey error: Sampling error (errors associated directly with the sample design and estimation methods used) and non-sampling error (a blanket term used to cover all other errors). Non-sampling errors are usually sub-divided as follows:
- Coverage errors, which are mainly associated with the sampling frame, such as missing units, inclusion of units not in the population of interest, and duplication.
- Response errors, which are caused by problems related to the way questions were phrased, the order in which the questions were asked, or respondents' reporting errors (also referred to as measurement error if possible errors made by the interviewer are included in this category).
- Non-response errors, which are due to respondents either not providing information or providing incorrect information. Non-response increases the likelihood of bias in the survey estimates. It also reduces the effective sample size, thereby increasing the observed sampling error. However, the risk of bias when non-response rates are high is generally more dangerous than the reduction in sample size per se.
- Data capture errors, which are due to coding or data entry problems.
- Edit and imputation ("E&I") errors, which can be introduced during attempts to find and correct all the other non-sampling errors.
All of these sources may contribute to either, or both, of the two types of survey error. These are bias, or systematic error, and variance, or random error.
Sampling error is not an error in the sense of a mistake having been made in conducting the survey. Rather it indicates the degree of uncertainty about the 'true' value based on information obtained from the number of people that were surveyed.
It is reasonably straightforward for knowledgeable, experienced survey-taking organizations to control sampling error through the use of suitable sampling methods and to estimate its impact using information from the sample design and the achieved sample. Any statement about sampling errors, namely variance, standard error, margin of sampling error or coefficient of variation, can only be made if the survey data come from a probability sample.
The non-sampling errors, especially potential biases, are the most difficult to detect, to control and to measure, and require careful planning, training and testing.
How will the accuracy of the survey results be measured and reported?
The combined effect of bias and variance is the total survey error, which, if available, is the best measure of the overall accuracy of the survey results. For most surveys, however, only an estimate of sampling error is available. The most commonly presented measure is usually referred to as the margin of error: It should properly always be called the margin of sampling error because it does not incorporate any information about non-sampling errors. The same comment applies to confidence intervals as they are computed directly from the margin of sampling error.
For that reason, confidence intervals and margins of sampling error alone are not enough to judge the quality of survey results. If the quality of statistical estimates is important to you in the use of your survey results, then you should seek a survey provider that is able to calculate and report all aspects of survey reliability.
What influences the margin of sampling error?
The margin of sampling error is influenced by several factors:
- The homogeneity of the population: the more the persons differ from one another in relation to the variables measured, the larger the sample must be.
- The level or prevalence of the variables being measured: The rarer a characteristic is in the population, the harder it is to measure accurately.
- The efficiency of the sample design being used.
- Sample size, which is based on a sample design that will yield the most accurate estimates possible at a given cost.
- Response rate, which determines the achieved sample size.
How big should the sample be?
The sample size directly affects the margin of sampling error that is reported with the survey results.
The margin of sampling error provides a legitimate estimate of the error due to sampling only if a probability sampling method was used to select the sample. Generally speaking, the more people that are interviewed, the smaller the sampling error becomes.
Note: Don't put all your faith in the survey results simply because the margin of sampling error is relatively small. This is only one possible source of error in a survey.
Will the margin of sampling error be the same for all survey estimates?
The margin of sampling error depends on the size of the sample surveyed. Therefore, estimates for sub-groups of the survey population, for which the sample size smaller by definition, will have a larger margin of sampling error than the overall estimate for the total survey population.
The margin of sampling error also depends on the behaviour of the variable being measured. So even under the same sample design and with the same sample size, the margin of sampling error may be larger for one variable than for another simply because its values are more widely dispersed in the population being surveyed.
Does a small margin of sampling error necessarily mean that the survey results are reliable?
If the survey estimate is relatively small, then a margin of sampling error of only a few percentage points means that the survey estimate should be interpreted with caution. Base your interpretation on how the information will be used and the consequences that may result from making an incorrect decision based on that result.
What is the typical response rate for a survey?
Response rates vary widely depending on a number of factors. Virtually all surveys suffer from some non-response, and non-respondents may be different from respondents in ways that affect the survey results. A low response rate increases the potential impact of bias and can be much more damaging than a small sample with high response rate.
Previous experience and choice of data collection method should provide an estimate of likely response rates. Some of the techniques that can help to maximize response rates include the following:
Providing advance notification: An advance letter explains the background of the survey and encourages participation.
Including effective introductions in your material: This approach can increase the credibility and perceived importance of the survey. In your introduction, it's important to do the following:
- Identify the name of the organization conducting the survey
- Guarantee confidentiality to all your respondents
- Be honest about the length of the interview
- Explain the uses and the benefits of the survey
Ensuring your interviewers are well trained: Preparing your interviewers before they meet with respondents is a must. Before sending them out into the field, ensure they are well-versed in the following:
- Able to explain "random selection" (an often asked question)
- Professional in their approach
- Able to read out questions accurately
- Prepared to probe and clarify responses
If quota sampling will be used and the respondents will be, for example, the first 1,000 willing to respond, then the results of the survey should be interpreted with caution. To follow this example, there is no information about how many people were approached in total in order to get the 1,000 interviews. There is also no information about how the respondents may be different from those who did not respond.
Usefulness of the survey results
If the organization conducting the survey follows proper procedures, will the survey results be a true reflection of a population's characteristics, attitudes or opinions?
Yes, usually. However, remember that according to the laws of chance, the survey results may differ at times from the population's actual characteristics, attitudes or opinions simply because of chance variation in the selected sample of people, or because of sampling error.
Will the survey use external information sources to improve or to validate its results?
Comparisons to external sources of information, such as other surveys or administrative data, can be used to correct for biases, or simply to verify that the survey results make sense. The survey does not exist in isolation.
For example, many surveys of human populations calibrate their results to Census data totals or distributions or to other widely-accepted data sources.
Should survey results be believed?
A healthy degree of skepticism about survey results is desirable. If the survey methods and results can withstand skeptical scrutiny, then the properly conducted survey can be the best objective means for gathering information about a population.
What outputs (deliverables) can you expect from your survey provider?
At a minimum you should receive a report describing the purpose of the survey and its key findings. The report should also include a brief description of the methods used and a full set of tabular estimates.
You may have to negotiate with the provider to present the results to your organization, live and on site.
You may also have to arrange with the provider to prepare more elaborate analyses for you or to give you advice on what analytical methods to apply to the data. This of course depends on whether you have arranged to receive the complete data file.