Data quality

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

2006 Census of Agriculture — concepts, methodology and data quality

Using the following information will ensure a clear understanding of the basic concepts that define the data provided in this product, and of the underlying census methodology and key aspects of the data quality. It will give you a better understanding of how the data can be effectively used and analysed according to their strengths and limitations. The information may be particularly important when making comparisons with data from other surveys or sources of information, and in drawing conclusions regarding change over time.

Data sources and methodology
Concepts and variables measured
Data accuracy
Comparability of data and related sources
Other quality indicators and assessments

Data sources and methodology

The Census of Agriculture collects and disseminates a wide range of data on the agriculture industry such as number and type of farms, farm operator characteristics, business operating arrangements, land management practices, crop areas, numbers of livestock and poultry, farm capital, operating expenses and receipts, and farm machinery and equipment. These data provide a comprehensive picture of the agriculture industry across Canada every five years at the national, provincial and territorial levels as well as at lower levels of geography.

Top of Page

General methodology

Target population

The target population is all census farms in Canada. In 2006, a census farm was defined as an agricultural operation that produces at least one of the following products intended for sale: crops (hay, field crops, tree fruits or nuts, berries or grapes, vegetables, seed); livestock (cattle, pigs, sheep, horses, game animals, other livestock); poultry (hens, chickens, turkeys, chicks, game birds, other poultry); animal products (milk or cream, eggs, wool, furs, meat); or other agricultural products (Christmas trees, greenhouse or nursery products, mushrooms, sod, honey, maple syrup products). However, the definition of a census farm has changed over time; for a summary of these changes since 1921, please refer to Census farm.

The Census of Agriculture also collects and disseminates data pertaining to a related sub-population — farm operators. In 2006, "farm operators" was defined as those persons responsible for the management decisions made in the operation of a census farm or agricultural operation. Up to three farm operators could be reported per farm. Prior to the 1991 Census of Agriculture, the farm operator referred to only one person responsible for the day-to-day decisions made in running an agricultural operation.

Collection

In 2006, a Census of Agriculture questionnaire was dropped off along with a Census of Population questionnaire when someone in the household was a farm operator. Once completed, the questionnaire was mailed back for editing. If it was determined that a questionnaire had not been received, or if data were missing, a follow-up was conducted by telephone or personal visit. For a more detailed description of the collection process, please refer to Data collection.

Data processing

Once the questionnaires were received at head office they were registered, electronically scanned and the data automatically captured from the image using intelligent character recognition (ICR) technology. The captured data were then subjected to many rigorous quality control and processing edits to identify and resolve problems related to inaccurate, missing or inconsistent data. Subject-matter analysts also reviewed the aggregated data and individual values so that any remaining errors due to coverage, misreporting, data capture or other reasons were identified and corrected. For a more thorough explanation, please refer to Data processing.

Reference period

The Census of Agriculture has been conducted concurrently with the Census of Population every five years since 1951. The 2006 Census of Agriculture was conducted on May 16, 2006.

Revisions

Top of Page

Data from the Census of Agriculture are not subject to revision.

Adjustments

Data from the Census of Agriculture are not subject to seasonal adjustments or benchmarking to other data sources.

Concepts and variables measured

For a full description of census concepts, derived variables and geographic levels, please refer to Census terms and Geographic definitions.

Data accuracy

An integral part of each Census of Agriculture is the implementation of new or enhanced methods, procedures and technologies that improve not only the collection, but also the processing, validation and dissemination of the data. New methods, procedures and technologies adopted for the 2006 Census of Agriculture included mailing questionnaires to the 6.5% of the farm population with a recognized mailing address, the option of completing the questionnaire online, and two follow-up surveys: the Missing Farms Follow-up Survey and the All-purpose Farms Follow-up Survey. In addition, to help ensure that data from the 2006 Census of Agriculture would be of consistently high quality, improved quality assurance and control procedures were incorporated into each of the collection and data processing stages.

Primarily as a result of adopting these methods, procedures and technologies, the 2006 Census of Agriculture data are of very good quality, with data for the major commodities generally being of the highest quality. A response rate of 95.7% and an estimated 3.4% undercoverage rate of farms indicate the overall success of the 2006 Census of Agriculture. Note that, over half of the estimated undercoverage was of farms with sales below $10,000 in 2005. As a result, the undercoverage rate for major commodities is below 2%.

With projects as large and complex as the Censuses of Agriculture and Population, the estimates produced from them are inevitably subject to a certain degree of error. Knowing the types of errors that can occur and how they affect specific variables can help users assess the data's usefulness for their particular applications as well as assess the risks involved in basing conclusions or decisions on them.

Errors can arise at virtually every stage of the census process, from preparing materials, through collecting data, to processing. Moreover, errors may be more predominant in certain areas of the country or vary according to the characteristic being measured. Some errors occur at random, and when individual responses are aggregated for a sufficiently large group they tend to cancel each other out. For errors of this nature, the larger the group, the more accurate the corresponding estimate. For this reason, data users are advised to be cautious when using estimates based on a small number of responses. Some errors, however, might occur more systematically and result in "biased" estimates. Because the bias from such errors is persistent no matter how large the group for which responses are aggregated, and because bias is particularly difficult to measure, systematic errors are a more serious problem for most data users than random errors.

Top of Page

The most common types of errors are described below.

Coverage errors

In spite of efforts by census representatives to locate and enumerate all farm operations in Canada, each Census of Agriculture misses some farms, primarily because of the difficulty in correctly identifying an agricultural operation when none of its farm operators live on or near it. To reduce undercoverage, census representatives are instructed to ask a member of every household whether someone in the household is a farm operator. In addition, since 1991, an agriculture operator screening question has been on the Census of Population questionnaire to identify farm operators missed when the questionnaires were delivered. If a Census of Population questionnaire was returned with this question marked "yes," the Missing Farms Follow-up Survey called those households by phone to complete a Census of Agriculture questionnaire. This survey also made it possible to identify all large farms in each province on Statistics Canada's Farm Register (a regularly updated listing of farms in Canada) that may have been missed by the Census of Agriculture. The operators of these farms were contacted by phone to complete the questionnaire. Finally, the Coverage Evaluation Survey gave an estimated 3.4% undercoverage rate for the 2006 Census of Agriculture.

Non-response errors

Some Census of Agriculture and Census of Population questionnaires are only partially completed or not completed at all, usually because of the respondent's absence during the census period or unwillingness to complete the questionnaires. In either case, if the follow-up attempt to obtain the appropriate information is unsuccessful, missing responses are approximated using an automated imputation procedure during data processing. This procedure replaces a missing or inconsistent response, either with a value that is consistent with the other data provided on the questionnaire or with a response obtained from a similar agricultural operation. Data resulting from this procedure generally have little impact on the final figures released.

Response errors

Respondents sometimes provide inaccurate responses on the questionnaire, perhaps as a result of misinterpretation of a question, incorrect placement of a response or approximation of a response. In the Census of Agriculture, implausible or inconsistent responses are confirmed or corrected by contacting the respondents, since they could have a significant impact on totals at either the provincial or the sub-provincial level.

Processing errors

Errors can arise at any stage of data processing, including scanning or character recognition errors during data capture, manual coding and classification errors, and errors due to limitations in the imputation procedure (to correct missing or inconsistent responses, as described in "Non-response errors"). A detailed set of computerized checks at each stage of processing identifies such errors for corrective action. In addition, quality assurance procedures were developed for all processing steps.

Top of Page

Matching errors

During the creation of the Agriculture–Population Linkage database, missing, incomplete, or incorrect operator identification information from either census has the potential to introduce errors into the matching process. As examples of false matches, the same operator on two different operations could be erroneously matched to two different persons, or two separate operators may be incorrectly linked to the same person on the Census of Population database. It may also occur that errors in operator identification could prevent some true matches from being made. The effects of these non-match situations are minimized through the use of imputation or weighting.

Sampling errors

Sampling errors apply to all data relating to those questions on the long Census of Population questionnaires which were asked of only a one-fifth sample of households. These errors arise from the fact that the data for these questions, when weighted up to represent the entire population, inevitably differ somewhat from the results that would have been obtained if all households had been asked these questions. When variables relating to 100% of the population (either Census of Agriculture of Census or Population) are presented within the same table as variables relating to a 20% sample, all figures in this table will necessarily be sample estimates and therefore subject to sampling error.

The potential error introduced by sampling will vary according to the relative scarcity of the characteristic in the population. For large values, the potential error due to sampling, as a proportion of the total value will be relatively small. For small values, this potential error will be relatively large. The potential error due to sampling is usually expressed by the "standard error". Every population has an associated standard deviation, which is given as the square root of the average squared deviation of all population values about their mean. The standard error is an estimate of the population standard deviation corrected for the size of the sample relative to the size of the population.

The following table provides approximate measures of the standard error due to sampling based on the size of the data table cell values. They are intended as a general guide only. Note that these measures should not be used directly for estimates associated with averages of population, family or farm data (e.g., average size of census family).

Table 1 Approximate standard error due to sampling for 2006 Agriculture–Population linkage data

Users wishing to determine the approximate error due to sampling, based upon the Agriculture–Population Linkage, should choose the standard error corresponding to the value that is closest to that given in a particular Agriculture–Population Linkage table. With 95% certainty (i.e., 19 times out of 20), an interval constructed from the tabulated value plus or minus two times its standard error will contain the true value for the enumerated population (discounting all forms of error other than sampling). As an example using the approximate standard errors above, the user can be reasonably certain that for a value of 1,000, the range of 1,000 ± (2 x 60) or 1,000 ± 120, will include the true value of the characteristic being tabulated.

The effect of the particular sample design and weighting procedure used in the 2006 Census will vary, however, from one characteristic to another and from one geographic area to another. Therefore, the standard error values in the table may understate or overstate the error due to sampling.

Top of Page

Comparability of data and related sources

The data validation process identified some instances in which data either were not directly comparable to those from previous censuses or were of reduced quality, primarily because of coverage or response errors. After thoroughly investigating each case, notes were developed to identify the variables affected and explain the situation associated with each.

Following each Census of Agriculture, other agricultural surveys use Census of Agriculture data as a basis, or benchmark, for the production of regularly published estimates of the agriculture industry.

Other quality indicators and assessments

Coverage Evaluation Survey

The purpose of the Coverage Evaluation Survey (CES) is to estimate the coverage of the 2006 Census of Agriculture that was conducted on May 16, 2006.

Coverage is a problem that affects the quality of estimates of all censuses. For the Census of Agriculture, coverage errors occur when farms are missed, incorrectly included or double counted. The CES measures the level of coverage and is one way to assess the quality of the Census of Agriculture estimates.

The CES selects a random sample of smaller farm operations from Statistic Canada's Farm Register for which no Census of Agriculture questionnaire was received. The CES also draws a random sample of households not contacted by the Missing Farms Follow-up Survey but that had identified a farm operator as part of the household yet had not completed a Census of Agriculture questionnaire. The survey uses a short questionnaire to collect key information about the operating status and the size of the farm. Please note that there are no estimates of undercoverage for Yukon Territory, the Northwest Territories and Nunavut.

Table 2 Farm undercoverage: breakdown by province

Table 3 Total farm area undercoverage: breakdown by province/region

Table 4 Total gross farm receipts undercoverage: breakdown by province/region

Table 5 Farm undercoverage: breakdown by total gross farm receipts