The Monthly Survey of Manufacturing (MSM) provides statistics on sales and inventories for Canada and each province. In recent years, there has been an increased interest in estimating sub-provincial sales. Direct estimates of sales can be obtained from the MSM, but they would be reliable only if the sample sizes are large enough. Therefore, a Small Area Estimation (SAE) methodology is now used to improve the quality of sub-provincial estimates, using Goods and Services Tax data provided by the Canada Revenue Agency. This document briefly describes this methodology.
1. Introduction
The demand for sales estimates at smaller geographical levels has greatly increased in recent years. Standard weighted estimates (or direct estimates) at sub-provincial levels can be obtained from the MSM. However, these direct estimates can be considered reliable as long as the sample size in the area of interest is large enough. To address this issue, a SAE methodology is used to improve the quality of sub-provincial estimates by combining survey estimates with data from other sources.
SAE methods attempt to produce reliable estimates when the sample size in the area is small. In this application of the methodology, the small area estimate is a function of two quantities: the direct estimate from the survey data, and a prediction based on a model – sometimes referred to as the indirect, or synthetic estimate. The model involves survey data from the geographical area of interest, but also incorporates data from other areas (as input to the model parameters) and auxiliary data. The auxiliary data must come from a source that is independent of the MSM, and it must be available at the appropriate levels of geography. The SAE model uses the GST sales as the auxiliary data. More precisely, the GST sales along with the direct survey estimates, are used to derive the small area estimates. For the smallest areas, the direct estimates are not reliable and the small area estimates are driven mostly by the predictions from the model. However, for the largest areas, this is the opposite and the small area estimates tend to be close to the direct estimates.
There are two types of SAE models: area-level (or aggregate) models that relate small area means to area-specific auxiliary variables, and unit-level models that relate the unit values of the study variable to unit-specific auxiliary variables. The MSM uses an area-level model.
Section 2 describes the requirements to produce sub-provincial inbound travel spending estimates. In section 3, diagnostics used for model validation and evaluation of small area estimates are briefly discussed.
2. Area-level model
The small area estimates were obtained through the use of the small area estimation module of the generalized software G-EstFootnote 1 version 2.02 (Hidiroglou et al., 2019; Estevao et al., 2017). Three inputs need to be provided to the G-Est for each area in order to obtain small area estimates:
Direct estimates from survey data
Smoothed variance estimates, which are obtained by applying a piecewise smoothing approach on the variance estimates
Vector of auxiliary variables
For the estimation of sales, the domain of interest are defined as: 27 industry groups × 15 Census Metropolitan Areas (M=324).
The 27 industry groups are as follows:
Industry Group | Description |
---|---|
311 | Food manufacturing |
312 | Beverage and tobacco product manufacturing |
313 | Textile mills |
314 | Textile product mills |
315 | Clothing manufacturing |
316 | Leather and allied product manufacturing |
321 | Wood product manufacturing |
322 | Paper manufacturing |
323 | Printing and related support activities |
324 | Petroleum and coal product manufacturing |
325 | Chemical manufacturing |
326 | Plastics and rubber products manufacturing |
327 | Non-metallic mineral product manufacturing |
331 | Primary metal manufacturing |
332 | Fabricated metal product manufacturing |
333 | Machinery manufacturing |
334 | Computer and electronic product manufacturing |
335 | Electrical equipment, appliance and component manufacturing |
3361 | Motor vehicle manufacturing |
3362 | Motor vehicle body and trailer manufacturing |
3363 | Motor vehicle parts manufacturing |
3364 | Aerospace product and parts manufacturing |
3365 | Railroad rolling stock manufacturing |
3366 | Ship and boat building |
3369 | Other transportation equipment manufacturing |
337 | Furniture and related product manufacturing |
339 | Miscellaneous manufacturing |
The 15 Census Metropolitan Areas that are used in the SAEFootnote 2 are shown in the following table.
Census Metropolitan Area | Description | Province |
---|---|---|
205 | Halifax | Nova Scotia |
421 | Québec | Québec |
433 | Sherbrooke | |
462 | Montréal | |
505 | Ottawa-Gatineau | Québec/Ontario |
535 | Toronto | Ontario |
537 | Hamilton | |
541 | Kitchener-Cambridge-Waterloo | |
559 | Windsor | |
602 | Winnipeg | Manitoba |
705 | Regina | Saskatchewan |
725 | Saskatoon | |
825 | Calgary | Alberta |
835 | Edmonton | |
933 | Vancouver | British Columbia |
3. Evaluation of small area estimates
The accuracy of small area estimates depends on the reliability of the model. It is therefore essential to make a careful assessment of the validity of the model before releasing estimates. For instance, it is important to verify that a linear relationship actually holds between direct estimates from the MSM () and payment data (), at least approximately.
For the MSM, diagnostic plots and tests in the G-Est are used to assess the model, and outliers are identified iteratively by examining the standardized residuals from that model.
A concept that is useful to evaluate the gains of efficiency resulting from the use of the small area estimate over the direct estimate is the Mean Square Error (MSE). The MSE is unknown but can be estimated (see Rao and Molina, 2015). Gains of efficiency over the direct estimate are expected when the MSE estimate is smaller than the smoothed variance estimate or the direct variance estimate. In general, the small area estimates in the MSM were significantly more efficient than the direct estimates, especially for the areas with the smallest sample size.
References
Estevao, V., You, Y., Hidiroglou, M., Beaumont, J.-F. (2017). Small Area Estimation-Area Level Model with EBLUP Estimation- Description of Function Parameters and User Guide. Statistics Canada document.
Hidiroglou, M.A., Beaumont, J.-F. And Yung, W. (2019). Development of a small area estimation system at Statistics Canada. Survey Methodology, Statistics Canada, no 12 001 X in the catalog, Vol. 45, no 1.
Rao, J.N.K., and Molina, I. (2015). Small Area Estimation. John Wiley & Sons, Inc., Hoboken, New Jersey.