Small Area Estimation for the Monthly Survey or Manufacturing

The Monthly Survey of Manufacturing (MSM) provides statistics on sales and inventories for Canada and each province. In recent years, there has been an increased interest in estimating sub-provincial sales. Direct estimates of sales can be obtained from the MSM, but they would be reliable only if the sample sizes are large enough. Therefore, a Small Area Estimation (SAE) methodology is now used to improve the quality of sub-provincial estimates, using Goods and Services Tax data provided by the Canada Revenue Agency. This document briefly describes this methodology.

1. Introduction

The demand for sales estimates at smaller geographical levels has greatly increased in recent years. Standard weighted estimates (or direct estimates) at sub-provincial levels can be obtained from the MSM. However, these direct estimates can be considered reliable as long as the sample size in the area of interest is large enough. To address this issue, a SAE methodology is used to improve the quality of sub-provincial estimates by combining survey estimates with data from other sources.

SAE methods attempt to produce reliable estimates when the sample size in the area is small. In this application of the methodology, the small area estimate is a function of two quantities: the direct estimate from the survey data, and a prediction based on a model – sometimes referred to as the indirect, or synthetic estimate. The model involves survey data from the geographical area of interest, but also incorporates data from other areas (as input to the model parameters) and auxiliary data. The auxiliary data must come from a source that is independent of the MSM, and it must be available at the appropriate levels of geography. The SAE model uses the GST sales as the auxiliary data. More precisely, the GST sales along with the direct survey estimates, are used to derive the small area estimates. For the smallest areas, the direct estimates are not reliable and the small area estimates are driven mostly by the predictions from the model. However, for the largest areas, this is the opposite and the small area estimates tend to be close to the direct estimates.

There are two types of SAE models: area-level (or aggregate) models that relate small area means to area-specific auxiliary variables, and unit-level models that relate the unit values of the study variable to unit-specific auxiliary variables. The MSM uses an area-level model.

Section 2 describes the requirements to produce sub-provincial inbound travel spending estimates. In section 3, diagnostics used for model validation and evaluation of small area estimates are briefly discussed.

2. Area-level model

The small area estimates were obtained through the use of the small area estimation module of the generalized software G-EstFootnote 1 version 2.02 (Hidiroglou et al., 2019; Estevao et al., 2017). Three inputs need to be provided to the G-Est for each area in order to obtain small area estimates:

Direct estimates from survey data θ^i

Smoothed variance estimates, which are obtained by applying a piecewise smoothing approach on the variance estimates

Vector of auxiliary variables zi

For the estimation of sales, the domain of interest are defined as: 27 industry groups × 15 Census Metropolitan Areas (M=324).

The 27 industry groups are as follows:

Table 1: Industry groups
Industry Group Description
311 Food manufacturing
312 Beverage and tobacco product manufacturing
313 Textile mills
314 Textile product mills
315 Clothing manufacturing
316 Leather and allied product manufacturing
321 Wood product manufacturing
322 Paper manufacturing
323 Printing and related support activities
324 Petroleum and coal product manufacturing
325 Chemical manufacturing
326 Plastics and rubber products manufacturing
327 Non-metallic mineral product manufacturing
331 Primary metal manufacturing
332 Fabricated metal product manufacturing
333 Machinery manufacturing
334 Computer and electronic product manufacturing
335 Electrical equipment, appliance and component manufacturing
3361 Motor vehicle manufacturing
3362 Motor vehicle body and trailer manufacturing
3363 Motor vehicle parts manufacturing
3364 Aerospace product and parts manufacturing
3365 Railroad rolling stock manufacturing
3366 Ship and boat building
3369 Other transportation equipment manufacturing
337 Furniture and related product manufacturing
339 Miscellaneous manufacturing

The 15 Census Metropolitan Areas that are used in the SAEFootnote 2 are shown in the following table.

Table 2: Census Metropolitan Areas
Census Metropolitan Area Description Province
205 Halifax Nova Scotia
421 Québec Québec
433 Sherbrooke
462 Montréal
505 Ottawa-Gatineau Québec/Ontario
535 Toronto Ontario
537 Hamilton
541 Kitchener-Cambridge-Waterloo
559 Windsor
602 Winnipeg Manitoba
705 Regina Saskatchewan
725 Saskatoon
825 Calgary Alberta
835 Edmonton
933 Vancouver British Columbia

3. Evaluation of small area estimates

The accuracy of small area estimates depends on the reliability of the model. It is therefore essential to make a careful assessment of the validity of the model before releasing estimates. For instance, it is important to verify that a linear relationship actually holds between direct estimates from the MSM (θ^i) and payment data (zi), at least approximately.

For the MSM, diagnostic plots and tests in the G-Est are used to assess the model, and outliers are identified iteratively by examining the standardized residuals from that model.

A concept that is useful to evaluate the gains of efficiency resulting from the use of the small area estimate θ^iSAE over the direct estimate is the Mean Square Error (MSE). The MSE is unknown but can be estimated (see Rao and Molina, 2015). Gains of efficiency over the direct estimate are expected when the MSE estimate is smaller than the smoothed variance estimate or the direct variance estimate. In general, the small area estimates in the MSM were significantly more efficient than the direct estimates, especially for the areas with the smallest sample size.

References

Estevao, V., You, Y., Hidiroglou, M., Beaumont, J.-F. (2017). Small Area Estimation-Area Level Model with EBLUP Estimation- Description of Function Parameters and User Guide. Statistics Canada document.

Hidiroglou, M.A., Beaumont, J.-F. And Yung, W. (2019). Development of a small area estimation system at Statistics Canada. Survey Methodology, Statistics Canada, no 12 001 X in the catalog, Vol. 45, no 1.

Rao, J.N.K., and Molina, I. (2015). Small Area Estimation. John Wiley & Sons, Inc., Hoboken, New Jersey.