2021 Census Comment Classification

By: Joanne Yoon, Statistics Canada

Once every five years, the Census of Population provides a detailed and comprehensive statistical portrait of Canada and its population. The census is the only data source that provides consistent statistics for both small geographic areas and small population groups across Canada. Census information is central to planning at all levels. Whether starting a business, monitoring a government program, planning transportation needs or choosing the location for a school, Canadians use census data every day to inform their decisions.

2021 Census Comment Classification

Preparation for each cycle of the census requires several stages of engagement, as well as testing and evaluating data to recommend questionnaire content for the next census, as is the case for the upcoming 2021 Census. These steps include content consultations and discussions with stakeholders and census data users, as well as the execution of the 2019 Census Test (which validates respondent behaviours and ensures that questions and census materials are understood by all participants).

At the end of the Census of Population questionnaires, respondents are provided with a text box in which they can share concerns and suggestions, or make comments about the steps to follow, the content or the characteristics of the questionnaire. The information entered in this space is further analyzed by the Census Subject Matter Secretariat (CSMS) during and after the census collection period. Comments pertaining to specific questionnaire content are classified by subject matter area (SMA)—such as education, labour or demography—and shared with the corresponding expert analysts. The information is used to support decision making regarding content determination for the next census and to monitor factors such as respondent burden.

Using machine learning to classify comments

In an effort to improve the analysis of the 2021 Census of Population comments, Statistics Canada's Data Science Division (DScD) worked in collaboration with CSMS to create a proof of concept on the use of machine learning (ML) techniques to quickly and objectively classify census comments. As part of the project, CSMS identified fifteen possible comment classes and provided previous census comments labelled with one or more of these classes. These fifteen classes included the census SMAs as well as other general census themes by which to classify comments from respondents such as "experience with the electronic questionnaire," "burden of response," as well as "positive census experience" and comments "unrelated to the census." Using ML techniques along with the labelled data, a bilingual semi-supervised text classifier was trained wherein comments can be in either French or English and the machine can use labelled data to learn each class, while leveraging unlabelled data to understand its data space. DScD data scientists experimented with two ML models—the strengths of each model, along with the final model are detailed in this article.

The data scientists trained the 2021 Census comment classifier using comments from the 2019 Census Test. The CSMS team manually labelled these comments using the fifteen identified comment classes and reviewed each other's coding in an effort to reduce coding biases. The classifier is multi-class since a comment can be classified into fifteen different classes. As a result, the classifier is also multi-label since a respondent can address multiple topics within a single comment falling under multiple classes, and so the comment can be coded to one or more class.

Deterministic question and page number mapping

When a comment contains a question or page number, that number is deterministically mapped to the SMA class associated to the question and then combined with the ML class prediction in order to output the final class prediction. For example, say that a respondent completes a questionnaire where question number 22 asks about the respondent's education. In the comment box, the respondent comments on question 22 by explicitly stating the question number and also mentions the sex and gender questions without stating any question numbers. The mapping outputs the education class and the ML model predicts the sex and gender class based on the words used to mention the sex and gender questions. The program outputs the final prediction which is a union of the two outputs: education and sex and gender class. When no question number or page is explicitly mentioned, the program only outputs the ML prediction. The ML model is not trained to learn the page number mapping of each question since the location of a question can change depending on the questionnaire format. There are, for example, questions on different pages when you compare the regular font and the large print questionnaires as fewer questions fit per page in large print, and the electronic or online questionnaire does not show any page numbers.

Text cleaning

Before training the classifier, the program first cleans the comments. It identifies the language of the comment (English or French) and then corrects the spelling of unidentifiable words with a word that requires the least amount of edits and is most frequently found in the training data. For example, the word toqn can be corrected to the valid words torn or town, but is corrected to town because town was used more frequently in the training data. Also, the words are lemmatized into their root representation. The machine thus understands the words walk and walked to have the same root meaning. Stop words are not removed since helper words have meaning and imply sentiment. For example, this should be better has a different meaning from this is better, but if the program dropped all stop words (including this, should, be and is), the two sentences becomes identical with only one word left: better. Removing stop words can alter the meaning and the sentiment of a comment.

Bilingual semi-supervised text classifier

The bilingual semi-supervised text classifier learns from the labelled comments and is used to classify comments. Bilingual semi-supervised text classifier is not a single concept but rather individual pieces combined to best classify census comments.

The data scientists have trained a bilingual model where the proportion of French to English labelled comments as detected by a language detecting python program was 29% and 71%, respectively (16,062 English labelled comments and 6,597 French labelled comments). By training the model on both languages, it leveraged identical words (such as consultation, journal and restaurant) that have the same meaning in both languages to improve the accuracy of French comments which have less labels than English comments.

The model is semi-supervised. Labelled data define the knowledge that the machine needs to replicate. When given the labelled training data, the model uses maximum likelihood to learn the model's parameters and adversarial training to be robust to small perturbations. Unlabelled data are also used to expand the data space that the machine should handle with low confusion but does not teach the model about the meaning of classes. The unlabelled data are only used to lower the model's confusion using entropy minimization to minimize the conditional entropy of estimated class probabilities and virtual adversarial training to maximize the local smoothness of a conditional label distribution against local perturbation.

The text classifier starts with an embedding layer to accept words as input. A lookup table will map each word to a dense vector since the machine learns from numbers and not characters. The embedding layer will represent a sequence of words into a sequence of vectors. With this sequence, the model looks for a pattern that is more generalizable and robust than learning individual words. Also, to prevent the machine from memorizing certain expressions rather than semantic meaning, a dropout layer directly follows the embedding layer. When training, the dropout layer drops random words from the training sentence. The proportion of words dropped is fixed but the dropped words are selected at random. The model is forced to learn without some words so that it generalizes better. When using the model to classify comments, no words are dropped and the model can use all identified knowledge and patterns to make a prediction.

Comparing CNN to Bi-LSTM

The data scientists compared a convolutional neural network (CNN) to a Bi-directional-Long Short Term Memory (Bi-LSTM) network. Both networks can classify text by automatically learning complex patterns, but learn differently because of their different structures. In this proof of concept, the data scientists experimented with three different models to learn all fifteen classes: a single-headed LSTM model, a multi-headed LSTM model and a multi-headed CNN model. Overall, the single-headed LSTM model consistently predicted all the classes the most accurately and will thus be used in production.

LSTM can capture long-term dependencies between word sequences using input, forget and output gates as it can learn to retain or forget previous state's information. Previous state's information is the context made by the group of words that preceded the current word that the network is looking at. If the current word is an adjective, the network knows what the adjective is referring to because it retained that information earlier in the sentence. If the sentence talks about a different topic, the network should forget the previous state of information. Since Bi-LSTM is bi-directional, the model gathers past and future information relative to each word.

The CNN model applies a convolution filter to a sliding window of group of words and max pooling to select the most prominent information from a phrase of words rather than looking at each word independently. CNN defines the semantic context of a word using neighbouring words, whereas LSTM learns from a sequential pattern of words. Individual features are concatenated to form a single feature vector that summarizes the key characteristics of the input sentence.

A multi-headed classifier was tested with a final sigmoid layer giving a confidence distribution of the classes. The sigmoid layer will represent each class prediction confidence score as a decimal between 0-1 (i.e. 0% - 100%) where each score is independent to each other. This is ideal for the multi-label problem of comments that talk about multiple topics.

The data scientists also tested a single-headed classifier where a model only learns to identify if a single class is present in the text using a softmax activation function. The number of single-headed classifier is equal to the number of classes. An input comment can have multiple labels if multiple classifiers predict that its topic is mentioned in the comment. For example, if a comment talks about language and education, the language classifier and education classifier will predict 1 to signal the presence of the relevant SMA classes and other classifiers will predict 0 to signal the absent.

A single-headed classifier learns each class better than a multi-headed classifier which needs to learn fifteen different classes, but there is the added burden for programmers to maintain fifteen different classifiers. The burden to run the multiple classifiers is minimal since it can easily be programmed to run all classifiers in a loop and output the presence of relevant class. As shown below, the single-head Bi-LSTM model performs the best across the different classes and also in the weighted average.

Table 1: Test weighted average F1-score of different models.

Table 1: Test weighted average F1-score of different models.
  F1-score
Single-head Bi-LSTM 90.2%
Multi-headed CNN 76%
Bi-LSTM 73%

Amongst the multi-headed classifiers, CNN had a 4.6% higher average test F1-score than Bi-LSTM when classifying comments into SMA classes such as language and education. On the other hand, the Bi-LSTM model's average test F1-score on general census themed classes (i.e. "unrelated to the census," "positive census experience," "burden of response," "experience with the electronic questionnaire") was 9.0% higher than CNN model. Bi-LSTM was better at predicting if a comment was relevant to the Census program or not because it knew the overall context of where the comment was directed. For example, a respondent's positive opinion on a Canadian sports team is not relevant to the census, so this type of comment would be classified under the class "unrelated to the census." In this case, the CNN model predicted the comment to be positive in nature and thus to the positive census experience class, whereas Bi-LSTM tied the positive sentiment to the context (sports teams) and since the context was unrelated to the census, it correctly labelled it to be of no value for further analysis by CSMS. CNN, on the other hand, only looks at a smaller range of words so it excels in extracting features in certain parts of the sentence that are relevant to certain classes.

Next steps

This proof of concept showed that a ML model can accurately classify bilingual census comments. The classifier is multi-class, meaning that there are multiple classes to classify a comment into. It is also multi-label, meaning that more than one class may be relevant to the input comment. The second phase of this project will be to transition this model into production. In production, French and English comments will be spell checked and stemmed to the root words depending on each comment's language. A bilingual semi-supervised text classifier will predict both the cleaned French and English comments. The labelled 2019 data will train the ML model to predict and label incoming comments from the new 2021 Census of Population and ensure that the respondent comments are categorized to and shared with the appropriate expert analysts. In the production phase, when 2021 Census comments come in, the CSMS team and data scientists will continue to validate the ML predictions and feed them back to the machine to further improve the model.

If you are interested in text analytics or want to find out more about this particular project, the Applied machine learning for text analysis community of practice (GC employees only) recently featured a presentation on this project. Join the community to ask questions or discuss other text analytics projects.

Date modified:

Monthly Survey of Manufacturing: National Level CVs by Characteristic - December 2020

National Level CVs by Characteristic
Month Sales of goods manufactured Raw materials and components inventories Goods / work in process inventories Finished goods manufactured inventories Unfilled Orders
%
December 2019 0.58 0.98 1.16 1.39 1.06
January 2020 0.64 0.99 1.26 1.32 1.10
February 2020 0.63 1.02 1.22 1.36 1.08
March 2020 0.68 0.99 1.17 1.41 1.10
April 2020 0.87 0.99 1.20 1.41 1.10
May 2020 0.80 1.04 1.13 1.37 1.06
June 2020 0.69 1.05 1.19 1.38 1.06
July 2020 0.69 1.02 1.15 1.43 1.10
August 2020 0.64 1.05 1.23 1.50 1.20
September 2020 0.67 1.05 1.22 1.54 1.20
October 2020 0.69 1.02 1.18 1.53 1.15
November 2020 0.72 1.09 1.19 1.46 1.34
December 2020 0.70 1.03 1.22 1.45 1.37

Retail Commodity Survey: CVs for Total Sales (November 2020)

Retail Commodity Survey: CVs for Total Sales (November 2020)
NAPCS-CANADA Month
202008 202009 202010 202011
Total commodities, retail trade commissions and miscellaneous services 0.69 0.58 1.23 0.59
Retail Services (except commissions) [561] 0.68 0.58 1.21 0.59
Food at retail [56111] 0.81 0.60 1.25 0.73
Soft drinks and alcoholic beverages, at retail [56112] 0.52 0.55 0.76 0.64
Cannabis products, at retail [56113] 0.00 0.00 0.05 0.00
Clothing at retail [56121] 1.07 1.09 1.61 1.74
Footwear at retail [56122] 2.17 1.66 1.73 2.15
Jewellery and watches, luggage and briefcases, at retail [56123] 9.08 9.18 6.60 2.34
Home furniture, furnishings, housewares, appliances and electronics, at retail [56131] 0.73 0.66 0.70 0.64
Sporting and leisure products (except publications, audio and video recordings, and game software), at retail [56141] 3.00 3.31 2.74 2.18
Publications at retail [56142] 8.50 8.32 6.44 7.28
Audio and video recordings, and game software, at retail [56143] 7.86 5.40 6.87 6.13
Motor vehicles at retail [56151] 2.58 1.95 4.73 2.02
Recreational vehicles at retail [56152] 3.79 3.95 4.42 6.47
Motor vehicle parts, accessories and supplies, at retail [56153] 1.67 1.49 2.47 1.39
Automotive and household fuels, at retail [56161] 2.13 2.23 2.40 2.24
Home health products at retail [56171] 2.26 2.53 3.32 3.73
Infant care, personal and beauty products, at retail [56172] 2.70 2.30 3.35 2.96
Hardware, tools, renovation and lawn and garden products, at retail [56181] 1.22 1.51 1.36 1.48
Miscellaneous products at retail [56191] 2.37 2.43 2.77 2.33
Total retail trade commissions and miscellaneous servicesFootnotes 1 1.65 1.66 2.38 1.56

Footnotes

Footnote 1

Comprises the following North American Product Classification System (NAPCS): 51411, 51412, 53112, 56211, 57111, 58111, 58121, 58122, 58131, 58141, 72332, 833111, 841, 85131 and 851511.

Return to footnote 1 referrer

Federal tax expenditures

Statistics Canada's Departmental Plan does not include information on tax expenditures that relate to its planned results for 2021–22.

Tax expenditures are the responsibility of the Minister of Finance, and the Department of Finance Canada publishes cost estimates and projections for government-wide tax expenditures each year in the Report on Federal Tax Expenditures. This report provides detailed information on tax expenditures, including objectives, historical background and references to related federal spending programs, as well as evaluations, research papers and gender-based analysis. The tax measures presented in this report are solely the responsibility of the Minister of Finance.

Supplementary information tables

Departmental Sustainable Development Strategy

2020 to 2023 Short-form Departmental Sustainable Development Strategy

Name of department

Statistics Canada


Date

January 2021


Context

Although Statistics Canada is not bound by the Federal Sustainable Development Act and is not required to develop a full departmental sustainable development strategy, Statistics Canada adheres to the principles of the Federal Sustainable Development Strategy (FSDS) by complying with the Policy on Green Procurement.

The Policy on Green Procurement supports the Government of Canada's effort to promote environmental stewardship. In keeping with the objectives of the policy, Statistics Canada supports sustainable development by integrating environmental performance considerations into the procurement decision making process through the actions described in the 2019 to 2022 FSDS "Greening Government" goal.


Commitments

Please refer to the table below.


Integrating sustainable development

Statistics Canada will continue to ensure that its decision-making process includes consideration of FSDS goals and targets through its Strategic Environmental Assessment (SEA) process. An SEA for policy, plan or program proposals includes an analysis of the impacts of the given proposal on the environment, including on FSDS goals and targets.

Public statements on the results of Statistics Canada's assessments will be made public and announced on its website when an initiative has undergone a detailed SEA. The purpose of the public statement is to demonstrate that the environmental effects, including the impacts on achieving the FSDS goals and targets, of the approved policy, plan or program have been considered during proposal development and decision making.

FSDS goal: Greening Government

FSDS goal: Greening Government
FSDS target FSDS contributing actions Corresponding departmental action(s) Contribution by each departmental action to the FSDS goal and target Starting point(s), target(s) and performance indicator(s) for departmental actions Link to the department's Program Inventory
Actions supporting the Greening Government goal and the Policy on Green Procurement Departments will use environmental criteria to reduce the environmental impact and ensure best value in government procurement decisions
  • Integrate environmental considerations into procurement management processes and controls.
  • Ensure paper purchased by Statistics Canada is made from recycled material.
Motivate suppliers to reduce the environmental impact of their goods, services and supply chains.
  • To reduce waste generated and minimize the environmental impacts of assets throughout their lifecycle, Statistics Canada will continue to embed environmental considerations in public procurement in accordance with the Policy on Green Procurement.
  • Copy paper purchased by Statistics Canada contains a minimum of 30% recycled content and has a forest certification, ECOLOGO certification or equivalent certification
  • Economic and Environmental Statistics
  • Socio-economic Statistics
  • Censuses
  • Centres of Expertise
  • Cost-Recovered Statistical Services
  • Internal Services
Support for green procurement will be strengthened, including guidance, tools and training for public service employees
  • Ensure that decision makers and materiel management and procurement specialists have the necessary training and awareness to support green procurement.
  • Ensure that key officials include support for and contributions to the Government of Canada's Policy on Green Procurement objectives.Table note 1
Motivate suppliers to green their goods, services and supply chain.
  • 100% of specialists in procurement and materiel management have completed training on green procurement.
  • Performance evaluations of managers and functional heads of procurement and materiel management include support for and contributions to green procurement in the given fiscal year.
 
Table note 1

Reference to performance agreements of procurement materiel management senior officials has been removed. Green Procurement considerations are addressed at the requirements definition phase and have been built into templates each contracting officer must use. The templates are subject to peer review and sectional audit, with monitoring and oversight by the key official.

Return to table note 1 referrer

Gender-based analysis plus

Institutional GBA+ Capacity

Statistics Canada has established a Centre for Gender, Diversity and Inclusion Statistics to report on progress made towards gender equality and address gaps in the availability of disaggregated data and analysis on gender, race, class, sexual orientation, disability and other intersecting identities. The Centre enables data users to easily access and analyze a wealth of statistical information, relevant to the evaluation of programs, policies and initiatives from a gender, diversity and inclusion perspective.

Statistics Canada is committed to creating not only a diverse and inclusive workforce, but a safe place for all employees. Statistics Canada's Diversity and Inclusion Framework and goals include:

  1. Have a workforce that is representative of the Canadian population
  2. Attract and retain a talented, skilled and diverse workforce
  3. Understand inclusion barriers within the agency
  4. Create barrier free processes, policies, practices and programs
  5. Tracking progress and measure results

In 2021-22, the agency will continue to implement the Equity Diversity and Inclusion Action Plan and support progress in areas under five pillars; recruitment, development, increasing awareness, visible leadership and accountability, and accessibility. A few examples of the Equity, Diversity and Inclusion actions include:

  • Increase hiring of racialized employees for all new recruitment, promotions and acting assignments for executive positions
  • Calculate projected gaps to inform potential hiring goals and staffing strategies with hiring managers
  • Develop an accountability mechanism for Champions
  • Develop and put in place a mandatory pledge for support and engagement from management in relations to diversity and inclusion
  • Identify and implement diversity and inclusion mandatory training for all employees

Statistics Canada plans to continue to diversify its hiring practices and staffing processes to ensure it is inclusive and accessible to all Canadians. As part of the agency's commitment to inclusivity and accessibility, the agency is amending questions in the hiring process to give flexibility to candidates who may have been out of the workforce for some time, removing barriers at the outset of the hiring process, and encouraging women to apply to male dominated fields where they are under-represented.

In addition, Statistics Canada is committed to tracking employment equity gaps and in 2021-22 a new Dashboard for management will be available with more indicators, giving management a better idea of the retention rate, promotion rate, and other key information regarding employment equity and diversity and inclusion. This will better equip management to understand and address the gaps in their division.


Highlights of GBA+ Results Reporting Capacity by Program

Economic and Environmental Statistics

National Economic Accounts Program:

To support the economic participation and prosperity framework, the human resources modules for the Infrastructure Accounts and selected satellite accounts (natural resources, environment) contain detailed breakdowns for men and women. For 2021-22, work is underway to link the labour productivity program to labour force characteristics which will provide further GBA+ insights. As well, estimates of the value of unpaid household work, for women and men, will be updated, thereby contributing to GBA+ and supporting analysis related to gender inequalities in Canada.

Corporations Return Act:

This was used to construct a gender database for corporate Canada. Coupled with additional information the database does provide insight on gender distribution within senior ranks of corporate Canada and therefore inform on the Gender Results Framework pillar: Leadership and Democratic Participation. Thus far the project has provided research and further insights into gender representation of decisions makers within the corporate sector in Canada. For 2021-22, the agency would like to widen the scope of the project to include a broader diversity lens especially within the immigrant population. This research initiative will pursue a research agenda focusing on the Canadian corporate sector developments in the areas of diversity within the existing GBA+ framework. The goal is to table a report by March 31, 2022 that outlines the findings and propose potential future initiatives for analytical studies and statistical outputs.

COVID-19

New disaggregated data based on gender will be released, particularly as they relate to the impact of the economic downturn in the context of the pandemic.


Socio-economic Statistics

With a joint goal to increase knowledge and literacy under five of the pillars of the Gender Results Framework (GRF) (Economic Participation and Prosperity; Poverty Reduction, Health and Well-being; Leadership and Democratic Participation; Education and Skills Development; Gender-based Violence and Access to Justice), the Department for Women and Gender Equality (WAGE) has engaged Statistics Canada to address important gaps in the availability of data and analysis related to gender, age, sexual orientation, disability, ethnocultural characteristics and their intersecting identities. Among projects supported by WAGE, the Centre will produce a report on the assessment of adding intersectionality to the Gender Results Framework indicators. The Centre will also release a few analytical products to inform on Canada's diversity: an analytical series on the Lesbian, Gay and Bisexual (LGB) population, including an article on the linguistic and ethnocultural diversity among lesbian, gay and bisexual Canadians, a paper on the sociodemographic profile of women living in rural and remote areas of Canada (including immigrant status, Indigenous identity and ethnocultural characteristics. Further, a greater emphasis was placed on disaggregating data as much as possible so all papers will include as much information on diverse population groups as the data will allow.

Since the onset of the pandemic, the Centre released a number of articles on the impact of COVID-19 on diverse population groups, including mental health and gender, mental health for population groups designated as visible minorities, the impact of COVID-19 on LGBTQ2+ Canadians and parenting through the pandemic. An article on statistical standards used to disaggregate data was also released. The Centre has also modified the Gender, Diversity and Inclusion Statistics Hub to highlight the COVID-19 articles that used disaggregated data and also organized them by diverse population group.

The Centre has also been developing a standard for measuring sexual orientation. The first round of consultations took place in winter 2020 with experts within the federal government, academia and community organizations. The next phase took place in summer 2020 where 17 focus groups were conducted. Since late January 2021, the proposed standards have been available for review by the public. A final round of qualitative testing will take place in late March and a final report with recommendations will be prepared in the first quarter of 2021-2022.

Social Inclusion Framework

With funding from Canada's Anti-Racism strategy, Diversity and Sociocultural Statistics will continue to work on the development of its conceptual framework on social inclusion, including a large number of social inclusion indicators based on 2016 Census data (to be later updated based on 2021 Census data) and other survey data such as the General Social Survey. These indicators will be presented on the Gender, Diversity and Inclusion Statistics HUB using a new disaggregated classification of ethnocultural groups that combines the population group question with the ethnic and cultural origin question. Indicators are currently in production and a new interactive tool to present them on GDIS Hub is currently in development through what is now called The Social Indicators Visualization Project.

General Social Survey on Social Identity

Work is also being undertaken with regards to collection and dissemination of ethno-cultural statistics. For example, with support from Heritage Canada, the new cycle of the General Social Survey on Social Identity will allow for the disaggregation of some specific ethno cultural groups to allow for increased data and more targeted policy analysis with respect to the experiences of some ethno-cultural groups for most provinces/provincial regions of Canada.

Labour Force Survey

Starting in the July 2020 reference month, the Labour Force Survey started collecting information on visible minority status which can be used to report on the labour market activities of persons belonging to population groups designated as visible minorities.

Justice Statistics

The Canadian Centre for Justice and Community Safety Statistics has released a number of articles and reports on Gender Based Violence. In addition, many projects are underway to report on the experiences of diverse population groups. Two such are projects are:

  1. Statistics Canada and the Canadian Association of Chiefs of Police publicly announced a commitment to work with the policing community and key organizations to add Indigenous identify and ethno-cultural groups to police-reported crime data. This will help inform issues of system inequities and shine light on the experiences of these populations.
  2. A collaborative project with the Government of Saskatchewan was undertaken to respond to the growing need to better understand the pathways individuals take through and, often back into, the justice system. This includes understanding how certain population groups, such as Indigenous peoples, may be more vulnerable to repeat contacts with the system.

Census

2021 Census

Various ethnocultural concepts, such as immigration, language groups, ethnic origins, population groups designated as visible minorities and religion will be measured on the 2021 Census. The data will provide detailed and granular disaggregation of data on population groups designated as visible minorities. In addition, Statistics Canada is consulting with experts and data users with the objective of developing a more disaggregated classification of groups designated as visible minorities for dissemination and analytical purposes using Census data. This new classification (2021 Census derived variable) combines information from the question on population groups with information from the ethnic and cultural origin question.


Cost-recovered Statistical ServicesFootnote 1

Cost Recovery projects are reflected throughout the programs mentioned.

For example, a portion of work being done to address important data gaps in collaboration with the Department for Women and Gender Equity (WAGE), is a cost-recovery program.


Centre of Expertise

Economic Analysis Projects

Data has been collected and shared on Private Enterprises by gender of primary owner, age of primary owner and enterprise size. In 2021-22, the agency plans to undertake projects that update and expand its capacity to report on gender and diversity. Specifically, the statistics on Private Enterprises by gender of primary owner will be updated to the latest period possible (2018), and research projects will be undertaken to examine human capital by gender, gross domestic product by gender, the performance gaps between women-owned and men-owned enterprises, as well as Black business owners and persons with disabilities and business ownership in Canada.


Internal Services

Statistics Canada’s Employment Equity, Diversity and Inclusion team supports two different pillars of the Gender Results Framework:

Economic Participation and Prosperity- Increased labour market opportunities for women, especially women in under-represented groups

Leadership and Democratic Participation- More women in senior management positions and more diversity in senior leadership positions.

Initiatives are targeted towards all employment equity groups, including Indigenous, members of population groups designated as visible minority and people with disability.

Here are some of the initiatives the agency is currently working on that support the pillars and goals of the Gender Results Framework:

  • Increase diversity in staffing process. We are working on adding a paragraph to questions during the hiring process to give the flexibility for candidates to use their own personal experience rather than only work-related experience to meet the merit criteria.
  • Review of tools (Track Record) for our staffing team in order to remove barriers in the hiring process and be more diverse/inclusive at the outset.
  • Add specific paragraphs encouraging women to apply to under-represented fields or male dominated fields, such as IT.
  • Add a column in the screening board report to identify people that have self-declared during the process. We will be able to see if more women have applied to the jobs that had specific wording to encourage them to apply and self-declare with the new column on the screening board report.
  • Create and implement a new Dashboard for management with employment equity (EE) data and more indicators than only the gaps and the Work Force Availability. It will also give a better idea of the retention rate, promotion rate, and will contain official language information and other key information regarding EE, diversity and inclusion in order to better identify and address the gaps in their division.
  • Review the Self-ID form. This will be a revamp to change more specifically the gender identity section, in order to include more than just male or female. We will be able to collect more accurate data on gender including Two-Spirit and trans employees and see where they are situated in the Agency, for example, at the executive level.
  • Establish an integrated approach to development and talent management for career progression for equity-seeking groups, e.g. through mentoring, coaching, and sponsorship by senior leaders.
  • Partner with educational and training institutions to provide a direct pathway into public service jobs for Indigenous peoples in occupations and departments in which they are under-represented.

Corporate information

Organizational profile

Appropriate minister(s): The Honourable François-Philippe Champagne, P.C., M.P.

Institutional head: Anil Arora

Ministerial portfolio: Innovation, Science and Economic Development

Enabling instrument(s):

Year of incorporation / commencement: The Dominion Bureau of Statistics was established in 1918. In 1971, with the revision of the Statistics Act, the agency became Statistics Canada.

Other: Under the Statistics Act, Statistics Canada is required to collect, compile, analyze, abstract and publish statistical information relating to the commercial, industrial, financial, social, economic and general activities and condition of the people of Canada.

Statistics Canada has two primary objectives:

  • to provide statistical information and analysis of the economic and social structure and functioning of Canadian society, as a basis for developing, operating and evaluating public policies and programs; for public and private decision-making; and for the general benefit of all Canadians
  • to promote the quality, coherence and international comparability of Canada’s statistics through collaboration with other federal departments and agencies, with the provinces and territories, and in accordance with sound scientific standards and practices.

Statistics Canada’s head office is located in Ottawa. There are regional offices across the country in Halifax, Sherbrooke, Montréal, Toronto, Sturgeon Falls, Winnipeg, Edmonton and Vancouver. There are also 33 research data centres located throughout the country. These centres provide researchers with access to microdata from population and household survey programs in a secure university setting. Canadians can follow the agency on Twitter, Facebook, Instagram, Reddit, feeds and YouTube.

Raison d'être, mandate and role: who we are and what we do

Raison d’être, mandate and role: who we are and what we do” is available on the Statistics Canada website.

or more information on the agency’s organizational mandate letter commitments, see the Minister’s mandate letter.

Operating context

A developed, democratic country such as Canada requires vast amounts of information to function effectively. Statistics provide Canadians with vital information to help monitor inflation, promote economic growth, plan cities and roads, adjust pensions, and develop employment and social programs. They help governments, businesses and individuals make informed decisions.

The value placed on data by every segment of society is growing at an exponential pace. At the same time, new tools and new computing power are emerging and multiplying the volume and types of information available.

As the demand for information increases along with its importance and availability, privacy concerns, call-screening technology and the busy lives of Canadians are making it harder to reach and obtain information from households. As a result, the agency is continually seeking out new and innovative approaches to meet emerging data needs.

As it innovates and modernizes, the agency will be well positioned to play a more active role in guiding and shaping this information age.

Reporting framework

The Statistics Canada approved Departmental Results Framework and Program Inventory for 2020–21 are as follows.

  • Core Responsibility: Statistical Information
    Statistics Canada produces objective high-quality statistical information for the whole of Canada. The statistical information produced relates to the commercial, industrial, financial, social, economic, environmental and general activities and conditions of the people of Canada.
    • Result 1: High quality statistical information is available to all Canadians.
      • Indicator 1: Number of post-release corrections due to accuracy.
      • Indicator 2: Percentage of international standards with which Statistics Canada conforms.
      • Indicator 3: Number of statistical products available on the website.
      • Indicator 4: Number of Statistics Canada data tables available on the Open Data Portal.
    • Result 2: High quality statistical information is accessed by Canadians.
      • Indicator 1: Number of visits to Statistics Canada website.
      • Indicator 2: Number of interactions on social media.
      • Indicator 3: Percentage of website visitors that found what they were looking for.
    • Result 3: High quality statistical information is relevant to Canadians.
      • Indicator 1: Percentage of users satisfied with statistical information.
      • Indicator 2: Number of media citations on Statistics Canada data.
      • Indicator 3: Number of journal citations.
  • Internal Services

Program Inventory

  • Economic and Environmental Statistics
  • Socio-economic Statistics
  • Censuses
  • Cost-Recovered Statistical Services
  • Centres of Expertise

Internal Services: planned results

Description

Internal Services are those groups of related activities and resources that the federal government considers to be services in support of Programs and/or required to meet corporate obligations of an organization. Internal Services refers to the activities and resources of the 10 distinct services that support Program delivery in the organization, regardless of the Internal Services delivery model in a department. These services are:

  • Management and Oversight Services
  • Communications Services
  • Legal Services
  • Human Resources Management Services
  • Financial Management Services
  • Information Technology Services
  • Real Property Management Services
  • Materiel Management Services
  • Acquisition Management Services.

Planning highlights

Statistics Canada's internal services will continue to evolve to meet the changing context by focusing on the agency's COVID-19 response, processes, controls and oversight practices. As the government continues to address public health and economic challenges, the agency's enabling corporate and internal services will provide support and solutions to meet business and employee needs. Decision making will be informed by a data infrastructure that continues to be more integrated, providing timely insights to foster the agency's cultural values and accountability for outcomes. Internal services will keep providing more user-centric and efficient services.

COVID-19 response

Over the last several months, the agency has prioritized its response to COVID-19, which entails delivering mission-critical programs while maintaining the safety and health of employees. As the situation continues to stabilize, the agency will remain focused on supporting employees adjusting to a new reality. This will include revising return-to-office plans, including Occupational Health and Safety programs to ensure that employees working remotely and on-site—whether in offices, the field or research data centres—are safe. The agency will also support employees by providing existing mental health and wellness training online and launching new initiatives in response to employee pulse survey results.

Internal services expertise and support will also enable the 2021 Census to be conducted successfully, with practices adapted for the pandemic.

Gender equity, Diversity and inclusion

Over the next year, the agency will deliver on an ambitious gender equity, diversity and inclusion agenda, encompassing accessibility and official languages. Statistics Canada will continue to implement its diversity and inclusion action plan and support progress in five areas: recruitment, development, awareness-raising, visible leadership and accountability, and accessibility. A multi-year, multi-phase accessibility roadmap will be created. Many of the planned actions will be undertaken through focus groups held with groups designated as visible minorities and Indigenous people. The agency will also continue co-developing an accessibility index with the Office of Public Service Accessibility and other key departments and agencies.

Skill sets and talent management

Employee and manager learning and development programs will be emphasized. Required skills, for now and the future, will continue to be identified, and this will include piloting an approach for employees to self-identify skills and areas of interest. The objective is to identify existing skills and areas for development within the agency and allocate employee skill sets to agency priorities in an agile way.
The agency will also focus on talent management for all levels and will implement a leadership development program for executives. Furthermore, a data-driven approach to performance management will be established to make the agency's approach to performance management ratings and results-based management more consistent.

Leveraging data analytics

The agency's corporate services will continue to experiment with new ways of leveraging data analytics to inform decision-making. For instance, in 2021–22, data from Statistics Canada's human resources analytics tool will be expanded to include recent results from the Public Service Employee Survey and internal pulse surveys. Additionally, different corporate service data sources will begin to be integrated to generate new solutions to business problems, such as facilitating employees' return to work. These enhancements will strengthen the agency's business intelligence posture by providing critical and timely information to managers. Furthermore, the agency will continue to work with government organizations to improve their data analytics capacity and develop indexes for priority areas such as accessibility.

Transformation of processes

To ensure effective stewardship of public resources, as well as proactive and agile processes, the agency will continue to transform its processes in 2021–22 by developing coherent corporate business planning frameworks. The frameworks will support strategy-setting and investment, planning, and the use of performance indicators to monitor progress. They will ensure the agency is effectively meeting its objectives.

Internal audit and evaluation

In 2021–22, the Audit and Evaluation Branch will conduct audits and evaluations to yield insight into the appropriateness of decision-making and governance structures and processes that enable the agency's employees to operate effectively within a strong management regime. Moreover, evaluations with a user-centric focus will assess the relevance of programs. This will provide insight into the degree to which user needs are being considered and met in program design and delivery.

Governance

Statistics Canada is continuing to strengthen its governance structure by implementing a principled performance model. The governance team will ensure timely, relevant, actionable and integrated enterprise data are available to support evidence-based decision-making. The agency will also continue to operationalize the senior committees and will formalize additional principal officers (P-suite) roles for executive officers to further strengthen the horizontal perspective. The P-suite will have clearly defined compliance management roles, aligned with corporate risks, and will regularly assess adherence to policy requirements and inform senior management of required adjustments. In the coming year, the agency will also update its foundational framework, processes and procedures for governing instruments, while ensuring horizontal standardization.

Digital solutions

As the lead on one of the Government of Canada's cloud pathfinder projects, Statistics Canada is uniquely positioned to explore, develop and adopt new technologies. The agency will continue to draw on its talent as it charts a way forward for technology in government. This will eventually affect how the Government of Canada does business and will have a positive impact on the lives of Canadians.

Adopting cloud services is a crucial part of the agency's modernization efforts. Most existing technological solutions are migrating to the agency's secure cloud environment, and most new solutions are being developed in the cloud and are positioned for successful production deployment. The transition to the cloud environment will enable more agility to ramp up or down infrastructure needs and robustness by having more redundancy and fail safe solutions.

The innovative Data Analytics as a Service platform, which leverages cloud delivery services, has been accessible for external user feedback using public data. It became an integral part of Statistics Canada's response to the pandemic, increasing critical data such as on the PPE dashboard. Over the next year, more data—beyond what Statistics Canada collects—will be added to and integrated into the platform. This will facilitate the research needed to arrive at meaningful insights and support evidence-based decisions.

Planned budgetary financial resources for Internal Services
2021-22 budgetary spending (as indicated in Main Estimates) 2021-22 planned spending 2022-23 planned spending 2023-24 planned spending
66,905,037 66,905,037 65,930,587 65,977,108
Note: Main Estimates, planned spending and full-time equivalent figures do not include Budget 2021 announcements. More information will be provided in the 2021–22 Supplementary Estimates and Departmental Results Report, as applicable.
Planned human resources for Internal Services
2021-22 planned full-time equivalents 2022-23 planned full-time equivalents 2023-24 planned full-time equivalents
563 546 546
Note: Main Estimates, planned spending and full-time equivalent figures do not include Budget 2021 announcements. More information will be provided in the 2021–22 Supplementary Estimates and Departmental Results Report, as applicable.

Spending and human resources

This section provides an overview of the department's planned spending and human resources for the next three consecutive fiscal years and compares planned spending for the upcoming year with the current and previous years' actual spending.

Planned spending

Departmental spending 2018–19 to 2023–24

The following graph presents planned (voted and statutory) spending over time.

Departmental spending graph 2021-2022
Description - Departmental spending graph
Departmental spending graph
  Total Voted Statutory Cost Recovery (Netted Revenue)
2018–19 507,744 438,134 69,610 124,201
2019–20 546,950 473,759 73,190 120,038
2020–21 631,926 552,084 79,842 113,157
2021–22 802,331 721,223 81,107 120,000
2022–23 512,533 440,480 72,053 120,000
2023–24 462,495 396,555 65,940 120,000
Note: Main Estimates, planned spending and full-time equivalent figures do not include Budget 2021 announcements. More information will be provided in the 2021–22 Supplementary Estimates and Departmental Results Report, as applicable.
Budgetary planning summary for core responsibilities and Internal Services (dollars)
Budgetary planning summary for core responsibilities and Internal Services (dollars)
The following table shows actual, forecast and planned spending for each of Statistics Canada's core responsibilities and to Internal Services for the years relevant to the current planning year.
Core responsibilities and Internal Services 2018–19
expenditures
2019–20
expenditures
2020–21
forecast spending
2021–22 budgetary spending (as indicated in Main Estimates) 2021–22
planned spending
2022–23
planned spending
2023–24
planned spending
Statistical Information 559,559,344 584,770,894 665,615,857 855,425,655 855,425,655 566,602,643 516,517,426
Internal Services 72,385,465 82,217,225 79,467,863 66,905,037 66,905,037 65,930,587 65,977,108
Total gross expenditures 631,944,809 666,988,119 745,083,720 922,330,692 922,330,692 632,533,230 582,494,534
Respendable revenue -124,200,719 -120,038,495 -113,157,338 -120,000,000 -120,000,000 -120,000,000 -120,000,000
Total net expenditures 507,744,090 546,949,624 631,926,382 802,330,692 802,330,692 512,533,230 462,494,534
Note: Main Estimates, planned spending and full-time equivalent figures do not include Budget 2021 announcements. More information will be provided in the 2021–22 Supplementary Estimates and Departmental Results Report, as applicable.

Statistics Canada is funded by two sources: direct parliamentary appropriations and cost-recovery activities. Statistics Canada has the authority to generate $120 million annually in respendable revenue, related to two streams: statistical surveys and related services, and custom requests and workshops. If exceeded, a request can be made to increase the authority, as was the case in 2018–19 and 2019–20.

In recent years, respendable cost-recovery revenue has contributed between $113 million and $124 million annually to the agency's total resources. A large portion of this respendable revenue comes from federal departments to fund specific statistical projects.

Spending fluctuations between the years shown in the graph and table above were mainly caused by the Census Program. Voted spending decreased in 2018–19 as the 2016 Census of Population and 2016 Census of Agriculture were winding down. This pattern is typical for the agency because of the cyclical nature of the Census Program. Spending will begin to ramp up and peak again in 2021–22 when the 2021 Census of Population and 2021 Census of Agriculture are conducted followed by a significant decrease in subsequent years as these activities wind down.

Internal Services spending from 2018–19 to 2020–21 includes planned resources from temporary funding related to a new initiative approved in 2018–19 to migrate the agency's infrastructure to the cloud.

For additional details on year-over-year variances between 2018–19 and 2019–20 expenditures, see the 2019–20 Departmental Results Report.

2021–22 budgetary planned gross spending summary (dollars)
The following table reconciles gross planned spending with net planned spending for 2021–22.
Core responsibilities and Internal Services 2021–22
planned gross spending
2021–22
planned gross spending for specified purpose accounts
2021–22
planned revenues netted against expenditures
2021–22
planned net spending
Statistical Information 855,425,655 0 -120,000,000 735,425,655
Internal Services 66,905,037 0 0 66,905,037
Total 922,330,692 0 -120,000,000 802,330,692
Note: Main Estimates, planned spending and full-time equivalent figures do not include Budget 2021 announcements. More information will be provided in the 2021–22 Supplementary Estimates and Departmental Results Report, as applicable.

Statistics Canada has the authority to generate $120 million annually in respendable revenue, which is reflected in the 2021–22 planned revenues netted against expenditures.

Planned human resources

The following table shows actual, forecast and planned full-time equivalents (FTEs) for each core responsibility in Statistics Canada's Departmental Results Framework and to Internal Services for the years relevant to the current planning year.

Human resources planning summary for core responsibilities and Internal Services
Core responsibilities and Internal Services 2018–19
actual FTEs
2019–20
actual FTEs
2020–21
forecast FTEs
2021–22
planned FTEs
2022–23
planned FTEs
2023–24
planned FTEs
Statistical Information 5,498 5,595 5,863 6,026 5,065 4,644
Internal Services 645 626 615 563 546 546
Total gross FTEs 6,143 6,221 6,478 6,589 5,611 5,190
Respendable revenue -1,380 -1,366 -1,265 -1,231 -1,241 -1,289
Total net FTEs 4,763 4,856 5,212 5,358 4,370 3,901
Note: Main Estimates, planned spending and full-time equivalent figures do not include Budget 2021 announcements. More information will be provided in the 2021–22 Supplementary Estimates and Departmental Results Report, as applicable.

Similar to trends seen in planned spending, FTE changes from year to year are largely explained by the cyclical nature of the Census Program. Activity decreased in 2018–19 as the 2016 Census of Population and 2016 Census of Agriculture were winding down. Activity will begin to ramp up and peak again in 2021–22 when the 2021 Census of Population and 2021 Census of Agriculture are conducted.

Included in net expenditure FTEs are approximately 210 public servant FTEs based across Canada outside the National Capital Region (NCR). Also included are approximately 950 interviewer FTEs (representing approximately 1,800 interviewers) outside the NCR. These interviewers are part-time workers with assigned workweeks that are determined by the volume of collection work available; they are hired under the Statistics Act, by the authority of the Minister of Innovation, Science and Industry. Interviewers are covered by two separate collective agreements and are employed through

Statistical Survey Operations. Many of Statistics Canada's main outputs rely heavily on data collection and the administration of these activities, which takes place in the regions.

Estimates by vote

Information on Statistics Canada's organizational appropriations is available in the 2021–22 Main Estimates.

Future-oriented Condensed statement of operations

The future-oriented condensed statement of operations provides an overview of Statistics Canada's operations for 2020–21 to 2021–22.

The amounts for forecast and planned results in this statement of operations were prepared on an accrual basis. The amounts for forecast and planned spending presented in other sections of the Departmental Plan were prepared on an expenditure basis. Amounts may therefore differ.

A more detailed future-oriented statement of operations and associated notes, including a reconciliation of the net cost of operations to the requested authorities, are available on Statistics Canada's website.

Future-oriented condensed statement of operations for the year ending March 31, 2022 (dollars)
Financial information 2020–21 forecast results 2021–22 planned results Difference
(2021–22 planned results minus 2020–21 forecast results)
Total expenses 867,639,406 1,048,174,102 180,534,696
Total revenues 113,157,338 120,000,000 6,842,662
Net cost of operations before government funding and transfers 754,482,068 928,174,102 173,692,034

The increase in planned expenses for 2021–22 is mainly explained by the approved funding to be received for the 2021 Census of Population and Census of Agriculture.

Statistics Canada expects to maintain its capacity in future years to deliver cost-recovered statistical services, with no significant shifts in resources.