The Open Database of Cultural and Art Facilities (ODCAF)
Metadata document: concepts, methodology and data quality

Version 1.0

Data Exploration and Integration Lab (DEIL)
Centre for Special Business Projects (CSBP)

October 2, 2020

Table of Contents

  1. Overview
  2. Data Sources
  3. Reference Period
  4. Target Population
  5. Compilation Methodology
  6. Database Coverage
  7. Data Quality
  8. Data Dictionary
  9. Contact Us

1. Overview

This experimental Open Database of Cultural and Art Facilities (ODCAF) is one of a number of datasets being created as part of the Linkable Open Data Environment (LODE). The LODE is an exploratory initiative of the Data Exploration and Integration Lab (DEIL) at Statistics Canada. It aims at enhancing the use, accessibility and harmonization of open data from authoritative sources by providing a collection of datasets released under a single licence, as well as open-source code to link these datasets together. This initiative is also meant to explore open data for official statistics and to support geospatial research across various domains. The LODE datasets and code are available through the Statistics Canada website and can be found at: Linkable Open Data Environment.

The ODCAF is a database of cultural and art facilities released as open data. Data sources include various levels of government within CanadaFootnote 1 and professional associations. This document details the process of collecting, compiling, and standardizing the individual datasets of cultural and art facilities that were used to create the ODCAF. The ODCAF is made available under the Open Government Licence – Canada.

In its current version (Version 1.0), the ODCAF contains approximately 8,000 individual records. The database is expected to be updated periodically as new open datasets become available. The ODCAF is provided as a compressed comma separated values (CSV) file.

2. Data Sources

Multiple data sources were used to create the ODCAF. The sources used are detailed in a 'Data Sources' CSV file located within the zipped data folder available for download on the ODCAF webpage. The links to the original datasets, licenses or terms of use, attribution statements and additional notes are also included in the Data Sources CSV file. For further information on the individual licences, users should consult directly the information provided on the open data portals of the various data providers. In addition to openly licensed databases, the ODCAF also includes a publicly available listing of cultural and art facilities.

The distinction between open and other publicly available data is based on the licensing terms (explicit or implicit) attached to each source dataset used. Open data licenses permit, in varying degrees, usability for any lawful purpose, redistribution (re-sharing) and modification and re-packaging of the data. However, open data licenses can impose some restrictions, such as attribution of original source, share-alike (re-sharing only with like conditions), and no commercial use. Examples of open data licenses are Creative Commons, MIT, GPLv3, and Canada's Open Government License. In general, no warranty is expressed and there are very minor conditions stipulated by the provider.

Publicly available data that are not open data might be associated with proprietary licensing or terms of use that may restrict some of the aspects that would otherwise be permitted under open data licensing.

3. Reference Period

The Data Sources CSV provides, when this is known, either the update frequency or the date each underlying dataset was last updated by the provider (this information is collected at the time the dataset was accessed for this project). Additionally, the Data Sources CSV provides the date each dataset used in the ODCAF was downloaded or provided by the organization that is the source of the data. Data were gathered between January 2020 and July 2020. Users are cautioned that the download date should not be used as an indication of the reference date of the data. To obtain specific information concerning the reference dates of the source datasets, users might contact the relevant data providers directly.

4. Target Population

For the purposes of the ODCAF database, cultural and art facilities are facilities wherein the primary activity is of a cultural nature or is related to the arts. The target population includes only brick and mortar cultural and art facilities that offer programs or services to the general public.

In terms of the North American Industry Classification System (NAICS), the facilities in the ODCAF are primarily in the following sub-sectors:

  • 711 - Performing arts, spectator sports and related industries
  • 712 - Heritage institutions

Facilities are included when their primary activities have a cultural or arts character, regardless of the source of funding, private or public status, operator type, location or other attributes. However, facilities that are not open to the general public and those that are primarily commercial in nature are not included. Thus, a theatre that offered ballet performances would be in scope, while a ballet school that offered training and performances only to paying students would not.

5. Compilation Methodology

This section provides an overview of the processing done to compile the ODCAF.

Data Standardization and Cleaning

The first processing component for compiling the ODCAF database comprised reformatting the source data to CSV format and mapping the original dataset attributes to standard variable (field) names. This was done using a version of the custom OpenTabulate software developed by the LODE team. A data dictionary of the variables used is provided in section 8.

Owing to the different classification systems and data attributes used in the source datasets and the need to standardize through application of several processing steps, the potential exists for the introduction of errors.

The methodology and limitations of the techniques used in each step used in the data cleaning process are described below. Trivial cleaning techniques, such as removal of whitespace characters and punctuation removal, are omitted from discussion.

Address Parsing

The libpostal address parser, an open source natural language processing solution to parsing addresses, was used to split concatenated address strings into strings corresponding to address variables, such as street name and street number. Occasionally, addresses were split incorrectly due to unconventional formatting of the original address. While effort was made to identify and correct these entries in the final database, some incorrectly parsed entries may have remained undetected. Exceptions are entries with street numbers of the form of two numbers separated by a hyphen or space. Entries of this form usually indicate that the address parser incorrectly parsed a numbered street name (e.g., "123 100 ave" is parsed into the street number "123 100" and the street name "ave", or else that a unit has not been identified correctly (as in "3-100 main st"). Numbers of this form are automatically separated, where the right most number is prepended to the street name if the street name is a variant of the word "street" or "avenue." Otherwise, the left most number is appended to the unit column.

A limited number of entries were manually edited when it was clear that the parsing had not been done correctly. An example is addresses with hyphenated numbers such as "1035-55 street nw", which may have been interpreted as having a civic number of "1035-55" and a street name of "street nw", rather than a civic number of 1035, and a street name of "55 street nw". While effort was made to ensure that the results are correct, it is possible that the scripts used to process and parse the addresses may unintentionally cause other, undetected, errors. Should any such errors be reported to or detected by the LODE team subsequently, they will be corrected in future versions of the ODCAF.

Removal of Duplicates

The removal of duplicates was done using both literal and fuzzy string matching on the facility name and street name, conditioned on the street number and province; by "conditioned," it is meant that a fuzzy comparison between two facilities is made provided that the street numbers and provinces agree. The fuzzy comparison is done using the Python package FuzzyWuzzyFootnote 2, which returns a similarity score between 0 and 100 for two strings, where a score of 100 indicates that the shorter string is a sub-string of the larger string. A threshold value for the returned score of the comparison is chosen empirically, indicating when an entry is marked as a duplicate.

If two entries contained identical street number and province information, then their street names and facility names were compared. When these were nearly identical (defined as having the sum of the similarity scores for the facility names and street names to be at least 195 out of a possible 200), then the entries were marked as duplicates. Recognized duplicates were deleted without manual intervention. The chosen threshold was selected close to the maximum score, which minimized any removal of false positives. When duplicates were found, whichever record contained more non-empty fields was retained. In total, 2,435 duplicates were removed.

Identification of Invalid Entries

A pair of filters was used to process the data after the address parsing stage. This captured entries with invalid postal code or province code information and wrote them to a file separate from the database for further processing. Most of these entries were manually corrected and added back into the database. The choice of these two filters is based on their capabilities in detecting potential errors in postal codes and province codes.

Other Data Cleaning Steps

  • Data entry formatting (removal of excess whitespace and punctuation), removal of postal code, province/territory names.
  • During processing, separation of entries with incorrect postal code or 2-letter province/territory code format from the cleaned data and their manual editing.

Selection of Record to Retain in Case of Duplicates

In some instances, a facility was present in more than one source. In such cases, the record with the most information available was retained. Where information between sources did not match, validation tools were used to decide which to retain.

Classification Used and Assignment of Cultural and Art Facility Type

The original data sources use a variety of standards, classifications and nomenclature to describe the type of cultural and art facility. Unfortunately, there is no classification for cultural and art facilities in Canada that is used universally. The following classification of cultural and art facilities is used for Version 1.0 of the ODCAF:

  • Arts or cultural centre: Establishments primarily engaged in promoting culture and arts
  • Artist: Individual artists engaged in creating artistic works
  • Festival site: Sites on which arts or cultural festivals are held
  • Gallery: Establishments primarily engaged in the display of artistic works
  • Heritage or historic site: Sites of cultural, artistic, or historic significance
  • Library or archive: Establishments primarily engaged in the display, curation, and sharing of primarily written material such as manuscripts, periodicals, and other items such as maps or images
  • Miscellaneous: Establishments associated in some way with promoting or providing culture or arts that do not fall into any of the above categories
  • Museum: Establishments primarily engaged in the display, curation, and sharing of collections of artifacts, fine arts, and other objects of artistic, cultural, or historical importance
  • Theatre/performance and concert hall: Establishments primarily engaged in the public performance of artistic or cultural works

The classification is intended to have broad categories that are helpful in distinguishing major types of facilities and yet enable accuracy in mapping source-specific facility types. Facility types are determined from source-specific facility types and source coverage metadata information. Assignments are made using keywords and validated afterwards, with changes made manually whenever needed. When classifying facilities based on source metadata information, this was done analytically on a case by case basis.

Geocoding and Determination of Census Subdivision

In general, the data included in the ODCAF are what is available from the original sources without imputation. The exception to this is the geocoding and the imputation of CSD names and categories, discussed below.

Census subdivision (CSD)Footnote 3 names were derived from two different attributes in the data.

The first attribute comprises the geographic coordinates, namely latitude and longitude. These are placed into the corresponding CSDs by linking the coordinate points to the CSD polygons through a spatial join operation using the Python package GeoPandas.Footnote 4

The second attribute is the city name, where literal string matching was done with each cultural and art facility municipality name and a list of CSD names. The city names with at least ten entries that did not receive a CSD name through this process were manually assigned a CSD name by using Place Names in GeoSuite.

Geocoding was carried out for some sources that provide address data but no geo-coordinates. Latitude and longitude were determined and validated using tools on the internet. A subset of the source-provided geo-coordinates were also validated using the internet. Some coordinates have also been removed from the original sources when it was determined they were derived from postal codes or other aggregate geographic areas as opposed to street address.

While efforts have been made to ensure the accuracy of geo-coordinates, no guarantees are implied, and errors and inaccuracies are possible.

Inclusion in the ODCAF of Facility Type Provided in Source Datasets

The facility types as provided in the data sources (e.g., exhibition or cultural centre, community library, centre d'art, etc.) are also included in the ODCAF without any modification, reassignment, or mapping to a uniform classification.

6. Database Coverage

The ODCAF current version (Version 1.0) database as provided contains approximately 8,000 cultural and art facilities.

As the total number of all cultural and art facilities in the country is not known with a reasonable degree of certainty, the coverage obtained with the sources used was not quantitatively assessed. However, many of the sources purport to list all facilities of a certain type within a jurisdiction. Thus, within these facility type categories and jurisdictions, coverage would be expected to be fairly complete. However, if facilities of a certain category were omitted in a source, then these might be missing from the database, unless they were obtained from a different source.

7. Data Quality

All cultural and art facility data in the ODCAF were collected from government data sources, either from open data portals or publicly-available webpages. In general, other than the processing required to harmonize the different sources into one database, the underlying datasets were taken "as is." The accuracy and completeness of the information is in general a function of the source datasets used.

Classifying facilities

Assignment of facility type was largely based on facility types provided by source datasets. In instances where facility type was either unclear or not defined by the source, facility type was classified based on further research or using meta-information, such as name of dataset.

Removing duplicates

Some source datasets do overlap; datasets which cover only a particular type of arts or cultural facility for an entire province, for example, may overlap with data provided only for specific towns. Although deduplication techniques are used, not all duplicates might have been removed. Modifying the deduplication methods to seek out the remaining duplicates would generate numerous false positives, which would require additional manual intervention. Further details are available in the sub-section Removal of duplicates above.

Correcting invalid entries

A few entries with erroneous province/territory names and postal codes were detected and manually corrected. Further details on the identification of erroneous entries are also reported in the sub-section Identification of invalid entries above.

Address parsing

Natural language processing methods were used for parsing and separation of address strings into address variables, such as street number and postal code (which is removed from the final released database). The methods are reputable in the field for performance and accuracy, but as with all statistical learning methods, they have limitations as well. Poor or unconventional formatting of addresses may result in incorrect parsing. At this stage, no further integration with other address sources was attempted; hence, although address records are generally expected to be correct, residual errors may be present in the current version of the database.

8. Data Dictionary

This data dictionary below describes the variables of the ODCAF.

Arts and cultural facilities varables

Variable – Index

Name
Index
Format
String
Source
Internally generated during data processing
Description
Unique number automatically generated during data processing

Variable – Facility Name

Name
Facility_Name
Format
String
Source
Provided as is from original data
Description
Cultural or arts facility name

Variable – Source Facility Type

Name
Source_Facility_Type
Format
String
Source
Provided as is from original data
Description
Facility type chosen by data provider

Variable – ODCAF Facility Type

Name
ODCAF_Facility_Type
Format
String
Source
Imputed from source data or metadata
Description
Facility type assigned from nine ODCAF categories

Location Variables

Variable – Unit Number

Name
Unit
Format
String
Source
Parsed from a full address string or provided as is
Description
Civic unit or suite number

Variable – Street Number

Name
Street_No
Format
String
Source
Parsed from a full address string or provided as is
Description
Civic street number

Variable – Street Name

Name
Street_Name
Format
String
Source
Parsed from a full address string or provided as is
Description
Civic street name

Variable – City

Name
City
Format
String
Source
Parsed from a full address string or provided as is
Description
City or municipality name (certain records may list the neighbourhood name)

Variable – Province/Territory

Name
Prov_Terr
Format
String
Source
Converted to two letter codes (internationally approved) after parsing from a full address string, or provided as is, or indicated by providers
Description
Province or territory name

Variable – Province Unique Identifier

Name
PRUID
Format
Integer
Source
Converted from province code
Description
Province unique identifier

Variable – CSD Name

Name
CSD_Name
Format
String
Source
Imputed from geographic coordinates and city names using GeoSuite 2016
Description
Census subdivision name

Variable – CSD Unique Identifier

Name
CSDUID
Format
Integer
Source
Imputed from either geographic coordinates or CSD name using GeoSuite 2016
Description
Census subdivision unique identifier

Variable – Longitude

Name
Longitude
Format
Float
Source
Provided as is from original data
Description
Longitude

Variable – Latitude

Name
Latitude
Format
Float
Source
Provided as is from original data
Description
Latitude

Variable – Data Provider

Name
Data_Provider
Format
String
Source
Created based on origins of input dataset
Description
Name of the entity that provided the dataset

9. Contact Us

The LODE open databases are modelled on ongoing improvement. To provide information on additions, updates, corrections or omissions, or for more information, please contact us at statcan.lode-ecdo.statcan@statcan.gc.ca. Please include the title of the open database in the subject line of the email.

Date modified:

The Open Database of Cultural and Art Facilities

Catalogue number: 21260001
Issue number: 2020001

The Open Database of Cultural and Art Facilities (ODCAF) is a collection of open data containing the names, types, and locations of cultural and art facilities across Canada. It is released under the Open Government License - Canada.

The ODCAF compiles open and publicly available data on cultural and art facilities across Canada. Data sources include provincial/territorial governments, municipal governments, and professional associations. This database aims to provide enhanced access to a harmonized listing of cultural and art facilities across Canada by making it available as open data. This database is a component of the Linkable Open Data Environment (LODE).

Data sources and methodology

The inputs for the ODCAF are datasets whose sources include provincial, territorial and municipal governments, and professional associations. These datasets were available either under one of the various types of open data licences, e.g., in an open government portal, or as publicly available data. Details of the sources used are available in a 'Data Sources' table located within the downloadable zipped ODCAF folder.

The data sources used do not deploy a uniform classification system. The ODCAF harmonizes facility type by assigning one of nine types to each facility. This was done based on the facility type provided in the source data as well as using other research carried out for that purpose.

The facility types used in the ODCAF are: art or cultural centre, artist, festival site, gallery, heritage or historic site, library or archive, museum, theatre/performance and concert hall, and miscellaneous.

The ODCAF does not assert having exhaustive coverage and may not contain all facilities in scope for the current version. While efforts have been made to minimize these, facility type classification and geolocation errors are also possible. While all ODCAF data are released on the same date, the dates as of which data are current depends on the update dates of the sources used.

A subset of geo-coordinates available in the source data were validated using the internet and updated as needed. When latitude and longitude were not available, geocoding was performed for some sources using address data in the source street address.

Deduplication was done to remove duplicates for cases where sources overlapped in coverage.

This first version of the database (Version 1.0) contains approximately 8,000 records. Data were collected by accessing sources between January 2020 and July 2020.

The variables included in the ODCAF are as follows:

  • Index
  • Facility Name
  • Source Facility Type
  • ODCAF Facility Type
  • Provider
  • Unit
  • Street Number
  • Street Name
  • Postal Code
  • City
  • Province or Territory
  • Source-Format Street Address
  • Census Subdivision Name
  • Census Subdivision Unique Identifier
  • Province or Territory Unique Identifier
  • Latitude
  • Longitude

For more information on how the addresses and variables were compiled, see the metadata that accompanies the ODCAF.

Downloading the ODCAF

For ease of download, the ODCAF is provided as a compressed comma-separated values (CSV) file.

Visualizing the ODCAF

The ODCAF content is available for visualization on a map using the Linkable Open Data Environment Viewer.

Date modified:

Supplement to Statistics Canada's Generic Privacy Impact Assessment related to the Survey on COVID-19 and Mental Health

Date: September 2020

Program managers: Director, Centre for Social Data Integration and Development
Director General, Census Subject Matter, Social Insights, Integration and Innovation

Reference to Personal Information Bank (PIB)

Not applicable as there are no direct personal identifiers being collected and retained.

Description of statistical activity

In the context of the COVID-19 pandemic and the significant disruption in households across Canada, Statistics Canada is conducting the Survey on COVID-19 and Mental Health, under the authority of the Statistics Act, on behalf of the Public Health Agency of Canada (PHAC). The purpose of the survey is to gather information that will help governments assess the impacts of the pandemic on Canadians' mental health and well-being, and develop strategies to address these impacts. These could include programs and services for Canadians, namely vulnerable Canadians and their families. In addition, the data will provide insights on how the restrictions and provincial lockdowns have led to or exacerbated symptoms related to mental health. They can also be used to analyze the longer-term impacts of COVID-19 on mental health.

This voluntary household survey collects information from individuals aged 18 years and older who live in Canadian provinces and the territorial capitals. Topics include mental health behaviours and symptoms associated with depression, anxiety and post-traumatic stress disorder (PTSD), suicide risk, parenting style, substance use, household violence and general mental health. In addition, information such as age, gender, postal code, email address, indigenous identity, visible minority status, immigration and citizenship, education and income will be collected. Reponses will be aggregated and processed to ensure that no individual can be identified.

Reason for supplement

While the Generic Privacy Impact Assessment (PIA) addresses most of the privacy and security risks related to statistical activities conducted by Statistics Canada, this supplement describes additional measures being implemented due to the sensitivity of the information being collected. The Survey on COVID-19 and Mental Health will be collecting information on mental health and well-being, which is contextually rendered even more sensitive while collected alongside personal information such as gender identity. This SPIA also describes how Statistics Canada has accounted for the unique impact to vulnerable populations when designing and deploying this survey, and integrates relevant principles of the Office of the Privacy Commissioner's Framework for the Government of Canada to Assess Privacy-Impactful Initiatives in Response to COVID-19.

Necessity and Proportionality

The collection and use of aggregated responses and personal information for the Survey on COVID-19 and Mental Health can be justified against the following four-part test from Statistics Canada's Necessity and Proportionality Framework:

  1. Necessity: Given the unprecedented nature of the COVID-19 pandemic and the measures put in place to contain it, the extent of the impacts on mental health and other aspects of life within households are currently in great part unknown. A quick and timely assessment of the mental health and well-being of Canadians will help inform government decision-making in order to support vulnerable Canadians and their families during this pandemic. In addition, the information will help governments assess how the COVID-19 restrictions and provincial lockdowns have led to or exacerbated symptoms of depression and PTSD, suicide risk, substance use, parenting style and household violence, and help inform future decisions.

    The survey data file, without direct identifiers, will be retained as long as required for statistical purposes, in order to conduct analysis of long-term impacts.

  2. Effectiveness (Working assumptions): Due to the urgent need for the information, a short questionnaire was developed that follows Statistics Canada's processes and methodology in an accelerated manner to produce timely results. The survey will be administered using a self-reported electronic questionnaire. A random sample of households from Statistics Canada's survey frame will receive an invitation letter to complete the survey and be provided with a secure access code to access the survey on Statistics Canada's secure survey infrastructure. Interviewers will follow up after three weeks with households that have not responded, to reiterate the invitation and follow a protocol to randomly select someone in the household (age order selection) ages 18 and older, to respond to the survey. The collection period will be approximately two months. All Statistics Canada directives and policies for the development, collection, and dissemination of the survey will be followed, and survey responses will not be attached to respondents' addresses or phone numbers. The data will be representative of population and may be disaggregated by province, ethnicity, gender, age groupings, etc.

  3. Proportionality: Data on mental health, substance use and household violence are highly sensitive, and may be amplified due to the recent COVID-19 pandemic isolation procedures. As such, experts at Statistics Canada and PHAC have been consulted on the scope and methodology of the survey. Wherever possible, questions about mental health and well-being from existing surveys have been used. These questions were taken from the Canadian Community Health Survey (CCHS), the Mental Health Survey (MHS), and the General Social Survey – Victimization (GSS). These questions have previously undergone qualitative testing, and the results of this survey could be compared to these other surveys, allowing for improved interpretation of the results.

    All the data to be collected are required for the purpose of the survey as described above. Careful consideration was made for each question and response category to ensure that it would measure the research questions and help inform future decisions related to mental health and the COVID-19 pandemic.

    The sample size of 18,000, which will represent people living in each province and in the three territorial capitals, has been assessed as the minimum required to meet quality estimates of the collected data. Increasing the sample size would not necessarily mitigate data findings for vulnerable populations.

    Statistics Canada directives and policies with respect to data collection and publication will be followed to ensure the confidentiality of the data. Individual responses will be grouped with those of others when reporting results. Individual responses and results for very small groups will not be published or shared with government departments or agencies. This will also reduce any potential impact on vulnerable populations or subsets of populations, as the grouping of results will make it impossible to identify individual responses. As permitted by the Statistics Act, with consent of individual respondents, survey responses may be shared with PHAC strictly for statistical and research purposes, for example, to aid in future policy decisions for the pandemic, and in accordance Statistics Canada's security and confidentiality requirements.

    The benefits of the findings, which are expected to support decision making at all levels of government aimed at improving mental health and well-being are believed to be proportional to the potential risks to privacy.

  4. Alternatives: Currently, there are no other surveys that gather information on the impact of the COVID-19 pandemic on the mental health and well-being of Canadians that describe these conditions by provinces and territorial capitals. The possibility of using crowdsourcing or web-panel survey methodologies was explored. However, based on discussions between mental health and methodology experts within Statistics Canada and PHAC, it was determined that a survey with at least 18,000 units was necessary to produce reliable and accurate results by provinces and territorial capitals. Releasing data at these aggregated levels will reduce the potential to identify impacts on vulnerable populations, subsets of populations, and groups.

Mitigation factors

Some questions contained in the Survey on COVID-19 and Mental Health can be considered sensitive as they relate to an individual's mental health and well-being, but the overall risk of harm to the survey respondents has been deemed manageable with existing Statistics Canada safeguards as well as with the following measures:

Mental Health Resources

Transparency

Prior to the survey, respondents will be informed of the survey purpose and topics, allowing them to assess whether they wish to participate. Topics listed will include: behaviours and symptoms associated with depression, anxiety and post-traumatic stress disorder (PTSD), suicide risk, pressure on parents, substance use, household violence, as well as general mental health. This information will be provided via invitation and reminder letters, and will be reiterated at the beginning of the questionnaire. Respondents will also be informed, in both invitation and reminder letters as well as in the questionnaire itself, that their participation is voluntary before being asked any questions. Information about the survey, as well as the survey questionnaire, will also be available on Statistics Canada's website.

Confidentiality

Individual responses will be grouped with those of others when reporting results. Individual responses and results for very small groups will never be published or shared with government departments or agencies. Careful analysis of the data and consideration will be given prior to the release of aggregate data to ensure that marginalized and vulnerable communities are not disproportionally impacted. As permitted by the Statistics Act, survey responses may be shared with PHAC strictly for statistical and research purposes, in accordance Statistics Canada's security and confidentiality requirements, and only with the consent of the respondent. The postal code will be used to derive the province or territory of the respondent and could also be used to identify regions that have been more impacted by the pandemic. It will not be used to identify respondents as only aggregated data will be released. The email address may be used to send out survey invitations for participation in a follow-up survey or other mental health surveys. It will be removed and separated from the final data file and it will not be used to identify respondents.

Conclusion

This assessment concludes that, with the existing Statistics Canada safeguards and additional mitigation factors listed above, any remaining risks are such that Statistics Canada is prepared to accept and manage the risk.

Formal approval

This Supplementary Privacy Impact Assessment has been reviewed and recommended for approval by Statistics Canada's Chief Privacy Officer, Director General for Modern Statistical Methods and Data Science, and Assistant Chief Statistician for Social, Health and Labour Statistics.

The Chief Statistician of Canada has the authority for section 10 of the Privacy Act for Statistics Canada, and is responsible for the Agency's operations, including the program area mentioned in this Supplementary Privacy Impact Assessment.

This Privacy Impact Assessment has been approved by the Chief Statistician of Canada.

Requests for information – Housing

Under the authority of the Statistics Act, Statistics Canada is hereby requesting the following information which will be used solely for statistical and research purposes and will be protected in accordance with the provisions of the Statistics Act and any other applicable law. This is a mandatory request for data.

Dwelling characteristics

Data on social and affordable housing

What information is being requested?

Statistics Canada is requesting data on social and affordable housing (SAH). These data include the residential addresses of SAH dwellings and the contact information for the managing institution and responsible manager. Information on the SAH program (type, last update, start and end dates, and program id), SAH dwelling record id numbers and the characteristics of the SAH dwellings is also being requested.

What personal information is included in this request?

The requested information includes contact information for the manager of each SAH institution. No personal information about SAH resident is being requested.

What years of data will be requested?

Annual data are being requested, beginning with 2018, on an ongoing basis.

From whom will the information be requested?

This information is being requested from the Canada Mortgage and Housing Corporation (CMHC), lessors of social housing projects and other Provincial and Territorial Public Administrations.

Why is this information being requested?

In 2017, the federal government introduced the National Housing Strategy (NHS). The NHS aims to ensure that Canadians across the country have access to affordable housing that meets their needs, with a particular focus on the most vulnerable populations. Research and policy making in support of this goal require high-quality data on SAH. This type of housing accounts for a relatively small share (5%) of the overall housing stock in Canada, making it difficult to target for inclusion in the Canadian Housing Survey (CHS), a key data source for the NHS. To overcome this issue, Statistics Canada built a satellite SAH dwelling register using administrative data from the Canada Mortgage and Housing Corporation and provincial and territorial housing authorities, and data from the census. The resulting National Social and Affordable Housing Database (NSAHD) enables the CHS to efficiently collect data on vulnerable populations living in SAH in order to have the best quality data for this segment of the population. Acquiring and integrating the requested SAH information will enhance the coverage of the NSAHD.

Statistics Canada may also use the information for other statistical and research purposes.

Why were these organizations selected as data providers?

Canada Mortgage and Housing Corporation, the lessors of social housing projects and other provincial and territorial public administrations collect and maintain up-to-date data for administrative purposes. This information will be used to improve coverage of the National Social and Affordable Housing Database.

When will this information be requested?

The data is requested on an annual basis.

What Statistics Canada programs will primarily use these data?

When was this request published?

June 4, 2021

Housing costs and affordability

One-time top-up to the Canada Housing Benefit

What information is being requested?

Information on renters, landlords and tenancy (address of rented property, period of rental, amount of rent paid) through the One-time top-up to the Canada Housing Benefit.

What personal information is included in this request?

Renter information (name, date of birth, social insurance number, phone number, marital status, official language of preferred correspondence, mailing address information, family net income) and landlord information (name of landlord or company, phone number) will be requested.

Personal identifiers are required to perform data linkages, for statistical purposes only. Once the data are linked, the personal identifiers will be replaced by an anonymized person key, meaning individuals will not be identifiable once the data has been linked.

What years of data will be requested?

2023

From whom will the information be requested?

This information is being requested from the Canada Revenue Agency.

Why is this information being requested?

Housing accounts for the most important asset and debt held by Canadian households. Given its importance, a sound understanding of factors impacting the ownership and rental markets is critical for the design of policies that can address housing issues, and for the provision of high-quality information on homeowners and renters.

In recent years, there has been growing concern among Canadians regarding housing affordability and market concentration. The inclusion of data from the One-time top-up to the Canada Housing Benefit will provide a better understanding of the rental market and vulnerable populations. Understanding trends in the low-income rental housing market can help Canadians make more informed decisions on housing, and help the government understand the impacts of programs, such as the One-time top-up to the Canada Housing Benefit, on Canadians.

Statistics Canada may also use the information for other statistical and research purposes.

Why were these organizations selected as data providers?

Canada Revenue Agency administered the one-time top-up to the Canada Housing Benefit which was authorized by the Canada Mortgage and Housing Corporation (CMHC) under the authority of the Rental Housing Benefit Act.

When will this information be requested?

Fall 2023

When was this request published?

September 7, 2023

Properties and property owners

What information is being requested?

Information on residential and non-residential properties as well as individual and non-individual property owners such as a corporation, trust, state-owned entity, or related groups. This includes information on: property location, structure/land characteristics, land use, assessment value by tax class, taxation status, sale value, rental prices, and financing. Ownership information is also being requested, which includes information on types of owners, names, and contact information.

What personal information is included in this request?

Statistics Canada has requested access to such personal information as: property owner names, types of owners, legal name of owner along with mailing, telephone number, property, and billing addresses. 

All information collected by Statistics Canada is strictly protected and anonymized. It is never possible to connect the data that is made public to the identity of any business, individual, or their household.

What years of data will be requested?

Ongoing.

From whom will the information be requested?

This information is requested from all provincial, territorial and municipal property assessment authorities or their operators/service providers, provincial and territorial land registry authorities or their operators/service providers and from rental websites.

Why is this information being requested?

The Canadian Housing Statistics Program (CHSP) provides municipal, provincial, territorial, and federal authorities, researchers, and industry stakeholders with relevant and timely data on housing stocks and home ownership.

Residential and non-residential property assessment values at current prices are primarily intended to meet data requirements from Finance Canada for Fiscal Arrangements, as part of the property tax base.

Housing accounts for the most important asset and debt held by Canadian households. Given its importance, a sound understanding of factors impacting the housing market is critical for the design of policies that can address housing issues, and for the provision of high-quality information to homeowners, renters, and people seeking home ownership.

The CHSP, launched in 2017 as a response to emerging data needs, is the most comprehensive source of data in Canada on the housing sector. It provides a framework on demand by describing the owners – their income, socio-demographic status, and whether they are residents of Canada – and on the supply – the characteristics of properties owned and built. By joining those factors, the program can produce information on the market equilibrium, such as the values of properties and their use.

The program is an innovative data project that leverages existing administrative data sources and transforms them into new and timely indicators on Canadian housing. In an effort to provide a complete coverage of housing in Canada, Statistics Canada is seeking to acquire annual property assessment roll and land registry data for all municipalities in Canada, in addition to rental data. 

The CHSP produces a granular property and owner statistics at the Census Subdivision level. Property characteristics include the structure type, period of construction, assessment value, sale value, rental price, living area, and property use. Owner characteristics include the number of owners, ownership type, residency status, income, age, sex, and immigration characteristics.

The program is unique in that it replaces the traditional survey methodology by combining administrative data sources to provide municipal, provincial/territorial and federal authorities, researchers and industry stakeholders with relevant and timely data on housing.

Property assessment roll data is also used to derive residential and non-residential property assessment values at current prices, according to a common stock date, by the Property Values Program. These estimates are designed to meet the data requirements of Finance Canada as part of the property tax base, in support of the Federal-Provincial Fiscal Arrangements Act. Statistics Canada may also use the information for other statistical and research purposes.

 

Why were these organizations selected as data providers?

Provincial, territorial, and municipal property assessment and land registry authorities collect and maintain up-to-date data for administrative purposes. By collaborating with other government departments, Statistics Canada avoids duplication of data collection, reducing the response burden on Canadians. Rental listing websites maintain a significant share of secondary rental market listings in Canada, and exercise high coverage of the Canadian rental market.

When will this information be requested?

January 2020, and onward. Rental data is requested as of March 2024 and onward.

When was this request published?

December 19th, 2019.

Housing price indexes

Property sale prices and property characteristics

What information is being requested?

Information on housing, commercial sale prices and other characteristics, such as the following: selling price, date of sale, sale type (new/resale) and addresses.

In addition, the following is being requested: information on property characteristics (e.g. property type, year built, square footage, property size, room sizes, lot size, renovation indicators, number of bedrooms, number of bathrooms, presence of finished basement), tax exemption status, condo status and condo information (e.g. fees, number of parking spots, building/unit amenities).

What personal information is included in this request?

This request does not include personal information. Only the property address is required to perform data integration for statistical purposes only. Once the data are integrated, the address will be replaced by an anonymized key.

What years of data will be requested?

Monthly data beginning January 2016.

From whom will the information be requested?

This information is being requested from the Canadian Real Estate Association.

Why is this information being requested?

Statistics Canada is requesting this information to improve the accuracy, quality and coverage of the statistics produced by the Residential Property Price Index as well as develop market value information for the Canadian housing stock. The data will help inform public policy on housing and will be used by policy makers, researchers, industry stakeholders and Canadians to better understand changes in housing prices. In addition, the agency will use the commercial sale information as a starting point in the development of a new commercial property price index, to serve as an important indicator of financial stability and wealth. Statistics Canada may also use the information for other statistical and research purposes.

Why were these organizations selected as data providers?

The Canadian Real Estate Association maintains the most robust and timely information on the sale price and characteristics of Canadian properties.

When will this information be requested?

September 2020 and onwards.

When was this request published?

September 23, 2020

Data on formal evictions

What information is being requested?

This information request is in response to a pilot project that Statistics Canada is undertaking with Canada Mortgage and Housing Corporation (CMHC), which aims to acquire data related to formal evictions in Canada.

More specifically, Statistics Canada will seek information related to the formal eviction application (e.g., length of tenancy, the reason for the application, amount of money owed), details surrounding the hearing process (e.g., tenant and landlord filings, access to legal resources, adjudicator decision), and details pertaining to appeals and the enforcement of orders (where available). Personal information will also be included in the request.

What personal information is included in this request?

This request includes personal information such as:

  • first name
  • last name
  • sex
  • date of birth
  • civic address
  • postal code
  • telephone numbers
  • email address of the tenant and landlord

Personal identifiers are required to perform data linkages, for statistical purposes only. Once the data are linked, the personal identifiers will be replaced by an anonymized person key, meaning individuals will not be identifiable once the data has been linked.

What years of data will be requested?

Statistics Canada is requesting data from the New Brunswick Residential Tenancies Tribunal from 2011 to 2022.

From whom will the information be requested?

This information is being requested from the New Brunswick Residential Tenancies Tribunal.

Why is this information being requested?

Evictions are a destabilizing force for individuals, households and communities. Given the shifting nature of evictions—which has anecdotally only been amplified during the COVID-19 pandemic—there is a growing need for centralized, standardized administrative data on evictions in order to better understand this issue and its impact. As a result, this pilot project intends to acquire data on formal eviction applications, decisions, appeals and enforcements.

These data will be central in formulating a study group of individuals who experienced a formal eviction, in order to evaluate the impact on the lives of tenants (and landlords, when possible). In addition, the data will illustrate the impact of evictions at various geographic levels, providing insight about where formal evictions take place, and the characteristics of communities most impacted by them. This project will benefit Canadians by filling an important data gap in housing research, helping to provide a better understanding of populations most vulnerable to evictions, circumstances leading to evictions, as well as the impact that evictions can have on other areas of life (e.g., housing outcomes, health, income, employment). These insights will help inform the development of evidence-based prevention measures and supports for those involved in formal evictions. Statistics Canada may also use the information for other statistical and research purposes.

Why were these organizations selected as data providers?

The New Brunswick Residential Tenancies Tribunal collects and maintains administrative data on formal eviction processes in its province. By collaborating with the New Brunswick Residential Tenancies Tribunal, Statistics Canada avoids duplication of data collection, reducing the response burden on Canadians.

When will this information be requested?

August 2022, one-time data request.

When was this request published?

August 3, 2022

Request for information – Business performance and ownership

Under the authority of the Statistics Act, Statistics Canada is hereby requesting the following information, which will be used solely for statistical and research purposes and will be protected in accordance with the provisions of the Statistics Act and any other applicable law. This is a mandatory request for data.

Business dynamics

Corporate insolvency microdata

What information is being requested?

Information on corporations (legal names, trade names and addresses) that have filed for corporate insolvency is being requested.

What personal information is included in this request?

This request does not include personal information.

What years of data will be requested?

Monthly data beginning in 2006 (ongoing)

From whom will the information be requested?

This information is being requested from the Office of the Superintendent of Bankruptcy.

Why is this information being requested?

Statistics Canada is requesting the Corporate Insolvency Microdata to help provide timely statistics on permanent firm closures. The COVID-19 pandemic has led the Government of Canada to introduce a number of measures such as the Canada Emergency Wage Subsidy, the Canada Emergency Business Account and the Canada Emergency Commercial Rent Assistance Program to support businesses and limit the number of business failures during the pandemic. Timely measures of permanent business closures will provide information on whether the objectives of these support programs are being met. The success of these programs will directly impact Canadians, as the survival of businesses during the pandemic directly affects the employment opportunities available to Canadians.

Statistics Canada may also use the information for other statistical and research purposes.

Why were these organizations selected as data providers?

As part of its mandate, the Office of the Superintendent of Bankruptcy is responsible for administration of the Bankruptcy and Insolvency Act. As such, it maintains up-to-data data on corporate bankruptcies in Canada.

When will this information be requested?

November 2020 and onward (monthly)

When was this request published?

October 28, 2020

Business ownership

Co-operatives businesses

What information is being requested?

Statistics Canada is requesting information on active co-operatives in Canada. A non-financial co-operative is a corporation that is legally incorporated under specific federal, provincial or territorial co-operative acts and that is owned by an association of people seeking to satisfy common needs, such as access to products or services, sale of products or services, or employment.

Information included in this request includes: business information such as the name and contact of the co-operatives, information to identify the area of activity of the co-operative, and a list of closures, dissolutions, amalgamations or name changes that may have taken place.

What personal information is included in this request?

This request does not contain any personal identifiers.

What years of data will be requested?

Data for all active co-operatives, as of December 31, 2020.

From whom will the information be requested?

This information is being requested from information services, business support services and other provincial and territorial public administration.

Why is this information being requested?

Statistics Canada requires this information in order to produce custom tabulations as part of a joint project with Innovation, Science and Economic Development (ISED) Canada. The produced tabulations will replace ISED's longstanding survey on co-operatives in Canada.

Co-operative businesses have an important economic role to play in generating jobs and growth in communities across Canada. Existing in every sector of the economy, co-operatives provide needed infrastructure, goods and services to over 8 million members and jobs to more than 95,000 Canadians. This project offers Canadians, policymakers, researchers and industry stakeholders an accurate depiction of the size and makeup of this sector.

Statistics Canada may also use the information for other statistical and research purposes.

Why were these organizations selected as data providers?

The organizations manage and maintain the provincial co-operative registry for their respective provinces. The data providers are an entity of the provincial government, and the only source of the required data. In collaboration with Innovation, Science and Economic Development (ISED) Canada, the data are used to replace a longstanding ISED survey with more timely, accurate and cost effective statistics.

When will this information be requested?

February 2022 and onward (annually)

When was this request published?

February 21, 2022

Financial statements and performance

Financial sector data

What information is being requested?

The desired information includes financial information from federally regulated financial institutions, including assets and debts aggregated by institution, with counterparty information broken out where available, and loan level data containing associated characteristics such as type of loan and borrowing terms. The counterparty information will specify how much lending goes to each of the other sectors in the economy.

What personal information is included in this request?

This request does not contain any personal information.

What years of data will be requested?

All current data holdings, historical (as available), and on an ongoing basis.

From whom will the information be requested?

This information is being requested from the Office of the Superintendent of Financial Institutions (OSFI)

Why is this information being requested?

Statistics Canada is requesting this information to develop and publish statistics on financial activity and lender/borrower relationships in the Canadian economy. The National Economic Accounts, including the Financial and Wealth Accounts (FWA), contain estimates on financial services with related incomes, assets, and liabilities (i.e. debt) broken down by various levels of sector and instrument detail. The additional information will help validate and complement currently available data holdings.

As Canada's banking industry regulator, OSFI already collects this data. This acquisition will avoid duplication of efforts and prevent increased burden for respondents.

The overall result of acquiring these new data will be an increased level of quality and detail of national financial statistics. This means policymakers, researchers, and other data users will have a more precise and detailed portrait of the financial system in Canada.

Statistics Canada may also use the information for other statistical and research purposes.

Why were these organizations selected as data providers?

OSFI is the national regulator for the financial sector in Canada and thus has the legal authority to collect this type of detailed financial data.

When will this information be requested?

This information is being requested in September 2021.

What Statistics Canada programs will primarily use these data?

When was this request published?

August 18, 2021

Summary of Changes

February 2024 – Inclusion of additional details on requested loan level information.

Other content related to Business performance and ownership

Business financing and supporting programs data

What information is being requested?

The data requested are the name of the enterprises, business numbers, addresses of the enterprises, program data (projects, agreements), the value, date and type of support to the enterprise and the name of the program stream.

What personal information is included in this request?

This request does not contain any personal information.

What years of data will be requested?

Annual data from January 2018 to latest year available.

From whom will the information be requested?

This information will be requested from:

  • Business Development Bank of Canada
  • Export Development Canada
  • Ministère de l'Économie, de l'Innovation et de l'Énergie du Québec
  • Institut de la statistique du Québec
  • Secrétariat du Conseil du trésor du Québec
  • Ministère de la Cybersécurité et du numérique du Québec
  • Conseil de l'innovation du Québec
  • Ontario Ministry of Economic Development, Job Creation and Trade
  • Canadian Commercial Corporation

Why is this information being requested?

Statistics Canada has been acquiring on an annual basis data on federal support to innovation and growth from all departments through the Business Innovation and Growth Support (BIGS) program since 2018. To complete this portrait and better understand business innovation in Canada, data from provincial and crown corporations are required.

Statistics Canada requires this information to create and publish statistics on innovation and growth support to businesses in Canada. These statistics will help provide a more accurate picture on which to design and optimize programs for the benefit of Canadians and will be used by policy makers, researchers, industry stakeholders to demonstrate the extent to which governments are supporting Canadian businesses and the economy. Statistics Canada may also use the information for other statistical and research purposes.

Why were these organizations selected as data providers?

These organizations have been identified as having detailed information on business innovation and growth which will contribute to fill out the current data gaps. As for the provincial organizations, a pilot project is being conducted for this next cycle with the addition of data from Ontario and Québec. Future cycles will most likely include other provinces.

When will this information be requested?

September 2024.

What Statistics Canada programs will primarily use these data?

When was this request published?

June 14, 2024

Monthly Survey of Food Services and Drinking Places: CVs for Total Sales by Geography - July 2020

CVs for Total sales by Geography
Table summary
This table displays the results of CVs for Total sales by Geography. The information is grouped by Geography (appearing as row headers), Month and percentage (appearing as column headers).
Geography Month
201907 201908 201909 201910 201911 201912 202001 202002 202003 202004 202005 202006 202007
percentage
Canada 0.69 0.57 0.59 0.56 0.58 0.61 0.67 0.59 0.63 1.22 1.29 1.13 1.21
Newfoundland and Labrador 2.87 2.49 3.13 3.19 2.77 3.06 2.94 3.17 3.10 4.99 4.02 3.97 5.25
Prince Edward Island 6.84 4.93 4.01 4.53 4.75 4.16 3.67 3.40 2.84 2.54 2.84 3.35 4.18
Nova Scotia 4.65 4.62 2.76 2.94 3.45 3.56 2.06 2.95 2.93 5.03 5.04 3.97 4.09
New Brunswick 2.28 1.30 1.56 1.87 1.45 1.40 1.35 2.16 2.47 4.36 4.44 3.89 3.43
Quebec 1.97 1.41 1.32 1.26 1.37 1.22 1.37 1.17 1.38 3.74 3.47 2.69 2.86
Ontario 1.11 0.94 1.04 0.96 0.99 1.02 1.05 0.97 1.03 1.97 2.14 1.89 1.99
Manitoba 2.43 2.74 2.18 2.42 1.95 2.00 1.92 1.80 2.18 4.91 4.17 3.73 5.00
Saskatchewan 1.92 1.92 1.58 1.59 1.79 1.56 1.51 1.68 1.98 3.68 3.32 2.66 3.19
Alberta 1.32 1.24 1.18 1.23 1.29 1.33 1.37 1.29 1.76 3.07 3.41 3.11 2.61
British Columbia 1.69 1.57 1.60 1.65 1.62 1.96 2.45 1.98 1.89 3.18 3.45 3.18 3.81
Yukon Territory 5.95 4.95 5.88 7.06 6.05 6.69 7.22 5.05 4.97 5.09 5.95 6.91 4.08
Northwest Territories 1.00 0.91 1.00 1.46 1.59 0.88 0.98 0.80 0.85 2.33 2.10 1.46 2.39
Nunavut 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Analysis 101, part 4: Case study

Catalogue number: 892000062020012

Release date: September 23, 2020

In this video, we will review the steps of the analytical process.

You will obtain a better understanding of how analysts apply each step of the analytical process by walking through an example. The example that we will discuss is a project that examined the relationship between walkability in neighbourhoods, meaning how well they support physical activity, and actual physical activity for Canadians

Data journey step
Analyze, model
Data competency
Data analysis
Audience
Basic
Suggested prerequisites
Length
9:01
Cost
Free

Watch the video

Analysis 101, part 4: Case study - Transcript

(The Statistics Canada symbol and Canada wordmark appear on screen with the title: "Analysis 101, part 4: Case study")

Analysis 101: part 4 - Case Study

Hi, welcome to our analysis 101 case study. Before you watch this video, make sure you've watched videos 123 so that you're familiar with the three stages of the analytical process.

Learning goals

In this video we will review the steps of the analytical process and you will obtain a better understanding of how analysts apply each step of the analytical process. By walking through an example. The example that we will discuss is a project that examined the relationship between walkability in neighborhoods, meaning how well they support physical activity. An actual physical activity of Canadians.

Steps in the analytical process

(Diagram of 6 images representing the steps involved in the analyze phase of the data journey where the first steps represent the making of an analytical plan, the middle steps represent the implementation of said plan and the final steps are the sharing of your findings.)

Throughout this video will refer back to the six steps of the analytical process and illustrate these steps through our walkability example.

What do we already know?

For analytical plan, let's start by understanding the broader context. What do we already know about the topic? Well, we already know that obesity is a problem in Canada. Insights from the Canadian health measures survey show that 29% of Canadian children and youth are overweight or obese while 60% of Canadian adults are overweight or obese. We also know that many Canadian adults and children are not active enough data from the Canadian health measures survey show that 33% of Canadian children and youth are meeting the physical activity guidelines, meaning that about 66% do not meet requirements. Likewise, 18% of Canadian adults are meeting the physical activity guidelines.

(Texte: "Without being aware of it, our neighbourhoods and how they are built influence how healthy we are.")

These challenges have led to increased attention around the idea of changing the environment in which we live to help Canadians make healthier lifestyle choices. This idea was the focus of the 2017 chief public health officers report on the state of public health in Canada, which noted that shifting behaviors is challenging. What would help Canadians become more active? More parks, better walking paths, or safer streets? Should policy makers look at crime rates? The list is endless.

What do we already know? Environments shape our health

There are a number of ways that our environment can influence our health behaviors. For example, our built environment such as how walkable our neighborhood is, or our health behaviors like how long we commute, or how many sports we participate in, can have an impact on our mental and physical health. Think about your own neighborhood. Does the design of your neighborhood make it easy or hard for you to walk to and from places or to get outside to exercise or play with your kids?

What do we already know? Knowledge gaps

Now that we understand the broader topic, let's identify the knowledge gaps. Previous studies had already demonstrated that Canadian adults living in more walkable neighborhoods are more active. However, recent findings focused on a few Canadian cities and did not provide national estimates. Likewise, previous work focus on how to get adults more active, but was limited in the analysis for children.

What is the analytical question?

Identifying a relevant analytical question is important to defining the scope of your work. For this study, the main question was does the relationship between walkability and physical activity in Canada differ by age? That's a clear, well defined question, and it's written in plain language.

Prepare and check your data

(Texte: Canadian Active Living Environments Database)

Now it's time to implement our plan. The first step is preparing and checking our data. Given that we had access to a new Canadian walkability data set, we wanted to leverage this new data source before we go any further. Let me give you some more context on walkability. Essentially, walkability means how well in neighborhoods supports physical activity. Walkability is higher in Denser neighborhoods, such as those with more people living on one block. It's also higher in neighborhoods with more amenities, like access to transit, grocery stores or schools or neighborhoods with well connected streets. Each neighborhood was assigned to walkability score from one to five. If you live in a suburban area outside the city core, your neighborhood will likely have a walkability score of three. Downtown neighborhoods will likely have a score of four or five.

Perform the analysis

(Texte: Canadian Active Living Environments; Canadian Community Health Survey (ages : 12 +years); Canadian Health Measures Survey (Ages: 3 to 79 years))

For our analysis, we linked external walkability data to two major Statistics Canada health surveys. We made use of both surveys because they use different measurements for physical activity. One survey asked respondents to self report their daily exercise while the other made use of accelerometers. Accelerometers capture minute by minute movement data. Think of it as a fancy pedometer.

Summarize and interpret your results

After some data cleaning concept, defining and lots of documenting our analytical decisions, we then started crafting a story based on our findings are main finding was that adults in more walkable neighborhoods are more active. However, different patterns were observed for children and youth. Their physical activity was pretty consistent across different levels of neighborhood walkability. When we started this work, there was a lot of evidence linking physical activity and neighborhood walkability in adults. But only a few studies examining children. Some studies found that children were more physically active in more walkable neighborhoods, while others stated the opposite. Finding we performed age specific analysis to examine this in greater detail and found that children under 12 are more active in neighborhoods with low walkability, like car oriented suburbs, which may have larger backyards, schoolyards, and parks where they can run around and play safely. But the relationship for children 12 and over with similar to that of. Adults, they were more physically active in higher walkability neighborhoods. Summarizing your results in simple terms is key to getting your message across to various audiences. As you learned in previous videos, translating complex analysis into a cohesive story is important. It's your job to digest the information and guide your reader through your story line.

Summarize and interpret your results: So what?

Interpreting the results also involves helping your audience understand the. So what factor for us. This meant highlighting that walkability is a relevant concept for adults, but we need to think differently about how to support physical activity in children. For example, what about parks, neighborhood safety, and crime rates? Explain to your reader how your findings fit within the existing body of literature. It's also a great practice to communicate what needs to be done going forward to advance our knowledge and flag any limitations to the study.

Disseminate your work

This project led to some very interesting analysis which we share it in different ways with stakeholders, policy makers and Canadians. Two major research papers were published for armor expert audience. While we also created an infographic on key points for a more general audience.

Summarize and interpret your results: So what?

(Diagram of 6 images representing the steps involved in the analyze phase of the data journey where the first steps represent the making of an analytical plan, the middle steps represent the implementation of said plan and the final steps are the sharing of your findings.)

The analytical process is a journey. It often takes much longer than you anticipate. First understand your topic and take your time to develop a clear and relevant analytical question. Make sure to check and review your data throughout the process and strive to translate your findings into a meaningful and interesting narrative. That way people will remember your work.

(The Canada Wordmark appears.)

What did you think?

Please give us feedback so we can better provide content that suits our users' needs.

Analysis 101, part 3: Sharing your findings

Catalogue number: 892000062020011

Release date: September 23, 2020

In this video, you will learn how to summarize and interpret your data and share your findings. The key elements to communicating your findings are as follows:

  • select your essential findings,
  • summarize and interpret the results,
  • organize and assess your reviews and
  • prepare for dissemination
Data journey step
Analyze, model
Data competency
Data analysis
Audience
Basic
Suggested prerequisites
Length
11:38
Cost
Free

Watch the video

Analysis 101, part 3: Sharing your findings - Transcript

(The Statistics Canada symbol and Canada wordmark appear on screen with the title: "Analysis 101, part 3: Sharing your findings")

Analysis 101: Part 3 - Sharing your findings

Hi, welcome to analysis 101 video 3. Now that we've learned how to plan an analytical project and perform, the analysis will discuss best practices for interpreting an sharing your findings.

Learning goals

In this video you will learn how to summarize an interpret your data and share your findings. The key elements to communicating your findings are as follows. Select your essential findings. Summarize an interpret the results. Organize an assess reviews. And prepare for dissemination.

Steps in the analytical process

(Diagram of 6 images representing the steps involved in the analyze phase of the data journey where the first steps represent the making of an analytical plan, the middle steps represent the implementation of said plan and the final steps are the sharing of your findings.)

Going back to our six analytical steps will focus on sharing our findings. If you've been watching the data literacy videos by Statistics Canada, you'll recognize that this work is part of the third step, which is the analyze phase of the data journey.

Step 5: Summarize and interpret your results

Let's start by discussing how to summarize an interpret your results.

Tell the story of your process

(Image of the 4 parts to for the 5th step: Context - Evidence from other countries or anecdotal; Methods - Comapre millennials (aged 25-34) to previous generations; Findings - Millennials have higher net worth and higher debt then Gen-X; Interpretation - Mortgages main contributor to debt for millennials.)

Presenting your findings clearly to others is one of the most challenging aspects of the analytical process. Let's use the millenial paper as an example. First we started with the context where we highlighted previous findings for American millennials, which motivated our study on Canadian millennials. Then we discussed our data and methodology defining millennials in explaining how we compared them with previous generations. Then we walked through the key findings. The storyline, for example, we explained that well, Millennials had higher net worth than generation X when they were younger. Millennials were also more indebted. Finally, we interpreted our findings, digging deeper into the Y. For millennials, we found that mortgage debt, which reflects higher housing values, contributed to their higher debt load.

Carefully select findings that are essential to your story

You'll likely produce several data tables or estimates throughout your analytical journey. Carefully select the findings that are essential to telling your story. Revisit your analytical questions an select visuals that clearly help to answer these questions. Remember that your results are not the story, but the evidence that supports your story.

Summarize your findings and present a logical storyline

Once you've selected the key results, summarize your findings and present them according to a logical storyline. Identify the key messages. Often these messages will serve as subheadings in a report or study. Also, always make sure to discuss your findings within the broader context of the topic. You've done great work and you want people to remember what your analysis contributes to the literature. Creating a clear storyline will ensure that people remember your work.

Define concepts

(Text on screen: A millennial is anyone in our dataset between 25 to 34 years old in 2016)

As you may recall from video to project specific definitions of key concepts may have been established before starting your analysis. It's worthwhile to include any relevant definitions in your written analysis, like our definition of Amillennial. This will help the audience better understand your findings.

Avoid jargon and explain abbreviations

In your written analysis, avoid jargon. An explain abbreviations clearly. For example, instead of using a statistical term such as synthetic birth cohort, explain your results in plain language. Define any acronyms that you use, like CSD, which stands for senses subdivision at the earliest possible opportunity.

Maintain neutrality

(Text on screen: Subjective - Large/small, High/Low, Only/A lot; Neutral - Rose or fell by X%, Higher or lower by X times.)

Ensure that you're maintaining neutrality by using plain language and not overstating your results, or speculating when interpreting them. Avoid qualifiers like large, high, or only, which can be subjective and focus on explaining things using neutral language.

(Text on screen: Subjective - Large/small, High/Low, Only/A lot; Neutral - Rose or fell by X%, Higher or lower by X times.)

Here are some examples that were not neutral and were improved by letting the data tell the story. Instead of employment growth plummeted down by 2%. You can say over the previous quarter employment fell 2%. The largest decline in the past two years. The second statement maintains neutrality. Instead of Millennials are dealing with a significantly worse housing market and have a lot more debt, you can say median mortgage debt from Millennials age 30 to 34 reached over 2.5 times their median after tax income. Don't rely on exaggerations to make your point stay neutral. These statements are robust and supported by the data.

Expect to make mistakes

Expect that you will make mistakes. It's a normal part of analytical work. Remember that you're the person most familiar with your project, which puts you in an ideal position to identify mistakes. When you complete your preliminary draft, leave it alone for a few days and review it with fresh eyes. Don't be afraid to ask others for help in correcting your errors, and remember that learning from your mistakes will strengthen your analytical skills.

Step 6: Disseminate your work

Next, we're going to review the last step. Which is how to prepare your work for dissemination and communicate your finding successfully.

Ask others to review your work

An important part of preparing your work for dissemination is asking others to review your work. You can request feedback from a range of people such as colleagues, managers, subject matter experts and data or methodology experts.

Seek feedback on different aspects of your work

Ask your reviewers for feedback on different aspects of your work, such as the clarity of your analytical objectives, appropriateness of the data you've used, definition of concepts, review of literature, methodological approach, interpretation of your results and clarity and neutrality of your writing.

Organize and assess reviewers' comments

After receiving comments from your reviewers, organize and assess their feedback. Look for any concerns that are common across reviewers comments and determine which concerns will require additional analysis. Make sure to clarify anything that reviewers struggled to understand.

Document how you addressed reviewers' comments

Document how you've addressed each of the reviewers comments. If you're not able to address certain concerns, it's important to justify why. In some cases, your organization may require that you provide a formal response to reviewers comments. However, even if this is not required, it is a best practice to make note of the decisions you make when revising your work.

Preparing your work for publication involves many people and processes

Typically many processes and many people are involved in helping to prepare your analytical product for dissemination. At Statistics Canada, analytical products undergo editing, formatting, translation, Accessibility, assessment approval processes, and the preparation of a press release. You will want to consider their requirements for your work, whether it's a briefing note, an infographic or information on your organization's website.

How your work is published depends on your intended audience

How you work is disseminated will depend on your intended audience. You need to think about who the intended audience is. What do they already know? And what do they need to know for example the general public will want high level key messages while the media or policy analyst community will want more information visuals and charts. Researchers, academics, or experts will want details about your data, methodology and limitations of your work.

How your work is published depends on your intended audience: Media and the general public

For example, we often provide highlights visually through charts and infographics when communicating findings to the general public. For a study on the economic well being of millennials, the findings were communicated through Twitter, an infographic and a press release which summarized the key messages of the analysis.

How your work is published depends on your intended audience: Policy-makers

Other audiences such as policy makers may be interested in more detailed findings or a different venue where they can have their questions answered quickly. Results from the millenial study were shared with analysts and policy makers through a web and R the publication of a study with detailed results and other presentations.

How your work is published depends on your intended audience: Researchers, academics, experts

Findings are shared with researchers, academics or experts by publishing the analysis in detailed research papers or Journal articles in peer reviewed publications, as well as by presenting at conferences. This audience will be more invested in the specific details of. Work and knowing where the findings fit into the larger research field and knowledge base.

Communicating your work to the media requires preparation

Lastly, preparation is essential to successfully communicate your work to the media. Check to see if your organization offers media training. Prior to sharing your findings with the media, devote time to summarizing your main results and determining your key messages. Think about how to communicate your findings in simple terms. Anticipate potential questions and create a mock question and answer document.

Summary of key points

And that's a quick description of how to review and disseminate your work. First, tell the story of your process. Second, interpret your findings using clear an neutral language. 3rd, ask others to review your work and forth. Preparation is key to communicating your findings. Remember to always stay true to your analytical question while telling a clear story. Next, take a look at our case study, where we provide an example of the analytical process through the lens of a study about neighborhood walkability and physical activity.

(The Canada Wordmark appears.)

What did you think?

Please give us feedback so we can better provide content that suits our users' needs.

Analysis 101, part 2: Implementing the analytical plan

Catalogue number: 892000062020010

Release date: September 23, 2020

By the end of this video, you will learn about the basic concepts of the analytical process:

  • the guiding principles of analysis,
  • the steps of the analytical process and
  • planning your analysis.
Data journey step
Analyze, model
Data competency
Data analysis
Audience
Basic
Suggested prerequisites
Analysis 101, part 1: Making an analytical plan
Length
6:11
Cost
Free

Watch the video

Analysis 101, part 2: Implementing the analytical plan - Transcript

(The Statistics Canada symbol and Canada wordmark appear on screen with the title: "Analysis 101, part 2: Implementing the analytical plan")

(The Statistics Canada symbol and Canada wordmark appear on screen with the title: "Analysis 101, part 2")

Implementating the analytical plan (Analysis 101: Part 2)

Hi, welcome to analysis 101 video 2. Make sure you've watched video one before you start because we're diving right back in. Now that we've learned how to plan, an analytical project will discuss best practices for implementing your plan.

Learning goals

In this video you will learn how to implement your analytical plan. The key steps in implementing your plan include preparing and checking your data. Performing your analysis. And documenting your analytical decisions.

Steps in the analytical process

(Diagram of 6 images representing the steps involved in the analyze phase of the data journey where the first steps represent the making of an analytical plan, the middle steps represent the implementation of said plan and the final steps are the sharing of your findings.)

In the first video we went through how to plan your analysis. In this video will go through how to implement your plan. If you've been watching the data literacy videos by Statistics Canada, you'll recognize that this work is part of the third step, which is the analyze phase of the data journey.

Step 3: Prepare and check your data

The first step in implementing your plan is to prepare and check your data. Preparing and checking your data will make your analysis more straightforward and rigorous.

Define your concepts

Start by defining your concepts in our previous example that examined the economic status of millennials, we needed to determine how we would define millennials. In the literature, we found no official definition for that generation, but many different recommendations. It's important to make an analytical decision that's meaningful and defendable, and to apply it consistently and documents your decision. In this paper, Millennials were defined as those age 25 to 34 in 2016 in age group that aligns with our typical definition for young workers.

Clean up the variables and the dataset

Now that the concepts are clear, will start digging into the data. Start by cleaning and preparing your data set. You'll want to rename the variables so that they are meaningful an formatted in a consistent manner. For example, rather than using the name Var 3, which is confusing, we rename the variable highest degree earned, which is much clearer. The effort you invest at this step will serve to make your life easier as you proceed with your analysis, especially if you document your decisiones well.

Check your data

(Table of presenting the economic well-being research by generation where the left column represents the generational groups. The middle columns and right column represent the average age in 1999 (Gen-Xers = 26 years old & Millennials = 14 years old) and 2016 (Gen-Xers = 43 years old & Millennials = 66 years old), respectively.)

At this stage, check your data to ensure that it's of the highest quality. For our example, we should check the average age by generational group to make sure there is no issue with how age is calculated. The average age for Generation X is 26 years old in 1999 and in 2016 their average age is 43. This makes sense, however. Well Millennials are 14 years old on average in 1999. They are 66 on average in 2016. In this case we should check our program code, examine the day to fix the error, and document why this error occurred.

Data checks throughout your analysis

To add rigor to your analysis, there are data checks that you should perform at different stages. In the early stages you can check the raw data to ensure that it's clean and ready for analysis. You can also check the frequency distributions of the variables to ensure that the data are consistent with past datasets. Then as you are checking the results of your analysis, you can verify whether your findings are consistent with the literature. All of this work should be done in well documented code that is saved for future reference.

Step 4: Perform the analysis

The second step in implementing your plan is to perform the analysis. As discussed in video one, your analysis should be planned out when creating your analytical plan. So once your data are clean and prepared, you're ready to perform the analysis.

Implementing your plan

Performing the analysis should be straightforward. If you created a clear analytical plan and cleaned and prepared your data appropriately. You should conduct your analysis as planned and as discussed previously, check your results as you go to ensure that the data and methods you are using are producing valid results. Another benefit of checking your results as you go is that you can flag unexpected findings.

Be flexible

If you have unexpected results, this may be due to an error in the data, or it might be some unexpected research finding. Be flexible and adjust your analytical plan to further investigate results that are not in line with your expectations or do not match up with theory. We will see an example of this in the case study video where additional analysis was necessary to disentangle a complex relationship.

Summary of key points

And that is a quick overview of how to implement your analytical plan. This involves preparing and checking your data. And then performing the analysis. Throughout this work, make sure to document your decisions. In the next video you'll be learning about interpreting and sharing your work.

(The Canada Wordmark appears.)

What did you think?

Please give us feedback so we can better provide content that suits our users' needs.

Video - Geoprocessing Tools (Part 1)

Catalogue number: Catalogue number: 89200005

Issue number: 2020017

Release date: November 24, 2020

QGIS Demo 17

Geoprocessing Tools (Part 1) - Video transcript

(The Statistics Canada symbol and Canada wordmark appear on screen with the title: "Geoprocessing Tools (Part 1)")

So today we'll introduce geoprocessing tools, which enable layers to be spatially overlaid and integrated in a variety of ways. These tools epitomize the power of GIS and geospatial analysis, facilitating combining feature geometries and attributes, whether it be assessing spatial relations, distributions or proximities between layers and associated variables of interest. We'll demonstrate these tools with a simple case-study, examining land-cover conditions near water features, also known as riparian areas, in southern Manitoba. These tools can be reapplied and iterated with multiple layers, enabling you to combine, analyse and visualize spatial relations between any variables, geometries and layers of thematic relevance to your area of expertise.

So first, the Merged Census Division feature from the AOI layer was selected and subset to a new layer – CAOI – since Selected Features is not available when running tools as a batch process.

In addition to the interactive and attribute selection tools covered previously, there is one final type – Select by Location. This selects features from the input layer according to its spatial distribution relative to a second layer and the selected geometric predicates. The predicates define the particular spatial relations used when selecting features. We'll use Intersects, Overlaps and Are Within. Multiple predicates can be used, provided they do not conflict. And processing times increase with the number of selected predicates. At the bottom, the alternative selection options are available in the drop-down, but we'll run with the default.

So most selected features match the predicates but two spatially disconnected features were also returned due to a common attribute. So now we'll use the Multipart to Singlepart tool to break the multi-polygons into separate features, running with Selected Features Only.

Now we'll use a slight variation of Select by Location - Extract by Location. Instead of creating feature selections in our input layer, this will generate a new layer. So matching the predicates and comparison layer to those used in Select by Location, we'll click Run. In addition there is also Join by Location, which enables fields from the second layer to be joined to the first according to the predicates and the specified join type – as one-to-one or one-to-many. So these by Location tools enable features to be selected or extracted and field information joined between layers according to their relative spatial distributions.

So now we'll merge the land-cover 2000 layers into one file with the Merge Vector Layers tool. Open the Multiple Selection box and select the four land-cover files. We'll also switch the Destination Coordinate Reference System to WGS84 UTM Zone 14 for spatial analysis. Click run with a temporary file. So merge can be applied to vectors of the same geometry type. It works best when layers contain the same fields and cover distinct yet adjacent areas – making the land-cover layers highly suitable. Two additional fields specifying the originating layer and file path for each of the features is included in the output.

While Merge is running, we'll reproject the watershed layer to the same Coordinate Reference System for consistency in our spatial analysis.

Now we'll join the provided classification guide with the class names to the merged output, using the Joins tab. So code is the Join Field and COVTYPE - the Target Fields. We'll join the Class field and remove the prefix. Now we can run the merged layer through the Fix Geometries tool to accomplish two tasks simultaneously. First it will fix invalid geometries – critical for adding spatial measures and applying geoprocessing tools, while also permanently joining the Class fields. The process may take a few minutes to complete.

 So now we'll rename the Reprojected and Fixed layers to PTWShed for projected tertiary watershed and FMLC2000 for fixed merged land-cover 2000. This will enable us to use the autofill settings to populate the file paths and names when running Clip as a Batch Process. So open Clip from the Toolbox and click Batch Process.

As we've covered, the Clip tool helps standardize the extent of analysis for multiple layers to an area of interest, or reduce processing times and file sizes in a workflow. The inputs can be of any geometry type while the Overlay Layer is always a polygon. Features and attributes that overlap with the Overlay Layer are retained, with the Overlay Layer acting like a cookie cutter on the input.

So select FMLC2000and PTWShedas the input and select CAOIas the Overlay Layer. We can then Copy and paste it into the next row – which we could repeat for as many entries as required. We'll click the plus icon and copy PTWShed for the Input to prepare this layer for an upcoming demo. Here we'll use Manitoba Outline as the Overlay layer. For the output files we'll store them in a Scratch folder, for intermediary outputs in our workflow which can then be deleted at the end of part 2 of the demo. Enter C for the filename, and click Save and then use Fill with Parameter Values in the Autofill settings drop-down. This adds a C prefix to our existing layer names. We'll store the last file in the Geoprocessing folder so that it is retained. Click Run and we'll pick back up once completed. The process takes around five minutes to complete.

So with the clipped layers complete, load them into the Layers Panel. I'll move them back into the Processing Group for organization purposes and then zoom in on the layers.

We can load the provided symbology file to visualize the different land-cover classes.

Then we'll add an area field to the clipped land-cover file. Call it FAreaHA for field area, using a decimal field type with a length of 12 and a precision of 2. We'll reuse these parameters for adding subsequent numeric fields. Enter the appropriate expression - $area divided by 10000.

Now we'll use Select by Expression to isolate 'Water' features using "COVTYPE" = 20 or "Class" LIKE 'Water' – and then click Select Features.

Now we'll generate a Buffer around the selected features to begin creating the Riparian area layer. There are many Buffer tools available in the Processing Toolbox – which we'll demonstrate in Part II – here using the default tool.

We'll check 'Selected features only' box and enter 30 for the distance – a common riparian setback in land-use planning and policies. Change the End Cap Style to Flat and check Dissolve Results, so that any overlapping buffers are merged to avoid conflating total area estimates. Run with a temporary output file. We'll rerun the tool toggling back to the Parameters and changing the distance to 0, to output Water features as their own temporary layer – reducing processing times for the next tool.

Buffer tools can be applied to any vector geometry type. And they are used to assess the proximity of features to those in other layers. We can also use buffers to facilitate combining our geometries and attributes with other layers – like buffering lines or points to use them as a difference layer. The buffer contains the input layer's attributes, which can be used for further analysis. The outputs are often applied with other geoprocessing tools for further examination.

So we'll rename the outputs, naming the first B30W and the second LC2000Water, to facilitate their distinction.

Zooming in on the buffer, the input water features were also included in the output geometry. Since we are not interested in water features but the land-cover conditions around them we'll run the water buffer through the Difference tool using LCWater2000 as the Overlay Layer to retain only the buffered area. So difference is the opposite of Clip – retaining only input features that do not overlap with the Overlay layer. Like Clip – the input can be any geometry type, while the overlay layer is always a polygon. Difference can be used whenever we are interested in features that do not overlap with a specific polygon, such as areas external to a certain drive or distance from hospitals or farm fields, roads or grain elevators not impacted by historical flooding. So click Run and we'll continue once the output is complete.

Toggling the water layer off, we can see that the Difference has retained only our 30 metre buffer. So now we've successfully generated our riparian area layer but need to follow up with the Intersection tool – running it twice to extract watershed codes and land-cover classes to our layer. Intersection retains the overlapping feature geometries of the input layers and any selected attributes of interest in the Fields to Keep parameter. If geometry types differ between layers, the first layer's geometry is used in the output. Thus, Intersection can help combine variables of interest from multiple layers.

For the first run we'll use the Difference and clipped watershed layers as the inputs to assign watershed codes to the riparian buffer. This will enable us to examine land-cover conditions by watershed in Part II of the demo. And for PTWShed check the sub-basin code field in the Multiple Selection box. For the Difference layer, we'll select an arbitrary field for the Fields to Keep parameter – here selecting the "layer" field, clicking OK and then clicking Run. This process takes around 5 minutes and we'll continue when complete.

Within the Attribute Table we can see watershed codes have been successfully assigned to the riparian layer. Now we'll run the tool again, using the intersect as the Input and the clipped land-cover file as the Overlay layer to integrate the land-cover features in the riparian areas. We'll retain the watershed code field from the first layer and the "Class" and "FAreaHA" fields from the land-cover. We'll save it to file, storing it in the main geoprocessing folder and calling it RipLC2000 for riparian land-cover 2000. If the tool fails, use Fix Geometries tool and rerun the Intersection with the fixed outputs. We'll pick back up after the layer is created, which may take up to 20 minutes.

With the riparian land-cover layer loaded copy and paste the style from the clipped land-cover to visualize the different feature classes occupying these areas. Now we've successfully combined the riparian buffer by watershed with the land-cover layer. And for the final component of Part I we'll add four new fields with the Field Calculator, specifically the intersected area in hectares, to determine the area of each land-cover feature within the buffered riparian area. Use the same parameters and expression as applied for creating the FAreaHA field.

So next we'll calculate the percentage of each feature within the 30 metre buffer, to assess the relative distribution of the original features within the riparian setback and isolate any potential violating land-uses. We'll call the field PrcLCinRip, for percent land-cover in riparian area, with the same parameters as the previous fields. Expanding the fields drop-down, we'll divide IAreaHA by FAreaHA and multiply by 100.

The next two fields are to create an identifier which combines the subwatershed codes and land-cover class fields which we'll use to aggregate and assess riparian land-cover by watershed. First is an FID field or FeatureID, which we'll use for the Group_By parameter when using the concatenate function. Leave the parameters in their defaults and double-click the @row_number expression.

Now we can use Concatenate to combine our fields in creating the ID. This is extremely helpful for further processing and analysis, such as distinguishing and rejoining different processed layers to original features or aggregating datasets by different criteria. So we'll change to a text field type with a length of 100 and call it "UBasinLCID".

So type concatenate in the expression box – specifying the function to apply, and then open bracket and double-click SUBBASIN in the fields and values drop-down. Using the separators and adding a dash in single quotes will help separate the codes and class fields for interpretability. As noted, the FID field is used for the Group_By parameter, writing group underscore by, colon, equal sign and then double-clicking the FID field.

We can see the combined fields in the output preview. Given the number of features, the concatenated function can take up to 30 minutes to create. After it's complete, ensure to save the edits to the layer and the project file with a distinctive name for use in Part II of the demo.

(The words: "For comments or questions about this video, GIS tools or other Statistics Canada products or services, please contact us: statcan.sisagrequestssrsrequetesag.statcan@canada.ca" appear on screen.)

(Canada wordmark appears.)