- Message from the Chief Statistician
- Background
- Our journey
- Statistics Canada Open Science Action Plan
- Governance
"Solutions to the great challenges we face require more knowledge, more science and more applications to technology. Opening up federal science will help pave a quicker path to discovery and at the same time ensure that the results of research paid for by Canadians is fully available to them."
Dr. Mona Nemer, Chief Science Advisor of Canada
Message from the Chief Statistician
As Canada's national statistical agency, our mission is to provide high-quality statistical information that helps Canadians better understand their country—its population, resources, economy, society and culture. Applied research is an essential function of an effective statistical organization, allowing us to remain and become more relevant.
The research conducted by the agency ranges from methodological research that develops new and innovative statistical techniques, to in-depth analysis that enhances the agency's vast data holdings. By identifying driving issues, presenting possible future scenarios, and developing new data sources to fill gaps, Statistics Canada research informs decision making by all levels of government, the private sector and Canadians more broadly.
Statistics Canada is committed to open government, working actively to make its data and research accessible and available to a wide range of users. These practices benefit Canadians by accelerating the pace of discovery and innovation and helping decision makers "see sooner and act faster." In February 2020, the Chief Science Advisor of Canada released the Roadmap for Open Science, calling on science-based departments to take the next steps to make federal science open to all, while respecting privacy, security, ethical considerations and appropriate intellectual property.
I am pleased to present the Open Science Action Plan for Statistics Canada. The plan outlines our ongoing and future actions to further advance open science in four key areas: FAIR (findable, accessible, interoperable, reusable) and open data, open publications, open communication, and open code. The plan will serve to make both our research and science more open, as well as support and advance open science among the many research communities we serve.
Anil Arora
Chief Statistician of Canada
Background
To be an effective national statistical organization, Statistics Canada relies on applied research to inform how it collects, manages and transforms data into value-added statistical information. The agency is actively engaged in conducting research in a range of subject-matter areas, including the economy, society, health, and themes in which these areas intersect.
In August 2021, Statistics Canada released the Multi-year Consolidated Plan for Research, Modelling and Data Development, 2021 to 2023 outlining research and modelling priorities over the next two years. The implementation of the plan is led by the Analytical Studies and Modelling Branch, the research arm of Statistics Canada. Agency researchers are subject-matter experts who leverage a broad range of data sources and modelling techniques to address the information needs of government, academic and public sector partners and stakeholders.
Research is also undertaken by agency statisticians and methodologists to develop new and innovative statistical techniques to gather, integrate and model data. The Modern Statistical Methods and Data Science Branch sponsors the Methodology Research and Development Program (MRDP). This program covers research and development activities in statistical methods with potentially broad application in the agency's statistical programs; these activities would otherwise be less likely to be carried out during the provision of regular methodology services to those programs. The MRDP also includes activities that provide client support in the application of past successful developments in order to promote the use of the results of research and development work.
The agency also supports government and academic researchers in their efforts to advance scientific research through its data access program. Accredited researchers can access a wide variety of anonymized microdata for research purposes, including social and business surveys, administrative data, and linked datasets. Researchers access the microdata in research data centres located in universities across Canada or secure environments located in federal and provincial government buildings. Starting in November 2021, microdata for selected government research projects may be accessed in Statistics Canada's Virtual Data Lab in the cloud environment. Biospecimen (blood, urine and DNA) samples from the Canadian Health Measures Survey are also available for approved research initiatives through the Canadian Health Measures Survey Biobank.
Additionally, the data access program's Data Liberation Initiative, a partnership with postsecondary institutions, allows faculty and students unlimited access to numerous public use data and geographical files. Aggregated data and public use microdata files are available to researchers through the Statistics Canada website and the Government of Canada Open Data Portal.
Statistics Canada is committed to the principles of open science as part of its broader commitment to enabling an open, transparent and accountable organization while ensuring rigorous data protection standards (for more information, see Statistics Canada's Trust Centre). Given its role as both a producer of science and a supporter of scientists, Statistics Canada is joining other science and research-based departments in responding to the Roadmap for Open Science. Released in February 2020 by the Chief Science Advisor of Canada, the roadmap outlines a set of overarching principles and recommendations to make federal science open and available to all Canadians, including the development of a departmental open science action plan.
Our journey
The development of the Statistics Canada Open Science Action Plan (OSAP) was led by the Open Science and Scientific Integrity Working Group (OS-SIWG), established with representation from the main research groups within the agency (i.e., Analytical Studies and Modelling Branch and Modern Statistical Methods and Data Science Branch). From the outset, the goal was to ensure alignment between open science and scientific integrity actions and processes. The OS-SIWG was guided by the Open Science-Scientific Integrity Steering Committee composed of the branch director generals and assistant chief statistician.
The development of the Statistics Canada OSAP was guided by the following goals:
- highlight the work Statistics Canada is already doing to make its data and research open
- identify areas where it could do more to enable open science
- reflect the views and aspirations of researchers
- align future activities with initiatives already underway or planned at the agency to support open science practices, including scientific integrity
- ensure the action plan is aligned with policies, procedures and legislation that govern Statistics Canada data and activities, including the principles of openness and transparency
- position the action plan as evergreen, ensuring updates align with agency directions and developments on open science within the Government of Canada.
The activities of the OS-SIWG to develop the action plan can be summarized as follows:
- Environmental scan – An internal environmental scan was conducted to identify policies, processes and activities currently underway at Statistics Canada that align with the four pillars of the action plan. Information on what is already being done and what more could be done to advance open science was documented.
- Communication and engagement – The OS-SIWG created internal communication products to inform Statistics Canada employees about open science activities. Details of the action plan were presented to various committees for feedback.
- Consultations with researchers – As per Recommendation #2 of the Roadmap for Open Science, the OS-SIWG conducted consultations with researchers within the agency. In spring 2020, a survey identified the current practices of more than 150 researchers across the agency. This information provided a better understanding of the challenges and opportunities related to open data, open publications, open communications and open code. It also produced ideas for new activities and supports required to overcome barriers to open science.
Statistics Canada Open Science Action Plan
The Statistics Canada OSAP is built on four key pillars—FAIR (findable, accessible, interoperable, reusable) and open data, open publications, open communications and open code—and supported by an established governance. The plan reflects the policies, processes and activities that the agency already engages in to support open science and identifies new directions and activities that the agency commits to undertake over the next several years.
FAIR and open data
Federal departments and agencies should develop strategies and tools to implement FAIR Footnote 1 data principles to ensure interoperability of scientific and research dataFootnote 2 and metadata standards by January 2023, with a phased plan for full implementation by January 2025.
(Roadmap for Open Science, 2020; Recommendation #5)
What is Statistics Canada currently doing?
- Statistics Canada continues to expand its mandate to make its data more findable and available to researchers through the Statistics Canada website, the Open Data Portal and the research data centres. Data on the agency's website, including public use microdata files, are now being offered in interoperable formats (i.e., CSV and SDMX). These data and files are also being made accessible through a Web data service (open application programming interfaces [APIs]), which increases Canadians' access to data while expanding the interoperability of Statistics Canada's information.
- The Enterprise Information and Data Management (EIDM) project that began in the spring of 2020 is implementing a holistic information management (IM) vision across the agency. This vision is anchored to specific IM principles that support a modern and comprehensive process to manage data and information. Moving forward, Statistics Canada will be adhering to FAIR data principles as it works to establish an integrated system for structuring, describing and governing information assets across the agency. This work will continue to improve the agency's efficiency, promote transparency and help enable further data insights for Canadians.
- The concrete business outcomes of the EIDM project are to increase the agency's statistical capacity, trust and user access, while aiming to reduce risk (both external and internal) to the agency.
- Various international data standards, tools and services are applied to information at the agency, along with metadata and data strategies that collectively improve its processes and facilitate analyses. Continuing to adhere to international data standards makes it easier for the agency to create, share and integrate its data with data sharing partners and other forms of publicly available data. The adoption of new standards will ensure data interoperability, an important tenet of FAIR. This will reduce the need for the agency to clean or transfer data it receives into usable forms. To date, the agency has adopted the following FAIR-based standards: SDMX (Statistical Data and Metadata eXchange), DDI (Data Documentation Initiative) and DCAT (Data Catalog Vocabulary).
- The agency is continually updating policies to reflect the current statistical environment. For example, the use of FAIR data was underscored and referenced throughout the new Policy on Information Resource Management. This particular policy enshrines how the agency will protect the authenticity, reliability, integrity and usability of its information and data, over time.
- Metadata management will also be leveraged to cumulatively increase the value of Statistics Canada data. The agency will use metadata to build up its corporate repositories (data storage receptacles) so that its information and data are reliably findable, accessible, interoperable, reusable, reproducible and only open to those requiring access.
Moving forward…
Statistics Canada will further advance FAIR data by doing the following:
- Currently, Statistics Canada is building up its FAIR data infrastructure as part of a Data Analytics as a Service model to make its data and information products more easily findable and accessible to Canadians. Using an on-demand or "as a service" process, Statistics Canada is creating reusable and interoperable software components that will leverage service-oriented architectures and APIs to better deliver data and information to its scientific and research data communities. A standards-based data hub will be accessible to all users to help them seamlessly collaborate and ingest and manage data, metadata and paradata while adhering to FAIR.
- Another tool the agency is developing, called the FAIRness Assessment Process, will measure internal programs' FAIR maturity at any point in their development. This tool will identify programs' data and IM methods and quantify how well they are adhering to FAIR.
- The agency is also investigating how it can further protect respondents by masking identifiable markers on data files without reducing the efficiency or statistical quality of data. The aim of this process is to reduce the potential for unauthorized employees to access sensitive files (unintended or otherwise) while protecting its assets and respondents. Such a process would allow FAIR data to be created in a protected fashion.
- The agency is continually adopting new versions of open standards (such as SDMX) that will refine how it describes and exchanges both data and information (including geospatial data) with its data sharing partners and user community.
- Updates will continue to be made to various policies and governance instruments as other processes are brought online that relate to open standards, as well as information and data management.
- The agency is fully engaged by providing FAIR data training to its internal user community, Statistics Canada researchers and other federal government departments; this training outlines the significance of FAIR and how it relates to sound information and data management.
Open publications
Federal departments and agencies should make federal science articlesFootnote 3 openly accessible by January 2022 and federal science publicationsFootnote 4 openly accessible by January 2023, while respecting privacy, security, ethical considerations and appropriate intellectual property protection.
(Roadmap for Open Science, 2020; Recommendation #4)
What is Statistics Canada currently doing?
Statistics Canada researchers disseminate hundreds of research articles and publications per year addressing a range of social, economic and methodological questions. Research findings are published in both Statistics Canada publications and external peer-reviewed journals. There are several practices currently in place at the agency that support the open publication of its research:
- Peer and institutional review: To ensure quality and objectivity, research conducted at Statistics Canada undergoes peer and institutional review prior to dissemination in either internal or external journals. The review process is guided by the following policies:
- Policy on Peer and Institutional Review: All interpretive information products, analytical products and methodological products for which Statistics Canada is solely or jointly responsible are subject to review prior to release outside the agency. The review should ensure that their content adheres to the generally accepted norms of good professional practice and that they are compatible with the agency's mandate.
- Policy on Scientific Integrity, guidance for articles for external publications (7.5.1 to 7.5.5): The Policy on Scientific Integrity publishing guidelines provide further requirements for external publications, including the requirement for peer and institutional review; appropriate acknowledgement and affiliation for Statistics Canada researchers; publication in reputable open access journals; and appropriate management of conditions of publication, including copyright and intellectual property.
- Open publications: All Statistics Canada publications are openly available and free of charge on the Statistics Canada website. Publications are fully bilingual, meet accessibility requirements, and are available in both HTML and PDF. Examples of research publications produced by the agency include
- Open Licence: Access to Statistics Canada research is further enabled through Statistics Canada's Open Licence, allowing worldwide, royalty-free, non-exclusive licence to its publications, tables, graphs and reports, and enabling users to
- use, reproduce, publish, freely distribute or sell the Information
- use, reproduce, publish, freely distribute or sell value-added products
- sublicence any or all such rights, under terms consistent with this licence.
Moving forward…
Statistics Canada will further advance the open publication of agency led research by
- continuing to publish research in existing agency publications
- supporting the use of program budgets to pay publications fees to enable the publication of research in external open access journals, where appropriate
- developing a plan to ensure that research findings published in external peer-reviewed journals can be made available to Canadians in the official language of their choice, in accordance with the Official Languages Regulations, and accessible according to the Government of Canada Standard on Web Accessibility
- developing guidance for Statistics Canada to leverage the new Government of Canada Open Science Repository Platform as the whole-of-government solution to enable access to research published in external peer-reviewed journals.
Open communications
Statistics Canada will continue to use existing platforms to communicate research findings and explore additional opportunities to engage researchers in communicating their work to Canadians by March 2023.
Statistics Canada recognizes the importance of communicating its research and statistical insights among a broad range of stakeholder groups (i.e., academics, policy makers, public and private organizations, citizens). The agency uses a range of communication strategies and tools to raise awareness of its research and enable it to inform decision making.
What is Statistics Canada currently doing?
- Official release: As per Statistics Canada's Policy on Official Release, all new datasets, as well as analytical products and information products based on new datasets, are published in one of the agency's official release vehicles, such as The Daily. Official release vehicles are dissemination channels that have been approved by the Chief Statistician as official for the purposes of communicating new data products to Canadians. These channels are easily accessible and visible and ensure equitable access to the agency's statistical products. The Daily is released Monday to Friday at 8:30 a.m. Eastern time. It provides short, plain language descriptions of key research findings. Information on upcoming releases is provided via The Daily release schedule. Customized alerts can be generated through My StatCan for specific topics of interest.
- StatsCAN Plus: StatsCAN Plus is the first step in providing a platform through which the agency can release information at multiple times throughout the day. StatsCAN Plus was launched on November 15, 2021. This portal packages data and analyses in a way that is more accessible to the general public and business users. Through this, the power of data storytelling can be better leveraged. The ultimate aim of StatsCAN Plus is to bridge some of the identified gaps in user needs to ensure Statistics Canada continues to connect with Canadians.
- Mobile application: As of April 2021, the agency began developing a mobile application to modernize the way Statistics Canada data are published and to keep pace with the way Canadians are accessing information and services through mobile devices, such as smartphones and tablets. The mobile app will respond to the ever-changing data landscape and to users' and stakeholders' requirements for more data, provided faster, and made available in multiple formats and from multiple access points. The agency is planning to launch it in early 2022.
- Social media: Statistics Canada uses a range of social media platforms to communicate its research and engage with a range of communities, including Twitter, Facebook, Reddit, Instagram, YouTube, and LinkedIn. The Web2Social team creates customized campaigns to help promote products and services to online audiences.
- Media: Researchers are encouraged to engage with media directly to communicate their research and address questions. The Media Relations team provides help to researchers through its "Encountering the Media" course. It also offers individual support to researchers with each release.
Moving forward…
Statistics Canada will further advance the open communication of agency-led research by
- leveraging existing communication processes, products and tools to communicate research findings
- expanding the existing communication products to include more plain language summaries to increase awareness of and access to Statistics Canada research
- developing guidance for researchers to encourage the use of personal social media accounts, where appropriate, for non-official communication to engage directly on their research with the external community (e.g., LinkedIn)
- developing training for researchers to develop their knowledge translation and communicating skills to better enable effective communication of their research.
Open code
Statistics Canada will develop guidelines, tools and training to support the open sharing of research and modelling code by March 2023.
What is Statistics Canada currently doing?
- The Open Source Office was established at Statistics Canada in May 2020 under the Enterprise Architecture, Strategy and Innovation Division to enable programs to modernize safely and securely through Open Source software (e.g. sharing and collaboration). The Office has developed guidelines and best practices on how to release software code on GitHub. The purpose of the guide is to support collaboration on code for projects maintained both by Statistics Canada and by external partners. The guides are currently in alpha version and include information on managing employee GitHub accounts, using GitHub for releasing code, contributing code to other open source projects and using organizational GitHub accounts.
- The Citizen Development Initiative was established to transition to a new, more open source way of working with an agency-wide approach and stream of information. Citizen development—or co-development—is defined as user applications for consumption by others, created by a broad community of practitioners from varying backgrounds using development and runtime environments supported by corporate IT. The ability to create and reuse open source code where possible, and integrate the latest automation, machine learning and visualization approaches already well established in the industry, is a foundational piece for providing better and more timely insights to Canadians. The Citizen Development Initiative is responsible for developing the Directives on Citizen Development, currently focused on production processes.
Moving forward…
Statistics Canada will further advance open code developed in the context of research by
- developing guidelines and tools to support the sharing of research and modelling code, when appropriate, based on best practices, and leveraging policies, processes and activities established at the agency to date
- developing training to support researchers seeking to make their code open and shareable.
Governance
Recommendation #7 - The Data Strategy Roadmap and the Open Science Action Plan should be aligned… To facilitate that, deputy heads should designate a Chief Scientific Data Officer by January 2021.
(Roadmap for Open Science, 2020)
The role of Chief Scientific Data Officer has been incorporated in the existing Statistics Canada P-Suite, which was established to manage risk and ensure compliance with a wide variety of external and internal acts and regulations, policies, directives, and standards. The principal accountability officers are the "gatekeepers", and provide the guardrails for the agency.
The role of Chief Scientific Data Officer has been incorporated with the role of the Principal Data Officer, which was established to provide governance and oversight, and to be an authoritative voice on all things related to the organization's data and information assets. The Chief Scientific Data Officer provides external leadership on data to help other organizations (and the Government of Canada as a whole) use data as a strategic asset.
The role of Open Science Champion was established to ensure that the Open Science Action Plan and related requirements and activities are communicated to all employees, and that progress is monitored. Responsibilities include
- continue to develop and implement the additional procedures, policies, guidelines, tools, training and professional development opportunities necessary to support this policy
- ensure that alleged breaches of this policy are promptly and thoroughly reviewed and investigated by Statistics Canada
- abide to all accountabilities set by the central agencies and those delegated from the deputy head
- advance open science by addressing the recommendations of the Roadmap for Open Science while protecting Canadians' privacy and confidentiality as stated in the Statistics Act
- represent Statistics Canada on Government of Canada Open Science senior management committees.
The Chief Scientific Data Officer is currently André Loranger, Principal Data Officer, Assistant Chief Statistician.
The Open Science Champion is currently Yvan Clermont, Director General, Analytical Studies and Modelling Branch.