Data ethics: An introduction

Catalogue number: 892000062022001

Release date: May 24, 2022

In this video, you will be introduced to data ethics, why they are important, and the 6 guiding principles of data ethics implemented by Statistics Canada, throughout the Data Journey.

Data journey step
Foundation
Data competency
  • Data security and governance
  • Data stewardship
Audience
Basic
Suggested prerequisites
N/A
Length
10:54
Cost
Free

Watch the video

Data ethics: An introduction - Transcript

Data ethics: An introduction - Transcript

(The Statistics Canada symbol and Canada wordmark appear on screen with the title: "Data Ethics An Introduction")

Slide 0: Data Ethics : An Introduction

Gathering, exploring, analyzing and interpreting data are essential steps in producing information that benefits society, the economy and the environment. To properly conduct these processes, data ethics must be upheld in order to ensure the appropriate use of data.

Slide 1: Learning Goals

(Text on screen: By the end of this video, you should have a better understanding of the following:

  • What "data ethics" means
  • Why data ethics are important
  • How Statistics Canada impliments data ethics throughout the data journey)

In this video, you will be introduced to data ethics, why they are important, and the 6 guiding principles of data ethics implemented by Statistics Canada, throughout the Data Journey.

Slide 2: Steps in the data journey

(Text on screen: Supported by a foundation of stewardship, metadata, standards and quality

Diagram of the Steps of the data journey: Step 1 - define, find, gather; Step 2 - explore, clean, describe; Step 3 - analyze, model; Step 4 - tell the story. The data journey is supported by a foundation of stewardship, metadata, standards and quality.)

This diagram is a visual representation of the data journey from collecting the data; to exploring, cleaning, describing and understanding the data; to analyzing the data; and lastly to communicating with others the story the data tell.

Slide 3: Steps in the data journey (Part 2)

Data ethics are relevant throughout all steps of the data journey.

Slide 4: What are data ethics?

So what are data ethics exactly? Data Ethics allow data users to address questions about the appropriate use of data throughout all steps of the data journey.

This field of study is used to ensure collected data always have a specific purpose, and that each new project or data acquisition has the best interests of both society and the individual at heart.

Slide 5: There Are Lots Of Ways To Gather Data…

With the rapid growth of data associated with the digital age, data gathering approaches have also evolved.

Along with the more traditional survey-based approach, some alternative data gathering methods include:

  • Earth observation data;
  • Scanner data;
  • Administrative data.

Slide 6: … And Transform Data To Information

These data are then used to create useful information such as statistics, and to train algorithms for artificial intelligence and machine learning. But with big data comes big responsibility…

Slide 7: Responsibility to address ethical challenges such as:

When deciding to embrace such evolving data gathering methods as administrative sourcing, web scraping, apps and crowdsourcing, there is a responsibility to maintain focus on such perennial ethical challenges as:

  • Protecting privacy and confidentiality
  • Balancing privacy intrusion vs public good
  • Recognizing the potentially harmful impacts of using biased data
  • Ensuring data quality to avoid misinformation

Slide 8: Statistics Canada's 6 Guiding Principles of Data Ethics

There are many ways to address these ethical challenges, at Statistics Canada, we use the following 6 guiding principles:

  • Data are used to benefit Canadians
  • Data are used in a secure and private manner
  • Data acquisitions and processing methods are transparent and accountable
  • Data acquisitions and processing methods are trustworthy and sustainable
  • The data themselves are of high quality
  • Any information resulting from the data are reported fairly and do no harm

Let's look at these principles in more detail.

Slide 9: Benefits To Society

Benefits to society means that statistical activities must allow governments, businesses and communities to make informed decisions and manage resources effectively, ultimately aiming to clearly benefit the lives of Canadians.

Slide 10: Benefits To Society - Example

A census of population is fundamental to any country's statistical infrastructure. In Canada, the census is currently the only data source that provides high-quality population and dwelling counts based on common standards and at low levels of geography, as well as consistent and comparable information on various population groups.

Slide 11: Privacy and Security

(Text on screen: It is important to find a balance between respecting privacy and producing information

  • Ensure statistical activities are not intruding into the lives of Canadians any more than necessary
  • Always justify whatever intrusion might be considered necessary

It is also important to consider the practical aspects of security, and how potential breaches may affect the well-being of Canadians).

When statistical activities require personal information, the consideration of both privacy and security is mandatory. The appropriate measures must always be taken in order to protect personal information while still ensuring the data can be used to create meaningful information.

Firstly, there is a fine balance between respecting privacy and producing information. Projects that intrude into the private lives of Canadians must justify why this information is important enough to warrant this intrusion, and be able to explain how using this data will ultimately provide benefits. In other words, we must ensure that our statistical activities are not intruding into the lives of Canadians any more than necessary, and to always justify whatever intrusion we consider necessary.

Furthermore, when designing a data-gathering approach, we have a moral obligation to protect the confidentiality and data of Canadians. Part of the data ethics exercise also consists in ensuring that projects have considered potential security threats and have prepared accordingly.

Slide 12: Privacy and Security – Example

(Text on screen: Study on the sexual orientation of individuals in management positions.

Questions related to gender, marital status and sex are pertinent, even if intrusive.

Questions about salary, criminal antecedents and health conditions are intrusive and not directly tied to the project, so they must be justified.

Strict IT and Information Management measures must be taken during all stages of working with this data, as they are personal and sensitive.)

Let's imagine we are trying to have a better picture of the sexual orientation of individuals in management positions. If we conduct a survey, then questions related to gender, marital status and sex are pertinent, even if intrusive. If we were to ask questions about salary, age and nationality, we would have to justify why these variables are necessary.

To avoid any breach of personal information, strict IT and Information Management measures must be taken during all stages of working with data - the collection, retention, use, disclosure and disposal of information, in order to protect the confidentiality of this vulnerable population as well as the integrity of the project.

Slide 13: Transparency and Accountability

Statistical activities undertaken for the benefit of society have the responsibility to be transparent about where the data come from, how they are used and the steps that are taken to ensure confidentiality.

Slide 14: Transparency and Accountability - Example

At Statistics Canada's Trust Centre for example, you will find a list of all current surveys and statistical programs, together with their methodologies, goals and data sources. Making these projects available is important not only so that Canadians can consult how statistical activities are conducted to determine if a project is in their best interest, but also so they can keep the agency accountable and point out whenever Statistics Canada ever encroaches upon the limits of its mandate.

Slide 15: Data Quality

The Data Quality principle means that the data used to create statistical information must be as representative and accurate as possible. Maintaining this expectation means ensuring that biases and errors do not compromise the potential benefits of a project or mislead data users.

Slide 16: Data Quality – Example

(Text on screen: Low response rates can lead to biasedestimates or samples too small to meet the information need.

Statistics Canada decides to start using alternative data sources.

If sources are biased, they may lead to uninformed measures and policies.)

When conducting a survey, low response rates can lead to biased estimates or samples too small to meet the information need. Take data surrounding employment among individuals with disabilities for example. If the response rate for survey affects the quality of the estimates, Statistics Canada might decide to start using alternative data sources, such as administrative data acquired from industrial associations or labor unions.

If these new sources are biased, the unreliable information resulting from them may lead to uninformed measures and policies, which may cause more harm than good.

Slide 17: Fairness and Do No Harm

When conducting statistical activities, it is necessary to consider all the potential risks that a statistical activity may pose to the well-being of individuals or specific groups.

Slide 18: Fairness and Do No Harm - Example

When acquiring and linking a large amount of data, detailed descriptions of smaller sub-populations of society might become available for analysis. These detailed clusters can sometimes magnify what is happening at the lowest level of geography. While this may sound harmless, it is important to remember these clusters of data might reveal information such as ethnicity and socio-economic status. Putting any sub-population under a microscope can raise ethical issues. For instance, studies on criminality have to be worded in careful manner so as to not reinforce stereotypes, and results have to be shared with caution to ensure that the information is informative and not taken as an indictment of a specific population group.

Slide 19: Trust and Sustainability

In order to maintain the trust of the public, the use of data for the benefit of society should occur only by implementing such best practises as assuring confidentiality, protecting personal information, producing representative data, and being accountable. By making this our mandate, we can ensure that our statistical activities remain socially acceptable in the eyes of the public. If we have social acceptability, any partnership and any approach we undertake becomes and opportunity to show that we follow our mandate and helps the agency promote its objectives and maintain the trust of the public in the long term.

Slide 20: Trust and Sustainability - Example

To illustrate when trust really matters, imagine we are trying to gather information on recreational cannabis use by Canadian youth, via voluntary crowdsourcing, and that this is happening before cannabis was legalized. One can only expect respondents to provide accurate, reliable data if they trust the institution responsible for guarding their responses and preserving confidentiality. In this case, they must trust their data is not going to be shared with anyone, including peers, parents and even legal authorities.

Slide 21: Recap of Key Points

(Figure 1 showing a table with the 6 guiding principles: Benefits Canadians, Trust and Sustainability, Privacy and Security, Data Quality, Transparency and Accountability and Fairness and Do No Harm)

In summary, Data Ethics is the field of study that addresses questions about the appropriate use of data.

With advances in data gathering techniques comes ethical challenges regarding access to and use of data.

There are 6 guiding principles you can use to address ethical concerns:

  • Benefits to Canadians
  • Privacy and security
  • Transparency and accountability
  • Trust and sustainability
  • Data quality
  • Fairness and do no harm

(The Canada Wordmark appears.)

What did you think?

Please give us feedback so we can better provide content that suits our users' needs.