Data Science Network for the Federal Public Service (DSNFPS)

The information in these articles is provided 'as-is' and Statistics Canada makes no warranty, either expressed or implied, including but not limited to, warranties of merchantability and fitness for a particular purpose. In no event will Statistics Canada be liable for any direct, special, indirect, consequential or other damages, however caused.

Recent articles

Data to Decisions: Visualizations and ML Modeling of Rental Property Data

Topics covered in this article: Data processing and engineering Computer vision

As per the 2021 census, there were 5-million rental households in Canada, which means roughly one-third of Canadian households are renters. However, much of this rental activity occurs privately, leading to limited and inconsistent data. To bridge this knowledge gap, NorQuest college, acquired processed, analyzed, and visualized rental listings from the stakeholder – Community Data Program, for Ontario.

Continue reading: Data to Decisions: Visualizations and ML Modeling of Rental Property Data


Adopting a high Level MLOps Practice for the Production Applications of Machine Learning in the Canadian Consumer Prices Index

Topics covered in this article: Data processing and engineering Text analysis and generation Ethics and responsible machine learning

Responsible application of Machine Learning (ML) within official statistics requires various processes to make sure that ML is developed in a robust and metric driven fashion, and directly tied to solving processing needs of a specific statistical program. These processes can be operationalized as a framework known as ML Operations or MLOps. Focusing on the use case of the Canadian Consumer Prices Index (CPI), this article provides an overview of how various MLOps processes can be built to ensure that ML models that classify unique products to the categories of the CPI classification system adhere to robust quality assurance, transparency, governance, and provenance best practices, ensuring that model decay is addressed and that price statistics calculated on administrative data are robust. The article also categorizes how MLOps could be implemented by providing an overview of a maturity model and focuses on several key components that are important for price statistics.

Continue reading: Adopting a high Level MLOps Practice for the Production Applications of Machine Learning in the Canadian Consumer Prices Index


Identifying Personal Identifiable Information (PII) in Unstructured Data with Microsoft Presidio

Topics covered in this article: Ethics and responsible machine learning

In today’s digital age, organizations collect and store vast amounts of data about their customers, employees, and partners. This data often contains Personal Identifiable Information (PII). With the growing prevalence of data breaches and cyber attacks, protecting PII has become a critical concern for businesses and government agencies alike. In this article, Statistics Canada will take a detailed look at Microsoft Presidio and how it helps organizations in Canada comply with privacy laws.

Continue reading: Identifying Personal Identifiable Information (PII) in Unstructured Data with Microsoft Presidio


Other recent articles

Browse articles by topic

Computer vision
Data processing and engineering
Predictive analytics
Text analysis and generation
Ethics and responsible machine learning
Other