National Travel Survey: C.V.s for Person-Trips by Duration of Trip, Main Trip Purpose and Country or Region of Trip Destination - Q1 2023

National Travel Survey: C.V.s for Person-Trips by Duration of Trip, Main Trip Purpose and Country or Region of Trip Destination, Q1 2023
Table summary
This table displays the results of C.V.s for Person-Trips by Duration of Trip, Main Trip Purpose and Country or Region of Trip Destination. The information is grouped by Duration of trip (appearing as row headers), Main Trip Purpose, Country or Region of Trip Destination (Total, Canada, United States, Overseas) calculated using Person-Trips in Thousands (× 1,000) and C.V. as a units of measure (appearing as column headers).
Duration of Trip Main Trip Purpose Country or Region of Trip Destination
Total Canada United States Overseas
Person-Trips (x 1,000) C.V. Person-Trips (x 1,000) C.V. Person-Trips (x 1,000) C.V. Person-Trips (x 1,000) C.V.
Total Duration Total Main Trip Purpose 66,482 A 58,207 A 6,040 A 2,234 A
Holiday, leisure or recreation 23,290 A 18,738 A 3,018 A 1,533 A
Visit friends or relatives 26,594 B 24,786 B 1,291 B 517 B
Personal conference, convention or trade show 904 C 824 C 78 D 2 E
Shopping, non-routine 4,269 B 3,641 B 626 B 2 E
Other personal reasons 5,159 B 4,890 B 207 C 61 D
Business conference, convention or trade show 1,547 B 1,076 B 407 C 65 C
Other business 4,719 B 4,251 B 414 C 55 D
Same-Day Total Main Trip Purpose 42,594 A 40,756 A 1,838 B ..  
Holiday, leisure or recreation 13,039 A 12,431 A 609 C ..  
Visit friends or relatives 17,309 B 16,944 B 365 D ..  
Personal conference, convention or trade show 633 C 617 C 16 E ..  
Shopping, non-routine 4,036 B 3,448 B 588 C ..  
Other personal reasons 3,862 B 3,747 B 116 D ..  
Business conference, convention or trade show 446 C 437 C 9 E ..  
Other business 3,267 C 3,133 C 135 E ..  
Overnight Total Main Trip Purpose 23,888 A 17,452 A 4,202 A 2,234 A
Holiday, leisure or recreation 10,251 A 6,308 A 2,409 A 1,533 A
Visit friends or relatives 9,284 A 7,842 A 925 B 517 B
Personal conference, convention or trade show 271 C 208 C 62 E 2 E
Shopping, non-routine 232 C 193 D 37 D 2 E
Other personal reasons 1,296 B 1,144 B 91 D 61 D
Business conference, convention or trade show 1,101 B 639 B 398 C 65 C
Other business 1,452 B 1,118 B 279 C 55 D
..
data not available

Estimates contained in this table have been assigned a letter to indicate their coefficient of variation (c.v.) (expressed as a percentage). The letter grades represent the following coefficients of variation:

A
c.v. between or equal to 0.00% and 5.00% and means Excellent.
B
c.v. between or equal to 5.01% and 15.00% and means Very good.
C
c.v. between or equal to 15.01% and 25.00% and means Good.
D
c.v. between or equal to 25.01% and 35.00% and means Acceptable.
E
c.v. greater than 35.00% and means Use with caution.

Quarterly Survey of Financial Statements: Weighted Asset Response Rate - second quarter 2023

Weighted Asset Response Rate
Table summary
This table displays the results of Weighted Asset Response Rate. The information is grouped by Release date (appearing as row headers), 2023, Q2, Q3, and Q4, and 2023, Q1, Q2 calculated using percentage units of measure (appearing as column headers).
Release date 2022 2023
Q2 Q3 Q4 Q1 Q2
percentage
August 24, 2023 80.9 79.0 72.7 72.2 59.4
May 24, 2023 80.9 79.0 72.7 57.6  
February 23, 2023 79.2 76.9 55.2    
November 23, 2022 76.1 56.2      
August 25, 2022 55.7        
.. not available for a specific reference period
Source: Quarterly Survey of Financial Statements (2501)

Retail Trade Survey (Monthly): CVs for total sales by geography - June 2023

CVs for total sales by geography-June 2023
Geography Month
202306
%
Canada 0.6
Newfoundland and Labrador 2.0
Prince Edward Island 1.0
Nova Scotia 1.8
New Brunswick 2.3
Quebec 1.2
Ontario 1.1
Manitoba 1.2
Saskatchewan 2.6
Alberta 0.9
British Columbia 1.8
Yukon Territory 1.7
Northwest Territories 1.9
Nunavut 1.6

Monthly Survey of Food Services and Drinking Places: CVs for Total Sales by Geography – June 2023

Monthly Survey of Food Services and Drinking Places: CVs for Total Sales by Geography - June 2023
Table summary
This table displays the results of CVs for Total sales by Geography. The information is grouped by Geography (appearing as row headers). Month and percentage (appearing as column headers).
Geography Month
202206 202207 202208 202209 202210 202211 202212 202301 202302 202303 202304 202305 202306
percentage
Canada 0.66 0.49 0.14 0.13 0.17 0.24 0.88 0.32 0.33 0.26 0.14 0.18 0.20
Newfoundland and Labrador 0.53 0.50 0.47 0.49 0.73 0.49 0.93 2.43 0.81 0.70 0.84 0.99 1.23
Prince Edward Island 15.97 9.23 5.27 3.04 8.45 8.22 3.45 10.49 14.17 8.25 7.86 2.25 3.07
Nova Scotia 1.79 3.37 0.43 0.40 0.37 0.43 16.87 0.83 0.91 0.72 0.58 0.70 0.72
New Brunswick 0.67 0.53 0.52 0.50 0.56 0.73 12.18 1.21 1.77 0.76 0.73 0.78 1.71
Quebec 1.55 0.97 0.18 0.28 0.26 0.19 1.73 0.67 0.95 0.77 0.33 0.53 0.59
Ontario 1.30 0.95 0.25 0.25 0.21 0.53 0.73 0.67 0.64 0.48 0.25 0.26 0.30
Manitoba 0.68 3.49 0.48 0.40 0.37 0.58 9.72 0.78 0.75 0.80 0.68 0.84 0.93
Saskatchewan 6.45 4.85 1.30 0.73 1.31 1.44 7.51 0.62 0.89 0.51 0.55 0.76 0.93
Alberta 1.45 0.91 0.39 0.30 0.33 0.38 1.56 0.40 0.44 0.36 0.33 0.37 0.48
British Columbia 0.64 0.91 0.28 0.21 0.66 0.33 2.77 0.44 0.44 0.38 0.27 0.36 0.38
Yukon Territory 3.32 2.54 2.09 2.07 2.34 2.20 2.50 41.12 2.70 30.75 2.48 8.17 4.07
Northwest Territories 3.20 2.74 2.38 2.05 2.00 2.09 2.56 6.03 2.47 38.31 3.64 10.11 3.58
Nunavut 1.55 1.52 1.30 2.35 2.85 101.77 43.21 2.83 2.61 2.50 2.47 23.47 2.74

Computer vision models: seed classification project

By AI Lab, Canadian Food Inspection Agency

Introduction

The AI Lab team at the Canadian Food Inspection Agency CFIA) is composed of a diverse group of experts, including data scientists, software developers, and graduate researchers, all working together to provide innovative solutions for the advancement of Canadian society. By collaborating with members from inter-departmental branches of government, the AI Lab leverages state-of-the-art machine learning algorithms to provide data-driven solutions to real-world problems and drive positive change.

At the CFIA's AI Lab, we harness the full potential of deep learning models. Our dedicated team of Data Scientists leverage the power of this transformative technology and develop customised solutions tailored to meet the specific needs of our clients.

In this article, we motivate the need for computer vision models for the automatic classification of seed species. We demonstrate how our custom models have achieved promising results using "real-world" seed images and describe our future directions for deploying a user-friendly SeedID application.

At the CFIA AI Lab, we strive not only to push the frontiers of science by leveraging cutting-edge models but also in rendering these services accessible to others and foster knowledge sharing, for the continuous advancement of our Canadian society.

Computer vision

To understand how image classification models work, we first define what exactly computer vision tasks aim to address.

What is computer vision:

Computer Vision models are fundamentally trying to solve what is mathematically referred to as ill-posed problems. They seek to answer the question: what gave rise to the image?

As humans, we do this naturally. When photons enter our eyes, our brain is able to process the different patterns of light enabling us to infer the physical world in front of us. In the context of computer vision, we are trying to replicate our innate human ability of visual perception through mathematical algorithms. Successful computer vision models could then be used to address questions related to:

  • Object categorisation: the ability to classify objects in an image scene or recognise someone's face in pictures
  • Scene and context categorisation: the ability to understand what is going in an image through its components (e.g. indoor/outdoor, traffic/no traffic, etc.)
  • Qualitative spatial information: the ability to qualitatively describe objects in an image, such as a rigid moving object (e.g. bus), a non-rigid moving object (e.g. flag), a vertical/horizontal/slanted object, etc.

Yet, while these appear to be simple tasks, computers still have difficulties in accurately interpreting and understanding our complex world.

Why is computer vision so hard:

To understand why computers seemingly struggle to perform these tasks, we must first consider what an image is.

Figure 1

Are you able to describe what this image is from these values?

Description - Figure 1

This image shows a brown and white pixelated image of a person’s face. The person's face is pixelated, with the pixels being white and the background being brown. Next to the image, there's a zoomed in image showing the pixel values corresponding to a small patch of the original image.

An image is a set of numbers, with typically three colour channels: Red, Green, Blue. In order to derive any meaning from these values, the computer must perform what is known as image reconstruction. In its most simplified form, we can mathematically express this idea through an inverse function:

x = F-1(y)

Where:

y represents data measurements (ie. pixel values).
x represents a reconstructed version of measurements, y, into an image.

However, it turns out solving this inverse problem is harder than expected due to its ill-posed nature.

What is an ill-posed problem

When an image is registered, there is an inherent loss of information as the 3D world gets projected onto a 2D plane. Even for us, collapsing the spatial information we get from the physical world can make it difficult to discern what we are looking at through photos.

Figure 2

Michelangelo (1475-1564). Occlusion caused by different viewpoints can make it difficult to recognise the same person.

Description - Figure 2

The image shows three paintings of different figures, each with a different expression on their faces. One figure appears to be in deep thought, while the other two appear to be in a state of contemplation. The paintings are made of a dark, rough material, and the details of their faces are well-defined. The overall effect of the image is one of depth and complexity. The paintings are rotated in each frame to create a sense of change.

Figure 3

Bottom of soda cans. Different orientations can make it impossible to identify what is contained in the can.

Description - Figure 3

The image shows five metal cans, four of them with a different patch of color on the lid. The colors are blue, green, red, and yellow. The cans are arranged on a countertop. The countertop is made of a dark surface, such as granite or concrete.

Figure 4

Yale Database of faces. Variations in lighting can make it difficult to recognise the same person (recall: all computers “see” are pixel values).

Description - Figure 4

The image shows two images of the same face. The images are captured from different angles, resulting in two different perceived expressions of the face. On the left frame the man a neutral facial expression, whereas on the right frame he has a serious and angry expression.

Figure 5

Rick Scuteri-USA TODAY Sports. Different scales can make it difficult to understand context from images.

Description - Figure 5

The image shows four different images, at different scales. The first images contain only what looks like the eye of a bird. The second image contains the head and neck of a goose. The third image shows the entire animal, and the fourth image shows a man standing in front of the bird pointing in a direction.

Figure 6

Different photos of chairs. Intra-class variation can make it difficult to categorise objects (we can discern a chair through its functional aspect)

Description -Figure 6

The image shows 5 different chairs. The first one is a red chair with a wooden frame. The second one is a black leather swivel chair. The third looks like an unconventional artistic chair. The fourth one looks like a minimalist office chair, and the last one looks like a bench.

It can be difficult to recognise objects in 2D pictures due to possible ill-posed properties, such as:

  • Lack of uniqueness: Several objects can give rise to the same measurement.
  • Uncertainty: Noise (e.g. blurring, pixilation, physical damage) in photos can make it difficult or impossible to reconstruct and identify an image.
  • Inconsistency: slight changes in images (e.g. different viewpoints, different lighting, different scales, etc.) can make it challenging to solve for the solution, x, from available data points, y.

While computer vision tasks may, at first glance, appear superficial, the underlying problem they are trying to address is quite challenging!

Now we will address some Deep Learning driven solutions to tackle computer vision problems.

Convolutional Neural Networks (CNNs)

Figure 7

Graphical representation of a convolutional neural network (CNN) architecture for image recognition. (Hoeser and Kuenzer, 2020)

Description - Figure 7

This is a diagram of a convolutional neural network (ConvNet) architecture. The network consists of several layers, including an input layer, a convolutional layer, a pooling layer, and an output layer. The input layer takes in an image and passes it through the convolutional layer, which applies a set of filters to the image to extract features. The pooling layer reduces the size of the image by applying a pooling operation to the output of the convolutional layer. The output layer processes the image and produces a final output. The network is trained using a dataset of images and their corresponding labels.

Convolutional Neural Networks (CNNs) are a type of algorithm that has been really successful in solving many computer vision problems, as previously described. In order to classify or identify objects in images, a CNN model first learns to recognize simple features in the images, such as edges, corners, and textures. It does this by applying different filters to the image. These filters help the network focus on specific patterns. As the model learns, it starts recognizing more complex features and combines the simple features it learned in the previous step to create more abstract and meaningful representations. Finally, the CNN takes the learned features and to classify images based on the classes it's been trained with.

Figure 8

Evolution of CNN architectures and their accuracy, for image recognition tasks from 2012 to 2019. (Hoeser and Kuenzer, 2020).

Description - Figure 8

The image shows the plot of the size of different CNN architectures and models from the year 2012 until 2019. Each neural network is depicted as a circle, with the size of the circle corresponding to the size of the neural network in terms of number of parameters.

The first CNN was first proposed by Yann LeCun in 1989 (LeCun, 1989) for the recognition of handwritten digits. Since then, CNNs have evolved significantly over the years, driven by advancements in both model architecture and available computing power. To this day, CNNs continue to prove themselves are powerful architectures for various recognition and data analysis tasks.

Vision Transformers (ViTs)

Vision Transformers (ViTs) are a recent development in the field of computer vision that apply the concept of transformers, originally designed for natural language processing tasks, to visual data. Instead of treating an image as a 2D object, Vision Transformers view an image as a sequence of patches, similar to how transformers treat a sentence as a sequence of words.

Figure 9

An overview of a ViT as illustrated in An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. Since the publication of the original ViT, numerous variations and flavours have been proposed and studied.

Description - Figure 9

The image shows the diagram of the ViT architecture. There is an image of the input image, being splitted into different patches, and each patch is fed into the neural network. The network consists of a transformer encoder block and an MLP Head block, followed by a classification head.

The process starts by splitting an image into a grid of patches. Each patch is then flattened into a sequence of pixel vectors. Positional encodings are added to retain the positional information, as is done in transformers for language tasks. The transformed input is then processed through multiple layers of transformer encoders to create a model capable of understanding complex visual data.

Just as Convolutional Neural Networks (CNNs) learn to identify patterns and features in an image through convolutional layers, Vision Transformers identify patterns by focusing on the relationships between patches in an image. They essentially learn to weigh the importance of different patches in relation to others to make accurate classifications. The ViT model was first introduced by Google's Brain team in a paper in 2020. While CNNs dominated the field of computer vision for years, the introduction of Vision Transformers demonstrated that methods developed for natural language processing could also be used for image classification tasks, often with superior results.

One significant advantage of Vision Transformers is that, unlike CNNs, they do not have a built-in assumption of spatial locality and shift invariance. This means they are better suited for tasks where global understanding of an image is required, or where small shifts can drastically change the meaning of an image.

However, ViTs typically require a larger amount of data and compute resources compared to CNNs. This factor has led to a trend of hybrid models that combine both CNNs and transformers to harness the strengths of both architectures.

Seed classification

Background:

Canada's multi-billion seed and grain industry has established a global reputation in the production, processing, and exportation of premium-grade seeds for planting or grains for food across a diverse range of crops. Its success is achieved through Canada's commitment to innovation and the development of advanced technologies, allowing for the delivery of high-quality products with national and international standards with diagnostic certification that meet both international and domestic needs.

Naturally, a collaboration between a research group from the Seed Science and Technology Section and the AI Lab of CFIA was formed to maintain Canada's role as a reputable leader in the global seed or grain and their associated testing industries.

Background: Quality Control

The seed quality of a crop is reflected in a grading report, whereby the final grade reflects how well a seed lot conforms with Canada's Seeds Regulations to meet minimum quality standards. Factors used to determine crop quality include contaminated weed seeds according to Canada's Weed Seeds Order, purity analysis, and germination and disease. While germination provides potentials of field performance, assessing content of physical purity is essential in ensuring that the crop contains a high amount of the desired seeds and is free from contaminants, such as prohibited and regulated species, other crop seeds, or other weed seeds. Seed inspection plays an important role in preventing the spread of prohibited and regulated species listed in the Canadian Weed Seeds Order. Canada is one of the biggest production bases for global food supply, exporting huge number of grains such as wheat, canola, lentils, and flax. To meet the Phyto certification requirement and be able to access wide foreign markets, analyzing regulated weed seeds for importing destinations is in high demand with quick turnaround time and frequent changes. Testing capacity for weeds seeds requires the support of advanced technologies since the traditional methods are facing a great challenge under the demands.

Motivation

Presently, the evaluation of a crop's quality is done manually by human experts. However, this process is tedious and time consuming. At the AI Lab, we leverage advanced computer vision models to automatically classify seed species from images, rendering this process more efficient and reliable.

This project aims to develop and deploy a powerful computer vision pipeline for seed species classification. By automating this classification process, we are able to streamline and accelerate the assessment of crop quality. We develop upon advanced algorithms and deep learning techniques, while ensuring an unbiased and efficient evaluation of crop quality, paving the way for improved agricultural practices.

Project #1: Multispectral Imaging and Analysis

In this project, we employ a custom computer vision model to assess content purity, by identifying and classifying desired seed species from undesired seed species.

We successfully recover and identify the contamination by three different weed species in a screening mixture of wheat samples.

Our model is customised to accept unique high resolution, 19-channel multi-spectral image inputs and achieves greater than 95% accuracy on held out testing data.

We further explored our model's potential to classify new species, by injecting five new canola species into the dataset and observing similar results. These encouraging findings highlight our model's potential for continual use even as new seed species are introduced.

Our model was trained to classify the following species:

  • Three different thistles (weed) species:
    • Cirsium arvense (regulated species)
    • Carduus nutans (Similar to the regulated species)
    • Cirsium vulgare (Similar to the regulated species)
  • Six Crop seeds,
    • Triticum aestivum subspecies aestivum
    • Brassica napus subspecies napus
    • Brassica juncea
    • Brassica juncea (yellow type)
    • Brassica rapa subspecies oleifera
    • Brassica rapa subspecies oleifera (brown type)

Our model was able to correctly identify each seed species with an accuracy of over 95%.

Moreover, when the three thistle seeds were integrated with the wheat screening, the model achieved an average accuracy of 99.64% across 360 seeds. This demonstrated the model's robustness and ability to classify new images.

Finally, we introduced five new canola species and types and evaluated our model's performance. Preliminary results from this experiment showed a ~93% accuracy on the testing data.

Project #2: Digital Microscope RGB Imaging and Analysis

In this project, we employ a 2-step process to identify a total of 15 different seed species with regulatory significance and morphological challenge across varying magnification levels.

First, a seed segmentation model is used to identify each instance of a seed in the image. Then, a classification model classifies each seed species instance.

We perform multiple ablation studies by training on one magnification profile then testing on seeds coming from a different magnification set. We show promising preliminary results of over 90% accuracy across magnification levels.

Three different magnification levels were provided for the following 15 species:

  • Ambrosia artemisiifolia
  • Ambrosia trifida
  • Ambrosia psilostachya
  • Brassica junsea
  • Brassica napus
  • Bromus hordeaceus
  • Bromus japonicus
  • Bromus secalinus
  • Carduus nutans
  • Cirsium arvense
  • Cirsium vulgare
  • Lolium temulentum
  • Solanum carolinense
  • Solanum nigrum
  • Solanum rostratum

A mix of 15 different species were taken at varying magnification levels. The magnification level was denoted by the total number of instances of seeds present in the image, either: 1, 2, 6, 8, or 15 seeds per image.

In order to establish a standardised image registration protocol, we independently trained separate models from a subset of data at each magnification then evaluated the model performance across a reserved test set for all magnification levels.

Preliminary results demonstrated the model's ability to correctly identify seed species across magnifications with over 90% accuracy.

This revealed the model's potential to accurately classify previously unseen data at varying magnification levels.

Throughout our experiments, we tried and tested out different methodologies and models.

Advanced models equipped with a canonical form such as Swin Transformers fared much better and proved to be less perturbed by the magnification and zoom level.

Discussion + Challenges

Automatic seed classification is a challenging task. Training a machine learning model to classify seeds poses several challenges due to the inherent heterogeneity within and between different species. Consequently, large datasets are required to effectively train a model to learn species-specific features. Additionally, the high degree of similarity among different species within genera for some of them makes it challenging for even human experts to differentiate between closely related intra-genus species. Furthermore, the quality of image acquisition can also impact the performance of seed classification models, as low-quality images can result in the loss of important information necessary for accurate classification.

To address these challenges and improve model robustness, data augmentation techniques were performed as part of the preprocessing steps. Affine transformations, such as scaling and translating images, were used to increase the sample size, while adding Gaussian noise can increase variation and improve generalization on unseen data, preventing overfitting on the training data.

Selecting the appropriate model architecture was crucial in achieving the desired outcome. A model may fail to produce accurate results if end users do not adhere to a standardized protocol, particularly when given data that falls outside the expected distribution. Therefore, it was imperative to consider various data sources and utilize a model that can effectively generalize across domains to ensure accurate seed classification.

Conclusion

The seed classification project is an example of the successful and ongoing collaboration between the AI Lab and the Seed Science group at the CFIA. By pooling their respective knowledge and expertise, both teams contribute to the advancement of Canada's seed and grain industries. The seed classification project showcases how leveraging advanced machine learning tools has the potential to significantly enhance the accuracy and efficiency of evaluating seed or grain quality with compliance of Seed or Plant Protection regulations, ultimately benefiting both the agricultural industry, consumers, Canadian biosecurity, and food safety.

As Data Scientists, we recognise the importance of open-source collaboration, and we are committed to upholding the principles of open science. Our objective is to promote transparency and engagement through open sharing with the public.

By making our application available, we invite fellow researchers, seed experts, and developers to contribute to its further improvement and customisation. This collaborative approach fosters innovation, allowing the community to collectively enhance the capabilities of the SeedID application and address specific domain requirements.

Meet the Data Scientist

If you have any questions about my article or would like to discuss this further, I invite you to Meet the Data Scientist, an event where authors meet the readers, present their topic and discuss their findings.

Register for the Meet the Data Scientist event. We hope to see you there!

MS Teams – link will be provided to the registrants by email

Subscribe to the Data Science Network for the Federal Public Service newsletter to keep up with the latest data science news.

Date modified:

Eh Sayers Episode 14 - I Got 99 Problems But Being Misgendered on the Census Isn't One

Release date: August 21, 2023

Catalogue number: 45200003
ISSN: 2816-2250

I got 99 problems but being misgendered on the census isn't one

Listen to "Eh Sayers" on:

Social media shareables

Tag us in your social media posts

  • Facebook StatisticsCanada
  • Instagram @statcan_eng
  • Twitter @StatCan_eng
  • Reddit StatCanada
  • YouTube StatisticsCanada

Visuals for social media

Gender graphic 1

Gender graphic 2

Ladies, Gentlemen, and Gentlethem!

While every census is special, the 2021 Census was historic. It was the first to include a question about gender, making Canada the first country to collect and publish data on gender diversity from a national census.

In this episode, we explore gender with drag king Cyril Cinder and we talk Census 2021 with StatCan’s Anne Milan.

Join us for a new kind of gender reveal.

Host

Tegan Bridge

Guests

Cyril Cinder, Anne Milan

Listen to audio

Eh Sayers Episode 14 - I Got 99 Problems But Being Misgendered on the Census Isn't One - Transcript

Cyril: Molly knew from experience that dresses were trouble. Dresses have tight places and zippers you can’t reach. Dresses mean troublesome tights and fancy shoes with no purpose. Dresses with no pockets mean nowhere to put interesting rocks, and nowhere to keep dog treats in case you find a stray. Dresses were not right on a regular day and they were definitely not right for something as important as picture day!

Tegan: (Stage whisper) Welcome to Eh Sayers a podcast from statistics, Canada, where we meet the people behind the data and explore the stories behind the numbers. I'm your host, Tegan Bridge, and I'm whispering because we're listening to drag story time. Shh!.

Cyril: Molly wanted to look like she was going on an adventure, not like she was going to a tea party. But she had an idea to save picture day. Her brother’s old tuxedo! It was perfect.

Dashing. Comfortable. Plenty of pockets. She had tried it on once when he was at chess club. It fit just right then, and Molly was sure it would look just as great today!

Tegan: You just heard part of Molly’s Tuxedo by Vicki Johnson, in which Molly has to decide whether to wear the dress her mom picked out for her school picture day or the tuxedo she wants to wear. It’s a book which explores gender in a kid-friendly way. You might hear Molly’s Tuxedo read at a drag story time, and if you’re in the Ottawa area, you might just hear it read by Cyril Cinder.

Cyril: My name is Cyril Cinder, and I'm an Ottawa based drag performer and drag king who has been performing since 2014.

Tegan: What's a drag king?

Cyril: Drag kings are drag artists who present or perform a masculine persona as part of their drag performance. That might involve parody, exploration or expansion of masculine gender norms and performance types. Um, they can be, you know, suave, they can be comedic, they can be big and extravagant and super over the top. Drag kings can be absolutely anyone and everything.

Tegan: Where does the name Cyril Cinder come from?

Cyril: Uh, I named myself in drag. That's not always a thing that happens. Sometimes you are given a name, but in the Canadian scene, usually we pick our own names, and I kind of wanted to sound like the alter ego of a super villain. So, and alliteration sounds cool. So I went with Cyril Cinder. I didn't wanna choose a pun name because I didn't trust myself to be clever enough to come up with something that nobody else would come up with. And the great benefit of it has been that it's also appropriate for all ages. It just sounds cool.

Tegan: I love it. What kind of performances do you do?

Cyril: I tend to lean into the age-old tradition of drag, which is the lip sync performance. I perform a lot in bars or in very different venues, music venues. I'm also a story time performer, so I will entertain and read books to children and audiences and families of all ages. I'm also a speaker and travel to conferences to talk about what I do as a drag performer, the increase attacks against the 2SLGBTQIA+ community in Canada, and how that does tend to specifically target drag performers, as well as mental health because I also work as a registered psychotherapist.

Tegan: We mentioned drag story times at the top of the show. Why are drag story times important?

Cyril: For me, drag story time is a couple of important things. I mean, right off the bat, it's a pro-literacy initiative. Whenever you're doing something to make reading look a little bit more fun, you are encouraging children to be more interested in reading, to be more excited about books, and literacy is such an important foundation of our society. But the other part is that it's also an anti-bullying initiative, right? Whether or not we are, you know, exposing kids to a positive queer role model for the first time, or to someone who's maybe a bit more gender nonconforming, doesn't quite match the boys and girls archetypes and binaries that they're often exposed to at home and at school and in media. It's an opportunity for them to see that and see that that's not that strange actually, that that's okay, that there's nothing too weird about being fabulous and sparkly and excited and fun. And if we can introduce children to positive role models of all different kinds of diversity at a young age, as they get older, that becomes less and less of a axis of difference for them, something for them to isolate or pick out about their peers and say, oh, you're not like me in this way. It becomes something that they can say, oh yeah, there are people like this. This is normal. This is okay.

Tegan:  You mentioned binaries and archetypes for little boys and little girls. Could you say more about that? What's the gender binary and why is it a problem?

Cyril: So the gender binary is the idea that there are only two genders. Gender is distinct from sex.

Sex is a series of biological characteristics. Sex is also not an actual binary between male and female, and we don't only see this non-binary nature of sex represented in the human species, but in multiple different species. So that is just to say that there are many, many different biological expressions that don't fit within the male-female solid archetype.  

Gender is a different thing than sex, though. Gender is an experience of one's gender, role, and identity within society: the concept of being masculine, feminine, androgynous, the idea of rather than being male or female, of being man or woman. And as much as the archetypes and binary roles of male and female don't capture the full breadth of the experience of sex, the binary of woman and man also doesn't capture the full breadth of experience that humans can have of our gender and of our gender identities.

The gender binary can be used to control people. It can be used to force people into things that they don't want for themselves, right? We can see very strong, uh, expectations on men, for example, on what kind of emotions they're allowed to show, what kind of careers they're allowed to pursue, how they're supposed to feel about caretaking or sex or power or any of these things. And these are equally damaging to people of all genders. You know, women get told not to be too bossy, uh, that they're over emotional, that they can't be trusted to make decisions or be in positions of leadership, right? These sorts of boxes that we force people into. And people who don't feel like they fit into either of the binary gender options, male or female, people who fall under the very large umbrella of the non-binary spectrum really also deserve to have their experience of gender understood, respected, and validated.

Tegan: What are your preferred pronouns?

Cyril: Yeah, so I actually identify as non-binary out of drag, but my drag character, Cyril Cinder, is a man that is the gender identity of that character. So in drag, I exclusively use he/ him pronouns, but out of drag I use she/he/they pronouns. Any pronouns really I'm comfortable with. But whenever I'm referring to Cyril Cinder, the drag character, I always prefer to use he/him.

Tegan: Is it a challenge living as an out non-binary person?

Cyril: It can be. I think I certainly don't experience certain challenges that other people who might hold the same identity of non-binary as, as me actually experience, um, because of my flexibility with pronouns, with the she/he/they. I am comfortable if someone, you know, defaults to she looking at me because I was assigned female at birth and I, you know, when I'm not in drag, do have a somewhat feminine gender presentation and someone looks at me and goes, ah, a woman. It's not correct, but it's also not like the worst thing in the world for me.

At the same time, other people for whom that would feel really actually quite distressing and upsetting and invalidating for them. I think they face a greater challenge, right? Someone who exclusively uses they/them pronouns who might constantly have to correct people who would default to he or she or might overtly refuse to use their pronouns because of a belief that they have and a subsequent then desire to invalidate that person.

And that's really hard because then you're moving through the world and you're trying to tell people who you are. We have this innate human desire to be seen by the people around us. We are a social animal. We exist in a society. We don't do well independently on our own. We're not built for that. So when you're trying to say to someone like, “Hey, this is who I am!” And they go, “No, you're not. I know you better than you do, and in fact, this thing that you're doing, it's really a problem. It's really dangerous. It's really bad actually, and you should feel shame for that.” That's not an experience anyone seeks to have when they're trying to order a coffee or talk to their boss or just go about their day and go about their life.

Tegan: You're a drag king. You're a performer, and I think it goes without saying that drag is a performance. In what ways is all gender a performance?

Cyril: So gender is performance. It is something that we put on whether you are choosing to wear a dress or a three-piece suit or both at the same time. You were saying something about what feels right to you about who you are, about, you know, just different little elements of yourself that we can express in little ways, and that might be through comparison, through contrast, through exaggeration, through celebration. We're all doing little things and all of those expressions are a completely valid way of looking at everyone's individual experience of gender.

We have gender identity, we have gender expression. Those are also two different things. How someone identifies in their gender might be different than how they express it. I'm non-binary. And so that's my gender identity, but I have a very feminine gender expression. I also can exhibit a very masculine gender expression. That's quite fluid for me. That moves around a lot. Not everybody might have the same experience, but it's, it's important to be able to articulate these different parts of it because we're only just enriching our understanding of the human experience.

Tegan: A good friend of mine whenever he needs someone who looks at the world in a very different way than he does. He pauses and gets a thoughtful look on his face, and he says, “Life is a rich tapestry!” The human experience and the diversity of that experience certainly is a rich tapestry. Gender identity, especially, is fascinating. But it's not something that we here at stat can have measured. Until now.

Tegan: Why was the 2021 census a big deal?

Anne: Well, I would say each census is a big deal. The 2021 specifically, in terms of gender, is because it was the first time that we asked the gender question on the, on the census.

Tegan: This is, of course, our resident census expert.

Anne: I am Anne Milan. I am chief of the Census Demographic Variables Section in the center for Demography at Statistics Canada.

Tegan: Yeah. So you said it's the gender question. Could you elaborate? So what's different about the 2021 census that we didn't ask before?

Anne: There's two, two main changes. There is a precision of “at birth” added to the sex question, and the question on gender is completely new. So that asked someone to identify whether they were male, female and there was an “or please specify” category where persons could write in their response.

It was historic to, to include this information.  It's the first time that, census data was released on the transgender and non-binary population among all the countries around the world. And so we're very proud of that.

Tegan: The census allowed people to write in whatever gender identity they would prefer. Why do that instead of boxes to tick?

Anne: With the gender question, we felt that having a write-in, or please specify this person's gender, was the most respectful and inclusive strategy to use so that people could select the gender that was most relevant to them.

Tegan: Do you have any idea how many different genders people put?

Anne: There were many.

Tegan: Many.

Anne: There were many,

Tegan: Many is a valid answer.

Tegan: Putting a blank space allowed respondents to describe themselves as they saw fit. And they did. StatCan uses the term non-binary as an umbrella term, but that's not necessarily how everyone describes themselves. Almost a third of those under this rainbow coloured non-binary umbrella used at different word to describe their gender: androgynous, bigender, intergender, pangender, polygender, queer, and two spirit. These were all terms provided by census respondents, but to be sure these weren't all of them. As Anne said, there were many many more.

Anne: If there was no gender question before, they wouldn't have had that opportunity to select their own gender. And that was one of the things that we noticed with the 2016 census when we were reviewing the comments that, uh, that people had put in, they were saying that the sex question, which had been there for many census cycles, was not precise enough for their needs. So some people felt that they were excluded. And of course, the goal of the census is to count all Canadians and have everybody see themselves in the data.

Tegan: Why is it important that people see themselves in the data?

Anne: Well, I think the, the census is a count of the total population in Canada. So of course it's important for people to see themselves in the data. We want people to have that experience and to feel that this information is relevant to them, that they're counted, that their voices matter. So really, that's the goal of the census: to include everybody and to have everybody feel included.

Tegan: So you said that people gave feedback in 2016, uh, saying that the census wasn't, asking necessarily the right question. Is that where the idea came from to make this change?

Anne: It was part of it. Each census cycle, we review all of the content, all of the questions, all of the definitions, and as part of that content determination process, there's an extensive consultation and engagement that takes place almost immediately after any census. And one of the common responses was that, gender was an information gap that, uh, that was needed.

And so following that, we had some more specific focus groups, individual, in-depth conversations. And this included all Canadians, cisgender, as well as transgender, non-binary persons. And so we, we took all that information into account. We developed the new content.

Tegan: What have we learned? What do the stats say about gender?

Anne: One in 300 Canadians aged 15 and over living in private households were transgender or non-binary individuals. Numerically, this is about a hundred thousand, so about 60,000 were transgender persons and about just over 40,000 were non-binary persons.

They tend to be younger on average than cisgender persons. So just to give you an idea about two thirds are less than age 35, so between age 15 and 34.

Tegan: There really is quite an age difference. Among those who were between 15 and 34, about one in 150 were transgender or non-binary. For those who were over the age of 35, it was only one in 550. That means, proportionally, there were more than three times as many transgender people aged 15 to 34 as there were 35 and over.  
 
Anne: And it might be that, you know, younger people are more comfortable reporting their gender. From a generational perspective, attitudes and behaviors of a particular generation are informed by historical context in which they're raised. So age differences were a big trend that we noticed.

Tegan: How comfortable people are reporting their gender on an official government form was something Cyril cinder mentioned as well.

Cyril: I think a lot of people in the 2SLGBTQIA+ community are still somewhat apprehensive or nervous to, you know, really tell the government, yes, I am transgender, or, yes, I am non-binary due to historical systems of oppression and how they have impacted those communities.

Tegan: Age differences weren't the only notable finding in the data.

Anne: There were also regional differences. So, for example, among the largest urban areas, what we call census metropolitan areas, Victoria, Halifax, and Fredericton really stood out. And these three large urban areas had certain elements in common. They had stronger population growth between 2016 and 2021 compared to the national average. They had larger shares, 15 to 34 than the national average. And they're all home to several major universities and colleges. And of course students tend to have a, a younger age profile.

So those are some of the high level findings. And we were so pleased with the reaction to the information. It was overwhelmingly positive. And so we were very, very proud of that.

Cyril: I think it's so important. I mean, other levels of government, we are recognizing that transgender and non-binary people exist, right? We have the option for different gender markers, on our IDs, and, you know, normally when you go to a doctor's office, or you fill out any other demographic form, you're gonna have an opportunity to indicate what your gender identity is. And when we're talking about something as large-scale as the census, which drives so many of the decisions that we make at various levels of government. It informs, you know, grant money, who's getting what, how many resources need to be allocated to which communities. It's important to have an accurate measure of those communities. You know, Canada was the first country to include this in the census, but these calls have been going on for over a decade to be able to access this information.

But even just asking the question, it, it indicates that the government does care about transgender and non-binary Canadians, that our experiences are important and that we are part of Canadian society.

Tegan: From your perspective, now that we have these data about transgender Canadians, what should we do with it? How can it be put to best use? And what would your hopes be for next steps?

Cyril: Mm-hmm. I think some of the important things is, is being able to use this information about like where are resources needed in particular, right. I was looking through the data and things like the vast majority of non-binary people live in six urban centers in Canada.

Right? That is huge to know, but also to know like how many aging trans, non-binary people do we have in Canada? What kind of services might they need within the elder care system in this country, which is dealing with a lot of struggles, but how might these people's needs be unique? Where are they? Where are those services needed to go? What can effectively serve these communities? How can we support these people who we know are more likely to struggle with negative mental health impacts? And other research also shows us more likely to live in poverty, more likely to deal with other axes of systemic oppression and various things like that. Making the information publicly available is also very helpful because it allows us to use it for advocacy work. And I think also just kind of sometimes putting in context how much vitriol is directed towards the trans non-binary community and how few of us there actually are. Right? We are a small community. Looking at the data, there's just about a hundred thousand of us in the country and knowing how to support that. Putting in context of how large our population is actually becoming and at the same time , in terms of next steps I think it’s important to get more accurate data.

Tegan: This is why the census is so crucial. StatCan doesn't just gather data. Our experts also analyze them, and Cyril's not the only one looking forward to getting more information.

Anne: It's very exciting and as an analyst this is the part that I enjoy because all of the census variables are available now: education, labor, income. So we can do a deeper dive into some of the patterns. And that's exactly what we're doing now. So there's a paper that's currently underway on the socioeconomic wellbeing of transgender and non-binary population, looking at characteristics like education labor force participation, income, housing. So that's underway, and that will be available in the coming months. That's one activity that we're working on: in depth analysis.

And what's exciting about 2026 is we will have trends. 2021, it was, it was the excitement of having that data for the first time. But now we will have two time points so we can see what were the changes over time. And that will allow us to do even more interesting analysis.

Tegan: How often does the census change, and maybe more importantly, why does it change?

Anne: I would say that the census changes every census cycle. And that's what keeps it relevant. I mean, for, for over the past a hundred years, it continues to evolve as as society evolves, and that's what makes it exciting.

One example I can maybe give of content that has continued but also changed over time is the adult population 15 plus in couples. We've been measuring it for, for over a hundred years. So in 1921, a couple was a married couple. In 1981, we introduced the concept of common law couples. In 2001, we introduced the idea of same sex and opposite sex common law couples. Following national legislation that permitted same-sex couples to marry in 2005, we then counted same-sex married couples in 2006. So we have this increasing way to slice and dice the data, but we also have this continuity over time. And so then in 2021, we added this further element of being able to look at couples by gender. So whether a couple is comprised of one transgender person or one non-binary person. And so that ability to look at emerging family forms continues while maintaining that ability to look at historical trends as well.

Tegan: While on the subject of history and trends. It's important to make the point that even though 2021 was the first census to ask about gender, trans and non-binary Canadians have always been here.

Anne: I think there is a recognition that... that, for example, in this, in this situation, transgender and non-binary people have, have always existed, but it's our ability to measure it. And that's the, that's the new part.

Tegan: Just because you can't, you're not measuring something, doesn't mean it doesn't exist.

Anne: Exactly.

Tegan: Does the future look bright for trans and gender nonconforming Canadian kids? What opportunities and what challenges do you foresee?

Cyril: I, I think the future does look bright for Canadian transgender, non-binary, gender non-conforming youth. I, I think that there is something really, really wonderful ahead, but the path to that wonderful future currently has a lot of barriers in the way.

We have made incredible progress in recent decades as a community and we are seeing intense reactions to that progress from people who would like to see it clawed back and, you know, everything I ever let go had claw marks on it. So good luck with that initiative. I think a lot of us feel that way. But we cannot become complacent. We cannot, you know, pat ourselves on the back and say, job well done. The fight's over. We did it. And ignore what's actually happening on the ground. Because if we do that, we are going to lose that bright future.

We are going to repeat history and the repetition of history loses lives. People die in the circumstances that we have been living in for centuries at this point. And to me, that's not an acceptable way forward. It's, it's not okay to have our trans non-binary and gender non-conforming siblings lost in this fight.

Queer and trans kids should get to grow up to be queer and trans adults. And that should not be a matter of debate. That is asked and answered at this point. And we need to be firm in that and not fall into the paradox of tolerance, whereby tolerating intolerance, it is allowed to fester and grow and become cancerous and take over, and then all of a sudden, oh, where'd all those rights go that we fought so hard to win?

So, I do believe that the future is bright because I do believe that Canadians care about this, and I do believe that Canadians are intelligent and capable of understanding honest facts when they're placed in front of them, that we can dispel negative myths, that we can march forward together towards something that is better for all of us, but we need to put in the work to make sure that that happens.

Tegan: What is allyship to you and how can people be allies to the queer community?

Cyril: Allyship is active, not passive. A lot of people, you know, identify with the idea of allyship. They want to be an ally, and I think that's a wonderful thing. But when someone tells me like, oh, I'm an ally to the queer community, I'm like, great. What does that look like? What do you do to be an ally to the queer community? Because it's not enough to just not be homophobic, transphobic, and queer phobic. It's not enough to just not be a bigot. You have to oppose it in some way. You have to support the community in some way.

We can't be left alone just to fight for our own cause. We need our cisgender and heterosexual allies to also show up for us. And so allyship is an active thing. It is something you can be bestowed, something you can be granted from the community. You are an ally. You are showing up for us. You fight for us. You are willing to be uncomfortable if it means being able to protect our dignity and our personhood. That means a lot. It is not something that you can just claim.

Tegan: Is there anything you'd like to add?

Anne: Well, maybe just one more word about the value of the census in general. I see the value of it every day to us at StatCan, but also to broader Canadians. It's the best source of data for looking at smaller populations and subgroups, and of course, transgender and non-binary persons fall into that category. But there's many other smaller populations as well that's important to study. It's a valuable source for detailed and local geographies so that municipal planners can plan schools and hospitals and home care. As the concepts broaden, we often don't lose content, but it does allow us to, to integrate these new patterns that we're seeing. And so that makes it be able to evolve and maintain its relevance, and I think I can't finish without thanking all Canadians for their participation, for their input. It's very much appreciated and we certainly couldn't have the census without them.

Tegan: Thank you for your time. Thank you for your sharing your expertise.

Anne: Thank you.

Tegan: If someone would like to learn more about you and your work, maybe they'd be interested in seeing what drag is firsthand. Where can they go?

Cyril: Oh, so I have a website, www.cyrilcinder.com. C-Y-R-I-L-C-I-N-D-E-R. I'm also all over all the social medias, uh, Instagram, TikTok, Facebook, all of those required things. They can come support their local drag. That is, to me, the greatest, most important thing. It is your local drag artists, the ones who maybe don't get to be on tv, who are maybe a little bit more different. Who are the ones who are out working in your community, who I think have the most valuable things to say. Um, I saw my first drag show in 2014 and it opened my eyes so, so much, and I just hope more people can go have that experience.

Tegan: And if maybe someone's listening and questioning their own gender. Do you have any suggestions or resources to recommend?

Cyril: If you're sort of questioning your own gender identity, there has been so much work done writing done to help you with that experience. There are really wonderful books. Um, you and your Gender Identity by Hoffman-Fox is a great workbook that can, people look, can look through. It's often available at your local library. Your local library will have a lot of resources on gender identity and gender exploration for a variety of age groups. Um, you could look at Interligne, which is a 2SLGBTQIA+ listening service in Canada that are based out of Montreal. If you're indigenous, there are indigenous focused resources for exploring two-spirit identity, you know? Open yourself up, ask questions. Go to your local queer bookstore or your local queer venue if you have one. If you don't have one, the internet is a fantastic place to find some good free educational resources and support from other people who feel like you, because I promise you, no matter what questions you have, no matter what feeling you are struggling with, you are not alone in that experience, and there is somebody out there who's asking the exact same questions, and you don't deserve to go through that journey alone.

Tegan: Thank you so much for joining us. We really appreciate it.

Cyril: Thank you for having me

Tegan: You’ve been listening to Eh Sayers. Thank you to our guests, Cyril Cinder and Anne Milan. Molly’s Tuxedo was written by Vicki Johnson and illustrated by Gillian Reid. It was published by little bee books. Thank you for letting us share it on our show. It was read by Cyril

Cinder. If you’re interested in learning more about our census gender data, check out the links in the show notes.

You can subscribe to this show wherever you get your podcasts. There you can also find the French version of our show, called Hé-coutez bien. If you liked this show, please rate, review, and subscribe. Thanks for listening!

Sources:

The Daily - Canada is the first country to provide census data on transgender and non-binary people

Filling the gaps: Information on gender in the 2021 Census

2021 Census: Sex at birth and gender - the whole picture

Quarterly Financial Report for the quarter ended June 30, 2023

Statement outlining results, risks and significant changes in operations, personnel and program

A) Introduction

Statistics Canada's mandate

Statistics Canada ("the agency") is a member of the Innovation, Science and Industry portfolio.

Statistics Canada's role is to ensure that Canadians have access to a trusted source of statistics on Canada that meets their highest priority needs.

The agency's mandate derives primarily from the Statistics Act. The Act requires that the agency collects, compiles, analyzes and publishes statistical information on the economic, social, and general conditions of the country and its people. It also requires that Statistics Canada conduct the census of population and the census of agriculture every fifth year and protects the confidentiality of the information with which it is entrusted.

Statistics Canada also has a mandate to co-ordinate and lead the national statistical system. The agency is considered a leader, among statistical agencies around the world, in co–ordinating statistical activities to reduce duplication and reporting burden.

More information on Statistics Canada's mandate, roles, responsibilities and programs can be found in the 2023-2024 Main Estimates and in the Statistics Canada 2023-2024 Departmental Plan.

The Quarterly Financial Report:

  • should be read in conjunction with the 2023-2024 Main Estimates;
  • has been prepared by management, as required by Section 65.1 of the Financial Administration Act, and in the form and manner prescribed by Treasury Board of Canada Secretariat;
  • has not been subject to an external audit or review.

Statistics Canada has the authority to collect and spend revenue from other federal government departments and agencies, as well as from external clients, for statistical services and products.

Basis of presentation

This quarterly report has been prepared by management using an expenditure basis of accounting. The accompanying Statement of Authorities includes the agency's spending authorities granted by Parliament and those used by the agency consistent with the Main Estimates for the 2023-2024 fiscal year. This quarterly report has been prepared using a special purpose financial reporting framework designed to meet financial information needs with respect to the use of spending authorities.

The authority of Parliament is required before moneys can be spent by the Government. Approvals are given in the form of annually approved limits through appropriation acts or through legislation in the form of statutory spending authority for specific purposes.

The agency uses the full accrual method of accounting to prepare and present its annual departmental financial statements that are part of the departmental results reporting process. However, the spending authorities voted by Parliament remain on an expenditure basis.

B) Highlights of fiscal quarter and fiscal year-to-date results

This section highlights the significant items that contributed to the net increase in resources available for the year, as well as actual expenditures for the quarter ended June 30.

Comparison of gross budgetary authorities and expenditures as of June 30, 2022, and June 30, 2023, in thousands of dollars
Description for Chart 1: Comparison of gross budgetary authorities and expenditures as of June 30, 2022, and June 30, 2023, in thousands of dollars

This bar graph shows Statistics Canada's budgetary authorities and expenditures, in thousands of dollars, as of June 30, 2022 and 2023:

  • As at June 30, 2022
    • Net budgetary authorities: $576,698
    • Vote netting authority: $120,000
    • Total authority: $696,698
    • Net expenditures for the period ending June 30: $185,286
    • Year-to-date revenues spent from vote netting authority for the period ending June 30: $11,675
    • Total expenditures: $196,961
  • As at June 30, 2023
    • Net budgetary authorities: $619,835
    • Vote netting authority: $120,000
    • Total authority: $739,835
    • Net expenditures for the period ending June 30: $184,915
    • Year-to-date revenues spent from vote netting authority for the period ending June 30: $3,990
    • Total expenditures: $188,905

Chart 1 outlines the gross budgetary authorities, which represent the resources available for use for the year as of June 30.

Significant changes to authorities

Total authorities available for 2023-24 have increased by $43.1 million, or 6.2%, from the previous year, from $696.7 million to $739.8 million (Chart 1). The net increase is mostly the result of the following:

  • An increase of $87.2 million for funding received to cover the initial planning phase and development activities related to the 2026 Census of Population and 2026 Census of Agriculture programs;
  • A decrease of $48 million for the 2021 Census of Population and 2021 Census of Agriculture programs due to cyclical nature of funding winding down;
  • A decrease of $1.8 million for the Disaggregated Data Action Plan;
  • An increase of $1.3 million for salary increases related to latest rounds of collective bargaining;
  • An increase of $6.7 million for various initiatives including Statistical Survey Operations Modernization, Canada Dental Benefit, Federal Action Plan to Strengthen Internal Trade, Higher Education Intellectual Property Commercialization and Advancing a Circular Plastics Economy for Canada.

In addition to the appropriations allocated to the agency through the Main Estimates, Statistics Canada also has vote net authority within Vote 1, which entitles the agency to spend revenues collected from other federal government departments, agencies, and external clients to provide statistical services. The vote netting authority is stable at $120 million when comparing the first quarter of fiscal years 2022-2023 and 2023-2024.

Significant changes to expenditures

Year-to-date net expenditures recorded to the end of the first quarter decreased by $0.4 million, or 0.2% from the previous year, from $185.3 million to $184.9 million (see Table A: Variation in Departmental Expenditures by Standard Object).

Statistics Canada spent approximately 29.8% of its authorities by the end of the first quarter, compared with 32.1% in the same quarter of 2022-2023.

Table A: Variation in Departmental Expenditures by Standard Object (unaudited)
This table displays the variance of departmental expenditures by standard object between fiscal 2021-2022 and 2022-2023. The variance is calculated for year to date expenditures as at the end of the first quarter. The row headers provide information by standard object. The column headers provide information in thousands of dollars and percentage variance for the year to date variation.
Departmental Expenditures Variation by Standard Object: Q1 year-to-date variation between fiscal year 2022-2023 and 2023-2024
$'000 %
(01) Personnel -6,633 -3.9
(02) Transportation and communications 393 11.0
(03) Information 1 0.1
(04) Professional and special services 1,834 22.1
(05) Rentals -1,641 -16.2
(06) Repair and maintenance -69 -44.8
(07) Utilities, materials and supplies -140 -65.1
(08) Acquisition of land, buildings and works - N/A
(09) Acquisition of machinery and equipment -1,141 -72.5
(10) Transfer payments - N/A
(12) Other subsidies and payments -660 -79.8
Total gross budgetary expenditures -8,056 -4.1
Less revenues netted against expenditures:
Revenues -7,685 -65.8
Total net budgetary expenditures -371 -0.2
Note: Explanations are provided for variances of more than $1 million.

Personnel: The decrease is mainly due to spendings for seasonal, casual, and students' salaries, offset by a slight increase related to cost-recovery work following the dissemination of the 2021 Census of Population.

Professional and special services: The increase is mainly due to expenses with IT consultants and timing difference in invoicing compared to the first quarter of 2022-2023.

Rentals: The decrease is mainly due to a one-time invoice for a software licence paid in the first quarter of 2022-2023.

Acquisition of machinery and equipment: The decrease is mainly due to the purchase of computers in the first quarter of 2022-2023.

Revenues: The decrease is mainly due to a timing difference in invoicing compared to last year.

C) Significant changes to operations, personnel and programs

In 2023-2024, the following changes in operations and program activities are underway:

  • The Census program is ramping down operations for the 2021 cycle and is in the planning phase for the 2026 Censuses of Population and Agriculture programs.
  • Budget 2023 announced funding for new initiatives such as the Canadian Dental Care program and the Official Languages Action Plan.
  • Budget 2023 announced a commitment to refocus government spending:
    • Budget 2023 proposes to reduce spending on consulting, other professional services, and travel by roughly 15 per cent starting in 2023-2024. The government will focus on targeting these reductions on professional services, particularly management consulting.
    • Budget 2023 proposes to phase in a roughly 3 per cent reduction of eligible spending by departments and agencies by 2026-2027.
  • Statistics Canada is committed to effective management of its programs and services. In anticipation of the announcement of pending reductions, Statistics Canada launched a review in 2022 to identify efficiencies and reductions to programs or services.

D) Risks and uncertainties

Statistics Canada will address the issues and corresponding uncertainties raised in this Quarterly Financial Report by implementing corresponding risk mitigation measures captured in the 2023-2024 Corporate Risk Profile and at the program level.

Statistics Canada continues to pursue and invest in modernizing business processes and tools to maintain its relevance and maximize the value it provides to Canadians. To address uncertainties, the agency is implementing the Census of Environment, the Quality of Life Framework for Canada and the Disaggregated Data Action Plan initiatives to meet the evolving needs of users and remain relevant as an agency. The agency is also remaining vigilant to cyber threats while supporting the use of modern methods with a functional digital infrastructure.

Statistics Canada requires a skilled workforce to achieve its objectives; however, it is difficult to compete with other organizations in the data ecosystem and the current labour market situation. To address uncertainties, Statistics Canada will create partnerships with other government departments, international organizations, and IT Industry partners to find innovative ways to collaborate on bridging gaps in digital skills and IT human resource shortfalls. The agency will continue promoting a strong workplace culture, a healthy work-life balance and advance on the Equity, Diversity and Inclusion Action Plan. In addition, it will focus on existing employees and continue its effort to achieve greater diversity and inclusion across its workforce and promote and support accessibility.

Statistics Canada continues its collaboration with federal partners to access IT services and support to realise its modernization objectives and to implement the Cloud Optimization Activities. To address uncertainties, the agency is working closely with its federal partners, while adhering to the agency's notable financial planning management practices, integrated strategic planning framework as well as strengthening its financial stewardship.

Approval by senior officials

Approved by:

Anil Arora, Chief Statistician
Ottawa, Ontario
Signed on: August 23rd, 2023

Kathleen Mitchell, Chief Financial Officer
Ottawa, Ontario
Signed on: August 15th, 2023

Appendix

Statement of Authorities (unaudited)
This table displays the departmental authorities for fiscal years 2022-2023 and 2023-2024. The row headers provide information by type of authority, Vote 105 – Net operating expenditures, Statutory authority and Total Budgetary authorities. The column headers provide information in thousands of dollars for Total available for use for the year ending March 31; used during the quarter ended June 30; and year to date used at quarter-end of both fiscal years.
  Fiscal year 2023-2024 Fiscal year 2022–2023
Total available for use for the year ending March 31, 2024Table note * Used during the quarter ended June 30, 2023 Year-to-date used at quarter-end Total available for use for the year ending March 31, 2023Table note * Used during the quarter ended June 30, 2022 Year-to-date used at quarter-end
in thousands of dollars
Vote 1 — Net operating expenditures 530,377 166,191 166,191 496,731 165,294 165,294
Statutory authority — Contribution to employee benefit plans 89,458 18,724 18,724 79,967 19,992 19,992
Total budgetary authorities 619,835 184,915 184,915 576,698 185,286 185,286
Table note *

Includes only Authorities available for use and granted by Parliament at quarter-end.

Return to the first table note * referrer

Departmental budgetary expenditures by Standard Object (unaudited)
This table displays the departmental expenditures by standard object for fiscal years 2022-2023 and 2023-2024. The row headers provide information by standard object for expenditures and revenues. The column headers provide information in thousands of dollars for planned expenditures for the year ending March 31; expended during the quarter ended June 30; and year to date used at quarter-end of both fiscal years.
  Fiscal year 2023-2024 Fiscal year 2022–2023
Planned expenditures for the year ending March 31, 2024 Expended during the quarter ended June 30, 2023 Year-to-date used at quarter-end Planned expenditures for the year ending March 31, 2023 Expended during the quarter ended June 30, 2022 Year-to-date used at quarter-end
in thousands of dollars
Expenditures:
(01) Personnel 636,127 164,220 164,220 613,079 170,853 170,853
(02) Transportation and communications 11,992 3,979 3,979 11,745 3,586 3,586
(03) Information 8,682 1,340 1,340 9,041 1,339 1,339
(04) Professional and special services 48,413 10,120 10,120 35,898 8,286 8,286
(05) Rentals 21,089 8,487 8,487 17,160 10,128 10,128
(06) Repair and maintenance 972 85 85 475 154 154
(07) Utilities, materials and supplies 1,642 75 75 1,736 215 215
(08) Acquisition of land, buildings and works 557 - - 555 - -
(09) Acquisition of machinery and equipment 10,304 432 432 6,962 1,573 1,573
(10) Transfer payments - - - - - -
(12) Other subsidies and payments 57 167 167 47 827 827
Total gross budgetary expenditures 739,835 188,905 188,905 696,698 196,961 196,961
Less revenues netted against expenditures:
Revenues 120,000 3,990 3,990 120,000 11,675 11,675
Total revenues netted against expenditures 120,000 3,990 3,990 120,000 11,675 11,675
Total net budgetary expenditures 619,835 184,915 184,915 576,698 185,286 185,286

Production level code in Data Science

By David Chiumera, Statistics Canada

In recent years, the field of data science has experienced explosive growth, with businesses across many sectors investing heavily in data-driven solutions to optimize decision-making processes. However, the success of any data science project relies heavily on the quality of the code that underpins it. Writing production-level code is crucial to ensure that data science models and applications can be deployed and maintained effectively, enabling businesses to realize the full value of their investment in data science.

Production-level code refers to code that is designed to meet the needs of the end user, with a focus on scalability, robustness, and maintainability. This contrasts with code that is written purely for experimentation and exploratory purposes, which may not be optimized for production use. Writing production-level code is essential for data science projects as it allows for the efficient deployment of solutions into production environments, where they can be integrated with other systems and used to inform decision-making.

Production-level code has several key benefits for data science projects. Firstly, it ensures that data science solutions can be easily deployed and maintained. Secondly, it reduces the risk of errors, vulnerabilities, and downtime. Lastly, it facilitates collaboration between data scientists and software developers, enabling them to work together more effectively to deliver high-quality solutions. Finally, it promotes code reuse and transparency, allowing data scientists to share their work with others and build on existing code to improve future projects.

Overall, production-level code is an essential component of any successful data science project. By prioritizing the development of high-quality, scalable, and maintainable code, businesses can ensure that their investment in data science delivers maximum value, enabling them to make more informed decisions and gain a competitive edge in today's data-driven economy.

Scope of Data Science and its various applications

The scope of data science is vast, encompassing a broad range of techniques and tools used to extract insights from data. At its core, data science involves the collection, cleaning, and analysis of data to identify patterns and make predictions. Its applications are numerous, ranging from business intelligence and marketing analytics to healthcare and scientific research. Data science is used to solve a wide range of problems, such as predicting consumer behavior, detecting fraud, optimizing operations, and improving healthcare outcomes. As the amount of data generated continues to grow, the scope of data science is expected to expand further, with increasing emphasis on the use of advanced techniques such as machine learning and artificial intelligence.

Proper programming and software engineering practices for Data Scientists

Proper programming and software engineering practices are essential for building robust data science applications that can be deployed and maintained effectively. Robust applications are those that are reliable, scalable, and efficient, with a focus on meeting the needs of the end user. There are several types of programming and software engineering practices that are particularly important in the context of data science, such as version control, automated testing, documentation, security, code optimization, and proper use of design patterns to name a few.

By following proper practices, data scientists can build robust applications that are reliable, scalable, and efficient, with a focus on meeting the needs of the end user. This is critical for ensuring that data science solutions deliver maximum value to businesses and other organizations.

Administrative Data Pre-processing Project (ADP) project and its purpose - an example.

The ADP project is a Field 7 application that required involvement from the Data Science Division to refactor a citizen developed component due to a variety of issues that were negatively impacting its production readiness. Specifically, the codebase used to integrate workflows external to the system was found to be lacking in adherence to established programming practices, leading to a cumbersome and difficult user experience. Moreover, there was a notable absence of meaningful feedback from the program upon failure, making it difficult to diagnose and address issues.

Further exacerbating the problem, the codebase was also found to be lacking in documentation, error logging, and meaningful error messages for users. The codebase was overly coupled, making it difficult to modify or extend the functionality of the program as needed, and there were no unit tests in place to ensure reliability or accuracy. Additionally, the code was overfitted to a single example, which made it challenging to generalize to other use cases, there were also several desired features that were not present to meet the needs of the client.

Given these issues, the ability for the ADP project to pre-process semi-structured data was seriously compromised. The lack of feedback and documentation made it exceedingly difficult for the client to use the integrated workflows effectively, if at all, leading to frustration and inefficiencies. The program outputs were often inconsistent with expectations, and the absence of unit tests meant that reliability and accuracy were not assured. In summary, the ADP project's need for a refactor of the integrated workflows (a.k.a. clean-up or redesign) was multifaceted and involved addressing a range of programming and engineering challenges to ensure a more robust and production-ready application. To accomplish this, we used a Red Green refactoring approach to improve the quality of the product.

Red Green vs Green Red approach to refactoring

Refactoring is the process of restructuring existing code in order to improve its quality, readability, maintainability, and performance. This can involve a variety of activities, including cleaning up code formatting, eliminating code duplication, improving naming conventions, and introducing new abstractions and design patterns.

There are several reasons why refactoring is beneficial. Firstly, it can improve the overall quality of the codebase, making it easier to understand and maintain. This can save time and effort over the long term, especially as codebases become larger and more complex. Additionally, refactoring can improve performance and reduce the risk of bugs or errors, leading to a more reliable and robust application.

One popular approach to refactoring is the "Red Green" approach, as part of the test-driven development process. In the Red Green approach, a failing test case is written before any code is written or refactored. This failing test is then followed by writing the minimum amount of code required to make the test pass, before proceeding to refactor the code to a better state if necessary. In contrast, the Green Red approach is the reverse of this, where the code is written before the test cases are written and run.

The benefits of the Red Green approach include the ability to catch errors early in the development process, leading to fewer bugs and more efficient development cycles. The approach also emphasizes test-driven development, which can lead to more reliable and accurate code. Additionally, it encourages developers to consider the user experience from the outset, ensuring that the codebase is designed with the end user in mind.

Figure 1: Red Green Refactor
Figure 1: Red Green Refactor

The first step, the Red component, refers to writing a test that fails. From here the code is modified to make the test pass, which refers to the Green component. Lastly, any refactoring that needs to be done to further improve the codebase is done, another test is created and run to test which fails, this is the red component again. The cycle continues indefinitely until the desired state is reached terminating the feedback loop.

In the case of the ADP project, the Red Green approach was applied during the refactoring process. This led to a smooth deployment process, with the application being more reliable, robust, and easier to use. By applying this approach, we were able to address the various programming and engineering challenges facing the project, resulting in a more efficient, effective, stable, and production-ready application.

Standard Practices Often Missing in Data Science Work

While data science has become a critical field in many industries, it is not without its challenges. One of the biggest issues is the lack of standard practices that are often missing in data science work. While there are many standard practices that can improve the quality, maintainability, and reproducibility of data science code, many data scientists overlook them in favor of quick solutions.

This section will cover some of the most important standard practices that are often missing in data science work. These include:

  • version control
  • testing code (unit, integration, system, acceptance)
  • documentation
  • code reviews
  • ensuring reproducibility
  • adhering to style guidelines (i.e. PEP standards)
  • using type hints
  • writing clear docstrings
  • logging errors
  • validating data
  • writing low-overhead code
  • implementing continuous integration and continuous deployment (CI/CD) processes

By following these standard practices, data scientists can improve the quality and reliability of their code, reduce errors and bugs, and make their work more accessible to others.

Documenting Code

Documenting code is crucial for making code understandable and usable by other developers. In data science, this can include documenting data cleaning, feature engineering, model training, and evaluation steps. Without proper documentation, it can be difficult for others to understand what the code does, what assumptions were made, and what trade-offs were considered. It can also make it difficult to reproduce results, which is a fundamental aspect of scientific research as well as building robust and reliable applications.

Writing Clear Docstrings

Docstrings are strings that provide documentation for functions, classes, and modules. They are typically written in a special format that can be easily parsed by tools like Sphinx to generate documentation. Writing clear docstrings can help other developers understand what a function or module does, what arguments it takes, and what it returns. It can also provide examples of how to use the code, which can make it easier for other developers to integrate the code into their own projects.

def complex (real=0.0, imag=0.0):
"""Form a complex number.
Keyword arguments:
real -- the real part (default 0.0)
imag -- the imaginary part (default 0.0)
"""
if imag == 0.0 and real == 0.0:
return compelx_zero
...

Multi-Line Docstring Example

Adhering to Style Guidelines

Style guidelines in code play a crucial role in ensuring readability, maintainability, and consistency across a project. By adhering to these guidelines, developers can enhance collaboration and reduce the risk of errors. Consistent indentation, clear variable naming, concise commenting, and following established conventions are some key elements of effective style guidelines that contribute to producing high-quality, well-organized code. An example of this are PEP (Python Enhancement Proposal) standards, which provide guidelines and best practices for writing Python code. It ensures that code can be understood by other Python developers, which is important in collaborative projects but also for general maintainability. Some PEP standards address naming conventions, code formatting, and how to handle errors and exceptions.

Using Type Hints

Type hints are annotations that indicate the type of a variable or function argument. They are not strictly necessary for Python code to run, but they can improve code readability, maintainability, and reliability. Type hints can help detect errors earlier in the development process and make code easier to understand by other developers. They also provide better interactive development environment (IDE) support and can improve performance by allowing for more efficient memory allocation.

Version Control

Version control is the process of managing changes to code and other files over time. It allows developers to track and revert changes, collaborate on code, and ensure that everyone is working with the same version of the code. In data science, version control is particularly important because experiments can generate large amounts of data and code. By using version control, data scientists can ensure that they can reproduce and compare results across different versions of their code and data. It also provides a way to track and document changes, which can be important for compliance and auditing purposes.

Figure 2: Version Control Illustration
Figure 2: Version Control Illustration

A master branch (V1) is created as the main project. A new branch off shooting V1 is created in order to develop and test until the modifications are ready to be merged with V1, creating V2 of the master branch. V2 is then released.

Testing Code

Testing code is the formal (and sometimes automated) verification of the completeness, quality, and accuracy of code against expected results. Testing code is essential for ensuring that the codebase  works as expected and can be relied upon. In data science, testing can include unit tests for functions and classes, integration tests for models and pipelines, and validation tests for datasets. By testing code, data scientists can catch errors and bugs earlier in the development process and ensure that changes to the code do not introduce new problems. This can save time and resources in the long run by reducing the likelihood of unexpected errors and improving the overall quality of the code.

Code Reviews

Code reviews are a process in which other developers review new code and code changes to ensure that they meet quality and style standards, are maintainable, and meet the project requirements. In data science, code reviews can be particularly important because experiments can generate complex code and data, and because data scientists often work independently or in small teams. Code reviews can catch errors, ensure that code adheres to best practices and project requirements, and promote knowledge sharing and collaboration among team members.

Ensuring Reproducibility

Reproducibility is a critical aspect of scientific research and data science. Reproducible results are necessary for verifying and building on previous research, and for ensuring that results are consistent, valid and reliable. In data science, ensuring reproducibility can include documenting code and data, using version control, rigorous testing, and providing detailed instructions for running experiments. By ensuring reproducibility, data scientists can make their results more trustworthy and credible and can increase confidence in their findings.

Logging

Logging refers to the act of keeping a register of events that occur in a computer system. This is important for troubleshooting, information gathering, security, providing audit info, among other reasons. It generally refers to writing messages to a log file. Logging is a crucial part of developing robust and reliable software, including data science applications. Logging errors can help identify issues with the application, which in turn helps to debug and improve it. By logging errors, developers can gain visibility into what went wrong in the application, which can help them diagnose the problem and take corrective action.

Logging also enables developers to track the performance of the application over time, allowing them to identify potential bottlenecks and areas for improvement. This can be particularly important for data science applications that may be dealing with large datasets or complex algorithms.

Overall, logging is an essential practice for developing and maintaining high-quality data science applications.

Writing Low-Overhead Code

When it comes to data science applications, performance is often a key consideration. To ensure that the application is fast and responsive, it's important to write code that is optimized for speed and efficiency.

One way to achieve this is by writing low-overhead code. Low-overhead code is code that uses minimal resources and has a low computational cost. This can help to improve the performance of the application, particularly when dealing with large datasets or complex algorithms.

Writing low-overhead code requires careful consideration of the algorithms and data structures used in the application, as well as attention to detail when it comes to memory usage and processing efficiency. Thought should be given to the system needs and overall architecture and design of a system up front to avoid major design changes down the road.

Additionally, low-overhead code is easily maintained requiring infrequent reviews and updates. This is important as it reduces the cost to maintain systems and allows for more focused development on improvements or new solutions.

Overall, writing low-overhead code is an important practice for data scientists looking to develop fast and responsive applications that can handle large datasets and complex analyses while keeping maintenance costs low.

Data Validation

Data validation is the process of checking that the input data meets certain requirements or standards. Data validation is another important practice in data science as it can help to identify errors or inconsistencies in the data before they impact the analysis or modeling process.

Data validation can take many forms, from checking that the data is in the correct format to verifying that it falls within expected ranges or values. Different types of data validation checks exist, such as type, format, correctness, consistency, and uniqueness. By validating data, data scientists can ensure that their analyses are based on accurate and reliable data, which can improve the accuracy and credibility of their results.

Continuous Integration and Continuous Deployment (CI/CD)

Continuous Integration and Continuous Deployment (CI/CD) is a set of best practices for automating the process of building, testing, and deploying software. CI/CD can help to improve the quality and reliability of data science applications by ensuring that changes are tested thoroughly and deployed quickly and reliably.

CI/CD involves automating the process of building, testing, and deploying software, often using tools and platforms such as Jenkins, GitLab, or GitHub Actions. By automating these processes, developers can ensure that the application is built and tested consistently, and that any errors or issues that block deployment of problematic code are identified and addressed quickly.

CI/CD can also help to improve collaboration among team members, by ensuring that changes are integrated and tested as soon as they are made, rather than waiting for a periodic release cycle.

Figure 3: CI/CD
Figure 3: CI/CD

The image illustrates a repeating process represented by the infinity symbol sectioned into 8 unequal parts. Starting from the middle and moving counterclockwise the first of these parts are: plan, code, build, and continuous testing. Then continuing from the last piece, which was in the center, moving clockwise the parts are: release, deploy, operate, and then monitor, before moving back to the original state of plan.

Overall, CI/CD is an important practice for data scientists looking to develop and deploy high-quality data science applications quickly and reliably.

Conclusion

In summary, production-level code is critical for data science projects and applications. Proper programming practices and software engineering principles such as adhering to PEP standards, using type hints, writing clear docstrings, version control, testing code, logging errors, validating data, writing low-overhead code, implementing continuous integration and continuous deployment (CI/CD), and ensuring reproducibility are essential for creating robust, maintainable, and scalable applications.

Not following these practices can result in difficulties such as a lack of documentation, no error logging, no meaningful error messages for users, highly coupled code, overfitted code to a single example, lacking features desired by clients, and failure to provide feedback upon failure. These issues can severely impact production readiness and frustrate users. If a user is frustrated, then productivity will be impacted and result in negative downstream impacts on businesses’ ability to effectively deliver their mandate.

The most practical tip for implementing production-level code is to work together, assign clear responsibilities and deadlines, and understand the importance of each of these concepts. By doing so, it becomes easy to implement these practices in projects and create maintainable and scalable applications.

Meet the Data Scientist

Register for the Data Science Network's Meet the Data Scientist Presentation

If you have any questions about my article or would like to discuss this further, I invite you to Meet the Data Scientist, an event where authors meet the readers, present their topic and discuss their findings.

Register for the Meet the Data Scientist event. We hope to see you there!

MS Teams – link will be provided to the registrants by email

Subscribe to the Data Science Network for the Federal Public Service newsletter to keep up with the latest data science news.