Privacy Preserving Technologies Part Two: Introduction to Homomorphic Encryption

By Zachary Zanussi, Statistics Canada

Have you ever wished that there was a way to access data to perform analytics while preserving the privacy of the data itself? Homomorphic encryption is an emerging privacy preserving technique with potential applications that will allow for greater access while keeping data encrypted and secure.

The first article in the series, Brief Survey of Privacy Preserving Technologies introduced privacy preserving techniques (PPTs) and how they are poised to enable analytics while protecting the privacy of the data. This article will build on that topic by taking a deeper look at one of these techniques, homomorphic encryption (HE), including what it is, how it works and what it can do for you.

This article begins with an overview of HE and introduces some common use cases. It gives an honest evaluation of HE's advantages and disadvantages. Then it will cover some of the more technical details to prepare you to dig into these techniques yourself! By the end of this article, hopefully you will be inspired to continue your learning by picking an HE library and making your own encrypted circuits.

Homomorphic encryption is currently being considered by international groups for standardization. The Government of Canada does not recommend that HE, or any cryptographic technique, be used in practice before standardization by experts. While HE is not yet ready for use on sensitive data, this is a great time to explore its functionality and potential use cases. Expect a future article on the standardization activities related to HE including expected timelines and schemes.

What is homomorphic encryption?

A traditional encryption scheme maps human-readable plaintexts into masked ciphertexts to protect data from prying eyes. Once masked, these ciphertexts are immutable; changing even a single bit in the ciphertext may return an unrecognizable plaintext message upon decryption. This makes traditional encryption quite static. By contrast, a homomorphic encryption scheme is dynamic; given two ciphertexts, you can perform operations on the underlying plaintexts. For example, a homomorphic 'add' operation will return a ciphertext that, upon decryption, returns the sum of the two original plaintext messages. This allows you to delegate computing to another party so that they can manipulate it without accessing the data.

A typical cloud computing protocol involves a client sending its data to the cloud. Since internet connections are inherently insecure, this transfer is facilitated by a form of transport security protocol that involves encryption, such as HTTPS. Upon receipt, the cloud decrypts and begins computation. However, what if you want to keep the data secret from the cloud? If you encrypted with a homomorphic scheme, not only would the data be protected during transport, but it would also be protected during the entire computation process. Upon completion, the cloud would forward the encrypted results back to the client, who could decrypt and view the results at their leisure.

The term "homomorphic" comes from Greek, roughly translating to "similar form." In mathematics, a homomorphism is a map from one mathematical structure to another that preserves the operations of the first structure. To construct a homomorphic encryption scheme, you need an encryption map that scrambles the data enough that no one can figure out what they are, while simultaneously preserving the structure of the data so that operations on ciphertexts result in predictable results in the plaintexts. These paradoxical goals underscore the difficulty in constructing such a scheme.

Figure 1: An illustration of the benefits of HE.

Figure 1: An illustration of the benefits of HE. On the left is ordinary encryption; to apply the desired analytics, the data need to first be decrypted using the private key. To make the results safe for transport, it must be re-encrypted. In addition, the data are vulnerable for the duration of the computation. On the right is HE; the computing party doesn't require any sensitive information to perform the calculation and the data and results are protected by encryption.

Description - Figure 1

An illustration of the difference between computations with ordinary and homomorphic encryption. In the case of ordinary encryption, the data, a box of lines with a padlock on it, must first be decrypted using some key, resulting in the same box with an unlocked padlock. If the results must be communicated to another party, they must then be encrypted again using another key. In the case of homomorphic encryption, the computation can be performed directly, without any secret information like keys.

What can you do with homomorphic encryption?

There are a number of different computing paradigms that can be enhanced with HE, including delegated computing, data sharing and data release. These different paradigms all revolve around the fact that the data holder, analyst and computing platforms are often different parties entirely and the aim is to reduce or remove the privacy concerns that arise when one of these parties shouldn't have access to the data. It is important to note that HE uses a weaker security model than traditional cryptography and that care will need to be taken to ensure that it is used securely in practice.Footnote 1

Possibly the simplest application involves a data holder delegating their computing to another party, such as the cloud. In this scenario, a client encrypts their data and sends them along with some instructions to the cloud. The cloud can carry out those instructions homomorphically and return the encrypted results, learning nothing about the input, output or intermediate values. These instructions are modeled as circuits, which are sequences of arithmetic operations applied to some input. It should be noted that creating correct and efficient circuits with HE is not always straightforward, but theoretically there is no limit to the computations that can be run. For example, Statistics Canada has completed proof-of-conceptsFootnote 2 applying statistical analysis and neural network training on encrypted data.

As an extension of the delegated computing scenario, consider a case where there are multiple data holders. These data sources want to share their data, but are prevented due to privacy issues. The exact outline depends on the trust model; however, HE may allow these different parties to each encrypt their data and share them with a central authority who has the power to compute homomorphically. These data sharing applications can allow for better analytics in scenarios where data are limited and sheltered. An example is an oncologist who wants to test their hypotheses; patient data are typically restricted to the treating hospitals and combining these sets not only increases the strength of the model, but removes geographic data biases. Therefore, allowing multiple hospitals to share their encrypted data and allowing the oncologist to compute on this joint encrypted dataset allows for better healthcare research and outcomes.

Consider also scenarios with a central data holder and several parties who want to perform analysis on these data. An example of this is Statistics Canada's Research Data Centres, which are hosted across Canada in secure facilities managed by the organization. Accredited researchers can gain special approval to access microdata within these secure sites. While secure, the approval process takes time and the researchers must be able to physically access these sites. With HE, the data centres may be able to host the data encrypted and give access to any party who requests it. This would cut down the administrative costs of adding a new researcher and would broaden access to data in line with Canada's Open Data Initiative.

Figure 2: Illustrations of the three paradigms

Figure 2: Illustrations of the three paradigms. First is delegated computing; the data holder encrypts and sends their data to the cloud, who returns the encrypted results after performing homomorphic calculations. Second, multiple parties encrypt and send their share of a distributed dataset which the cloud can use to perform analytics without compromising the privacy of each data holder. Third, a central data holder can give analysts access to an encrypted dataset. The analysts can be subjected to less scrutiny and restrictions because they never have direct access to the data.

Description - Figure 2

An illustration of the three paradigms. In the "delegated computing" paradigm, the data holder sends their encrypted data to the cloud, who sends the encrypted results back. In the "multiple data holder" paradigm, multiple data holders can each send their encrypted data, allowing the cloud server to perform a joint computation on the union of their datasets, resulting in a stronger analytical result. In the "data bank" paradigm, the cloud holds the data and can send encryptions of it to any analyst they choose, without fear of the data being misused.

HE can help with more than numerical calculations. For example, Private Set Intersection (PSI) allows a client in possession of a sensitive dataset to learn its intersection with a server's dataset without the server learning the client's dataset and without the client learning anything about the server's data beyond the intersection. Private String Matching is a similar protocol that allows the client to query a textual database for a matching substring. Using these and other cryptographic primitives, you can envision a broad privacy-preserving suite linking data dispersed across different government departments and public institutions. While such a system is ambitious and the exact implementations are not yet clear, it gives a taste of the types of systems that you can aspire to as more complicated tasks are completed using HE and other PPTs.

Downsides of homomorphic encryption

While there are many benefits to the use of HE, as with any technology, there are potential downsides. The price of cryptographic security is the computational cost; depending on the analysis, encrypted computation can be several orders of magnitude more expensive than unencrypted. There is also a data expansion cost that can be quite significant. This data expansion cost is exacerbated by the fact that most HE protocols involve transferring encrypted data; while cloud storage is relatively inexpensive, data transfer can be costly and complicated.

There are also a restricted set of computations allowed natively by HE. Only addition, subtraction and multiplication are native to most arithmetic schemes and all other computations (such as exponentials, activation functions, etc.) must be approximated by a polynomial. One should note that this is true in general with all computers, but while a modern computer hides this fact from the user, HE libraries currently require the user to specify how to compute these non-trivial functions.Footnote 3 In some schemes, one also has to be wary of the depth of computations attempted. Indeed, these schemes introduce noise into the encrypted data to protect it. This noise is compounded through successive computations and, unless reduced,Footnote 4 would eventually overtake the signal, at which point decryption will no longer return the expected output. One's choice of encryption parameters is important here. Given a circuit, there exists a parameter set large enough to accommodate it, but dealing with larger parameters increases the computational cost of the protocol.

Can the extra costs in terms of computation and circuit creation be justified? Well, HE allows for computations that might not be possible otherwise. This is true with particularly sensitive datasets, such as health data. There is a huge cost inherent in obtaining permissions for an analyst to work on such data, as well as additional complications such as controlled computing environments. And once the data are shared, how do you verify that the analysts are following the rules? Some data holders may be reluctant to allow anyone access to their data at all; without some additional measures such as HE, this analysis might be impossible. The choice between "expensive computation" and "no computation" is much easier to make.

Moreover, the various schemes and their implementations are an active area of research and the library implementations regularly release improvements to their data compression and homomorphic computation algorithms. There has also been a significant amount of investment in hardware acceleration for HE recently. This is similar to the hardware that is installed on most computers, which contains specific electronic circuits designed to perform encryption and decryption operations as fast as possible. This could allow HE-accelerated cloud computers to perform analysis on encrypted data at speeds closer to that of unencrypted data.

In spite of the downsides, there are reasons to believe that HE will become an important tool for preserving privacy. That makes the present a fantastic time to begin to examine what can be done with these techniques.

The mathematics of homomorphic encryption

Now this article will delve into the inner mathematical workings of HE, including cryptographic details; hopefully even non-mathematical readers will be able to grasp the basics of how these schemes work. It should be noted that the rest of this section provides details pertaining to the scheme of Cheon, Kim, Kim and Song, which they named Homomorphic Encryption for Arithmetic of Approximate Numbers but the cryptographic community usually refers to as CKKS. That said, most of what is mentioned here applies to the other schemes with only slight modifications.

At the heart of every public key cryptosystem is a mathematical problem that is believed to be hard to solve unless you have access to a special piece of information called a secret (or private) key. A related public key can be used to encrypt plaintext data producing a ciphertext, but only knowledge of the secret key enables one to recover the original plaintext from this ciphertext. Since the public key cannot be used to decrypt, the public key can be shared with anyone wishing to encrypt data with confidence that only the secret key holder can decrypt the ciphertext to access the plaintext.

Most HE schemes use some variant of the Learning With Errors hardness assumption. This describes the ring variant, called Ring-Learning With Errors (RLWE). Rather than integers, it deals with polynomials with integer coefficients. More precisely, you want the space of polynomials with integer coefficients modulo q of degree less than N ; this is denoted by R q = Z q [ X ] / X N - 1 . You can think of this space simply as lists of N integers, each less than q . Typically, you would take these values to be quite large; for example N=215=16,384 and q ~ 2800. This makes R q large enough to hide secrets in! Figure 3 gives a toy example of the type of space we would work with.

Figure 3: A toy example of a ring of the type that might be used for HE, as well as a few of its elements.

Figure 3: A toy example of a ring of the type that might be used for HE, as well as a few of its elements. Note that the sum or product of these elements is another element in the ring.

Description - Figure 3

An example of a ring that may be of interest when working with homomorphic encryption.

R17=Z17[X]/X16-1
X15+11X14+X12+5X7+2X6+4X2+X+16
X4+13X3+5X2+X+8
X10+16X8+X6+16X4+X2+16

Here, the value of q is 17 and the value of N is 16. Also listed are some sample polynomials in the ring; one example is the polynomial x 4 + 13 x 3 + 5 x 2 + x + 8 .

Given two polynomials, you can add them or multiply them. The result of these operations is always another polynomial.Footnote 5 This makes R q a kind of a sandbox that you can move around freely within. Mathematicians call a set with this property a ring and the way that these operations affect the elements of the ring is what is meant by structure. The special property of homomorphic encryption is that there exist operations in the ciphertext space that correspond homomorphically to the operations on the underlying plaintext space. The use of polynomial rings is preferred because the operations are efficient and the RLWE problem is believed to be difficult.

How does one hide a secret in a mathematical space? Suppose you have four random polynomialsFootnote 6 in R q , called a , s , e , and b . The RLWE hardness assumption states that it is very hard to distinguish a series of pairs that are either of the form ( a , a s + e ) or of the form ( a , b ). Here, "very hard to distinguish" means "parameters can be set such that all the best computers in the world working together using the best known algorithms would still not be able to solve the problem. The polynomials  a  and  b  can be sampled uniformly at random from all of Rq, but the others have a special form. In CKKS, we take s to have coefficients of ±1  or 0, and sample the coefficients of e from a discrete Gaussian distribution over Zq centred around 0. For the rest of this post, we will just refer to these polynomials as "small", because in both cases their coefficients are close to 0.

The hardness of the RLWE problem allows you to keep a secret in the following way: notice that the first pair is correlated; there is a factor of a in both polynomials, while in the second there is no correlation between the randomly selected a and b. Now imagine someone handed you many pairs that are either all of the form (a,as+e)  for many different values of e and a constant s, or all just completely random pairs. According to the hardness of RLWE, not only could you not reliably find s when given the (a,as+e)  pairs, you couldn't even reliably determine which of type ofthe pairs you were given! Figure 4 gives a toy example of this problem for you to try at home.

Figure 4: Four pairs of polynomials

Figure 4: Four pairs of polynomials inR17=Z17[X]/X16-1 broken into two groups. One group is distributed as form (a,as+e)  for some fixed "small" s and two different random "small" e and the other two are of group is of the form (a,b). Can you tell which is which? What if 17 is changed to 2800 and 16 to 16,384? Now imagine trying to figure out what s is. Note that in the RLWE assumption, you would be given just one of these groups, not both.

Description - Figure 4

Four pairs of polynomials. This is supposed to be a toy example of the RLWE problem for you to try at home. The polynomial pairs are separated into two groups. One group is distributed as (a,as+e)  for a fixed "small" polynomial s, and the other is of the form (a,b) for random a and b. Can you tell which is which? The point of this figure is to illustrate just how hard the RLWE hardness assumption is. The polynomials in the figure are repeated below:

(x4+4x3+10x+1,x8+6x7+x6+8x5+12x4+4x3+10x2+8x+14)
(x4+12x3+2x2+5x+11, x8+14x7+14x6+12x5+9x4+13x3+8x2+6x+7)
(x4+5x3+3x2+8, x8+4x7+12x6+16x5+15x4+3x3+6x2+9x+8)
(x4+9x3+7x2+14x+1, x8+413x7+9x6+14x5+2x4+8x3+x2+13x+12)

The security of schemes based on RLWE follows from the fact that given a , s , and e it is easy to compute a s + e , but it is practically impossible to find s given a and a s + e . You can construct a public key encryption system as follows:

  • Fix your space R q by picking a coefficient modulus q and a polynomial modulus degree N .
  • Pick a random "small" secret key s , a uniformly random a, and a random "small" e to construct your public key (a,-as+e,a). Note the negative in this pair; this makes the encryption process more straightforward but does not affect the security of RLWE.
  • Share your public key with the world and no one will be able to find your secret key! Hence, anyone in possession of this public key can encrypt the data and send them to some party to perform computations on it, homomorphically. In the end, the results also can only be decrypted and viewed using the secret key.

To encrypt the data, the data must first be encoded as a vector v of real numbers. This is straightforward when you are working with numerical data and is a standard practice when working with textual or other types of data. To encrypt, the data vector v is first encoded as a polynomialFootnote 7 m in R q combined with by the public key to get a ciphertext, which will be denoted by [ v ] . Now, send this off to the computing party who will perform homomorphic additions and multiplications to implement the calculation that is of interest. Figure 5 outlines a simple circuit computing a polynomial function. Once the computations are completed and output ciphertexts are returned, you can use your secret key to decrypt and view the results.

Figure 5: A visualization of a homomorphic circuit

Figure 5: A visualization of a homomorphic circuit. A vector of values can be encrypted into a single ciphertext and computed on at once. Pictured is just one realization of a circuit to compute the polynomial f(x). Values with padlocks are encrypted and are thus unreadable to the computing party.

Description - Figure 5

A homomorphic circuit that evaluates the function f ( x ) = x 3 + 4 x 2 + 2 x + 1 on a vector of values. Padlocks represent values that are encrypted and are thus unreadable to the computing party. Arrows and operations represent how one could actually encode the circuit in a homomorphic encryption library.

While this article did not explore all of the details of how these operations are implemented mathematically, the description of HE given so far provides the background needed to further learn about HE.

How to get started with homomorphic encryption

To get started with HE, take a look at some of the available open-source HE libraries; you can try Microsoft SEAL, PALISADE Homomorphic Encryption Software Library, TFHE: Fast Fully Homomorphic Encryption over the Torus, or even Concrete: Open-source Homomorphic Encryption Library if you are a Rustacean also known as someone who uses Rust. These different libraries implement multiple HE schemes between them and you can pick the one that's best for your use case. We reiterate that, until the standardization process has finished, the Government of Canada does not recommend using HE with any sort of sensitive data.

While all of the different HE schemes will implement most use cases, some schemes will perform better on some problems. The CKKS scheme is designed to work on real numbers; if you are interested in statistics or machine learning, you should probably start here! Brakerski/Fan-Vercauteren and Brakerski-Gentry-Vaikuntanathan are great for integer arithmetic and implementing the computer science primitives such as private set intersection or string matching. TFHE implements logical gates natively and refreshes the ciphertext noise with every operation, allowing improved efficiency with longer circuit depths. Readers who are interested are encouraged to try some simple circuits using each scheme and compare the results and performance!

If you would like more information on the cyber security aspects of homomorphic encryption, including standardization activities, contact the Canadian Centre for Cyber Security at contact@cyber.gc.ca, (613) 949-7048 or 1-833-CYBER-88.

Conclusion

This article took an in-depth look at homomorphic encryption, from its applications to the RLWE problem. Next, this series on privacy preserving techniques will look at some proofs-of-concept that have been completed by applying HE at Statistics Canada! It will also cover some of the more advanced aspects of the CKKS interface, including rotations, choice of parameters, packing, bootstrapping, scale and levels.

Want to keep in the loop about these emerging technologies, or want to share your work in the field of privacy? Check out the Privacy Preserving Technologies Community of Practice page (Government of Canada employees only) to discuss this series of privacy articles, connect with peers interested in privacy and share resources and ideas with the community. You can also give feedback on this topic or leave suggestions for future articles in this series.

Note: We wish to acknowledge the input provided on this article by the Canadian Centre for Cyber Security and the Tutte Institute for Mathematics and Computing, both part of Communications Security Establishment.

Date modified:

Federal government expenditures on COVID-19 response measures - Q2 - 2021

On March 11, 2020, the World Health Organization declared the COVID-19 pandemic. To address the consequences of the pandemic on the Canadian economy, the federal government of Canada announced and implemented various support and recovery measures for businesses, households, students, the vulnerable population and organizations helping individuals. The table Federal government expenditures on COVID-19 response measures presents the major federal measures announced and implemented, their treatment in the national accounts (in particular, in the Income and Expenditure Accounts), the table numbers where the pertinent series may be found and the amount of expenditure on a quarterly basis.

For a comprehensive explanations on the treatment of COVID-19 government support measures in the national accounts, please refer to the documents Recording COVID-19 measures in the national account and Recording new COVID measures in the national accounts.

Treatment in national accounts: Subsidies on production, by quarter at quarterly rates
COVID-19 measure 2020 2021
First quarter Second quarter Third quarter Fourth quarter First quarter Second quarter
$ millions
Canada Emergency Wage Subsidy (CEWS) - business 4,359 29,351 22,711 10,703 10,033 7,633
Temporary Wage Subsidy (TWS) - business 169 739        
Canada Emergency Rent Subsidy (CERS) - business     52 1,558 1,714 1,096
Lockdown Support (LS) - business     5 209 341 237
Source: Statistics Canada, tables 36-10-0103, 36-10-0118, 36-10-0477.
Treatment in national accounts: Current transfers to non-profit institutions serving households (NPISH), by quarter at quarterly rates
COVID-19 measure 2020 2021
First quarter Second quarter Third quarter Fourth quarter First quarter Second quarter
$ millions
Canada Emergency Wage Subsidy (CEWS) - NPISH 200 1,095 1,051 573 549 325
Temporary Wage Subsidy (TWS) - NPISH 13 46        
Canada Emergency Rent Subsidy (CERS) - NPISH     1 36 38 22
Lockdown Support (LS) - NPISH     0 4 7 4
Source: Statistics Canada, tables 36-10-0118, 36-10-0477, 36-10-0115.
Treatment in national accounts: Subsidies on products and imports, by quarter at quarterly rates
COVID-19 measure 2020 2021
First quarter Second quarter Third quarter Fourth quarter First quarter
$ millions
Canada Emergency Commercial Rent Assistance (CECRA)   1,130 904    
  • Federal contribution
  849 679    
  • Provincial contribution
  281 225    
Source: Statistics Canada, tables 36-10-0103, 36-10-0118, 36-10-0477.
Treatment in national accounts: Current transfers to households - Employment Insurance benefits, by quarter at quarterly rates
COVID-19 measure 2020 2021
First quarter Second quarter Third quarter Fourth quarter First quarter Second quarter
$ millions
Canada Emergency Response Benefit (CERB) - EI stream   19,127 9,239 864    
Source: Statistics Canada, tables 36-10-0118, 36-10-0477, 36-10-0112.
Treatment in national accounts: Transfers to households -Other federal transfers to households, by quarter at quarterly rates
COVID-19 measure 2020 2021
First quarter Second quarter Third quarter Fourth quarter First quarter Second quarter
$ millions
Canada Emergency Response Benefit (CERB) - CRA stream   29,002 15,597 704    
Canada Emergency Student Benefit (CESB)   1,386 1,550 8  
Canada Recovery Benefit (CRB)       6,073 7,280 6,516
Canada Recovery Caregiving Benefit (CRCB)       900 960 933
Canada Recovery Sickness Benefit (CRSB)       246 144 188
Source: Statistics Canada, tables 36-10-0118, 36-10-0477, 36-10-0112.

Life Expectancy and Deaths Statistics

Life Expectancy and Deaths Statistics

Follow:

Sign up to My StatCan to get updates in real-time.

What are provisional deaths in Canada?

Provisional deaths are not based on all deaths that are observed during a specific reference period because of reporting delays. Provisional death counts are based on what is reported to Statistics Canada by provincial and territorial vital statistics registries.

Provisional death estimates have been adjusted to account for incomplete data. As a result, the provisional death counts and estimates released may not match figures from other sources, such as media reports, or counts and estimates from provincial and territorial health authorities and other agencies.

Visualizing mortality in Canada

Explore the cause of death trends in Canada since 2000 with these interactive dashboards. Metrics visualized on the dashboards are: number of deaths, death rate per 100,000 people, and the proportion of deaths represented by each selected cause of death.

Rates and counts by age group for select causes of death

Visualizing mortality in Canada: Rates and counts by age group for select causes of death

Cause of death trends in Canada broken down by several age groups between 0 to 90 years of age and by sex.

Rates and counts by sex and province or territory for select causes of death

Visualizing mortality in Canada: Rates and counts by sex and province or territory for select causes of death

Cause of death trends in Canada broken down by province or territories and by sex.

Focus on COVID-19

Learn more about provisional deaths and excess mortality:

Our partners

Our partners

Data sources

Data sources

Frequently asked questions

Frequently asked questions

Death certification and classification

Death certification and classification

COVID-19 comorbidities in Canada

COVID-19 comorbidities in Canada

The Daily articles

The Daily articles

COVID-19 insights

COVID-19 insights

What’s trending in health? Visit Statistics Canada’s official release bulletin

Explore the mortality dashboard

The Provisional Deaths in Canada Dashboard allows users to examine recent mortality trends, by comparing the number of deaths being observed with previous years. Comparing provisional death counts and death estimates over time can be useful for understanding trends in mortality. As Canada's population grows and ages, the number of deaths is expected to increase from year to year. The Canadian Vital Statistics Death (CVS-D) database is the authoritative source for cause of death data in Canada. The CVS-D is an administrative survey that collects demographic and medical information from all provincial and territorial vital statistics registries on all deaths in Canada.

Modelling SARS-CoV-2 Dynamics to Forecast PPE Demand

By: Jihoon Choi, Deirdre Hennessy and Joel Barnes, Statistics Canada

Personal protective equipment (PPE) has become an important part of the lives of all Canadians as the pandemic changed the way we interact with one another and protect ourselves. The rapid rise of the novel coronavirus, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), also referred to as COVID-19, has put unprecedented demands on the Government of Canada to provide timely, accurate and relevant information to inform decision-making around a host of public health issues, including PPE procurement and deployment of PPE to the provinces and territories.

The global pandemic caused by SARS-CoV-2, poses a serious public health concern for Canadians.Footnote 1 As of October 2021, over 1.71 million diagnosed cases have been reported in Canada, meaning it is essential that Canadians have access to PPE when they need it.

PPE refers to commodities such as masks, gloves and gowns that are worn to provide protection against potential exposure to infectious pathogens. The pandemic has brought severe stress to the supply chains for PPE in Canada, causing a significant disruption in supply among sectors where PPE stocks are essential (e.g., hospitals, long-term care facilities).Footnote 2 For this reason, forecasts on the pandemic trajectory and its effect on the supply, demand and inventory of PPE have become a crucial element in decision-making.Footnote 3, Footnote 4

Epidemiological models can contribute valuable insights in public health decision-making by generating a number of 'what-if' scenarios under different assumptions. Furthermore, it can help estimate how different public health intervention measures can affect the outcome of the epidemic (i.e. deciding the critical timing to introduce lockdowns/reopening in each provinces).Footnote 5 Different variations of epidemiological models exist and many of these are compartmental models where the population is divided into multiple compartments and are moved from one compartment to another at a defined rate.Footnote 6

The Susceptible-Infectious-Recovered (SIR) model is one of the most basic forms of a compartmental model (Figure 1). This model consists of three compartments, where S is the number of susceptible individuals, I is the number of infected individuals and R is the number of recovered (and immune) individuals.

Figure 1 – Structure of a basic epidemiological model

Figure 1 – Structure of a basic epidemiological model.

Description - Figure 1

The base structure of the SIR model. The initial population starts in the susceptible compartment and flows into the infectious compartment at an infection rate β, then moves into the recovered compartment at a recovery rate defined by λ.

Figure 1 shows the base structure of the SIR model. The initial population starts in the susceptible compartment and flows into the infectious compartment at an infection rate β, then moves into the recovered compartment at a recovery rate defined by λ.

The origin of compartmental models in epidemiology dates back to the early 20th century. Specifically, the foundation was built based on the theorem outlined by Ronald Ross, William Hamer, Anderson McKendrick and William Kermack, along with important influences from a statistical perspective by John Brownlee.Footnote 7 Since their development, compartmental models have proven useful in modelling for numerous communicable diseases, such as malaria and plague.Footnote 8, Footnote 9

As the SARS-CoV-2 outbreak became a serious public health concern for Canadians, Health Canada commissioned the Data Science Division (DScD) and the Health Analysis Division (HAD) at Statistics Canada to create an epidemiological model that could forecast the trajectories of the outbreak in Canadian provinces. The forecasted cases and hospitalizations produced from the epidemiological model are used in the PPE Project to estimate the PPE demand in various sectors across the provinces. The PPE Project aims to inform decisions related to procurement, allocation and domestic production investment in PPE through evidence-based reports on the current status and projections of PPE supply and demand, in diverse epidemiological scenarios.

Creating the initial model for PPE demand: Susceptible – Infected – Recovered – Death (SIRD) model

The initial SIRD model first used Bayesian methods to estimate the number of active infections in Canadian communities based on SARS-CoV-2 mortalities. The number of total SARS-CoV-2 infections (diagnosed and undiagnosed) were reverse-estimated from SARS-CoV-2 fatalities by province and territory, using a similar method to that used by Flaxman et al.Footnote 10 Estimated number of infections, deaths and recoveries were fed into a simple compartmental model, composed of four compartments. The first three compartments are equivalent to the base SIR model (Susceptible, Infected and Recovered), but this model has an additional compartment D, which represents the population in deceased state (Figure 2).

Figure 2 – Structure of a SIRD epidemiological model

Figure 2 – Structure of a SIRD epidemiological model.

Description - Figure 2

The base structure of SIRD (Susceptible – Infected – Recovered – Death) model. The initial population starts in the susceptible compartment and flows into the infectious compartment at an infection rate β, then moves into the recovered compartment at a recovery rate defined by λ or into the deceased compartment at a mortality rate defined by γ.

Figure 2 shows the base structure of SIRD (Susceptible – Infected – Recovered – Death) model. The initial population starts in the susceptible compartment and flows into the infectious compartment at an infection rate β, then moves into the recovered compartment at a recovery rate defined by λ or into the deceased compartment at a mortality rate defined by γ.

This model also produced a dynamic historical Reproduction Number, R(t). The R(t) is an important concept in infectious disease epidemiology, providing information about the transmission potential of an infectious agent. In other words, it shows how contagious an infectious disease is at time t in the study population. Generally, if R(t) is greater than 1, the disease will start to propagate in the population, whereas if R(t) is less than 1, the number of new cases will decrease.

R(t) is often estimated from observing the number of new infections across a time period. However, the number of SARS-CoV-2 cases was not traced accurately in the beginning of the pandemic, due to a limitation in resources such as insufficient availability of testing kits.Footnote 11 As a workaround, the SIRD model estimated the historical R(t) from the number of SARS-CoV-2 fatalities, which was a much more reliable measure than actual case counts during the initial periods of the outbreak. An infection fatality rate (IFR) for SARS-CoV-2 from the research literature was used to backwards-compute the historical R(t).

To forecast the future R(t), the team generated different pandemic scenarios each with varying assumptions about public health intervention measures in effect:

  • The SARS-CoV-2 containment scenario—attempts to model a situation where strict public health intervention measures are in place (i.e., lockdowns). Under this scenario, R(t) is always kept under 1.
  • The Resurgence Best Estimate scenario—allows the epidemic to resurge in tandem with the reopening of the economy and allows the R(t) to stay high.
  • The Peaks and Valleys Scenario—allows the epidemic to resurge in tandem with the reopening of the economy until hospital intensive care unit (ICU) occupancy reached 30% of the provincial maximum. Then an intervention plan is triggered to bring the R(t) back down to lockdown level.

The SIRD model was used as the main epidemiological model for the PPE project until the beginning of 2021. The model has shown decent accuracy in forecasting the pandemic during the initial phase of the outbreak. However, this model had a number of limitations. In particular, it did not take age structure of the population into account. These limitations led to the creation of another version of the epidemiological model with additional compartments that can take more complex characteristics of the pandemic into consideration.

The current model: Susceptible – Exposed – Infected – Recovered – Deceased – Vaccinated (SEIRDV) model

Early in the pandemic, DScD and HAD at Statistics Canada worked with the Public Health Agency of Canada (PHAC) to develop an age-structured, multi-compartmental SIR model. This collaboration yielded the SEIRDV model, which was adapted by the Statistics Canada PPE epidemiological team, in collaboration with Health Canada, for use in the main PPE demand and supply model. This model has been used as the main epidemiological model in the PPE project since January 2021 (Figure 3).

Figure 3 – Simplified structure of a SEIRDV epidemiological model

Figure 3 – Simplified structure of a SEIRDV epidemiological model.

Description - Figure 3

A simplified structure of the SEIRDV (Susceptible – Exposed – Infected – Recovered – Death – Vaccinated) model. The population starts in the susceptible compartment, and then can flow into exposed and infectious compartments upon contracting the disease. Individuals whose infections have been detected are sent to the quarantine path and will have a reduced likelihood of spreading the disease to others. Upon infection, individuals with severe symptoms will seek medical attention. The severely symptomatic population can end in two terminal states: deceased or recovered.

Figure 3 shows a simplified structure of the SEIRDV (Susceptible – Exposed – Infected – Recovered – Death – Vaccinated) model. The population starts in the susceptible compartment and then can flow into exposed and infectious compartments upon contracting the disease. Some of these infections are detected from contact-tracing efforts or SARS-CoV-2 testing. Individuals whose infections have been detected are sent to the quarantine path and will have a reduced likelihood of spreading the disease to others. Upon infection, individuals with severe symptoms will seek medical attention. The severely symptomatic population can end in two terminal states: deceased or recovered. People who are only mildly symptomatic or asymptomatic will flow into the recovered compartment over time. In addition, the population can be vaccinated in this model. If an individual is vaccinated, their chances of flowing into the infection compartments are reduced by the protection rate of the vaccine. Similarly, the vaccinated population has a reduced probability of developing severe cases, and therefore, of flowing into the health care system (i.e. Hospital/ICU).

The four major modifications made by introducing the SEIRDV model are:

1. The model allows the study population to be age stratified

In the SEIRDV model, the population is divided into six distinct age groups (0-9 years, 10-19 years, 20-39 years, 40-59 years, 60-74 years, 75+ years), which allows different parameters to be set for each age group and to take age-related differences into account.

For instance, reports show that younger age groups have a reduced likelihood of hospitalization and mortality compared to older age groups.Footnote 12 Since the SEIRDV model allows users to set different flow rates for each age group, it is capable of modelling this effect.

Similarly, certain age groups are known to interact at a higher frequency than others (i.e., parents with their children) and therefore have increased chances of transmitting the disease to each other. In the SEIRDV model, this effect can be taken into account by using an interaction matrix that models the average contact rate between two age groups.

2. Estimation of the transmission rate (β) has been improved

Instead of relying on a single measure, such as R(t), to estimate the transmission rate, the model now uses three different parameters to calculate the rate of transmission.

First is β, which in this model represents the "probability of transmission upon contact". This number is estimated from literature and calibrated in accordance with the dominant strain of SARS-CoV-2 in each province. This measure is multiplied by a contact matrix, which is a numeric matrix that illustrates the average number of contacts that people in each age group make with another age group. Lastly, a contact multiplier is applied to take variances in contact rates into account. When different public health intervention measures are in effect (e.g., lockdowns), the rate of contact among the population will change accordingly. These variances are captured by calibrating the contact-multiplier to the reported number of daily active cases in each province every week.

3. The effect of vaccination is taken into consideration

Two main effects of vaccination are a reduction in the stress on the health care system (by providing protection against developing a severe case requiring hospitalization) and transmission of the disease within the community (by providing protection against infection, ultimately promoting herd immunity). The current design of the SEIRDV model takes this into account by introducing a distinct vaccination pathway. The vaccinated population will flow into this pathway, where they will have reduced chances of contracting the disease as well as reduced likelihood of developing a severe symptom requiring hospitalization.

The model also takes into account the two-dose vaccination plan set out by the National Advisory Committee on Immunization. The vaccination data were retrieved from PHAC and COVID-19 Canada Open Data Working Group (CCODWG) to estimate the number of doses that can be given out each day per province. In addition, the different rates of protection given by the two-stage vaccination plan were modelled by dividing our vaccination path into four distinctive compartments. This process is summarized in Figure 4.

Figure 4 – Design of the vaccination compartment

Figure 4 – Design of the vaccination compartment

Description - Figure 4

Demonstrates how a population is divided into age groups, with vaccines distributed from oldest to youngest with accommodation for some high risk groups of all ages. The groups flow through first and second doses on their way to full vaccination.

The study population is divided into six distinct age groups (0-9 years, 10-19 years, 20-39 years, 40-59 years, 60-74 years, 75+ years) and vaccines are distributed in the order of older to younger age groups, while distributing a small number of doses to an age group that represents the health care professionals in the early phase. Upon receiving the first dose, the freshly vaccinated population flows into the first vaccination compartment which represents the population who have received their vaccine but have not had the chance to develop any immunity yet. Then this population flows into the second vaccination compartment after a set period, at which point they develop a partial protection against SARS-CoV-2. The population stays in this compartment until phase 1 (i.e. giving out first dose) completes. Once phase 2 of the vaccination plan starts, the population flows into the third vaccination compartment where they receive their second dose, then flows into the last vaccination compartment where they develop the maximum immunity that they can gain from the vaccination.

4. Impact of variant of concern (VOC) can be modelled

A number of different strains of SARS-CoV-2 have been sequenced around the world as a result of viral mutation, some having shown higher rates of transmission or mortality.Footnote 13 These variants are called variants of concern (VOC) and became a crucial factor to consider in epidemiological modelling of SARS-CoV-2. The SEIRDV model is capable of modelling these by altering the probability of transmission (β) to model the increased transmission rate, as well as altering the flow into the hospitalization or the deceased compartment to model the effect of increased symptom-severity of the variant. Using this mechanism, the team has successfully modelled the effect of the B.1.1.7 (Alpha) variant in our model.

Conclusion

Through continuous development, enhancement and calibration efforts, the epidemiological model has yielded a valuable contribution in modelling the trend of the SARS-CoV-2 pandemic in Canada. Specifically, findings from this model have allowed the PPE Project to estimate the PPE demand across Canadian provinces to ensure that all sectors acquire sufficient PPE stocks in advance of large outbreaks.

Furthermore, this article demonstrates how applications of data science, combined with statistics, computer science and epidemiology, can be utilized in public health planning as well as decision making for resource requirements during the COVID-19 pandemic.

How was this achieved?

Areas of further study

Given that SARS-CoV-2 is still an on-going pandemic, there may be more work that needs to be done. Some potential future areas of study include:

  • New variants
    With the high rate of mutation observed in the SARS-CoV-2 strain, new variants are constantly sequenced around the world. While the effect of the B.1.1.7 variant has been considered in the model, there are still several other VOCs that may need to considered (e.g., Delta variant). The team is closely monitoring the spread of VOCs across the country to determine if other variants need to be taken into account in the model.
  • Waning immunity
    Studies have shown that immunity gained from vaccination (or infection) does not last indefinitely. Immunity will wane over time, causing a progressive loss of protective antibodies. This phenomenon is called waning immunity. This will need to be taken into account in the model to prepare for a future scenario, such as when a large portion of the population will require another dose of vaccination to maintain their immunity.

The PPE epidemiological modelling team:
Jihoon Choi (DScD), Deirdre Hennessy (HAD), Joel Barnes (HAD).

Project team and contributors:
Rubab Arim, Statistics Canada; Kayle Hatt, Health Canada

Date modified:

National Travel Survey: C.V.s for Visit-Expenditures by Duration of Visit, Main Trip Purpose and Country or Region of Expenditures – Q1 2021

National Travel Survey: C.V.s for Visit-Expenditures by Duration of Visit, Main Trip Purpose and Country or Region of Expenditures, including expenditures at origin and those for air commercial transportation in Canada, in Thousands of Dollars (x 1,000)
Table summary
This table displays the results of C.V.s for Visit-Expenditures by Duration of Visit, Main Trip Purpose and Country or Region of Expenditures. The information is grouped by Duration of trip (appearing as row headers), Main Trip Purpose, Country or Region of Expenditures (Total, Canada, United States, Overseas) calculated using Visit-Expenditures in Thousands of Dollars (x 1,000) and c.v. as units of measure (appearing as column headers).
Duration of Visit Main Trip Purpose Country or Region of Expenditures
Total Canada United States Overseas
$ '000 C.V. $ '000 C.V. $ '000 C.V. $ '000 C.V.
Total Duration Total Main Trip Purpose 4,388,384 B 3,584,773 A 348,650 E 454,962 C
Holiday, leisure or recreation 2,095,885 B 1,642,628 A 265,540 E 187,717 D
Visit friends or relatives 762,081 B 715,270 B 10,745 E 36,065 E
Personal conference, convention or trade show 30,793 E 29,511 E 849 E 433 E
Shopping, non-routine 267,241 B 267,241 B ..   ..  
Other personal reasons 679,391 C 456,425 B 67,117 E 155,850 E
Business conference, convention or trade show 31,336 E 9,449 E 123 E 21,764 E
Other business 521,658 B 464,249 B 4,276 E 53,133 E
Same-Day Total Main Trip Purpose 1,655,756 B 1,555,199 A 100,557 E ..  
Holiday, leisure or recreation 573,716 D 473,915 B 99,802 E ..  
Visit friends or relatives 278,527 B 277,949 B 578 E ..  
Personal conference, convention or trade show 27,842 E 27,842 E ..   ..  
Shopping, non-routine 242,842 B 242,842 B ..   ..  
Other personal reasons 281,383 B 281,205 B 178 E ..  
Business conference, convention or trade show 900 E 900 E ..   ..  
Other business 250,546 B 250,546 B ..   ..  
Overnight Total Main Trip Purpose 2,732,628 B 2,029,574 B 248,092 E 454,962 C
Holiday, leisure or recreation 1,522,168 B 1,168,713 B 165,738 E 187,717 D
Visit friends or relatives 483,554 B 437,321 B 10,167 E 36,065 E
Personal conference, convention or trade show 2,952 E 1,669 E 849 E 433 E
Shopping, non-routine 24,399 E 24,399 E ..   ..  
Other personal reasons 398,009 D 175,220 B 66,939 E 155,850 E
Business conference, convention or trade show 30,435 E 8,548 E 123 E 21,764 E
Other business 271,112 C 213,703 C 4,276 E 53,133 E
..
data not available

Estimates contained in this table have been assigned a letter to indicate their coefficient of variation (c.v.) (expressed as a percentage). The letter grades represent the following coefficients of variation:

A
c.v. between or equal to 0.00% and 5.00% and means Excellent.
B
c.v. between or equal to 5.01% and 15.00% and means Very good.
C
c.v. between or equal to 15.01% and 25.00% and means Good.
D
c.v. between or equal to 25.01% and 35.00% and means Acceptable.
E
c.v. greater than 35.00% and means Use with caution.

National Travel Survey: C.V.s for Person-Trips by Duration of Trip, Main Trip Purpose and Country or Region of Trip Destination – Q1 2021

National Travel Survey: C.V.s for Person-Trips by Duration of Trip, Main Trip Purpose and Country or Region of Trip Destination – Q1 2021
Table summary
This table displays the results of C.V.s for Person-Trips by Duration of Trip, Main Trip Purpose and Country or Region of Trip Destination. The information is grouped by Duration of trip (appearing as row headers), Main Trip Purpose, Country or Region of Trip Destination (Total, Canada, United States, Overseas) calculated using Person-Trips in Thousands (× 1,000) and C.V. as a units of measure (appearing as column headers).
Duration of Trip Main Trip Purpose Country or Region of Trip Destination
Total Canada United States Overseas
Person-Trips (x 1,000) C.V. Person-Trips (x 1,000) C.V. Person-Trips (x 1,000) C.V. Person-Trips (x 1,000) C.V.
Total Duration Total Main Trip Purpose 27,144 A 26,718 A 219 E 208 B
Holiday, leisure or recreation 9,612 A 9,450 A 83 E 80 D
Visit friends or relatives 8,048 B 7,980 B 15 E 53 E
Personal conference, convention or trade show 167 D 164 D 0 E 2 E
Shopping, non-routine 1,752 B 1,752 B ..   ..  
Other personal reasons 4,341 B 4,252 B 24 E 64 E
Business conference, convention or trade show 20 E 15 E 2 E 3 E
Other business 3,205 B 3,105 B 94 E 6 E
Same-Day Total Main Trip Purpose 20,369 A 20,253 A 117 E ..  
Holiday, leisure or recreation 6,315 A 6,263 A 51 E ..  
Visit friends or relatives 5,795 B 5,793 B 2 E ..  
Personal conference, convention or trade show 156 D 156 D ..   ..  
Shopping, non-routine 1,695 B 1,695 B ..   ..  
Other personal reasons 3,664 B 3,652 B 12 E ..  
Business conference, convention or trade show 6 E 6 E ..   ..  
Other business 2,738 B 2,687 B 51 E ..  
Overnight Total Main Trip Purpose 6,775 A 6,465 A 102 C 208 B
Holiday, leisure or recreation 3,298 B 3,186 B 32 D 80 D
Visit friends or relatives 2,254 B 2,188 B 13 E 53 E
Personal conference, convention or trade show 10 E 8 E 0 E 2 E
Shopping, non-routine 57 D 57 D ..   ..  
Other personal reasons 677 B 601 B 12 E 64 E
Business conference, convention or trade show 13 E 9 E 2 E 3 E
Other business 467 C 418 C 43 E 6 E
..
data not available

Estimates contained in this table have been assigned a letter to indicate their coefficient of variation (c.v.) (expressed as a percentage). The letter grades represent the following coefficients of variation:

A
c.v. between or equal to 0.00% and 5.00% and means Excellent.
B
c.v. between or equal to 5.01% and 15.00% and means Very good.
C
c.v. between or equal to 15.01% and 25.00% and means Good.
D
c.v. between or equal to 25.01% and 35.00% and means Acceptable.
E
c.v. greater than 35.00% and means Use with caution.

National Travel Survey: Response Rate at the estimation stage – Q1 2021

National Travel Survey Q1 2021: Response Rate at the estimation stage
Table summary
This table displays the results of Response Rate at the estimation stage. The information is grouped by Province of residence (appearing as row headers), Unweighted and Weighted (appearing as column headers), calculated using percentage unit of measure (appearing as column headers).
Province of residence Unweighted Weighted
Percentage
Newfoundland and Labrador 10.7 9.8
Prince Edward Island 7.4 7.3
Nova Scotia 22.2 20.1
New Brunswick 19.7 17.6
Quebec 28.5 25.0
Ontario 26.8 24.6
Manitoba 19.2 17.2
Saskatchewan 15.6 14.4
Alberta 22.8 21.6
British Columbia 28.3 26.4
Canada 21.3 23.5

Preliminary Estimate for 2021 and Intentions for 2022

Integrated Business Statistics Program (IBSP)

This guide is designed to assist you as you complete the Annual Exploration, Development and Capital Expenditures Survey Petroleum and Natural Gas Industry Preliminary Estimate for 2021 and Intentions for 2022.

If you need more information, please call the Statistics Canada Help Line at the number below.

Help Line: 1-833-977-8287 (1-833-97STATS)

Table of contents

Reporting period information
Definitions

Reporting period information

For the purpose of this survey, please report information for your 12 month fiscal period for which the final day occurs on or between April 1, 2021 – March 31, 2022.

  • May 1, 2020 – April 30, 2021
  • June 1, 2020 – May 31, 2021
  • July 1, 2020 June 30, 2021
  • August 1, 2020 – July 31, 2021
  • September 1, 2020 – August 31, 2021
  • October 1, 2020 – September 30, 2021
  • November 1, 2020 – October 31, 2021
  • December 1, 2020 – November 30, 2021
  • January 1, 2021 – December 31, 2021
  • February 1, 2021 – January 31, 2022
  • March 1, 2021 – February 28, 2022
  • April 1, 2021 – March 31, 2022

Here are other examples of fiscal periods that fall within the required dates:

  • September 18, 2020 to September 15, 2021 (e.g., floating year-end)
  • June 1, 2021 to December 31, 2021 (e.g., a newly opened business)

Definitions

  • When there are partnerships and joint venture activities or projects, report the expenditures reflecting this corporation's net interest in such projects or ventures.
  • Report all dollar amounts in thousands of Canadian dollars ('000).
  • Do not include sales tax. Percentages should be rounded to whole numbers.
  • When precise figures are not available, please provide your best estimates.

If there are no capital expenditures, please enter '0'.

What are Capital Expenditures?

Capital Expenditures are the gross expenditures on fixed assets for use in the operations of your organization or for lease or rent to others. Gross expenditures are expenditures before deducting proceeds from disposals, and credits (capital grants, donations, government assistance and investment tax credits).

Include:

  • Cost of all new buildings, engineering, machinery and equipment which normally have a life of more than one year and are charged to fixed asset accounts
  • Modifications, acquisitions and major renovations
  • Capital costs such as feasibility studies, architectural, legal, installation and engineering fees
  • Subsidies received and used for capital expenditures
  • Capitalized interest charges on loans with which capital projects are financed
  • Work done by own labour force
  • Additions to capital work in progress.

Exclude:

  • transfers from capital work in progress (construction-in-progress) to fixed assets accounts
  • assets associated with the acquisition of companies property developed for sale and machinery or equipment acquired for sale (inventory).

1. Oil and gas rights acquisition and retention costs (exclude inter-company sales or transfers):

Include acquisition costs and fees for oil and gas rights (include bonuses, legal fees and filing fees), and oil and gas retention costs

2. Exploration and evaluation, capitalized or expensed (e.g., seismic, exploration drilling):

These expenditures include geological, geophysical and seismic expenses, exploration drilling, and other costs incurred during the reporting period in order to determine whether oil or gas reserves exist and can be exploited commercially. Report gross expenditures, before deducting any incentive grants, incurred for oil and gas activities on a contracted basis and/or by your own employees. Exclude the cost of land acquired from other oil and gas companies.

3. Building construction (e.g., process building, office building, camp, storage building, and maintenance garage):

Include capital expenditures on buildings such as office buildings, camps, warehouses, maintenance garages, workshops, and laboratories. Fixtures, facilities and equipment that are integral parts of the building are included.

4. Other construction assets (e.g., development drilling and completions, processing facilities, natural gas plants, upgraders):

Include all infrastructure, other than buildings, such as the cost of well pads, extraction and processing infrastructure and plants, upgrading units, transportation infrastructure, water and sewage infrastructure, tailings, pipelines and wellhead production facilities (pumpjacks, separators, etc). Include all preconstruction planning and design costs such as development drilling, regulatory approvals, environmental assessments, engineering and consulting fees and any materials supplied to construction contractors for installation, as well as site clearance and preparation. Equipment which is installed as an integral or built-in feature of a fixed structure (e.g. casings, tanks, steam generators, pumps, electrical apparatus, separators, flow lines, etc.) should be reported with the construction asset; however, when the equipment is replaced within an existing structure, the replacement cost should be reported in machinery and equipment (sustaining capital).

5. Machinery and equipment purchases (e.g., trucks, shovels, computers, etc.):

Include transportation equipment for people and materials, computers, software, communication equipment, and processing equipment not included in the above categories.

Research and Development

Research and experimental development (R&D) comprise creative and systematic work undertaken in order to increase the stock of knowledge – including knowledge of humankind, culture and society – and to devise new applications of available knowledge.

For an activity to be an R&D activity, it must satisfy five core criteria:

  1. To be aimed at new findings (novel);
  2. To be based on original, not obvious, concepts and hypothesis (creative);
  3. To be uncertain about the final outcome (uncertainty);
  4. To be planned and budgeted (systematic);
  5. To lead to results to could be possibly reproduced (transferable/ or reproducible).

The term R&D covers three types of activity: basic research, applied research and experimental development. Basic research is experimental or theoretical work undertaken primarily to acquire new knowledge of the underlying foundations of phenomena and observable facts, without any particular application or use in view. Applied research is original investigation undertaken in order to acquire new knowledge. It is, however, directed primarily towards a specific, practical aim or objective. Experimental development is systematic work, drawing on knowledge gained from research and practical experience and producing additional knowledge, which is directed to producing new products or processes or to improving existing products or processes.

Integrated Business Statistics Program (IBSP)

This guide is designed to assist you as you complete the Annual Capital Expenditures Survey

Preliminary Estimate for 2021 and Intentions for 2022. If you need more information, please call the Statistics Canada Help Line at the number below.

Help Line: 1-833-977-8287 (1-833-97STATS)

Table of contents

Reporting period information

For the purpose of this survey, please report information for your 12 month fiscal period for which the Final day occurs on or between April 1, 2021 – March 31, 2022.

  • May 1, 2020 – April 30, 2021
  • June 1, 2020 – May 31, 2021
  • July 1, 2020 – June 30, 2021
  • August 1, 2020 – July 31, 2021
  • September 1, 2020 – August 31, 2021
  • October 1, 2020 – September 30, 2021
  • November 1, 2020 – October 31, 2021
  • December 1, 2020 – November 30, 2021
  • January 1, 2021 – December 31, 2021
  • February 1, 2021 – January 31, 2022
  • March 1, 2021 – February 28, 2022
  • April 1, 2021 – March 31, 2022

Here are other examples of fiscal periods that fall within the required dates:

  • September 18, 2020 to September 15, 2021 (e.g., floating year-end)
  • June 1, 2021 to December 31, 2021 (e.g., a newly opened business)

Dollar amounts

  • all dollar amounts reported should be rounded to thousands of Canadian dollars (e.g., $6,555,444.00 should be rounded to $6,555);
  • exclude sales tax;
  • your best estimates are acceptable when precise figures are not available;
  • if there are no capital expenditures, please enter '0'.

Definitions

What are Capital Expenditures?

Capital Expenditures are the gross expenditures on fixed assets for use in the operations of your organization or for lease or rent to others. Gross expenditures are expenditures before deducting proceeds from disposals, and credits (capital grants, donations, government assistance and investment tax credits).

Include:

  • Cost of all new buildings, engineering, machinery and equipment which normally have a life of more than one year and are charged to fixed asset accounts
  • Modifications, acquisitions and major renovations
  • Capital costs such as feasibility studies, architectural, legal, installation and engineering fees
  • Subsidies received and used for capital expenditures
  • Capitalized interest charges on loans with which capital projects are financed
  • Work done by own labour force
  • Additions to capital work in progress.

Exclude:

  • transfers from capital work in progress (construction-in-progress) to fixed assets accounts
  • assets associated with the acquisition of companies
  • property developed for sale and machinery or equipment acquired for sale (inventory).

How to Treat Leases

Include:

  • assets acquired as a lessee through either a capital or financial lease;
  • assets acquired for lease to others as an operating lease.

Industry characteristics

Report the value of the projects expected to be put in place during the year. Include the gross expenditures (including subsidies) on fixed assets for use in the operations of your organization or for lease or rent to others. Include all capital costs such as feasibility studies, architectural, legal, installation and engineering fees as well as work done by your own labour force. Include all additions to work in progress.

New Assets, Renovation, Retrofit, includes both existing assets being upgraded and acquisitions of new assets.

Purchase of Used Canadian Assets

Definition: Used fixed assets may be defined as existing buildings, structures or machinery and equipment which have been previously used by another organization in Canada that you have acquired during the time period being reported on this questionnaire.

Explanation: The objective of our survey is to measure gross annual new acquisitions to fixed assets separately from the acquisition of gross annual used fixed assets in the Canadian economy as a whole.

Hence, the acquisition of a used fixed Canadian asset should be reported separately since such acquisitions would not change the aggregates of our domestic inventory of fixed assets, it would simply mean a transfer of assets within Canada from one organization to another.

Imports of used assets, on the other hand, should be included with the new assets (Column 1) because they are newly acquired for the Canadian economy.

Work in Progress

Work in progress represents accumulated costs since the start of capital projects which are intended to be capitalized upon completion.

Land

Capital expenditures for land should include all costs associated with the purchase of the land that are not amortized or depreciated.

Residential Construction

Report the value of residential structures including the housing portion of multi-purpose projects and of townsites.

Exclude:

  • buildings that have accommodation units without self-contained or exclusive use of bathroom and kitchen facilities (e.g., some student and senior citizen residences)
  • the non-residential portion of multi-purpose projects and of townsites
  • associated expenditures on services

The exclusions should be included in the appropriate construction (e.g., non-residential) asset.

Non-Residential Building Construction (excluding land purchase and residential construction)

Building construction represents any permanent structure with walls and a roof affording protection and shelter from and for a social and/or physical environment for people and/or materials.

For example, building construction represents expenditures on aircraft hangars, factories, hospitals, hotels, office buildings, railway stations, schools and shopping centres.

Report the total cost incurred during the year of building construction (contract and by own employees) whether for your own use or rent to others.

Include also:

  • the cost of demolition of buildings, land servicing and of site-preparation
  • leasehold and land improvements
  • all preconstruction planning and design costs such as engineer and consulting fees and any materials supplied to construction contractors for installation, etc.
  • townsite facilities, such as streets, sewers, stores, schools.

Non-residential engineering construction

Engineering construction encompasses the direct or indirect conveyance of people, machinery, materials, gases, and/or electrical impulses. It also includes free standing structures which contain or restrain such objects either as part of such conveyance or separately and independently.

In addition, the cost associated with significantly altering any terrain in the preparation for specialized use of that terrain will fall under engineering construction.

Report the total cost incurred during the year of engineering construction (contract and by own employees) whether for your own use or rent to others. Include also:

  • the cost of demolition of buildings, land servicing and of site-preparation
  • leasehold and land improvements
  • all preconstruction planning and design costs such as engineer and consulting fees and any materials supplied to construction contractors for installation, etc.
  • oil or gas pipelines, including pipe and installation costs
  • communication engineering, including transmission support structures, cables and lines, etc.
  • electric power engineering, including wind and solar plants, nuclear production plants, power distribution networks, etc.

Machinery and Equipment

Report total cost incurred during the year of all new machinery, whether for your own use or for lease or rent to others. Any capitalized tooling should also be included. Include progress payments paid out before delivery in the year in which such payments are made. Receipts from the sale of your own fixed assets or allowance for scrap or trade-in should not be deducted from your total capital expenditures. Any balance owing or holdbacks should be reported in the year the cost is incurred.

Include:

  • automobiles, trucks, professional and scientific equipment, office and store furniture and appliances
  • computers (hardware and software), broadcasting, telecommunication and other information and communication technology equipment
  • motors, generators, transformers
  • any capitalized tooling expenses
  • progress payments paid out before delivery in the year in which such payments are made
  • any balance owing or holdbacks should be reported in the year the cost is incurred
  • leasehold improvements.

Software

Capital expenditures for software should include all costs associated with the purchase or development of software.

Include:

  • Pre-packaged software
  • Custom software developed in-house/own account
  • Custom software design and development, contracted out

Research and Development

Research and experimental development (R&D) comprise creative and systematic work undertaken in order to increase the stock of knowledge – including knowledge of humankind, culture and society – and to devise new applications of available knowledge.

For an activity to be an R&D activity, it must satisfy five core criteria:

  1. To be aimed at new findings (novel);
  2. To be based on original, not obvious, concepts and hypothesis (creative);
  3. To be uncertain about the final outcome (uncertainty);
  4. To be planned and budgeted (systematic);
  5. To lead to results to could be possibly reproduced (transferable/ or reproducible).

The term R&D covers three types of activity: basic research, applied research and experimental development. Basic research is experimental or theoretical work undertaken primarily to acquire new knowledge of the underlying foundations of phenomena and observable facts, without any particular application or use in view. Applied research is original investigation undertaken in order to acquire new knowledge. It is, however, directed primarily towards a specific, practical aim or objective. Experimental development is systematic work, drawing on knowledge gained from research and practical experience and producing additional knowledge, which is directed to producing new products or processes or to improving existing products or processes.