Confidentiality Vetting Support: Rounding Proportions using Stata
(The Statistics Canada symbol and Canada wordmark appear on screen with the title: "Confidentiality Vetting Support: Rounding Proportions using Stata"
Welcome to Statistic Canada's data access training series. This video is part of the confidentiality vetting support series and presents examples of how to use different statistical software packages to perform the analyses required for researchers working with confidential data.
Today, we will show you an example of how to create rounded proportion output in Stata using sample data from the General Social Survey (GSS). Please note that this is a public use version of the GSS.
All components of proportions need to meet minimum unweighted count requirements. This includes the numerator, denominator and the difference between the denominator and the numerator. Some confidentiality vetting rules also require proportions to be calculated based on rounded components. To help you with this, we have prepared a rounded proportion tool in Stata which will automatically prepare a supporting document showing the minimum counts necessary and output rounded proportions when they are required. If you're unsure of the location of the Stata rounded proportion tool, please reach out to your local RDC analyst.
This do file is designed to be used with your data set once it is cleaned and ready for analysis.
The only place where any adjustments are required to be made are here in the globals.
All we need to do is specify the directory our data is in, our cleaned dataset. We are also required to specify our numerator and denominator. In this example our numerator is employment status and our denominator is sex. So our proportion is the amount of men and women in different job types.
We specify our survey weight. we specify our rounding base and we specify the minimum cell size. I have already run the globals in Stata so all is required for us to do is run the tool from lines 25 down to 53, and hit do.
Everything has been run and will appear in our directory.
Please note this tool makes use of the collapse command which means there will be several intermediate datasets which will also show up in our directory.
The first thing we want to look at is our "for release" file.
What we see here is, we see men, the different employment statuses, and the proportion of men in each of these employment statuses. And down here, the proportion of women in different employment statuses as well.
Everything looks good, now all we have to do is check our supporting documentation.
Here in our directory it will be called supporting. The supporting document gives us everything we need: our unweighted denominator counts. our unweighted numerator counts. We even have a variable that provides the difference between numerator and denominator and ensures our residuals are above the minimum cell size.
Column L is a 'fail' in case there are any numbers that fall below our minimum cell size. We see the weighted and unweighted proportions, the rounded weighted and unweighted proportions. And we check to make sure everything is good.
The final thing I will note is if you want to create more proportions and more output, all you have to do is change your numerator and denominator and run the tool again. You can create as many proportions as you would like.
Thank you for watching this video today. I hope you have an excellent day.
(Canada wordmark appears.)