Preparing a program

For the Real Time Remote Access (RTRA) system to automate the confidentiality processes, your programs must be written in a standard format. To write a SAS program in the correct format, users must apply information from the RTRA parameters document and create statistics by calling standard RTRA macros.

Parameters

Parameters

The RTRA parameters contains essential information that users will require to write their SAS programs. The terms in this document are explained below.

SAS Tag Name - A tag name is a unique reference term for each survey library available through the RTRA system. To ensure access to the correct survey library, the tag name must be referenced in the title of your SAS program. Please refer to Program name for directives on the correct naming convention of your SAS program.

SAS datasets - The SAS dataset name must be referenced using the standard libname called RTRAData. To ensure access to the correct survey dataset, please refer to the RTRA data page for the complete list of dataset names.

Rounding base - Frequencies are rounded in accordance with the rounding base specified for each survey dataset. The rounding base is developed using information on the weight distribution, minimum-respondent rules and existing rounding practice for each survey dataset.

Variables renamed - For RTRA compatibility reasons, certain variables are renamed.

Deleted variables - Sensitive variables that pose a disclosure risk are deleted from the microdata files.

Weight - Weight variables for each survey dataset are made available in this document. Although sample weights do not exist for administrative datasets, a standard name of “WEIGHT” must be inputted for the RTRA system to pass in the macro. This “WEIGHT” variable is equal to 1 for administrative data files.

Execution time limit - The execution time limit specifies the maximum time length for running the program submission. This limit prevents the SAS program from running for an excessive amount of time and consuming unnecessary computing resources.

Program name

Program name

To ensure access to the correct survey library, the tag name must be referenced in the title of your SAS program. Please refer to the RTRA data page for the complete list of tag names.

Your SAS program must follow a standard naming convention. This convention must begin with the appropriate survey "Tag Name". Next, there must be an underscore followed by a name of your choosing. For example, researchers submitting a program using the 2006 General Social Survey should name their program: GSS2006_anynameyouwant.sas. Note that program names have a 70 character limit and cannot include the characters & and %.

Program content: Statistics

Program content: Statistics

Please ensure your program follows the structure in the sections below.

Part 1: Program element

  • Users need to reference a standard libname called RTRAData. The list of corresponding dataset names can be found in the RTRA parameters. For example, Set RTRAdata.GSS2007;.
  • Do not use a standard SAS libref ; including a libname statement will result in the termination of your program.
  • In this section you can manipulate the data using "proc sort" and "data steps".
  • When using the "keep" statement (to define which variables to include in the output) or "keep=" dataset option in SAS, you must include the 'ID' variable.  For example, Set RTRAdata.GSS2007 (keep = AGE SEX ID);.

Part 2: Statistics

In this section, tabulations are created by calling the custom RTRA procedures macros. You can call these procedures a maximum of 10 times per program.

There are three types of statistics that can be calculated in RTRA:

  • 1. Basic Statistics: These statistics calculate only one statistic at a time. The basic statistics available in the RTRA system are: frequency, mean, percentiles, percent distribution, proportions, ratio and share.
  • 2. Level 5 (L5) Statistics: Also known as higher‐order statistics, these statistics calculate differences between the basic statistics available in the RTRA system.
    • There are three different types of L5 statistics:
      • 1. Level Change (LC): Level change is defined as the difference between the values of the statistic calculated within a table.
      • 2. Percent Change (PC): Percentage change is defined as the percent difference between the values of the statistic within a table. It is calculated by taking the difference of two values within a table and dividing by the original value.
      • 3. Significance Test (ST): Significance tests calculate whether two values in a table have a difference that is statistically significant.
    • There are three methods of calculating L5 statistics. These methods refer to how the values in the table’s cells are compared to one another:
      • 1. Global: For a global L5 statistic, every value in a cell is compared to the value for the entire domain that encompasses these cells.
      • 2. Base Value: A base value L5 statistic compares the value of every cell with another specified cell (the base value).
      • 3. Sequential: A sequential L5 statistic compares the value of every cell with the value of the cell directly below it in the table. Note: The order of the domains in a table matters when using a sequential L5 statistic.
  • 3. Level 5 Sequential Over Time (L5SOT) Statistics: Also known as higher‐order statistics, these statistics calculate differences between the basic statistics available in the RTRA system. L5SOT statistics compare the value of every cell with the value of the cell directly below it in the table in a sequential manner over time. As such, a string of time needs to be identified in the macro so that the sequence can be shown; these time records can be yearly (L5YrVar), monthly (L5MonVar), quarterly (L5QtrVar) or a set time interval (L5TimeInt). Note: The order of the domains in a table matters when using L5SOT statistics.
    • There are three different types of L5SOT statistics:
      • 1. Level Change (LC): Level change is defined as the difference between the values of the statistic calculated within a table.
      • 2. Percent Change (PC): Percentage change is defined as the percent difference between the values of the statistic within a table. It is calculated by taking the difference of two values within a table and dividing by the original value.
      • 3. Significance Test (ST): Significance tests calculate whether two values in a table have a difference that is statistically significant.

Both L5 and L5SOT statistics require a basic statistic to be calculated before they can be used. As such, there is a field within the L5 and L5SOT macros where the basic statistic is identified.

Date modified: