Video - Join by Attributes (Part 2): One to Many Joins

Catalogue number: Catalogue number: 89200005

Issue number: 2020013

Release date: November 20, 2020

QGIS Demo 13

Join by Attributes (Part 2) - One to Many Joins - Video transcript

(The Statistics Canada symbol and Canada wordmark appear on screen with the title: "Join by Attributes (Part 2) - One to Many Joins")

Hello everyone, following up from the previous tutorial on one-to-one join by attributes, today we'll discuss the second type – the one-to-many join, where there are many rows or features for one corresponding feature geometry. We'll demonstrate these procedures using the Farms Classified by Total Farm Capital table, which includes the numbers of farms reporting in six separate capital categories for each corresponding boundary. The table also requires some formatting in an external editor, including extracting the join information which is not unusual for performing joins or integrating Statistics Canada tables with vector data presently.

So within the table we'll begin by moving the first Census Agricultural Region name down to align with the first capital category and delete the row it was in. Then we'll copy the main table to a new workbook to continue editing. So now we must fill the Census Agricultural Region names for each capital category. So under Find and Select, we can click Go to Special and check Blanks. In the Formula Bar at the top we'll then define the cell used to fill the first blank cell and hit Ctrl and Enter to auto-fill the formula down the entire column.

Next we want to extract the Census Agricultural Region Unique Identifiers found within the square brackets of the "Geography" column. So to do so we can apply this formula, which will find and extract characters within square brackets. I've included this and other helpful formulas for extracting join information within the video description. So after pasting the formula we can use the tab in the bottom-right corner of the cell to apply the formula for the entire column.

Now we want to remove the CAR prefix so we'll use the RIGHT formula, specifying the cell and the number of characters to take, in this case 9.

The final procedure to isolate the Census Agricultural Region Unique Identifiers is to take the first four values from the LEFT.

To extract the join information for finer census geographies such as subdivisions, we would actually take the first three digits on the left and the last four digits on the right and concatenate them together, with the procedures otherwise being the same.

Now we'll copy the extracted IDs and paste them as values in the adjacent column. We can then delete the previous three columns with formulas. And the final step is to provide abbreviated field names less than 10 characters in length as we learned in the previous video. I'm adding FN in front of 2011 and 2016 to distinguish that these are farm numbers being reported within these columns. Then we'll save our table first as an excel workbook, which we'll call FarmCap and then we'll resave as a comma-separated values file selected from the Save As Type drop-down. Saving directly to .csv could've adversely affected our table formatting. So now we can close the program, and click Don't Save.

Now within QGIS we can refresh our browser panel and load in the formatted table along with our Census Agricultural Regions boundary file.

Rather than using the Joins tab which results in a more complex workflow, we'll go to the Processing Toolbox, and search Join Value - opening the Join Attributes by Field Value tool. Despite the slightly different appearance and format the tool contains the exact same parameters as the Joins tab. We specify the two layers that we'd like to join, here the Census Agricultural Regions boundary file and the formatted farm capital table – and the two fields with common entries used to link the datasets together – in this case both being CARUID. We can also specify which fields to add from our second layer. The main distinguishing feature of the tool is the ability to select the One-To-Many from the Join Type drop-down.

So we'll now save it to a permanent file – calling it JFarmCap – in our Joins folder. Run the tool and once complete refresh the Browser Panel and load the layer.

So we'll isolate the Census Agricultural Regions in Manitoba for a smaller area of analysis. With our selection we can then calculate the difference in the farm numbers reporting in each capital category between the two census years using the Field Calculator. So I'll call the field D-Farm-16-11 and expanding the Fields and Values drop-down we'll subtract the number of farms in 2016 from those in 2011.

So now we can apply a graduated symbology to our calculated difference field. I created a symbology file earlier for this purpose which I'll load from File. So Natural Jenks was the Mode to establish the ranges for visualization and the same field was also used to label the features . It's important that we are applying the visualization to this layer, as it contains the full range of values which might not be replicated by individual capital categories.

Now within the Processing Toolbox, the Split Vector Layer tool can be used to create many separate layers from a vector according to a unique ID. So if we left it as is, using the CARUID – each census agricultural region would be output as a separate file. However, since we're interested in examining the changing number of farms in each capital category - we'll apply it to the Farm Capital Class field. We'll create a new directory within our Joins folder for the output files, calling it Split Farm Capital. Ensuring it's only applied to the selected features we can then click Run.

Once complete, we can now expand the directory and for a selection of layers in the Browser Panel we can right-click and select Add Layers.

So with the symbology already applied to our original joined vector, we can simply right-click copy style, and selecting all style categories. Now we can select and right-click the split layers and click paste style. So this enables a uniform visualization, with the complete value ranges, to be rapidly applied across multiple layers.

I've loaded and ordered the split layers in ascending capital ranges in the Prepared Layers group. So now if we toggle through the separate Farm Capital Categories we can see a broad trend. In general there's a decline in the numbers of farms reporting in smaller capital categories – and as we approach larger capital categories, there's a general increase in the number of farms reporting. This is an established trend within the agricultural sector.

So joining table and vector data is a powerful tool for integrating tabulated variables in a geospatial format – facilitating analyzing and visualizing spatial and temporal variations in a wide-range of reported variables. With the skills developed in this demo, users should be able to: Extract or match join information from datasets, Perform one-to-one and one-to-many joins between tables and vector datasets, and visualize joined variables and examine the relations with the graphics tools.

Although we only joined one table to the vector datasets for each of these case-uses, these procedures can be repeated to combine multiple tables and explore relations between variables at a common level. Apply these skills to datasets of interest to you. So stay tuned for the next tutorial in which we'll cover creating Maps in QGIS.

And on one final note - Statistics Canada is increasingly releasing datasets in table and spatial formats on the Federal Geospatial Platform, such as Median Income after Tax and Farm Operator datasets. They can be readily loaded into QGIS and used to examine trends at multiple scales. So download these datasets when available to facilitate using Statistics Canada data in QGIS.

(The words: "For comments or questions about this video, GIS tools or other Statistics Canada products or services, please contact us:
statcan.sisagrequestssrsrequetesag.statcan@canada.ca" appear on screen.)

(Canada wordmark appears.)