Video - Geoprocessing Tools (Part 1)

Catalogue number: Catalogue number: 89200005

Issue number: 2020017

Release date: November 24, 2020

QGIS Demo 17

Geoprocessing Tools (Part 1) - Video transcript

(The Statistics Canada symbol and Canada wordmark appear on screen with the title: "Geoprocessing Tools (Part 1)")

So today we'll introduce geoprocessing tools, which enable layers to be spatially overlaid and integrated in a variety of ways. These tools epitomize the power of GIS and geospatial analysis, facilitating combining feature geometries and attributes, whether it be assessing spatial relations, distributions or proximities between layers and associated variables of interest. We'll demonstrate these tools with a simple case-study, examining land-cover conditions near water features, also known as riparian areas, in southern Manitoba. These tools can be reapplied and iterated with multiple layers, enabling you to combine, analyse and visualize spatial relations between any variables, geometries and layers of thematic relevance to your area of expertise.

So first, the Merged Census Division feature from the AOI layer was selected and subset to a new layer – CAOI – since Selected Features is not available when running tools as a batch process.

In addition to the interactive and attribute selection tools covered previously, there is one final type – Select by Location. This selects features from the input layer according to its spatial distribution relative to a second layer and the selected geometric predicates. The predicates define the particular spatial relations used when selecting features. We'll use Intersects, Overlaps and Are Within. Multiple predicates can be used, provided they do not conflict. And processing times increase with the number of selected predicates. At the bottom, the alternative selection options are available in the drop-down, but we'll run with the default.

So most selected features match the predicates but two spatially disconnected features were also returned due to a common attribute. So now we'll use the Multipart to Singlepart tool to break the multi-polygons into separate features, running with Selected Features Only.

Now we'll use a slight variation of Select by Location - Extract by Location. Instead of creating feature selections in our input layer, this will generate a new layer. So matching the predicates and comparison layer to those used in Select by Location, we'll click Run. In addition there is also Join by Location, which enables fields from the second layer to be joined to the first according to the predicates and the specified join type – as one-to-one or one-to-many. So these by Location tools enable features to be selected or extracted and field information joined between layers according to their relative spatial distributions.

So now we'll merge the land-cover 2000 layers into one file with the Merge Vector Layers tool. Open the Multiple Selection box and select the four land-cover files. We'll also switch the Destination Coordinate Reference System to WGS84 UTM Zone 14 for spatial analysis. Click run with a temporary file. So merge can be applied to vectors of the same geometry type. It works best when layers contain the same fields and cover distinct yet adjacent areas – making the land-cover layers highly suitable. Two additional fields specifying the originating layer and file path for each of the features is included in the output.

While Merge is running, we'll reproject the watershed layer to the same Coordinate Reference System for consistency in our spatial analysis.

Now we'll join the provided classification guide with the class names to the merged output, using the Joins tab. So code is the Join Field and COVTYPE - the Target Fields. We'll join the Class field and remove the prefix. Now we can run the merged layer through the Fix Geometries tool to accomplish two tasks simultaneously. First it will fix invalid geometries – critical for adding spatial measures and applying geoprocessing tools, while also permanently joining the Class fields. The process may take a few minutes to complete.

 So now we'll rename the Reprojected and Fixed layers to PTWShed for projected tertiary watershed and FMLC2000 for fixed merged land-cover 2000. This will enable us to use the autofill settings to populate the file paths and names when running Clip as a Batch Process. So open Clip from the Toolbox and click Batch Process.

As we've covered, the Clip tool helps standardize the extent of analysis for multiple layers to an area of interest, or reduce processing times and file sizes in a workflow. The inputs can be of any geometry type while the Overlay Layer is always a polygon. Features and attributes that overlap with the Overlay Layer are retained, with the Overlay Layer acting like a cookie cutter on the input.

So select FMLC2000and PTWShedas the input and select CAOIas the Overlay Layer. We can then Copy and paste it into the next row – which we could repeat for as many entries as required. We'll click the plus icon and copy PTWShed for the Input to prepare this layer for an upcoming demo. Here we'll use Manitoba Outline as the Overlay layer. For the output files we'll store them in a Scratch folder, for intermediary outputs in our workflow which can then be deleted at the end of part 2 of the demo. Enter C for the filename, and click Save and then use Fill with Parameter Values in the Autofill settings drop-down. This adds a C prefix to our existing layer names. We'll store the last file in the Geoprocessing folder so that it is retained. Click Run and we'll pick back up once completed. The process takes around five minutes to complete.

So with the clipped layers complete, load them into the Layers Panel. I'll move them back into the Processing Group for organization purposes and then zoom in on the layers.

We can load the provided symbology file to visualize the different land-cover classes.

Then we'll add an area field to the clipped land-cover file. Call it FAreaHA for field area, using a decimal field type with a length of 12 and a precision of 2. We'll reuse these parameters for adding subsequent numeric fields. Enter the appropriate expression - $area divided by 10000.

Now we'll use Select by Expression to isolate 'Water' features using "COVTYPE" = 20 or "Class" LIKE 'Water' – and then click Select Features.

Now we'll generate a Buffer around the selected features to begin creating the Riparian area layer. There are many Buffer tools available in the Processing Toolbox – which we'll demonstrate in Part II – here using the default tool.

We'll check 'Selected features only' box and enter 30 for the distance – a common riparian setback in land-use planning and policies. Change the End Cap Style to Flat and check Dissolve Results, so that any overlapping buffers are merged to avoid conflating total area estimates. Run with a temporary output file. We'll rerun the tool toggling back to the Parameters and changing the distance to 0, to output Water features as their own temporary layer – reducing processing times for the next tool.

Buffer tools can be applied to any vector geometry type. And they are used to assess the proximity of features to those in other layers. We can also use buffers to facilitate combining our geometries and attributes with other layers – like buffering lines or points to use them as a difference layer. The buffer contains the input layer's attributes, which can be used for further analysis. The outputs are often applied with other geoprocessing tools for further examination.

So we'll rename the outputs, naming the first B30W and the second LC2000Water, to facilitate their distinction.

Zooming in on the buffer, the input water features were also included in the output geometry. Since we are not interested in water features but the land-cover conditions around them we'll run the water buffer through the Difference tool using LCWater2000 as the Overlay Layer to retain only the buffered area. So difference is the opposite of Clip – retaining only input features that do not overlap with the Overlay layer. Like Clip – the input can be any geometry type, while the overlay layer is always a polygon. Difference can be used whenever we are interested in features that do not overlap with a specific polygon, such as areas external to a certain drive or distance from hospitals or farm fields, roads or grain elevators not impacted by historical flooding. So click Run and we'll continue once the output is complete.

Toggling the water layer off, we can see that the Difference has retained only our 30 metre buffer. So now we've successfully generated our riparian area layer but need to follow up with the Intersection tool – running it twice to extract watershed codes and land-cover classes to our layer. Intersection retains the overlapping feature geometries of the input layers and any selected attributes of interest in the Fields to Keep parameter. If geometry types differ between layers, the first layer's geometry is used in the output. Thus, Intersection can help combine variables of interest from multiple layers.

For the first run we'll use the Difference and clipped watershed layers as the inputs to assign watershed codes to the riparian buffer. This will enable us to examine land-cover conditions by watershed in Part II of the demo. And for PTWShed check the sub-basin code field in the Multiple Selection box. For the Difference layer, we'll select an arbitrary field for the Fields to Keep parameter – here selecting the "layer" field, clicking OK and then clicking Run. This process takes around 5 minutes and we'll continue when complete.

Within the Attribute Table we can see watershed codes have been successfully assigned to the riparian layer. Now we'll run the tool again, using the intersect as the Input and the clipped land-cover file as the Overlay layer to integrate the land-cover features in the riparian areas. We'll retain the watershed code field from the first layer and the "Class" and "FAreaHA" fields from the land-cover. We'll save it to file, storing it in the main geoprocessing folder and calling it RipLC2000 for riparian land-cover 2000. If the tool fails, use Fix Geometries tool and rerun the Intersection with the fixed outputs. We'll pick back up after the layer is created, which may take up to 20 minutes.

With the riparian land-cover layer loaded copy and paste the style from the clipped land-cover to visualize the different feature classes occupying these areas. Now we've successfully combined the riparian buffer by watershed with the land-cover layer. And for the final component of Part I we'll add four new fields with the Field Calculator, specifically the intersected area in hectares, to determine the area of each land-cover feature within the buffered riparian area. Use the same parameters and expression as applied for creating the FAreaHA field.

So next we'll calculate the percentage of each feature within the 30 metre buffer, to assess the relative distribution of the original features within the riparian setback and isolate any potential violating land-uses. We'll call the field PrcLCinRip, for percent land-cover in riparian area, with the same parameters as the previous fields. Expanding the fields drop-down, we'll divide IAreaHA by FAreaHA and multiply by 100.

The next two fields are to create an identifier which combines the subwatershed codes and land-cover class fields which we'll use to aggregate and assess riparian land-cover by watershed. First is an FID field or FeatureID, which we'll use for the Group_By parameter when using the concatenate function. Leave the parameters in their defaults and double-click the @row_number expression.

Now we can use Concatenate to combine our fields in creating the ID. This is extremely helpful for further processing and analysis, such as distinguishing and rejoining different processed layers to original features or aggregating datasets by different criteria. So we'll change to a text field type with a length of 100 and call it "UBasinLCID".

So type concatenate in the expression box – specifying the function to apply, and then open bracket and double-click SUBBASIN in the fields and values drop-down. Using the separators and adding a dash in single quotes will help separate the codes and class fields for interpretability. As noted, the FID field is used for the Group_By parameter, writing group underscore by, colon, equal sign and then double-clicking the FID field.

We can see the combined fields in the output preview. Given the number of features, the concatenated function can take up to 30 minutes to create. After it's complete, ensure to save the edits to the layer and the project file with a distinctive name for use in Part II of the demo.

(The words: "For comments or questions about this video, GIS tools or other Statistics Canada products or services, please contact us: statcan.sisagrequestssrsrequetesag.statcan@canada.ca" appear on screen.)

(Canada wordmark appears.)