Computer vision models: seed classification project
By AI Lab, Canadian Food Inspection Agency
Introduction
The AI Lab team at the Canadian Food Inspection Agency CFIA) is composed of a diverse group of experts, including data scientists, software developers, and graduate researchers, all working together to provide innovative solutions for the advancement of Canadian society. By collaborating with members from inter-departmental branches of government, the AI Lab leverages state-of-the-art machine learning algorithms to provide data-driven solutions to real-world problems and drive positive change.
At the CFIA's AI Lab, we harness the full potential of deep learning models. Our dedicated team of Data Scientists leverage the power of this transformative technology and develop customised solutions tailored to meet the specific needs of our clients.
In this article, we motivate the need for computer vision models for the automatic classification of seed species. We demonstrate how our custom models have achieved promising results using "real-world" seed images and describe our future directions for deploying a user-friendly SeedID application.
At the CFIA AI Lab, we strive not only to push the frontiers of science by leveraging cutting-edge models but also in rendering these services accessible to others and foster knowledge sharing, for the continuous advancement of our Canadian society.
Computer vision
To understand how image classification models work, we first define what exactly computer vision tasks aim to address.
What is computer vision:
Computer Vision models are fundamentally trying to solve what is mathematically referred to as ill-posed problems. They seek to answer the question: what gave rise to the image?
As humans, we do this naturally. When photons enter our eyes, our brain is able to process the different patterns of light enabling us to infer the physical world in front of us. In the context of computer vision, we are trying to replicate our innate human ability of visual perception through mathematical algorithms. Successful computer vision models could then be used to address questions related to:
- Object categorisation: the ability to classify objects in an image scene or recognise someone's face in pictures
- Scene and context categorisation: the ability to understand what is going in an image through its components (e.g. indoor/outdoor, traffic/no traffic, etc.)
- Qualitative spatial information: the ability to qualitatively describe objects in an image, such as a rigid moving object (e.g. bus), a non-rigid moving object (e.g. flag), a vertical/horizontal/slanted object, etc.
Yet, while these appear to be simple tasks, computers still have difficulties in accurately interpreting and understanding our complex world.
Why is computer vision so hard:
To understand why computers seemingly struggle to perform these tasks, we must first consider what an image is.
An image is a set of numbers, with typically three colour channels: Red, Green, Blue. In order to derive any meaning from these values, the computer must perform what is known as image reconstruction. In its most simplified form, we can mathematically express this idea through an inverse function:
x = F-1(y)
Where:
y represents data measurements (ie. pixel values).
x represents a reconstructed version of measurements, y, into an image.
However, it turns out solving this inverse problem is harder than expected due to its ill-posed nature.
What is an ill-posed problem
When an image is registered, there is an inherent loss of information as the 3D world gets projected onto a 2D plane. Even for us, collapsing the spatial information we get from the physical world can make it difficult to discern what we are looking at through photos.
It can be difficult to recognise objects in 2D pictures due to possible ill-posed properties, such as:
- Lack of uniqueness: Several objects can give rise to the same measurement.
- Uncertainty: Noise (e.g. blurring, pixilation, physical damage) in photos can make it difficult or impossible to reconstruct and identify an image.
- Inconsistency: slight changes in images (e.g. different viewpoints, different lighting, different scales, etc.) can make it challenging to solve for the solution, x, from available data points, y.
While computer vision tasks may, at first glance, appear superficial, the underlying problem they are trying to address is quite challenging!
Now we will address some Deep Learning driven solutions to tackle computer vision problems.
Convolutional Neural Networks (CNNs)
Convolutional Neural Networks (CNNs) are a type of algorithm that has been really successful in solving many computer vision problems, as previously described. In order to classify or identify objects in images, a CNN model first learns to recognize simple features in the images, such as edges, corners, and textures. It does this by applying different filters to the image. These filters help the network focus on specific patterns. As the model learns, it starts recognizing more complex features and combines the simple features it learned in the previous step to create more abstract and meaningful representations. Finally, the CNN takes the learned features and to classify images based on the classes it's been trained with.
The first CNN was first proposed by Yann LeCun in 1989 (LeCun, 1989) for the recognition of handwritten digits. Since then, CNNs have evolved significantly over the years, driven by advancements in both model architecture and available computing power. To this day, CNNs continue to prove themselves are powerful architectures for various recognition and data analysis tasks.
Vision Transformers (ViTs)
Vision Transformers (ViTs) are a recent development in the field of computer vision that apply the concept of transformers, originally designed for natural language processing tasks, to visual data. Instead of treating an image as a 2D object, Vision Transformers view an image as a sequence of patches, similar to how transformers treat a sentence as a sequence of words.
The process starts by splitting an image into a grid of patches. Each patch is then flattened into a sequence of pixel vectors. Positional encodings are added to retain the positional information, as is done in transformers for language tasks. The transformed input is then processed through multiple layers of transformer encoders to create a model capable of understanding complex visual data.
Just as Convolutional Neural Networks (CNNs) learn to identify patterns and features in an image through convolutional layers, Vision Transformers identify patterns by focusing on the relationships between patches in an image. They essentially learn to weigh the importance of different patches in relation to others to make accurate classifications. The ViT model was first introduced by Google's Brain team in a paper in 2020. While CNNs dominated the field of computer vision for years, the introduction of Vision Transformers demonstrated that methods developed for natural language processing could also be used for image classification tasks, often with superior results.
One significant advantage of Vision Transformers is that, unlike CNNs, they do not have a built-in assumption of spatial locality and shift invariance. This means they are better suited for tasks where global understanding of an image is required, or where small shifts can drastically change the meaning of an image.
However, ViTs typically require a larger amount of data and compute resources compared to CNNs. This factor has led to a trend of hybrid models that combine both CNNs and transformers to harness the strengths of both architectures.
Seed classification
Background:
Canada's multi-billion seed and grain industry has established a global reputation in the production, processing, and exportation of premium-grade seeds for planting or grains for food across a diverse range of crops. Its success is achieved through Canada's commitment to innovation and the development of advanced technologies, allowing for the delivery of high-quality products with national and international standards with diagnostic certification that meet both international and domestic needs.
Naturally, a collaboration between a research group from the Seed Science and Technology Section and the AI Lab of CFIA was formed to maintain Canada's role as a reputable leader in the global seed or grain and their associated testing industries.
Background: Quality Control
The seed quality of a crop is reflected in a grading report, whereby the final grade reflects how well a seed lot conforms with Canada's Seeds Regulations to meet minimum quality standards. Factors used to determine crop quality include contaminated weed seeds according to Canada's Weed Seeds Order, purity analysis, and germination and disease. While germination provides potentials of field performance, assessing content of physical purity is essential in ensuring that the crop contains a high amount of the desired seeds and is free from contaminants, such as prohibited and regulated species, other crop seeds, or other weed seeds. Seed inspection plays an important role in preventing the spread of prohibited and regulated species listed in the Canadian Weed Seeds Order. Canada is one of the biggest production bases for global food supply, exporting huge number of grains such as wheat, canola, lentils, and flax. To meet the Phyto certification requirement and be able to access wide foreign markets, analyzing regulated weed seeds for importing destinations is in high demand with quick turnaround time and frequent changes. Testing capacity for weeds seeds requires the support of advanced technologies since the traditional methods are facing a great challenge under the demands.
Motivation
Presently, the evaluation of a crop's quality is done manually by human experts. However, this process is tedious and time consuming. At the AI Lab, we leverage advanced computer vision models to automatically classify seed species from images, rendering this process more efficient and reliable.
This project aims to develop and deploy a powerful computer vision pipeline for seed species classification. By automating this classification process, we are able to streamline and accelerate the assessment of crop quality. We develop upon advanced algorithms and deep learning techniques, while ensuring an unbiased and efficient evaluation of crop quality, paving the way for improved agricultural practices.
Project #1: Multispectral Imaging and Analysis
In this project, we employ a custom computer vision model to assess content purity, by identifying and classifying desired seed species from undesired seed species.
We successfully recover and identify the contamination by three different weed species in a screening mixture of wheat samples.
Our model is customised to accept unique high resolution, 19-channel multi-spectral image inputs and achieves greater than 95% accuracy on held out testing data.
We further explored our model's potential to classify new species, by injecting five new canola species into the dataset and observing similar results. These encouraging findings highlight our model's potential for continual use even as new seed species are introduced.
Our model was trained to classify the following species:
- Three different thistles (weed) species:
- Cirsium arvense (regulated species)
- Carduus nutans (Similar to the regulated species)
- Cirsium vulgare (Similar to the regulated species)
- Six Crop seeds,
- Triticum aestivum subspecies aestivum
- Brassica napus subspecies napus
- Brassica juncea
- Brassica juncea (yellow type)
- Brassica rapa subspecies oleifera
- Brassica rapa subspecies oleifera (brown type)
Our model was able to correctly identify each seed species with an accuracy of over 95%.
Moreover, when the three thistle seeds were integrated with the wheat screening, the model achieved an average accuracy of 99.64% across 360 seeds. This demonstrated the model's robustness and ability to classify new images.
Finally, we introduced five new canola species and types and evaluated our model's performance. Preliminary results from this experiment showed a ~93% accuracy on the testing data.
Project #2: Digital Microscope RGB Imaging and Analysis
In this project, we employ a 2-step process to identify a total of 15 different seed species with regulatory significance and morphological challenge across varying magnification levels.
First, a seed segmentation model is used to identify each instance of a seed in the image. Then, a classification model classifies each seed species instance.
We perform multiple ablation studies by training on one magnification profile then testing on seeds coming from a different magnification set. We show promising preliminary results of over 90% accuracy across magnification levels.
Three different magnification levels were provided for the following 15 species:
- Ambrosia artemisiifolia
- Ambrosia trifida
- Ambrosia psilostachya
- Brassica junsea
- Brassica napus
- Bromus hordeaceus
- Bromus japonicus
- Bromus secalinus
- Carduus nutans
- Cirsium arvense
- Cirsium vulgare
- Lolium temulentum
- Solanum carolinense
- Solanum nigrum
- Solanum rostratum
A mix of 15 different species were taken at varying magnification levels. The magnification level was denoted by the total number of instances of seeds present in the image, either: 1, 2, 6, 8, or 15 seeds per image.
In order to establish a standardised image registration protocol, we independently trained separate models from a subset of data at each magnification then evaluated the model performance across a reserved test set for all magnification levels.
Preliminary results demonstrated the model's ability to correctly identify seed species across magnifications with over 90% accuracy.
This revealed the model's potential to accurately classify previously unseen data at varying magnification levels.
Throughout our experiments, we tried and tested out different methodologies and models.
Advanced models equipped with a canonical form such as Swin Transformers fared much better and proved to be less perturbed by the magnification and zoom level.
Discussion + Challenges
Automatic seed classification is a challenging task. Training a machine learning model to classify seeds poses several challenges due to the inherent heterogeneity within and between different species. Consequently, large datasets are required to effectively train a model to learn species-specific features. Additionally, the high degree of similarity among different species within genera for some of them makes it challenging for even human experts to differentiate between closely related intra-genus species. Furthermore, the quality of image acquisition can also impact the performance of seed classification models, as low-quality images can result in the loss of important information necessary for accurate classification.
To address these challenges and improve model robustness, data augmentation techniques were performed as part of the preprocessing steps. Affine transformations, such as scaling and translating images, were used to increase the sample size, while adding Gaussian noise can increase variation and improve generalization on unseen data, preventing overfitting on the training data.
Selecting the appropriate model architecture was crucial in achieving the desired outcome. A model may fail to produce accurate results if end users do not adhere to a standardized protocol, particularly when given data that falls outside the expected distribution. Therefore, it was imperative to consider various data sources and utilize a model that can effectively generalize across domains to ensure accurate seed classification.
Conclusion
The seed classification project is an example of the successful and ongoing collaboration between the AI Lab and the Seed Science group at the CFIA. By pooling their respective knowledge and expertise, both teams contribute to the advancement of Canada's seed and grain industries. The seed classification project showcases how leveraging advanced machine learning tools has the potential to significantly enhance the accuracy and efficiency of evaluating seed or grain quality with compliance of Seed or Plant Protection regulations, ultimately benefiting both the agricultural industry, consumers, Canadian biosecurity, and food safety.
As Data Scientists, we recognise the importance of open-source collaboration, and we are committed to upholding the principles of open science. Our objective is to promote transparency and engagement through open sharing with the public.
By making our application available, we invite fellow researchers, seed experts, and developers to contribute to its further improvement and customisation. This collaborative approach fosters innovation, allowing the community to collectively enhance the capabilities of the SeedID application and address specific domain requirements.
Meet the Data Scientist
If you have any questions about my article or would like to discuss this further, I invite you to Meet the Data Scientist, an event where authors meet the readers, present their topic and discuss their findings.
Register for the Meet the Data Scientist event. We hope to see you there!
MS Teams – link will be provided to the registrants by email
Subscribe to the Data Science Network for the Federal Public Service newsletter to keep up with the latest data science news.
- Date modified: