Fully automatic segmentation of bee wing images

Bee preservation is important because approximately 70% of all pollination of food crops is made by them and this service costs more than $ 65 billion annually. In order to help this preservation, the identification of the bee species is necessary


Introduction
Bees are very important insects that can inhabit various environments such as deserts, mountains, savannas, etc . They are classified into different genders, species, and subspecies, each adapted to different environmental characteristics. The identification of species was typically based on bee body morphometric characteristics, however, in recent years, wing characteristics have been used quite efficient for the classification task. For this reason, several methods have been developed to perform automatic classification through bee wing images using their morphological characteristics. Some of them, obtained good results in terms of computational time and reliability of results.
It is estimated that around 70% of all food crop pollination is done by bees (Drauschke et al., 2007). Economically, this amounts to about $65 billion annually (Pimentel et al., 1997). Additionally, bees' pollination plays an important role in the preservation of ecosystems, and several plant species depend on them for survival (Drauschke et al., 2007, Michener, 2000. Thus, bees are key actors for both agribusiness and ecological preservation.
The correct identification of bee species that are found in different regions and, in particular, the identification of species that are found dead is a very important task. It allows protective measures to be taken, as well as to support the development of environmental public policies.
Image-based recognition depends on well-defined image acquisition and processing techniques. The acquisition step is usually performed by selecting the insects that are researched and photographing them individually, preferably under controlled conditions to avoid noise and different background and lighting conditions. The processing step is composed of several tasks depending on the objectives of the study. In general, color images are converted to grayscale and, then, converted to binary ones. The Region of Interest (ROI) is separated from the background of the image. Then, feature extraction techniques use to be applied and the features are sent to a classification algorithm. At this point, different techniques can produce very different results.
The computer vision area can offer different techniques to identify insects by images. The use of image processing and pattern recognition algorithms for automatic classification of insect species has changed the traditional manual descriptive model of morphological characteristics provided by taxonomic studies for their identification.
Only qualified taxonomists and skilled technicians can accurately identify insects using traditional models, as they require special knowledge gained through years of experience and study of insect taxonomy (Zhu and Zhang, 2010).
According to Martineau et al. (Martineau et al., 2017), a more flexible image-based insect capture and classification system can broaden this field of knowledge as it can be used by more people.
Using computer models with automated artificial intelligence techniques, the identification of insect species can be performed by a layperson in less time than traditional models. Moreover, the new automatic classification approaches can achieve a higher accuracy than the manual ones, and be easily tested and replicated (Martineau et al., 2017).
The quality, quantity, and type of features that are extracted from insect images are factors that most influence the accuracy of the classifier, and although there are several techniques and tools for feature extraction. Some of them are generalists and others are optimized for a particular insect. There remain relatively few studies specific for bees classification in the literature.
The main objective of this paper is to propose segmentation and feature extraction techniques, specific to bee wing images. It is important to mention that this work is part of a larger project in which the extracted features should be used for an automatic bee species classifier. We assume that specifying and developing a specific image segmentation approach for bees' wing may yield better results than the generalist techniques found in the literature.
This document is organized as follows: Section 2 presents the fundamental concepts related to image processing and segmentation; Section 3 presents briefly some related works; Section 4 describes the details of the approach proposed to properly segment the images; Section 5 presents the results, as well as a comparison with techniques inspired by the literature; and finally Section 6 contains the conclusions and final considerations.

Basic Concepts
This section presents the main concepts of image processing that were used in the developed segmentation approach.
Tresholding is an operation used to produce a binary image. It corresponds to the simplest method of image segmentation: from a color or grayscale image, thresholding is used to create a binary image. This operation works by replacing each pixel in an image with a black pixel if the image intensity I(i, j) is less than or equal to some fixed constant T (that is, (I(i, j) <= T), or a white pixel if the image intensity is greater than this constant. In this work, we used an adaptive threshold, i. e., given a window size, the algorithm calculates a new threshold value for each window. This adaptive method typically produces better results, especially for images with varying illumination.
Dilatation is one of the basic operations in mathematical morphology. Originally developed for binary images, it has been expanded first to grayscale images, and then to complete lattices. The dilation operation usually uses a structuring element for probing and expanding the shapes contained in the input image (Silva, 2015). In our proposed approach, dilation is important to reconnect parts of the wings that were disconnected for the thresholding operation.
Erosion is the opposite of dilation. This operation removes details on objects' boundaries. It can be used, for example, to shrink an image.
Thinning is a method to draw a one-pixel wide skeleton from a binary image while retaining the shape and structure of the full image. The Zhang-Suen Thinning algorithm (Zhang and Suen, 1984) is probably the most used thinning algorithm. It works as a socalled two-pass algorithm, meaning that, for each iteration, it performs two sets of operations to remove pixels from the image. These operations are devised so the first set removes from the southeast (bottom right) corner of the image, and the second set removes from the northwest (top left) corner.
Hit-and-Miss is a general binary morphological operation that can be used to look for particular patterns of foreground and background pixels in an image. It is the basic operation of binary morphology, as almost all the other binary morphological operators can be derived from it.
As with other binary morphological operators it takes as input a binary image and a structuring element and produces another binary image as output.

Noise Reduction and Filtering Techniques.
Image noise is defined as a random variation of brightness or color information in images. There are several kinds of noise reduction and filter techniques. In this work, we use several filters such as Gaussian, Median, and Bilateral, which are known as edgepreserving filters in order to remove as much noise as possible from the input images without loose important information (Kaehler and Bradski, 2016)

Related Work
Morphological characteristics extracted from the wings are an efficient way to classify bees (Santana et al., 2014, Francoy et al., 2008. The most effective characteristics are wing venations junctions , Santana et al., 2014, Francoy et al., 2008, thus, some studies in the literature have taken advantage of this fact to try to automate the bee classification process , Strauss and Houck, 1994, Rojas et al., 2016, Silva, 2015 and other winged insects as well, such as flies (Brkljač et al., 2012, Faria et al., 2014, Wang et al., 2011, Hatsuda et al., 2009) and wasps (Weeks et al., 1999).
Image pre-processing and segmentation are important processes for automatic classification of bees  as this makes it easier to efficiently extract features (Francoy et al., 2008) and thus improve classifier performance. Solutions that improve image pre-processing and segmentation have great potential to improve classification accuracy.
Moreover, there is an alternative approach which is the use of Convolutional Neural Networks (CNN) (Schmidhuber, 2015, LeCun et al., 2015. These approaches use a Deep Learning technique and there is a high tendency on its use in studies related to image recognition (Schmidhuber, 2015). This approach can be used without the image pre-processing steps (LeCun et al., 2015, Nizam et al., 2019, Murali et al., 2019, Lim et al., 2018. No works were found comparing high-quality segmented images, such as the method proposed in this study, versus the use of the original images without (pre-)processing.
The next section presents an overview of our segmentation approach.

Materials and Methods
We used a dataset of 904 wing images, from 48 bee species. The images were taken in different lighting conditions and different resolutions and were selected for a bee specialist by providing different challenges in their segmentation.
There are some clear challenges in these images, such as "salt and pepper" noise, two wings in one single image, dirty wings, and zoomed out images. Therefore, this data set allowed the evaluation of a comprehensive approach that tackles these types of problems.
The developed algorithm takes as input a bee wing image and extracts landmarks (vein junctions in the wing) and, to properly execute this task, the segmentation needs to be reliable and accurate.
The knowledge used to the specification and development of the proposed approach corresponds to a combination of different approaches used in the related literature on image segmentation, the expertise acquired studying bees' wings and empirical experiments.
An overview of the proposed segmentation approach is described as follows, and Fig. 1 represents, graphically, each one of the steps in this approach.
It is possible to observe the input image (Fig. 1a) has part of its wing missing, although it has a good quality overall. Furthermore, the wing and the background have few noises.
In the first step (Fig. 1b), the image is converted into grayscale to reduce the complexity of the next steps and facilitate the visualization. Moreover, our approach applies two smoothing filters in the image: bilateral filter and median filter, in order to reduce noises.
In the second step ( Fig. 1c), an adaptive thresholding is used to binarize the image, reduce complexity and find regions of interest. Although the application smoothing filters, noises are still present.Another approach of this step is to apply the Difference of Gaussian instead of adaptive thresholding, it's helpful to tackle image noise.
The third step is performed to remove noise and dilate the image (Fig. 1d). The idea here is to make connected components removal (based on their size), i.e., the removal of small objects (noises) from the image, and perform small dilations. The alternating execution of these two operations is important to ensure the exclusion of unwanted components of the image, but avoiding the exclusion of parts of the wing.
In the fourth step (Fig. 1e), the most centralized wing is cropped out of the image. It's extremely important for images with more than one wing or with largesized noises.
In the fifth step (Fig. 1f), the white pixels are changed to the original grayscale value, and the black pixels are changed to a blurred pixel of the grayscale value. In summary, the output here is the grayscale image but sharpened and with less noise.
The sixth step, illustrated in Fig. 1g, fundamentally, repeats the steps 2 and 3 to get better results, and, then applies a Gaussian filter.
In the seventh step, we used the Zhang-Suen thinning algorithm to facilitate the task of detecting vein junction. The result can be seen in Fig. 1h.
The last step (Fig. 1i) concerns detecting the landmarks, to do that we used the hit-or-miss morphology operation to identify all shapes of line junction. Furthermore, the algorithm removes some of these landmarks, based on the size of the line junction and closeness with other junctions, because these are presumably not a real landmark.
Algorithm 1 summarizes the main steps of our segmentation approach.
In the next section, the results of our proposed algorithm are presented and compared with related Input: : A bee wing image I Output: : A Segmented Image 1 Convert I to Grayscale ; 2 Binarize I by using an adaptative thesholding; 3 Remove small connected components and apply dilation operation; 4 Crop out the most centralized wing; 5 Apply contrast enhancement operation; 6 Apply a Gaussian filter and repeat steps 2 and 3; 7 Apply a thinning algorithm; Algorithm 1: Our algorithm approaches.

Results and Discussion
Primarily, it is important to set a few ground truth segmented images to understand how the ideal output would look like. Fig. 2 shows three manually segmented images that were used as a reference to evaluate the algorithms described in this paper (we have 10 images manually segmented used as ground truth). In the left column, the input images are presented and, in the right column, the corresponding manually segmented images.
The proposed approach was compared with three strategies frequently used in related literature (Rojas et al., 2016) and using basic methods (Minichino and Howse, 2015). These strategies are briefly introduced as follows.
The first strategy, summarized in Algorithm 2, corresponds to a generic strategy for general image segmentation.
Input: : A bee wing image I Output: : A Segmented Image 1 Perform a edge detection Canny; 2 Execute a Contour detection; 3 Apply a Thinning operation; 4 Apply a Corner detection; Algorithm 2: Strategy One Fig. 3 displays the results of strategy one for the input image (Fig. 2a). It can be noted in Fig. 3b that several points were incorrectly identified as vein junction.
Algorithm 3 summarizes the second strategy steps. This algorithm increments the previous one by some more powerful methods/operators such as the strategy of connected components removal.
An example of the result of strategy two is shown in Fig. 4. It is possible to note that possibly the result had even more false positives points than Fig. 3, probably The strategy three, summarized in Algorithm 4, was inspired in a similar work (Rojas et al., 2016) and the basic idea is to create pyramids where the scale of the image is changing and there is convolution with a Gaussian function. The result of a difference of Gaussians tend to highlights the edges on the resultant image (Moreno et al., 2009, Zahedi andSalehi, 2011). Fig. 5 shows the results of strategy three for Fig. 2e as input image. It is observed, especially at the top of the wing where there are two slots almost parallel and very close to each other, that the strategy mistakenly captured several joints. Fig. 6 displays the result of our developed approach applied to the input images (Fig. 2). It may be noted that although we got some wrong junctions, our Input: : A bee wing image I Output: : A Segmented Image 1 Transform I in a gray scale image ; 2 Apply a Gaussian filter; 3 Perform a Difference of Gaussian (Zhou et al., 2009); 4 Apply a Median filter ; 5 Execute a thinning algorithm; 6 Apply a Shape detection; Algorithm 4: Strategy Three strategy returned images closer to the ground truth than the other three. We can assume the difference is considerable, the outputs of the proposed algorithm (Fig. 6) are more similar to the ground truth image than the other strategies. The segmentation resulting for strategies one, two and three were acceptable in specific cases presented previously but inaccurate when the image was not in the best quality and clarity.
Thus, the segmentation of the proposed strategy was more prepared to handle images in different conditions. It also had a better feature extraction, thanks to the more reliable segmentation.
However, it's crucial to determine the difference of each strategy with metrics, to give grounds for the differences previously observed. In this study, we selected two metrics to evaluate the approaches: Modified Hausdorff Distance and F1 Score.
Modified Hausdorff distance (Marinov, 2012), experimented by Dubuisson and Jain (Dubuisson and Jain, 1994), and determined to be better than the other examined distance measures for object matching. Accordingly, it was used to measure the accuracy of the segmentation of our algorithm compared to a ground truth segmentation's.
Given two finite sets A = a 1 , ..., ap and B = b 1 , ..., bq in a metric space, the Hausdorff distance (H) (Gao et al., 2014) between the both sets is defined as: However, Hausdorff distance is very sensitive to outlier points (Dubuisson and Jain, 1994), to reduce this sensitivity we used the modified Hausdorff distance (MHD), which corresponds to the maximum value between the arithmetic mean of the minimum distances from all points of the first set A to the second set B, and the arithmetic mean of the minimum distances from all points of the second set B to the first set A. It can be defined as: We also normalized this distance to compare images with different sizes, the metric reaches its best score at 0 and the worse at 1. The results are displayed in the Table 1.
Additionally, to facilitate the perception of the variation of the scores, we divided each value of Table 1 by its maximum row value (i.e., for the less accurate score) and multiplied by 100. Thus, in Table 2, the worst score of each image turned to 100, and the closer the value is to zero, the higher is the improvement compared to the worst case.
Although all the compared strategies presented relatively small modified Hausdorff distances, the developed solution was able to stand out in all the analyzed cases. This validates the premise of this work that segmentation algorithms developed specifically for bee wing segmentation could be more accurate and robust than related algorithms developed for more general purposes. To assess the accuracy of the feature extraction, we chose the F1 score (also known as F-measure or balanced F-score) which is harmonic mean of precision and recall. It's widely applied measure in statistics, especially in binary classification problems (Goutte and Gaussier, 2005). F1 score is defined as: where: precision is the fraction of correct classified instances among the retrieved instances, and recall is the fraction of positive instances that were retrieved. In this project, the vein junction was the feature of interest in the bee wing images. Thus if the output has detected all and only the vein junctions the score is 1. On the other-hand, if the output contains only non vein junctions the score is 0.
Based on these results, we can infer the initial  assumptions are justified. Both metrics indicate superiority in the proposed algorithm, while the other strategies are in a similar score, with high variation depending on the image. Thus, strategies commonly used in the literature were not very accurate considering the current imperfect scenario, with images containing many types of problems/noises. Hence, the algorithm proposed was designed to be robust, dealing with possible frequent difficulties.
In the segmentation, the strategy one, two and three have problems to reduce noises and to identify every part of the wing's veils. The first often loses part of the wing, the second repeatedly permits unwanted objects to connect with the wing, and the third doesn't have a proper solution for small connected components (noises). On the other hand, the proposed algorithm is better in these aspects, although, since it dilates the image it might create connections that aren't real.
Moreover, in the feature extraction step, the strategy one and two miss explicit vein junctions, while the strategy three marks all junctions points, including noises. Differently, the proposed algorithm is better because it starts at marking all junctions points, but it removes junctions that seems to be not real landmarks.
Nevertheless, even the proposed algorithm presenting some limitations, especially with images very dark or very bright such as Figs. 7a and 7b in Tables 1 and 3. This specific problem can be minimized if in step c (Fig. 1c) was used the Difference of Gaussian instead of adaptive thresholding.
Although we have experimented with different species and different conditions, it is necessary to test the algorithm with even more species and different cameras to observe if the accuracy will remain similar to our results.
Besides, despite the fact we developed the algorithm to segment the bee's wing, the concept could be used in similar projects since all steps are well-defined and based on prestige methods or algorithms.
It is worth to mention that this work was developed in the context of a bigger project which aims to perform the (automatic) species identification steps. Therefore, we expect the results accomplished in this article to be relevant to identify bee species accurately.

Conclusions
In this study, we analyzed the effectiveness of a innovative approach to segment bee wing images and to perform feature extraction. We developed an automatic technique using image processing to reduce the cost and the time usually spent to identify the species.
We focused on working with flawed images since related studies utilized "perfect" images. It is a significant benefit of our approach, to deal with challenges that unclear/flawed images might have.
The results suggest that our approach was more accurate than the related strategies, but more tests need to be performed. Regardless, the outcome of the study is positive and the result of the feature extraction is accurate. More studies are required to continue to develop a program that classifies the bee species, based on the wing shape, to help the bee species conservation efforts.
It is important to highlight that automatic segmentation of arbitrary images is not a completely solved problem in the specialized literature. Our contribution makes use of specific knowledge about the image content and, at least in the tested sets, presented very satisfactory results. In addition, this work is part of a larger project that currently is able to classify bee genre with 90% accuracy from their wing image.