Weight and volume estimation of single and occluded tomatoes using machine vision

ABSTRACT The fundamental characteristics of agricultural products are appearance, size, and weight, which affect their market value, consumer preference, and choice. Thus, food and agricultural industries seek rapid, simple, and nondestructive approaches to assess real-time measurements at the post-harvest stage before packaging for the consumer market. While sorting and grading may be performed by humans, it is unreliable, time-consuming, complicated, subjective, onerous, expensive, and easily influenced by surroundings. Therefore, an astute sorting and grading method for tomato fruit is required. We evaluated two tomato configurations on a conveyor belt: single tomatoes (no occlusion) and multi-tomatoes (partially occluded). We used polygon approximation for concave and convex point extraction algorithms to segment the occluded tomatoes. We developed seven models for regression using single-tomato image features. The Bayesian regularization artificial neural network outranked all the trained models in weight estimation with a root-mean-square error (RMSE) of 1.468 g and R 2 of 0.971. For volume estimation, the RBF SVM had the best performance with R 2 of 0.982 and RMSE of 1.2683 cm3. It is feasible to implement a proposed system as a noninvasive in-line sorting technique for tomatoes.


INTRODUCTION
Tomatoes are downright one of the world's most highly produced and consumed fruits. It is an essential horticultural plant and the most exported fleshy fruit globally. [1][2][3] China produced 35% of the world's total production of 180 million tons of tomatoes in 2019, which was approximately 62 million tons making it the top producer, followed by India, Turkey, and the United States of America, respectively. [4] Tomatoes are native to South America (Mexico) and Central America and can now be grown worldwide, especially in temperate climates and greenhouses. Thus, there is a need to automate the sorting and grading techniques for tomato processing and quality inspection before consumption. [5] In fruits, morphometric measurements, such as weight, size, density, surface area, and volume, are correlated and essential. [6,7] In grading and sorting of vegetables and fruits, these measurements are often used as crucial descriptors. [8] The fruit size and weight directly influence storage and transport costs, eventually influencing sales and marketing prices. [9,10] Fruit volume and density relate to fruit consistency, carbon consumption, and water content used to forecast harvesting periods. [7,11] The average density, critical in identifying hidden defects such as frost and internal damage, can be determined when the weight is combined with the volume. [9] each tomato was determined by the drainage method based on the Archimedes' principle (Lang and Thorpe, 1989) using a graduated cylinder (±1 cm 3 ).
RGB images of tomato fruits were obtained using a digital camera (Panasonic Lumix DMC-FZ40, Panasonic Corporation, Osaka, Japan) installed 1.5 m perpendicularly above the imaging stage. The images were taken with a dark background and fluorescent lamps for lighting. The data were transferred and stored for subsequent analysis onto external storage. The marked tomato samples were put on a conveyor belt as the scanning stage (0.30 m by 0.30 m) as shown in Figure 1. The conveyor line was a lab fabrication of an online production system as shown in Figure 1 (b). The conveyor speed was set at 0.017 ms −1 to synchronize with the image acquisition rate of 10 frames per minute.
Two modes for image capture defined by the tomatoes' configuration (single and partially occluded) were performed in the imaging stage as displayed in Figure 2; (a) single-tomato images (for model creation) M1 ð Þ were acquired in the scanning stage at three different orientations. (b) Images of partially occluded tomatoes M2 ð Þ. Nine images were captured for each tomato inM1, three for every orientation and position in the imaging stage. The calyx facing upward was the first orientation, the calyx facing downward second and, finally, the calyx's longitudinal axes parallel to the imaging stage's surface. A total of 1500 images was acquired in M1. The cluster sizes (the number of tomatoes partially occluded) in M2 were raised from 3 to 15 tomatoes per image (tomatoes' number  in each cluster was added by one tomato from 3 to 14 tomatoes per cluster). Two hundred images (50 tomatoes) were obtained for the partially occluded tomatoes (creating different patterns as in Figure 2).

Image pre-processing and volume/weight prediction model
This study's objective was to implement an automated algorithm for the segmentation and prediction of tomatoes' weight and volume under partial occlusions. First, tomatoes in M1 were used to establish weight and volume estimation models based on extracted features using pre-processed RGB images. Second, a polygon approximation-based tomato-splitting technique was employed to M2. Third, the split tomatoes' weight and volume (after splitting) were predicted using the established model (the developed model in the first step). Using the total number of tomatoes identified in a cluster versus the count of image objects after separation, the tomato-splitting algorithm was assessed and comparative accuracy analysis of the volume after separation versus the actual volume from the water displacement method and estimated weight against the actual weight. Figure 3 displays the introduced tomato weight and volume prediction system's algorithmic flow.

Image preprocessing algorithm
The raw RGB images were analyzed using the steps below: Step 1: Due to the background and the camera being static, the image subtraction procedure was used to remove the background: where S x;y is the resultant image from background separation, the original image is B x;y and the background image isT xÀ y .
Step 2: The images were smoothened by a Gaussian kernel filter: where the resultant smoothened image is F x;y , and h s;t ¼ 1 2πσ 2 e À 1 2 s 2 þt 2 σ 2 À � Is the Gaussian filter kernel.
Step 3: Binarization was subsequently applied using Otsu's method [25] to transform the image into a binary image.
Step 4: A morphological opening procedure was done using a size 9-pixel disk structural element to eliminate holes in the image object: After morphological opening, A x;y was the resultant image, Es the structural element, and z the translation.
Step 6: Region filter was then performed based on the area to remain with only tomatoes' images.

Weight and volume prediction model
For tomato weight and volume estimation, we explored seven regression models: Support Vector Regression (SVR) (linear, cubic, quadratic, and RBF (Gaussian)) and artificial neural networks (ANNs) (Levenberg-Marquardt, Bayesian regularization, and scaled conjugate gradient training algorithms). These models were established based on the features extracted from the training dataset using a 10-fold cross-validation-based parameter search and subsequently evaluated on the testing dataset. For model evaluation, the whole dataset (1500 images) was split into training (70%) and testing dataset (30%). Image identification criteria for feature extraction were dependent on manually selecting the images, provided that the tomatoes did not touch the edge of the image because of their movement while images were captured. Features' extraction: Any machine learning model's precision and performance depend significantly on the model training's feature variables. [10] Only 2D features, i.e., area Υ A , perimeterΥ P and eccentricity Υ E ; major-axis length Υ 1 ð Þ, minor-axis length Υ 2 ð Þ, and radial distance Υ D were extracted from the RGB images. These extracted features are summarized in Table 1.
The Υ A (Eq. (4)) is the number of pixels in an image object or the projected area. This feature has been extensively applied in several agricultural product weight and volume estimations. [9,10] Υ P (Eq. (5)) is the number of pixels around the closed contour of a 2D object. If the closed contour has N vertices, the contour can hence be described as C ¼ x n ; y n ð Þ f g, where n ¼ 1; 2; . . . ; N; and x 1 ; y 1 ð Þ x N ; y N ð Þ, then Υ P is defined by (Eq. (6)). Υ E ¼ Υ 1 Υ 2 equals the ratio of the Eigen values Υ 1 andΥ 2 ð Þ of the covariance matrix of a fitted ellipse. [9,10,26] Υ 1 (Eq. (7)) is the pixel distance between the major-axis endpoints. Here, the major-axis endpoint coordinates are defined by X 1 ; Y 1 ð Þ and X 2 ; Y 2 ð Þ. The result is the Euclidean distance between the two points. Υ 2 (Eq. (8)) is the pixel distance between the minor-axis endpoints. Here, the minor-axis endpoint coordinates are defined by M 2 ; N 2 ð Þ and M 1 ; N 1 ð Þ. The result is the Euclidean distance between the two points. The Υ D (Eq. (9)) is the average distance from the center of gravity of an image to the boundary points. The radial distance is measured from the central point (centroid) of the object of each pixel x n ð Þ; y n ð Þ on the boundary. Regression models: ANNs are non-linear statistical models whose learning technique is a problem of function optimization that determines the optimal network conditions to reduce network failure. [10,27] ANNs act in response to unpredicted inputs, and they require less statistical training. The learning difficulty of an ANN is a loss function reduction via an optimization algorithm. The training of ANNs ends as soon as the optimization algorithm satisfies a particular requirement or stop criterion. [9,10,27] ANNs essentially classify inputs into a set of target categories. These optimizations (training) algorithms vary in speed, precision, and memory complexity. In this study, three ANNs were explored; each had a 1-hidden layer feed-forward network trained using backpropagation algorithms activated by the tangent sigmoid function (Bayesian Regularization, Scaled Conjugate Gradient, and Levenberg-Marquardt training algorithms).
SVR is an extension of the SVM employed in high dimensional spaces to solve a regression problem. SVR is considered a non-parametric statistical learning algorithm based on kernel functions. [28] SVR models' strengths are that their training is easy, are explicitly controlled, offer better generalization performance (accuracy), provide direct geometrical interpretation, lack an optimal local solution, elegant mathematical tractability, and prevent overfitting by not necessitating a massive set of training samples. [9,29,30] Similarly, the SVR models rely only on a subset of the training dataset since the cost function does not consider the training points outside the margin during model construction, thereby conveniently preventing overfitting. [31] SVR is typically a quadratic programming problem (QP) to distinguish support vectors from the other training data vectors. [31] The kernel functions that map input data into the necessary forms solve this quadratic problem. Thus, an SVR model's overall performance is greatly affected by the kernel function's choice. [9,10,26,29,32,33] For this study, we applied the linear (Eq. (10)), polynomial (Eq. (11)), and Gaussian (Eq. (12)) kernel functions. Table 2 displays a description of SVR kernel functions.

Tomato splitting algorithm
Concave and convex points' extraction: The concave point detection method was applied to fit an object boundary with a sequence of line segments (Polygon approximation). Concave points are Major-axis length Υ 1 ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi Minor-axis length Υ 2 ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffiffi Consider that the contour of a 2D image I x; y ð Þ is defined by Þg are the endpoint coordinates of the major-axis and minor-axis, respectively. Table 2. Descriptions of the kernel properties.

Model parameters
Kernel equations Linear Polynomial Gaussian divided into two types, the obvious concave points and the unobvious concave points. Obvious concave points connect two tomatoes and can easily be detected by angle features since their gradients change sharply, while the gradient change in the unobvious concave point is insignificant. [34] Polygon approximation: Polygon approximation is used to represent the object's contours by a sequence of dominant points. The method detects dominant points on a contour by suppression of redundant points. The advantages of using this algorithm are that it smoothens the object contours, reduces complexity and calculation time, and avoids false concave point detection. [35,36] Suppose a sequence of extracted contour points ¼ m 1 ; m; . . . f g, the collinear suppression determines its dominant points. This simply means that every contour point m i is examined for co-linearity while being compared to the previous and next successive contour points. The point m i is considered the dominant point if it is not located on the line connecting m iÀ 1 a iÀ 1 ; b iÀ 1 Þ ð and m iþ1 a iþ1 ; b iþ1 Þ ð , and the distance n i from m i to the line connecting m iÀ 1 a iÀ is more significant than the pre-set threshold n i > n th : This distance n i is obtained by n i ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi After polygon approximation and detection of the dominant points, the concave points can be identified by the techniques suggested by Bai et al. [24] and Zhang et al. [37] In the method adopted by Bai et al., [24] the dominant point m n;i 2 M dom is defined as concave if the concavity value of m n;i is within the range of x 1 to x 2 and the line connecting ! m n;i þ1m n;iÀ 1 does not pass through inside the objects.
The concavity value of m n;i is defined as the angle between lines m n;iÀ 1 ; m n;i À � and m n;iþ1 ; m n;i À � as follows: x m n;iÀ 1 ; m n;i À � À m n;iþ1 ; m n;i where x m n;iÀ 1 ; m n;i À � tan À 1 b n;iÀ 1 À b n;i À � = a n;iÀ 1 À a n;i À � À � and x m n;iþ1 ; m n;i À � tan À 1 b n;iþ1 À b n;i À � = a n;iþ1 À a n;i À � À � In the technique proposed by Zhang et al., [37] the dominant point m n;i 2 M dom is considered to be a concave point if ! m n;iÀ 1 m n;i �m n;i m n;iþ1 ! is positive: M con ¼ m n;i 2 M dom : ! m n;iÀ 1 m n;i �m n;i m n;iþ1 Both Bai et al. [24] and Zhang et al. [37] define the threshold value n th manually. For this study, we propose to employ the approach suggested by Zafari et al. [36] to extract both concave C x ð Þ and convex ðC v Þ points. This technique combined the approaches by Zhang et al. [37] and Prasad et al. [38] for parameter-independent polygon approximation and dominant point detection. In this approach, the line connecting m iÀ 1 a iÀ 1 ; b iÀ 1 Þ ð and m iþ1 a iþ1 ; b iþ1 Þ ð is first digitized, and, afterward, the threshold value n th is automatically selected based on the angular distance between the digitized line slope and that of the actual line. This leads to the concave point detection being fully parameter-free. [36] Both the C x and C v points were extracted for all the images under partial occlusion. For both the inner and outer contour points, the points were extracted, respectively, as seen in Figure 4. Therefore, each image was tested to have inner and outer contours based on the small-hole filling technique.
Split-line extraction: The C x and C v points were used for drawing the separating line in the images (splitting two partially occluded tomatoes). Splitting was carried out in three phases. First, the KNN nearby search for each C v point was carried out by finding its equivalent C v points based on an Euclidean distance threshold (�) among the two points. If the Euclidean distance between a point C vk and its neighbor C vl was greater than �, then the points were regarded as not neighbors. Second, to find the corresponding C x points on the remaining C v points, another KNN nearby search was conducted. Lastly, to find the corresponding C x neighbors based on � of the remaining C x points another KNN search was performed. KNN nearby search is carried out as follows: given that a point T is distinctly connected to another point T j 2 W x:y (nearest) regarding the Euclidean distance among them, the nearest neighbor of T is calculated as  where T � is the calculated nearest neighbor of T. For every C x and C v points and their equivalent neighbors, a split-line was drawn linking the two points as displayed in Figure 5. Thus, detaching two touching tomatoes.

Multi-tomato volume and weight estimation
This paper aimed to efficiently and accurately determine each tomato's weight and volume in a partially occluded tomato cluster based on developed volume and weight prediction models. The explored touching-tomatoes scenario results were compared in accuracy terms for predicted volume to the water displacement volume and weight estimation compared to the measured weight. Accuracies for estimated volume and weight were compared to the drainage method and measured weight, respectively, in terms of RMSE, R-squared R 2 ð Þ, and relative error. Moreover, a dependent T-test was conducted to evaluate the variations between actual weight W 0 , M1 estimated weight W 1 and M2 estimated weight W 2 and the difference between the drainage method volume V 0 , M1 estimated volume V 1 and M2 estimated volume V 2 . For each model, two T-test analyses were performed, i.e., for weight estimation in M1, W 0 against W 1 and W 0 against W 2 in M2, and for volume estimation V 0 against V 1 and V 0 against V 2 in M2. A dependent T-test contrasts the means of two similar classes using the same constant dependent variable. The data normality was calculated by the skewness value and skewness standard error values before the T-test. [39] No data transformation was done for this research. All statistical analyses were computed using the Statistical Package for Social Sciences (IBM SPSS).

Dataset statistical description
In Table 3, descriptive statistics of the whole dataset are presented in aspects of central tendency metrics and variability (spread) metrics i W 0 and V 0 . For the 300 tomatoes, the mean weight was 163.18 g with a median of 121.50 g ranging from 36.00 g (minimum) to 507.20 g (maximum), while the mean volume was 154.69 cm 3 with a median of 114.00 cm 3 ranging from 30.00 cm 3 (minimum) to 505.00 cm 3 (maximum). The weight dataset had a skewness of 0.94 g at a standard error of 6.33 g, while the volume dataset had a skewness of 0.94 cm 3 and a standard error of 6.36 cm 3 .

Evaluation of the weight and volume estimation models
The model parameters for the SVR and the ANN are provided in Table 4. These parameters help enhance the overall accuracy and efficiency of the model. The SVR kernel function parameters were iteratively calculated through the 10-fold-cross-validation-based model parameter search during the model training phase. Cross-validation aims to find the most accurate kernel parameters while avoiding overfitting and underfitting the model and providing an acceptable generalization capability. Likewise, the number of neurons in the hidden layer was adjusted to 10 (6-10-1 topology) for all the ANNs to match the computation complexity and error term by validating the model. The models were applied to the testing dataset to determine their precision in model testing. Table 5 displays the models' accuracy in terms of R 2 and RMSE on the testing dataset.
The Bayesian regularization ANN had the best regression results, 0.971 R 2 at an RMSE of 1.468 g for weight estimation, while the linear SVM had the lowest R 2 0.813 at an RMSE of 3.841 g. For volume estimation, the RBF SVM had the highest regression results of 0.982 R 2 at an RMSE of 1.283 cm 3 while the linear SVM had the lowest R 2 0.835 at an RMSE of 2.917 cm 3 . Figure 6 displays linear regression plots for the W 0 against estimated weight W 1 and V 0 against estimated volume V 1 of the best performing models.
It was also observed that the average relative error for W 1 in M1 was 4.78%, with a maximum and minimum of 5.04% and 0.47%, respectively. The average relative error for V 1 in M2 was 3.18%, with a maximum and minimum of 4.27% and 0.96%, respectively, as shown in Table 6.

Evaluation of the tomato splitting algorithm
The efficiency of the splitting algorithm was analyzed through the splitting accuracy (object count) and the accuracy of W 2 and V 2 in comparison to W 0 and V 0 for the same tomatoes. It can be seen that the splitting technique correctly segmented all the tomato images in clusters, i.e., the image objects' (after splitting) number was equivalent to the number of tomatoes in all the images for M2. Moreover, to evaluate the accuracy of the splitting process with cluster size, the average relative error (in the estimation of W 2 and V 2 ) was assessed for each cluster size as displayed in Figure 7. The results demonstrated that the introduced system's performance was invariant to the number of tomatoes in a cluster. In the estimation of W 2 the maximum and the minimum mean relative errors of 2.632% and 1.324% were Input -hidden layer -output Bayesian regularization ANN 6-10-1 6-10-1 Levenberg-Marquardt ANN 6-10-1 6-10-1 Scaled conjugate gradient 6-10-1 6-10-1 reported when the cluster sizes were 8 and 6 tomatoes. In V 2 the maximum and the minimum mean relative errors of 3.088% and 1.435% were reported when the cluster size was 10 and 3 tomatoes. It was established by a dependent T-test that there is no statistically significant difference ðp < 0:05Þ between V 0 and V 1 . There was, however, a statistically significant difference ðp < 0:05Þ between V 0 and V 2 . It can be seen in Table 7 that there was a mean decrease in V 2 of 2.822 cm 3 compared to V 0 . For weight, there was no statistically significant difference ðp < 0:05Þ between W 0 and W 2 and between W 1 and W 2 as seen in Table 8.

DISCUSSION
This paper proposes a computer vision-based automated tomato volume and weight prediction system implemented in a tomato production-line system. An RGB image preprocessing algorithm was introduced to segment tomatoes from their background. Afterward, the image feature extraction algorithm was employed for the tomato images. From prior research [9,13,14] on volume and weight determination of tomatoes, geometric shape features are generally favored because they are accurate and descriptive. Thus, in our research, tomato image features were represented by six standard shape geometric parameters Υ A , Υ P ; Υ E ; Υ 1 , Υ 2 , and Υ D : The developed models are compared in Table 5, where the Bayesian regularization ANN surpassed the other ANNs and SVM models in weight estimation, and the RBF SVM performed better than the rest of the SVM, ANN models in volume estimation. This can be attributed to the need for less statistical data and better optimization of data for the Bayesian ANN and the RBF SVM's increased ability to generalize data on the testing dataset. All of the models provided excellent estimates of weight and volume, with an accuracy of at least 0.81.
Calculation of tomato weight and volume in real time is a time-consuming and cumbersome task. The drainage approach is the most common method of measuring the volume of irregular objects or objects with different densities. For weight, tomatoes are weighed using weighing scales as the primary method to get fruits' weight. Nevertheless, a volume-based sorting system is a cost-effective approach since it is more accurate and offers an alternative to weight sorting since fruits are usually sorted and graded with reference to size. [6,9,40] Derived from previous researches on weight and volume and mass prediction for multiple fruits, such as tangerine, orange, lime, and lemon, the limit of agreement was set to 95% such that the agreed predicted volume ought to be below 5% error. [6,9] This study's results  a, b mean ± std within a row, with no superscript in common, differ significantly ðp < 0:05Þ  [6,9,40,41] in terms of maximum and minimum errors.
In an in-line tomato production environment, tomatoes move entirely at random in clusters. Thus, in any location (partial or no occlusion) across the line, a robust weight and volume determination system is expected to predict each tomato's volume and weight. This work introduces a touchingtomatoes separating technique for weight and volume prediction. The implemented separating method provides a remedy to the tomato occlusion problem. Algorithms used to split partially occluded objects include global minimization-based methods, watershed-based algorithms, and shape information-based algorithms. However, the watershed-based algorithms are frequently unreliable, lacking a strong gradient among objects due to occlusion. [10,24] Shape information-based algorithms conduct the separation of occluded objects by evaluating an object contour to determine convex and concave points. [23,24] In shape representation, occlusion points are identified by high curvature points on an object boundary. Thus, our objective was to accurately detect occlusion points across the shape contour for split-line extraction. In contrast to Lin et al., [23] it was shown that extracting high curvature points centered on a curvature curve function requires the obtained curvature peaks to be transformed to be invariant to translation and rotation. However, using a set threshold on the angles between two successive contour points, concave points can be established according to Bai et al. [24] Our research presents a robust, precise, and determinant-based approach to extract high curvature points from contour primitives for noise invariance and reduce computational complexity.
In an accurate splitting algorithm, the method's consistency and efficiency with regard to the number of occluded objects are evaluated by the accuracy of the system developed. Figure 6 indicates that the average relative error is independent of the cluster size after the splitting technique, but only in the initial image processing procedure. Henceforth, the proposed system with an average error range of 1.324-2.632% for weight estimation and 1.435-3.088% can accurately estimate partially occluded tomatoes' weight and volume of partially occluded tomatoes (tomatoes in a cluster).
However, this study had some limitations; the system is only applicable to the CM20 tomato variety. Other tomato fruit types, particularly those with large fruits, may have different weight and volume than CM20 fruits due to differences in the internal fruit structure (locule number and flesh percentage). As such, this technique requires validation on various tomato varieties and fruit types to ensure its feasibility for estimating weight and volume for other axis-symmetric fruits. Furthermore, a higher frame rate and superior computing resources are suggested to increase the proposed system's processing capability.

CONCLUSION
In this article, a novel tomato weight and volume prediction method was developed for tomatoes with no and partial occlusions using image processing and machine learning techniques. The findings revealed that the weight and volume of tomato fruits could be accurately measured under two conditions: occlusion and no occlusion. Based on the image processing steps presented, errors can be easily minimized. This new approach met the study's primary objectives of splitting tomatoes during partial occlusion and estimating weight/volume with acceptable accuracy on the test dataset. The RGB-based system can be used in a production line to calculate the volume and weight of tomatoes as a single item or with partial occlusion with reduced errors. The masses and volumes of tomatoes are accurately measured, allowing for more efficient handling and sorting processes after harvest. Since most customers and food retailers prefer homogeneous lots of a specific size, weight, and volume, producers can boost their trade agreements by having a greater understanding of classified sales volume. This technique can minimize labor costs and sorting time at any point in the production line. Hence, this computer vision system can calculate the volume and weight of tomato fruits in a nondestructive manner.