Non-Invasive Abalone Sex Classification from External Measurements using Interpretable Machine Learning

Ramtin Dabiri

Call for Paper

February Edition

IJCA solicits high quality original research papers for the upcoming February edition of the journal. The last date of research paper submission is 20 January 2026

Submit your paper

Know more

The week's pick

DHCPv6 Security Threats in Smart City Infrastructure: A Comprehensive Case Study of USA Municipalities

Joy Selasi Agbesi

Random Articles

Reseach Article

Non-Invasive Abalone Sex Classification from External Measurements using Interpretable Machine Learning

by Ramtin Dabiri

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 187 - Number 58

Year of Publication: 2025

Authors: Ramtin Dabiri

10.5120/ijca2025925985

Ramtin Dabiri . Non-Invasive Abalone Sex Classification from External Measurements using Interpretable Machine Learning. International Journal of Computer Applications. 187, 58 ( Nov 2025), 65-72. DOI=10.5120/ijca2025925985

@article{ 10.5120/ijca2025925985,

author = { Ramtin Dabiri },

title = { Non-Invasive Abalone Sex Classification from External Measurements using Interpretable Machine Learning },

journal = { International Journal of Computer Applications },

issue_date = { Nov 2025 },

volume = { 187 },

number = { 58 },

month = { Nov },

year = { 2025 },

issn = { 0975-8887 },

pages = { 65-72 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume187/number58/non-invasive-abalone-sex-classification-from-external-measurements-using-interpretable-machine-learning/ },

doi = { 10.5120/ijca2025925985 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2025-11-18T21:11:20.814647+05:30

%A Ramtin Dabiri

%T Non-Invasive Abalone Sex Classification from External Measurements using Interpretable Machine Learning

%J International Journal of Computer Applications

%@ 0975-8887

%V 187

%N 58

%P 65-72

%D 2025

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Accurate sex classification of abalone is essential for selective breeding and ethical harvesting, yet many existing studies rely on invasive measurements (e.g., internal weights), limiting real-world deployment. This study contributes two innovations motivated by practical field constraints. First, a strictly non-invasive framework is adopted, using only external traits—length, diameter, height, and whole weight—so specimens are not opened. Second, instead of the common rank-then-select approach, a ranking-guided combinatorial search over polynomial and interaction terms (degree ≤ 5) is applied for multinomial logistic regression. This design is motivated by three considerations: (1) standard ranking methods (ANOVA, Mutual Information, Random Forest) evaluate variables largely in isolation, whereas sex signal emerges from feature–feature interactions; (2) relationships among external measurements are partly non-linear, so higher-order terms capture structure missed by base features or linear models; and (3) rankings can be unstable under collinearity and outliers, making empirical validation of feature sets more robust. Under an outlier-inclusive protocol, a compact model excluding diameter attains 0.5689 test accuracy, while an all-four-measurements model reaches 0.5641—both exceeding the commonly reported 0.50–0.55 range for this dataset and avoiding invasive measurements. The curated interaction design enables logistic regression to outperform more complex models (e.g., tuned SVM and XGBoost), indicating that interaction construction, rather than model complexity, is the key driver of accuracy under non-invasive constraints. The resulting pipeline is interpretable, field-deployable, and supported by fully reproducible code.

References

Dua, D & Graff, C 2019, UCI Machine Learning Repository: Abalone Data Set, University of California, Irvine, viewed 1 April 2025, https://archive.ics.uci.edu.
Mehta, K 2019, ‘Abalone age prediction problem: a review’, International Journal of Computer Applications, vol. 178, no. 50, p. 43.
Barrera-Hernandez, R, Barrera-Soto, V, Martinez-Rodriguez, JL, Rios-Alvarado, AB & Ortiz-Rodriguez, F 2021, ‘Towards abalone differentiation through machine learning’, in D-S Huang, Z Kang, V Bevilacqua & PSP da Silva (eds), Intelligent Computing Theories and Applications, Lecture Notes in Computer Science, vol. 12836, Springer, Cham, pp. 689–703. https://doi.org/10.1007/978-3-030-84514-6_53
Cook, PA 2019, ‘Worldwide abalone production statistics’, Journal of Shellfish Research, vol. 38, no. 2, pp. 401–404.
Arifin, WA, Ariawan, I, Rosalia, AA, Lukman, L & Tufailah, N 2022, ‘Data scaling performance on various machine learning algorithms to identify abalone sex’, Jurnal Teknologi dan Sistem Komputer, vol. 10, no. 1, pp. 26–31.
Hossain, MM & Chowdhury, MNM 2019, ‘Econometric ways to estimate the age and price of abalone’, MPRA Paper no. 91210, University Library of Munich, Germany, viewed 1 April 2025, https://mpra.ub.uni-muenchen.de/91210/.
Wang, Z 2018, Abalone age prediction employing a cascade network algorithm and conditional generative adversarial networks, technical report, Research School of Computer Science, Australian National University, Canberra.
Webb, GI, Keogh, E & Miikkulainen, R 2010, ‘Naïve Bayes’, in C Sammut & GI Webb (eds), Encyclopedia of Machine Learning, Springer, Boston, MA, pp. 713–714.
Steinwart, I & Christmann, A 2008, Support Vector Machines, Springer, New York.
Luo, H, Xiao, J, Jiang, Y, Ke, Y, Ke, C & Cai, M 2020, ‘Mapping and marker identification for sex-determining in the Pacific abalone, Haliotis discus hannai Ino’, Aquaculture, vol. 530, 735810. https://doi.org/10.1016/j.aquaculture.2020.735810
Alsabti, K, Ranka, S & Singh, V 1998, ‘CLOUDS: A decision tree classifier for large datasets’, in Proceedings of the 4th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’98), ACM, New York, NY, pp. 2–8. (year corrected to match KDD ’98)
Mayukh, H 2010, Age of abalones using physical characteristics: a classification problem, technical report, Department of Electrical and Computer Engineering, University of Wisconsin–Madison.
Sethi, S, Agarwal, S & Panda, S 2023, ‘Performance comparison of machine learning models on abalone dataset’, International Journal of Computer Applications, vol. 185, no. 14, pp.2

Index Terms

Computer Science

Information Sciences

Keywords

Abalone; sex classification; aquaculture; non-invasive measurement; logistic regression