CFP last date
22 December 2025
Call for Paper
January Edition
IJCA solicits high quality original research papers for the upcoming January edition of the journal. The last date of research paper submission is 22 December 2025

Submit your paper
Know more
Random Articles
Reseach Article

Non-Invasive Abalone Sex Classification from External Measurements using Interpretable Machine Learning

by Ramtin Dabiri
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 187 - Number 58
Year of Publication: 2025
Authors: Ramtin Dabiri
10.5120/ijca2025925985

Ramtin Dabiri . Non-Invasive Abalone Sex Classification from External Measurements using Interpretable Machine Learning. International Journal of Computer Applications. 187, 58 ( Nov 2025), 65-72. DOI=10.5120/ijca2025925985

@article{ 10.5120/ijca2025925985,
author = { Ramtin Dabiri },
title = { Non-Invasive Abalone Sex Classification from External Measurements using Interpretable Machine Learning },
journal = { International Journal of Computer Applications },
issue_date = { Nov 2025 },
volume = { 187 },
number = { 58 },
month = { Nov },
year = { 2025 },
issn = { 0975-8887 },
pages = { 65-72 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume187/number58/non-invasive-abalone-sex-classification-from-external-measurements-using-interpretable-machine-learning/ },
doi = { 10.5120/ijca2025925985 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2025-11-18T21:11:20.814647+05:30
%A Ramtin Dabiri
%T Non-Invasive Abalone Sex Classification from External Measurements using Interpretable Machine Learning
%J International Journal of Computer Applications
%@ 0975-8887
%V 187
%N 58
%P 65-72
%D 2025
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Accurate sex classification of abalone is essential for selective breeding and ethical harvesting, yet many existing studies rely on invasive measurements (e.g., internal weights), limiting real-world deployment. This study contributes two innovations motivated by practical field constraints. First, a strictly non-invasive framework is adopted, using only external traits—length, diameter, height, and whole weight—so specimens are not opened. Second, instead of the common rank-then-select approach, a ranking-guided combinatorial search over polynomial and interaction terms (degree ≤ 5) is applied for multinomial logistic regression. This design is motivated by three considerations: (1) standard ranking methods (ANOVA, Mutual Information, Random Forest) evaluate variables largely in isolation, whereas sex signal emerges from feature–feature interactions; (2) relationships among external measurements are partly non-linear, so higher-order terms capture structure missed by base features or linear models; and (3) rankings can be unstable under collinearity and outliers, making empirical validation of feature sets more robust. Under an outlier-inclusive protocol, a compact model excluding diameter attains 0.5689 test accuracy, while an all-four-measurements model reaches 0.5641—both exceeding the commonly reported 0.50–0.55 range for this dataset and avoiding invasive measurements. The curated interaction design enables logistic regression to outperform more complex models (e.g., tuned SVM and XGBoost), indicating that interaction construction, rather than model complexity, is the key driver of accuracy under non-invasive constraints. The resulting pipeline is interpretable, field-deployable, and supported by fully reproducible code.

References
  1. Dua, D & Graff, C 2019, UCI Machine Learning Repository: Abalone Data Set, University of California, Irvine, viewed 1 April 2025, https://archive.ics.uci.edu.
  2. Mehta, K 2019, ‘Abalone age prediction problem: a review’, International Journal of Computer Applications, vol. 178, no. 50, p. 43.
  3. Barrera-Hernandez, R, Barrera-Soto, V, Martinez-Rodriguez, JL, Rios-Alvarado, AB & Ortiz-Rodriguez, F 2021, ‘Towards abalone differentiation through machine learning’, in D-S Huang, Z Kang, V Bevilacqua & PSP da Silva (eds), Intelligent Computing Theories and Applications, Lecture Notes in Computer Science, vol. 12836, Springer, Cham, pp. 689–703. https://doi.org/10.1007/978-3-030-84514-6_53
  4. Cook, PA 2019, ‘Worldwide abalone production statistics’, Journal of Shellfish Research, vol. 38, no. 2, pp. 401–404.
  5. Arifin, WA, Ariawan, I, Rosalia, AA, Lukman, L & Tufailah, N 2022, ‘Data scaling performance on various machine learning algorithms to identify abalone sex’, Jurnal Teknologi dan Sistem Komputer, vol. 10, no. 1, pp. 26–31.
  6. Hossain, MM & Chowdhury, MNM 2019, ‘Econometric ways to estimate the age and price of abalone’, MPRA Paper no. 91210, University Library of Munich, Germany, viewed 1 April 2025, https://mpra.ub.uni-muenchen.de/91210/.
  7. Wang, Z 2018, Abalone age prediction employing a cascade network algorithm and conditional generative adversarial networks, technical report, Research School of Computer Science, Australian National University, Canberra.
  8. Webb, GI, Keogh, E & Miikkulainen, R 2010, ‘Naïve Bayes’, in C Sammut & GI Webb (eds), Encyclopedia of Machine Learning, Springer, Boston, MA, pp. 713–714.
  9. Steinwart, I & Christmann, A 2008, Support Vector Machines, Springer, New York.
  10. Luo, H, Xiao, J, Jiang, Y, Ke, Y, Ke, C & Cai, M 2020, ‘Mapping and marker identification for sex-determining in the Pacific abalone, Haliotis discus hannai Ino’, Aquaculture, vol. 530, 735810. https://doi.org/10.1016/j.aquaculture.2020.735810
  11. Alsabti, K, Ranka, S & Singh, V 1998, ‘CLOUDS: A decision tree classifier for large datasets’, in Proceedings of the 4th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’98), ACM, New York, NY, pp. 2–8. (year corrected to match KDD ’98)
  12. Mayukh, H 2010, Age of abalones using physical characteristics: a classification problem, technical report, Department of Electrical and Computer Engineering, University of Wisconsin–Madison.
  13. Sethi, S, Agarwal, S & Panda, S 2023, ‘Performance comparison of machine learning models on abalone dataset’, International Journal of Computer Applications, vol. 185, no. 14, pp.2
Index Terms

Computer Science
Information Sciences

Keywords

Abalone; sex classification; aquaculture; non-invasive measurement; logistic regression