Automated classification of schools of the Silver Cyprinid Rastrineobola argentea in Lake Victoria acoustic survey data using random forests

Proud, Roland and Mangeni-Sande, Richard and Kayanda, Robert J. and Cox, Martin J. and Nyamweya, Chrisphine and Ongore, Collins and Natugonza, Vianny and Everson, Inigo and Elison, Mboni and Hobbs, Laura and Kashindye, Benedicto Boniphace and Mlaponi, Enock W. and Taabu-Munyaho, Anthony and Mwainge, Venny M. and Kagoya, Esther and Pegado, Antonio and Nduwayesu, Evarist and Brierley, Andrew S. (2020) Automated classification of schools of the Silver Cyprinid Rastrineobola argentea in Lake Victoria acoustic survey data using random forests. ICES Journal of Marine Science, 77 (4). pp. 1379-1390. ISSN 1054-3139

[img]
Preview
Text (Proud-etal-ICESJMS-2020-Automated-classification-of-schools-of-the-Silver-Cyprinid-Rastrineobola)
Proud_etal_ICESJMS_2020_Automated_classification_of_schools_of_the_Silver_Cyprinid_Rastrineobola.pdf
Final Published Version
License: Creative Commons Attribution 4.0 logo

Download (792kB)| Preview

    Abstract

    Biomass of the schooling fish Rastrineobola argentea (dagaa) is presently estimated in Lake Victoria by acoustic survey following the simple “rule” that dagaa is the source of most echo energy returned from the top third of the water column. Dagaa have, however, been caught in the bottom two-thirds, and other species occur towards the surface: a more robust discrimination technique is required. We explored the utility of a school-based random forest (RF) classifier applied to 120 kHz data from a lake-wide survey. Dagaa schools were first identified manually using expert opinion informed by fishing. These schools contained a lake-wide biomass of 0.68 million tonnes (MT). Only 43.4% of identified dagaa schools occurred in the top third of the water column, and 37.3% of all schools in the bottom two-thirds were classified as dagaa. School metrics (e.g. length, echo energy) for 49 081 manually classified dagaa and non-dagaa schools were used to build an RF school classifier. The best RF model had a classification test accuracy of 85.4%, driven largely by school length, and yielded a biomass of 0.71 MT, only c. 4% different from the manual estimate. The RF classifier offers an efficient method to generate a consistent dagaa biomass time series.