Project number: 2017-085
Project Status:
Budget expenditure: $397,750.00
Principal Investigator: Karina C. Hall
Organisation: NSW Department Of Primary Industries
Project start/end date: 21 Dec 2017 - 29 Jun 2020


During a recent national Fisheries Statistics Working Group meeting, data managers from all Australian states highlighted and discussed the likely high prevalence of inaccurate or fraudulent data supplied by fishers and accrued through data-entry errors. Current data quality control measures in each jurisdiction are largely heterogeneous, undocumented and often rely on manual checks by clerks or analysts that are labour intensive and costly and not routinely executed. Because many of these checks occur during manual data entry of paper-based records, these are likely to become obsolete as reliance on electronic reporting increases, with data entered directly by fishers through online portals or mobile applications.

There is a need to develop automated data cleansing and diagnostic procedures that can be applied post-hoc or retrospectively to large fisheries databases to detect and flag errors and outliers and provide subsets of reliable catch and effort data for stock assessments and other analyses. This project will contribute towards addressing these issues, by developing automated processes to routinely assess newly entered fisheries catch and effort data for errors, retrospectively quantify error rates in existing data and assess their likely influence on the outputs of stock assessment analyses. The outcomes will help improve the quality and accuracy of catch and effort data used in routine stock assessments, and in turn lead to more sustainable management of wild capture fisheries resources.


1. Review existing data quality control and cleansing processes applied to fisheries catch and effort databases in all state and commonwealth jurisdictions.
2. Develop a suite of generic algorithmic and statistical approaches to detect and flag different error types (e.g., anomalous, missing and outlying values) in fisheries catch and effort relational databases.
3. Trial the above approaches with several case-study fisheries datasets to assess the performance of different data cleansing approaches, quantify error rates and types and assess the sensitivity of catch and effort statistics to these errors and outliers.
4. On the basis of the above findings, recommend a standard national approach for data cleansing and validation of fisheries catch and effort data.
5. Customise and integrate the generic approaches into NSW fisheries database systems to implement automated data cleansing processes.
6. Extend the results of the project to fishers and industry representatives to encourage greater accuracy in fisheries catch and effort data reporting.

Related research