Developing automated data cleansing and validation processes for fisheries catch and effort data
NSW Department Of Primary Industries Orange
Karina C. Hall
1. Review existing data quality control and cleansing processes applied to fisheries catch and effort databases in all state and commonwealth jurisdictions.
2. Develop a suite of generic algorithmic and statistical approaches to detect and flag different error types (e.g., anomalous, missing and outlying values) in fisheries catch and effort relational databases.
3. Trial the above approaches with several case-study fisheries datasets to assess the performance of different data cleansing approaches, quantify error rates and types and assess the sensitivity of catch and effort statistics to these errors and outliers.
4. On the basis of the above findings, recommend a standard national approach for data cleansing and validation of fisheries catch and effort data.
5. Customise and integrate the generic approaches into NSW fisheries database systems to implement automated data cleansing processes.
6. Extend the results of the project to fishers and industry representatives to encourage greater accuracy in fisheries catch and effort data reporting.