TY - JOUR
T1 - In silico prediction of acute chemical toxicity of biocides in marine
crustaceans using machine learning
AU - Jha, Awadhesh N.
AU - Krishnan, Rama
AU - Howard, Ian S.
AU - Comber, Sean
PY - 2023/8/20
Y1 - 2023/8/20
N2 - Biocides are a heterogeneous group of chemical substances intended to control the growth or kill undesired organisms. Due to their extensive use, they enter marine ecosystems via non-point sources and may pose a threat to ecologically important non-target organisms. Consequently, industries and regulatory agencies have recognized the ecotoxicological hazard potential of biocides. However, the prediction of biocide chemical toxicity on marine crustaceans has not been previously evaluated. This study aims to provide in silico models capable of classifying structurally diverse biocidal chemicals into different toxicity categories and predict acute chemical toxicity (LC50) in marine crustaceans using a set of calculated 2D molecular descriptors. The models were built following the guidelines recommended by the OECD (Organization for Economic Cooperation and Development) and validated through stringent processes (internal and external validation). Six machine learning (ML) models were built and compared (linear regression: LR; support vector machine: SVM; random forest: RF; feed-forward backpropagation-based artificial neural network: ANN; decision trees: DT and naïve Bayes: NB) for regression and classification analysis to predict toxicities. All the models displayed encouraging results with high generalisability: the feed-forward-based backpropagation method showed the best results with determination coefficient R2 values of 0.82 and 0.94, respectively, for training set (TS) and validation set (VS). For classification-based modelling, the DT model performed the best with an accuracy (ACC) of 100 % and an area under curve (AUC) value of 1 for both TS and VS. These models showed the potential to replace animal testing for the chemical hazard assessment of untested biocides if they fall within the applicability domain of the proposed models. In general, the models are highly interpretable and robust, with good predictive performance. The models also displayed a trend indicating that toxicity is largely influenced by factors such as lipophilicity, branching, non-polar bonding and saturation of molecules.
AB - Biocides are a heterogeneous group of chemical substances intended to control the growth or kill undesired organisms. Due to their extensive use, they enter marine ecosystems via non-point sources and may pose a threat to ecologically important non-target organisms. Consequently, industries and regulatory agencies have recognized the ecotoxicological hazard potential of biocides. However, the prediction of biocide chemical toxicity on marine crustaceans has not been previously evaluated. This study aims to provide in silico models capable of classifying structurally diverse biocidal chemicals into different toxicity categories and predict acute chemical toxicity (LC50) in marine crustaceans using a set of calculated 2D molecular descriptors. The models were built following the guidelines recommended by the OECD (Organization for Economic Cooperation and Development) and validated through stringent processes (internal and external validation). Six machine learning (ML) models were built and compared (linear regression: LR; support vector machine: SVM; random forest: RF; feed-forward backpropagation-based artificial neural network: ANN; decision trees: DT and naïve Bayes: NB) for regression and classification analysis to predict toxicities. All the models displayed encouraging results with high generalisability: the feed-forward-based backpropagation method showed the best results with determination coefficient R2 values of 0.82 and 0.94, respectively, for training set (TS) and validation set (VS). For classification-based modelling, the DT model performed the best with an accuracy (ACC) of 100 % and an area under curve (AUC) value of 1 for both TS and VS. These models showed the potential to replace animal testing for the chemical hazard assessment of untested biocides if they fall within the applicability domain of the proposed models. In general, the models are highly interpretable and robust, with good predictive performance. The models also displayed a trend indicating that toxicity is largely influenced by factors such as lipophilicity, branching, non-polar bonding and saturation of molecules.
U2 - 10.1016/j.scitotenv.2023.164072
DO - 10.1016/j.scitotenv.2023.164072
M3 - Article
SN - 0048-9697
VL - 887
JO - Science of the Total Environment
JF - Science of the Total Environment
IS - 0
ER -