TY - JOUR
T1 - Improving the identification of the source of faecal pollution in water using a modelling approach
T2 - From multi-source to aged and diluted samples
AU - Ballesté, Elisenda
AU - Belanche-Muñoz, Luis A
AU - Farnleitner, Andreas H
AU - Linke, Rita
AU - Sommer, Regina
AU - Santos, Ricardo
AU - Monteiro, Silvia
AU - Maunula, Leena
AU - Oristo, Satu
AU - Tiehm A, Andreas
AU - Stange, Claudia
AU - Blanch, Anicet R
N1 - Publisher Copyright:
© 2019 Elsevier Ltd
PY - 2020/3/15
Y1 - 2020/3/15
N2 - The last decades have seen the development of several source tracking (ST) markers to determine the source of pollution in water, but none of them show 100% specificity and sensitivity. Thus, a combination of several markers might provide a more accurate classification. In this study Ichnaea® software was improved to generate predictive models, taking into account ST marker decay rates and dilution factors to reflect the complexity of ecosystems. A total of 106 samples from 4 sources were collected in 5 European regions and 30 faecal indicators and ST markers were evaluated, including E. coli, enterococci, clostridia, bifidobacteria, somatic coliphages, host-specific bacteria, human viruses, host mitochondrial DNA, host-specific bacteriophages and artificial sweeteners. Models based on linear discriminant analysis (LDA) able to distinguish between human and non-human faecal pollution and identify faecal pollution of several origins were developed and tested with 36 additional laboratory-made samples. Almost all the ST markers showed the potential to correctly target their host in the 5 areas, although some were equivalent and redundant. The LDA-based models developed with fresh faecal samples were able to differentiate between human and non-human pollution with 98.1% accuracy in leave-one-out cross-validation (LOOCV) when using 2 molecular human ST markers (HF183 and HMBif), whereas 3 variables resulted in 100% correct classification. With 5 variables the model correctly classified all the fresh faecal samples from 4 different sources. Ichnaea® is a machine-learning software developed to improve the classification of the faecal pollution source in water, including in complex samples. In this project the models were developed using samples from a broad geographical area, but they can be tailored to determine the source of faecal pollution for any user.
AB - The last decades have seen the development of several source tracking (ST) markers to determine the source of pollution in water, but none of them show 100% specificity and sensitivity. Thus, a combination of several markers might provide a more accurate classification. In this study Ichnaea® software was improved to generate predictive models, taking into account ST marker decay rates and dilution factors to reflect the complexity of ecosystems. A total of 106 samples from 4 sources were collected in 5 European regions and 30 faecal indicators and ST markers were evaluated, including E. coli, enterococci, clostridia, bifidobacteria, somatic coliphages, host-specific bacteria, human viruses, host mitochondrial DNA, host-specific bacteriophages and artificial sweeteners. Models based on linear discriminant analysis (LDA) able to distinguish between human and non-human faecal pollution and identify faecal pollution of several origins were developed and tested with 36 additional laboratory-made samples. Almost all the ST markers showed the potential to correctly target their host in the 5 areas, although some were equivalent and redundant. The LDA-based models developed with fresh faecal samples were able to differentiate between human and non-human pollution with 98.1% accuracy in leave-one-out cross-validation (LOOCV) when using 2 molecular human ST markers (HF183 and HMBif), whereas 3 variables resulted in 100% correct classification. With 5 variables the model correctly classified all the fresh faecal samples from 4 different sources. Ichnaea® is a machine-learning software developed to improve the classification of the faecal pollution source in water, including in complex samples. In this project the models were developed using samples from a broad geographical area, but they can be tailored to determine the source of faecal pollution for any user.
KW - Ecosystem
KW - Environmental Monitoring
KW - Escherichia coli
KW - Feces
KW - Humans
KW - Water
KW - Water Microbiology
KW - Water Pollution
UR - http://www.scopus.com/inward/record.url?scp=85076531162&partnerID=8YFLogxK
U2 - 10.1016/j.watres.2019.115392
DO - 10.1016/j.watres.2019.115392
M3 - Journal article
C2 - 31865126
SN - 0043-1354
VL - 171
SP - 115392
JO - Water Research
JF - Water Research
M1 - 115392
ER -