alexa Effects of the training dataset characteristics on the performance of nine species distribution models: application to Diabrotica virgifera virgifera.
General Science

General Science

Forest Research: Open Access

Author(s): Dupin M, Reynaud P, Jarok V, Baker R, Brunel S,

Abstract Share this page

Abstract Many distribution models developed to predict the presence/absence of invasive alien species need to be fitted to a training dataset before practical use. The training dataset is characterized by the number of recorded presences/absences and by their geographical locations. The aim of this paper is to study the effect of the training dataset characteristics on model performance and to compare the relative importance of three factors influencing model predictive capability; size of training dataset, stage of the biological invasion, and choice of input variables. Nine models were assessed for their ability to predict the distribution of the western corn rootworm, Diabrotica virgifera virgifera, a major pest of corn in North America that has recently invaded Europe. Twenty-six training datasets of various sizes (from 10 to 428 presence records) corresponding to two different stages of invasion (1955 and 1980) and three sets of input bioclimatic variables (19 variables, six variables selected using information on insect biology, and three linear combinations of 19 variables derived from Principal Component Analysis) were considered. The models were fitted to each training dataset in turn and their performance was assessed using independent data from North America and Europe. The models were ranked according to the area under the Receiver Operating Characteristic curve and the likelihood ratio. Model performance was highly sensitive to the geographical area used for calibration; most of the models performed poorly when fitted to a restricted area corresponding to an early stage of the invasion. Our results also showed that Principal Component Analysis was useful in reducing the number of model input variables for the models that performed poorly with 19 input variables. DOMAIN, Environmental Distance, MAXENT, and Envelope Score were the most accurate models but all the models tested in this study led to a substantial rate of mis-classification.
This article was published in PLoS One and referenced in Forest Research: Open Access

Relevant Expert PPTs

Recommended Conferences

Relevant Topics

Peer Reviewed Journals
 
Make the best use of Scientific Research and information from our 700 + peer reviewed, Open Access Journals
International Conferences 2017-18
 
Meet Inspiring Speakers and Experts at our 3000+ Global Annual Meetings

Contact Us

 
© 2008-2017 OMICS International - Open Access Publisher. Best viewed in Mozilla Firefox | Google Chrome | Above IE 7.0 version
adwords