Proper Data Management as a Scientific Foundation for Reliable Species Distribution Modeling
C. Ashton Drew, Yolanda F. Wiersma, Falk Huettmann
Data management, storage, curation, and dissemination are mainstays of computer modeling. Indeed, a traditional view of computer modeling has perpetuated the notion of “garbage in, garbage out” (GIGO), which serves as a constant reminder that, no matter how sophisticated the analysis, computers will “unquestioningly process” whatever type of data are provided regardless of its quality or suitability (Pearson 2007). In ecology, the datasets used in computer modeling are inherently complex and often characterized by missing values, dynamic environmental variables, and other factors leading to numerous data anomalies ( Michener et al. 1997; Michener and Brunt 2000). Ecologists have long recognized, however, that although data quality is undoubtedly important, using different types of data, even messy ones, can still prove informative, and facilitates new questions, methods, and synergies in science and society.