Mining the Himalayan Uplands Plant Database for a Conservation Baseline Using the Public GMBA Webportal
This chapter shows how a synthesis of heterogeneous biological field observation data, robust taxonomic methods, and data mining leads to up-to-date scientific information that is important for sustainability and conservation management. The core of this type of research is a database with field observations. Here we use the Himalayan Uplands Plant Database (HUP), which consists of extensive collections of botanic survey information collected by the senior author in the Himalayas and in renowned public herbaria over more than 25 years. The HUP database is primarily based on preserved herbarium specimens and presently holds more than 164,000 occurrence records of vascular plants. It contains the records of more than 2,000 collectors and observers who had either directly or indirectly contributed, or records that were derived from herbarium label information. Consistent taxonomic information and the sound use of taxonomy is the key to success of any exercise with large amounts of heterogeneous biological collection data. Taxonomy, especially on the scales of developing consistent cross-border registries, still comprises one of the most obvious bottlenecks to our understanding of biodiversity. In the absence of consistent backbone taxonomies, physical documentation (collecting, preserving, and curating of good and representative herbarium specimens or other vouchers), and quality control must be stressed as necessary preconditions to vegetation and ecology-related studies. Although inherent synonymy rates are obviously quite variable among different taxonomic groups, there is no logical, automated, or permanent procedure that could identify or constrain synonyms. A wide range of Floras, monographs, taxonomic treatments, original publications, and databases has been consulted in HUP to identify and verify specimens, and to develop, at least internally, consistent taxonomies. Other challenges of using such a large collection are the long time span covered and the diversity and inconsistency of spatial and altitudinal information. Thus, large parts of the data are currently not covered by current georeferencing databases such as BioGeomancer or by international taxonomic databases such as ITIS (Integrated Taxonomic Information System). The history of modern biodiversity exploration is brief—in the Himalayas, a mere 200 years—whereas dramatic ecological change and disturbance including deforestation, land degradation, melting glaciers, and increasing severity of natural hazards occurred during the periods of collection. Historic data are thus precious not only on account of the “priority principle” in biological taxonomy. To ensure the highest level of usage of such precious data, we regard the availability of the data for similar and potentially even larger exercises as critically important. Here we show that a new culture needs to develop and mature for sharing, exploiting, and improving primary biodiversity data and for taxonomic work in progress. The example of HUP is used to give a step-by-step best practice guidance to make biological data digitally available online using existing and rapidly developing data-sharing infrastructures. The information of the database columns was transferred into the Darwin Core 2 format and uploaded to the publicly accessible Global Biodiversity Information Facility (GBIF; www.gbif.org). Through GBIF it is also available using the Mountain Biodiversity Portal (MBP; www.mountain biodiversity.org), which allows to query, filter, and download GBIF data specific for mountain areas, with a horizontal (region) and vertical (elevation, climate) dimension and includes many options. In addition, a first-version XML-metadata information was created and uploaded to the National Biological Information Infrastructure (NBII) metadata clearinghouse (National Biological Information Infrastructure 2010; http://metadata.nbii.gov/clearinghouse). Thus, the HUP data are made accessible worldwide either by searching for metadata in the NBII clearinghouse database and through the authors, by searching for original biological data at GBIF, or by searching for mountain-specific information at the Global Mountain Biodiversity Assessment (GMBA) mountain biodiversity portal.