SpeciesgeocodeR is an R-package to facilitate the use of “big data” (e.g. millions of occurrence records from GBIF) in biogeography. The package includes functions to classify species occurrence records into discrete areas in a format ready-to-use with common software used for ancestral area reconstruction, to estimate ranges and range sizes from occurrence records and to automatically clean records from geographic errors common to biological collection databases. The stable version of speciesgeocodeR is available from CRAN, the latest version from GitHub. There are a lot of tutorials available here. There is also a similar package developed by Mats Töpel in Python.
Overly imprecise or erroneous geo-references are a major issues when using species distribution data from large databases compiled from various sources, such as public data aggregators (e.g. GBIF). CoordinateCleaner is a software to automatically clean databases from records with potentially erroneous coordinates. It accounts for errors most commonly found in biological collections, for instance invalid coordinates, coordinates in the sea, coordinates assigned to country centroids, capitals or biodiversity institutions. Additionally, CoordinateCleaner can check for imprecise dating in fossils and identify if records in a data set have undergone decimal rounding. CoordinateCleaner is available as R packages via CRAN or GitHub.
Sampbias is a method and tool to 1) visualize the distribution of occurrence records and species in any user-provided dataset, 2) quantify the biasing effect of geographic features related to human accessibility, such as proximity to cities, rivers or roads, and 3) create publication-level graphs of these biasing effects in space. Find the SampBias R-package here, the shiny web application here and a detailed description here.
I contributed to the developement of Infomap Bioregions, an interactive web application for delimiting biogeographical regions based on species occurrence data. The app uses the network delimitation algorithm developed by Vilhena and Antonelli (2015, Nature Communications). Species distributions may be provided as georeferenced point occurrences or range maps, and can be of local, regional or global scale. The application uses a novel adaptive resolution method to make best use of often incomplete species distribution data. The application is fully described here.
I also made some contribution to the DES-model (Dispersal-Extintion-Sampling) implemented in PyRate. The DES-model estimates biogeographic parameters (dispersal, extinction and speciation) using fossil occurrences instead of phylogenetic trees. PyRate is a Python program developed by Daniele Silvestro to estimate speciation, extinction, and preservation rates from fossil occurrence data using probabilistic framework to jointly estimate species-specific times of speciation and extinction and the rates of the underlying birth-death process. Check out PyRate here and the description of the DES-model here.