GBIF2

Documentation for GBIF2.

Overview

GBIF2.GBIF2Module

GBIF2

Stable Dev Build Status Coverage

The goals of GBIF2 is to follow the GBIF api as completely and correctly as possible.

Its main design features are:

  • Single results are returned with type Occurrence or Species, with all of the GBIF fields available using object.fieldname, These return missing if not returned by a specific query.
  • Multiple results are returned as a Tables.jl compatible Table of Occurrence or Species rows. This Table can be converted to a DataFrame or writted directly to disk using CSV.jl and similar packages.
  • All GBIF enum keys are checked for correctness before querying so that only correct queries can be sent. Clear messages point to errors in queries.
  • A limit above 300 items at a time is allowed, unlike in the original API, by making multiple reuests and joining the results.
  • For even larger queries, download requests are handled with gbif.org account authentication.

A quick example

julia> using GBIF2, DataFrames

# Basic species match with `species_match`:
julia> sp = species_match("Lalage newtoni");

julia> sp.species
"Coracina newtoni"

julia> sp.synonym
true

julia> sp.vernacularName
missing

# Get a more detailed object with `species`:
julia> sp_detailed = species(sp);

julia> sp_detailed.vernacularName
"Reunion Cuckooshrike"

# Get the first 2000 occurrences for the species from 2000 to 2020, on reunion:
julia> oc = occurrence_search(sp;
           limit=2000, country=:RE, hasCoordinate=true, year=(2000,2020)
       ) |> DataFrame
2000×83 DataFrame
  Row │ decimalLongitude  decimalLatitude  year    month   day
      │ Float64?          Float64?         Int64?  Int64?  Int64?
──────┼────────────────────────────────────────────────────────────
    1 │          55.5085         -21.0192    2020       1      14
    2 │          55.4133         -20.928     2020       1      23
    3 │          55.4133         -20.928     2020       1      16
    4 │          55.5085         -21.0192    2020       1      14
    5 │          55.4123         -21.0184    2020       1      13
    6 │          55.4133         -20.928     2020       1      28
    7 │          55.4133         -20.928     2020       1      16
  ⋮   │        ⋮                 ⋮           ⋮       ⋮       ⋮
 1994 │          55.4133         -20.928     2017      10      29
 1995 │          55.4123         -21.0184    2017      10      25
 1996 │          55.4123         -21.0184    2017      10      25
 1997 │          55.4123         -21.0184    2017      10      17
 1998 │          55.4123         -21.0184    2017      10      25
 1999 │          55.4123         -21.0184    2017      10      25
 2000 │          55.4123         -21.0184    2017      10      25

There are a number of ways to query the GBIF database for species, returning different numbers of results and amounts of data.

Species

Species objects and queries correspond closely to the GBIF species api.

GBIF2.SpeciesType
Species

Wrapper object for information returned by species, species_list, species_match or species_search queries. These often are species, but a more correctly taxa, as it may be e.g. "Aves" for all birds. We use Species for naming consistency with the GBIF API.

Species also serve as rows in Table, and are converted to rows in a DataFrame or CSV automatically by the Tables.jl interface.

Species properties are accessed with ., e.g. sp.kingdom. Note that these queries do not all return all properties, and not all records contain all properties in any case. Missing properties simply return missing.

The possible properties of a Species object are: (:kingdom, :phylum, :class, :order, :family, :genus, :species, :key, :nubKey, :nameKey, :taxonID, :sourceTaxonKey, :kingdomKey, :phylumKey, :classKey, :orderKey, :familyKey, :genusKey, :speciesKey, :datasetKey, :constituentKey, :scientificName, :canonicalName, :vernacularName, :parentKey, :parent, :basionymKey, :basionym, :authorship, :nameType, :rank, :origin, :taxonomicStatus, :nomenclaturalStatus, :remarks, :publishedIn, :numDescendants, :lastCrawled, :lastInterpreted, :issues, :synonym)

GBIF2.species_matchFunction
species_match(; kw...)

Query the GBIF species/match api, returning the single closest Species using fuzzy search.

The results are not particularly detailed, this can be improved by calling species(res) on the result of species_match to query for the full dataset.

Example

using GBIF2
sp = species_match("Lalage newtoni")

# output
GBIF2.Species({
           "usageKey": 8385394,
   "acceptedUsageKey": 2486791,
     "scientificName": "Lalage newtoni (Pollen, 1866)",
      "canonicalName": "Lalage newtoni",
               "rank": "SPECIES",
             "status": "SYNONYM",
         "confidence": 98,
          "matchType": "EXACT",
            "kingdom": "Animalia",
             "phylum": "Chordata",
              "order": "Passeriformes",
             "family": "Campephagidae",
              "genus": "Coracina",
            "species": "Coracina newtoni",
         "kingdomKey": 1,
          "phylumKey": 44,
           "classKey": 212,
           "orderKey": 729,
          "familyKey": 9284,
           "genusKey": 2482359,
         "speciesKey": 2486791,
            "synonym": true,
              "class": "Aves"
})

Keywords

We use keywords exactly as in the GBIF api.

You can find keyword enum values with the [GBIF2.enum](@ref) function.

  • rank: Filters by taxonomic rank as given in our Rank enum
  • name: Name of the species
  • strict: If true it (fuzzy) matches only the given name, but never a taxon in the upper classification
  • verbose: If true it shows alternative matches which were considered but then rejected
  • kingdom: Optional kingdom classification accepting a canonical name.
  • phylum: Optional phylum classification accepting a canonical name.
  • class: Optional class classification accepting a canonical name.
  • order: Optional order classification accepting a canonical name.
  • family: Optional family classification accepting a canonical name.
  • genus: Optional genus classification accepting a canonical name.
GBIF2.speciesFunction
species(key; kw...)
species(key, resulttype; kw...)

Query the GBIF species api, returning a single Species.

  • key: a species key, or Species object from another search that a key can be obtained from.
  • resulttype: set this so that instead of a Species, species will return an object in (:verbatim, :name, :parents, :children, :related, :synonyms, :combinations, :descriptions, :distributions, :media, :references, :speciesProfiles, :vernacularNames, :typeSpecimens). The return value will be a raw JSON3.Object, but itspropertynames` can be checked and used to access data.

Example

Here we find a species with species_search, and then obtain the complete record with species.

julia> using GBIF2
julia> tbl = species_search("Falco punctatus")
20-element GBIF2.Table{GBIF2.Species, JSON3.Array{JSON3.Object, Vector{UInt8}, SubArray{
UInt64, 1, Vector{UInt64}, Tuple{UnitRange{Int64}}, true}}}
┌──────────┬──────────┬───────────────┬───────────────┬────────────┬─────────┬──────────
│  kingdom │   phylum │         class │         order │     family │   genus │         ⋯
│  String? │  String? │       String? │       String? │    String? │ String? │         ⋯
├──────────┼──────────┼───────────────┼───────────────┼────────────┼─────────┼──────────
│ Animalia │  missing │          Aves │ Falconiformes │ Falconidae │ missing │ Falco p ⋯
│ Animalia │  missing │          Aves │       missing │ Falconidae │   Falco │ Falco p ⋯
│  missing │  missing │          Aves │ Falconiformes │ Falconidae │   Falco │ Falco p ⋯
│  missing │ Chordata │       missing │       missing │ Falconidae │   Falco │ Falco p ⋯
│ Animalia │ Chordata │          Aves │ Falconiformes │ Falconidae │   Falco │ Falco p ⋯
│ Animalia │ Chordata │          Aves │ Falconiformes │ Falconidae │   Falco │ Falco p ⋯
│  Metazoa │ Chordata │          Aves │ Falconiformes │ Falconidae │   Falco │ Falco p ⋯
│ Animalia │  missing │       missing │ Falconiformes │ Falconidae │ missing │ Falco p ⋯
│  Metazoa │ Chordata │          Aves │ Falconiformes │ Falconidae │   Falco │ Falco p ⋯
│    ⋮     │    ⋮     │       ⋮       │       ⋮       │     ⋮      │    ⋮    │         ⋱
└──────────┴──────────┴───────────────┴───────────────┴────────────┴─────────┴──────────
                                                           35 columns and 11 rows omitted

And retrieve all the fields for one of the matches.

julia> species(tbl[6])
GBIF2.Species({
                   "key": 102091853,
                "nubKey": 2481005,
               "nameKey": 4400647,
               "taxonID": "175650",
               "kingdom": "Animalia",
                "phylum": "Chordata",
                 "order": "Falconiformes",
                "family": "Falconidae",
                 "genus": "Falco",
               "species": "Falco punctatus",
            "kingdomKey": 101683523,
             "phylumKey": 102017110,
              "classKey": 102085317,
              "orderKey": 102091762,
             "familyKey": 102091763,
              "genusKey": 102091765,
            "speciesKey": 102091853,
            "datasetKey": "9ca92552-f23a-41a8-a140-01abaa31c931",
             "parentKey": 102091765,
                "parent": "Falco",
        "scientificName": "Falco punctatus Temminck, 1821",
         "canonicalName": "Falco punctatus",
        "vernacularName": "Mauritius Kestrel",
            "authorship": "Temminck, 1821",
              "nameType": "SCIENTIFIC",
                  "rank": "SPECIES",
                "origin": "SOURCE",
       "taxonomicStatus": "ACCEPTED",
   "nomenclaturalStatus": [],
        "numDescendants": 0,
           "lastCrawled": "2022-10-10T18:15:33.989+00:00",
       "lastInterpreted": "2022-10-10T19:16:16.841+00:00",
                "issues": [
                            "SCIENTIFIC_NAME_ASSEMBLED"
                          ],
               "synonym": false,
                 "class": "Aves"
})

Keyword arguments

  • language: can be specified for a single argument or with second argument in (:parents, :children, :related, :synonyms).
  • datasetKey: can be specified, with a second argument :related.
GBIF2.species_searchFunction
species_search([q]; kw...)

Query the GBIF species/search api, returning many results in a GBIF2.Table.

Example

using GBIF2
sp = species_search("Psittacula eques")

# output
20-element GBIF2.Table{GBIF2.Species, JSON3.Array{JSON3.Object, Vector{UInt8}, SubArray{UInt64, 1,
Vector{UInt64}, Tuple{UnitRange{Int64}}, true}}}
┌──────────┬──────────┬────────────────┬────────────────┬───────────────┬──────────────┬──
│  kingdom │   phylum │          class │          order │        family │        genus │ ⋯
│  String? │  String? │        String? │        String? │       String? │      String? │ ⋯
├──────────┼──────────┼────────────────┼────────────────┼───────────────┼──────────────┼──
│ Animalia │ Chordata │           Aves │ Psittaciformes │ Psittaculidae │   Psittacula │ ⋯
│ Animalia │  missing │           Aves │ Psittaciformes │ Psittaculidae │      missing │ ⋯
│  Metazoa │ Chordata │           Aves │ Psittaciformes │   Psittacidae │   Psittacula │ ⋯
│ Animalia │  missing │           Aves │ Psittaciformes │ Psittaculidae │      missing │ ⋯
│ Animalia │  missing │           Aves │ Psittaciformes │ Psittaculidae │      missing │ ⋯
│ Animalia │  missing │        missing │ Psittaciformes │ Psittaculidae │      missing │ ⋯
│  Metazoa │ Chordata │           Aves │ Psittaciformes │   Psittacidae │   Psittacula │ ⋯
│  missing │  missing │        missing │        missing │       missing │   Psittacula │ ⋯
│ Animalia │ Chordata │           Aves │ Psittaciformes │ Psittaculidae │   Psittacula │ ⋯
│ Animalia │  missing │           Aves │        missing │   Psittacidae │   Psittacula │ ⋯
│  Metazoa │ Chordata │           Aves │ Psittaciformes │   Psittacidae │   Psittacula │ ⋯
│ Animalia │ Chordata │           Aves │ Psittaciformes │ Psittaculidae │   Psittacula │ ⋯
│ ANIMALIA │ CHORDATA │ PSITTACIFORMES │           AVES │   PSITTACIDAE │ Alexandrinus │ ⋯
│  Metazoa │ Chordata │           Aves │ Psittaciformes │   Psittacidae │   Psittacula │ ⋯
│ Animalia │ Chordata │           Aves │ Psittaciformes │   Psittacidae │   Psittacula │ ⋯
│ Animalia │  missing │           Aves │ Psittaciformes │ Psittaculidae │   Psittacula │ ⋯
│    ⋮     │    ⋮     │       ⋮        │       ⋮        │       ⋮       │      ⋮       │ ⋱
└──────────┴──────────┴────────────────┴────────────────┴───────────────┴──────────────┴──
                                                              37 columns and 4 rows omitted

Keyword arguments

We use keywords exactly as in the GBIF api.

  • class: Optional class classification accepting a canonical name.
  • datasetKey: Filters by the checklist dataset key (a uuid)
  • facet: A list of facet names used to retrieve the 100 most frequent values for a field. Allowed facets are :datasetKey, :higherTaxonKey, :rank, :status, :nomenclaturalStatus, isExtinct, :habitat, :threat and :nameType.
  • facetMincount: Used in combination with the facet parameter. Set facetMincount=N to exclude facets with a count less than N, e.g. facet=type, limit=>0, facetMincount=>10000 only shows the type value OCCURRENCE because :CHECKLIST and :METADATA have counts less than 10000.
  • facetMultiselect: Used in combination with the facet parameter. Set facetMultiselect=true to still return counts for values that are not currently filtered, e.g. facet=type, limit=>0, type=>CHECKLIST, facetMultiselect=>true still shows type values OCCURRENCE and METADATA even though type is being filtered by type=:CHECKLIST
  • family: Optional family classification accepting a canonical name.
  • genus: Optional genus classification accepting a canonical name.
  • habitat: Filters by the habitat. Currently only 3 major biomes are accepted in our Habitat enum
  • highertaxonKey: Filters by any of the higher Linnean rank keys. Note this is within the respective checklist and not searching nub keys across all checklists.
  • hl: Set hl=true to highlight terms matching the query when in fulltext search fields. The highlight will be an emphasis tag of class 'gbifH1' e.g. q="plant", hl=>true. Fulltext search fields include title, keyword, country, publishing country, publishing organization title, hosting organization title, and description. One additional full text field is searched which includes information from metadata documents, but the text of this field is not returned in the response
  • isExtinct: Filters by extinction status (a boolean, e.g. isExtinct=>true)
  • issue: A specific indexing issue as defined in our NameUsageIssue enum
  • kingdom: Optional kingdom classification accepting a canonical name.
  • language: Language for vernacular names, given as an ISO 639-1 two-letter code from our
  • nameType: Filters by the name type as given in our NameType enum
  • nomenclaturalStatus: Not yet implemented, but will eventually allow for filtering by a nomenclatural status enum
  • order: Optional order classification accepting a canonical name.
  • phylum: Optional phylum classification accepting a canonical name.
  • q: Simple full text search parameter. The value for this parameter can be a simple word or a phrase. Wildcards are not supported
  • rank: Filters by taxonomic rank as given in our Rank enum
  • sourceId: Filters by the source identifier
  • status: Filters by the taxonomic status as given in our TaxonomicStatus enum
  • strict: If true it (fuzzy) matches only the given name, but never a taxon in the upper classification
  • threat: Filters by the taxonomic threat status as given in our ThreatStatus enum
  • verbose: If true it shows alternative matches which were considered but then rejected
  • offset: Offset to start results from
  • limit: The maximum number of results to return. This can't be greater than 300, any value greater is set to 300.
GBIF2.species_listFunction
species_list(; kw...)
species_list(key; kw...)
species_list(key, resulttype; kw...)

Query the GBIF species_list api, returning a table of Species that exactly match your query.

Example

using GBIF2
species_list(; name="Lalage newtoni")

# output
8-element GBIF2.Table{GBIF2.Species, JSON3.Array{JSON3.Object, Vector{UInt8}, SubArray{UInt64, 1, Vector{UInt64}, Tuple{UnitRange{Int64}}, true}}}
┌──────────┬──────────┬───────────────┬───────────────┬───────────────┬──────────┬──────────────────┬───────────┬─────────┬──────────┬──────────────┬────────────────┬───────
│  kingdom │   phylum │         class │         order │        family │    genus │          species │       key │  nubKey │  nameKey │      taxonID │ sourceTaxonKey │ king ⋯
│  String? │  String? │       String? │       String? │       String? │  String? │          String? │    Int64? │  Int64? │   Int64? │      String? │         Int64? │      ⋯
├──────────┼──────────┼───────────────┼───────────────┼───────────────┼──────────┼──────────────────┼───────────┼─────────┼──────────┼──────────────┼────────────────┼───────
│ Animalia │ Chordata │          Aves │ Passeriformes │ Campephagidae │ Coracina │ Coracina newtoni │   8385394 │ missing │ 18882488 │ gbif:8385394 │      176651982 │      ⋯
│ Animalia │  missing │          Aves │       missing │ Campephagidae │   Lalage │   Lalage newtoni │ 100144670 │ 8385394 │  5976204 │        06014 │        missing │  128 ⋯
│ Animalia │  missing │          Aves │ Passeriformes │ Campephagidae │  missing │   Lalage newtoni │ 133165079 │ 8385394 │  5976204 │        18380 │        missing │  135 ⋯
│ Animalia │ Chordata │          Aves │ Passeriformes │ Campephagidae │   Lalage │   Lalage newtoni │ 161400685 │ 8385394 │ 18882488 │       895898 │        missing │  134 ⋯
│  missing │  missing │       missing │       missing │       missing │ Bossiaea │   Lalage newtoni │ 165585935 │ missing │ 18882488 │      6924877 │        missing │    m ⋯
│ Animalia │  missing │          Aves │ Passeriformes │ Campephagidae │   Lalage │   Lalage newtoni │ 165923305 │ 8385394 │ 18882488 │        19393 │        missing │  100 ⋯
│ Animalia │ Chordata │          Aves │ Passeriformes │ Campephagidae │   Lalage │   Lalage newtoni │ 168010293 │ 8385394 │  5976204 │       181376 │        missing │  167 ⋯
│ Animalia │ Chordata │ Passeriformes │          Aves │ Campephagidae │   Lalage │   Lalage newtoni │ 176651982 │ 8385394 │ 18882488 │     22706569 │        missing │  202 ⋯
└──────────┴──────────┴───────────────┴───────────────┴───────────────┴──────────┴──────────────────┴───────────┴─────────┴──────────┴──────────────┴────────────────┴───────

Keyword arguments

We use keywords exactly as in the GBIF api.

You can find keyword enum values with the [GBIF2.enum](@ref) function.

  • language: Language for vernacular names, given as an ISO 639-1 two-letter code from our
  • datasetKey: Filters by the checklist dataset key (a uuid)
  • sourceId: Filters by the source identifier
  • name: Name of the species
  • offset: Offset to start results from
  • limit: The maximum number of results to return. This can't be greater than 300, any value greater is set to 300.

Occurrence

Occurrence objects and queries correspond closely to the GBIF occurrence api.

GBIF2.OccurrenceType
Occurrence

Wrapper object for information returned about an occurrence by occurrence and occurrence_search queries. Occurrence also serves as rows in Table, and is converted to rows in a DataFrame or CSV automatically by the Tables.jl interface.

Occurrence properties are accessed with ., e.g. oc.country. Note that these queries do not all return all properties, and not all records contain all properties in any case. Missing properties simply return missing.

The possible properties of an Occurrence object are: (:geometry, :year, :month, :day, :kingdom, :phylum, :class, :order, :family, :genus, :species, :genericName, :taxonRank, :taxonomicStatus, :iucnRedListCategory, :elevation, :continent, :stateProvince, :eventDate, :decimalLongitude, :decimalLatitude, :key, :datasetKey, :publishingOrgKey, :installationKey, :publishingCountry, :protocol, :lastCrawled, :lastParsed, :crawlId, :hostingOrganizationKey, :extensions, :basisOfRecord, :individualCount, :occurrenceStatus, :taxonKey, :kingdomKey, :phylumKey, :classKey, :orderKey, :familyKey, :genusKey, :acceptedTaxonKey, :scientificName, :acceptedScientificName, :issues, :lastInterpreted, :license, :identifiers, :media, :facts, :relations, :gadm, :institutionKey, :isInCluster, :datasetName, :recordedBy, :inCluster, :geodeticDatum, :countryCode, :recordedByIDs, :identifiedByIDs, :country, :rightsHolder, :nomenclaturalStatus, :recordNumber, :identifier, :nomenclaturalCode, :county, :locality, :fieldNumber, :collectionCode, :gbifID, :occurrenceID, :type, :taxonID, :catalogNumber, :institutionCode, :ownerInstitutionCode, :bibliographicCitation, :collectionID, :earliestEraOrLowestErathem, :earliestPeriodOrLowestSystem, :higherClassification)

GBIF2.occurrence_searchFunction
occurrence_search(species::Species; kw...)
occurrence_search([q]; kw...)
occurrence_search(q, returntype; limit...)

Search for occurrences, returning a Table{Occurrence} table.

Example

Here we find a species with species_match, and then retrieve all the occurrences with occurrence_search.

julia> 
using GBIF2

julia> 
sp = species_match("Falco punctatus");

julia> 
ocs = occurrence_search(sp; continent=:AFRICA, limit=1000)
[ Info: 522 occurrences found, limit was 1000
522-element GBIF2.Table{GBIF2.Occurrence, Vector{JSON3.Object}}
┌──────────────────┬─────────────────┬────────┬────────┬────────┬────────────
│ decimalLongitude │ decimalLatitude │   year │  month │    day │  kingdom  ⋯
│         Float64? │        Float64? │ Int64? │ Int64? │ Int64? │  String?  ⋯
├──────────────────┼─────────────────┼────────┼────────┼────────┼────────────
│          missing │         missing │   2012 │      8 │     18 │ Animalia  ⋯
│          missing │         missing │   2010 │      1 │     29 │ Animalia  ⋯
│          57.2452 │        -20.2239 │   2009 │     10 │     26 │ Animalia  ⋯
│          57.2452 │        -20.2239 │   2009 │     11 │      5 │ Animalia  ⋯
│          57.2452 │        -20.2239 │   2009 │     11 │      5 │ Animalia  ⋯
│          57.2452 │        -20.2239 │   2009 │     11 │      4 │ Animalia  ⋯
│          57.2452 │        -20.2239 │   2009 │     11 │      5 │ Animalia  ⋯
│          57.2452 │        -20.2239 │   2009 │     11 │      4 │ Animalia  ⋯
│          57.7667 │          -19.85 │   2007 │      6 │     19 │ Animalia  ⋯
│          57.7667 │          -19.85 │   2007 │      6 │     19 │ Animalia  ⋯
│          57.7667 │          -19.85 │   2007 │      6 │     19 │ Animalia  ⋯
│          57.7667 │          -19.85 │   2007 │      6 │     19 │ Animalia  ⋯
│          57.7667 │          -19.85 │   2007 │      6 │     19 │ Animalia  ⋯
│          57.7667 │          -19.85 │   2007 │      6 │     19 │ Animalia  ⋯
│          57.7667 │          -19.85 │   2007 │      6 │     19 │ Animalia  ⋯
│        ⋮         │        ⋮        │   ⋮    │   ⋮    │   ⋮    │    ⋮      ⋱
└──────────────────┴─────────────────┴────────┴────────┴────────┴────────────
                                              78 columns and 507 rows omitted

Arguments

  • q: a search query.

  • species: if the first value is a species, search keywords will be retrieved from it.

  • returntype: modify the returntype, with a Symbol from :

    • :catalogNumber: Search that returns matching catalog numbers. Table are ordered by relevance.

    • :collectionCode: Search that returns matching collection codes. Table are ordered by relevance.

    • :occurrenceId: Search that returns matching occurrence identifiers. Table are ordered by relevance.

    • :recordedBy: Search that returns matching collector names. Table are ordered by relevance.

    • :recordNumber: Search that returns matching record numbers. Table are ordered by relevance.

    • :institutionCode: Search that returns matching institution codes. Table are ordered by relevance.

Keywords

We use parameters exactly as in the GBIF api.

You can find keyword enum values with the [GBIF2.enum](@ref) function.

GBIF range queries work by putting values in a Tuple, e.g. elevation=(10, 100).

  • basisOfRecord: Basis of record, as defined in our BasisOfRecord enum
  • catalogNumber: An identifier of any form assigned by the source within a physical collection or digital dataset for the record which may not be unique, but should be fairly unique in combination with the institution and collection code.
  • classKey: Class classification key.
  • collectionCode: An identifier of any form assigned by the source to identify the physical collection or digital dataset uniquely within the context of an institution.
  • continent: Continent, as defined in our Continent enum
  • coordinateUncertaintyInMeters: The horizontal distance (in meters) from the given decimalLatitude and decimalLongitude describing the smallest circle containing the whole of the Location. Supports range queries.
  • country: The 2-letter country code (as per ISO-3166-1) of the country in which the occurrence was recorded.
  • crawlId: Crawl attempt that harvested this record.
  • datasetId: The ID of the dataset.
  • datasetKey: The occurrence dataset key (a uuid).
  • datasetName: The name of the dataset.
  • decimalLatitude: Latitude in decimals between -90 and 90 based on WGS 84. Supports range queries.
  • decimalLongitude: Longitude in decimals between -180 and 180 based on WGS 84. Supports range queries.
  • depth: Depth in meters relative to altitude. For example 10 meters below a lake surface with given altitude. Supports range queries.
  • elevation: Elevation (altitude) in meters above sea level. Supports range queries.
  • establishmentMeans: EstablishmentMeans, as defined in our EstablishmentMeans enum
  • eventDate: Occurrence date in ISO 8601 format: yyyy, yyyy-MM, yyyy-MM-dd, or MM-dd. Supports range queries.
  • eventId: An identifier for the information associated with a sampling event.
  • facet: A facet name used to retrieve the most frequent values for a field. Facets are allowed for all the parameters except for: eventDate, geometry, lastInterpreted, locality, organismId, stateProvince, waterBody. This parameter may by repeated to request multiple facets, as in this example /occurrence/search?facet=datasetKey&facet=basisOfRecord&limit=0
  • facetMincount: Used in combination with the facet parameter. Set facetMincountN to exclude facets with a count less than N, e.g. /search?facet=type&limit=0&facetMincount=10000 only shows the type value 'OCCURRENCE' because 'CHECKLIST' and 'METADATA' have counts less than 10000.
  • facetMultiselect: Used in combination with the facet parameter. Set facetMultiselect=true to still return counts for values that are not currently filtered, e.g. /search?facet=type&limit=0&type=CHECKLIST&facetMultiselect=true still shows type values 'OCCURRENCE' and 'METADATA' even though type is being filtered by type=CHECKLIST
  • facetOffset:
  • facetLimit: Facet parameters allow paging requests using the parameters facetOffset and facetLimit as this example /occurrence/search?facet=datasetKey&datasetKey.facetLimit=5&datasetKey.facetOffset=10&limit=0
  • familyKey: Family classification key.
  • format: Export format, accepts TSV(default) and CSV
  • fromDate: Start partial date of a date range, accepts the format yyyy-MM, for example: 2015-11
  • gadmGid: A GADM geographic identifier at any level, for example AGO, AGO.11, AGO.1.11 or AGO.1.1.1_1
  • gadmLevel: A GADM region level, valid values range from 0 to 3
  • gadmLevel0Gid: A GADM geographic identifier at the zero level, for example AGO
  • gadmLevel1Gid: A GADM geographic identifier at the first level, for example AGO.1_1
  • gadmLevel2Gid: A GADM geographic identifier at the second level, for example AFG.1.1_1
  • gadmLevel3Gid: A GADM geographic identifier at the third level, for example AFG.1.1.1_1
  • genusKey: Genus classification key.
  • geoDistance: Filters to match occurrence records with coordinate values within a specified distance of a coordinate, it supports units: in (inch), yd (yards), ft (feet), km (kilometers), mmi (nautical miles), mm (millimeters), cm centimeters, mi (miles), m (meters), for example /occurrence/search?geoDistance=90,100,5km
  • geometry: Searches for occurrences inside a polygon described in Well Known Text (WKT) format. Only POINT, LINESTRING, LINEARRING, POLYGON and MULTIPOLYGON are accepted WKT types. For example, a shape written as POLYGON ((30.1 10.1, 40 40, 20 40, 10 20, 30.1 10.1)) would be queried as is, i.e. /occurrence/search?geometry=POLYGON((30.1 10.1, 40 40, 20 40, 10 20, 30.1 10.1)). Polygons must have anticlockwise ordering of points, or will give unpredictable results. (A clockwise polygon represents the opposite area: the Earth's surface with a 'hole' in it. Such queries are not supported.)
  • hasCoordinate: Limits searches to occurrence records which contain a value in both latitude and longitude (i.e. hasCoordinate=true limits to occurrence records with coordinate values and `hasCoordinate=false limits to occurrence records without coordinate values).
  • hasGeospatialIssue: Includes/excludes occurrence records which contain spatial issues (as determined in our record interpretation), i.e. hasGeospatialIssue=true returns only those records with spatial issues while hasGeospatialIssue=false includes only records without spatial issues. The absence of this parameter returns any record with or without spatial issues.
  • hl: Set hl=true to highlight terms matching the query when in fulltext search fields. The highlight will be an emphasis tag of class 'gbifH1' e.g. /search?q=plant&hl=true. Fulltext search fields include: title, keyword, country, publishing country, publishing organization title, hosting organization title, and description. One additional full text field is searched which includes information from metadata documents, but the text of this field is not returned in the response.
  • identifiedBy: The person who provided the taxonomic identification of the occurrence.
  • identifiedByID: Identifier (e.g. ORCID) for the person who provided the taxonomic identification of the occurrence.
  • institutionCode: An identifier of any form assigned by the source to identify the institution the record belongs to. Not guaranteed to be unique.
  • issue: A specific interpretation issue as defined in our OccurrenceIssue enum
  • kingdomKey: Kingdom classification key.
  • lastInterpreted: This date the record was last modified in GBIF, in ISO 8601 format: yyyy, yyyy-MM, yyyy-MM-dd, or MM-dd. Supports range queries. Note that this is the date the record was last changed in GBIF, not necessarily the date the record was first/last changed by the publisher. Data is re-interpreted when we change the taxonomic backbone, geographic data sources, or interpretation processes.
  • license: The type license applied to the dataset or record.
  • limit: The maximum number of results to return. This can't be greater than 300, any value greater is set to 300.
  • locality: The specific description of the place.
  • mediaType: The kind of multimedia associated with an occurrence as defined in our MediaType enum
  • modified: The most recent date-time on which the resource was changed, according to the publisher
  • month: The month of the year, starting with 1 for January. Supports range queries.
  • networkKey: The GBIF Network to which the occurrence belongs.
  • occurrenceId: A single globally unique identifier for the occurrence record as provided by the publisher.
  • occurrenceStatus: Either 'ABSENT' or 'PRESENT'; the presence or absence of the occurrence.
  • offset: Offset to start results from
  • orderKey: Order classification key.
  • organismId: An identifier for the Organism instance (as opposed to a particular digital record of the Organism). May be a globally unique identifier or an identifier specific to the data set.
  • organismQuantity: A number or enumeration value for the quantity of organisms.
  • organismQuantityType: The type of quantification system used for the quantity of organisms.
  • otherCatalogNumbers: Previous or alternate fully qualified catalog numbers.
  • phylumKey: Phylum classification key.
  • preparations: Preparation or preservation method for a specimen.
  • programme: A group of activities, often associated with a specific funding stream, such as the GBIF BID programme.
  • projectId: The identifier for a project, which is often assigned by a funded programme.
  • protocol: Protocol or mechanism used to provide the occurrence record.
  • publishingCountry: The 2-letter country code (as per ISO-3166-1) of the owining organization's country.
  • publishingOrg: The publishing organization key (a uuid).
  • publishingOrgKey: The publishing organization key (a uuid).
  • q: Simple search parameter. The value for this parameter can be a simple word or a phrase.
  • recordedBy: The person who recorded the occurrence.
  • recordedByID: Identifier (e.g. ORCID) for the person who recorded the occurrence.
  • recordNumber: An identifier given to the record at the time it was recorded in the field.
  • relativeOrganismQuantity: The relative measurement of the quantity of the organism (i.e. without absolute units).
  • repatriated: Searches for records whose publishing country is different to the country where the record was recorded in.
  • sampleSizeUnit: The unit of measurement of the size (time duration, length, area, or volume) of a sample in a sampling event.
  • sampleSizeValue: A numeric value for a measurement of the size (time duration, length, area, or volume) of a sample in a sampling event.
  • samplingProtocol: The name of, reference to, or description of the method or protocol used during a sampling event
  • scientificName: A scientific name from the GBIF backbone. All included and synonym taxa are included in the search. Under the hood a call to the species match service is done first to retrieve a taxonKey. Only unique scientific names will return results, homonyms (many monomials) return nothing! Consider to use the taxonKey parameter instead and the species match service directly
  • speciesKey: Species classification key.
  • stateProvince: he name of the next smaller administrative region than country (state, province, canton, department, region, etc.) in which the Location occurs.
  • subgenusKey: Subgenus classification key.
  • taxonKey: A taxon key from the GBIF backbone. All included and synonym taxa are included in the search, so a search for aves with taxonKey=212 (i.e. coordinate_search(; taxonKey=212)) will match all birds, no matter which species.
  • toDate: End partial date of a date range, accepts the format yyyy-MM, for example: 2019-12
  • typeStatus: Nomenclatural type (type status, typified scientific name, publication) applied to the subject.
  • userCountry: Country country of the user who made the requested
  • verbatimScientificName: The scientific name provided to GBIF by the data publisher, before interpretation and processing by GBIF.
  • verbatimTaxonId: The taxon identifier provided to GBIF by the data publisher.
  • waterBody: The name of the water body in which the Locations occurs.
  • year: The 4 digit year. A year of 98 will be interpreted as AD 98. Supports range queries.
GBIF2.occurrenceFunction
occurrence(key; [returntype])
occurrence(occurrence::Occurrence; [returntype])
occurrence(datasetKey, occurrenceID; [returntype])

Retrieve a single Occurrence by its key, by datasetKey and occurrenceID or by passing in an Occurrence object.

Keyword

  • returntype modifies the return value, and can be :fragment or :verbatim.

Example

using GBIF2
sp = species_match("Falco punctatus")
ocs = occurrence_search(sp)
oc = occurrence(ocs[1]; returntype=:verbatim)

# output
GBIF2.Occurrence({
                                              "key": 3556750430,
                                       "datasetKey": "4fa7b334-ce0d-4e88-aaae-2e0c138d049e",
                                 "publishingOrgKey": "e2e717bf-551a-4917-bdc9-4fa0f342c530",
                                  "installationKey": "7182d304-b0a2-404b-baba-2086a325c221",
                                "publishingCountry": "MU",
                                         "protocol": "DWC_ARCHIVE",
                                      "lastCrawled": "2022-03-02T17:41:33.833+00:00",
                                       "lastParsed": "2022-09-08T14:55:01.342+00:00",
                                          "crawlId": 15,
                                       "extensions": {},
   "http://rs.gbif.org/terms/1.0/publishingCountry": "MU",
             "http://rs.tdwg.org/dwc/terms/country": "Mauritius",
      "http://rs.tdwg.org/dwc/terms/collectionCode": "EBIRD",
               "http://rs.tdwg.org/dwc/terms/order": "Falconiformes",
                "http://rs.tdwg.org/dwc/terms/year": "2021",
      "http://rs.tdwg.org/dwc/terms/vernacularName": "Mauritius Kestrel",
            "http://rs.tdwg.org/dwc/terms/locality": "Ebony Forest Reserve Chamarel",
       "http://rs.tdwg.org/dwc/terms/basisOfRecord": "HumanObservation",
              "http://rs.tdwg.org/dwc/terms/family": "Falconidae",
               "http://rs.tdwg.org/dwc/terms/month": "07",
     "http://rs.tdwg.org/dwc/terms/decimalLatitude": "-20.436033",
      "http://rs.tdwg.org/dwc/terms/taxonConceptID": "avibase-D1069C26",
      "http://rs.tdwg.org/dwc/terms/scientificName": "Falco punctatus",
          "http://rs.tdwg.org/dwc/terms/recordedBy": "obsr2637790",
       "http://rs.tdwg.org/dwc/terms/stateProvince": "Black River",
              "http://rs.tdwg.org/dwc/terms/phylum": "Chordata",
              "http://rs.gbif.org/terms/1.0/gbifID": "3556750430",
                 "http://rs.tdwg.org/dwc/terms/day": "15",
               "http://rs.tdwg.org/dwc/terms/genus": "Falco",
             "http://rs.tdwg.org/dwc/terms/kingdom": "Animalia",
              "http://purl.org/dc/terms/identifier": "OBS1201437854",
               "http://rs.tdwg.org/dwc/terms/class": "Aves",
     "http://rs.tdwg.org/dwc/terms/individualCount": "1",
     "http://rs.tdwg.org/dwc/terms/specificEpithet": "punctatus",
        "http://rs.tdwg.org/dwc/terms/occurrenceID": "URN:catalog:CLO:EBIRD:OBS1201437854",
       "http://rs.tdwg.org/dwc/terms/catalogNumber": "OBS1201437854",
    "http://rs.tdwg.org/dwc/terms/decimalLongitude": "57.37246",
     "http://rs.tdwg.org/dwc/terms/institutionCode": "CLO",
       "http://rs.tdwg.org/dwc/terms/geodeticDatum": "WGS84",
    "http://rs.tdwg.org/dwc/terms/occurrenceStatus": "PRESENT"
})
GBIF2.occurrence_requestFunction
occurrence_request(sp::Species; kw...)
occurrence_request(; kw...)

Request an occurrence download, returning a token that will later provide a download url. You can call occurrence_download(token) when it is ready. Prior to that, you will get 404 errors.

Example

Here we request to dowload all of the occurrences of the Common Myna, Acridotheres tristis.

julia> sp = species_match("Acridotheres tristis");

julia> occurrence_count(sp)
1936341
julia> token = occurrence_request(sp, username="my_gbif_username")

This will prompt for your password, and either throw an error or return a value for the token to use later in occurrence_download.

If you forgot to store the token and your session is still open, you can simply use occurrence_download() to download the most recent request.

Keywords

  • username: String username for a gbif.org account
  • password: String password for a gbif.org account. The password will be entered in the REPL if this keyword is not used.
  • type: choose from an :and or :or query.

Allowed query keywords are: (:datasetKey, :year, :month, :decimalLatitude, :decimalLongitude, :elevation, :depth, :institutionCode, :collectionCode, :catalogNumber, :scientificName, :occurrenceID, :establishmentMeans, :degreeOfEstablishment, :pathway, :eventDate, :modified, :lastInterpreted, :basisOfRecord, :countryCode, :continent, :publishingCountry, :recordedBy, :identifiedBy, :recordNumber, :typeStatus, :hasCoordinate, :hasGeospatialIssues, :mediaType, :issue, :kingdomKey, :phylumKey, :classKey, :orderKey, :familyKey, :genusKey, :subgenusKey, :speciesKey, :acceptedTaxonKey, :taxonomicStatus, :repatriated, :organismID, :locality, :coordinateUncertaintyInMeters, :stateProvince, :waterBody, :level0Gid, :level1Gid, :level2Gid, :level3Gid, :protocol, :license, :publishingOrgKey, :hostingOrganizationKey, :crawlId, :installationKey, :networkKey, :eventID, :parentEventID, :samplingProtocol, :projectId, :programmeAcronym, :verbatimScientificName, :taxonID, :sampleSizeUnit, :sampleSizeValue, :organismQuantity, :organismQuantityType, :relativeOrganismQuantity, :collectionKey, :institutionKey, :recordedByID, :identifiedByID, :occurrenceStatus, :lifeStage, :isInCluster, :dwcaExtension, :iucnRedListCategory, :datasetID, :datasetName, :otherCatalogNumbers, :preparations)

Modifiers

Prameter values can modify the kind of match by using a pair: elevation = :lessThan => 100, or using julia Fix2 operators like elevation = >(100).

Pair keyFix2Description
:equals==(x)equality comparison
:lessThan<(x)is less than
:lessThanOrEquals<=(x)is less than or equals
:greaterThan=>(x)is greater than
:greaterThanOrEquals>=(x)greater than or equals
:inin(x)specify multiple values to be compared
:not!=(x)logical negation
:and&(x)logical AND (conjuction)
:or(x)
:likesearch for a pattern, ? matches one character, * matches zero or more characters

To pass instead of a value:

|:isNull | has an empty value | |:isNotNull | has a non-empty value |

GBIF2.occurrence_downloadFunction
occurrence_download([key::String]; [filename])

Download the data for an occurrence key returned by occurrence_request, or without arguments download the last result of occurrence_request.

Note that occurrence_download depends on gbif.org preparing the download. Prior to it will give a 404 error as the page will not be found.

The filename keyword can be used to name the resulting file.

Example

Request all the common mynor birds below 100m of elevation:

sp = species_match("Acridotheres tristis");
token = occurrence_request(sp, username="my_gbif_username", elevation=<(100))
write("mydownloadtoken", string(token)) # save it just in case
# wait for your download to be prepared
# If you need to, read the token again:
token = readlines("mydownloadtoken")[1]
# And download it
filename = GBIF2.occurrence_download(token)
GBIF2.occurrence_countFunction
occurrence_count(species::Species; kw...)
occurrence_count(; kw...)

Count the number of occurrences for a taxon.

Example

juila> sp = species_match("Pteropus niger");

juila> occurrence_count(sp)
559

Keywords

  • taxonKey: is the most useful key when a Species is not passed.

Occurrence counts have a complicated schema of allowed keyword combinations. You can access these from the GBIF api using GBIF2.occurrence_count_schema().

GBIF2.occurrence_count_schemaFunction
occurrence_count_schema()

Return the raw schema of allowed keyword/parameter combinations to be used with occurrence_count.

GBIF2.occurrence_inventoryFunction
occurrence_inventory(type::Symbol; kw...)

Return the number of occurrences for a taxon based on certain criteria. The return value is a JSON3.jl object.

Example

julia> country_counts = occurrence_inventory(:countries)
JSON3.Object{Vector{UInt8}, Vector{UInt64}} with 252 entries:
  :UNITED_STATES  => 816855696
  :CANADA         => 133550183
  :FRANCE         => 119683053
  :SWEDEN         => 116451484
  :AUSTRALIA      => 115239040
  :UNITED_KINGDOM => 108417142
  :NETHERLANDS    => 85750415
  :SPAIN          => 54804973
  :DENMARK        => 49334935
  :GERMANY        => 49290658
  :NORWAY         => 48337823
  :FINLAND        => 36970838
  :BELGIUM        => 35064346
  :SOUTH_AFRICA   => 33185318
  :INDIA          => 32071907
  :MEXICO         => 26295593
  :BRAZIL         => 24558941
  :COSTA_RICA     => 21103286
  :COLOMBIA       => 20143253
  :SWITZERLAND    => 17727317
  :PORTUGAL       => 17688228
  ⋮               => ⋮

julia> country_counts.INDIA
32071907

Keywords

  • type: inventory accross categories, with additional keywords from:

(basisOfRecord = (), countries = :publishingCountry, datasets = (:country, :taxonKey), publishingCountry = (:publishingCountry,), year = (:year,))

Occurrence counts have a complicated schema of allowed keyword combinations. You can access these from the GBIF api using occurrence_count_schema().

Low level

Species and occurrences are return in a generalised Table object.

GBIF2.TableType
Table <: AbstractVector

A generic Vector and Tables.jl compatible table to hold both Occurrence and Species data.

Use as any julia AbstractArray to access species or occurrence records, or use with the Tables.jl interface to convert to a DataFrame or e.g. CSV.

GBIF2.enumFunction
enum()
enum(k::Symbol)

Get the enum values for the keyword k, or for all enums in a NamedTuple.

Enum keywords are in (:basisOfRecord, :continent, :establishmentMeans, :habitat, :issue, :nameType, :rank, :threat, :taxonomicStatus).