`neotomapydoi`: Minting and Managing Neotoma PIDs

Note

This project is only intended for Neotoma data administrators.

Although general users can access the code, the system will not make "Neotoma" DOIs unless you have the proper authorization. It requires valid authorization for the DataCite system, as well as authorization for the production Neotoma database server. Without these you may be able to assemble DOI metadata from a database snapshot and examine or manipulate it yourself, but you will not be able to pull the most current Neotoma data, or mint data using the Neotoma DOI shoulder (10.21233/).

Introduction

Neotoma stores information about tens of thousands of datasets around the world, including spatial, temporal and observational data about the taxa, chemistry and physical properties of the samples found in these sedimentary archives. These records are exposed through the Neotoma R package (neotoma2), the Neotoma API, Neotoma Explorer, and, more broadly, through their Digial Object identifiers (DOIs).

DOIs managed by DataCite have a defined metadata schema, which allows Neotoma to provide dataset terms in a form that can be easily searched and returned by users around the globe, without the need for detailled knowledge about Neotoma or its database schema. The DOIs (e.g., https://doi.org/10.21233/znex-sp94) return users to the Neotoma Landing Pages, where they can download records in JSON format and examine additional metadata about the records.

sequenceDiagram
    participant Batch@{ "type":"boundary" }
    participant neotomaPyDOI@{"type":"control"}
    participant NeotomaDB@{ "type" : "database" }
    participant DataCite@{"type":"boundary"}

    Note left of Batch: Trigger to run daily using Fargate
    Batch->>neotomaPyDOI: Trigger a run (cron)
    neotomaPyDOI->>NeotomaDB: Check for new datasets without DOIs
    NeotomaDB->>neotomaPyDOI: Return datasetids
    neotomaPyDOI->>NeotomaDB: Prepare dataset metadata (freeze datasets)
    NeotomaDB->>neotomaPyDOI: Query returns metadata
    neotomaPyDOI->>+DataCite: POST DOI metadata
    DataCite->>-neotomaPyDOI: Return DOIs and formatted metadata
    neotomaPyDOI->>NeotomaDB: INSERT dois and DOI metadata
    create participant AWS_S3_Logs
    neotomaPyDOI->>AWS_S3_Logs: Log results

Development

Simon Goring: University of Wisconsin - Madison

Contribution

We welcome user contributions to this project. All contributors are expected to follow the code of conduct. Contribution guidelines can be found in the Contributing document. Contributors should fork this project and make a pull request indicating the nature of the changes and the intended utility. Further information for this workflow can be found on the GitHub Pull Request Tutorial webpage.

Using `neotomadoi`

Requirements

Python 3.12 in a Linux or MacOS environment.
A valid connection to the Neotoma Paleoecology Database, either in the cloud (AWS) or locally (see the Neotoma Snapshot documentation)
All packages as defined in the pyproject.toml file (use uv and the uv install command)
Valid DataCite credentials

Credential Storage

All credentials should be stored within a .env file. We provide .env-template as an example. The user should modify this file to reflect their own credentials and connection strings for these environment variables.

DBAUTH={"host":"localhost","port":5432,"user":"postgres","password":"postgres","database":"neotoma"}
DCITE={"user": "USER","mode": {"test": {"handle": "10.00000","pw": "SANDBOX_PASSWORD"},"prod": {"handle": "10.00001","pw": "PRODUCTION_PASSWORD"}}}
DBAUTH_TEST={"host":"localhost","port":5432,"user":"postgres","password":"postgres","database":"neotoma_test"}
UV_PUBLISH_TOKEN="pypi-LONG_TEST_STRING"

Minting DOIs

Getting Help

uv run ndbdoi.py -h

The ndbdoi.py module uses argparser to manage commandline arguments. At any time you can get help by using the -h flag. With the help you can see there is one main function, -m, minting. However, there are times when we want to simply test that the minting process will run securely and send data to the DataCite Sandbox. In this case, we use the flag -t or --tank.

Sandbox Minting

Before minting datasets, it is recommended to test the minting process using the Neotoma Holding Tank and the DataCite Sandbox:

uv run ndbdoi.py --tank

Neotoma DOI Metadata

The Neotoma API metadata can be seen on all Neotoma Landing Pages with minted DOIs.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
docs		docs
src/neotomadoi		src/neotomadoi
tests		tests
.env-template		.env-template
.gitignore		.gitignore
.python-version		.python-version
CITATION.cff		CITATION.cff
CONTRIBUTING.md		CONTRIBUTING.md
ISSUE_TEMPLATE.md		ISSUE_TEMPLATE.md
LICENSE.md		LICENSE.md
Makefile		Makefile
README.md		README.md
code_of_conduct.md		code_of_conduct.md
mkdocs.yml		mkdocs.yml
ndbdoi.py		ndbdoi.py
neotomadoi.yaml		neotomadoi.yaml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`neotomapydoi`: Minting and Managing Neotoma PIDs

Note

Introduction

Development

Contribution

Using `neotomadoi`

Requirements

Credential Storage

Minting DOIs

Getting Help

Sandbox Minting

Neotoma DOI Metadata

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Languages

License

NeotomaDB/neotomapydoi

Folders and files

Latest commit

History

Repository files navigation

neotomapydoi: Minting and Managing Neotoma PIDs

Note

Introduction

Development

Contribution

Using neotomadoi

Requirements

Credential Storage

Minting DOIs

Getting Help

Sandbox Minting

Neotoma DOI Metadata

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Languages

`neotomapydoi`: Minting and Managing Neotoma PIDs

Using `neotomadoi`

Packages