Meet the FREYA partners: PANGAEA
Many different organisations are involved in FREYA and in this blog post series we take a closer look at the partners and their work. This time you can read about PANGAEA, an Open Access library aimed at archiving, publishing and distributing geo-referenced data from earth system research.
What is the mission of your organisation?
The World Data Center PANGAEA is operated as an Open Access library aimed at archiving, publishing and distributing geo-referenced data from earth system research. The system guarantees long-term availability of its content through a commitment of the hosting institutions, the Center for Marine Environmental Sciences, University of Bremen (MARUM) in Bremen and the Alfred Wegener Institute, Helmholtz Center for Polar and Marine Research (AWI) in Bremerhaven, Germany.
Most of the data in PANGAEA are freely available and can be used under the terms of the license mentioned on the data set description. A few password protected data sets are under moratorium from ongoing projects. The description of each data set is always visible and includes the principle investigator (PI) who may be asked for access to the data.
Each dataset can be identified, shared, published and cited by using a Digital Object Identifier (DOI Name). PANGAEA also allows data to be published as supplements to science articles (example) or as citable data collections in combination with data journals like ESSD, Geoscience Data Journal, Scientific Data, or others.
The PANGAEA data editorial ensures the integrity and authenticity as well as a high usability of your data. Archived data are machine readable and mirrored into our data warehouse which allows efficient compilations of data.
PANGAEA is member of the World Data System (WDS) of the International Science Council (ISC). It further hosts the World Radiation Monitoring Center (WRMC) of the Baseline Surface Radiation Network (BSRN) and is as such accredited as a "Data Collection and Processing Center" (DCPC) of the World Meteorological Organisation (WMO) Information System (WIS). PANGAEA is a CoreTrustSeal certified repository.
PANGAEA is open to any project, institution, or individual scientist to use or to archive and publish data.
PANGAEA is operated by a team of data editors, project managers, and IT specialists. Our editors are scientists with expertise in all fields of earth and environmental science and have a profound knowledge for the review and processing of scientific data. Most PANGAEA staff are located either at the University Campus in Bremen, or at the AWI in Bremerhaven, or at the GEOMAR in Kiel. However, several PANGAEA staff members are located as far north as Finland and as far east as China.
Project managers, Tina Dohna (Marine Ecologist, PhD) and Ketil Koop-Jakobsen (Biochemist, PhD), and IT specialist, Uwe Schindler, are responsible for the work carried out in the FREYA project. They ensure that FREYA output is integrated with the PANGAEA system and that relevant feedback from our marine and earth systems user community is fed back into the project.
Why are PIDs important (for your organisation)?
As data publishers, we are very much focused on offering FAIR and sustainable data publications to all our authors and users. FAIR stands for: Findable, Accessible, Interoperable, and Reusable, and represents a global effort to increase the value and impact of the large amounts of data collected daily. By making sure that data is FAIR when published, we support this effort and provide the scientific community with the data needed to help answer pressing environmental questions. PIDs can help dataset authors to ensure their data is properly acknowledged and cited and can also increase the findability, interoperability and reusability of their data. Once the appropriate PIDs are used to describe the different metadata components of a dataset (e.g. authors, institutes, samples, sensors, etc.) other users can more readily find and assess the utility of the dataset for their own purposes. ‘Big data’ analysis relies on machine readability of information and here too PIDs, with underlying standardized metadata schema, are crucial.
What do you do in FREYA?
The PANGAEA team is involved in tasks for PID landscaping and maturation, as well as implementation of new PIDs and the PID graph in pilot applications. We adopt new PIDs (like ROR institutional IDs) by including them in our metadata schema, which is used to describe datasets found in PANGAEA. We also integrate them as part of our internal and more extensive PANGAEA schema. Our role is to help mature and implement new PIDs and to develop tools that enable users to ‘surf’ the PID graph, looking for similar or related data or running queries to produce statistics on data use/reuse and impact. For example, we have developed an App that can help scientists who use or plan to use the institutes sediment core repository. This cooled storage center (in the photo below) houses thousands of marine sediments core fragments which are collected during ocean drilling expeditions.
The App can be used to scan the sample barcode (or enter the sample PID) to find other fragments collected from the same bore (drilling) hole or other cores that were collected in the same area, and it can also help users find related publications and people who have worked on the cores, in addition to funding information.To access the tool, go to https://dataportals.pangaea.de/freya/igsn/.
Central to this type of tool are the sample PIDs we have for each core fragment (IGSNs) and other PIDs available for authors (ORCIDs), dataset and literature publications (DOI), and funders (CROSSref Funder IDs). Here an example output of the app from scanning a core barcode:
What would your perfect (PID) world look like?
As a data publisher, PIDs and the linkage of PIDs is of great importance for our data users, facilitating easy access to linked metadata about specific datasets. Hence, for PANGAEA, the current expansion and tighter linkage of the PID landscape, which is generated through the work of FREYA, provides new opportunities for PANGAEA to implement PIDs as part of the metadata in our published dataset. In the ideal PID-world all aspects necessary to generate a scientific dataset would have a PID. This includes not only the researchers, but also the organization behind the research, the funder paying for the research, the instrumentation used to generate the data, the protocol for operating the instrument, and many more.
More information
Website: https://www.pangaea.de/
Follow us on twitter: @PANGAEAdataPubl