Information Management and Data Science at the Jornada

Resources for researchers and data managers at the Jornada Basin LTER and USDA-ARS Jornada Experimental Range. Contact: jornada.data@nmsu.edu

View My GitHub Profile

This page was written by Jornada information managers (IMs) to provide a concise, but fairly comprehensive guide to research administration and data management for Jornada students and investigators, including PIs, postdocs, graduate students, undergraduate researchers, and staff scientists.

Research approval and policies

Most new research projects are initiated through the Jornada Research Site Manager, John Anderson (janderso@nmsu.edu). The JRN website or the JER website are the best source of information on this process, and have the current forms needed. To begin the approval process, a potential researcher submits a Research Notification (ResNotif) form, which describes the planned location, research activities, personnel, and other important project information, to John Anderson. After review, and after any necessary changes to the plan are made, the project may be approved to begin.

All approved research projects at the Jornada must agree to a data sharing and acknowledgement policy. The requirements for Jornada researchers are, again, detailed on the JRN or JER websites, but to summarize, researchers should:

  1. Submit research data and metadata for publication
    • We require most researchers to submit data and metadata yearly to our research data archive. We publish this data in open-access repositories at the time research results are peer-reviewed and published, or no later than 2 years after collection. This keeps the program in step with the LTER data policy and other relevant guidelines.
  2. Keep research project information up to date
    • We ask that approved projects submit any major post-approval changes to the project as a revised ResNotif form, or at a minimum regularly provide updates on the project, its personnel, and progress.
  3. Acknowledge Jornada LTER and USDA-ARS support
    • The support received, name of the program, and relevant grant numbers should be included in the acknowledgement section of any journal article, dissertation, or other publication. Support is defined as direct funding of research or personnel costs allocated from the Jornada grants, fellowship support to graduate students, logistical or data collection support from the Jornada field crew, or data management services from the Jornada Information Management team.
  4. Cite the Jornada LTER and USDA-ARS data used in your research
    • When research findings involving Jornada data appear in a publication (journal article, dissertation, website, etc.), please include a citation and link to the original data. For already-published long-term datasets include a DOI link at a minimum, and if possible, a full citation in the reference section (e.g. in a journal article). If the data are new and unpublished, please publish the dataset AND cite it in the publication.

The Jornada IM team strives to uphold high data publishing standards, so Jornada students and investigators who contribute to research at the Jornada should be aware of, or be willing to learn about, some best practices for collecting and describing their research data.

Cleaning and describing data

The Jornada IM team has a significant amount of experience and a variety of tools to draw on for quality assurance and quality control (QA/QC) of long-term Jornada datasets. For data managed by individual researchers, the Jornada IM team leaves most data QA/QC up to the research group or individual, but we are happy to advise when asked. For a simple overview and some resources useful for QA/QC of tabular data, see EDI’s recommendations.

Once data are clean and ready to analyze or publish, it is best to describe the data with metadata that is as detailed as needed to allow interpretation and re-use. The EDI repository has more guidance here, and the Jornada IM team tries to offer extensive assistance to researchers who need to describe their datasets for publication. Its a good idea to start collecting and organizing metadata as soon as you start collecting data. There are two recommended ways to collect and organize metadata: EDI’s ezEML tool or Jornada metadata templates. These are described below.

ezEML

The EDI repository has created a web app called ezEML for describing research datasets and creating standardized metadata documents for publication (EML). The tool is new but has rapidly developed to become an excellent method to author well-documented datasets. There is a Jornada EML template available on the site, so the recommended process for Jornada researchers is:

  1. Log in to ezEML using your Google, GitHub, or ORCID account (whichever is easiest).
  2. Start a new EML document using the “EML Documents > New from Template” menu item
  3. Navigate to and select the “LTER/JRN/JRN_template_general” template to open a document template pre-populated with Jornada metadata.
  4. Give the document a unique name. You can save your metadata and then return to this document anytime.
  5. Follow the sequence of forms on the left, and ezEML’s prompts, to upload data files and enter metadata for your dataset. Each section of your metadata will have help available (“?” icons) and several fields will already be filled if you are using the JRN template.
  6. Use the “Check metadata” and “Check data tables” tools at the bottom left to check the completeness and validity of your dataset. Green lights mean your dataset is well described and ready to share.
  7. When ready, click “Submit/Share Package” and then “Collaborate with Colleagues”. DO NOT USE “Submit Package to EDI” or we may miss your dataset.
  8. On the “Invite a Collaborator” screen share the dataset with a Jornada data manager (jornada.data@nmsu.edu).

At this point, the Jornada IM Team will receive a notification and can access your dataset in ezEML to review, edit, and publish to EDI.

Metadata templates

A metadata template is a document with a structure and cues that help you collect the essential metadata needed to describe a published dataset. We have created Jornada metadata templates in MS Word (.docx) or Excel (.xlsx) formats. These templates contain sections for all critical pieces of metadata, along with instructions on what to include and how to structure the information. The Excel version is slightly more detailed and may be useful for complex datasets. Completed templates and accompanying data files should be sent to the Jornada IM team (jornada.data@nmsu.edu).

General Jornada metadata guidelines

While writing metadata, the Jornada metadata standards (.docx) and keyword thesauri (.xlsx) documents are helpful, but not required.

Publishing datasets

Datasets, which include data and metadata describing that data, that are generated from Jornada research projects should be submitted to the IM team (jornada.data@nmsu.edu) regularly according the policy outlined above. Once submitted, preparation and publication of a Jornada dataset is usually an iterative process (Figure 1). Researchers submit data and metadata to a Jornada Information Manager who then securely archives the data and checks these items for quality and consistency. Usually there is a period of communication and updates between the IM and the researcher until the dataset is ready for publication. Once it is, the IM encodes the data into something called an EML file, and then sends it, with the data, to the Environmental Data Initiative repository (EDI) as a published dataset. There are other variations on this process, depending on the data, but this is the most common.

Figure 1: A simplified schematic of how to publish a Jornada dataset.



Updating project information

Please update the Jornada Research Site Manager (janderso@nmsu.edu) with changes to projects using updated ResNotif forms at least once per year. Send updates about project personnel or website information to the IM team (jornada.data@nmsu.edu) as necessary.

Jornada data discovery

One goal of the Jornada IM system is to make the vast majority of Jornada data accessible through the Jornada website and other avenues. On the Jornada website there are two primary data access points:

  1. The JRN LTER primary data catalog currently provides access to Jornada data held in multiple repositories with a faceted interface. The five tabs in the data catalog are:

    • Tab 1: A list of links to “signature” long-term datasets collected at JRN, arranged by important research themes.
    • Tab 2: A searchable listing of all datasets held in our primary data catalog, hosted at the EDI repository.
    • Tab3: Is a searchable listing of Jornada data held in other repositories (not EDI).
    • Tab 4: A listing of a few of the most useful spatial datasets the Jornada has. This catalog will be updated soon.
    • Tab 5: A listing of Jornada “data partnerships”, with relevant links to data catalogs provided and managed by collaborating research networks working in the Jornada Basin are.
  2. The interactive data viewer allows map-based browsing of some of our long-term meteorology and ecology datasets collected at the Jornada.

Citing the Jornada

Please cite Jornada data when you use it. You can do this in the reference section of a journal article, which is preferable, or the data availability statement of some journals. We’ll are encouraging this practice more and more, and will be developing better guidelines and, we hope, data citation statistics in the future.

Learn more about Jornada data management

The Jornada IM team consists of both Jornada Basin LTER (JRN) and USDA-ARS Jornada Experimental Range (JER) staff. If you need to know more about how we manage data, or how you or your lab can do a better job with data, there are a number of opportunities to learn more.