Project Overview

Geodetic measurements with GNSS are a prime example of an observational data set that becomes far more valuable when scientists can access data without regard to national boundaries. Many governmental and research organizations are increasingly setting up GNSS stations worldwide, and are often open to sharing these data but lack the resources or expertise to deploy systems and software to manage data, metadata, and user tools that would support data sharing. UNAVCO developed the Dataworks for GNSS software with NSF funding to the COCONet project to address this need. Dataworks provides end-to-end, instrument-to-dissemination capability, and has been deployed on systems that are now operating as Regional Data Centers as part of the NSF-funded TLALOCNet and COCONet projects.

Dataworks consists of susbsystems or modules that provide the required functionality for data acquisition, management and sharing. There are modules for GNSS receiver control and data download, a database for metadata and tools for metadata handling, ingest software to manage file metadata, data file management scripts, GSAC (Geodesy Seamless Archive Centers software) for managing data search and access including web services for advanced access, scripts for mirroring station data and metadata from partner GSACs, guidelines and scripts for backup, and extensive software and operator documentation. Modules are written in Python and Java, and configured for Linux. The database used is MySQL.

Dataworks for GNSS is available through UNAVCO to any organization desiring to use its capabilities. UNAVCO plans to provide an Amazon AWS image of Dataworks that would allow standing up a Dataworks-enabled data center without requiring upfront investment in server hardware. Additional refinements to the modules and expanding the capabilities depend on our ability to further fund this effort.

Significance

Dataworks provides data management and distribution software subsystems as open source modules that can be employed by regional GNSS managers for small- to medium-scale networks (e.g. 10-100 stations).

Recognizing that many organizations operate GNSS stations but do not have the expertise to write their own software systems for the fundamental tasks of GNSS data and metadata management, UNAVCO created Dataworks for GNSS. These software modules are intended to keep the fundamental tasks of handling incoming data, ingesting, metadata storage, and presentation to the users manageable for smaller institutions. In addition to fundamental data management, Dataworks offers specialized functionality such as mirroring and federation.

Mirroring and Federation

Mirroring and federation capabilities of Dataworks and GSAC facilitate institutional data management functions where data sharing can occur among projects that span national boundaries. Data centers can tune access control to fit their needs. Dataworks Mirroring is being used within the COCONet and TLALOCNet Regional Data Centers to ensure that these projects have complete data sets within their respective regions by mirroring the segment of project data managed at UNAVCO.

When multiple Regional Data Centers are running GSAC software, a Federated GSAC server can allow all of the collections to be queried through a single GSAC Application Programming Interface (or its associated web user interface). Each Data Center remains in control of their own collections, while sharing metadata and access information. With federation, each data and metadata collection is kept distinct, yet users can use familiar GSAC queries and web user interface to investigate and access data without needing to know which data center holds any particular part of the overall collection.

COCONet Regional Data Centers

The NSF-funded COCONet project provided support for Dataworks development and for three Regional Data Centers (RDCs), selected via a competitive proposal process, to receive computer servers and Dataworks software, plus modest funding to support operations. The RDCs were awarded to:

  • CIMH, Barbados
  • INETER, Nicaragua
  • SGC, Colombia

RDC technical staff received hands-on training in Boulder on running the servers and Dataworks software. The RDCs provide new or expanded capabilities within the region. A focus on regional data sharing is a benefit of the COCONet RDCs.

TLALOCNet Regional Data Center

TLALOCNet, a combined atmospheric and tectonic cGPS-Met network in Mexico for the interrogation of climate, atmospheric processes, the earthquake cycle, and tectonic processes of Mexico, leverages NSF and UNAM funding to augment and enhance existing cGPS network stations. When completed, TLALOCNet will span all of Mexico and will link existing GPS infrastructure in North America and the Caribbean to create a continuous, federated network of networks spanning from Alaska to South America. NSF provided support for the TLALOCNet Data Center, hosted at the Universidad de Guadalajara Mexico, to hold all project data.

More Information and Future Plans

While Dataworks is a functioning data management system, there are opportunities to enhance its capabilities. Additional development will depend on securing additional resources. Development tasks include:

  • Improve reliability and integration
  • Improve scalability
  • Add tools for operator metadata management
  • Expand the receivers and formats of the Download Module
  • Test in a cloud VM environment and distribute as a VM image

Documentation and software access is available on the Dataworks for GNSS web site.

Invitation to Collaborate

We seek new global partners to extend the federation capability of GPS networks and archives. If you have an interest in collaborating please contact us at data [at] unavco.org.

Regional Data Centers (RDCs)

Project Information

  • UNAVCO staff: Fran Boler, James Matykiewicz, Mike Rost, Stuart Wier
  • Location(s): Boulder, Mexico, Colombia, Barbados, Nicaragua
  • Funding Source: NSF

Written by: