Table of Contents
The purpose of this document is to describe the data integration architecture for the New Mexico Water Data Initiative (NNWDI). It describes in a “bottom-up” approach the NMWDI. It proceeds as follows
Outline of the overall multi-agency architecture
Description the data model and API standard through which all agency data will ideally be integrated and served to users
Provision of template procedures for how agencies can provide their data in the standard data model and API standard.
Elaboration of the overall multi-agency architecture in detail
Key terms are linked directly to their entries in the project Glossary of Terms .
The New Mexico Water Data Initiative Architecture
The goal of the New Mexico Water Data Initiative is to make available to the public data collected by multiple agencies about water resources in New Mexico in a common format. Many (but not all) agency data are already published online through services such as ESRI web maps, excel files, or in some cases public APIs. However, important aspects for a given data type (such as water table level measurements from wells) such as data/time formats, geospatial projections, column names, and units vary from agency to agency and even from dataset to dataset within agencies. In order to allow users to access data from multiuple agencies in one format, the NMWDI architecture will route all agency data through one Web API standard with one corresponding underlying data model that references one common statewide water data controlled vocabulary. As long as each agency somehow serves their data through the common Web API, data storage can be federated (i.e. not centralized), although some degree of centralization can be accomodated if that is most convenient for a given dataset. Each agency’s standardized API will be published through a central portal with an NMWDI administered API Management Platform. Users can send API requests to the management platform, which will route these requests to the agency APIs and in turn forward the responses to users. However, whether data storage is federated across agencies or centralized, all contributing agency data will be required to be mapped to the common data model and transformed into the common format before being delivered to users. This basic data flow is illustrated in Figure 1.
Figure 1. Basic data flow.
An aspirational demonstration application can be found here, where monitoring wells can be visualized from multiple agencies, with parameters and measurements searched for in a common interface and returned in a csv file with common column names. This appliccation is based on a workflow where multiple agencies' data have been transformed and provided with independent instances of a standard API. The single application can then allow users to interact seamlessly with data from multiple agencies being provided by multiple APIs.
The Data Model & API Standard: OGC SensorThings API (STA)
The above basic data flow requires a state-wide data model and API standard. The NMWDI has chosen the OGC SensorThings API as the model and standard. The OGC is the Open Geospatial Consortium, an international standards organization that creates and publishes open standards for geospatial data management, processing, and sharing.
The STA data model
The STA data model is based on the Observations and Measurements data model of the OGC, which itself underlies many environmental science data systems that integrate data from many independent organizations. Examples include the CUAHSI HydroClient that provides centralized access to global streamgage, monitoring well, and meteorlogical networks; and the National Groundwater Monitoring Network that provides centralized access to standardized high-frequency groundwater level and quality data from federal, state, and local agencies. The STA data model provides a unifying metadata standard and data structure standard that can model any data generated about point or polygonal locations on earth. It is important to be bale to map agency data to this data model in order to structure each agency’s data in a compatible format and to provide a seamless data request experience to users. The STA data model shown in Figure 2 below, and full specified in this OGC Specification.
Figure 2: The STA data model entity-relationship diagram
Table 2 below provides definitions for the entities and key properties, as well as example mappings to some agency data. The exercise of mapping agency data to this data model is very important to further more functional data integrations steps.
SensorThings Entity | Description | Example: NMBGMR Aquifer Monitoring Well | NMED Drinking Water Quality Monitoring | NMOSE Water Withdrawal Monitoring | ||
---|---|---|---|---|---|---|
Metadata | Location | A unique coordinate or area on the surface of the earth | Location in latitude and longitude | Street Address (with associated latitude and longitude) | Location in easting and northing | |
Thing | Some real-world thing with which one or more Sensors are associated | Well | Sampling Point | |||
Datastream | A collection of Observations about an ObservedProperty produced by a Sensor associated with a Thing | Time series, Hydrograph | ||||
Datastream/observationType | The type of observation, codified in the Observations and Measurements data standard. Types include Categorical (defined text), Count (integer), Measurement (continuous number), Observation (free text), and TruthObservation (True/False) | continuous number (Measurement) | ||||
Datastream/unitOfMeasurement | A three-item definition of the unit of measurement, including its name, symbol, and link to the definition. | meters | ||||
Sensor | The procedure used to provide a Datastream. Can be a particular data recording device model, or a defined procedure followed by a human observer. | Diver, Wellintel, Steel tape | ||||
ObservedProperty | The raw or processed phenomenon (quantitative or qualitative) being measured for the Datastream. | Water Pressure Head, DepthBelowSurface | ||||
OPTIONAL: FeatureOfInterest | The real-world feature that the Observations are about. This may be different from the Thing on which the Sensor is mounted. Can be a point location or a polygon or collections thereof. | Aquifer (polygon) | Public Water System (head office location or service area boundary) | Water Right (set of relevant points of diversion) | ||
Data | Observation | A single measurement value including the result, time values, and other metadata. Information on the ObservedProperty that was measured by what Sensor is provided by the Datastream these observations are in. Features of Interest are linked for each observation as well. Observations are linked to (collected in) Datastreams | Depth measurement | |||
Observation/result | The actual measured value, with valid values defined in observationType and units defined in unitsOfMeasurement, both provided by Datastream | 3252 | ||||
Observation/phenomenonTime | The date+time (or interval) in ISO 8601 format (YYYY-MM-DDT:HH:MM:SS-Z) when the observation occured | |||||
OPTIONAL: Observation/resultTime | The date+time that the result was generated | |||||
OPTIONAL: Observation/validTime | The date+time interval during which the Observation can be used (often used for provisional values that are replaced by QA/QC’d observations) |
Add Comment