Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

There are multiple types of water quality data, with varying nuances about why this data is collected (i.e., purpose) and who does the data collection. A brief description of different water quality data and notes re: these nuances are captured below.

  • drinking Drinking water quality

    • NMED (SDWIS - captures 1000+ PWS)

      • theoretically Theoretically have the machine readable data available

      • SDWIS accessible online but not query-able per criteria (only viewable by well)

      • Connection between SDWIS and Drinking Water Watch (both EPA-driven) is somewhat locked under EPA set up

  • surface Surface water quality (SWQ)

    • NMED regulates surface water quality under the Clean Water Act (CWA).

      • Approximately 700 water bodies assessed by NMED (for various criteria)

      • potentially Potentially have machine-readable data available

      • most Most data are accessible through the USGS and EPA Water Quality Portal

      • considers Considers many categories and criteria, such as cold water aquatic life, recreational contact, and irrigation uses.

    • USGS collects intermittent and continuous surface water quality for research (on NWIS)

    • NMBGMR collects intermittent surface water quality for research

    • Universities collect intermittent SWQ for research

    • NGOs also collect SWQ

  • groundwater Groundwater quality

    • NMED regulates protection of groundwater from discharges

      • mostly Mostly spreadsheets and some paper files

      • monitoring Monitoring well data - as required for discharge permits (such as petroleum storage tanks, dairy farms, or landfills) * these data include measurements of water levels

    • NMBGMR collects groundwater quality for hydrogeologic research

      • SQL database, can be made machine readable

    • USGS collects groundwater quality for hydrogeologic research

  • produced Produced water quality

    • not Not regulated right now by NMED, no state agency database

    • eventually Eventually PRRC/NMPWRC hope to be intaking some quality data for treated produced water (that’s used outside of O&G)

  • precipitation Precipitation water quality

    • Occasionally collected for research purposes (NMBGMR)

...

Table 1: The STA data model definitions and example mappings to NM agency datasets regarding water quality. Note: text in red indicates content actively being discussed.

SensorThings Entity

Description

NMED Drinking Water Quality Monitoring

Chemistry Surface Water Data (e.g.,

View file
nameNMED_Surface_Water_Quality_Chemistry_Example.xlsx

View file
namechemistry_data_report_field_definitions.xlsx
View file
nameNMED_Surface_Water_Quality_Chemistry_Example.xlsx

)

Groundwater Data (e.g., report generated from dtb:

View file
nameH2 2020 LALF All GW.xlsx

)

Biological Data (e.g., CKAN dataset)

Metadata

Location

A unique coordinate or area on the surface of the earth

Street Address (possibly with associated latitude and longitude). (e.g. 3960 PRINCE ST) - geoCoding required

Coordinates - SiteID

sys_loc_code - also has x,y,z coordinate (have it in both state plane (US survey foot) and wgs84)

Coordinates - everything’s recorded in UTM; also a lat-long uploaded with it

Thing

Some real-world entity with which one or more Sensors are associated

Sample Pt RT236I

SiteID

WaterBody ID

sys_loc_code (e.g., LAF-24)

usually has a FieldID tied to a particular site, but does vary with sampling event

Datastream

A collection of Observations about an ObservedProperty produced by a Sensor associated with a Thing

Sample Results

Sample Results

report_result_value

Sample Results

Datastream/observationType

The type of observation, codified in the Observations and Measurements data standard. Types include Categorical (defined text), Count (integer), Measurement (continuous number), Observation (free text), and TruthObservation (True/False)

Categorical or TruthObservation

multiDataStream if there are multiple analytes

Measurement (discrete values)

Measurement

Measurement

Datastream/unitOfMeasurement

A three-item definition of the unit of measurement, including its name, symbol, and link to the definition (preferably to one provided in an established ontology such as http://unitsofmeasure.org/ucum.html or http://qudt.org/)

TCR Result (binary indicator: absence/presence)

ug/l (micrograms/liter; also = ppb), mg/l (also = ppm), nTu, deg C, %, us/cm (microSiemens/centimeter), “pH” (techically dimless), rads,

ug/l (micrograms/liter; also = ppb), mg/l (also = ppm), nTu, deg C, %, us/cm (microSiemens/centimeter), “pH” (techically dimless), rads, mpn (most probably number)

report_result_units

ug/l, mg/l, % recovered (for surrogates), deg C, mg CO2/liter, mg CaCO3/liter, mV, nTu, pH, us/cm

Sometimes rads

number of fish by species

species identified by taxa code (6-letter code)

sometimes temp (deg C)

Sensor

The procedure used to provide a Datastream. Can be a particular data recording device model, or a defined procedure followed by a human observer. If applicable, a specific instance (e.g. a sensor model and serial number)

9223B-PA (https://www.standardmethods.org/doi/10.2105/SMWW.2882.194)

Lab results - 100+ methods

Field results - sondes, data loggers

2 methods fields: standard analytical method (~15 that are run) and a prep method

Highly variable - could be traps, net, electrofishing, …

ObservedProperty

The raw or processed phenomenon (quantitative or qualitative) being measured for the Datastream. Preferably including a link to a definition provided by an established ontology or controlled vocabulary such as the ODM2 Controlled Vocabularies or http://qudt.org/)

Analyte (e.g. Coliform (TCR) (3100))

Turbidity

Characteristic

cas_rn

(could also use: chemical_name; there may be variants here)

Number

OPTIONAL: FeatureOfInterest

The real-world feature that the Observations are about. This may or many not be different from the Location where the Thing on which the Sensor is mounted. Can include a JSON-formatted point location or a polygon or collections thereof.

Public Water System (head office location or service area boundary) (e.g. Albuquerque Water System PWSID NM3510701)

Linear Feature - WaterBody ID

Matrix code: W (for water)

screen interval/depth sample was collected at

(z_from, z_to)

Matrix code: W (for water)

HUC2, HUC8, Waterbody name

OPTIONAL: FeatureOfInterest

The real-world feature that the Observations are about. This may or many not be different from the Location where the Thing on which the Sensor is mounted. Can include a JSON-formatted point location or a polygon or collections thereof.

Qualifiers on observedproperty:

1) dilution value?

2) was water sample filtered (happens for select analytes, e.g., metals)? if so, at what microns?

3) detection limits

4) only ingesting normal samples (not dups, trip blanks, etc.)

1) detection limits; quantitation (QDL), method detection (MDL), and reporting detection (RDL). Usually QDL and RDL are the same.

2) dilution factor

3) test type (some are “retest” if there’s an error in the lab)

4) sample filtering - D (xx), N (xx), and T (total)

5) fields for purge volume/time

6) lab id code

7) sample type code - N (normal), D (duplicate), … → only pull in Normal data?

Data

Observation

A single measurement value including the result, time values, and other metadata. Information on the ObservedProperty that was measured by what Sensor is provided by the Datastream these observations are in. Features of Interest are linked for each observation as well. Observations are linked to (collected in) Datastreams

Sample (e.g. 763391)

Sample (~7 digits, e.g., 2555000)

2 sample Ids: 1) CABQ (e.g., sys_sample_code:LAF24-20201110W-N) - trip blanks are also captured this way

2) lab sample id

3) test id

FieldID

Observation/result

The actual measured value, with valid values defined in observationType and units defined in unitsOfMeasurement, both provided by Datastream

Sample Result (P (Positive/ Coliform found) A (Negative/ Coliform not found))

Sample Result

report_result_value

number

Observation/phenomenonTime

The date+time (or interval) in ISO 8601 format (YYYY-MM-DDT:HH:MM:SS-Z) when the observation occured

MP (Monitoring Period) (e.g. 01-01-2020 to 01-31-2020)

(might need to discuss this further to figure out value for entry; does coincide with entry within Drinking Water Watch)

y-m-d h-m-s

sample_date (y-m-d h-m-s)

y-m-d

OPTIONAL: Observation/resultTime

The date+time that the result was generated. May be the same as phenomenonTime

Date (e.g., 01-06-2020)

(refers to sampling time, not time of lab analysis)

also an analysis date + time that’s available: y-m-d h-m-s

also an analysis date + time that’s available: y-m-d h-m-s

OPTIONAL: Observation/validTime

The date+time interval during which the Observation can be used (often used for provisional values that are replaced by QA/QC’d observations)

OPTIONAL: Observation/resultQuality

A description of the result Quality. Will vary according to agency practice. Can use ODM2 controlled vocabulary for data quality types as a guide.

3 different fields: lab qualifier (describes matrix spikes, replicates, etc.); internal qualifier (result of validation process); STORET qualifier (translation of internal qualifier to federal code)

result_type_code: most are TRG, some are SUR (surrogates)

2 qualifiers: 1) detection flag - yes or no; 2) organic - yes or no

also, have lab qualifiers (J, …) to show which detection bands are within, etc.

quality control checks involve data entry verifications (from std. data entry sheets into dtb). Post dtb verification, entry sheets are either archived in the office or at UNM.