NSF SAGE: MUSTANG: Metrics: Docs: v. 1

Summary

This metric estimates a station’s data completeness as an average of broadband (BH) channel percent_availability. It also returns a quality_flag reflecting the difference in channel count between those having open metadata epochs and those having percent_availability measurements in MUSTANG. A zero quality_flag (the best case) indicates that station completeness is based on all BH channels described in the metadata. If this flag is greater than zero, it means that percent_availability measurements are missing and station_completeness may be reported too low. Recalculation of both metrics is needed to remedy the problem.

The assumptions on which this approach is based are

each station has an instrument type that characterizes its primary purpose,
stations having broadband channels are primarily broadband stations, and
a station is “complete” if its primary instrument(s) is sending complete data.

Although this metric is currently focused on broadband stations, it can be expanded to stations having other purposes if the primary purpose can be determined.

Uses

The value reflecting the average percent_availability of the primary sensor type is a quick summary of data completeness for those who need to use multiple components of a station’s main sensor type (in this case, broadband). A non-zero quality flag indicates that additional channels are waiting to be averaged in before the station_completeness measurement will be up to date.

Note that there are currently no measurements for this metric, so we do not advise using it as a constraint in rrds requests.

Data Analyzed

Traces – all BH? channels for a N.S (Network.Station) per measurement
Window – 24 hours starting at 00:00:00 UTC
Data Source – IRIS SEED archive

SEED Channel Types – None

Algorithm

For a station having BH? channels,
- Request the N.S.L.C names where C is BH? that have data for the current 24-hour window,
- Request all percent_availability measurements for N.S.L.C names returned,
- Average these percent_availability measurements and report it as percent station_completeness:
```
station_completeness = average(BH channel percent_availability)
```
- Calculate the quality_flag as the difference between the number of channels with open metadata epochs and with percent_availability measurements:
```
quality_flag = (# BH channels with open metadata) - (# BH channels with percent_availability measurements)
```

Metric Values Returned

value – average of percent_availability measurements for BH channels with open metadata epochs
quality_flag – difference in count between BH channels with open metadata and those with percent_availability measurements
target – the trace analyzed, labeled as N.S…Q (Network.Station…Quality)
start – beginning of the data day requested (00:00:00 UTC)
end – end of the data day requested (truncated as 23:59:59 UTC)
lddate – date/time the measurement was made and loaded into the MUSTANG database (UTC)

Notes

In order to avoid underestimating station_completeness when triggered channels are present, this metric assumes that BH channels with percent_availability < 2% are triggered channels and treats them as 100% complete. This could result in overestimation for days when continuous channels have very little data.

Author(s)

Robert Casey

Contact

dmc_qa@iris.washington.edu

Updated

2019-01-22

NSF SAGE Facility MUSTANG metrics Web Service Documentation

station_completeness Station Percent Available Per Day