Help: noise-pdf v.1

Description

The noise-pdf web service returns Probability Density Functions (PDFs) for seismic channels.

Summary

Probability Density Functions (PDFs) can be useful for visualizing the characteristic noise levels of seismic data. The noise-pdf web service uses the Power Spectral Density (PSD) results from the noise-psd web service to construct PDFs. This service can return two formats:

  1. text or xml tabulating frequency/power bins and the count of PSD occurrences within those bins (“frequency, power, hits”), suitable for creating a histogram.
  2. PDF plot as described in Ambient Noise levels in the Continental United States by Daniel E. McNamara and Raymond P. Bland.

Algorithm

  1. Gather PSDs for requested station(s)-channel(s) and time period.
  2. Remove instrument transfer function from each PSD.
  3. Tabulate number of PSDs that fall into each combination of period (1/8 octave intervals) and power (1 dB intervals) bins. This information is stored in the database as described in the section Understanding Query Latency below.
  4. If text or xml output is requested, return tabulated values as “frequency, power, hits”.
  5. If plot output is requested, calculate the probability distribution and return a PDF plot.

Example PDF Plot
Example PDF plot produced by the noise-pdf web service.

PDF Plot Color Scale

The PDF plot color scale is ordered from low to high probabilities as a gradient from white (0%) – magenta (>0%) – blue (6%) – turquoise (12%) – green (18%) – yellow (24%) – red (30%+).

Plot Options

There are many options available to customize the PDF plot output:

  1. New High and Low Noise Models (default = yes)
  2. Minimum, maximum, mode curves (default =yes)
  3. Legend for noise models, min/mode/max curves (default=yes)
  4. Interpolation type (bicubic, bilinear, or none)
  5. Plot size in pixels
  6. Title, subtitle
  7. Axes limits for frequency/period and power
  8. Choice of frequency or period x-axis labels (default=both)
  9. Font size for title, subtitle, or axes labels

Constraints

Channel constraints = currently [BH]H?

Understanding Query Latency

When a request is made to the web-service, the web service sends a series of queries to a backend SQL database. The results of those queries is compiled into a histogram which is returned as a plot, xml document or text document. The amount of time that a query takes is roughly proportional to the number of rows of data that must be returned from the database. There are two factors that determine the number of database rows that must be processed:

  1. The number of “targets” selected
  2. The time interval selected.

The number of targets selected is determined by what target query option is and what targets have actually been measured in the given time range. Asking for targets that do not have measurements does not incur a significant penalty. For example, if a query had the target selection target=AA.*.*.*.M and the AA network had many different channel codes, but only BHZ, BH1, BH2 were measured by the mustang system, only these channels would be processed by web-service. Requesting target=AA.*.*.BH*.M would result in the same number of rows being processed and would take the same amount of time for the webservice to process.

Understanding how the data is stored in the database helps to make sense of the effect of the selected time interval (starttime and endtime). The data is stored in 5 date range tiers:

  1. Day
  2. Week
  3. Month
  4. Year
  5. All-time

When an arbitrary time interval is selected, rows are retrieved from these 5 tiers to fill the selected time range in the most efficient manner possible. For example consider the request:

starttime=2013-11-29&endtime=2015-02-07

Assuming data is available to completely cover this time interval, this request would be filled by retrieving rows for the following dates:

Interval   Dates Rows
Day 2013-11-29, 2013-11-30   2
Week 2015-02-01 1
Month 2013-12, 2015-01 2
Year 2014 1
total rows: 6

If the selected time interval was:

starttime=2013-01-01&endtime=2015-01-01

Only 2 rows would need to be processed: 2013, 2014

If no time interval is selected or if the time interval is wider than the available measurements, only the all-time interval row needs to be processed.

The actual total number of rows processed will be the product of date range row count and effective targets selected count.

Scholarly Articles

The following links describe the use of Probability Density Function (PDF) plots.

Ambient Noise levels in the Continental United States by Daniel E. McNamara and Raymond P. Buland
Ambient Noise Probability Density Functions by D. McNamara and R. Boaz