diff --git a/.README.md.swp b/.README.md.swp new file mode 100644 index 0000000..6e0fec4 Binary files /dev/null and b/.README.md.swp differ diff --git a/content/post/pmu-siting.md b/content/post/pmu-siting.md deleted file mode 100644 index c7e4590..0000000 --- a/content/post/pmu-siting.md +++ /dev/null @@ -1,74 +0,0 @@ ---- -date: '2021-02-21T01:00:00Z' -description: Wondering where to put PMUs? Here's a guide to siting and installation -featuredImage: '/assets/images/post/default.jpg' -tags: -- explainers -- ni4ai-data - -title: Practical Guide to PMU Siting -author: laurel ---- - -Sensor siting affects the nature and quality of the data that’s generated. Dense sensor coverage can provide broad visibility, while reducing uncertainty associated with estimating state variables at different nodes on the system. Many applications, however, can be supported with data from only one sensor that’s strategically placed to provide data relevant to the problem at hand. - - -This blog post summarizes theoretical and practical considerations relevant to siting PMUs on a distribution grid. - -# Theoretical Considerations -The main distinction between a transmission PMU and a distribution PMU (e.g., the microPMU) is that time synchronization on distribution PMUs must be more accurate in order to capture phase angle differences typical in distribution systems. The microPMU, for example, reports phase angles that are accurate down to 0.01 degrees. - -## Separation -The phase angle difference between two nodes on a distribution system depends on two characteristics of the network: -1. Power flow -2. Line impedance - -We can relate phase angle differences to these quantities using the power flow equations. Alternately, the LinDistFlow equations [1] offer a linear approximation of power flow based on assumptions characteristic of distribution systems: -$$\delta_a-\delta_{b} \approx xP-rQ$$ -Where $\delta_a-\delta_{b}$ describes the phase angle difference between two nodes, $x+jr$ describes the impedance of the line between them, and $P+jQ$ describes power flow along the line. Given phase angle measurement precision of 0.01 degrees, the second equation coupled with what’s known about the impedance and power flow of the network to compute the minimum physical separation between sensors. - -## Placement -When placing PMUs, it is convenient to think about the value of the current and voltage measurements independently. A current measurement provides visibility on a single, specific flow: that of the line or load being measured. A voltage measurement provides broader visibility across an area, since the voltage at one network node is affected by multiple current flows within the network. - -Consider, for example, a radial network with one PMU at the substation or root node, and one at a line end or leaf node. The voltage difference between these two PMUs will be affected by a change in load at any point in the network. Voltages at nearby nodes are highly correlated, therefore the value of placing two PMUs near each other comes from the additional current rather than voltage measurement. PMU voltage measurements spread out throughout a network, with one PMU per rough cluster of nodes, provide the greatest visibility on events occurring across the network. In radial networks, a measurement at the substation, and a comprehensive set of leaf measurements provides thorough visibility into load or network changes. When choosing an installation point within a cluster of nodes, or among the full set of leaves, the current measurement is a useful differentiator: preference should be given to locations where more important current measurements (of large or interesting flows) can be made. - -## Measurement error -In distribution networks, estimating impedances from PMU measurements is bedeviled by transducer error. This further reduces the incentives for proximal installations---for example at two ends of a line. Instead, spreading PMUs through a network as described enables their high-resolution measurements to provide broad visibility into significant system changes. Depending on the precise installation configuration and coverage, changes can then be localized to specific parts of the network, with greater specificity as measurement coverage increases. - -# Installation -PMUs can be installed on any current transformer, electrical panel, or wall outlet. Different deployment strategies can affect which quantities are measurable, as well as the nature of the dynamics that can be observed. - -## On the Grid -PMUs can be connected directly to the grid at service voltages up to 750V (line-to-neutral). To monitor networks at a higher voltage, the sensor must be connected through a potential transformer (PT). It is important to note that measurements taken through the PT is an additional source of measurement noise. This should be taken into account in choosing a transformer, and in calibrating the sensor. - - -Current measurements are recorded by a high-precision current transformer (CT) purchased along with the sensor. Depending on the rated capacity of the grid, it may be necessary to connect the sensor’s CT to the secondary side of an existing CT installed for metering or protection. It is important to again note that placing a CT between the measurement site and the grid itself contributes to measurement noise. - -## Electrical Panels -The sensor can be installed directly on an electrical panel to measure service voltage and current at the customer. For customers that receive single-phase service, measurements will of course be limited to single-phase voltage and current measurements. - - -Despite ease of installation, it is important to note that measurements recorded on secondary distribution may be much noisier than measurements recorded on medium-voltage grids due largely to volatility in load. Furthermore, signals that are interesting or relevant to monitor in one region of the network may be obfuscated by transformers. Sensors should be installed as close to the area of interest as possible, as sensors installed on secondary distribution may not capture dynamics relevant to distribution grid operation. Depending on the nature of the dynamics that are of interest to monitor, measurements that are sensitive to changes in customer load may make it difficult to differentiate changes in load from events on the grid. - -## Wall Outlets -Sensors installed in a wall outlet measure single-phase voltage, but not current (as at the electrical panel). Sensors installed in wall outlets are also subject to the same limitations on data quality and measurement noise as sensors installed on electrical panels. - - -Wall outlet data can, however, be used to detect voltage sags or spikes happening on the distribution grid, which can be traced to arcs and faults related to risk factors like vegetation contact or equipment degradation. - - -Wall outlet measurements can also provide insights valuable for wide-area monitoring purposes. Installing PMUs in wall outlets allows for independent grid monitoring efforts which circumvent institutional barriers to making data accessible across service territories. After filtering out measurement noise, wall outlet data can be used to monitoring frequency on the grid, and to study phase angle differences which indicate the direction of power flow across different regions of an interconnect. - -# Practical considerations -Practically speaking, choosing where to site sensors and how many to install really depends on which applications you wish to enable, or what dynamics you want to be able to see. - - -While state estimation and fault location require dense sensor coverage, practitioners willing to accept some uncertainty may find that less coverage is sufficient for their purposes. For example, comparing measurements from sensors on different laterals may allow operators to determine in which general area of the grid a fault occurred, without locating it precisely. - - -For control and monitoring applications, the heuristics are much simpler. Control applications may require a PMU sited at each decision point on the network -- for example at each inverter, battery, or controllable device. Monitoring purposes may also be supported by simply installing a sensor in each area of the system that is of interest. Examples could include siting sensors at the substation, at each generator, and at each major branch point on the network. - - -# References -[1] Powerside. microPMU tech sheet (2020). https://powerside.com/wp-content/uploads/2020/12/MicroPMU-LV-Data-Sheet.V1_En.pdf -[2] A Ostfeld, K Brady, L Mehrmanesh and A von Meier. Reference document for microPMU installation (2018). https://www.naspi.org/sites/default/files/reference_documents/uPMU%20Installation%20Ref%20Manual%20Mar%2023%202018.pdf \ No newline at end of file diff --git a/content/post/point-on-wave-1.md b/content/post/point-on-wave-1.md new file mode 100644 index 0000000..07bc825 --- /dev/null +++ b/content/post/point-on-wave-1.md @@ -0,0 +1,42 @@ +--- +date: '2021-08-06T01:00:00Z' +description: Exploring point on wave data using spectral analysis +featuredImage: '/assets/images/post/point-on-wave-2/fig1.png' +tags: ["btrdb", "python", "angles", "wams", "analytics", "phasors"] + +title: Point on Wave Data (Part 1) +author: miles +--- + + +PMU phasor data is created from point-on-wave (POW) data, which is the raw voltage waveform of a node on the power grid. Below is a window of POW data from the GridSweep sensor, which has a sampling rate of 4.3kHz, measuring the voltage at a household outlet in Oakland, CA [1]. This means we are measuring the voltage at a node of the distribution grid in that area. This data can be found at POW/GridSweep in the NI4AI database. + +![png](/assets/images/post/point-on-wave-1/fig1.png) + +As you can see, raw POW data is visually hard to interpret. One thing we can do is look at its frequency spectrum using the discrete fourier transform (DFT): + +![png](/assets/images/post/point-on-wave-1/fig2.png) + +From the spectrum, we see most of the signal power at the base frequency of 60Hz, as well as significant power in the odd harmonics at 180Hz, 300Hz, 420Hz, etc. + +One thing we can do with point-on-wave data that we cannot do with phasor data is compute the harmonic distortion present in the voltage signal. The total harmonic distortion (THD) compares the power present in the fundamental frequency of 60Hz to the power present in the higher-order harmonics. The formula for THD is as follows: + +$$ +THD = \sqrt{\frac{V_2^2 + V_3^2 + V_4^2 + ... + V_n^2}{V_1}} +$$ + +Where $V_1$, $V_2$, $V_n$ correspond to the 1st, 2nd, and nth harmonic, respectively. + +To compute the THD, we need to look at the peaks of the signal’s DFT. To get a more accurate estimate of the peak magnitude, we interpolate between the DFT data points by fitting the data to cubic polynomials, as shown below. This is done using the [spline interpolation](https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.splrep.html) functions provided by scipy. + +![png](/assets/images/post/point-on-wave-1/fig3.png) + +The plot above of the 7th harmonic near 420Hz illustrates that the true maximum frequency lies somewhere between the discrete frequencies computed by the DFT. Using an interpolation allows us to get a closer estimate of the maximum. In this way, we find the 7th harmonic, $V_7$, of the formula for the THD above. + +To compute the THD, we find the maximum amplitude for each harmonic in the spectrum. Computing the THD this way, we get the following value: + +```python +THD = 0.038 +``` + +Which means the total power in the harmonics are 3.8% of the power of the fundamental frequency of 60Hz. diff --git a/content/post/point-on-wave-2.md b/content/post/point-on-wave-2.md new file mode 100644 index 0000000..0615422 --- /dev/null +++ b/content/post/point-on-wave-2.md @@ -0,0 +1,31 @@ +--- +date: '2021-08-06T01:00:00Z' +description: Visualizing point on wave data with spectrograms +featuredImage: '/assets/images/post/point-on-wave-2/fig2.png' +tags: ["btrdb", "python", "angles", "wams", "analytics", "phasors"] + +title: Point on Wave Data (Part 2) +author: miles +--- + + +We can also visualize how the spectrum changes in time by computing the spectrogram. After taking a spectrogram, we see + +![png](/assets/images/post/point-on-wave-2/fig1.png) + + +We see the harmonics present in the spectrogram as well. + +Additionally, because this data has a high sampling frequency as well as a long time window, we see the frequency domain with a high level of detail. Zooming into the low frequencies we see some interesting features: + +![png](/assets/images/post/point-on-wave-2/fig2.png) + + +Horizontal lines are sustained oscillations, and we see some sustained oscillation at subharmonics 30Hz and 15Hz. Vertical lines are transient events that contain many frequency components over a short amount of time. +We see pings which are transient events with frequencies centered around 20Hz. +We also see a sustained oscillation that changes its frequency in a seemingly random manner. +Since this data is initially unlabelled, a next step is to identify these different signals in the spectrogram. Having information about devices that may be affecting the local household voltage will help to associate these signals. + + +## References +[1] Data recorded during on-going U.S. Dept of Energy Project "GridSweep: Frequency Response of Low-Inertial Bulk Grids": 24 hours of GPS-time-stamped 4.3kHz point-on-wave sampling at 29-bit resolution, single-phase 120-volt nominal, recorded at Alex McEachern's residential kitchen in Alameda, California, USA on 2021/05/10 - 2021/05/11. diff --git a/content/post/visualizing-aggregates.md b/content/post/visualizing-aggregates.md new file mode 100644 index 0000000..e5cb1ee --- /dev/null +++ b/content/post/visualizing-aggregates.md @@ -0,0 +1,393 @@ +--- +date: '2021-06-29T01:00:00Z' +description: Exploring patterns observed over months or years of data +featuredImage: '/assets/images/post/visualizing-aggregates/output_11_0.png' +tags: ["btrdb", "python", "angles", "wams", "analytics", "phasors"] + +title: Visualizing Aggregates +author: laurel +--- + + +# Visualizing Aggregates + +This blog post showcases how to use statistical aggregates (or StatPoints) to visualize trends and anomalies in data over very long time series. Visualizing aggregates rather than raw point values can provide valuable high-level perspective about what values or patterns are "typical" to see in data, and what statistical properties may characterize events that are more unusual. + +New to StatPoints? Start with Tutorial 5 - [Working with Statpoints](https://github.com/PingThingsIO/ni4ai-notebooks/blob/main/tutorials/5%20-%20Working%20with%20StatPoints.ipynb). + + +```python +import btrdb +import pandas as pd +import numpy as np + +import time + + +from datetime import datetime, timedelta + +from matplotlib import pyplot as plt +from btrdb.utils import timez + +db = btrdb.connect() +``` + + +```python +streams = db.streams_in_collection('sunshine/PMU1', tags={'unit': 'volts'}) + +pd.DataFrame([[s.name, s.unit, s.collection] for s in streams], + columns=['name','unit','collection']) +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
nameunitcollection
0L3MAGvoltssunshine/PMU1
1L1MAGvoltssunshine/PMU1
2L2MAGvoltssunshine/PMU1
+
+ + + + +```python +stream = db.stream_from_uuid(streams[1].uuid) + +def get_time(stream, func): + return timez.ns_to_datetime(getattr(stream, func)()[0].time) + +print('start:', get_time(streams[0], 'earliest')) +print('end:', get_time(streams[0], 'latest')) +print(str(get_time(streams[0], 'latest') - get_time(streams[0], 'earliest'))) +``` + + start: 2015-10-01 16:08:24.008333+00:00 + end: 2017-04-15 01:41:35.999999+00:00 + 561 days, 9:33:11.991666 + + +See it in the plotter: https://plot.ni4ai.org/permalink/VKne4LTTl + + +```python +start_time = datetime(2016,4,1) +end_time = datetime(2017,4,1) + +start_ns = timez.datetime_to_ns(start_time) +end_ns = timez.datetime_to_ns(end_time) +``` + +https://plot.ni4ai.org/permalink/2KN5iCXw5 + + +```python +window = timez.ns_delta(days=30) +pw = int(np.log2(window)) + +points, _ = zip(*stream.aligned_windows(start_ns, end_ns, pointwidth=pw)) +points +``` + + + + + (StatPoint(1459166279268040704, 6825.37109375, 7157.580867829119, 7301.88525390625, 269591739, 36.51418677591111), + StatPoint(1461418079081725952, 6580.9541015625, 7161.23304578515, 7300.8623046875, 269692154, 35.61453444076186), + StatPoint(1463669878895411200, 6796.833984375, 7160.458736401789, 7286.55126953125, 147383982, 34.1646855496053), + StatPoint(1465921678709096448, 6964.91455078125, 7159.378036144785, 7325.37158203125, 206497053, 40.14920297145494), + StatPoint(1468173478522781696, 5558.57421875, 7167.452925690963, 7294.76318359375, 269810992, 39.238088874956624), + StatPoint(1470425278336466944, 5780.056640625, 7161.174368865373, 7307.212890625, 268549058, 41.321696619794494), + StatPoint(1472677078150152192, 6392.52099609375, 7161.894405845441, 7318.228515625, 268988146, 39.96141042153979), + StatPoint(1474928877963837440, 5955.3212890625, 7158.670968887105, 7300.72509765625, 255962541, 39.97700500308349), + StatPoint(1477180677777522688, 5097.31298828125, 7156.974233908252, 7283.45361328125, 268787149, 36.33054088798969), + StatPoint(1479432477591207936, 6591.23486328125, 7158.855179375172, 7281.74755859375, 270215978, 33.48565115468457), + StatPoint(1481684277404893184, 6501.1142578125, 7162.591062269605, 7282.77734375, 270216211, 34.53212742881303), + StatPoint(1483936077218578432, 6076.43017578125, 7162.1296525714415, 7262.462890625, 270089894, 35.17972671790351), + StatPoint(1486187877032263680, 6127.64404296875, 7153.473267831473, 7292.42724609375, 234816262, 37.10013475124324), + StatPoint(1488439676845948928, 5907.42041015625, 7160.925875368738, 7297.984375, 270196174, 34.90643606628646)) + + + + +```python +def points_to_dataframe(points, + aggregates=['time','min','max','mean','stddev','count'], + use_datetime_index=True): + df = pd.DataFrame([[getattr(p, agg) for agg in aggregates] for p in points], + columns=aggregates) + if use_datetime_index: + df['datetime'] = [timez.ns_to_datetime(t) for t in df.time] + df = df.set_index('datetime') + return df + +df = points_to_dataframe(points) +df +``` + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
timeminmaxmeanstddevcount
datetime
2016-03-28 11:57:59.268041+00:0014591662792680407046825.3710947301.8852547157.58086836.514187269591739
2016-04-23 13:27:59.081726+00:0014614180790817259526580.9541027300.8623057161.23304635.614534269692154
2016-05-19 14:57:58.895411+00:0014636698788954112006796.8339847286.5512707160.45873634.164686147383982
2016-06-14 16:27:58.709096+00:0014659216787090964486964.9145517325.3715827159.37803640.149203206497053
2016-07-10 17:57:58.522782+00:0014681734785227816965558.5742197294.7631847167.45292639.238089269810992
2016-08-05 19:27:58.336467+00:0014704252783364669445780.0566417307.2128917161.17436941.321697268549058
2016-08-31 20:57:58.150152+00:0014726770781501521926392.5209967318.2285167161.89440639.961410268988146
2016-09-26 22:27:57.963837+00:0014749288779638374405955.3212897300.7250987158.67096939.977005255962541
2016-10-22 23:57:57.777523+00:0014771806777775226885097.3129887283.4536137156.97423436.330541268787149
2016-11-18 01:27:57.591208+00:0014794324775912079366591.2348637281.7475597158.85517933.485651270215978
2016-12-14 02:57:57.404893+00:0014816842774048931846501.1142587282.7773447162.59106234.532127270216211
2017-01-09 04:27:57.218578+00:0014839360772185784326076.4301767262.4628917162.12965335.179727270089894
2017-02-04 05:57:57.032264+00:0014861878770322636806127.6440437292.4272467153.47326837.100135234816262
2017-03-02 07:27:56.845949+00:0014884396768459489285907.4204107297.9843757160.92587534.906436270196174
+
+ + + + +```python +def plot_aggregates(df, vlines=[], hlines=[]): + fig, ax = plt.subplots(figsize=(15,3)) + df['min'].plot(ax=ax, ls=' ', marker='_', color='black', markersize=5, label='minimum') + df['max'].plot(ax=ax, ls=' ', marker='_', color='black', markersize=5, label='maximum') + df['mean'].plot(ax=ax, label='average', ls=' ', marker='.') + ax.fill_between(df.index, df['mean']-df['stddev'], df['mean'] + df['stddev'], alpha=0.5, label=r'$+/- 1\times\sigma$') + plt.legend() + + ax.vlines(vlines, *ax.get_ylim(), color='0.5', alpha=0.5, zorder=10, lw=3, label='events') + ax.hlines(hlines, *ax.get_xlim(), color='0.5', zorder=10, lw=1, ls='--', label='threshold') + return fig + +plot_aggregates(df) +plt.show() +``` + + +![png](/assets/images/post/visualizing-aggregates/output_9_0.png) + + + +```python +window = timez.ns_delta(days=7) +pw = int(np.log2(window)) + +points, _ = zip(*stream.aligned_windows(start_ns, end_ns, pointwidth=pw)) +df = points_to_dataframe(points) +fig = plot_aggregates(df) +``` + + +![png](/assets/images/post/visualizing-aggregates/output_10_0.png) + + + +```python +window = timez.ns_delta(days=1) +pw = int(np.log2(window)) + +points, _ = zip(*stream.aligned_windows(start_ns, end_ns, pointwidth=pw)) +df = points_to_dataframe(points) +fig = plot_aggregates(df) +``` + + +![png](/assets/images/post/visualizing-aggregates/output_11_0.png) + + + +```python + +``` diff --git a/content/post/windows.md b/content/post/windows.md index 3cdf3ef..a002c02 100644 --- a/content/post/windows.md +++ b/content/post/windows.md @@ -1,21 +1,24 @@ --- date: '2021-06-23T14:00:00+0000' -description: This blog post explores how to maximize your efficiency in working with large datasetes using `windows`, `aligned_windows` and `values` queries (Photo credit Roald Dahl). +description: This blog post explores three API calls for querying time series data - `windows`, `aligned_windows` and `values` featuredImage: '/assets/images/post/windows/bfg_book.png' tags: - ni4ai-tutorials -title: Working with big data +title: Querying Time Series author: laurel --- +The term "Big Data" refers to data too large to be analyzed using conventional methods. A conventional technique might be, for example, to load data from CSV into memory in order to perform operations on it. With big data, this approach will quickly overwhelm most computing environments. -# Windows, aligned windows, and values +This blog post describes three user-friendly approaches to querying time series data, including workflows that will make working with big data more approachable for users of all skill levels. + +Photo credit: The BFG (Big Friendly Giant) by Roald Dahl. -This tutorial offers a guide on using the PredictiveGrid to work wtih VERY big data sets. +# Windows, aligned windows, and values -When working with high-resolution time series data, seemingly simple tasks can quickly become intractable. The reason for this is that the volume of data exceeds the computational limits of most most computing environments. +When working with high-resolution time series data, performing operations using seemingly small windows of data (e.g., one week) can quickly overwhelm most computing environments. -Here, we'll describe three methods for querying data in PredictiveGrid. In practice none of these is "better" than another -- there is a time and a place for each. This post will weigh the relative advantages of each approach. +Below, we'll describe three methods for querying data in PredictiveGrid. In practice none of these is "better" than another -- there is a time and a place for each. This post will weigh the relative advantages of each approach. ### Functions used - `stream.values()` diff --git a/static/assets/images/post/point-on-wave-1/fig1.png b/static/assets/images/post/point-on-wave-1/fig1.png new file mode 100644 index 0000000..2cb4bdd Binary files /dev/null and b/static/assets/images/post/point-on-wave-1/fig1.png differ diff --git a/static/assets/images/post/point-on-wave-1/fig2.png b/static/assets/images/post/point-on-wave-1/fig2.png new file mode 100644 index 0000000..565f6e4 Binary files /dev/null and b/static/assets/images/post/point-on-wave-1/fig2.png differ diff --git a/static/assets/images/post/point-on-wave-1/fig3.png b/static/assets/images/post/point-on-wave-1/fig3.png new file mode 100644 index 0000000..315d4e2 Binary files /dev/null and b/static/assets/images/post/point-on-wave-1/fig3.png differ diff --git a/static/assets/images/post/point-on-wave-2/fig1.png b/static/assets/images/post/point-on-wave-2/fig1.png new file mode 100644 index 0000000..8f149fa Binary files /dev/null and b/static/assets/images/post/point-on-wave-2/fig1.png differ diff --git a/static/assets/images/post/point-on-wave-2/fig2.png b/static/assets/images/post/point-on-wave-2/fig2.png new file mode 100644 index 0000000..52004ab Binary files /dev/null and b/static/assets/images/post/point-on-wave-2/fig2.png differ diff --git a/static/assets/images/post/visualizing-aggregates/output_10_0.png b/static/assets/images/post/visualizing-aggregates/output_10_0.png new file mode 100644 index 0000000..bd8b11c Binary files /dev/null and b/static/assets/images/post/visualizing-aggregates/output_10_0.png differ diff --git a/static/assets/images/post/visualizing-aggregates/output_11_0.png b/static/assets/images/post/visualizing-aggregates/output_11_0.png new file mode 100644 index 0000000..1515837 Binary files /dev/null and b/static/assets/images/post/visualizing-aggregates/output_11_0.png differ diff --git a/static/assets/images/post/visualizing-aggregates/output_9_0.png b/static/assets/images/post/visualizing-aggregates/output_9_0.png new file mode 100644 index 0000000..1368f90 Binary files /dev/null and b/static/assets/images/post/visualizing-aggregates/output_9_0.png differ