Abstract

The research was motivated by the desire to understand some Australian smart meter data, which was collected half-hourly for two years at the household level. With temporal data available at ever finer scales, exploring periodicity can become overwhelming with so many possible temporal deconstructions to explore. Analysts are expected to comprehensively explore the many ways to view and consider temporal data. However, the plethora of choices and the lack of a systematic approach to do so quickly can make the task daunting.

This work investigates how time may be dissected, resulting in alternative data segmentation and, as a result, different visualizations that can aid in the identification of underlying patterns. The first contribution (Chapter 2) describes classes of time deconstructions using linear and cyclic time granularities. It provides tools to compute possible cyclic granularities from an ordered (usually temporal) index and also a framework to systematically explore the distribution of a univariate variable conditional on two cyclic time granularities by defining “harmony.” A “harmony” denotes pairs of granularities that could be effectively analyzed together and reduces the search from all possible options. This approach is still overwhelming for human consumption due of the vast number of harmonies remaining. The second contribution (Chapter 3) refines the search for informative granularities by identifying those for which the differences between the displayed distributions are greatest and also rating them in order of importance of capturing maximum variation. The third contribution (Chapter 4) builds upon the first two to provide methods for exploring heterogeneities in repetitive behavior for many time series and over multiple granularities. It accomplishes this by providing a way to cluster time series based on probability distributions across informative cyclic granularities. Although we were motivated by the smart meter example, the problem and the solutions we propose are practically relevant to any temporal data observed more than once per year.

2021-11-17

–>