5.2 Software development

This thesis focuses on integrating research approaches into open source R packages for reproducibility and ease of use in other applications. So a significant amount of work has been devoted to the development of R packages gravitas, hakear, and gracsr, each of which corresponds to a chapter presented in this thesis.

5.2.1 gravitas

The gravitas package provides very general tools to compute and manipulate cyclic granularities and generate plots displaying distributions conditional on those granularities. The functions search_gran(), create_gran(), harmony(), gran_advice() and prob_plot() provides the entire workflow for an analyst to systematically explore large quantities of temporal data across different harmonies (pairs of granularities that can be analyzed together.) This package was developed as part of my internship at Google Summer of Code, 2019. It has been on CRAN since January 2020. The website (https://sayani07.github.io/gravitas) includes full documentation and two vignettes about the package usage. There has been a total of 12K downloads from the RStudio mirror dating from 2020-11-01 to 2021-11-01. This package supplements the paper corresponding to Chapter 2, which has won the ACEMS Business Analytics Award 2021. The package can be generalized to non-temporal applications for which a hierarchical structure can be construed similar to time.

5.2.2 hakear

The R package hakear (https://github.com/Sayani07/hakear) implements ideas presented in Chapter 3. The function wpd() computes the weighted pairwise distances (\(wpd\)) for each cyclic granularity or pair of granularities, and select_harmonies() selects the ones with significant patterns and ranks them from highest to lowest \(wpd\). These selected harmonies can be plotted using package gravitas for potentially interesting displays. This package is reliant on parallel processing using multiple multi-core computers for faster computation of \(wpd\).

5.2.3 gracsr

The open-source R package gracsr is available on Github (https://github.com/Sayani07/gracsr) to implement ideas presented in Chapter 4. The package contains functions for implementing the complete clustering methodology with choices of scaling and distance metrics discussed in the chapter. It has received a grant (AUD 3000) as part of the ACEMS Business Analytics Prize towards polishing the functions and preparing it for CRAN.