swmmr - an R package to interface SWMM

ABSTRACT The stormwater management model SWMM of the US EPA is widely used to analyse, design or optimise urban drainage systems. To perform advanced analysis and visualisations of model data this technical note introduces the R package swmmr. It contains functions to read and write SWMM files, initiate simulations from the R console and to convert SWMM model files to and from GIS data. Additionally, model data can be transformed to produce high quality visualisations. In accordance with SWMM’s open source policy the package can be obtained through github.com or the Comprehensive R Archive Network (CRAN).


Introduction
Modelling urban drainage systems has become essential to develop and assess resilient urban stormwater management strategies. Analysing the impact of different climatic or demographic scenarios on urban water infrastructure or optimising urban drainage networks are only some of the applications. Various software products are available to model urban drainage systems. Amongst others, the stormwater management model SWMM (Rossman 2010) is widely used by researchers and practitioners to simulate dynamic hydrology-hydraulic water quality processes. Its source code is released under public domain specification and online available from the US EPA. 1 Besides the availability of the open source engine of SWMM, a pre-compiled software for Microsoft Windows operating systems is available. The software also provides a graphical user interface (GUI) to design drainage networks and to assign attributes to elements of the system. While the open source software facilitates basic analysis and visualisations of model data, advanced features such as time series data management, parameter uncertainty analysis or extended statistics are reserved to commercialised versions of SWMM, only.
In this respect, the free software environment for statistical computing and graphics R (R Core Team 2017) is frequently used by both scientists and engineers. It provides a huge variety of add-on packages which also cover issues related to hydrology in general and urban water modelling more specifically. For example, hydrology specific packages support process-based modelling (e.g. reservoir -Turner and Galelli (2016)), spatial data processing (e.g. Watersheds -Torres-Matallana (2016)), model performance analysis (e.g. hydroGOF -Zambrano-Bigiarini (2017)), or data exploration (e.g. wql -Jassby, Cloern, and Stachalek (2017)). Moreover, packages epanet2toolkit (Arandia and Eck 2018) and epanetReader (Eck 2016) interface R with EPANET 2 (Rossmann 2010), a widely used water distribution systems model. A more comprehensive list is given in the CRAN Task View 'Hydrological Data and Modeling'. 3 Further packagesnot explicitly related to (urban) hydrologyprovide functions to perform model parameter optimisation (e.g. DEoptim - Ardia et al. (2016)), visualise data (e.g. dygraphs - Vanderkam et al. (2017); ggplot2 -Wickham (2016)), or manage time series data (e.g. xts -Ryan and Ulrich (2017)). Additionally, with the development of the packages sp (Pebesma and Bivand 2005) and sf (simple features) (Pebesma 2018), R's spatial data processing capabilities have been significantly advanced. Consequently, as modelling in general involves both pre-and post-processing of different types of data such as spatial or time series data, the availability of these packages enables an efficient model data management and allows various modelling tasks of diverging complexity to be addressed.
To bridge the gap between urban drainage modelling and advanced model analytics, we herein introduce the freely available R package swmmr which provides functions to interface SWMM. Core functions of the package comprise fast reading and writing of SWMM files, conversion between GIS data and the SWMM input file format as well as model data transformation to produce expressive visualisation. This technical note describes design principles of the swmmr package and exemplifies its usage. This includes a demonstration of how to produce high quality figures of model results and model structures enabled by further R packages.

What is the package useful for?
The main purpose of the swmmr package is to assist the modeller during the modelling process. Typically, this includes processing and visualisation of measurement and spatial data, which the R ecosystem provides matured packages for. However, its capabilities of interactively creating and modifying spatial data are limited and should not yet be compared to a specialised GIS software, though remarkable progress can be observed (mapview -Appelhans et al. (2018); mapedit -Appelhans and Russell (2017)). Thus, the package is especially useful to modellers who use R for model data management and/or need to perform advanced analysis, visualisation or optimisation tasks of a given model or model results, respectively.

Package design and core functions
At its core, the package relies on the tidy data concept (Wickham 2014) which is expressed through a set of harmonised packages sharing common data representation principles ('tidyverse' -Wickham (2017)). Although most tasks could have been addressed with base R, 4 packages from the 'tidyverse' tend to simplify both the programming and the data analysis. For example, swmmr uses tibbles (Müller and Wickham 2017) instead of R's built-in data.frame to represent SWMM sections because tibbles have a convenient print method which only shows the first 10 rows of data, and all the columns that fit on screen (Wickham and Grolemund 2016). This becomes especially useful when dealing with large SWMM data using functions such as read_inp(), read_rpt() and read_lid_rpt() (Table 1) as the console output remains readable in case large data have been printed. Generally, these functions take the path to a corresponding SWMM file (*.inp or *.rpt) and parse its content to a named list of tibbles or a single tibble, respectively. read_inp() creates an object of class inp, whose list element names are identical to the names of SWMM input sections available in lower letters (e.g. options, subcatchments, etc). To print a summary or to quickly visualise the model structure of the inp object, two generic functions summary() and autoplot() for inp objects are implemented. read_rpt() creates a named list of class rpt containing summary sections from the report file of SWMM (e.g. sub-catchment_runoff_summary). While both of the aforementioned functions maintain the original SWMM file structure, read_lid_rpt() interprets text files from specific LID elements. A single tibble or index-based time series data as xts object is returned accordingly. The latter option is provided because xts objects, which are introduced with the xts package and build upon R's built-in matrix data type, efficiently represent time series data and offer indexfocused data subsetting methods.
Reading simulation data from the binary .out file is supported by read_out(). Because .out files can become very large, the function design aims for fast data processing and embeds modern C++ code through Rcpp (Eddelbuettel and Francois 2011). Output data per system element and model variable is always represented as an xts object and conveniently stored in a list environment.
The function write_inp() writes an inp object to disk, which addresses cases where an inp object has been modified within R and changes need to be saved back to disk (e.g. model parameter calibration). Thus, it takes an existing inp object and creates a model file on disk which can be read and run by the original SWMM executable. However, a SWMM simulation run can also be initiated from the R console with run_swmm(). It requires the path to an .inp file to be specified and calls the SWMM executable. The function conveniently returns a 3-element list containing paths to the .inp, .rpt and .out file.
Moreover, converting SWMM input sections with spatial reference to sf objects is supported with *_to_sf() functions. Based on the conversion of SWMM input sections to sf objects, an inp object can be converted to the popular .shp format with inp_to_files(). Additionally, .txt files containing simulation settings, storage and pumping curves are returned as well as files containing SWMM time series data. As a counterpart the function shp_to_inp() converts spatial data given in .shp files into an object of class inp. Information on simulation settings, rainfall time series etc. can be given in .txt files to complete the model data. While the conversion to sf objects already enables common spatial analysis of SWMM model data in R, this also allows using the plotting interface of ggplot2 through geom_sf(). Alternatively, it is also attached to the package (cf. Listing 1). In addition, the reader is referred to three package vignettes which cover topics beyond the scope of this technical note. For example, instructions on how to auto-calibrate a SWMM model with swmmr or how to convert GIS and SWMM model data with swmmr are given.

Setup and model execution
To install swmmr from CRAN and to add its namespace to R's search list, the following commands need to be executed from the R command line (Listing 1). In this example, the model file attached to the package is used and its path is assigned to the variable inp_path. Subsequently, run_swmm() initiates a model run.

Analysis of model data
SWMM's model files (.inp, .rpt and .out) can be accessed from the named list variable swmm_files. Since the results of both the read_inp() and read_rpt() function comprises a list of named tibbles (Listings 2 and 3), elements can be accessed via R's common extracting mechanism.
Time index-based model results from an .out file are imported as given in Listing 4. Here, model variables total rainfall (in/hr or mm/hr, vIndex = 1) and total runoff (in flow units, vIndex = 4) from the system (iType = 3) are read. A general dictionary covering the mapping between variable and index number is included in the package documentation.

Convert between GIS and SWMM model data
inp_to_files() utilises the conversion functions *_to_sf() for all SWMM sections containing spatial data (Table 1). Sections without spatial information are returned and saved separately. Thus, sub-folders containing .shp, .txt and .dat files are created in a specified directory (Listing 5). Information on supported SWMM sections for both inp_to_files() and shp_to_inp() is given in the package manual.

Visualisation with ggplot2 and mapview
Modelling involves visualisation of spatial and temporal data. With base (R Core Team 2017), lattice (Sarkar 2008) and ggplot2 (Wickham 2016), R currently offers three different plotting systems. Because of ggplot2's flexibility and declarative way of constructing graphics, a demonstration of how to create expressive and customisable figures of model data is given in Listings 7 and 8. Listing 7 aims to visualise rainfall and simulated runoff data. Temporal data is read from an .out file, initially merged to one single xts object with two columns ('total_rainfall' and 'total_runoff') and converted to tibble which can be processed by ggplot2. Both variables are plotted as different geometric objects (geom_col(), geom_line()) and separated into facets. The result is shown in Figure 1.
Since sf objects are supported by the mapview package, a SWMM model structure converted to simple feature geometries can also be interactively visualised. Figure 3 shows a screenshot of a browser-based visualisation of the 'Example1' model, obtained by executing Listing 9.

Model calibration using DEoptim
Calibration of model parameters is an essential part within the modelling chain to improve the model quality. During calibration, model parameter values are systematically modified to optimise an objective function, which numerically expresses the difference between observed and simulated data.
Because swmmr provides the functions write_inp() to save an inp object to disk and run_swmm() to potentially run the written model file afterwards, it especially facilitates autocalibration of model parameters. swmmr, however, does not depend on Figure 3. Interactive visualisation of SWMM Example 1 model structure using the mapview package. particular optimisation packages. The package vignette 'How to autocalibrate a SWMM model with swmmr' exemplifies the application of the DEoptim package (Ardia et al. 2016) for single objective optimisation.

Conclusions
A brief introduction of the R package swmmr is given. swmmr interfaces the stormwater management model SWMM with R and bridges the gap between modelling and advanced model analytics. It offers functions to represent SWMM models in R which subsequently can be modified or visualised with modern technologies. Simulation results are efficiently read with help of Rcpp to streamline further time series analysis. This facilitates efficient model calibration and parameter uncertainty analysis. The package is freely available and is especially open to both the SWMM and R community. The authors would like to promote the open source project and welcome any contribution to the package through the project page on GitHub. Notes 1. https://www.epa.gov/water-research/storm-water-managementmodel-swmm. 2. https://www.epa.gov/water-research/epanet. 3. https://cran.R-project.org/view=Hydrology. 4. 'base R' refers to a set of default packages which R is actually based upon without any additional packages loaded. 5. Note that ggplot2 ≥ 3.0.0 is required.