How accurate is the regional travel demand model in mimicking real-world travel times?

ABSTRACT This paper provides insights pertaining to the validity of a regional travel demand model in mimicking real-world travel times. The estimated travel times from the regional travel demand model, for the base year 2015, for Mecklenburg County in the city of Charlotte, North Carolina (NC) were compared with travel time statistics from a private data source, for the same year. The results indicate that the estimated travel times from the regional travel demand model are typically lower than the 85th percentile travel times, irrespective of the link speed limit. The estimated travel times for the Central Business District (CBD) area type are moderately correlated with the travel time statistics from the private data source, irrespective of the time of the day. For all the other area types, stronger correlations were observed when the estimated travel times from the regional travel demand model are compared with 10th to 50th percentile travel times. The calculated Pearson correlation coefficients are low for morning and evening peak periods compared to mid-day and night-time period, indicating the inability of the regional travel demand model in mimicking congested traffic conditions accurately.


Introduction
Travel demand forecasting is a process which helps in the estimation of the future trends in travel patterns as well as the travel times, using current trends in travel patterns and travel times. Transportation planners and system managers work to estimate future travel demand as well as to effectively manage the current travel demand for a transportation system. Metropolitan transportation plans and longrange transportation plans are developed for a region using outputs from the travel demand forecasting process. These plans contribute to the efficiency of the transportation system, improve mobility, reduce congestion, and has an indirect effect on other aspects such as air pollution. The traditional regional travel demand forecasting process of a metropolitan area is a four-step process. The region is typically divided into various traffic analysis zones (TAZs) based on the socio-economic and demographic data. These TAZs should exhibit homogenous trip making patterns (Meyer & Miller, 2000). After a rigorous feedback process involving calibration of trip generation, trip distribution, mode split, and trip assignments steps, the travel time, link volume, and volume-to-capacity ratio (V/C) are estimated for each link, typically on major roads, in the network. The regional travel demand model predicts the link flow and travel time based on the routes and mode opted by the traveler from an origin to the destination (Florian & Hearn, 1995). The algorithm used for trip assignment is of no use if the resulting flow does not match with the observed flow (Tobin, 1979).
Trips are traditionally assigned based on Wardrop's first principle for equilibrium flows (Wardrop, 1952). Many algorithms were developed to satisfy the equilibrium flow condition. Nguyen (1974), LeBlanc (1973), Dafermos (1971), and Bruynooghe and Gilbert (1968) outline a few example algorithms that work with a fixed origindestination matrix for a road network. Slavin et al. (2006) compared different approaches to compute outputs from the user equilibrium trip assignment method. Their results showed that the origin-based approach congregates more than other considered approaches. Dial (1971) developed a path-based user equilibrium algorithm, that shifted the flow from the costliest path to the cheapest path.
Similarly, Bell & Cassir (2002) developed user equilibrium traffic assignment method based on game theory. For equilibrium condition, the probability of selection of each path is the same i.e., the cost is the same. The model distributed the flow based on minimum trip cost during the congested and uncongested period. Lo & Chen (2000) used Facchinei & Soares (1997) gap function for addressing the traffic equilibrium problem. They developed a model to fulfill 1) traffic assignment based on general route cost structure, and 2) to ensure that every point of minimization corresponds to an overall minimum. The vast and abundant literature on travel demand forecasting process and resulting outputs is an indicator of its significance with decision-making and developing transportation plans.
Travelers, on the other hand, plan their trip based on trip purpose, mode, travel time, travel cost, and route. The flow pattern in an urban network depends on two mechanisms. It depends on the route selected by a traveler to minimize the travel time from an origin to the destination. It could also depend on congestion, which can be explained as the use of the transportation system. A good regional travel demand model should predict the traveler's choices with accuracy (Carrion & Levinson, 2012).
The travel time helps travelers make decisions regarding route choice, departure time choice, and mode choice (if options are available). It varies by the time of the day, the day of the week, and season of the year. It also depends on factors that influence demand for travel (for example, an increase in gas price could decrease the demand for travel and therefore reduce travel time). The advancements in technology, growing awareness to make the world more sustainable, congestion pricing, and incentives to use alternate modes of transportation have a catalytic effect on the trip making patterns, and hence, travel demand, Therefore, the accuracy of the travel demand forecasting model in mimicking real-world conditions is often debated. The model could overestimate or underestimate outputs such as travel demand and travel time, if not calibrated considering accurate and appropriate data. A few times, decisions made in the past may not be applicable for future purposes.
In spite of their limitations, transportation planners and system managers rely heavily on travel demand forecast model outputs for developing metropolitan transportation plans, long-range transportation plans, and for evaluating large-scale transportation project alternatives. Researching the relationship between outputs from the regional travel demand model and real-world data could help these transportation planners and system managers make informed decisions, or proactively anticipate and work to address any limitations from such forecasts. Therefore, the focus of this paper is to evaluate the relationship between outputs from a regional travel demand model and travel time statistics based on real-world data by the time of the day.
Continuous travel time data, for all links along major roads, is needed to assess travel time statistics and compare with outputs from the regional travel demand model. Global Positioning Systems (GPS) installed in vehicles, commercial devices, and smartphone with apps are currently used to capture travel time information, in near real-time (Haghani et al., 2009). Many studies were conducted to compare the reliability and accuracy of such probe sourced data to collect travel times. The Federal Highway Administration (FHWA) conducted a study to compare the private and public source data as well as services (Turner et al., 2011). Kim & Coifman (2014) and Lindveld et al. (2009) compared the speed feed against the concurrent loop detector data. Kim & Coifman (2014) observed that INRIX speed data lags the loop detector speed data. Their results showed that a shortage in the INRIX (confidence score = 30) data affects the accuracy, especially during off-peak hours on arterial streets. Haghani et al. (2009) and Pulugurtha et al. (2015aPulugurtha et al. ( , 2015b compared INRIX travel time data with the data captured from Bluetooth detectors and other sources. The results from past research indicate that travel times are comparable for freeways, while caution must be exercised when using data for arterial streets. The regional travel demand model network, typically, include major roads in the region. These include functional classes such as interstates, expressways, major and minor arterial streets, and some collector roads. The design standards, area type and speed limits along these road functional classes vary. The characteristics (lane width, shoulder width, number of lanes, grade, traffic volume and composition, ramps, etc.) that influence travel times on interstates and expressways differ from the characteristics that influence travel time on major and minor arterials (lane width, grade, traffic volume and composition, parking, bus-stops, intersection controls, driveways, signal phasing/timing, etc.). Therefore, investigating the relationship between outputs from the regional travel demand model and travel time statistics by speed limit (an indicator of road functional class) will provide valuable insights.
Land use characteristics influence traffic patterns. Trips generated from a TAZ are influenced by the surrounding land use characteristics. Matt et al. (2005) explored the activity-based and utilities-based theories to study the relationship between travel behavior and land use. Further, Bagley & Mokhtarian (2002) investigated the effect of a residential neighborhood on the travel pattern in the San Francisco Bay area, California. They accounted that, neighborhood type has little influence when variables like socio-demographic and lifestyle are considered. Similar results were examined by Crane & Crepeau (1998). The land use characteristics vary by area type. Therefore, investigating the relationship between outputs from the regional travel demand model and travel time statistics by area type will also provide valuable insights.

Methodology
The regional travel demand model data was obtained for the Metrolina region from the city of Charlotte Department of Transportation (CDoT). The regional network consists of ten counties, located in North Carolina and South Carolina. However, for this research, only Mecklenburg County, North Carolina was considered for analysis (due to access to travel time data). The base regional travel demand model was developed by CDoT using demographic, socio-economic, land use, and on-network characteristics for the year 2015. It is calibrated by comparing with traffic volumes and travel times for selected links in the Metrolina region. The estimated travel times from the four-step process, for the year 2015, were obtained for each link within the region. CDoT adopted the user equilibrium approach as the trip assignment method.
The travel times are estimated using the four-step process, for the year 2015, for four different time periods. They are morning peak (6:30 AM to 9:30 AM), mid-day (9:30 AM to 3:30 PM), evening peak (3:30 PM to 6:30 PM), and night-time (6:30 PM to 6:30 AM) periods. The estimated travel times considered from the regional travel demand model for this research is the travel time for each time period.
The real-world raw travel time data, at the one-minute interval, was downloaded from the Regional Integrated Transportation Information System (RITIS) website and imported into Microsoft Structured Query Language (SQL) server for the year 2015. The raw travel time database includes the time of the day, the day of the week, linkbased speed, sample size, link length, travel time, and score (30 indicates real-time data; 20 indicates real-time data across multiple segments; 10 indicates historical data). Only raw travel time samples with a score of 30 were used in this research. Data for Saturdays, Sundays, and federal holidays were excluded in this research.
For every Traffic Message Channel (TMC) or link, travel time statistics were then computed by the time of the day and day of the week. The raw travel time data obtained from the private data source is acquired from sources, such as traffic sensors, probe vehicles, and the smart dust network (Regional Integrated Transportation Information System (RITIS), 2018). Since the regional travel demand model estimates travel time for the four selected time periods during a weekday, the raw travel time data was processed to compute travel time statistics for the same time periods. As an example, travel time data for each link with a score of 30 during a weekday in the year 2015, from 6:30 AM to 9:30 AM, was used to compute travel time statistics for the morning peak period.
The travel time statistics considered for analysis in this research include minimum travel time (MinTT), average travel time (AvgTT), percentiles of travel times indicated as 'TT' followed by the number representing the percentile (TT05, TT10, TT25, TT50, TT75, TT85, TT90 and TT95), and maximum travel time (MaxTT). The estimated travel times from the regional travel demand model were integrated with the computed travel times from the private data source, using the TMC code, which is a common field in both the databases.
The length of a link (TMC) in the regional travel demand model and the private data source differ for some links. The difference between the link lengths was calculated. The links with a difference in link length greater than 0.1 miles were excluded from the analysis. Additionally, links with the speed limit equal to or less than 30 mph were also excluded from the analysis. Overall, 920 links with estimated travel times from the regional travel demand model and data to compute travel time statistics were considered in this research. Figure 1 shows the distribution of the links considered across the study area. The light-colored thin lines in the background indicate the street network of Mecklenburg County, NC.
The speed limit of the links was considered in order to classify and analyze by the road functional class. Figure 2 represents the speed limit distribution of the links across the considered links. Majority of links with higher speed limit (greater than 50 mph) are present at the outer edges of the network whereas the links with moderate (>40 & ≤50 mph) and lower speed limits (>30 & ≤ 40 mph) are concentrated in the core/inner part of the network.
Similarly, the area type was also considered to classify the links into different groups. The study area consisted of four area types; the Central Business District (CBD), fringe, urban, and suburban areas. In the regional travel demand model for the Metrolina region, the area types are calculated based on the residential and employment density within a 1.5-mile radius of the corresponding TAZ. The TAZ with population density (population/square mile) greater than or equal to 375 and an employment density (employment/square mile) below 2,600 is categorized as urban and suburban areas. The area type fringe is categorized as the TAZ with an employment density greater than 2,600 while CBD is categorized as the TAZ with an employment density greater than 10,500. Figure 3 represents the area type distribution of the considered links. The links in CBD and fringe area types are concentrated in the inner part of the network whereas the links in suburban, urban and rural area types are present at the outer parts and edges of the network.
Scatter plots were generated to observe trends in the relationships. The Kolmogorov-Smirnov (K-S) test was performed to check for normality. If the test is not significant, then the data is said to be normally distributed. The significance value obtained for the sample size of 920 from the K-S test is greater than 0.05, indicating that the travel times are normally distributed for the selected time periods. Therefore, Pearson correlation coefficients were calculated to examine the relationships.
Pearson correlation coefficients were calculated between the estimated travel times from the regional travel demand model and the computed travel time statistics, for each time period, to check the degree of correlation between the two values. The calculated Pearson correlation coefficient is between −1.0 and +1.0. The Pearson correlation coefficient is considered to be very strong if the calculated value is closer to −1.0 or +1.0.
Three cases were considered for the analysis. In the first case, the Pearson correlation coefficients were calculated using data for all the links. In the second case, the Pearson correlation coefficients were calculated by categorizing data by the speed limit. They were also calculated by categorizing data by the area type in the third case. Due to the sample size constraints, the suburban and rural area types were considered as one category.
Plots showing trends in Pearson correlation coefficients, with the Pearson correlation coefficient on the y-axis and the travel time statistics on the x-axis, were generated. These plots were used to interpret the significance of the relationships and trends between the estimated travel times from the regional travel demand model and the computed travel time statistics from the private data source, by the time of the day, speed limit, and area type.

Results and discussion
Descriptive statistics such as the mean and standard deviation were first computed for the estimated travel times from the regional travel demand model and the travel time statistics, by the time of the day. Table 1 summarizes the means and standard deviations, by the time of the day, considering data for all the 920 links.
The mean of the estimated travel times from the regional travel demand model (TT) is between the mean of the 75 th percentile travel time (TT75) and the 85 th percentile travel time (TT85) for the morning peak, mid-day, and evening peak periods. However, it is between the mean of the 15 th percentile travel time (TT15) and the 25 th percentile travel time (TT25) for the night-time period. The standard deviation of the estimated travel times from the regional travel demand model (TT) is greater than the standard deviation of the travel time statistics (in most cases, except for the maximum travel time, MaxTT) for the morning peak, mid-day, and evening peak periods.
To compare and assess the variations by the speed limit, the data were categorized into three groups. They are: Descriptive statistics such as the mean and standard deviation were then computed for the estimated travel times from the regional travel demand model and the travel time statistics by the time of the day and speed limit. Table 2 summarizes the means and standard deviations by the time of the day and speed limit.
When links with speed limit >50 mph are considered, the mean of the estimated travel times from the regional travel demand model (TT) is between the mean of the 50 th percentile travel time (TT50) and the 75 th percentile travel time (TT75) for the morning peak, mid-day, and evening peak periods. However, it is equal to the mean of the 25 th percentile travel time (TT25) for the night-time period.
When links with speed limit >40 & ≤50 mph are considered, the mean of the estimated travel times from the regional travel demand model (TT) is between the mean of the 85 th percentile travel time (TT85) and the 90 th percentile travel time (TT90) for the morning peak and evening peak periods. It is between the mean of the 75 th percentile travel time (TT75) and the 85 th percentile travel time (TT85) for the mid-day period, and between the mean of the 10 th percentile travel time (TT10) and the 15 th percentile travel time (TT15) for the night-time period. TT is estimated travel time from the regional travel demand model. All others are computed from the private data source.
When links with speed limit >30 & ≤ 40 mph are considered, the mean of the estimated travel times from the regional travel demand model (TT) is between the mean of the 50 th percentile travel time (TT50) and the 75 th percentile travel time (TT75) for the morning peak, mid-day, and evening peak periods. However, it is between the mean of the 25 th percentile travel time (TT25) and the 50 th percentile travel time (TT50) for the night-time period.
The standard deviation of the estimated travel times from the regional travel demand model (TT) and the travel time statistics in Table 2 are lower than those obtained when data for all the links are considered (Table 1).
The variations based on the area type was also assessed. The classification of the links based on the area type was used and are categorized into four classes: The descriptive statistics for the links by the area type categories are also computed. Table 3 summarizes the means and standard deviations for the considered area types, by the time of the day. For the CBD area type, the mean of the estimated travel time is between the 25 th percentile travel time (TT25) and the 50 th percentile travel time (TT50) for the morning, mid-day and evening peak periods. However, the mean of the estimated travel time is between the 15 th percentile travel time (TT15) and the 25 th percentile travel time (TT25) for the night-time period.
In case of the fringe area type, the mean of the estimated travel time is between the 75 th percentile travel time (TT75) and the 85 th percentile travel time (TT85) for the morning and evening peak periods. However, the means of the estimated travel time is between the 50 th percentile travel time (TT50) and the 75 th percentile travel time (TT75) for the mid-day period, and between the 25 th percentile travel time (TT25) and the 50 th percentile travel time (TT50) for the night-time period.
For the urban area type, the mean of the estimated travel time is between the 75 th percentile travel time (TT75) and the 85 th percentile travel time (TT85) for the morning peak period. However, the mean of estimated travel time is between the 85 th percentile travel time (TT85) and the 90 th percentile travel time (TT90) for the mid-day and evening peak periods. The mean of the estimated travel time for the night-time period is between the 15 th percentile travel time (TT15) and the 25 th percentile travel time (TT25).
For the suburban and rural area type, the mean of the estimated travel times is between the 85 th percentile travel time (TT85) and the 90 th percentile travel time (TT90) for the morning peak period. However, it is between the 90 th percentile travel time (TT90) and the 95 th percentile travel time (TT95) for the evening peak period. Likewise, the means of the estimated travel time lies in between the 75 th percentile travel time (TT75) and the 85 th percentile travel time (TT85) for the mid-day period. It is equal to the mean of the computed 25 th percentile travel time (TT25) for the night-time period.
Scatter plots were generated for all the speed limit categories for all the times of the day to examine general trends and distribution of samples with the estimated travel times from the regional travel demand model on the y-axis and the travel time statistic on the x-axis. Figures 4-6 show the distribution of samples by the speed limit category with the estimated travel times from the regional travel demand model on the y-axis and the 50 th percentile travel time (TT50) from the private data source on the x-axis. Best-fit line was added in each scatter plot to indicate the general data trend. Figure 4(a-d) presents the scatter plots for the links with the speed limit > 50 mph. The estimated travel times from the regional travel demand model are greater than the 50 th percentile travel time (TT50) from the private data source for a majority of the links during the morning peak, mid-day, and evening peak periods. However, they are relatively closer to each other for most of the links during the night-time period. Figure 5(a-d) presents the scatter plots for the links with the speed limit > 40 & ≤ 50 mph. The estimated travel times from the regional travel demand model are greater than the 50 th percentile travel time (TT50) from the private data source for a majority of the links during the morning peak, mid-day, and night-time periods. However, the estimated travel times from the regional travel demand model are lower than the 50 th percentile travel time (TT50) from the private data source for a majority of the links TT is estimated travel time from the regional travel demand model. All others are computed from the private data source.  during the evening peak period. Figure 6(a-d) presents the scatter plots for the links with the speed limit >30 & ≤ 40 mph. The estimated travel times from the regional travel demand model are greater than the 50 th percentile travel time (TT50) from the private data source for a majority of the links irrespective of the time of the day.

(a) Morning Peak (b) Mid-day (c) Evening Peak (d) Night-time
The generated scatter plots with best-fit lines indicate that the relationships between the estimated travel times from the regional travel demand model and travel time statistics from the private data source are linear in nature. While the general trend is linear, the number of links for which the estimated travel times from the regional travel demand model are greater than, equal to, or less than the travel time statistics varies by the posted speed limit and time of the day.
Statistical Package for the Social Sciences (SPSS) (SPSS Inc., 2008) was used to calculate the Pearson correlation coefficients and examine the strengths in the relationships. The analysis was first performed using data for all the 920 links. Figure 7 shows the variations in calculated Pearson correlation coefficients between the estimated travel times from the regional travel demand model and the travel time statistics by the time of the day. The Pearson correlation coefficients are the highest for the night-time period, followed by the mid-day period, irrespective of the travel time statistic (except in the case of maximum travel time, MaxTT). They are the lowest for the evening peak period (except in the case of maximum travel time, MaxTT). While the calculated Pearson correlation coefficients are marginally lower for the minimum travel time (MinTT) and 5 th percentile travel time (TT05), they are very close to each other for 10 th to 75 th percentile travel times (TT10 to Overall, Figure 7 indicates that the calculated Pearson correlation coefficients vary with the travel time statistics considered for comparison or validation. Further, the calculated Pearson correlation coefficients indicate stronger correlations with 5 th to 95 th percentile travel time statistics. It is the lowest for the maximum travel time (MaxTT).
The calculation of Pearson correlation coefficients using data for all the links was followed by calculation of Pearson correlation coefficients using data by speed limit category. Figure 8 shows the variations in calculated Pearson correlation coefficients between the estimated travel times and travel time statistics by the time of the day. Only links with a speed limit greater than 50 mph are considered when calculating the Pearson correlation coefficients in this case. The Pearson correlation coefficients are relatively higher for the night-time period and mid-day period, irrespective of the travel time statistic. The variations in the calculated Pearson correlation coefficients with respect to travel time statistics, for these two time periods, are marginal.

Speed limit >50 mph
The calculated Pearson correlation coefficients are the lowest for the evening peak period and morning peak period. The calculated Pearson correlation coefficients are marginally close to each other and consistent up to the 25 th percentile travel time (TT25). They then follow a decreasing trend. The larger differences between the 50 th percentile travel time (TT50) and the 95 th percentile travel time (TT95) could be attributed to the effect of variations in traffic volumes and travel times on links with a speed limit greater than 50 mph. Speed limit >40 & ≤50 mph Figure 9 shows the variations in calculated Pearson correlation coefficients between the estimated travel times and travel time statistics by the time of the day. Only links with a speed limit greater than 40 mph but less than or equal to 50 mph are considered when calculating the Pearson correlation coefficients in this case. The Pearson correlation coefficients are relatively higher for the night-time period, followed by the mid-day period, irrespective of the travel time statistic (except in the case of maximum travel time, MaxTT). They are the lowest for the evening peak period. The difference in the calculated Pearson correlation coefficients for morning and evening peak periods is relatively less from 5 th to 50 th percentile travel times (TT05 to TT50). The variations in the calculated Pearson correlation coefficients with respect to travel time statistics are also marginal, except for the maximum travel time (MaxTT). They are very close to each other from 10 th to 75 th percentile travel times (TT10 to TT75).  MinTT  TT05  TT10  TT15  TT25  AvgTT  TT50  TT75  TT85  TT90  TT95 MaxTT Correlation coefficients

Morning Peak
Mid-day Evening Peak Night-time Figure 9. Pearson correlation coefficientsfor links with speed limit >40 & ≤ 50 mph. Figure 10 shows the variations in calculated Pearson correlation coefficients between the estimated travel times and travel time statistics by the time of the day. Only links with a speed limit greater than 30 mph but less than or equal to 40 mph are considered when calculating the Pearson correlation coefficients in this case. The Pearson correlation coefficients are the highest for the night-time period, followed by the mid-day period, irrespective of the travel time statistic (except in the case of maximum travel time, MaxTT). They are the lowest for the evening peak period. The variations in the calculated Pearson correlation coefficients with respect to travel time statistics are more noticeable from the 50 th percentile travel time (TT50) to the maximum travel time (MaxTT). They are very close to each other from 10 th to 25 th percentile travel times (TT10 to TT25). In general, the lines are closer (indicating smaller relative differences in calculated Pearson correlation coefficients) and tend to converge for the four different time periods and merge at MaxTT.
In Figures 8, 9 and 10, the calculated Pearson correlation coefficients for the average travel time (AvgTT) are typically lower than the calculated Pearson correlation coefficients for the 50 th percentile travel time (TT50), irrespective of the time period. Figure 11(a-d) shows the variations in Pearson correlation coefficients between the estimated travel times and the travel time statistics by area type. The links in CBD, fringe, urban and suburban and rural area are segregated and considered separately for the analysis. Figure 11(a) presents the trends for the morning peak period. The estimated travel times and the travel time statistics are highly correlated for the suburban and rural area type during the morning peak period, except in the case of maximum travel time (MaxTT). The Pearson correlation coefficients for the fringe, urban, suburban and rural area types increased until the 50 th percentile travel time (TT50). They then gradually decreased, followed by a drop at the maximum travel time (MaxTT). The Pearson correlation coefficients for the CBD area type are the lowest for all the travel time statistics, in particular for the 10 th percentile travel time (TT10), 15 th percentile travel  MinTT  TT05  TT10  TT15  TT25  AvgTT  TT50  TT75  TT85  TT90  When the mid-day period is considered ( Figure 11(b)), the Pearson correlation coefficients for fringe, urban, suburban and rural area types followed a similar trend irrespective of the travel time statistic. No specific trend could be observed in the case of Pearson correlation coefficients for the CBD area. The Pearson correlation coefficients are the lowest for the maximum travel time (MaxTT) irrespective of the area type. Figure 11(c) shows the Pearson correlation coefficients by the area type during the evening peak period. The Pearson correlation coefficients are consistent from the minimum travel time (MinTT) to the average travel time (AvgTT), followed by a reduction until the maximum travel time (MaxTT). It is the lowest for the maximum travel time (MaxTT) irrespective of the area type. Figure 11(d) shows the Pearson correlation coefficients by area type during the night-time period. The trends are similar to the mid-day period except for the CBD area type. However, the values are slightly higher than that of the mid-day period for fringe, urban, suburban and rural area types.
Overall, the Pearson correlation coefficients for fringe, urban, suburban and rural area types are comparatively lower for the peak periods (morning and evening peak) than the off-peak periods (mid-day and night-time). However, for all the times of the day, the trends pertaining to the CBD area type remains similar, with comparatively lower values of Pearson correlation coefficients than the other area types.
All the calculated Pearson correlation coefficients were significant at a 95 percent confidence level or higher. Therefore, non-linear relationships were not explored.  Figure 11. Pearson correlation coefficients by area type.

Conclusions
The calculated Pearson correlation coefficients indicate that the estimated travel times from the regional travel demand model and travel time statistics from the private data source are positively correlated to each other. This implies that both the datasets exhibit similar trends (increasingincreasing or decreasingdecreasing). The Pearson correlation coefficients between the estimated travel times from the regional travel demand model and the maximum travel times from the private data source are the lowest, irrespective of the time of the day and area type. This is as expected, as travel times are high under non-recurring congestion conditions such as a fatal or severe crash, special event, inclement weather or other similar incidents. The regional travel demand models do not consider such factors for modeling and analysis.
Likewise, the calculated Pearson correlation coefficients between the estimated travel times from the regional travel demand model and the computed minimum travel times from the private data source are marginally lower when compared to 5 th to 95 th percentile travel time statistics. The minimum travel time could be due to aggressive, reckless, and speeding drivers. It is not a representation of the general traffic condition during a time period.
The trends in calculated Pearson correlation coefficients are relatively similar (in most of the cases) from the 10 th percentile travel time to the 50 th percentile travel time. They then typically follow a decreasing trend up to the maximum travel time (MaxTT). Also, the calculated Pearson correlation coefficients are the highest for the night-time period, followed by the mid-day period. They are the lowest for the evening peak period. These findings indicate that the regional travel demand models capture normal traffic conditions but fail to accurately reflect the congested traffic conditions. This could be attributed to the use of static trip assignment techniques in the regional travel demand model. This raises a question as to how effectively they reflect future growth patterns.
The calculated Pearson correlation coefficients for the average travel time (AvgTT) are generally lower than the calculated Pearson correlation coefficients for the 50 th percentile travel time (TT50), irrespective of the time period. This indicates that there is some skewness in the data (observed travel times), which is evident from the descriptive statistics and scatter plots.
Some variations were observed when results are compared by the speed limit and time of the day. The calculated Pearson correlation coefficients are closer for night-time and mid-day periods on roads with speed limit >50 mph. They differ significantly when compared to evening and morning peak periods. On the other hand, the calculated Pearson correlation coefficients are relatively closer to each other for morning peak, mid-day, and evening peak periods on roads with >40 & ≤ 50 mph. The differences seem to be marginal for all considered time periods on roads with >30 & ≤ 40 mph. These trends seem to reflect traffic conditions on various road functional classes.
Variations corresponding to the classification of the links based on the area type were also observed. The trends in Pearson correlation coefficients are similar for all the times of the day in the CBD area type. A marginal increase followed by a decrease in the Pearson correlation coefficients was observed in the case of fringe, urban, and suburban and rural area types. In general, there was a reduction in the Pearson correlation coefficient values from the 95 th percentile travel time (TT95) to the maximum travel time (MaxTT). Overall, the suburban and rural area type tend to have the highest Pearson correlation coefficient values, followed by the urban and fringe area type. They are the lowest for the CBD area type during all the times of the day, indicating relatively weaker relationship between the estimated travel times from the regional travel demand model and the travel time statistics from the private data source for links in the CBD area type. The regional travel demand model tends to mimic real-world travel times in the case of the night-time period for all the area types with higher Pearson correlation coefficient except the CBD area type.
Overall, the findings from this research indicate that the estimated travel times from the regional travel demand model are typically lower than the computed 85 th percentile travel times from the private data source, irrespective of the time of the day, link speed limit, and area type. Stronger correlations are observed in the case of off-peak periods of the day and relatively lower Pearson correlation coefficients are observed in the case of peak periods.
The estimated travel times from the regional travel demand model are based on static user equilibrium approach, used by the regional planning agency. Hence, the travel paths of the users are based on the lowest cost i.e., travel impedance. These conditions might not reflect the actual travel patterns across a region. Estimating travel times using other trip assignment techniques and comparing with the computed travel time statistics from the private data source may help in providing a better understanding of travelers' behavior and travel patterns. This as well as evaluating trends using data for other urban areas merit an investigation.
Travel times vary considerably during peak hours. They also vary by the area type. The number of hours during the peak period may be less than or more than three hours and differ from what may be traditionally used in the regional travel demand model. Additionally, regional travel demand models are based on land use, socio-economic, and network characteristics. The regional travel demand models are used to predict future patterns. The relationships between these characteristics and travel time statistics as well as the analysis duration and predicting future patterns need to be explored in the future.