Commuter Rail and Hybrid Rail Efficiency

Better, or Just Different?
Examining Operational Efficiency on Commuter Rail and

Hybrid Rail Systems in the US
Sandy Johnston
APLN 504
12/11/2015
Contents
Introduction ..................................................................................................................................... 3
Hypothesis and Research Question ................................................................................................. 5
About the Data ................................................................................................................................ 6
Descriptive Statistics ....................................................................................................................... 7
Relationships Between Variables (t-tests) ...................................................................................... 9
Correlations ..................................................................................................................................... 9
Correlation Matrix (n for all=27) .................................................................................................. 10
Linear Regression Modeling ......................................................................................................... 11
Combined/Comparative Models ............................................................................................... 13
Differential Models ................................................................................................................... 14
YR .......................................................................................................................................... 14
CR .......................................................................................................................................... 15
Discussion ..................................................................................................................................... 16
Conclusions and Further Research Needed .................................................................................. 21
Appendix A: Systems Studied ...................................................................................................... 24
Appendix B: Variables .................................................................................................................. 26
Appendix C: Visual Presentations of Descriptive Data ................................................................ 27
Appendix D: Analysis Dataset ...................................................................................................... 31
Appendix E: SPSS Outputs........................................................................................................... 32
Introduction
The last several decades have seen a remarkable resurgence in public transit in the United
States. As traffic congestion increases and many metropolitan areas continue to sprawl,
policymakers have increasingly looked to increase the number of mobility options available to
their constituents. One of the most popular ways to do this has been to implement a regional, or
commuter, rail system.
Commuter rail is a uniquely American mode that evolved to cope with high peak-hour
demand from low-density areas surrounding a major urban center. Commuter rail trains can
cover long distances at high speeds, and are is relatively cheap to implement if using existing
rights of way. However,
commuter rail trains are
expensive to operate because
of staffing requirements and
generally run infrequently at
off-peak times as a result,
leading to significant
emphasis on peak service.
The US has five major
Figure 1: Existing commuter rail systems in North America, from The Transport Politic
(http://www.thetransportpolitic.com/existing-systems/existing-commuter-rail-systems)
legacy commuter rail systems (systems of significant size that have been in continuous
operation from the pre-World War II era to today), in New York City, Boston, Philadelphia,
Washington, DC, and Chicago. While other systems (including those in Detroit, Pittsburgh, and
Milwaukee) have come and gone over the years, since the 1980s a number of new commuter
rail systems have opened in cities like Seattle, Los Angeles, Albuquerque, and Miami. Ridership
levels on these systems, however, remain uneven, leading some metro areas to seek other
solutions.
Recently, several cities have experimented with a form of transit known to the Federal
Transit Administration as hybrid rail. The foundations for this kind of operation were laid with
the release of Transit Cooperative Research Program (TCRP) report 52, Joint Operation of
Light Rail Transit or Diesel Multiple Unit Vehicles with Railroads, in 1999. Often known
popularly as diesel light rail, and first
defined by the FTA in 20111 (although
systems were in operation before then)
hybrid rail is best understood as a cross
between light rail and commuter rail.
Hybrid rail systems generally run with
self-propelled cars (like light rail), but
propelled by diesel, rather than electric,
motors (like most commuter rail). For a
Figure 2: Coaster commuter rail (left) and Sprinter hybrid rail (right)
share a station, but not tracks, at Oceanside, CA. Difference in size
and design between the two modes is apparent. Source:
http://www.trainweb.org/chris/13nps4.JPG
variety of technical and regulatory reasons, hybrid rail systems generally import European
vehicles known as Diesel Multiple Units, or DMUs2. With streamlined staffing and lower fuel
consumption, these systems can and do operate more frequently than commuter rail, though they
generally serve a suburb-to-city routing and do not run as frequently as urban light rail. As a
Federal Register /Vol. 76, No. 103 / Friday, May 27, 2011 /Notices
Several of the systems have used FRA-compliant American-made DMUs, but these have generally been
unsuccessful in the market.
2
result, many transit advocates have hailed DMU-based hybrid rail as the wave of the future in
American transit3.
Hypothesis and Research Question

At this point in time, hybrid rail systems have been in operation in the US for a period
of time long enough to begin the process of examining their efficiency benefits. The earliest such
line, New Jersey Transits River Line between Trenton and Camden, opened in 2004. It was
followed by the North Country Transportation District (CA) Sprinter in 2008, Oregons Westside
Express in 2009, Capital MetroRail in Austin, TX in 2010, and the A-Train in Denton County,
TX in 2011. Three more California projects, SMART in Sonoma County, eBart in the East Bay,
and the Redlands Line from San Bernardino to Redlands, will open using the mode in coming
years. As hybrid rail proliferates, the time has come to examine to what extent its cost
efficiency promises relative to commuter rail have been born out.
This paper examines a snapshot of data from the National Transit Database (NTD)
related to commuter and hybrid rail systems, with the goal of measuring relative efficiencies
given a number of physical and operational factors. Given the expectations of advocates and the
growing popularity of the mode, it seems reasonable to hypothesize that hybrid rail systems
will be more efficient on an operational cost basis than commuter rail systems. This paper
uses the statistical software SPSS to conduct several analyses on the dataset, including
descriptive statistics, hypothesis testing, and creation of a correlation matrix. The paper also
seeks to establish regression models that can be used not just to observe, but to predict,
operational costs and efficiencies. One set of regression models will help stakeholders decide
See for example http://seattletransitblog.com/2014/01/03/the-cheaper-brighter-future-of-american-passenger-rail/

and http://capntransit.blogspot.com/2009/03/feds-relax-restrictions-for-light-rail.html
between commuter rail and hybrid rail systems based on expected dimensions of service, and the
other will predict service costs and efficiencies based on mode.
About the Data

This paper relies on data compiled from the National Transit Database. Established by
Congress in 1974, NTD collects annual transit performance and financial data, monthly
ridership, and safety and security data.4 The data is used to support benchmarking and research
and calculate federal funding; all urban and rural transit agencies that receive Federal funding are
required to report data to NTD. The data tablescurrently up-to-date through 2013are
accessible online through the Federal Transit Administration5 or the American Public Transit
Association6 and can be downloaded in Excel format.
NTD tables allow sorting and filtering by a number of variables, including mode
(meaning, in transit parlance, roughly what kind of vehicle is being used). For the following
analysis, results from several tables were filtered to present only the CR (commuter rail) and
YR (hybrid rail) modes. The filtering returns 28 results, of which one, representing the
Downeaster Amtrak service from Boston to Portland, ME, was manually excluded because it is
an intercity, not a commuter, service (despite being classified as CR in NTD) and presented as an
extreme outlier in data analysis. It is presumably included in NTD because it receives some FTA
funding. Another semi-intercity Amtrak route, the Keystone Service between Philadelphia and
Harrisburg, is also presented in the 2013 NTD data, but was retained because its stop spacing and
frequency are more equivalent to a commuter rail route and fall within the norms of such
Background information on NTD from:

http://www.apta.com/members/memberprogramsandservices/international/Documents/U.S.%20National%20Transit
%20Database.pdf
5
www.ntdprogram.gov
6
http://www.apta.com/resources/statistics/Pages/NTDDataTables.aspx
operations. Thus the full dataset of commuter rail and hybrid rail operations in the US contains
27 operations, 22 classified as CR and 5 as YR; a full list may be found in Appendix A below.
This paper uses a number of variables from NTD to define and analyze operational
factors and efficiency. Some of these variables are taken directly from NTD tables, and others
are secondarily computed from variables contained in NTD tables. Future versions of this work
could expand the list of variables to include measurements and factors not included in NTD,
especially crew requirements and density of the area along the route. A full list of variables may
be found in Appendix B below. This paper will particularly stress three dependent variables that
measure operational efficiency: operational expense per vehicle hour, operational expense per
passenger mile, and operational expense per passenger trip (unlinked). The most important
independent variables are stop spacing (the distance between stops on a given line) and trains per
route mile, a crude proxy for frequency of service, which is not directly measured by NTD.
Another variable, passenger trips per vehicle revenue hour, can occupy either a dependent or an
independent role.
Descriptive Statistics
This section provides an overview and numeric and visual presentations of the data
covered in this paper. Not all variables present in Appendix B are presented here; some are
filtered out based on irrelevance to the research question. Data is presented with a particular eye
towards defining the differences between CR and YR systems. Visual representations are
available in Appendix C below.
Mean
Median Variance
SD
Min
Max
Range
IQR
Variable: VOMS
CR
277.05
64.0
171667.0
414.33
7.0
1230.0
1223.0
330.0
YR
7.4
6.0
20.800
4.56
15
11
7.5
Variable: Number of Trains

CR
39.46
12.0
2410.74
49.1
2.0
143.0
141.0
61.75
YR
5.4
4.0
13.80
3.72
3.0
12.0
9.0
4.50
Variable: Stop Spacing

CR
4.60
4.63
3.69
1.92
1.45
8.60
7.15
2.95
YR
2.79
2.92
1.41
1.19
1.47
4.26
2.79
2.31
Variable: Trains per Route Mile

CR
.0928
.0701
.004
.06429
.02
.26
.24
.09
YR
.1044
.0939
.002
.04081
.06
.17
.11
.06
Variable: Passenger Trips Per Revenue Hour

CR
46.06
43.66
217.1
14.73
17.0
87.34
70.34
14.33
YR
58.80
58.92
508.447
22.55
22.90
82.68
59.78
37.10
Variable: Operational Expense per Vehicle Hour

CR
548.16
505.65
29978.69
173.143
326.30
1087.50 761.20
200.20
YR
688.82
674.30
27225.76
165.0
465.60
868.60
403.0
313.0
Variable: Operational Expense per Passenger Trip

CR
14.66
12.95
45.796
6.77
6.20
30.8
24.6
5.32
YR
14.72
15.90
30.72
5.45
7.40
22.20
14.80
9.75
Variable: Operational Expense per Passenger Mile

CR
.541
.40
.065
.256
.30
1.30
.20
YR
1.22
1.00
.272
.522
.8
2.0
1.2
.95
Relationships Between Variables (t-tests)

We have seen thus far that many, but not all, of the variables examined show apparently
large differences between YR and CR systems. But are these differences statistically significant?
We use paired-sample t-tests, grouped by the type variable, to determine. It should be kept in
mind that the sample size is relatively small22 CR systems and just 5 YR systemsso
statistical significance at high levels of confidence will be hard to achieve.
Variable
Stop Spacing
Pax Trips/Revenue
Hour
Trains/Route Mile
OpEx/Vehicle Hour
PaxTrips/Vehicle
Hour
OpEx/PaxTrip
OpEx/PaxMile
Levene's Test
Sig.=.226; equal
variances assumed
Sig.=.434; equal
variances assumed
Sig.=.162; equal
variances assumed
Sig.=.972; equal
variances assumed
Sig.=.434; Equal
variances assumed
Sig.=.737; Equal
variances assumed
Sig.=.013; Equal
variances NOT
assumed
Sig. (2tailed)
DF
1.992
25
0.057
-1.583
25
0.126
-0.381
25
0.706
-1.652
25
0.111
-1.583
25
0.126
-0.2
25
0.984
-2.835
25
.042
Correlations
Constructing a correlation matrix allows us to immediately see statistically significant
relationships between ratio variables. While the sample size is small and statistical significance
could therefore be hard to tease out, this exercise is important for two reasons:
a) It allows us to see relationships between dependent and independent variables,
previewing the construction of linear regression models in the next section
b) It establishes relationships or lack thereof between independent variables, warning
about potential multicollinearity problems.
It is important to recognize that this matrix represents correlations for all of the data points in the
set, and is not sorted by type (YR vs. CR). Statistically significant correlations are marked in red.
Correlation Matrix (n for all=27)

Trains Per
OpEx
OpEx
OpEx
Route
per
per
per
Mile
Vehicle Unlinked Pax
Stop
(proxy for
Hour Pax Trip Mile Spacing frequency) PaxTripPerRevHr
Pearson
1
.386* .421*
Correlation
Sig. (2.047
.029
tailed)
Pearson
OpEx per
.386*
1 .405*
Unlinked Pax Correlation
Trip
Sig. (2.047
.036
tailed)
Pearson
OpEx per
.421*
.405*
1
Correlation
Pax Mile
Sig. (2.029
.036
tailed)
Stop Spacing Pearson
.170
.553** -.094
Correlation
Sig. (2.396
.003
.643
tailed)
Pearson
Trains per
-.134
-.442* -.095
Correlation
Route Mile
Sig. (2.505
.021
.637
tailed)
PaxTrips per Pearson
.353
-.611** -.060
Correlation
Revenue
Hour
Sig. (2.071
.001
.765
tailed)
*. Correlation is significant at the 0.05 level (2-tailed).
**. Correlation is significant at the 0.01 level (2-tailed).
OpEx per
Vehicle Hour
.170
-.134
.353
.396
.505
.071
.553**
-.442*
-.611**
.003
.021
.001
-.094
-.095
-.060
.643
.637
.765
-.705**
-.477*
.000
.012
.323
-.705**
.000
.101
-.477*
.323
.012
.101
Operational expense per vehicle hour:
Moderately positively correlated with operational expense per passenger trip

Moderately positively correlated with operational expense per passenger mile
Operational expense per unlinked passenger trip:
Moderately positively correlated with operational expense per vehicle hour

Moderately positively correlated with operational expense per passenger mile
Moderately positively correlated with stop spacing (as distance between stops
INCREASES, costs increase)
Moderately negatively correlated with frequency
Moderately to strongly negatively correlated with passenger trips per revenue
hour
Operational Expense per Passenger Mile:
Moderately positively correlated with operational expense per vehicle hour

Moderately positively correlated with operational expense per unlinked passenger
trip
Stop Spacing
Moderately positively correlated with operational expense per passenger trip

Strongly negatively correlated with frequency
Moderately negatively correlated with passenger trips per revenue hour
Passenger Trips per Revenue Hour
Moderately to strongly negatively correlated with operational expense per

revenue hour
Moderately negatively associated with stop spacing
Linear Regression Modeling

With an idea of the relative efficiencies of the two modes from descriptive statistics, and
having established between which variables statistically significant correlations exist, we now
turn to predictive functions. Creating linear regression models that can predict our various
dependent variables will allow future policymakers who wish to establish a commuter rail or
DMU service to predict the operational efficiency (and therefore, costs) with some accuracy,
given the several inputs.
There are two kinds of regression models in this section. The first includes regression
models for each of the three primary dependent variables measuring operational efficiency, using
up to four independent variables: stop spacing, frequency of service, passengers per revenue
hour, and a dummy variable representing the binary choice between YR and CR service, where
YR=1 and CR=0. These models allow direct prediction of the relative efficiencies of YR and CR
service. A fourth model calculates the anomalous variable passengers per vehicle revenue hour,
here treated as a dependent variable though it may be regarded as an input as well. Each model
incorporates the full sample size of data from NTD, so n=27 in all cases.
The second set of nesting models represents an attempt to create operational cost and
efficiency tests for each of the two modes separately, using exclusively their own data. For each
of the three dependent variables measuring operational efficiency, we have created one set of
nested models based on YR data exclusively and one set of nested models based on CR data
exclusively, all using the same independent variables. Although the sample sizes are very small
(n=5 for YR and n=22 for CR), this is at least a beginning to work that will allow future decision
makers to predict operational costs. Using nested models allows us to control for different
variables and make observations about the relative importance of various independent variables.
Ultimately the goal is the selection of the best model(s) for operational efficiency for both YR
and CR; since the dependent variables are largely interchangeable in terms of predictive value,
this can be any of them.
Combined/Comparative Models7
Dependent Variable:
Independent Variable
Stop Spacing
Trains per Route Mile
Passengers/Revenue
Hour
Dummy for Type
Constant
R2
Operational Expense per Vehicle Hour

Model 1
sig.
Model 2 sig.
Model 3 sig.
Model 4 sig.
15.698
0.396
13.987
0.6
39.626
0.125
59.555
0.976
-81.908
0.923
-36.388
0.961 383.871
0.027
Dependent Variable:
Operational Expense per Unlinked Passenger Trip

Model 1
sig.
Model 2 sig.
Model 3 sig.
Model 4 sig.
1.854
0.003
1.607
0.056
0.856
0.281
1.437
0.084
Stop Spacing
507.346
0.029
522.787
0.029
-11.199

Passengers/Revenue
Hour
0.008
0.666
5.975
0.011
119.582
0.273
0.584
6.772
0.15
0.305
R2
Dependent Variable:
Stop Spacing
8.885
0.121
0.311
0.592
-0.277
0.99
-0.174
0.016
-0.193
0.006
5.118
0.006
16.99
0.018
20.66
0.005
0.468
0.544
Operational Expense per Passenger Mile

Model 1
sig.
Model 2 sig.
Model 3 sig.
Model 4 sig.
-0.02
0.643
-0.068
0.264
-0.083
0.214
0.005
0.066
-2.176

Passengers/Revenue
Hour
0.262
-2.202
0.263
-0.353
0.924
-0.003
0.541
-0.006
0.815
0.773
0.147
0.841
0.066
Dummy for Type

0.751
Constant
0.009
R2
0.6
0.05
0.976
-12.525
Dummy for Type

Constant
5.327
175.615
-6.355
0.393
0.001
1.162
0.009
0.06
Full SPSS output for all tables is attached in Appendix XXX below
1.396
0.076
0.02
0.511
Dependent Variable:
Stop Spacing
Dummy for Type
Constant
R2
Dependent Variable:
Stop Spacing
Model 1
Passengers/Revenue Hour
sig.
Model 2 sig.
Model 3 sig.
-4.139
0.012
-4.306
0.061
-3.481
0.173
-7.602
0.915
7.842
0.916
6.379
0.464
66.047
0
67.482
0
61.319
0.002
0.228
0.228
0.246
Differential Models
YR
Model 1
sig.
Model 2
sig.
Model 3
-8.015
0.927
-31.015
0.806 282.484
-1275.091

Passengers/Revenue
Hour
711.197
Constant
0.059
0.003
R2
Dependent Variable:
Stop Spacing
Model 1
4.493
R2
Dependent Variable:
Stop Spacing
2.176
0.926
0.367
15.667
0.256
-1530.83
0.393
0.731
0.858
Operational Expense per Passenger Trip

sig.
Model 2
sig.
Model 3
0.009
5.237
0.004
4.549
41.256
0.395
-4.208
0.306
0.731 4883.675
0.075

Passengers/Revenue
Hour
Constant
908.511
sig.
0.052
0.138
0.992
sig.
0.118
27.741
0.374
-0.034
0.538
1.145
0.886
0.996

Model 1
sig.
Model 2 sig.
Model 3 sig.
0.238
0.347
0.261
0.482
0.114
0.929
1.312
0.896
-1.588
0.896
Passengers/Revenue
Hour
Constant
R2
0.556
Stop Spacing
R2
Dependent Variable:
Stop Spacing
Passengers/Revenue
Hour
0.844
0.3
0.899
1.502
0.875
0.317
Passenger Trips per Vehicle Hour

sig.
Model 2 sig.
-12.919
0.207
-20.01
0.092
94.869
0.462
0.029
-393.106
155.7
0.829
0.174
0.044
CR
Model 1
sig.
Model 2 sig.
Model 3 sig.
33.392
0.089
50.39
0.092
56.626
0.05
681.071

Passengers/Revenue
Hour
Constant
0.353
0.293
Dependent Variable:
Independent Variable Model 1
Stop Spacing
Trains per Route
Mile
Constant
R2
Dependent Variable:
0.447
-0.007
394.805
.137
253.507
0.432
0.219
.166
360.294
0.665
4.745
0.083
36.082
0.873
.298
Operational Expense per Passenger Trip

Model 1
sig.
Model 2 sig.
Model 3 sig.
1.984
0.006
1.745
0.097
1.455
0.116
-9.58
0.752
5.319
0.845
-0.22
0.085
Constant
5.544
0.101
0.317
R2
Dependent Variable:
Stop Spacing
Dependent Variable:
Stop Spacing
Constant
R2
17.631
0.026
0.507

Model 1
sig.
Model 2 sig.
Model 3 sig.
0.022
0.456
0.002
0.97
-0.004
0.929
-0.824
0.439
0.028
R2
0.297
0.321

Passengers/Revenue
Hour
Constant
7.532
Model 1
0.007
0.609
0.047
0.545
0.068
-0.526
0.705
-0.004
0.321
0.812
0.042
0.099
0.028
Passengers per Revenue Hour

sig.
Model 2
sig.
-2.001
0.072
-1.314
0.586
67.601
0.353
59.845
0
45.821
0.013
0.153
0.192
Discussion
This research has examined the comparative operational efficiencies of commuter rail
and hybrid rail, using data from the National Transit Database. We hypothesized that, given the
expectations of its backers and proponents, hybrid rail would, as a mode, be found to be more
efficient in operation than commuter rail. This was analyzed using three primary dependent
variables: operational expense per vehicle hour, operational expense per passenger trip, and
operational expense per passenger mile. These variables were analyzed using two primary
independent variables, stop spacing and trains per route mile, a proxy for frequency of service.
Analysis was also conducted using the important ridership measure of ridership per vehicle
revenue hour, which can serve either as an independent or a dependent variable, since ridership
is both a result of good service and an input into the calculation of how much service is required.
The results of this statistical examination are, on the whole, mixed. We expected that YR
systems would show closer stop spacing than CR systems, to take advantage of the lightweightand faster-accelerating (in theory) nature of their equipment. Indeed, YR stop spacing is
considerably closer than that of commuter rail systemslogical, considering the proposed
benefits of the mode. The difference between the two modes comes very, very close to achieving
the 95% confidence threshold (p=.057). With a median of 2.79 miles, YR stop spacing does not,
however, approach the generally considered best practice for urban rapid transit of stations
located every half mile to 1 mile. Indeed, the longer end of YR stop spacing overlaps with CR
stop spacing, again suggesting a convergence between the modes. The minimum stop spacing on
a CR system, SEPTAs 1.45 miles, actually is the single lowest result regardless of mode, and
suggests that that entire system should be run as a rapid transit system rather than commuter
raila longtime cause among transit advocates. It is worth remembering that, according to our
hypothesis, stop spacing would be expected to show an inverse relationship with efficiency
measuresthat is, closer (smaller) stop spacing should make for lower costs.
Perhaps the most important result is that YR systems decisively outperform CR systems
on the cherished operating efficiency measure of passengers carried per vehicle hour; though the
difference does not quite achieve significance at a high level of confidence (p=.126), that level of
confidence is hard to achieve with such small sample sizes. The mean (58.80) is well higher than
that of CR (46.06), and indeed aside from one lower outlier (DCTA), the entire distribution of
YR systems lies above the CR mean. The single most heavily used system in the country,
though, is #4, Caltrain on the San Francisco peninsulaa strong corridor anchored by San
Francisco on one end and San Jose on the other, running through Silicon Valley in between. On
the whole, though, YR systems clearly make better use of their equipment than do CR systems.
This is not a surprise given that CR systems often run long trains at off-peak times with only one
or two cars open since breaking up trainsets midday is difficult, while YR systems use selfpropelled cars that can be more easily mixed and matched to meet demand.
NTD does not measure frequency of service directly (and indeed, that would be difficult
to do on a system-wide basis for systems that have more than one line). As such, since frequency
of service is an important determinant of efficiency of service and of passenger utility, this paper
uses the number of trains in operation on an average weekday divided by the systems overall
route mileage as a crude proxy for frequency of service. The results are interesting: YR systems
are actually, on the whole, more frequent than CR services. That is how it should be; the promise
of YR is that is can offer more frequent service at lower cost. The single most frequent system,
though, is point #16, New York and Connecticuts Metro-North Railroad. As one of the two
largest systems in the country, that is not a surprise. When measuring all 27 systems, frequency
of service is moderately to strongly negatively correlated with operational expense per revenue
hour and moderately negatively associated with stop spacing. In other words, systems with closer
stop spacing generally have more frequent service, although it is hard to state the direction of
causation. Frequency is also associated with lower operational expense on one measurea
potentially important result. However, regression shows that slopes related to the crude
frequency proxy used here generally struggle to achieve statistical significance, so a more
thorough analysis using actual schedule data to more accurately estimate frequency, though
outside the scope of this project, would likely prove a strong next step.
If passengers per vehicle revenue mile indicated that YR systems are more productive,
the various dependent variables indicating operational expense show that the mode has not yet
conquered the bug of massive operational expense that plagues American commuter rail. YRs
mean for operating expense per passenger mile mean is higher than that of CR systems, as is
almost the entire distribution (though the highest single expense belongs to Minnesotas
Northstar commuter rail, a prime example of wasteful commuter rail spending). Based on
averages and distributions, operational expense per passenger trip is virtually identical for CR
and YR systems. Cost per passenger mile, too, is much higherboth in averages and in
distributionfor YR than CR systems. In part, this is surely because YR lines are typically
shorter than CR equivalents, which typically carry passengers for long distances. The cost
efficiency measuresour dependent variablessuggest that, on the whole, YR systems have not
accomplished the cost control they have potential to provide.
Of the variables examined, difference in only one, operational expense per passenger
mile, achieves full statistical significance at the .95 confidence level. One other, stop spacing,
comes very close (sig.=.057), while severalpassenger trips per revenue hour, operational
expense per vehicle hour, and passenger trips per revenue hourcome close to achieving
significance at the .90 confidence level. This is a fascinating result as it seems to indicate that
operational practices on YR systems are not very different from those on CR systems, perhaps
accounting for some of the YR modes apparent operational inefficiencies.
Analysis of descriptive statistics and hypothesis tests allow us to analyze currently
existing differences between YR and CR systems; regression allows us to project those
differences into the future. Since all of the dependent variables are highly correlated with each
other, and largely interchangeable in planning for overall costs, we can afford to pick the
strongest models of each type to represent overall costs. For the comparative models, those
measuring directly the differences between YR and CR systems, this takes the form of Model 4
analyzing Operational Expense per Unlinked passenger trips, all of whose slopes are highly
significant by the standards of this exercise, and whose r2 is .544:

= 16.99 + (1.437 ) + (0.277 )
+ (0.193 ) + (5.118 )
Where TYPE is a dummy variable representing mode type, with CR=0 and YR=1. In this
function, operational expense has a positive relationship with stop spacingmeaning that as stop
spacing gets wider, expenses will go up. Expense has a negative relationship with frequency,
meaning that as frequency grows, expense goes down (although at a low rate), which would be
somewhat surprising to operators, though not to advocates. And, of course, expense goes up as
ridership goes down, which is to be expected, since expenses are largely fixed for a given level
of operation. When the mean input variables from our data are plugged into this equation, YR
expense per passenger trip comes to $14.74, and CR to $14.69virtually identical to the means
of the variable in NTD data. Regression thus again confirms that YR has, on this measure, not
achieved the significant operational savings promised, despite higher productivity in terms of
ridership.
This research also seeks to present regression models tied directly to the individual types,
to allow policymakers who have already decided on their mode type to predict costs to some
extent. Given the small sample sizes, the models struggle to achieve much significance. Of all
the YR models presented, it seems that #2 of operational expense per passenger trip is the overall
strongest. The model boasts an impressive r2 of .992 and looks like this:

= 4.208 + (5.237 ) + (41.256 )
This conclusion suggests that policymakers must establish a sense of what ridership will be
before seeking to measure future efficiency on a YR service. There is also significant room for
additional research on the effect of frequency and span of service on efficiency, beyond the use
of a crude proxy such as NTD is able to provide.
For CR systems, it is clear that the most reliable relationship is between stop spacing and
operational efficiency. Overall, the best model of those tested is likely model #1 of those
measuring operational expense per passenger trip. The resulting equation would be:
= ( 1.984)
Between r2 and its adjusted equivalent, we can surmise that stop spacing accounts for around
30% of the variation in operational expense per passenger tripa not insignificant amount. As
with the YR systems, there is clearly much more work to be done here, particularly with regard
to the effects of frequency on efficiency. Interestingly, the lack of reliable results with regard to
operational cost per passenger mile suggests that the wide variability within CR systems on
distance may make constructing cost-predictive functions difficult.
Conclusions and Further Research Needed

This analysis has come to several primary, and important, but limited conclusions:
Hybrid rail systems can and do outperform their commuter rail counterparts on a
ridership-per-vehicle-hour basis
Hybrid rail operational costs are equivalent to or higher than commuter rail costs
With all systems analyzed together, closer stop spacing generally correlates to more
efficiency (reduced costs and higher ridership)
While the crudeness of the representation used may obscure the results, frequency of
service may also correlate with more cost-efficient service
Taken together, these conclusions point in the direction that technically-minded transit advocates
have long advocated: commuter and regional rail systems in the US need significant labor reform
to increase operating efficiency.8 Systems that break the 9-to-5, peak-focused, mold of typical
US commuter rail can and do perform well on ridership metricsbut they have not yet solved
the problem of high operational costs. DMU advocates often point to the mode as being
lightweight, easy, and cheap to implementand that can be true in terms of capital costs,
although due to their rarity DMUs still often cost significantly more in the US than in Europe,
where they are more common9. It seems, though, that hybrid rail systems have not yet broken
through the cost barrier of reducing crewing requirements, the single largest piece of the transit
expenditure puzzle.
Trying to track labor efficiency, then, is probably the single largest piece of research that
could supplement this analysis. NTD tracks a variable known as Operating Expense per
Employee Hour, but agencies are not required to report it, and in 2013 data only five agencies
did so. One potential avenue forward on this measure would be to cobble together data from
multiple years of NTD reporting and try to compile a larger sample size. Alternatively, an
ambitious researcher could try to compile the data from agencies own annual reports and other
documentation.
The second primary way forward, as has been stated multiple times, is to better quantify the
concept of frequency of service. In times past this would have required manual examination of
timetables and schedules, and still might; but the introduction of General Transit Feed
Specification, or GTFS, tools might allow automated quantification of frequency. On larger
See, for example, Alon Levys recent post Why Labor Efficiency is Important.
https://pedestrianobservations.wordpress.com/2015/07/26/why-labor-efficiency-is-important/
9
See in particular https://systemicfailure.wordpress.com/2010/11/13/the-six-million-dollar-train/
8
systems with more than one route, especially those with multiple service patterns on the same
route (say, the Long Island Railroad, which has very frequent service on the inner half of its
network and relatively infrequent service on the outer part), there would be numerous
complicating factors, but an enterprising researcher could surely make something work. A better
measure for frequency than this papers crude proxy would likely make the models much more
robust.
It may be ironic that this statistical analysis of operational efficiency ultimately comes down
to, in large part, a qualitative rather than a quantitative measure. Yet it does seem that labor
policyin particular, the question of how many crew members must ride a particular trainis
the single most important remaining question in the comparative analysis of hybrid rail vs.
commuter rail systems. It is a question that remains unquantified because of NTDs (lack of)
reporting practices, and one that is highly politicized. Labor unions remain extremely strong in
the railroad sector, and often provide crucial political support for transit projects. That makes any
talk of reducing crew sizes extremely touchy. Ultimately, it seems that the question of efficiency
remains not just a technical one, but a political oneperhaps even more political than technical.
And research on that front will continue in this authors senior paper.
Appendix A: Systems Studied

Service
Metro Area
Type
Dataset ID
Altamont Commuter Express
San Jose-Stockton
CR
Sprinter
San Diego
YR
Coaster
San Diego
CR
Caltrain
San Francisco-San Jose
CR
Metrolink
Los Angeles
CR
Shore Line East
Connecticut Shoreline
CR
Tri-Rail
Miami
CR
Metra
Chicago
CR
South Shore
Chicago/Northwest Indiana
CR
MBTA
Boston
CR
10
MARC
Washington, DC/Baltimore
CR
11
Northstar
Minneapolis/St. Paul
CR
12
River Line
Philadelphia/Trenton
YR
13
New Jersey Transit
NYC/Trenton
CR
14
RoadRunner
Albuqurque/Santa Fe
CR
15
Metro-North
NYC
CR
16
LIRR
NYC
CR
17
Westside Express
Portland, OR
YR
18
Keystone Service
Philadelphia/Harrisburg
CR
19
SEPTA
Philadelphia
CR
20
Music City Star
Nashville
CR
21
Capital MetroRail
Austin
YR
22
DART
Dallas
CR
23
A-Train
Dallas
YR
24
FrontRunner
Salt Lake City
CR
25
Virginia Railway Express
Washington, DC
CR
26
Sounder
Seattle
CR
27
Appendix B: Variables
NTD Description
Table
VOMS
19
VehMi
RevMi
19
19
VeHr
RevHr
PaxTrips
PaxMiles
NumTrains
19
19
19
19
20
Stations
RouteMiles
StopSpace
21
23
n/a
TrainsPerRouteMile
n/a
OpExVoms
27
OpExVeHR
OpExPaxTrip
OpExPaxMi
OpExEmHr
27
27
27
27
PaxTripPerRevHR
n/a
Units
Notes
Independent Variables
Vehicles Operated in Maximum
Servicemost vehicles (coaches)
operated at busiest point of the day
Annual Vehicle Miles
Thousands
Annual Vehicle Revenue (in service,
Thousands
carrying passengers) Miles
Annual Vehicle Hours
Annual Vehicle Revenue Hours
Annual Unlinked Passenger Trips
Annual Passenger Miles
Number of trains in operation
(Average weekday)
Total Number of Stations
Round Trip Route Miles
Stations/(RouteMiles/2)
Thousands
Thousands
Thousands
Thousands
NumTrains/RouteMiles
Dependent Variables
Operating Expense per Vehicles
Single
Operated in Maximum Service
dollars
Operating Expense per Vehicle Hour
Operating Expense per Passenger Trip
Operating Expense per Passenger Mile
Operating Expense per Employee
Hour
Unlinked Passenger Trips Per Vehicle
Revenue Hour
All lines in system

Average stop spacing for a oneway trip (entire system)
Proxy for frequency
Only some agencies report

Considered one of the most
reliable indicators of performance
efficiency
Appendix C: Visual Presentations of Descriptive Data
Appendix D: Analysis Dataset

PaxTripP
erRevHr
521.10
15.90
0.40
#NULL!
8.60
0.02
40.38
0.00
606.30
7.40
0.80
#NULL!
1.47
0.09
82.68
1.00
82.20
750679.0
0
462.80
11.50
0.40
#NULL!
5.14
0.05
46.55
0.00
23.30
940.80
42140.30
4.00
10.00
172.00
678709.0
0
North County Transit District(NCTD)
YR
6.00
533.70
530.60
24.30
24.20
2000.90
18103.00
4.00
15.00
44.00
2454214.
00
1629.20
44875.30
4.00
Stations
TrainsPe
rRouteM
ile
28.70
35.00
PaxMiles
StopSpac
e
914.70
40.50
PaxTrips
OpExEm
Hr
944.10
1392.40
RevHr
OpExPax
Mi
22.00
1470.70
VeHr
OpExPax
Trip
CR
25.00
RevMi
OpExVO
MS
Altamont Corridor Express(ACE)
CR
VehMi
RouteMil
es
Type
North County Transit District(NCTD)
VOMS
NumTra
ins
Name
8.00
OpExVe
Hr
TypeDu
mmy
Peninsula Corridor Joint Powers Board dba:

Caltrain(PCJPB)
CR
100.00
6845.00
6590.70
199.40
187.60
16384.60
357919.10
20.00
32.00
153.68
1019919.
20
511.50
6.20
0.30
#NULL!
2.40
0.13
87.34
0.00
Southern California Regional Rail Authority dba:

Metrolink(M
CR
185.00
13460.00
13162.90
374.20
338.00
13444.80
464643.10
37.00
55.00
777.80
1023318.
70
505.90
14.10
0.40
#NULL!
7.07
0.05
39.78
0.00
101.20
957772.5
0
645.60
30.80
1.30
#NULL!
5.62
0.06
28.76
0.00
142.24
1451297.
30
501.80
13.80
0.50
#NULL!
3.95
0.07
40.99
0.00
#NULL!
Connecticut Department of
Transportation(CDOT)
South Florida Regional Transportation
Authority(TRI-Rail)
CR
CR
28.00
40.00
2008.90
3258.00
1467.60
3164.50
41.50
115.70
Northeast Illinois Regional Commuter Railroad

Corporation db
CR
1043.00
45217.40
43197.70
1458.60
Northern Indiana Commuter Transportation

District(NICTD)
CR
66.00
3835.90
3736.40
107.50
Massachusetts Bay Transportation

Authority(MBTA)
Maryland Transit Administration(MTA)
CR
CR
416.00
175.00
22530.50
6110.90
22072.60
5687.40
753.60
156.80
30.30
102.50
871.50
20872.20
6.00
4201.00
116122.40
1410.00
73603.20
1665749.7
0
141.00
241.00
975.40
636697.6
0
455.30
9.00
0.40
2.02
0.14
52.20
0.00
104.70
3606.90
104240.20
14.00
20.00
179.80
598529.1
0
367.40
11.00
0.40
80.00
4.50
0.08
34.45
0.00
776.08
844611.0
0
466.20
10.00
0.50
73.70
2.83
0.08
47.46
0.00
400.40
694907.1
0
775.80
13.50
0.40
#NULL!
4.77
0.07
61.39
0.00
742.30
147.10
35228.80
9030.00
729585.70
274231.00
10.00
9.00
63.00
28.00
18.00
137.00
42.00
Metro Transit
CR
23.00
543.30
536.90
16.30
15.10
787.20
19877.40
4.00
7.00
77.90
771893.9
0
1087.50
22.60
0.90
#NULL!
5.56
0.05
52.13
0.00
New Jersey Transit Corporation(NJ TRANSIT)
YR
15.00
1253.30
1230.30
49.70
49.70
2859.20
41231.10
12.00
20.00
69.70
2236150.
30
674.30
11.70
0.80
#NULL!
1.74
0.17
57.53
1.00
80136.40
2224999.2
0
1001.80
808051.3
0
418.10
11.40
0.40
3.05
0.13
44.72
0.00
New Jersey Transit Corporation(NJ TRANSIT)
CR
1135.00
64130.40
60753.20
2193.40
1792.10
131.00
164.00
94.40
Rio Metro Regional Transit District(RMRTD)
CR
25.00
1426.70
1398.30
38.10
36.10
1089.50
48413.10
7.00
13.00
193.10
1083428.
20
711.50
24.90
0.60
#NULL!
7.43
0.04
30.18
0.00
Metro-North Commuter Railroad Company,

dba: MTA Metro-North
CR
1230.00
73724.40
65213.20
2173.70
1955.20
83290.90
2501154.2
0
143.00
112.00
545.74
1205723.
50
509.30
12.30
0.60
#NULL!
2.44
0.26
42.60
0.00
99256.00
2161002.9
0
638.20
871251.6
0
493.00
12.90
0.40
98.40
2.57
0.18
46.97
0.00
29.22
1759008.
30
829.30
15.90
2.00
106.20
2.92
0.10
58.92
1.00
MTA Long Island Rail Road(MTA LIRR)

Tri-County Metropolitan Transportation District
of Oregon(Tr
CR
YR
1011.00
4.00
74456.10
164.30
64819.90
162.10
2393.40
8.50
2113.10
7.50
441.90
3552.60
113.00
3.00
124.00
5.00
Pennsylvania Department of
Transportation(PENNDOT)
CR
20.00
2146.10
2146.10
35.90
35.90
610.20
44623.40
4.00
12.00
144.40
936733.8
0
521.80
30.70
0.40
#NULL!
6.02
0.03
17.00
0.00
Southeastern Pennsylvania Transportation

Authority(SEPTA)
CR
334.00
19990.20
18679.00
740.40
694.40
37167.70
502346.10
80.00
154.00
446.94
738994.1
0
333.40
6.60
0.50
#NULL!
1.45
0.18
53.52
0.00
62.80
597208.3
0
505.40
16.60
1.10
5.23
0.03
37.64
0.00
64.24
3428112.
30
868.60
16.40
1.00
#NULL!
3.57
0.06
71.96
1.00
Regional Transportation Authority(RTA)

Capital Metropolitan Transportation
Authority(CMTA)
CR
YR
7.00
4.00
205.30
331.10
200.00
279.40
8.30
15.80
6.70
11.60
252.20
834.70
3917.50
13281.90
2.00
4.00
6.00
9.00
70.30
Dallas Area Rapid Transit(DART)
CR
23.00
1351.60
1144.50
55.80
49.50
2092.80
40170.30
6.00
10.00
72.30
1172514.
90
483.10
12.90
0.70
#NULL!
3.62
0.08
42.28
0.00
Denton County Transportation Authority(DCTA)
YR
8.00
624.60
598.10
24.30
22.30
510.70
7637.40
4.00
5.00
42.60
1414881.
30
465.60
22.20
1.50
#NULL!
4.26
0.09
22.90
1.00
174.46
992619.3
0
326.30
9.40
0.30
#NULL!
5.45
0.05
38.39
0.00
749.00
13.30
0.40
#NULL!
4.49
0.20
68.42
0.00
707.70
13.00
0.60
6.83
0.06
60.20
0.00
Utah Transit Authority(UTA)
CR
36.00
5126.10
5068.10
109.50
99.40
3816.40
108921.20
9.00
16.00
Virginia Railway Express(VRE)
CR
89.00
2427.60
2081.20
81.00
66.50
4550.10
149745.10
32.00
18.00
161.48
682010.5
0
Central Puget Sound Regional Transit

Authority(ST)
CR
62.00
1671.90
1636.80
54.50
49.30
2968.00
64702.00
10.00
12.00
163.84
622467.8
0
78.00
Appendix E: SPSS Outputs

<see digital attachments>

Commuter Rail and Hybrid Rail Efficiency

Enviado por

Dados do documento

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Commuter Rail and Hybrid Rail Efficiency

Enviado por

Direitos autorais:

Formatos disponíveis

Better, or Just Different?

Examining Operational Efficiency on Commuter Rail and

Hypothesis and Research Question

See for example http://seattletransitblog.com/2014/01/03/the-cheaper-brighter-future-of-american-passenger-rail/

About the Data

Background information on NTD from:

Variable: Number of Trains

Variable: Stop Spacing

Variable: Trains per Route Mile

Variable: Passenger Trips Per Revenue Hour

Variable: Operational Expense per Vehicle Hour

Variable: Operational Expense per Passenger Trip

Variable: Operational Expense per Passenger Mile

Relationships Between Variables (t-tests)

Correlation Matrix (n for all=27)

Operational expense per vehicle hour:

Moderately positively correlated with operational expense per passenger trip

Operational expense per unlinked passenger trip:

Moderately positively correlated with operational expense per vehicle hour

Operational Expense per Passenger Mile:

Moderately positively correlated with operational expense per vehicle hour

Moderately positively correlated with operational expense per passenger trip

Passenger Trips per Revenue Hour

Moderately to strongly negatively correlated with operational expense per

Linear Regression Modeling

Operational Expense per Vehicle Hour

Operational Expense per Unlinked Passenger Trip

Trains per Route Mile

Operational Expense per Passenger Mile

Trains per Route Mile

Dummy for Type

Dummy for Type

Trains per Route Mile

Operational Expense per Passenger Trip

Trains per Route Mile

Operational Expense per Passenger Mile

Passenger Trips per Vehicle Hour

Trains per Route Mile

Operational Expense per Passenger Trip

Operational Expense per Passenger Mile

Trains per Route Mile

Passengers per Revenue Hour

Conclusions and Further Research Needed

Appendix A: Systems Studied

Altamont Commuter Express

San Francisco-San Jose

Shore Line East

New Jersey Transit

Music City Star

Salt Lake City

Virginia Railway Express

All lines in system

Only some agencies report

Appendix C: Visual Presentations of Descriptive Data

Appendix D: Analysis Dataset

North County Transit District(NCTD)

Altamont Corridor Express(ACE)

North County Transit District(NCTD)

Peninsula Corridor Joint Powers Board dba:

Southern California Regional Rail Authority dba:

Northeast Illinois Regional Commuter Railroad

Northern Indiana Commuter Transportation

Massachusetts Bay Transportation

New Jersey Transit Corporation(NJ TRANSIT)