From the point of view of a Caltrain rider, quality can be defined in terms of just four Metrics That Matter.
- What is the average trip time between my origin and my destination?
- What is the best trip time between my origin and my destination?
- At my origin, how long is the average waiting time between trains that go to my destination?
- At my origin, what is the longest waiting time between trains that go to my destination?
Figuring the Metrics
The four metrics that matter are objective measures that are reasonably straightforward to extract from a timetable. A computer can churn through a timetable to extract the metrics for every possible origin & destination pair, by making a few simple assumptions about rider behaviors as shown in the graphic at left.
For example, we can crunch the current Caltrain timetable (with 90 trains per weekday and 5 trains per peak hour), with the result shown at right. For simplicity, the graphic shows a limited sample of ten stations; simply follow the blue lines to find the intersection of the desired origin and destination, and read off the four metrics. A more complete version of this table is provided below.
Effective Trip Time
The four metrics are useful to consider separately, but ultimately we'd like to compare entire timetables to determine which timetable is better. To do this, we need to consolidate the four metrics that matter into a single "effective trip time" metric for each origin and destination pair. Thus far, the four metrics required very few assumptions and could be quantified quite objectively. As we construct an effective trip time metric, things get a bit more subjective and debatable.
The effective trip time is not the trip time experienced by any particular passenger; rather, it is a single figure of merit that reflects a global average of trip times taken by all passengers between a given origin and destination.
If a passenger showed up randomly, the effective trip time would simply be the average trip time plus 50% of the average wait time. However, most passengers don't show up randomly. They tend to show up before their train, and they also tend to prefer faster express trips. Therefore, we can create a reasonable measure of effective trip time as the sum of:
- 70% of the average trip time
- 30% of the best trip time (to favor express service)
- 20% of the mean wait between trains (far less than the random arrival figure of 50%)
- 15% of the maximum service gap (to penalize very large gaps between trains)
Notice that the effective trip time metric is constructed so as to punish very large service gaps. Due to the speed / frequency trade-off that Caltrain currently must contend with, many smaller stops are severely under-served during rush hour, commonly with gaps of 40 minutes or more, to clear the tracks for Baby Bullets. This is reflected in the table: for example, Palo Alto to 4th & King (the highest traffic and best-served O&D pair) is covered in 47 minutes, but the similar distance between California Ave and 22nd Street (an under-served O&D pair) effectively takes 67 minutes.
Whether or not you agree with the exact weighting used to construct the effective trip time metric, the fact remains that with some optimally chosen weights, the metric is a valid one. The numbers can easily be recalculated using different weighting assumptions.
Measuring the Quality of an Entire Timetable
Now that we have a single number that describes the effective trip time between any given O&D pair, it's time to generalize the approach to encompass all O&D pairs and to construct a single figure of merit that captures the service performance of an entire timetable. Obviously, not all O&D pairs can be served optimally: any timetable is inherently a trade-off between minimizing trip times for most riders at the cost of longer trip times for some riders. Figuring out what works best thus requires ridership weighting.
Ridership weights can be derived from actual weekday ridership data, shown as blue bars in the chart at right (these weights have been scaled such that they add up to one). Unfortunately, actual ridership is not always an exact reflection of underlying travel demand. Some stations suffer from a vicious circle; they have poor ridership in large part because they are poorly served. A good example of such a station is California Avenue in Palo Alto, where average weekday ridership was 1225 riders in 2002 before service was cut to make way for the Baby Bullet, dropping to 891 riders in 2010. This 27% drop occurred at the same time as overall ridership increased by 19%, and is unlikely to have been caused by any shifts in employment or residential patterns in the rather thriving vicinity of this station.
The most desirable approach would be to implement a full-featured ridership model, of the sort that has recently generated so much controversy for the state-wide high-speed rail project. That is unfortunately beyond our means, so we will simply use direct ridership weighting, with some filling in where service currently isn't provided (e.g. Transbay or Atherton).
The ridership weight of an O&D pair is the product of origin ridership and destination ridership (a measure of how many people travel on that O&D pair) and is shown by the light blue circles in the figure at left. Big circles mean heavy ridership, small circles mean light ridership. An optimal timetable will seek to provide the best effective trip time for O&D pairs where a big circle represents heavy ridership.
To construct a single figure of merit for an entire timetable, we first need to come up with a new, ridership-weighted service score for each O&D pair. This score, where higher is better, is the product of origin and destination ridership (the size of the blue circle), divided by effective trip time. Dividing by effective trip time means that shorter trip times increase the score for that O&D pair. Now all that's left to do is to sum up all the O&D service scores, and presto, we've got ourselves a single figure of merit. This allows us to compare timetables and, provided that we agree on the method used to construct that figure of merit, to determine objectively which of two timetables is the better one.
Any disagreements about which is the best timetable can then be reduced to disagreements over the scoring method. Planners tend to fall in love with their favorite solution, so taking the discussion away from the solution and focusing instead on the scoring method allows a dispassionate debate that is less colored by subjective preferences. Indeed, without an agreed-upon scoring framework (clearly defined metrics), comparing timetables is a subjective and useless exercise.
The Great Timetable Shoot-Out
Enough with metrics, are we ready for some fun, or rather, as much fun as can be had with timetables? Let's put three different timetables through their paces, and see how they stack up in terms of service quality.
Contestant #1: today's 90-train-per-day, 5-train-per-hour Caltrain timetable, to which we will assign a score of 100 for the purpose of comparison.
- Input timetable file (tab delimited text)
- Metrics that matter table (318 kB PDF)
- Effective trip time table (35 kB PDF)
- Origin & Destination service score table (539 kB PDF)
- Overall service quality score: 100
- Input timetable file (tab delimited text)
- Metrics that matter table (345 kB PDF) -- also compared with Caltrain 2010 (329 kB PDF)
- Effective trip time table (35 kB PDF) -- also compared with Caltrain 2010 (38 kB PDF)
- Origin & Destination service score table (671 kB PDF)
- Overall service quality score: 147
- Input timetable (tab delimited text)
- Metrics that matter table (342 kB PDF) -- also compared to Caltrain 2010 (328 kB PDF) and compared to Caltrain 2025 (317 kB PDF)
- Effective trip time table (34 kB PDF) -- also compared to Caltrain 2010 (38 kB PDF) and compared to Caltrain 2025 (37 kB PDF)
- Origin & Destination service score table (674 kB PDF)
- Overall service quality score: 145
The secret weapon: the mid-line overtake.
The Trickle Down Effect
With the peninsula being reconfigured to four tracks for high-speed rail, it would be a terrible shame not to take advantage of some of that new track capacity to run better and more efficient local service, providing measurable benefits to local peninsula communities. Better service just might be the sugar coating to make the bitter pill of high-speed rail go down a little bit easier in places like Palo Alto, Atherton, Belmont, or Burlingame. Otherwise, why even bother?