06 January 2012

Peninsula Rail Corridor Census

The U.S. Census Bureau provides an astonishing array of fine-grained statistics on population and jobs along the peninsula rail corridor.  When thinking about the future of peninsula rail service, and especially in deciding quantitatively how good a proposed timetable might be, or where stations should be placed, or how HSR should mesh with Caltrain in a 'blended' scenario, the basic consideration should be where people live and work.

Annual ridership counts provide one way of planning your timetable: simply add more service to the stops that get a lot of ridership.  This becomes a self-fulfilling prophecy with ridership patterns becoming distorted by the timetable, as observed with the Baby Bullet Effect.  Teasing apart the timetable-induced distortion from the underlying (and often untapped) ridership demand is impossible, so it is necessary to go back to the raw population and jobs data to build the full picture.  That is where the census really delivers.

Where People Live

Figure 1
The 2010 census provides the most recent snapshot of the population distribution on the peninsula, on a block-by-block basis that includes over 45,000 locations in the three Caltrain counties.  By tallying up how many people live within 1/4, 1/2, 1 and 2 miles of each Caltrain station location, you can build Figure 1.  This chart reveals where the population is densely concentrated around stations (e.g. San Mateo), or sprawled out (e.g. Sunnyvale).

Observations on the population numbers:
  • The new Oakdale station long proposed by San Francisco (with little support from Caltrain) could tap into more residential population than just about any other stop along the peninsula, or even 22nd Street.
  • The population density doesn't suddenly drop off at the southern end of the Caltrain-owned right of way in San Jose, where service suddenly drops off.  There are large concentrations of under-served population within a mile of the Tamien and Capitol stops, accounting for more than 3 times as many people as live within a mile of the San Jose Diridon station.
  • A stop like Broadway (Burlingame) with zero weekday rail service has more people living near it than Millbrae, site of the all-important BART intermodal station.  Other stations with poor Caltrain service (San Antonio, Cal Ave, San Bruno, Burlingame, Belmont, Santa Clara) have more people living nearby than stops with the best service, such as Palo Alto.
Figure 2
To assign to each station location a single weighting factor that quantifies that station's accessibility for nearby residents, regardless of distance, one can sum up each person divided by the square of how far away they live.  This inverse-square relationship is empirical, but captures the fact that people who live far away from a station are less likely to use it; its use in ridership modeling is not unprecedented.  A 1/r law would fall off too slowly, with the same number of people using the station from 1/2 mile away as 2 miles away (assuming constant population density).  A 1/r cubed law would fall off too quickly, with only 1/16th as many riders from 2 miles away as from 1/2 mile away.  As it turns out, the precise value of the exponent--if not exactly two--doesn't really drive the relative weights that strongly.  Only one small tweak has been applied to prevent people who live very close to a station from skewing the results: anyone living closer than 1/4 mile is considered 1/4 mile away.  The resulting inverse-square population weights for each station location are shown in Figure 2.
Where People Work

Figure 3
The Census Bureau publishes extensive statistics on local employment dynamics, providing block-by-block data on the number and distribution of jobs, pay levels, and industries.  The latest data set as of this writing is from 2009 (based on geographical data from the 2000 census covering over 32,000 locations in the three Caltrain counties).  The analysis presented here is based on raw data files, but the data can also be analyzed interactively using the Census Bureau's On The Map application.   Figure 3 shows how many jobs are located within 1/4, 1/2, 1 and 2 miles of each Caltrain station location.  Only the jobs worth more than $40k a year are shown, since lower-income jobs are less likely to require commuting (only about 15% of Caltrain riders earn less than $40k, and the average household income of a weekday peak Caltrain rider is over $100k).

Observations on the jobs numbers:
  • Not so surprisingly, there is a concentration of jobs in the vicinity of the future Transbay Transit Center, adjacent to the financial district.  What is more surprising is just how massive that concentration is: Transbay has more jobs within a half-mile radius (over 100,000) than all the other Caltrain stations combined, from 4th & King all the way down to Gilroy!
  • Job sprawl shows up in Santa Clara and southern Palo Alto (and most of Silicon Valley, really) in the form of few jobs near stations but many jobs within a mile or two.  Mountain View, despite its status as a major Baby Bullet stop, and home of Google, is not a particularly large job center.
Figure 4
Again, assigning to each station a weighting factor that quantifies that station's accessibility to nearby jobs, we apply the same inverse square relationship to obtain the job weights for each station location shown in Figure 4.  Note that Transbay goes way off the chart.
The Ridership Potential Matrix

Since 86% of riders during the weekday peak are commuters, the distribution of population and jobs can be used to construct a relative weight for the ridership that could potentially be generated between any given origin and destination (O&D) pair.  This is the ridership potential matrix.  The eventual purpose of this matrix is to help derive a single figure of merit for timetables, on an apples-to-apples basis, for how much of the potential ridership is tapped based on the service metrics for each O&D pair.  When considering any given timetable, this weighting scheme ensures that O&D pairs that have a lot of population and jobs at each end (such as 4th & King and Palo Alto) are given more importance compared to O&D pairs with lower population and fewer jobs (such as Atherton and Bayshore).

It is important to note that this ridership potential matrix is completely independent of how each O&D pair is connected by rail service; it holds true for any timetable.  It is solely a product of census data and the geographic location of each station.  A timetable must then be designed to unlock the maximum potential ridership.

The ridership potential matrix works like this: take for example station 1 and station 2, with respective population and job weights P1, P2, J1 and J2.  The weight for morning peak trips from origin 1 to destination 2 is P1*J2 (for people living near station 1 and working near station 2).  Conversely, the weight for morning peak trips from origin 2 to destination 1 is P2*J1 (for people living near station 2 and working near station 1).  When you multiply all the population weights from Figure 2 by all the job weights from Figure 4, you get a basic ridership potential matrix.  But there's a bit more to it than just people and jobs.

Distance Considerations

Regardless of where people live and work, there are upper and lower limits to how far they will typically commute by rail.  Extremely short trips are less likely because of the overhead of access and egress to and from the station, at each end of the journey.  Conversely, extremely long trips are less likely because of their sheer duration; regional commute patterns are not just a factor of train service considered in isolation, but also driving times.  That's why we will make the assumption that the distance distribution of commutes, generally speaking, is independent of the quality of train service--and that no foreseeable rail service pattern could significantly alter it.  Good service might lead to greater market share for rail, but the underlying distance distribution will be assumed not to budge.  This allows us to apply a (timetable-independent) distance distribution to the ridership potential matrix.

Figure 5
Caltrain ridership surveys show that the average trip length on the peninsula rail corridor during the weekday peak is about 25 miles.  The distance weighting function will be modeled as a Rayleigh distribution with a value of 0 at 0 miles and a peak of 1 at 25 miles-- for no particular statistical reason other than it ends up looking about right, as shown in Figure 5.

Each element of the ridership potential matrix is now the product of three factors: the distance weight based on the distance between origin and destination; the population weight at the origin station; and the job weight at the destination station.  This simple formulation yields the morning peak values shown in Figure 6 as a bubble graph (numerical values are available as a tab-delimited text file).  The evening peak is described by the transpose of the matrix, i.e. origin and destination switch places.  The distance-weighted ridership potential matrix is now ready for use in the quantitative analysis of past, present and future timetables, a topic that will be covered in upcoming posts revisiting the topic of service metrics.
Figure 6
Figure 8
Figure 7
In the meantime, we can explore other interesting aspects of the ridership potential matrix.  For example, summing the nth row together with the nth column of the matrix allows us to build a single weighting factor for the potential ridership at each stop including both the morning and evening peaks, i.e. a measure of the ridership distribution that could exist if it were tapped with excellent service, shown in Figure 7.  These weights can then be compared to the actual Caltrain ridership realized in 2011, yielding the scatter plot in Figure 8.  This comparison provides another more fundamental way (much better than historical ridership patterns) to visualize which groupings of Caltrain stops are under-served, and is amazingly accurate considering that it was constructed without ever looking at a timetable.

Key conclusions:
  • Access to Transbay would provide a step-change improvement in Caltrain service, with probable ridership gains of more than 25%.  Terminating any weekday peak train at 4th & King, as is inexplicably planned by Caltrain, is a huge mistake.  Agency turf battles with BART and the CHSRA regarding whether or how to pay for the downtown extension tunnel, and how to share platforms at Transbay, must be fought and won.
  • Underlying ridership demand is not accurately reflected by realized ridership, which suffers from severe timetable distortion.  Future service planning, and in particular the timetables assumed for the ongoing 'blended' operations analysis, must be based less on realized ridership and more on fresh census data--even if not using the simplified approach described here.
  • For the same reason that every Caltrain should serve Transbay (the huge concentration of jobs in San Francisco), HSR service that does not provide a one-seat ride into Transbay is a non-starter.


  1. First, this is a great analysis.

    I wonder how it looks without the TBT. It would be interesting to see what the current line looks capable of.

    Also I think you need to include some effect for transit connections. Mt View and Milbrae are going to have significant numbers of users who connect to Bart/VTA. You need some recursive look at how many people live/work near the stations on those lines, with some distance penalty for the connection.

  2. Nice work Clem, I'm going to have to check out these census data sets when I get a chance.

    But I have one question: isn't using ridership counts for "average distance" weights susceptible to the same problem that afflicts timetable-planning according to ridership surveys?

    You're baking in the idea that most people are commuting between the San Mateo (17mi) - Redwood City (25mi) - Palo Alto (30mi) area and San Francisco.

    Distance weights, if used, should probably be based on something like people's willingness to commute for X amount of time -- translated into Caltrain travel distance.

  3. What a great post to start the new year. The jobs data (specifically within 0.5 miles, unless there's an effort to improve connecting transit) is particularly nice, because they practically determine the destinations of the riders. Basically, any station with a >2% relative job weighting needs to have minimum 3 tph, and all trains should stop at a station with >3%.

    Population weighting is trickier. Because there's no overtake (i.e. any station with 4 tracks and 2 platforms to allow a transfer), this makes it harder to provide a fully local trip along the corridor.

    The Caltrain schedule could use a nice refreshing now, so I'll attempt to craft a schedule without taking into account electrification, overtakes, or TBT. I'll admit that traditional peak is much easier to schedule than reverse peak.

  4. My analysis suffers from a few shortcomings and simplifications, which I might as well list here:

    (1) the inverse square weights are calculated without regard to the proximity of other nearby stations. Assigning people and jobs to the closest station is computationally easy, but the problem is that it changes the weights based on which stops are assumed to be present (e.g. Transbay or Oakdale). As it turns out, the stops are spread out evenly and far enough apart that this does not have a huge effect on the weights, but it does introduce a few unrealistic distortions such as for example the high weight of Hayward Park, a stone's throw from Hillsdale. I accept this shortcoming because it makes the weighting matrix independent of the list of stops (@Ze Ace: for today's situation, simply delete rows and columns for the stops that don't exist)

    (2) The inverse square function may not be appropriate for both population and job weights. It's known that riders are willing to connect from much further out at the residential end of their trip than at the job end, where for example the flexibility of a car is not available. This would suggest that the exponent would be less than 2 at the home stop and more than 2 at the work stop. How much less or more? Can of worms.

    (3) The distance weighting is probably not completely timetable independent, as Matthew points out. This is borne out in ridership surveys that indicate that BB riders average ~28 miles vs. local riders ~21 miles, a ratio that corresponds very closely to the ratio of average speeds. This indicates that riders are sensitive to time, not distance. This makes a great deal of sense but breaks my aim of creating a weighting matrix that is 100% timetable-independent. They're never totally independent in reality, but I have to simplify the problem to make it tractable. I guess we can leave this refinement to the pros.

    (4) As Ze Ace points out, the effect of connecting transit and shuttles is not modeled. Again something for the pros.

    With a timetable-independent ridership potential matrix, we can compute a single score for any given timetable, something I plan to do soon (and will also be built in to the service pattern generator that Richard built). I am hoping that we can somehow build a system that can digest a timetable in text format and spit out an overall score, based on the O&D time metrics and ridership weights. Now you know my evil scheme...

  5. Clem, in your final table, there's somewhat of a sleight of hand, making all stations appear underserved. Because downtown commuters currently use 4th and King where Transbay would be more appropriate, the weight you should assign to 4th and King is larger. Of course the method is independent of the existence of other stops, but in reality people with no options (road congestion, parking difficulties) will make a transfer at the downtown end if they have to.

    The upshot of this is that Millbrae and Palo Alto would both turn out to get much more ridership than predicted. This is not surprising, since both are Baby Bullet stops, so they're made to collect ridership that would go to nearby stations in an all-local setup. But on top of that, Millbrae gets some hapless people connecting from BART, and Palo Alto gets students, who are less likely to own a car.

  6. Off topic, but for those who are interested and perhaps haven't checked the site real recently, The Infrastructurist announced it was suspending operations as of Jan. 6, 2011, apparently at least in part because its writers and editors found other work. You may want to check what's on it before it goes away entirely.

  7. @Clem

    Unless I missed it, you don't mention anything about the GO PASS program.

    As far as I have been able to determine, CalTrain really has no idea how many passengers use the GO PASS program. I'm not sure how this would affect your numbers here, but for sure, estimating GO PASS users by just doing a survey rather than having a real quantitative method leads me to believe that their passenger counts may not be up to snuff.

    CalTrain is of the opinion that they are getting a fair return on the GO PASS program, but they really don't know, and apparently don't care. I've given up trying to get real information.

  8. On the topic of ridership in South San Jose: it's not just the fact that service there is terrible, but also that it costs more, because Capitol and Blossom Hill are in Zone 5. It's cheaper to transfer to VTA, and definitely cheaper to just drive to Diridon. Tamien ridership is hurt by the lack of midday service, and the fact that the off-ramps from the 87 are pointing in the wrong direction to capture ridership from the south. Moving Blossom Hill and Capitol into Zone 4 would cost Caltrain maybe $50k in fare revenue (assuming anyone actually pays a Zone 5 fare), if ridership stays constant, but that could be made up if just 50 more people start riding as a result. It's a cheap enough experiment that it's probably worth trying, to make better use of spare capacity.

    Also, Caltrain really needs to look into banner repeaters or something like that to avoid the Delay in Block rule that slows things down on the section south of Tamien. There's literally about 3 or 4 minutes of travel time that can be recovered by some minor changes in operating practices, and some minor improvements to the signal system (to remove the DIB stations). For those who don't know what that is, it's a rule that states that if a train stops somewhere between signals, and the next signal down the line isn't visible, once the train starts again, it must be assumed to be red, and therefore approached no faster than 35 mph. That's why southbound Caltrains are so slow after departing Capitol.

  9. @Alon: good point, I had no business making up a normalization of apples to oranges. I have now fixed this shortcoming with Figure 8, which I think turned out very nicely. Thanks for making me do it.

    Figure 8 also made me alter one of my conclusions. While southern SJ has a high residential density, there are very few jobs there, so Capitol and Blossom Hill don't rate particularly high after all, compared to other stops that have much higher potential.

    @Morris: While the number of GO pass holders who actually ride is unknown, they are definitely counted. Every February, Caltrain conducts a labor-intensive manual count of every passenger that enters and exits a train. That count includes GO passes, fare evaders, children who pay no fare, and even pet monkeys.

  10. As I'm realizing how hard it is to make a takt timetable with only 2 tracks (give RWC 4 tracks and my job becomes much easier), here are some things I'm trying to build into the schedule. Note that I first figure out where the people want to go (the job centers), then I attempt to provide 4tph to there from the major population centers (>2.5%).

    Also, it's important to realize that the headways at the employment stations matter. For example, if a local train and an express train arrive at Palo Alto within 5 minutes of each other, everyone will crowd on the express train. It's important to make sure that the express and local trains don't compete with each other as much as possible.

    I consider a stop with >3% a must-serve, >2% a most-stops.

    Peak (all stops after Mountain View, where driving becomes more annoying):
    All trains must stop at SF, 22nd, RWC, and PA. Most trains should stop at Millbrae, California Avenue, and surprisingly SSF (though the 2002 numbers hint at that development).

    Reverse Peak:
    All trains need to stop at RWC, PA, Sunnyvale, Lawrence, SC, and SJ. Most trains should stop at the other stations south of MP.

    It's also important to distinguish reverse peak and traditional peak ridership. When service was relatively good at Bayshore back in 2002, hardly anybody took the train to SF. On the other hand, there were over 200 reverse commuters there, a substantial amount. Same is true of San Bruno. The only way to fix this is to build the TBT, whenever that will happen.

    Finally, while we're on the subject of HSR compatibility, RWC would make a really good HSR station based on the census data.

  11. if a local train and an express train arrive at Palo Alto within 5 minutes of each other, everyone will crowd on the express train.

    Um. how long do people who are waiting patiently at the local station have to wait for the express to arrive? I suspect they will have a very very long wait.

  12. @Caelestor: “if a local train and an express train arrive at Palo Alto within 5 minutes of each other, everyone will crowd on the express train. It's important to make sure that the express and local trains don't compete with each other as much as possible.”
    A loaded 5-car express train contains 80 kw-hours of kinetic energy at 79 mph which costs $8.00 @ 10 cents per kw-hour for each maximum speed acceleration from each stop. A two car local’s energy costs $4.60 per maximum speed acceleration from each stop.
    It is desirable that passengers do prefer express trains over local trains for two reasons: (1) It costs less per passenger mile offered to provide express service, and (2) the faster smoother express train riding experience might encourage the passengers to ride again in the future.

  13. Great work again, Clem.

    Ze Ace already mentioned my main concern -- the need to consider transit.

    You could probably model the passenger load due to each mode of transit independently.

    It would be easy to do fits on actual data to find appropriate models for each mode, if we had data.

    Another important aspect to consider is that these travel modes are asymmetric -- a person is very unlikely to use a car at both ends of a trip involving a train commute.

    Of course, the results may not be that significantly different than what you already have...

    Plus, another way of looking at it is to say that the population is the potential, and the transit "just" needs to be filled in... so perhaps your model not taking transit into account is more correct in a forward-looking way.

  14. Here's a study that backs the assumption that residents do drive transit ridership, even if the case study is BART. ; ) It also gives tradeoffs based on replacing parking with other uses and an insight into new methods of modeling ridership.

  15. "The need to consider transit"...

    Hey, outside of SF, for all practical purposes, there is no transit.

    Even in SF you have a hard case arguing that Muni serves 22nd in a way that isn't a rounding error for Caltrain trips.

    BART in MIllbrae adds a few hundred extra riders a day from south-eastern SF who are all of insensitive to high cost, insensitive to long travel times, and not rational enough to take I-280 (mostly uncongested, easy access.) Maybe we need to adjust for rider IQ? (Just kidding!)

    VTA and SamTrans offer "service" that is bad everywhere and slightly less bad in places with slightly more density. It seems like that effect is already in there! Density is a pretty good first order proxy for nearly everything related to transit.

    Dedicated, timed (unlike Muni, SamTrans, VTA, sadly) employer shuttles again can be "swept under the rug" of local employment density. If there are lots of employees those things seem to run, if not, enjoy the bike ride.

    Stanford's is the only stand-out trip generator that punches above its weight, the nice Marguerite shuttles responding to that. But once Clem starts special-casing Palo Alto, ever stop will have Special Needs and it gets out of control.

    The existing level of rough and ready seems surprisingly useful, and my bet is that it in fact as good as or better than the opaque secret models that consultancies are paid to tweak to produce the results that "justify" big projects.

  16. "since lower-income jobs are less likely to require commuting since lower-income jobs are less likely to require commuting"

    Wouldn't lower-income jobs in higher-income areas be more likely to require commuting?

  17. If you want to add a variable that explains Palo Alto, look for the percentage of households with no car, and the percentage with just one car. It's used elsewhere to explain why light rail lines that go out to boonies where everyone owns a car underperform the models whereas the ones in cities, with fewer vehicles per capita, overperform.

  18. Another excellent piece of analysis, thanks for breaking this down!

    Some people have already commented on the need to account for transit connections, but I'd like to suggest another addition - ease of car access to a station. The ease of people arriving at a station from a distance cannot be measured based solely on distance - some stations, such as Millbrae, are both very close to a major freeway (101), and have underused parking lots (although that lot is technically a BART lot, it serves Caltrain as well). When I lived on the East side of San Mateo, I would often drive up to Millbrae to head to the city, because it was a very easy drive. If Burlingame (to pick an arbitrary nearby example) had been the major bullet-stop, I probably still would have driven to Millbrae. Even though it was farther away, the reduced time spent on surface streets and cheaper and more reliable parking offset that distance handily.

    That being said, I'm really not sure how easy it would be to model that, but it may provide some interesting information about stations that draw in more distant riders who would rather split their commute between driving and riding, instead of trying to park in San Francisco.

    Keep up the great analysis! It's refreshing to see these things independently broken down.

  19. Excellent analysis!

    The data presented here, and the realization from looking at it that everywhere south of the San Francisco CBD is essentially one big suburb, prompted me to put together the following future takt-timetable:

    Caltrain A/B/C timetable

    The timetable assumes level boarding and electrification, but only two tracks throughout. I figure, if this service pattern works for the 38-Geary, why can't it work for Caltrain too? ;-)

  20. Hey, outside of SF, for all practical purposes, there is no transit.

    That's just nonsense. One gripe I have with Clem's model is that ignoring job-end local transit is skewing his numbers. For example, Stanford runs the free Marguerite shuttle from the Palo Alto station, to around campus. The shuttle gets packed morning and evening rush-hour by Stanford employees (mostly not faculty). The buses for that service just keep getting bigger and bigger. Plus, that segment is going to grow hugely with Stanford Hospital's expansion.

    Similarly, the free shuttles to and from various hi-tech office parks -- Oracle; EA/Redwood Shores - are very well patronized, though at lower volumes.

  21. "Hey, outside of SF, for all practical purposes, there is no transit."

    As somebody who has always relied on public transportation and my own power to get around the SF Peninsula and has done so for a couple decades, I can confirm that that statement is perfectly accurate.

    Try it, for real, then report back. You have to be a complete masochist to even try, let alone persist with it.

    People in Santa Clara and San Mateo Counties use (crap-quality) transit if their are either (a) one of a handful of enviro-martyrs or (b) poor (and hence, by contemporary reasoning, deserving of nothing but the worst.)

  22. Looking at the ridership counts makes for a slightly different analysis. For instance the "February 2011 AM Peak Passenger Activity" shows that only 58% of the passengers who get on northbound trains, get off in San Francisco (22nd or 4th). Totaling the AM peak northbound and southbound riders, 34% get off in SF, so 66% of the AM Peak commute riders are going someplace else. Broken down a bit, 15% are commuting to Palo Alto, 10% to Mountain View, and about 5% to each of Millbrae, Redwood City, Menlo Park. Also of interest, 732 people get on trains in Palo Alto during AM peak, but 2364 get off.

    As a Stanford University employee I receive $300/yr for not buying a parking permit, plus free Go and VTA Passes. I assume various company incentives are also skewing the Mountain View numbers, though not quite as dramatically.

    Caltrain is NOT simply commuter transit taking people from their suburban bedroom communities to SF...

  23. Marc, I don't think you're managing to poke any holes here.

    Exclude Transbay from the mix -- as MTC, BART, Caltrain, former "Rail Transportation Chief" Bob "CBOSS" Doty, Quentin Kopp and PBQD seek to do, with all of their effort -- and much of the hugely overwhelming SF job market is positioned out of reach.

    I live in SF, I commuted on Caltrain to jobs down in sprawl hell for decades, I know full well that the trains heading south aren't empty, but if you think that Stanford Hospital expansion (I've "commuted" there, also!) is a market within three orders of the SF CBD, well, you're just not thinking straight and you're letting anecdotal personal experience trump objective reality.

    PS To be *really* cheesy with the numbers, Clem's "6.96" and "3.12" MissionBay/22nd numbers together are 12% of a cheesy hypothetical total of 81.13 (exclude Transbay's 18.87). By this cheesy figuring, the numbers vastly *overstate* non-SF ridership, contra your claims.

    PPS A reminder re density that PBQD "predicts" as many transit riders at its Milpitas shopping mall parking lot stop on their highly lucrative public-funded extension to nowhere as use the BART SF CBD stations. The level of outright systematic naked fraud committed by the parties involved is simply breathtaking.

  24. So, I crafted a reasonable 6tph takt that makes schedule reading easier.

    - A stop is about a 2-minute penalty right now.
    - North of RWC, SF and 22nd Street dwarf all other job employment centers; hence there needs to be fast service there.
    - OTOH, nearly all stations south of RWC are job centers, so reverse peak needs to run local in Santa Clara County.
    - Some stations are better suited in one direction, so they only need 2 tph in one direction for now. Otherwise, every stop has 4 tph.

    Thus, here's a timed transfer takt. Caltrain probably made the right decision to put a timed transfer at RWC during the peak, but RWC is not a good timed transfer for the reverse peak. That job is better served by Millbrae, which has a fair amount of commuters coming in from BART. This schedule also avoids skip-stop in San Mateo to make everyone's life easier.

    It's easy to see who benefits, so who loses? San Jose riders obviously (but they probably want Palo Alto and RWC based on distance considerations, so it's not too bad).

  25. Caelestor, out of curiosity, why do trains skip San Antonio in your schedule?

  26. San Antonio isn't really a destination stop, but rather an origin station for people headed north. The thing about scheduling is that I had to make so many tradeoffs. Basically, the reverse peak schedule sacrifices trip times to allow more people in San Mateo County to access Mountain View, Sunnyvale, Lawrence, and Santa Clara. San Antonio is tiny compared to the four stops after it, so skipping it saves a tiny bit of time (2 minutes).

    I also made some reasonable assumptions (7% padding, 30 min second boarding times), but Caltrain may think otherwise.

    The point is, the traditional peak schedule isn't too bad. The reverse peak needs to be substantially reworked.

  27. @Richard:

    you're raplidly losing credibiltiy, to me at least.

    The first 2 was in the Bay Area, I surived without a car. I even survived 2 months without even using a bicycle, after breaking an arm. (On Pasteur Drive, of all places. Drivers, meh.) Conceded, that was over a decade ago; but SamTrans still has a bus-route there.

    Try a little less invective and outrageous hyberbole, and a bit more fact.

  28. Dear Mr Kiwi,

    Congratulations on being an enviro-martyr. That makes two of us. Between us we'll save the planet, somehow.

    Now look around you, look at the other people on the public agency bus (do they look like your fellow tech employees or like the dishwashers at the places you eat lunch?), count the number of cars, and try to say with a straight face that transit is a credible or attractive alternative in the sprawlburbs of the peninsula.

    You don't have to convince me. You have to convince the two million other peninsula residents that transit is an alternative in anything but name.

  29. Dear Caelestor,

    So pleased that somebody is getting some mileage out of the ultra-professional quality web timetable tool!

    And thanks for sharing your ideas, "Thinking" and "ideas" tend to be in short supply, so all contributions are appreciated and welcomed, by me at least.

  30. Richard,

    I have several former co-workers who commuted from the Peninsula to Mountain View by Caltrain. Some even took VTA south from Mtn View.

    They've all given it up. Caltrain has gotten more and more expensive, compared to driving; and more and more often, unacceptably behind schedule.

  31. Dear Jonathan,

    A few of the final years of my (thankfully past) SF-"Silicon Valley" commute were as a paying-but-not-driving part of a carpool.

    The fact that driving was over twice as fast as transit (or even bike+transit+bike) is pathetic.

    TWICE AS FAST, even when 101 was constipated!

    Even if I valued my time at $0/hour (which I do, or I wouldn't be writing blog comments!), it was also much cheaper.

    So perhaps we're working toward agreement that transit on the peninsula isn't practical, except for martyrs and the (large and increasing) economic underclass? I'd like nothing more than for this to not be true, believe me!

  32. One interesting thing I've noticed about the conversation is how the complaints are mainly coming from reverse peak riders. That's where Caltrain should first improve.

    Also, Lawrence's ridership will continued to be depressed, as it lies too close to Sunnyvale in another zone (i.e. $4 cheaper roundtrip).

  33. The Caltrain fare zone system is terrible: riding one stop south from Redwood City costs the same as riding Redwood City to SF.

    Caltrain needs to go to distance-based fares, which, given their TVM/Clipper-based ticketing, is easy to implement.

    Back when conductors sold/punched tickets on board, the zone-based fare table was more arguably a practical -- if not necessary -- simplification/approximation of a distance-based fare system.

    Caltrain has no (good) reason not to do it now. Reprogram the TVMs to issue station-to-station (instead of zone-to-zone) tickets, print up a BART-style fare matrix and be done with it. No more bizarre short-trip-discouraging/penalizing fare cliffs/steps.

  34. Back in the "olden days" of conductors selling tickets on board, there were 10 fare zones. At least, that's what I infer from the old Translink validators that had buttons for zones 1-9 and "SF" on them. So fares were actually finer grained, and the problem at fare zone boundaries not as big. Also, Gilroy was still 2 zones away from San Jose, but they were smaller zones, with a proportionately much smaller fare increment. Changing the fare structure to almost double fares at the same time as a competing freeway expansion opened pretty much killed any hope of Gilroy ridership. At this point, it's both slower and more expensive than the VTA express service.

  35. Richard,

    as for "enviro-martyrs" [sic]: most of the time I rode SamTrans was when I'd broken a wrist, bicycling (on Pasteur Drive!) when I was a reserch assistant on soft funding. At that time, most of the other cyclists I knew were in similar straits.
    At that time, the other bus passengers I saw appeared to be retirees.

    Have you considered being less caustic to those who mostly agree with you? Or are you afflicted with the syndrome that anyone who disagrees with you is "dumb"?
    Or, to coin your own phrase, "should have their hands cut off"?

  36. @ Arcady: “Back in the "olden days" of conductors selling tickets on board, there were 10 fare zones.”

    It used to be nine zones which did make the system finer grained but there were still glaring disparities to some short distances.

    Prior to implementation of POP, in 2003, Caltrain conducted a “study” to revamp the zone fare system. The *only* reason for this so-called “study” was to determine how to make the Caltrain fare from Millbrae (to SF) to be comparable to the (soon to be open) overpriced BART fare from Millbrae to SF, so they came up with 13 mile zones. Caltrain claimed the changers were to “simplify” the fare system because customers were “confused” by far too many zones and ticket options. The result is a person(s) travelling 25 miles pays the same fare as a person(s) travelling just 2 miles… Stupid, Stupid, Stupid……

    Some suggested point to point pricing, I suggested shorter 5 mile zones, but Caltrain claimed that these options would confuse riders and TVM’s could not handle such a system. Can you believe that?

  37. Too confusing and the TVMs wouldn't be able to do it? I'm sure someone somewhere, in some exotic far off location, has a complex fare structure


    Page 3 of the brochure or page 5 of the PDF.

  38. Nice analysis Clem…

    @ Clem: “A stop like Broadway (Burlingame) with zero weekday rail service has more people living near it than Millbrae, site of the all-important BART intermodal station.”

    I worked with my (Burlingame) city council (I now despise them) on the Broadway issue. Broadway was a station with decent ridership (567 in 2001), and good service including some express service. However, Caltrain instigated what I call systematic destruction of ridership at Broadway. First, they changed the zones, making Broadway the same zone as Millbrae. This took away the cost savings for southbound riders that may have chosen to use Broadway instead of Millbrae. Second, Caltrain reduced service to just once per hour, which really killed off the ridership, giving Caltrain the perfect excuse to suspend weekday service at Broadway.

    Burlingame City Council also presented a study of census data (similar to Clem’s) showing that more people live near Broadway than some other Caltrain stations to Caltrain back in 2005, but to no avail.

    Also notable is that Broadway (and Burlingame) had the highest percentage of customers walking to the station to access Caltrain.

    Caltrains answer to the suspension of service at Broadway was to provide “robust” shuttle service between Broadway and Millbrae. Problem is that this “robust” shuttle service only ran during peak hours and of course, there are additional operating costs associated with running the shuttle.

  39. Clem, do you (or does anyone else here) have off-peak and weekend ridership numbers? Those could provide some check on your numbers.

    That said, off-peak and weekend ridership tends to be more intercity and less commuter. I don't know how it plays in the Bay Area, but over here, Providence seems to account for one third of the people on the weekend trains to Boston, even though its actual proportion of Providence Line ridership is closer to one tenth.

  40. Given the convenience, usually speed, and affordability of driving (For the $104K median income of the Caltrain user) Clem’s inverse square relationship for probable rail transit use seems plausible.
    “To assign to each station location a single weighting factor that quantifies that station's accessibility for nearby residents, regardless of distance, one can sum up each person divided by the square of how far away they live. This inverse-square relationship is empirical, but captures the fact that people who live far away from a station are less likely to use it; its use in ridership modeling is not unprecedented.”
    Making the best use of the inverse-square-distance-relationship in order to induce the maximum increase in ridership from ‘transit-oriented-development’ the following scheme would minimize the walking distance between many Silicon Valley Offices and an open train door:
    A 1987 Peninsula Rail Transit Study, available at the Mountain View Library during 1999, suggested “rerouting peninsula rail transit through Sunnyvale and Santa Clara on the Central Expressway right-of-way”. Diverting parallel automobile traffic to the 101 Freeway and a vacated Southern Pacific right-of-way would leave a largely-car-free-corridor bisected by a CHSR-Caltrain open cut. Caltrain could offer frequent stops along this dense professional employment corridor in exchange for a low cost four-track peninsula rail right-of-way organized by a Silicon Valley Development District. (The current Brown Administration’s effort to discontinue state funding for development districts would not necessarily prevent their formation in the future.)
    Below-grade, short spans between rail attachment points (In order to increase the resonant frequency of the rail-wheel couple thus reducing the susceptibility to rail corrugation.), supported by ground structures different from those supporting adjacent buildings would be required in order to minimize noise and vibration transmission to nearby structures and at least not discourage commercial real-estate entrepreneurs from constructing high-density neighborhoods close to Caltrain stations. An ideal transit oriented development would be to build two equal-height tall buildings on opposite sides of a center platform train station. Using air-rights over the station aerial walkways connecting both buildings to an elevator bank centered over the station platform would encourage their inhabitants to use transit.
    If the SF to SJ rail right-of-way was completely grade separated with four tracks through stations and noting that train throughput along non-stop sections is several times local stop sections frequent driverless trains, mostly expresses, whose length was carefully tailored to demand could be run with little mutual interference and be affordable. For little added expense spare driverless rolling stock could be distributed between stations along the center of a FSSF topology set-up to fill service gaps and meet demand surges the second their movement appears useful.
    When approaching expensive to construct subways through high ridership areas combining en-route trains can become a cost-effective railway system design. For example doubling train lengths will increase rail-car throughput at a station by 87% if the dwell time is nearly equal to the close-up period. This is true if current transit industry safety margin norms constrain a train separation system that is continuously aware of all train positions. The considerable savings possible for increasing the capacity of large heavily used systems plus the rapidly improving effectiveness of electronic controls suggests moving block high capacity systems may become the lowest cost train separation system available for any rail transit system. One estimate for The New York Subway is a moving block system capital cost would be 2% of the capital cost per additional rider capacity of building parallel subways.

  41. Alon, Caltrain publishes pretty good ridership statistics on their website. I don't know if they break out off-peak from peak, but they definitely have weekend ridership, and that does make for an interesting comparison.

  42. As the weekend timetable has barely changed over the past decade, it's interesting to see what stations deserve more than 1tph on the weekends. There should be additional service to the following Tier 1 and 2 stops.

    Tier 1 Stops: SJ, Mtn View, Palo Alto, RWC, Millbrae, SF

    Tier 2 Stops: SC, Sunnyvale, San Antonio, California Ave, Menlo Park, Hillsdale, San Mateo, Burlingame

    Tier 3 Stops: All other stations w/ <300 riders in one direction

    Should be closed: Atherton

  43. "As the weekend timetable has barely changed over the past decade"

    Sunday used to have two hour headways before the 2004 "Baby Bullet" revamp.

    Incredible, antediluvian, but true.

    So, yeah, something changed for the better. (Doty finally got it that they weren't even saving much money by not running trains.)

    "it's interesting to see what stations deserve more than 1tph on the weekends. "

    The answer is "all of them (except those that should be closed at all times: Hayward Park, Atherton, College Park, all south of Tamien)".

    Service at less than 1tph means "no service", for all practical purposes.

    Caltrain's big problem for providing service remains overstaffing (thank you FRA, BLE, UTU) and dismal performance (thank you FRA, and non-level boarding that is 100% the fault of Caltrain staff) that means that half-hour headway service requires eight trains+crew (240 min round trip for an all-stops local) and a bunch of "conductor" and "assistant sub-auxiliary co-conductor" deadweight,, not six (150 min round trip) trains and train drivers.

    And naturally Caltrain/PTG's genius electrification "plan" -- featuring FRA regulation, no level boarding, continued over-staffing, low-acceleration and technical obsolete non-competitivelty-procured trains and higher operating and maintenance costs -- does not rectify this overriding issue, which is the improvement that a multi hundred million dollar "investment" would be required to provide anywhere else in the world ... anywhere that gave even lip service to the concepts of "value for money" or "investment", that is.

  44. Note re a few of Clem's underserved stations:

    * Oakdale

    First, SF has recently "invested" (in the loosest sense of the term, a more accurate descriptions being "wasted" and "thrown money at corrupt construction interests") $650 million on building a high-floor "light" rail line along Third Street in SF, paralleling Caltrain in the MIssion Bay-22nd-Bayshore corridor.

    Assuming the Muni T-Third were to operated remotely non-incompetently, it's hard to see how the outrageous capital cost of an Oakdale Caltrain stop could gain enough riders beyond those gathered by the Muni line to justify its existence.

    Second, note that without an Oakdale stop, it is possible to operate Bayshore-Mission Bay with two tracks, with few timetabling compromises. Add that station, and one gets substantial train capacity loss in the Bayshore-Mission Bay-TTT section of the line, and generally ends up with the need to quadruple track precisely where it is most expensive to do so: urban area, tunnels. Run away!

    Third, nearly all the population in the corridor is to the east of the Caltrain line, more or less precisely bisected by Third Street and hence the T-Third line, and to the south of the Oakdale location. It's possible to make the argument that a north-south transit collector line along Third connecting at Bayshore might even be a superior solution.

    So while in some abstract ideal world a new Oakdale stop would make sense, the massive practical downsides -- either in line capacity or in the capital cost to preserve line capacity --, together with a semi-plausible story about extant local parallel local transit service, make me extremely dubious about the desirability of adding this stop.

    * Tamien-Capitol-Blossom Hill.

    The 800lb gorilla is that All Your Tracks Are Belong to UP starting a mile or so north of Capitol (Control Point Lick, in the midst of a TOD-tastic piles of gravel.) Running more Caltrain service south of Tamien by necessity involves UP (non-)co-operation, and by necessity involves either FRA regulation or acquisition of new and non-FRA separate right of way.

    Doing anything as ambitious than operating a limited number of FRA "commuter rail" type shuttle trains Capitol-Tamien-SJ-(Santa Clara?)-(Great America?) seems wildly fiscally improbable, and doing more than that -- ie integrating with mainline, by-the-grace-of-god non-FRA mainline Caltrain traffic -- is as good as impossible.

    Of course given (a) non-UIC HSR SJ-Tamien-Capitol-Gilroy-Los Banos (highly undesirable) and (b) integrated HSR/Caltrain operations (highly desirable, but actively opposed by both Caltrain and HSR!!), the difficulties and costs shrink to "only" those of building the stations and associated stopping-train tracks and turnback-train tracks near the stations.

    Such capital costs are still in the several hundreds of millions of dollars range: again it's hard to make any economic or social justification for this level of spending. Nice to have, maybe, but one can't afford everything, and some markets are just too expensive to serve.

    Regardless of stations, the rail line itself is hardly ideally placed: hemmed in by light industrial and highways, access seems like it's always going to be a challenge and limitation.

    Lastly, there's no shortage of parallel highways and arterial roads -- 280, 101, 87, 85, 82, etc. Better-than-rail bus service isn't inconceivable.

    I don't see Caltrain ever pencilling out, at least not in the less than 30 year timeframe.