New York City Transit (NYCT) implemented an automated algorithm to estimate daily bus unlinked trips, infer passenger miles, and compute average trip lengths by route with the use of transaction data from an entry-only automated fare collection (AFC) system. Total onboard miles are inferred from symmetries in bus passengers’ daily activity patterns. NYCT’s algorithm uses rigorously tested engineering assumptions to detect common data errors caused by mechanical failures, imperfect driver-farebox interactions, and operational reality and applies statistically measured adjustment factors to correct or interpolate for missing passengers from non-AFC boardings and malfunctions. Surveys revealed that under typical operating conditions, non-AFC passengers and farebox data transmission errors accounted for 12% and 5.5% of missing ridership, respectively. The fault-tolerant algorithm uses non-geographic transaction data from an AFC system without automated vehicle locator functionality and directly computes aggregate passenger miles by inferring origin locations from transaction time stamps with scheduled average speed assumptions and without assigning each passenger’s precise destination. NYCT focused on fully automatic, production-ready algorithms by rejecting alternatives that required excessive coding effort, processor time, difficult-to-obtain data, or manual intervention in favor of logical inference, statistical estimation, and symmetry. Meticulous parallel testing demonstrated that resultant average trip lengths were stable across days and correlate well with manually collected stop-by-stop ridership data. Annual passenger miles were within -1% to 4% of the National Transit Database (NTD) ±10% sample data and were approved by FTA for NTD Section 15 submission.
Transportation Research Record Journal of the Transportation Research Board
2216(-1)
DOI:10.3141/2216-03