Data Sets
Dec 18, 2025
6
Minutes read

Greater Kuala Lumpur Mobilities

No items found.
Authors
Gregory Ho Wai Son
Kelvin Ling Shyan Seng
Kelvin Ling Shyan Seng
Key Takeaways
Data Sets Overview

A multi-year research project analyzing mobility patterns, public transport performance, and accessibility in Greater KL using large-scale administrative, geospatial and qualitative data.

gklmob
Data Sets
Disclaimer
As we transition to a digital-first communication and continue building our knowledge hub, publications released before October 2025 are preserved in their original format. Publications released from October 2025 onward adopt a new, digitally friendly format for easier online reading. The official versions of earlier publications, including their original language and formatting, remain available in the downloadable PDF.

Introduction

The Greater Kuala Lumpur Mobilities (GKLMOB) project is a multi-year research initiative examining how people move across the Greater Kuala Lumpur region, with a focus on public transport performance, accessibility, and user experience. The project integrates large-scale mobility data, administrative transport datasets, geospatial analysis, and qualitative interviews to build an evidence base for transport planning and policy reform.

A central component of GKLMOB is the analysis of bus system reliability and efficiency, including punctuality, headway regularity, and spatial coverage. This is complemented by qualitative interviews with public transport users to capture lived experiences of waiting times, comfort, safety, affordability, and convenience. Together, these quantitative and qualitative components aim to identify systemic barriers to mode shift and inform interventions to improve public transport outcomes.

Outputs from GKLMOB include datasets, technical notes, performance indices, policy briefs, and interactive data visualisations designed for policymakers, researchers, and the public.

Methodology

The dataset is constructed by integrating GTFS Static schedules with GTFS Realtime vehicle position feeds published by Rapid KL. GTFS Static data provides route, trip, and schedule definitions, while GTFS Realtime supplies high-frequency observations of bus locations and movement.

Vehicle position data are collected programmatically via the GTFS Realtime API. API calls are made every 15 seconds, on a daily collection window between 5:00 a.m. and 10:00 p.m., capturing operational bus movements during active service hours. Each API response is timestamped upon retrieval and stored as an individual vehicle position observation.

Raw vehicle position records include vehicle identifiers, geographic coordinates, and movement attributes such as speed and bearing.Vehicle positions are matched to scheduled trips using trip identifiers from the GTFS Static feed.

The resulting dataset supports analysis of bus movement patterns, service regularity, headway variability, and temporal–spatial performance metrics. The dataset represents vehicle-level operational data only and does not include passenger boarding, alighting, or load information.

Caveats

Data collection is limited to the 5:00 a.m. to 10:00 p.m. window and does not capture late-night or early-morning services operating outside this period.

GTFS Realtime data availability is dependent on the upstream API provided by Rapid KL. There are periods where the API was temporarily unavailable or unresponsive, resulting in short-term gaps in data collection.

Changes to the structure or schema of the GTFS Realtime feed occurred during the collection period. These changes required updates to the data ingestion pipeline and may have introduced gaps in the dataset, typically spanning several hours or, in some cases, multiple days.

Data gaps are not systematically imputed. Users should account for periods of missing data when conducting temporal analyses or aggregating statistics over time.

Vehicle position data reflects reported GPS locations and may be affected by signal loss, reporting delays, or device-level inaccuracies.

The dataset contains vehicle-level operational data only and does not capture passenger boarding, alighting, or load factors.

Metadata

Data Source(s) Rapid KL, GTFS static & GTFS realtime API
Last Updated 5th January 2026
Frequency Monthly
Format Parquet

Datasets

Rapid KL Bus Positions

Dataset Name

vehicle_positions

Dataset Brief Description

This dataset is derived from the General Transit Feed Specification (GTFS) Realtime feed for the Rapid KL bus service, obtained from Malaysia’s Official Open API. It contains raw GPS-based vehicle position records for Rapid KL buses operating on scheduled routes. Each row represents a single GPS ping from a bus at a specific timestamp, and each column is defined in the Columns section.

The dataset is updated monthly on YYYY/MM/DD and can be downloaded in Parquet format from the Download section.

Download Dataset

Bulk Download
Download all 2025 data (ZIP)
Includes all monthly Parquet datasets
Monthly Downloads (2025)
Data Previews
timestamp trip_id route_id latitude longitude vehicle_id license_plate bearing speed start_time start_date
1735680581weekday_U2000_U200002_0U20003.245983101.727005WVB4123WVB412300.0005:30:002025-01-01
1735680601weekday_U1800_U180002_0U18003.193283101.694176WVC4291WVC429130418.5205:30:002025-01-01
1735680613weekday_U4210_U421002_0U42103.144493101.732994WVP2522WVP2522265.2912.4105:20:102025-01-01
1735680600weekday_U3000_U300002_0U30003.134097101.770930VFG2095VFG2095270.924.0805:19:392025-01-01
1735680611weekday_U2000_U200002_0U20003.246000101.727005WVB4123WVB412300.0005:30:002025-01-01
1735680609weekday_U3030_U303002_0U30303.132257101.788376WVL734WVL73432.427.7805:30:092025-01-01
1735680614weekday_U2200_U220002_0U22003.226933101.744660WUU6711WUU671121825.9305:30:142025-01-01
1735680614weekday_U2500_U250002_0U25003.205170101.732030WVD4028WVD4028310.120.7405:30:142025-01-01
1735680654weekday_U4500_U450002_0U45002.956690101.792150VHB446VHB44653.417.3905:30:502025-01-01
1735680631weekday_U1800_U180002_0U18003.195167101.693990WVC4291WVC429134916.6705:30:002025-01-01
Column Definitions Copied
Column Name Data Type Description Valid Values / Units Example Value
timestamp datetime Timestamp of vehicle position observation ISO 8601 (UTC) 1747689056
route_id string Vehicle route Alphanumeric U2020
trip_id string GTFS trip identifier GTFS-defined weekday_U2020_U202002_0
latitude float Vehicle latitude (WGS84) -90 to 90 3.243067
longitude float Vehicle longitude (WGS84) -180 to 180 101.70716
vehicle_id string Unique vehicle identifier from GTFS realtime Alphanumeric WUW8422
license_plate string Reported vehicle license plate Alphanumeric WUW8422
bearing float Direction of travel in degrees 0–360 88
speed float Vehicle speed km/h 25.93
start_date date Service date on which the trip will be run YYYY-MM-DD 2025-05-20
start_time string The departure start time of the trip HH:MM:SS 05:00:00

Rapid KL MRT Feeder Bus Positions

Dataset Name

vehicle_positions_feeder

Dataset Brief Description

This dataset is derived from the General Transit Feed Specification (GTFS) Realtime feed for the Rapid KL MRT Feeder service, obtained from Malaysia’s Official Open API. It contains raw GPS-based vehicle position records for MRT Feeder buses operating on scheduled routes. Each row represents a single GPS ping from a bus at a specific timestamp, and each column is defined in the Columns section.

The dataset is updated monthly on YYYY/MM/DD and can be downloaded in Parquet format from the Download section.

Download Dataset

Bulk Download
Download all 2025 data (ZIP)
Includes all monthly Parquet datasets
Monthly Downloads (2025)
Data Previews
timestamp trip_id route_id latitude longitude vehicle_id license_plate bearing speed
1736327932241209010218S12T4153.054465101.786250VAG4673VAG46731961
1736327929241209010013S12T1063.195752101.610010VG6519VG651915224
1736327922241209010926S8T4013.111718101.719480VAG4729VAG4729160
1736327930241209010089S7T1143.214353101.639540VH1570VH1570820
1736327924241209010205S11T5433.031315101.677150VX2864VX286415532
1736327917241209010080S7T8173.136412101.672640VAF5750VAF5750133
1736327906241209010112S9T8523.166052101.664276VAJ2895VAJ28957743
1736327912241209010063S11T4183.136834101.707720VX2919VX291934013
1736327925241209010030S11T1103.206914101.616330VH4985VH498513035
1736327913241209010231S9T8073.128712101.593710VAJ9136VAJ91363350
Column Definitions Copied
Column Name Data Type Description Valid Values / Units Example Value
timestamp datetime Timestamp of vehicle position observation ISO 8601 (UTC) 1746134878
trip_id string GTFS trip identifier GTFS-defined 250414010004S2
route_id string Vehicle route Alphanumeric T101
latitude float Vehicle latitude (WGS84) -90 to 90 3.206544
longitude float Vehicle longitude (WGS84) -180 to 180 101.616330
vehicle_id string Unique vehicle identifier from GTFS realtime Alphanumeric VH1568
license_plate string Reported vehicle license plate Alphanumeric VH1568
bearing float Direction of travel in degrees 0-360 106
speed float Vehicle speed km/h 0

Credits

Author(s): Gregory Ho Wai Son, Kelvin Ling Shyan Seng  

Team: Knowledge, Innovation & Data Hub (KID)

This page was prepared and maintained by the Knowledge, Innovation & Data Hub (KID) team at Khazanah Research Institute. The team is responsible for the structuring, documentation, and ongoing maintenance of the dataset.

Read Full Publication

Article highlight

featured report

Conclusion

Download Resources
Files
Datasets
Attributes
Footnotes
References
Photography Credit

Related to this Publication

No results found for this selection
You can  try another search to see more

Want more stories like these in your inbox?

Stay ahead with KRI, sign up for research updates, events, and more

Thanks for subscribing. Your first KRI newsletter will arrive soon—filled with fresh insights and research you can trust.

Oops! Something went wrong while submitting the form.
Follow Us On Our Socials