Datasets
Video
Urban Environments
Footage of cities captured at street level and from elevated or aerial vantage points, covering the built environment in its full operational state — pedestrians, vehicles, signage, infrastructure, and open space. The set spans dense commercial cores through residential neighborhoods and transit hubs across a range of cities, seasons, and times of day, giving perception and navigation models broad exposure to urban visual complexity.
Hours38K+
District types7+
Regions20+
Training Use Cases
✓Urban scene understanding and semantic segmentation
✓Pedestrian and vehicle detection in complex street environments
✓Outdoor navigation and path planning in urban spaces
✓Geographic and cultural diversity training for city-scene models
Key Highlights
✓7+ district types covering commercial, residential, industrial, transit, mixed-use, waterfront, and park zones
✓Street-level, elevated, rooftop, and aerial perspectives all represented
✓Global coverage across cities in multiple continents and climates
✓Footage captured across times of day from early morning through night
Metadata Fields
durationLength of clip in HH:MM:SS
resolutionPixel dimensions (e.g., 1920x1080, 3840x2160)
frame_rateFrames per second (e.g., 24, 30, 60, 120)
contains_audioWhether the clip carries an audio track (boolean)
primary_categoryDominant content category assigned to a video
district_typecommercial | residential | industrial | transit | mixed_use | waterfront | park
capture_perspectivestreet_level | elevated | aerial
time_of_daymorning | afternoon | evening | night
regionGeographic region or continent of origin