Datasets
Video

Urban Environments

Footage of cities captured at street level and from elevated or aerial vantage points, covering the built environment in its full operational state — pedestrians, vehicles, signage, infrastructure, and open space. The set spans dense commercial cores through residential neighborhoods and transit hubs across a range of cities, seasons, and times of day, giving perception and navigation models broad exposure to urban visual complexity.

Hours38K+
District types7+
Regions20+

Training Use Cases

Urban scene understanding and semantic segmentation
Pedestrian and vehicle detection in complex street environments
Outdoor navigation and path planning in urban spaces
Geographic and cultural diversity training for city-scene models
Key Highlights
7+ district types covering commercial, residential, industrial, transit, mixed-use, waterfront, and park zones
Street-level, elevated, rooftop, and aerial perspectives all represented
Global coverage across cities in multiple continents and climates
Footage captured across times of day from early morning through night

Metadata Fields

durationLength of clip in HH:MM:SS
resolutionPixel dimensions (e.g., 1920x1080, 3840x2160)
frame_rateFrames per second (e.g., 24, 30, 60, 120)
contains_audioWhether the clip carries an audio track (boolean)
primary_categoryDominant content category assigned to a video
district_typecommercial | residential | industrial | transit | mixed_use | waterfront | park
capture_perspectivestreet_level | elevated | aerial
time_of_daymorning | afternoon | evening | night
regionGeographic region or continent of origin