Video

Talking Head

People speaking to camera or across from an interviewer, framed tightly with controlled lighting that isolates face and voice from surroundings. The footage captures clean audio alongside unobstructed front-of-face framing across a wide range of speakers.

Hours449K+

Countries30+

Languages10+

Training Use Cases

✓Audio-driven facial animation and lip-sync

✓Avatar and digital human generation

✓Visual speech recognition and audio-visual ASR

✓Expression and affect recognition during natural speech

Key Highlights

✓30+ countries of origin and 10+ languages including English, Spanish, Mandarin, French, German, Hindi, Japanese, Arabic

✓5+ framing conventions from tight close-up through medium, two-shot, over-the-shoulder, and cut coverage

✓Solo direct-to-camera and two-person sit-down conversation formats across the set

✓Controlled lighting throughout, with face and voice prioritized over surroundings

Metadata Fields

durationLength of clip in HH:MM:SS

resolutionPixel dimensions (e.g., 1920x1080, 3840x2160)

frame_rateFrames per second (e.g., 24, 30, 60, 120)

contains_audioWhether the clip carries an audio track (boolean)

primary_categoryDominant content category assigned to a video

styletight_close_up | medium_close | two_shot | over_the_shoulder | multi_camera_cut

conversation_formatsolo_to_camera | one_on_one_interview | sit_down_two_shot | hosted_segment

speaker_count1 | 2

languagePrimary spoken language (ISO 639-1 code)

country_of_originCountry where footage was produced (ISO 3166 code)