Casual Conversations v2 Dataset

Name: Casual Conversations v2 Dataset
Published: 2023-04-07
License: Data is available through Meta AI at <a href="https://ai.facebook.com/datasets/casual-conversations-v2-downloads/">https://ai.facebook.com/datasets/casual-conversations-v2-downloads/</a>. <br> Download the file &quot;Casual Conversations V2 Dataset License Agreement&quot; for full dataset terms and conditions.

Casual Conversations v2 Dataset

2023

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Description

Casual Conversations v2 is composed of over 5,567 participants (26,467 videos) and intended mainly to be used for assessing the performance of already trained models in computer vision and audio applications for the purposes permitted in our data license agreement. The videos feature paid individuals who agreed to participate in the project and explicitly provided Age, Gender, Language/Dialect, Geo-location, Disability, Physical adornments, Physical attributes labels themselves. The videos were recorded in Brazil, India, Indonesia, Mexico, Philippines, United States, and Vietnam with a diverse set of adults in various categories. A group of trained annotators labeled the participants’ apparent skin tone using the Fitzpatrick scale and Monk Scale, in addition to annotations of Voice timbre, Activity and Recording setups. Spoken words in all videos are either scripted (a sample paragraph from The Idiot by Fyodor Dostoevsky provided with the dataset) or nonscripted (answering one of five predetermined questions).

Details

Title

Casual Conversations v2 Dataset

Variant Title

CCv2

Creator

Porgali, Bilal Data Collector (Meta AI)
Albiero, Vıtor Data Collector (Meta AI)
Ryda, Jordan Data Collector (Meta AI)
Ferrer, Cristian Canton Data Collector (Meta AI)
Hazirbas, Caner Data Collector (Meta AI)

Subject

artificial intelligence
machine learning

Issued Date

2023-04-07

Version

v1

Alternate Identifiers

URL: https://ai.facebook.com/datasets/casual-conversations-v2-dataset/

Status

Published

Access Rights

Data are available through Meta AI at https://ai.facebook.com/datasets/casual-conversations-v2-downloads/.
Download the file "Casual Conversations V2 Dataset License Agreement" for full dataset terms and conditions.

Citation

Porgali, Bilal, Albiero, Vitor, Ryda, Jordan, Ferrer, Cristian Canton, and Hazirbas, Caner. Casual Conversations v2 Dataset. Inter-university Consortium for Political and Social Research [distributor], 2023-04-07. https://socialmediaarchive.org/record/18

Record Appears in

Datasets

Geographic Coverage

Brazil
India
Indonesia
Mexico
Philippines
United States
Vietnam

Platform

Facebook

Collection Modes

coded video observation

Data Formats

video: film, animation, etc.

Purpose

Assist in measuring algorithmic fairness and robustness in terms of age, gender, apparent skin tone, language/dialect, geo-location, disability, physical adornment, physical attributes, voice timbre, activity/recording setup conditions.

Design

Video recordings of individuals, who are asked predetermined questions from a pre-approved list, to provide their nonscripted answer as well as video recordings of their reading from a scripted text

Universe

Total number of subjects/actors: 5,567
Total number of video recordings: 26,467
Average per video length: ~1 Minute

Variables

Age (self-provided)
Gender (self-provided)
Language/Dialect (self-provided)
Geo-location (self-provided)
Disability (self-provided)
Physical adornment (self-provided)
Physical attributes (self-provided)
Voice timbre (human labeled)
Apparent skin tone (human labeled)
Activity (human labeled)
Recording setup (human labeled)

Analysis Units

media unit: video

Additional Notes

To download the dataset, please visit the following webpage: https://ai.facebook.com/datasets/casual-conversations-v2-downloads/

Related Resources

The Casual Conversations v2 Dataset, Document, Is Source Of, https://doi.org/10.48550/arXiv.2303.04838, DOI
Casual Conversations v2: Designing a large consent-driven dataset to measure algorithmic bias and robustness, Conference Presentation, Is Source Of, https://doi.org/10.48550/arXiv.2211.05809, DOI

PDF

Files

Statistics

Download Full History

Casual Conversations v2 Dataset

Description

Details

Related Items

PDF

Files

Statistics