Chicago Fingerspelling in the Wild Data Set (ChicagoFSWild)

Overview

This is the home of a collaborative data collection effort by U. Chicago and TTI-Chicago researchers. This is to our knowledge the first collection of American Sign Language fingerspelling data "in the wild," that is in naturally occurring (online) video. The ChicagoFSWild data set contains 7304 ASL fingerspelling sequences signed by 160 signers, carefully annotated by students who have studied ASL.

Citing the data set

@article{fs18slt,
  author = {B. Shi, A. Martinez Del Rio, J. Keane, J. Michaux, D. Brentari, G. Shakhnarovich, and K. Livescu}
  title = {American Sign Language fingerspelling recognition in the wild},
  journal = {SLT},
  year = {2018},
  month = {December}
}

Publication

SLT'18 paper

Download

You can download the dataset here:

ChicagoFSWild.tgz(14 GB)

Dataset format

Files are structured as follows:

  • ChicagoFSWild.csv - This is the main data description file. Each line corresponds to a single fingerspelling sequence.

    • filename - Name of fingerspelling sequence
    • url - url of the video from which the sequence was obtained
    • start_time - start time of the sequence in that video, in the format HH:MM:SS.xxx
    • number_of_frames - number of frames of the fingerspelling sequence
    • width - frame width
    • height - frame height
    • label_raw - raw labels from the annotators
    • label_notes - annotator notes
    • label_proc - processed labels, used for training and testing
    • partition - partition (train/dev/test) the sequence belongs to
    • signer - signer identity for this sequence
  • ChicagoFSWild-Frames.tgz - This file contains sequences of image frames (in .jpg), identified by filename in ChicagoFSWild.csv.

  • annotation_instructions.txt - This text file provides the instructions used by the annotators, which define the conventions used for the raw labels. This is provided for completeness. However, to reproduce our results, only the label_proc field in the CSV file is needed.

  • HandAnnotation.csv - Annotations of hand bounding boxes for a subset of the fingerspelling sequences in ChicagoFSWild

    • filename - Name of fingerspelling sequence
    • partition - partition (train/dev) the sequence belongs to, used to train and tune the hand detector
  • BBox - A folder of hand bounding boxes

    • F/X.txt - hand bounding boxes in frame indexed by X of the fingerspelling sequence F
      • x0, y0, x1, y1, L - top left corner (x0, y0), bottom right corner (x1, y1), L=1: signing hand(s), L=2: non-signing hand(s)