Username: Password:

What is the dataset?

  • This dataset contains 61 categories of food items. These 61 categories are a subset of the 101 categories. There are 3 instances of each food item, each taken on different days. We are careful in selecting the 61 categories to ensure that it is not possible to recognize the food item from the background/lighting alone.
  • Baselines approaches

  • We evaluate the accuracy of standard computer vision recognition algorithms on this dataset. Specifically, we examine the accuracy with which two popular representations, color histograms and SIFT, are able to capture the image content in our fast food images. The goal is to provide standard baselines for image processing and computer vision researchers who are working in this area rather than to propose such methods as the state of the art in automated fast food recognition.
  • We employ the following consistent methodology in both of the experiments. Twelve images (from different views of two instances) of each of the 61 food types are utilized as the training set, while the six images (from the third instance) are held out for testing. Each instance is held out in turn and results are averaged over this three-fold cross validation. In particular, we ensure that no instance of a food item ever appears in both the training and test sets. We train a multi-class SVM classifier using the former data using the popular libsvm package, with standard parameters.
  • Results of the baselines

  • Classification accuracy on the 61 categories:
    Color histogram- 11.3%,
    Bag of SIFT features- 9.2%


    Baseline Image Data: Caution! Large zip file!

    Lab Still Shots [9MB]
  • Mei Chen
    Rahul Sukthankar
    Dean Pomerleau
    Casey Helfrich
    Intel Labs Pittsburgh

    Jie Yang
    Wen Wu
    Lei Yang
    Franziska Kraus
    Anlu Wang

    Carnegie Mellon University

    Kapil Dev Dhingra
    Columbia University