Introduction

We provide RGB-D video and reconstructed 3D models for a number of scenes. The data can be used for any purpose with proper attribution. If you just want a 3D scene for some purpose, feel free to use these. If you use any of the data, cite our paper:

  • If you use Fountain, cite the SIGGRAPH 2014 paper "Color Map Optimization for 3D Reconstruction with Consumer Depth Cameras"
  • If you use Reading Room, cite the ICCV 2013 paper "Elastic Fragments for Dense Scene Reconstruction"
  • If you use Augmented ICL-NUIM Dataset, cite the CVPR 2015 paper "Robust Reconstruction of Indoor Scenes"
  • If you use A Large Dataset of Object Scans, cite the arXiv technical report "A Large Dataset of Object Scans"
  • If you use other data that has not been mentioned above, cite the SIGGRAPH 2013 paper "Dense Scene Reconstruction with Points of Interest"
  • In general, if you do something interesting with the data, we'll be happy to know about it. Contact us at: Qianyi Zhou (Qianyi.Zhou@gmail.com) and Vladlen Koltun (vkoltun@gmail.com)

    Data

    The data was originally hosted on a server at Stanford University. Thus it was known as "Stanford 3D Scene Data" at that moment. It has been moved to a Google Drive folder since I left Stanford.

    Data on Google Drive 

    Additionally, in our latest project "Robust Reconstruction of Indoor Scenes", we have published a synthetic RGB-D dataset (thanks to my friend Sungjoon Choi) and reconstructed models from a set of SUN3D scans. Additionally, we have collected 10,000 dedicated 3D scans and reconstructed 398 mesh models.

    Augmented ICL-NUIM Dataset  SUN3D Scenes  A Large Dataset of Object Scans 

    Format

    Most scenes include a reconstructed model in .ply file format, an RGB-D video in either the .oni file format (requires OpenNI) or a zip file of individual color and depth images (both in .png format), and an estimated camera trajectory in a customized .log format (explained later).

    Each RGB-D video is a continuous shot, taken with an Asus Xtion Pro Live camera at VGA resolution (640x480) and full speed (30Hz). Except "fountain" and "readingroom", The depth data is calibrated using the technique presented in "Unsupervised Intrinsic Calibration of Depth Sensors via SLAM", by Alex Teichman, Stephen Miller, and Sebastian Thrun, RSS 2013. The calibration partially removes low-frequency distortion from the depth data. In our SIGGRAPH 2013 paper this calibration step is crucial. In our later papers, we developed an auto-calibration approach and do no require explicit calibration any more.

    The estimated camera trajectory is stored in a .log file. See this page for details. The following C++ code first translates a depth pixel (u,v,d) into a point (x,y,z) in the local coordinates, then transforms it to the world coordinates (xw,yw,zw). We assume that the transformation matrix Tk has been extracted from the .log file and that the Eigen library is used. All lengths are measured in meters.

    fx = 525.0; fy = 525.0; // default focal length
    cx = 319.5; cy = 239.5; // default optical center
    Eigen::Matrix4d Tk; // k-th transformation matrix from trajectory.log
    
    // translation from depth pixel (u,v,d) to a point (x,y,z)
    z = d / 1000.0;
    x = (u - cx) * z / fx;
    y = (v - cy) * z / fy;
    
    // transform (x,y,z) to (xw,yw,zw)
    Eigen::Vector4d w = Tk * Eigen::Vector4d(x, y, z, 1);
    xw = w(0); yw = w(1); zw = w(2);
    

    Preview

    The Burghers of Calais
    RGB video, Depth video, Model

    Cactus garden
    RGB video, Depth video

    Stone wall
    RGB video, Depth video

    Totem pole
    RGB video, Depth video

    Reading Room
    RGB-D video, Model

    Fountain
    RGB-D video, Model