in Autonomous Driving, Computer Vision

Although 2D camera data is used to teach autonomous vehicles to find their way from Point A to PointB, it comes with its own set of drawbacks. For eg: camera images are not very useful when it is dark or there are reflections due to strong sunlight. Hence, a new type of hardware, termed LiDAR – Light Detection and Ranging, which uses infrared lasers is used to eliminate such issues in scene perception. Find below descriptions of open LiDAR datasets for autonomous vehicles.

Ford Campus LiDAR dataset

FORD Campus LiDAR Dataset

  • This dataset created in 2009 at the Perceptual Robotics Laboratory of the University of Michigan uses a pickup truck mounted with multiple LiDAR devices and an omnidirectional camera system. The dataset was collected in an urban environment and contains time synchronized 2D image, 3D LiDAR and IMU (Inertial Measuring Unit) data. There are 2 parts to the dataset collected in different urban environments. A smaller, preview dataset can be downloaded here.
  • This dataset has been used for multi-modal sensor registration methods. Thus one can fuse 2D and 3D data to obtain data that would contain significant mutual information.  This enriched data can provide improved perception capabilities than just a single input for autonomous vehicles. This dataset does not have any annotations on objects present within the driving scenes.
  • Data formats of different sensors
    • 3D Point Cloud Data
    • 2D Image Data
      • .ppm images which are spherically undistorted


KITTI Dataset

The KITTI (Karlsruhe Institute of Technology and Toyota Technological Institute) dataset is a widely used computer vision benchmark which was released in 2012. A Volkswagen station was fitted with grayscale and color cameras, a Velodyne 3D Laser Scanner and a GPS/IMU system. They have datasets for various scenarios like urban, residential, highway, and campus. This dataset has been used in papers for object detection such as VoxelNet, MV3D, and Vote3Deep

Data Formats of different sensors:

  • 3D Point Cloud Data (Velodyne HDL-64E @ 10Hz spin)
  • 2D Image Data (2 × PointGray Flea2 grayscale cameras+ 2 × PointGray Flea2 color cameras)
    • .png (in the image_00/01/02/03 folder)
  • One can use this repo to browse through the data.

Other Popular LiDar Datasets,

Sydney Urban Objects

  • This dataset was built with the focus on analyzing individual objects such as vehicles and pedestrians in their 3D point cloud formats.  It has been used in the VoxNet paper for object recognition. It aims to provide non-ideal sensing conditions that are representative of practical urban sensing systems, with a large variability in viewpoint and occlusion.

Stanford Track Collection

  • This dataset is also used to focus on individual object recognition. It contains about 14,000 labeled tracks of objects as observed in natural street scenes by a Velodyne HDL-64E S2 LIDAR.

Oakland 3D Point Cloud Dataset

  • This is a smaller dataset created at Carnegie Mellon University

With the current wave of innovations in computer vision for object detection and its application in autonomous driving, there is huge scope for more annotated LiDAR datasets. This repo contains labeled 3D point cloud laser data collected from a moving platform in a urban environment.

Know more about semantic segmentation datasets here.