This is Post #1 in my 2020 personal goal to read and summarize a paper each week.

A Theory of Fermat Paths for Non-Line-of-Sight Shape Reconstruction [paper]

Why this is interesting:

It proposes a way to not only detect, but accurately reconstruct the shape of objects that are outside a viewer's line of sight. It was adjudged the "Best Paper" at CVPR 2019.

How it works:

The (first) experiment setup looks like this:

The light source projects a laser dot matrix onto the visible side wall, and the light detector plots the transient graph - i.e. a histogram of observed light intensities over time.

A transient graph looks somewhat like this:

This paper claims and proves that the discontinuous high points on the graph correspond to two circumstances which depend only on the geometry (and not the reflectance) of the measured object. These are when a ray of light hits the object at:

  • A normal; or a
  • A boundary.

The shape of the transient graph also tells us whether the point is a local minima, local maxima or a saddle point.

To now reconstruct a given point, we need:

  1. The Fermat length - Derived from the time in the transient graph.
  2. The gradient vector from (visible side wall point -> object surface point) - We can derive this by interpolating the points on the visible surface.

By intersecting the sphere of radius (fermat_length/2) with the line of the gradient passing through the point of reflection (on the visible wall), we can obtain the coordinates of the point on the NLOS object.

By repeating this for each branch for a particular dot in the projected light matrix, and then repeating it for each dot in the matrix - we can obtain a point cloud.

When a picosecond scale imaging system is used, we can get results at the millimeter scale for everyday objects:

When a femtosecond scale system is used, they are able to obtain even finer resolutions. With this they can reconstruct depth maps even for objects like a coin.

How is this is better than previous efforts:

  1. This approach relies only on geometry. It is invariant to the reflectance of the NLOS object. It is thus agnostic to the specific transient imaging system used.
  2. It applies to both reflective cases (looking around the corner), and transmissive cases (looking through a diffuser)


  1. The output is sensitive to inaccurate discontinuity detection in transient graph.
  2. Reconstruction quality can suffer with sparse measurements on visible surface. This is because Fermat gradients are estimated via interpolation.