- GL

# Beautiful 3D Point Clouds with camcalib, OpenCV & RAFT Stereo

This article will show you how to use camcalib YAML files to calculate beautiful dense 3D point clouds with a stereo-camera image pair.

Camcalib is just the first step for most computer vision applications. It is an essential step, yet only a means to your ends. With this article, we want to take you on the next steps of the journey – the steps after you have used camcalib – and show you how easy it is to compute precise 3D point clouds. We aim to get you started fast and with as little trouble as possible.

We will cover the following topics:

Loading and parsing the camcalib YAML result file

Removing lens distortion

Rectification of a stereo image pair

Converting the rectified image pair into a dense 3D point cloud with stereo matching

To keep the examples simple and the corresponding code short, we will be using Python and OpenCV to do the heavy lifting.

## Download our examples source code

You don't need to copy and paste the code snippets we show here and puzzle them all together. Just check out the code for the entire example from our examples __repository__ and follow the instructions in the accompanying __README.md__ to get up and running with ease.

## Loading the Camera Calibration YAML

Let’s get started! Using __camcalib__ in a __previous blog post__, we calibrated a stereo camera pair and saved the result as a YAML file.

Tip: installing the PyYAML package

You may need to install the **PyYAML** package using **apt-get** or **pip3**

```
# Either
sudo apt-get install -y python3-yaml
# OR
pip3 install PyYAML
```

Alternatively, just check out the code for the entire example from our examples __repository__ and follow the instructions in the accompanying __README.md__ to get up and running with ease.

With Python, we import the **yaml** package to read the result file.

```
import yaml
filename = "calibration_result.yaml"
with open(filename,"r") as f:
calibration_parameters = yaml.load(f, Loader=yaml.FullLoader)
```

We print the loaded data to the console to verify that everything works.

```
print(type(calibration_parameters))
$ <class 'dict'>
print(calibration_parameters)
$ {'sensors': {'cam1': {'extrinsics': {'axis_angle': [0.0, 0.0, 0.0], 'translation': [0.0, 0.0, 0.0]}, 'intrinsics': {'parameters': {'cx': 631.2595405625116, 'cy': 491.5258233725439, 'fx': 2282.3920867826905, 'fy': 2282.5710507986964, 'image_size': [1280, 1024], 'k1': 0.19408319800216575, 'k2': -0.0032982721692749804, 'k3': -0.018097059724866766, 'k4': -0.2839958241290381}, 'type': 'KannalaBrandt'}}, 'cam2': {'extrinsics': {'axis_angle': [-0.0017098846275279435, 0.1681972700321447, 0.04623455960511724], 'translation': [-0.19068022336063817, -0.0017535410879024739, 0.004499676517813966]}, 'intrinsics': {'parameters': {'cx': 615.0347480681874, 'cy': 472.525057978465, 'fx': 2301.3314301312444, 'fy': 2301.140570503219, 'image_size': [1280, 1024], 'k1': 0.18254782022559218, 'k2': 0.4203174919560141, 'k3': -5.89114449668284, 'k4': 25.664870652441312}, 'type': 'KannalaBrandt'}}}}
```

The second command shows that the calibration data is loaded into a Python dictionary, enabling us to access any part of the loaded object we need quickly. For later reference, the YAML file contains the following basic structure.

```
sensors:
cam1:
extrinsics: ...
intrinsics: ...
cam2:
extrinsics: ...
intrinsics: ...
```

Before we begin processing the loaded data, let's visualize the result to get a better understanding of our extrinsics. Check out the visualization below.

If you look closely, you will notice several details reflected in the YAML result:

**cam2**is rotated ~10° countcam2er-clockwise around the**y-axis**.The 31° FOV is evident from the frustum visualization.

The image center coordinate offset in

**cam1**and**cam2**is not zero. It is subtle in the visualization, but it is there.Cam1s coordinate system has the

**x-axis**directed**right**,**y-axis**oriented**down**, and**z-axis**pointing**forward**. The coordinate system xy-orientation matches the xy-orientation of the image plane coordinates and is a computer vision convention.

Stay tuned for our follow-up article, where we show you how to visualize the extrinsics of stereo and complex multi-view constructions with ease.

## Undistortion

With camcalib, we have determined the distortion parameters and offset errors of the lens and sensor geometry. But how can we use the camcalib results to remove the lens distortion from our images?

OpenCV provides a vast library of well-tested computer vision algorithms that will help us accelerate the implementation process and reach our goals with only a few lines of code. So let's get started!

To undistort an image efficiently, we will take the following steps:

Fetch and restructure the intrinsic parameters from the YAML we loaded before.

Use the intrinsic parameters to construct a 2D look-up table – a so-called undistortion map.

Apply a remap function with the undistortion map on every image the camera captures during the capture loop.

It may sound more logical to pack this into a single undistort image function, but you would repeat large amounts of the same computations every time you take a new snapshot. It is better to do the heavy calculations – computing the undistortion map – only once and store the result in memory for repeated later use. This way, you have more CPU resources left to do more valuable computations, like computing 3D point clouds.

Tip: installing the NumPy package.

You may need to install the **numpy** package using **apt-get** or **pip3**

```
# Either
sudo apt-get install -y python3-numpy
# OR
pip3 install numpy
```

Alternatively, just check out the code for the entire example from our examples __repository__ and follow the instructions in the accompanying __README.md__ to get up and running with ease.

To create the undistortion map, we will need to fetch the intrinsic parameters and restructure them into a camera matrix and a list of distortion parameters.

```
import numpy as np
# Extract the calibration data for camera 1 from the
# calibration_parameters variable
cam1_parameters = calibration_parameters['sensors']['cam1']
# Extract camera 1 intrinsic parameters from cam1_parameters
cam1_intrinsics = cam1_parameters['intrinsics']['parameters']
# Prepare variables map creation
image_size = cam1_intrinsics['image_size']
cam1_camera_matrix = np.array(
[[cam1_intrinsics['fx'], 0, cam1_intrinsics['cx']],
[0, cam1_intrinsics['fy'], cam1_intrinsics['cy']],
[0, 0, 1]])
cam1_distortion = np.array(
[cam1_intrinsics[k] for k in ['k1', 'k2', 'k3', 'k4']])
```

A detailed review of what the code does

Before we continue, a few notes on what we did in the code.

We use Python dictionaries to easily extract only the relevant parts of the tree structure in the YAML file. For example, we extract all parameters relating to

**cam1**by looking into**sensors**->**cam1**or in Python with**['sensors']['cam1']**. We do the same to get only the intrinsic parameters.We create

**cam1_camera_matrix**from the**cam1_intrinsics**parameters.**cam1_camera_matrix**is a 3x3 matrix that we save in the form of a NumPy array. NumPy is a powerful tool that will benefit us greatly later on. It takes care of linear algebra as well as advanced filtering operations.The camera matrix we define here will be the basis for our pinhole camera model.

The variable

**cam1_distortion**is an array with a particular sequence of distortion parameters. The sequence of the parameters depends on the camera distortion model you selected during calibration. We set__KannalaBrandt__during calibration, determining the sequence we used in the code.To construct cam_distortion use Pythons list comprehension to shorten the code. We could have also directly listed each parameter in sequence as follows

**cam1_distortion = np.array([cam1_intrinsics['k1'], cam1_intrinsics['k2'], cam1_intrinsics['k3'], cam1_intrinsics['k4']]])**but that is quite tedious to write, and visually noisy for anyone to read.

After executing the code, we have all the parameters needed to create our undistort map – **image_size**, **cam1_camera_matrix**, and **cam1_distortion**. Now let's build the undistortion map.

Tip: installing the OpenCV package

You may need to install the **opencv-python** package using **apt-get** or **pip3**

```
# Either
sudo apt-get install -y python3-opencv
# OR
pip3 install opencv-python
```

Alternatively, just check out the code for the entire example from our examples __repository__ and follow the instructions in the accompanying __README.md__ to get up and running with ease.

```
import cv2
cv = cv2.fisheye
undistort_map1 = cv.initUndistortRectifyMap(cam1_camera_matrix,
cam1_distortion,
np.eye(3),
cam1_camera_matrix,
image_size,
cv2.CV_32F)
```

That's it! The variable **undistort_map1** will contain a tuple. The first element of the tuple will contain all x-coordinates of the look-up table, and the second element will contain all y-coordinates.

More details on what the undistortion map contains, how it is used, and how remap works.

Let’s briefly discuss the parameters of initUndistortRectifyMap and then look at the undistort map itself:

We import OpenCV with

**import cv2**Keen eyes will notice that we are using

**initUndistortRectifyMap**. This function can create undistortion maps as well as undistort and rectification maps. We will re-use this function later on to its full extent. For now, we will ignore the rectification component. Check out the__OpenCV documentation__for more details.To undistort using KannalaBrandt, we need to access the

**cv=cv2.fisheye**module.The first and second parameters are the camera matrix and distortion coefficient variables we previously constructed.

The third parameter

**np.eye(3)**passes the Identity matrix as extrinsic data to disable the rectification component. Remember, we only care about undistortion at the moment.For the fourth parameter, we repeat

**cam1_camera_matrix**to preserve the pinhole parameters. You can also specify a different camera or projection matrix if you like.

Further, **undistort_map1** tells us how to construct a distortion-free image by informing us where, in the distorted image (**source coordinate**), we need to fetch **a pixel** from and where, in the undistorted image (**destination coordinate**), to place it. **undistort_map1** contains two arrays, one for **x** and one for **y**, with the same shape as our image. The source coordinate is encoded in each of the array's values. The coordinate we read the source coordinates from is simultaneously the destination coordinate. The picture below illustrates the application of the undistort map.

Put differently; you must, for each pixel in the **undistorted image**,

fetch, from the x-map using the

**destination coordinate**, the**source coordinates' x-value**fetch, from the y-map using the

**destination coordinate**, the**source coordinates' y-value**then fetch the

**brightness value**from the**distorted image (RAW)**at the**source coordinate**X_srcand finally, write the

**brightness value**to the**undistorted image**at the**destination coordinate**X_dst

In short:

where the source coordinate X_src is the sum of the destination coordinate X_dst and the distortion vector dX

The plot below shows the magnitude of the distortion vectors for **cam1** and **cam2**. The distortion magnitudes can reach hundreds or thousands of pixels in some cases. In our case, the distortion is almost invisible but will cause significant errors if left uncorrected. Note that the lenses used in this example are low-distortion lenses.

How to plot the distortion map's magnitudes

If you want to reproduce the undistortion magnitude plot we show above, subtract the destination coordinate from the source coordinate and compute the vector's magnitude as follows:

```
xy = np.dstack(np.meshgrid(range(1280),range(1024)))
ud_map1_abs = np.linalg.norm(
np.dstack(undistort_map1) - xy, axis=2)
```

Plot the results as follows:

```
import matplotlib.pyplot as plt
plt.imshow(ud_map1_abs)
plt.colorbar()
plt.show()
```

Note: you may need to install the **matplotlib** package using **apt-get** or **pip3**

```
# Either
sudo apt-get install -y python3-matplotlib
# OR
pip3 install matplotlib
```

Now we apply the undistortion map:

```
# Load raw cam 1 image data from png file
img1_raw = cv2.imread("camcalib_tutorial_data/cam1/001.png")
# undistort raw cam 1 image data using undistort_map1
# img1_undist = cv2.remap(img1_raw, *undistort_map1,
# cv2.INTER_LANCZOS4) #short version
img1_undist = cv2.remap(img1_raw,
undistort_map1[0],
undistort_map1[1],
cv2.INTER_LANCZOS4)
```

The steps are short and straightforward. We use the OpenCV **imread** function to load **cam1/001.png** and apply the previously computed undistortion map using OpenCV's **remap** function. The remap function is a powerful resampling function that can take any type of 2D lookup table and resample image data with the most common sub-pixel interpolation types (we use Lanczos resampling here). You can see the results of remapping our left and right images with the undistortion maps below. Note, that you will probably not be able to tell the difference as the distortion magnitude, as mentioned before, is below 16 pixels at worst.

Example of a strongly distorted fish-eye image

If, however, we subtract the raw and undistorted images, we can see a difference, which will lead to significant 3D errors later on if we leave it unchecked. Imperceptible distortion errors are why we calibrate our cameras.

## Rectification

For computational efficiency, we carry out undistortion and rectification at once in the same look-up table (the so-called undistort-rectify map). Still, it is more convenient to think of undistortion and rectification as separate processes carried out one after the other. With distortion errors taken care of, we can focus on rectification alone. Thus, for discussion, we will now assume that undistortion has been applied and perform rectification next.

Recall the visualization of our extrinsic calibration earlier; we found that **cam2** is tilted (rotated) towards **cam1**. **cam2** is also slightly pitched and rolled w.r.t. **cam1**, but that is more subtle and difficult to read from the extrinsic parameters. To triangulate 3D points for all pixels of **cam1**, we will need to find their corresponding pixels in **cam2**. Finding all correspondences presents us with a 2D search problem that is computationally intense. This is where rectification helps us out big time! The following two sketches, visually illustrate why rectification is useful. In case you are not familiar with the visualizations, check out the following “Visualization hint” expand where we describe the abstract representation of a pinhole camera that is widespread in the computer vision industry.

Visualization hint: abstract representation of the pinhole camera

We will be using virtual pinhole cameras a lot in the coming sections, so it is worth showing you how we draw simplified pinhole cameras. A pinhole camera is made of a dark box with a tiny hole, the “pinhole”, on the opposite side of our “real image plane” which is our “photosensitive surface” or “image sensor”. The scene is projected upside-down onto our “real image plane” because every ray from the scene must pass through the pinhole to enter the camera. Note, that the focal length of our pin-hole “lens” is simply the distance of the pinhole to the image plane.

As a thought experiment, we can introduce a virtual image plane that is exactly the focal length away from the pinhole, but on the outside (in front) of the camera. If we register the intersections of all scene rays that pass through the pinhole and draw their colors onto the virtual plane, we get exactly the same image that is projected onto the image plane with one difference: **the image is upright instead of upside down**. This helps us a lot when thinking of projections of the scene, rectification, undistortion, and triangulation. So without loss of generality, we use the virtual image plane as our pinhole camera representation in most computer vision applications. The illustration below summarizes what we have just discussed.

To understand why, think of the undistorted camera image as an image created by a pinhole camera. Each pixel's horizontal and vertical ray-angle is precisely known. Consequently, we can virtually rotate our camera simply by moving the pixels around. Virtually rotating our camera allows us to remove the yaw, pitch, and roll of **cam2** w.r.t. **cam1**. By doing this, we effectively make both cameras' optical axes perfectly parallel. Perfectly parallel optical axes imply that pixels in any row of **cam1** will correspond only to pixels of the same row in **cam2**. Consequently, our 2D search problem becomes a 1D search problem that we can compute faster or even at live-video speeds. Thus, undistorting and rectifying our images leads to many simultaneous efficiency gains!

Let's get to the implementation! To speed up things, we will repeat a few things for cam2 that we left out for the undistortion before without going into too much detail and only discuss the new stuff.

```
# Extract the calibration data for camera 1 & 2 from calibration_parameters variable
cam1_parameters = calibration_parameters['sensors']['cam1']
cam2_parameters = calibration_parameters['sensors']['cam2']
# Extract camera 1 & 2 intrinsic parameters from cam1_parameters
cam1_intrinsics = cam1_parameters['intrinsics']['parameters']
cam2_intrinsics = cam2_parameters['intrinsics']['parameters']
# Prepare variables map creation
image_size = cam1_intrinsics['image_size']
cam1_camera_matrix = np.array([
[cam1_intrinsics['fx'], 0, cam1_intrinsics['cx']],
[0, cam1_intrinsics['fy'], cam1_intrinsics['cy']],
[0, 0, 1]])
cam2_camera_matrix = np.array([
[cam2_intrinsics['fx'], 0, cam2_intrinsics['cx']],
[0, cam2_intrinsics['fy'], cam2_intrinsics['cy']],
[0, 0, 1]])
cam1_distortion = np.array(
[cam1_intrinsics[k] for k in ['k1', 'k2', 'k3', 'k4']])
cam2_distortion = np.array(
[cam2_intrinsics[k] for k in ['k1', 'k2', 'k3', 'k4']])
```

That takes care of the intrinsic parameters. Now, let's extract the extrinsic parameters

```
# Extract camera 1 & 2 extrinsic parameters from cam1_parameters
P_c1 = Pose.from_axis_angle(
cam1_parameters['extrinsics']['axis_angle'],
cam1_parameters['extrinsics']['translation'])
P_c2 = Pose.from_axis_angle(
cam2_parameters['extrinsics']['axis_angle'],
cam2_parameters['extrinsics']['translation'])
# Compute pose difference between camera 1 and 2
P_c2_c1 = P_c2 @ P_c1.I
```

The Pose class

To simplify the usage of the extrinsic poses, we provide a minimal Pose class. The Pose class is initialized with an axis-angle rotation vector and a translation vector and works well with camcalib extrinsic data. The class also provides a multiply operation with the overloaded NumPy matrix multiply operator **@** and an inverse ** I ** of the Pose.

```
class Pose:
@classmethod
def from_axis_angle(cls, r, t):
return Pose(
cv2.Rodrigues(np.array(r))[0], np.array(t))
def __init__(self, r, t):
self.r = r
self.t = t
@property
def I(self):
return Pose(self.r.T, - (self.r.T @ self.t))
def __matmul__(self, other):
return Pose(self.r @ other.r,
self.r @ other.t + self.t)
```

There are a few things of note here.

The Pose class takes the extrinsic data and allows you to treat it as a pose matrix with little effort. You can think of the pose as a transformation that applies rotation and translation to a set of 3D points. It can also be chained with other poses to represent a sequence of camera movements.

Cameras 1 and 2 each have an extrinsic pose assigned to them. The extrinsic pose is the position and orientation that we

__determined earlier__during the extrinsic calibration.For our rectification task, we only care about the position and orientation of

**cam2**w.r.t.**cam1**. This is why we subtract the pose of**cam1**from**cam2**by performing**P_c2_c1 = P_c2 @ P_c1.I**. Note that we mean the abstract sense of subtraction for poses.

Now we are prepared for rectification.

```
# Compute rectification parameters and undistort-rectify maps
cv = cv2.fisheye
R1, R2, P1, P2, Q = cv.stereoRectify(cam1_camera_matrix,
cam1_distortion,
cam2_camera_matrix,
cam2_distortion,
image_size,
P_c2_c1.r,
P_c2_c1.t,
cv2.CALIB_ZERO_DISPARITY)
# Compute undistort-rectify maps
rect_map1 = cv.initUndistortRectifyMap(cam1_camera_matrix,
cam1_distortion,
R1,
P1,
image_size,
cv2.CV_32F)
rect_map2 = cv.initUndistortRectifyMap(cam2_camera_matrix,
cam2_distortion,
R2,
P2,
image_size,
cv2.CV_32F)
```

If you recall, the undistort-rectify map construction is very much like the undistort map construction process. The only difference is that we also take the extrinsic data into account. We only have to compute the pose change from **cam1** to **cam2** with **P_c2_c1**. We let OpenCV do the heavy lifting by using the **stereoRectify** function. **stereoRectify** takes all the intrinsics and extrinsics we prepared and gives us the transformation parameters we need to construct the undistort-rectify maps. In particular, **stereoRectify** provides us with the rotation matrices **R1** and **R2**. **R1** and **R2** are the rotations that orient cam1 and cam2 in the same direction. **stereoRectify** also provides **P1** and **P2**, which are our new projection matrices. When applied, the new projection matrices determine the virtual camera parameters that the undistort-rectify maps cause. In other words, **P1** and **P2** describe how our virtual pinhole cameras (undistorted and rectified cameras) project 3D points onto the image plane – or "see" the world.

Put yet another way, **R1** and **P1** (and corresponding intrinsics) define a map that converts the distorted and rotated view of cam1 into a pinhole view of the world. **R2**, **P2**, and corresponding intrinsics do the same for **cam2**. And as a bonus, our virtual pinhole views will also have parallel optical axes and aligned rows that allow us to perform 1D searches to find correspondences.

With the **rect_maps** we can compute our virtual aligned (rectified) pinhole views

```
# load a stereo image pair
img1_raw = cv2.imread("camcalib_tutorial_data/cam1/001.png")
img2_raw = cv2.imread("camcalib_tutorial_data/cam2/001.png")
# Apply the undistort-rectify maps to the stereo image pair
img1_rect = cv2.remap(img1_raw, *rect_map1, cv2.INTER_LANCZOS4)
img2_rect = cv2.remap(img2_raw, *rect_map2, cv2.INTER_LANCZOS4)
```

That's it. We have rectified views that we can use to compute 3D point clouds. Before we triangulate, though, let's review our results so far to build intuition.

Check out the difference between the raw image pair and the rectified image pair in the plots below. The black lines in the raw image pair do not connect pixel correspondences, while the black lines do so for all pixels of the rectified pair.

Also, check out the zoomed-in plot, where the red bars show the errors of the raw image pair.

The red bars vanish in the rectified image pair.

Now that we have prepared our image data for efficient 1D correspondence search, let's do what we all came here for – computing 3D point clouds!

## 3D Point Clouds

We have two steps left until we have useful point clouds.

Run a stereo matcher on the rectified image pair. Stereo matching is the 1D correspondence search we mentioned before.

Triangulate pixel correspondences on the left and right images to construct the 3D point cloud.

For the first step, there are tons of algorithms to choose from. Your choice will depend on the trade-offs of your application. For example,

if you target a low-power ARM device, you might go with a simple block matcher algorithm.

If you have more power, you can go with a Semi-Global Block Matcher (SGBM) implementation.

The sky is the limit if your budget and application allow for a powerful GPU and longer run times. Check out the

__Middlebury Stereo Benchmark__to see how different algorithms perform and what they require.

We compute our results using __RAFT stereo__, but we also will provide a minimal SGBM wrapper around the OpenCV SGBM implementation to get you started.

Tip: installing the Open3D package to visualize point clouds.

You may need to install the **open3d** package using **pip3**

`pip3 install open3d`

__repository__ and follow the instructions in the accompanying __README.md__ to get up and running with ease.

The SGBM wrapper

```
import cv2
import numpy as np
import matplotlib.pyplot as plt
import open3d as o3d
class StereoMatcherSGBM(cv2.StereoSGBM):
"""
A simple wrapper for OpenCV SGBM that is also configured
for our example. The aim of this class is to simplify the
presented code while retaining reproducibility for more
interested readers. This also allows us to switch out
the matcher with more advanced algorithms easily.
"""
def __init__(self):
# configure stereo matcher
window_size = 11
min_disp = 0
num_disp = 320-min_disp
self.matcher = \
cv2.StereoSGBM.create(
minDisparity=min_disp,
numDisparities=num_disp,
uniquenessRatio = 5,
speckleRange = 5,
disp12MaxDiff = 1,
P1 = 8*2*window_size**2,
P2 =32*2*window_size**2)
def match(self,img1,img2):
# compute disparity
disparity = self.matcher.compute(img1,img2)\
.astype(np.float32)/16.0
return disparity
def reconstruct(self, img_disp0, img_rect0, Q):
# reproject disparity to 3D
xyz = cv2.reprojectImageTo3D(img_disp0, Q)
# construct validity masks based on distance
# and brightness
mask_depth = (xyz[:, :, 2] < 5.0) & \
(xyz[:, :, 2] > 0.1)
mask_brght = (img_rect0 > 30) & (img_rect0 < 250)
# create linear point and color lists
xyz_linear = xyz.reshape((-1, 3))
colors_linear = img_rect0.reshape((-1, 3))
mask_linear = (mask_brght[:, :, 0] & \
mask_depth).flatten()
# create open3d geometry
pcd = o3d.geometry.PointCloud()
pcd.points = o3d.utility.Vector3dVector(
xyz_linear[mask_linear])
pcd.colors = o3d.utility.Vector3dVector(
colors_linear[mask_linear] / 255.0)
return pcd
```

The following code will take you the rest of the way and take care of stereo matching, triangulation, and visualization.

```
import open3d as o3d
# Setup the stereo matcher
matcher = StereoMatcherSGBM()
# Compute the pixel correspondences - called the disparity map
disparity = matcher.match(img1_rect, img2_rect)
# Construct open3d geometry
pointcloud = matcher.reconstruct(disparity, img1_rect, Q)
# Perform a statistical outlier removal to make
# the results even prettier.
pointcloud, _ = pointcloud.remove_statistical_outlier(
nb_neighbors=20, std_ratio=1.5)
# Create a coordinate frame to visualize the camera
# coordinate frame
origin_frame = o3d.geometry.TriangleMesh\
.create_coordinate_frame(size=0.10, origin=[0, 0, 0])
# display open3d geometry
o3d.visualization.draw_geometries([pointcloud, origin_frame])
```

Let’s take a closer look at what is happening here:

We imported

**open3d as o3d**for short to help with visualizing the point cloud.**matcher.match(img1_rect, img2_rect)**uses the rectified images as input to find pixel correspondences. The correspondence is encoded in the form of a disparity. The disparity is the distance between the left and right images correspondences measured in pixels. You can think of the pixel location and disparity as encoding the two angles we need to triangulate a 3D point.**reconstruct(disparity, img1_rect, Q)**is provided by our wrapped**matcher**to compute the 3D point cloud. We will discuss this in more detail below. Just note the parameters.**disparity**contains the disparity map shown in the image below,**img1_rect**holds corresponding color values, and**Q**is the 4x4 perspective-transform or disparity-to-depth matrix used to perform the triangulation step.Note that the perspective-transform matrix

**Q**was the 5th parameter of the**stereoRectify**function we called during rectification.

Before we go into the details of **matcher.reconstruct()**, check out the point clouds below. The high-quality point clouds result from excellent camera calibration and the RAFT-stereo correspondences.

This last image also shows the camera's coordinate system at the origin. Again note the direction of the **x-axis** and **y-axis**. The orientation of the coordinate system matches the image-planes coordinate system.

Let’s close this article with a final discussion on the **reconstruct** function.

```
def reconstruct(self, img_disp0, img_rect0, Q):
# reproject disparity to 3D
xyz = cv2.reprojectImageTo3D(img_disp0, Q)
# construct validity masks based on distance and brightness
mask_depth = (xyz[:, :, 2] < 5.0) & (xyz[:, :, 2] > 0.1)
mask_brght = (img_rect0[:, :, 0] > 30) & \
(img_rect0[:, :, 0] < 250)
# create linear point and color lists
xyz_linear = xyz.reshape((-1, 3))
colors_linear = img_rect0.reshape((-1, 3))
mask_linear = (mask_brght & mask_depth).flatten()
# create open3d geometry
pcd = o3d.geometry.PointCloud()
pcd.points = o3d.utility.Vector3dVector(
xyz_linear[mask_linear])
pcd.colors = o3d.utility.Vector3dVector(
colors_linear[mask_linear] / 255.0)
return pcd
```

We use OpenCVs

**reprojectImageTo3D**function to keep the code light and convert the disparity map into 3D coordinates. In principle, we could also do this by hand, and it would take us four lines of code with NumPy. Check out the expand below if you would like to see it.We build a simple distance and brightness filter

**mask_brght & mask_depth**to remove errors.The open3d

**PointCloud()**object**pcd**has the properties**points**and**colors**.**points**is a**Vector3dVector**constructed with an Nx3 NumPy array containing our triangulated valid 3D points.**colors**is also a**Vector3dVector**constructed with an Nx3 NumPy array.**colors**columns represent**red**,**green,**and**blue**with values ranging from 0.0 for darkest and 1.0 for brightest, so we normalize it with 255.

NumPy implementation of **reprojectImageTo3D**

The function's core is generating u- and v-coordinate values that match up with the disparity map in homogeneous form and multiplying each uvd1 coordinate with the Q matrix. Also, note the normalization step to complete the process.

```
def reprojectImageTo3D(img_disp0, Q):
U, V = np.meshgrid(np.arange(0, img_disp0.shape[1]),
np.arange(0, img_disp0.shape[0]))
UVD = np.dstack((U, V, img_disp0, np.ones_like(img_disp0)))
XYZW = UVD @ Q.T
xyz = XYZW[:, :, :3] / XYZW[:, :, 3, np.newaxis]
return xyz
```

In essence, this is the same as carrying out the following operations for each disparity map pixel.

Where B is the baseline or distance between **cam1** and **cam2**. f, c_x, and c_y are the focal length and image center coordinates given in the new camera matrices **P1** and **P2** after rectification. u, v, and d are the x-, y- and disparity coordinates of the disparity map **img_disp0**.

## Putting it all together

For your convenience, we provide this example in full in our __examples repository__. Check it out and simply follow the instructions in the accompanying __README.md__ to get up and running with ease.

## Conclusions

This article has taken us quite far beyond the scope of camcalib. But we hope that it helped you get going and gave you an insight into the various moving parts of a computer vision application that relies on precise camera calibration. We are constantly adding new pieces to help you get familiar with camcalib as well as help you get started with your applications faster. Please also return to check out our other articles that will take you even further.