Sometimes, when the calibration results just don't seem to make any sense and the pure numbers help you even less in understanding your setup, you need a visualization tool. Look no further!
This article and its accompanying repository take any camcalib result containing any number of sensors with intrinsic and extrinsic data and draws them onto your screen.
Download our examples source code
You don’t need to copy and paste the code snippets we show here and puzzle them together. Just check out the code for the example from our examples repository and follow the instructions in the accompanying README.md to get up and running with ease.
Before we dig into the code, let's review the general contents of a camcalib calibration YAML file. The root level of the YAML file contains the keyword sensors indicating that named sensors with calibration data will follow. On the level below the sensors, each sensor will be named. Camcalib only knows about sensors' intrinsic and extrinsic calibration and puts these in the tree for each sensor below intrinsics and extrinsics. Note, that the sensor model type is stored in sensors.sensor_name.intrinsics.type. Cameras will have the type Pinhole, PinholeRadTan, KannalaBrand, or DoubleSphere whereas inertial sensors are of type IMU.
sensors: sensor_name1: extrinsics: axis_angle: ... translation: ... intrinsics: parameters: ... type: ... sensor_name2: extrinsics: axis_angle: ... translation: ... intrinsics: parameters: ... type: ... ...
Check out the expand below to see a full YAML file with multiple cameras and IMUs. We will be using this YAML file in our example, but you can use your own if you prefer.
The YAML file we use in this article and example
sensors: cam0: extrinsics: axis_angle: - 1.2117727229251698 - -1.2130879874616947 - 1.210011388236652 translation: - 0.04597450727993618 - -0.005838564481711161 - -0.0512708409110581 intrinsics: parameters: cx: 703.4184470287223 cy: 531.3233320547156 fx: 701.9338931901199 fy: 701.9198711846919 image_size: - 1440 - 1080 k1: -0.043828058913568155 k2: 0.008739784977397793 k3: -0.010315540156750557 k4: 0.003301491901077731 type: KannalaBrandt cam1: extrinsics: axis_angle: - -1.2051586966320844 - -1.2144923847331992 - -1.2092688010798944 translation: - 0.061455281994088264 - 0.005439360217392036 - -0.04976140130335088 intrinsics: parameters: cx: 709.0899978478627 cy: 535.815525540387 fx: 696.909241975068 fy: 696.8023227375285 image_size: - 1440 - 1080 k1: -0.041475680818875205 k2: 0.0014162984286327878 k3: -0.0025200962367925867 k4: 0.0005834346664595179 type: KannalaBrandt cam2: extrinsics: axis_angle: - -0.010935541421966512 - 3.13254126801532 - 0.0005989964728145517 translation: - 0.008228232036358979 - 0.00705053272291295 - -0.03561450010265822 intrinsics: parameters: cx: 718.2502798340198 cy: 534.3967834027355 fx: 702.181653981154 fy: 701.9899133700088 image_size: - 1440 - 1080 k1: -0.04062232949626832 k2: 0.00034368526664624365 k3: -0.001719296403524369 k4: 0.00027240579795875835 type: KannalaBrandt cam3: extrinsics: axis_angle: - 1.5707821516259428 - -0.001794296867394402 - -0.0009339004796569021 translation: - 0.006569612361542141 - -0.005952289522851722 - -0.06592580748391123 intrinsics: parameters: cx: 703.6162852705492 cy: 536.00252400961 fx: 702.5175094304196 fy: 702.3365885533029 image_size: - 1440 - 1080 k1: -0.04122193502555801 k2: 0.0035171247707347147 k3: -0.006674594018057502 k4: 0.0027772629896513816 type: KannalaBrandt cam4: extrinsics: axis_angle: - -1.5675548679274078 - -0.010731198085364272 - 0.006375542799480938 translation: - 0.00962018755774569 - 0.006423120933193033 - -0.07984116864787359 intrinsics: parameters: cx: 699.9737172923118 cy: 541.4029215446449 fx: 701.1060084629287 fy: 701.3402415958489 image_size: - 1440 - 1080 k1: -0.040391667653718266 k2: 0.0005239676865717209 k3: -0.0014238682500836899 k4: 8.150188005469019e-05 type: KannalaBrandt imu1: extrinsics: axis_angle: - 0.0 - 0.0 - 0.0 translation: - 0.0 - 0.0 - 0.0 intrinsics: parameters: bias_a: - 0.054845791090596216 - 0.08487371857383741 - 0.046242459364215775 bias_g: - -0.0015414507179768058 - -0.0013533857270691307 - 0.0002798779211343726 gravity: - 0.015376257669965263 - 9.79238560551554 - -0.0062017596285653806 type: IMU imu2: extrinsics: axis_angle: - -2.2103282091047545 - 2.2259342355315304 - 0.006352840952155146 translation: - -0.006587878077190073 - -0.028448876489710395 - -0.03216176566993722 intrinsics: parameters: bias_a: - 0.04487119545085117 - 0.0011934199202234463 - -0.035186962905322146 bias_g: - 0.0013372800051996568 - 0.00034351260696414824 - -0.003250192045435073 gravity: - 0.015376257669965263 - 9.79238560551554 - -0.0062017596285653806 type: IMU imu3: extrinsics: axis_angle: - 3.1377123268857288 - 0.00433263016964078 - -0.005211021082020662 translation: - 0.012079789455707347 - -0.018433326337923397 - 0.11003063340552045 intrinsics: parameters: bias_a: - 0.13119507189926777 - -0.10620434067776963 - -0.1661974985277128 bias_g: - -0.006750852020882154 - 0.017124357966291758 - 0.010620605788657082 gravity: - 0.015376257669965263 - 9.79238560551554 - -0.0062017596285653806 type: IMU
This example will visualize the following data:
The frustum of each camera's pinhole representation.
The image center coordinates.
The extrinsic position and orientation of every sensor.
The following image illustrates all the details of a pinhole camera model representation we will be visualizing. The focal length, image width, image height, and image center coordinates are all stated in pixels by camcalib. Consequently, all we need is a conversion factor to render a true or scaled representation of the camera.
Check out the expand below to see how our rendered representation above compares with the typical visualization of pinhole cameras in the computer vision literature.
Visualization hint: abstract representation of the pinhole camera
A pinhole camera is made of a dark box with a tiny hole, the “pinhole”, on the opposite side of our “real image plane” which is our “photosensitive surface” or “image sensor”. The scene is projected upside-down onto our “real image plane” because every ray from the scene must pass through the pinhole to enter the camera. Note, that the focal length of our pin-hole “lens” is simply the distance of the pinhole to the image plane.
As a thought experiment, we can introduce a virtual image plane that is exactly the focal length away from the pinhole, but in front of the camera. If we register the intersections of all scene rays that pass through the pinhole and draw their colors onto the virtual plane, we get exactly the same image that is projected onto the image plane with one difference: the image is upright instead of upside down. This helps us a lot when thinking of projections of the scene, rectification, undistortion, and triangulation. So without loss of generality, we use the virtual image plane as our pinhole camera representation in most computer vision applications. The illustration below summarizes what we have just discussed.
The examples file structure
Now let's dig into the examples codebase. The folder structure of the example looks as follows:
camcalib_visualization_example ├── camcalib_tutorial_data │ ├── calibration_result.yaml ├── modules │ ├── Pose.py │ ├── calib_viz_utils.py │ ├── camcalib_loader.py └── main.py
camcalib_tutorial_data contains the calibration result file that we want to visualize calibration_result.yaml. You can use your data here instead if you like.
modules contains essential modules and helper classes.
Pose.py provides a minimal container for 6D pose transforms. We use this to efficiently handle combined rotation and translation operations on 3D data.
calib_viz_utils.py helps us visualize the intrinsic and extrinsic calibration alongside our multiview point clouds. Consider this a simple helper utility for now. We will dive into its details in the following sections.
camcalib_loader.py will aid us with loading the YAML file. It also constructs undistort-rectify maps for all camera pairings, but we will not need that feature here.
main.py, when run, launches our example. Check out the README.md to see how to set everything up and run the example.
Step 1: Loading the YAML file
This is where we make use of the camcalib_loader.py module.
# 1. import CamcalibLoader module. from modules.camcalib_loader import CamcalibLoader # 2. specify calib file. calibration_file_name = "camcalib_tutorial_data/calibration_result.yaml" # 3. specify camera pairs as empty list. This prevents the module # from generating any undistort-rectify maps. We will not need them. camera_pairs =  # 4. load the calib data. calibration = CamcalibLoader(calibration_file_name, camera_pairs)
With that, the calibration data is loaded.
To make use of the calibration object we created, let’s discuss its member variables:
.cameras is a list of all camera names contained within the YAML file. If the YAML file contains IMUs and cameras, this list will only contain the names of the cameras.
.sensors is a list of all sensor names contained within the YAML file. If the YAML file contains IMUs and cameras, this list will contain all cameras and IMU's names.
.camera_pairs either contains
the list of the camera pairs we specified in camera_pairs or
if we specify camera_pairs=None, .camera_pairs contains a list of all unique pairings of the cameras listed in the .cameras member variable.
.camera_poses contains the extrinsic pose for each camera listed in .cameras.
.sensor_poses contains the extrinsic pose for each camera and IMU listed in .sensors.
.camera_pair_undistort_rectify_maps is a dictionary that contains, for each pair in the member variable .camera_pairs, the corresponding undistort-rectify maps and rectification data.
Note the difference compared to the camcalib_loader.py module we used in our previous example. Here we added the member variables .sensors and .sensor_poses while preserving the members .cameras and .camera_poses. The reason for this is to preserve compatibility with our previous examples while adding the ability to handle other sensor types as well. This may change in other examples but it's convenient for us now.
Step 2: generate geometry for each camera
To produce the following 3D visualization of a camera, we simply construct a set of lines in open3d.
The structure open3d provides is open3d.geometry.LineSet() and requires the developer to set three members.
.points is a set of 3D points that specify each vertex of our desired geometry.
.lines is a list of index pairs that tells open3d which vertices to connect to make a line.
.colors is a list of RGB colors for each vertex in .points ranging from 0 to 1 in brightness.
Before we can construct the vertices, we need to prepare our camera parameters and scale them to something useful. We assume a metric space that we are rendering our geometry into. So, arbitrarily, a length of 1 in open3d means a length of 1 meter to us. If we are to render a camera with a focal length of 1000 pixels and an image size of 1280x1024, the rendering would be impractically large. For this, we will need a scale parameter that we specify later on.
# Import open3d import open3d as o3d # Specify our scale free camera parameters # the variables _f, _w, _h, _cx, and _cy are the intrinsic parameters # the corresponding f, w, h, cx and cy are normalized by _f so we # can scale them later. f = 1 w = _w/_f h = _h/_f cx = _cx/_f cy = _cy/_f # Paramers to draw the image center vector offset_cx = cx - w/2.0 offset_cy = cy - h/2.0
With the scale-free parameters defined, we can start creating our vertices and lines.
points = [[ 0, 0, 0], [offset_cx,offset_cy, f], [-0.5 * w,-0.5 * h, f], [ 0.5 * w,-0.5 * h, f], [ 0.5 * w, 0.5 * h, f], [-0.5 * w, 0.5 * h, f], [-0.5 * w,-0.5 * h, f]] lines = [[0,1],[2,3],[3,4],[4,5],[5,6],[0,2],[0,3],[0,4],[0,5],[2,4],[3,5]]
Note how the lines variable only states which vertices are connected to each other. We do not repeat coordinates, there is no need to.
Now that our geometry data is prepared, it's time to apply scale and pose transforms.
# rescale cam symbol to visible size points = np.array(points) * size # apply camera pose transform points = (R @ points.T).T + T
Because we set f=1 in the beginning and expressed the parameters normalized by _f we can use the variable size to define how large we want the cameras to be rendered. Remember, we don't want them to be impractically large, or too small. A good value for size is 1/10th to 1/3rd of the typical baseline of your setup.
The variables R and T we use above are Pose.r and Pose.t that we get from the inverse of the extrinsic pose calibration data calibration.sensor_poses[sensor_name].I. Note that we need to take the inverse of the extrinsic pose as the extrinsic pose transform specified by camcalib is the pose that transforms any point from the world coordinate frame into the camera or sensor coordinate frame. The geometry we specified here is expressed in the local cameras or sensors coordinate frame but we intend to visualize the geometry in one consistent world coordinate frame, which is why we need to apply the inverse of the extrinsic pose to our geometry.
Finally, we create the open3d LineSet object
_color=[0,1,0] # Green colors = [_color for i in range(len(lines))] line_set = o3d.geometry.LineSet() line_set.points = o3d.utility.Vector3dVector(points) line_set.lines = o3d.utility.Vector2iVector(lines) line_set.colors = o3d.utility.Vector3dVector(colors)
Now, if you render line_set using open3d you should see a visualization of a pinhole camera with exactly the orientation and position as specified in the YAML file. For your convenience, we put all of this into the function construct_camera(size, intrinsics, extrinsic_pose, color=[0,1,0]) that you can use as follows
from modules.calib_viz_utils import * # ... extract intrinsics and P_world_sensor from loaded calibration data construct_camera(size=0.05, intrinsics=intrinsics, extrinsic_pose=P_world_sensor)
Step 3: generate geometry for each IMU
For the IMUs geometry, we simply make use of a small coordinate frame mesh to indicate the orientation of the accelerometers x, y, and z-axes. So all we need to do is create a coordinate frame object and apply a pose transform
mesh = o3d.geometry.TriangleMesh.create_coordinate_frame(size=size, origin=[0,0,0]) P = np.eye(4) P[0:3, 3] = extrinsic_pose.t P[0:3, 0:3] = extrinsic_pose.r mesh.transform(P)
There are two minor things of note here:
we use size again to scale the coordinate frame mesh to something sensible that also matches the size of the cameras.
we use .r and .t of the Pose module to construct a 4x4 pose transform matrix as required by open3d's mesh.transform() function.
For your convenience, we put all of this into the function construct_imu(size, extrinsic_pose) that you can use as follows
from modules.calib_viz_utils import * # ... extract P_world_sensor from loaded calibration data construct_imu(size=0.02, extrinsic_pose=P_world_sensor)
Step 4: draw the geometry
Now that we have defined our helper functions, we can loop over all our sensors, generate the appropriate geometry and throw the geometry list at our renderer.
scene_geometry = [o3d.geometry.TriangleMesh.\ create_coordinate_frame(size=0.05, origin=[0, 0, 0])] for sensor_name in calibration.sensors: try: P_world_sensor = calibration.sensor_poses[sensor_name].I if sensor_name in calibration.cameras: # Fetch intrinsic parameters so we can properly render the 3D # representation of the cameras. cam_calib_data = calibration.calibration_parameters\ ["sensors"][sensor_name] intrinsics = cam_calib_data["intrinsics"]["parameters"] camera_model = cam_calib_data["intrinsics"]["type"] # Generate the camera geometry (Frustum, image center vector, # and camera name) scene_geometry.append( construct_camera(size=0.05, intrinsics=intrinsics, extrinsic_pose=P_world_sensor)) scene_geometry.append( text_3d(text=sensor_name, scale=3, extrinsic_pose=P_world_sensor)) else: # Generate IMUs as coordinate frame meshes and indicate it # is an IMU by writing the sensor parallel to the x-axis. scene_geometry.append(construct_imu(0.02, P_world_sensor)) scene_geometry.append( text_3d(text=sensor_name, scale=3, extrinsic_pose=P_world_sensor)) print("Generated scene geometry for", sensor_name) except: print("Failed to generate geometry for", sensor_name) # Render scene geometry o3d.visualization.draw_geometries(scene_geometry)
When we run the code, we get the following output:
There are a few things to note here:
The orientations of the sensor names are aligned with the x-axis of the sensor pointing towards the right of the text and the y-axis pointing downwards from it.
Notice that cam1 is upside down compared to cam0 and the IMUs are oriented in many different directions. This is not an accident but the result of design choices by the hardware designers. It does not matter if cameras are mounted upside-down as long as they are facing in the right direction because the image can always be flipped later. But it is important that this information is reflected in the calibration data. Our previous articles on multiview point clouds automatically take this into account during the rectification of the image data so no extra care needs to be taken during image data read-out to flip the images.
Imu1 has a large coordinate frame as opposed to imu2 and imu3. This is because it is the reference coordinate frame for all other sensors. The reference frame is drawn larger and covers the coordinate frame of imu1.
Putting it all together
With this article, we have shown you how to easily visualize your calibration data and gain valuable information on how your sensors are mounted and are positioned with respect to each other. Stay tuned for further articles that will help you bootstrap your CV applications even faster with camcalib!