12 minute read


Chicken or the egg?

In robotics, it is common to use a map of the environment as a tool for navigation and motion planning. Specifically, maps are really good at telling you where you are in the environment: this called localization. The ability to localize, though, is completely dependent on the quality of the map. Humans could construct this map, but you can imagine how much work that would require to measure out every detail. If the map is off by a little bit, how long will it take to correct that error and improve precision? What if something changes and you need to update the map? Since robots will use the map anyway, there is this idea of use the sensing modality of the robot to build the map: this is called mapping.

One issue that you may see is that in a realistic situation, there seems to be some conflicting workflow: in order to create a map, the robot needs a precise estimate on its current location (how can you tell the size of a room if you aren’t confident in how far you walked from one end to the other?). But in order to get a precise estimate on the current location, you must have a map to help you localize. In this sense, Simultaneous Loclalization and Mapping (SLAM) is often described as a chicken-and-the-egg problem, where the solution of one process requires the end result to the other. How do you solve such a problem?

The answer is not too complicated: you must solve both problems at the same time – hence the world “simultaneous”. The idea is that separately, mapping and localization are pretty much the same process of estimating location given some process within some errors. In localization, you estimate your current position by knowing the exact position of landmarks on the map, and when mapping, you estimate the location of landmarks given a precise estimate of you current location. SLAM leans into the mathy jungle of Baysean estimation by recognizing that since the estimation process is the same, you can effectively lump all the positions that you want to estimate (robot and landmarks) into the same vector. By iteratively finding estimates for this vector, you solve both localization and mapping at the same time.

This blog post will focus on the most conceptually simple type of SLAM, called EKF SLAM. If you are not aware, the EKF (Extended Kalman Filter) is way of estimating values of any variable that (1) follows a differential equation of motion through time, allowing you to predict the current value from previous values, and (2) is periodically observed, allowing you to correct you’re prediction with measurements. This post will not be an explanation of how EKF SLAM works, per se, as there are many works that already do this quite well (here is a video by Cyrill Stanchniss, whose youtube channel is a rich resource for any aspiring roboticist). Instead, I’ll will explore some questions that are often left unexplored when EKF SLAM is talked about, as well as show you my personal implementation of the 2D EKF SLAM for robots with an odometry motion model.

The 1D SLAM Problem

First, let’s look at the simple 1D case, i.e., robot movement on a straight line. It would look funny if the landmarks were on the robot’s line of motion, so without loss of generality I’ll put the landmarks and the robot on different y positions, and allow the robot to change its x position by moving right to left. Below is a picture of the setup with two landmarks:

Since the robot can only move left and right, the motion model of the robot is quite simple. Here is the discrete motion dynamics that describe how the x position of the robot $x_r$ changes under commanded velocity $u$ over a time step of $dt$:

\begin{equation}\label{eq:motion_model} x_r^+ = x_r + u\times dt \end{equation}

The robot can only observe a landmark if it is within some sensing radius. Because this is a 1D simulation, I am only concerned with the x position of landmark $i$, denoted as $x_{li}$. When a landmark is observed, the robot observes the relative distances between it and the landmark:

\begin{equation}\label{eq:meas_model} z_{li} = x_{li} - x_r \end{equation}

Both the robotic motion \ref{eq:motion_model} and the measurement model \ref{eq:meas_model} are linear, so the EKF performs the same as the regular KF in this situation. This won’t be the case for the 2D EKF SLAM in the next section, but for now the linearity keeps the problem simple enough to see the concepts at play without too much complication.

In this simplified SLAM problem, we wish to estimate the x positions of the robot and the two landmarks, so three variables in total. In the EKF algorithm, our belief takes the form of a multivariate gaussian. This gaussian is defined by its average value $\mathbf{\mu}$, where we actually estimate the x positions to be, and the covariance matrix $\Sigma$, which defines the errors in this belief (along the diagonal) as well as the correlations between the x positions (on the off-diagonal elements). In general, these are defined as:

\begin{equation} \mathbf{\mu} = \begin{pmatrix} \mu_{rx} \\
\mu_{1x} \\
\mu_{2x} \end{pmatrix} \end{equation}

\begin{equation} \mathbf{\Sigma} = \begin{pmatrix} \Sigma_{rr} & \Sigma_{r1} & \Sigma_{r2} \\
\Sigma_{1r} & \Sigma_{11} & \Sigma_{12} \\
\Sigma_{2r} & \Sigma_{21} & \Sigma_{22} \end{pmatrix} \end{equation}

To be clear, $\Sigma_{ri}$ is the covariance between the robot’s x position and the $i$th landmark’s x position.

Initializing the EKF SLAM process is pretty simple. First, we provide a estimate on the locations of the robot and landmarks, $\mathbf{\mu}$. As is the case with regular EKF estimation, these starting estimates are not so critical. I chose $\mathbf{\mu} = \begin{pmatrix} 0 & 0 & 0 \end{pmatrix}^T$. There is small but important point when choosing $\Sigma$: since SLAM is concerned with building a map of the environment, the origin can be defined however we choose. We can take advantage of this choice by saying that the origin is precisely where the robot starts. In terms of the mutlivariate gaussian, $\Sigma_{rr}=0$. The other landmarks are initialized with a large error, say $\Sigma_{ii}=100$.

Below is a gif showing how the estimates and uncertainties change over time in the simulation. A line is drawn between the robot and any landmark that it is close enough to observe.

Here are some observations that one can make:

  • When the robot moves without sensing a landmark, it’s uncertainty grows over time.
  • Unless observed by the robot, the uncertainty of the landmarks do not change over time.
  • When the robot sees a landmark for the first time, the uncertainty of the landmark is the uncertainty of the robot at that moment. At the same time, the uncertainty of the robot doesn’t really change.

There are two “magic” things that happen in EKF SLAM. The first happens when the robot sees a landmark it has already seen before. Since the uncertainty of the landmarks don’t grow over time, sensing the landmark for a second time helps reduce the robot’s own uncertainty. It is unlikely that this exchange will reduce the uncertainty of the landmark, since the robot’s uncertainty only grows as it moves without observing any landmarks.

The second bit EKF SLAM magic is even more interesting. In the above gif, you may notice that the uncertainty of the landmark 2 is fairly large when first observed. This makes sense, since the robot itself has a large uncertainty. Continue watching, and you’ll see that the uncertainty of landmark 2 decreases when landmark 1 is observed for a second time! This is strange at first: why should the uncertainty of one landmark be reduced when another landmark is observed? Further thought reveals the intuition: seeing a landmark for a second time helps contextualize the locations of the other landmarks you observed in the past. Imagine walking around an unfamiliar neighborhood; seeing a landmark for a second time not only helps you understand where you are, but also helps place the landmarks that you recently walked past. Since you typically see landmarks again only after travelling around a complete loop, this process is called closing the loop.

In the parlance of Gaussian shapes, we say that the locations of the landmarks are correlated. These are the off-diagonal elements of $\Sigma$ (technically the off-diagonal elements are called covariances, but they are strongly related to correlation). Initially, the locations of landmarks are uncorrelated in the EKF, i.e., $\Sigma$ is a diagonal matrix. When landmarks are first observed, the off-diagonal entries of $\Sigma$ are filled by the EKF algorithm. It’s these off-diagonal correlations that reduce the uncertainty of landmarks when other landmarks are observed.

The correlations between robot location and landmark location aren’t shown in the above figure, since the correlation isn’t as straight forward to visualize. To help visualize how these correlations are built up, below the simulation along with the covariance matrix $\Sigma$ overtime.

The 2D SLAM Problem

A more practical example is when the robot can move in 2 dimensions. For this simulation, I’ll use the differential drive kinematic model. Basically, the robot has three state variables $(x_r,y_r,\theta_r)$ and two inputs, linear velocity $v$ and rotational velocity $\omega$. The continuous time equations of motion are:

\begin{equation*} \dot{x_r} = v\cos(\theta_r) \end{equation*} \begin{equation*} \dot{y_r} = v\sin(\theta_r) \end{equation*} \begin{equation*} \dot{\theta_r } = \omega \end{equation*}

This motion model is commonly used to teach EKF SLAM because (1) it is a simple model that can be applied to a lot of practical situations and (2) it is nonlinear, meaning we need to use the “Extended” in EKF. In order to setup the EKF, matrices need to be constructed that represent the locally linear kinematics around any generic operating point. Thrun’s chapter on EKF SLAM has derived these matrices for us, so we’ll use those. When the robot observes the landmarks in the 2D case, it observes the distance away to the landmark $r_i$ and the angle of the landmark relative to its heading $\theta_i$. These are also nonlinear:

\begin{equation*} r_i = \sqrt{(x_r-x_i)^2 + (y_r-y_i)^2} \end{equation*} \begin{equation*} \theta_i = \arctan{((y_i-y_r)/(x_i-x_r))} - \theta_r \end{equation*}

As is the case with the nonlinear motion model, EKF SLAM requires matrices that linearize around some generic operating point. These are also derived in Thrun’s book.

A run of the 2D simulation is shown below. The same observations that were made in the 1D case can also be made in the 2D case as well.

Below is the same run with the covariance matrix $\Sigma$ alongside for comparison with the 1D case. When the loop is closed, you can tell a clear checker board pattern emerge in $\Sigma$. This is due to the fact that the x coordinates of each landmark are correlated together, and the same with each y coordinate, but correlations between x and y don’t exist. This makes sense, since these are orthogonal directions and knowing the x coordinate of a landmark tells me nothing of the y coordinate.

Note: when looking at the colors in $\Sigma$, also look at the color bar on the right to get a sense of scale. Until landmark 4 is observed, the largest value in $\Sigma$ is 100, so the nonzero terms that seem to “appear” after landmark 4 is observed are always there, the changing colors is just the graph adjusting to a much smaller scale. The same also goes for colors that seem to “fade”, where the scale might be growing to accommodate the growing uncertainty in the robot x/y position.


This blog post is intended to show the EKF in a little more detail than standard classes typically do, giving the big ideas a chance to breathe. In a future blog post, I hope to explore what it means when we close the loop. For anyone that wishes to understand the material better, I suggest trying to implement your own EKF SLAM algorithm. There are several points for practical implementation that I didn’t go over here, and are best learned when struggling to do this stuff yourself. But seeing SLAM work in the end is always worth the pain!