Sensor Fusion Using a Kalman Filter

5 minute read


How do you fuse measurements together?

The first time you learn about probabilistic robotics, you will probably hear about the Kalman filter. The Kalman filter is a way of estimating the state of a system that has both process noise and measurement noise. Founded in probability theory, it gives an optimal estimate based on the relative size of the process and measurement noise. If the process noise is very large and the measurement noise is very small, then the Kalman filter returns an estimated state that is closer to the measurement, and vice versa. For most students that first encounter the Kalman filter, you’re told the intuition, shown some complicated multivariate Gaussian math and are told to use it in a homework exercise.

Most introductory examples of the Kalman filter have only one measurement to use. What if there are more than one, though? How do you handle two different measurements of the same exact thing with a Kalman filter? This is a form of sensor fusion, and when I first learned the Kalman filter I had only a shaky understanding of the solution. The purpose of this blog post is to show how the Kalman filter can perform sensor fusion, and hopefully clarify the machinery of the Kalman filter in the process.

The two steps of the Kalman filter

Let us define $p(\mathbf{x})$ as the belief probability distribution of state space vector $\mathbf{x}$. A Kalman filter has two important steps when providing an estimate of the value of $\mathbf{x}$:

  • Prediction update step (use input $\mathbf{u}$ to update $p(\mathbf{x})$)
  • Measurement update step (use measurement $\mathbf{y}$ to update $p(\mathbf{x})$)

Since the Kalman filter assumes multivariate Gaussian probability distributions, only two quantities are recorded over each step $k$ in the Kalman filter: an estimate vector $\hat{\mathbf{x}}_k$ (corresponding to the mean of the Gaussian), and a covariance matrix $P_k$ (corresponding to the confidence of the measurement). Together, these quantities fully define the probability distribution $p(\mathbf{x})$, so when the Kalman filter updates $p(\mathbf{x})$ in the prediction and the measurement step, it really just updates these two variables.

What if we have two separate measurements $\mathbf{y}^a$ and $\mathbf{y}^b$, modeled with their own Gaussian noise? Cutting to the punchline, you perform two measurement steps, one with each measurement:

  • Prediction update step (use input $\mathbf{u}$ to update $p(\mathbf{x})$)
  • First measurement update step (use measurement $\mathbf{y}^a$ to update $p(\mathbf{x})$)
  • Second measurement update step (use measurement $\mathbf{y}^b$ to update $p(\mathbf{x})$)

This seems like a natural thing to do given two different measurements. Let’s see why this is so.

The Bayes Update

The Kalman filter is a specific type of filter called a Bayes filter. The Bayes filter also has two steps: one prediction step and one measurement step. In order to change the Kalman filter to incorporate more than one measurement step, we need to understand what each step means in terms of Bayesian estimation.

Define the belief distribution at iteration $k-1$ as $p_{k-1|k-1}(\mathbf{x})$. The $k-1|k-1$ part can be read as the probability distribution at time $k-1$, given information up to time $k-1$. The prediction step requires some process model that describes the probability of reaching state $\mathbf{x}^+$ given current state $\mathbf{x}$ and input $\mathbf{u}_{k-1}$:

\begin{equation*} p(\mathbf{x}^+|\mathbf{x},\mathbf{u}_{k-1}) \end{equation*}

The prediction update is thus described in the Bayes filter as:

\begin{equation} p_{k|k-1}(\mathbf{x}) = \sum_{\mathbf{x}’} p(\mathbf{x}|\mathbf{x}’,\mathbf{u}_{k-1}) p_{k-1|k-1}(\mathbf{x}) \end{equation}

At time step $k$, we then receive two measurements $\mathbf{y}^a_k$ and $\mathbf{y}^b_k$. The measurement update of the Bayes filter is then used to find the probability distribution of the state given these two measurements, or $p_{k|k}(\mathbf{x})=p(\mathbf{x}|\mathbf{y}^a_k,\mathbf{y}^b_k)$. In the case of a single measurement, Bayes’ rule is used to “switch” the random variable and the conditioned variable:

\begin{equation*} p(\mathbf{x}|\mathbf{y}) = \eta p(\mathbf{y}|\mathbf{x})p_{k|k-1}(\mathbf{x}) \end{equation*}

The $\eta$ term is a normalization factor, and we are usually unconcerned with it in almost all Bayes filters. The measurement process $p(\mathbf{y}^a|\mathbf{x})$ includes gaussian noise in the Kalman filter, and the probability distribution $p_{k|k-1}(\mathbf{x})$ is found in the prediction update.

With two state measurements, this update is only slightly altered. First, perform Bayes’ rule between the state variable $\mathbf{x}$ and a single measurement, say $\mathbf{y}^b$:

\begin{equation*} p(\mathbf{x}|\mathbf{y}^a_k,\mathbf{y}^b_k) = \eta^b p(\mathbf{y}^b_k|\mathbf{x},\mathbf{y}^a_k)p_{k|k-1}(\mathbf{x}|\mathbf{y}^a_k) \end{equation*}

We can simplify this expression by making the reasonable assumption that the measurement model for $\mathbf{y}^b_k$ is independent of (1) the iteration step $k$ and (2) the value of the other measurement $\mathbf{y}^a$, so that $p(\mathbf{y}^b_k|\mathbf{x},\mathbf{y}^a_k) = p(\mathbf{y}^b|\mathbf{x})$. This results in:

\begin{equation} \label{eq:second_measurement} p(\mathbf{x}|\mathbf{y}^a_k,\mathbf{y}^b_k) = \eta^b p(\mathbf{y}^b|\mathbf{x})p_{k|k-1}(\mathbf{x}|\mathbf{y}^a_k) \end{equation}

This is great, but we still need $p_{k|k-1}(\mathbf{x}|\mathbf{y}^a_k)$, or the probability distribution of $\mathbf{x}$ conditioned on the value for $\mathbf{y}^a_k$. This term can be found performing Bayes’ rule a second time:

\begin{equation} \label{eq:first_measurement} p(\mathbf{x}|\mathbf{y}^a_k) = \eta^a p(\mathbf{y}^a|\mathbf{x})p_{k|k-1}(\mathbf{x}) \end{equation}

Eq. \ref{eq:first_measurement} can be thought of as the first measurement update to find the a posteriori probability distribution with respect to measurement $\mathbf{y}^a$. We then take this updated distribution and perform a second measurement update with this update distribution and $\mathbf{y}^b$ and in Eq. \ref{eq:second_measurement}. The result is an estimate that “fuses” two different measurements from different sensor, verifying our intuition.

One final note

It should be that incorporating two different measurements should arrive at the same answer, independent of the order in which you incorporate the measurements. Indeed, you can see this by substituting Eq. \ref{eq:first_measurement} into Eq. \ref{eq:second_measurement}:

\begin{equation} p(\mathbf{x}|\mathbf{y}^a_k,\mathbf{y}^b_k) = \eta^a\eta^b p(\mathbf{y}^a|\mathbf{x})p(\mathbf{y}^b|\mathbf{x})p_{k|k-1}(\mathbf{x}) \end{equation}

The measurement update is symmetric in the label for measurements $a$ and $b$, so it cannot depend on the order in which the measurement updates are applied.