We present a particle?ltering algorithm for robustly track-ing the contours of multiple deformable objects through se-vere occlusions.Our algorithm combines a multiple blob tracker with a contour tracker in a manner that keeps the required number of samples small.This is a natural combination because both algorithms have complementary strengths.The multiple blob tracker is an effective solu-tion to the model-data association problem and searches a smaller and simpler space.On the other hand,contour tracking gives more?ne-tuned results and relies on cues that are available during severe occlusions.Our choice of combination of these two algorithms accentuates the advan-tages of each.We demonstrate good performance on a chal-lenging video sequence of three identical mice that contains multiple instances of severe occlusion.


In this paper,we address the problem of tracking the con-tours of multiple identical mice from video of the side of their cage;see Figure3for example video frames.Although existing tracking algorithms might work well from an over-head view of the cage,the majority of vivaria are set up in a way that prohibits this view.A solution to the side view tracking problem would be extremely useful for medical re-searchers wishing to automatically monitor the health and behavior of lab animals.The contour tracking problem is also interesting from a computer vision standpoint.Track-ing mouse contours is dif?cult because they are extremely deformable3D objects with unconstrained motion.Thus, any mouse contour model general enough to represent most possible deformations is necessarily complex.The biggest challenge to tracking mice from a side view is that the mice occlude one another severely and often.Tracking the mice independently would inevitably result in two trackers fol-lowing the same mouse.Thus,the data association problem must be handled explicitly by a multitarget tracking algo-rithm.These algorithms involve searching a space whose size increases exponentially with the number of objects.Di-rectly searching the contour space for all mice at once is prohibitively expensive.

In addition,tracking individual mouse identities is dif?-cult because the mice are indistinguishable.We cannot rely on object-speci?c identity models and must instead accu-rately track the mice during occlusions.This is challenging because mice have few if any trackable features,their be-havior is erratic,and edges(particularly between two mice) are hard to detect.Other features of the mouse tracking problem that make it dif?cult are clutter(the cage bedding, scratches on the cage,and the mice’s tails),inconsistent lighting throughout the cage,and moving re?ections and shadows cast by the mice.

Our algorithm is of general interest to the tracking com-munity because the challenges to successful mouse track-ing are common to many real world tracking applications. While many video sequence testbeds are constructed to show off the novelty of an algorithm,our algorithm is con-structed to address the challenges of a speci?c tracking problem.Thus,our feature extraction algorithm must be powerful,our objects’state representation must be detailed, and our algorithm must be able to search the complex pa-rameter space with a limited number of samples.

We propose a combination of existing blob and con-tour tracking algorithms each of which address subsets of the challenges above.Our combination accentuates the strengths of each algorithm to address all the challenges speci?c to mouse tracking.In addition,we capitalize on the independence assumptions made by our model,so that most of our search can be done independently for each mouse. This reduces the size and complexity of the search space, and allows our Monte Carlo sampling algorithm to search the complex state parameter space with a reasonable num-ber of samples.Our algorithm works with a detailed repre-sentation of mouse contour to achieve positive results on a video sequence of three mice exploring a cage.

The paper is organized as follows.In Section2,we de-scribe the algorithms we build off of:the Bayesian Multiple Blob(BraMBLe)tracker[5]and MacCormick et al.’s con-tour likelihood model[6].In Section3,we describe the model assumed for the blob and contour tracking problem. In Section4,we describe our particle?ltering algorithm for ?tting contours given this model.In Section5,we present speci?c details of our algorithm,and the results for a chal-lenging video sequence.

2.Related Work

Our algorithm builds off of the BraMBLe tracker[5]and MacCormick et al.’s contour tracking framework[6,1].

Both of these approaches are based on particle?ltering. In this section,we?rst introduce standard particle?ltering with the purpose of introducing notation;see[2]for a com-plete treatment of particle?ltering.Then,we describe the blob and contour tracking algorithms brie?y,and point out their strengths and weaknesses for our application.

2.1.Particle Filtering

Particle?ltering is a Monte Carlo sampling algorithm for estimating properties of hidden variables given observations in a hidden Markov model.For tracking from video,x t, the state at time t,represents the positions of the objects in frame t,and y t,the observation at time t,is a function of video frame t.Particle?ltering assumes that we can directly sample from p(x t|x t?1),the probability of transitioning to state x t from state x t?1.It also assumes that we can easily evaluate the observation likelihood,p(y t|x t),the likelihood of observing y t in state x t.

Particle?ltering sequentially constructs particle set rep-resentations of p(x t|y1:t),the posterior probability of ob-serving state x t given the sequence of observations y1:t= (y1,...,y t),using the following recursive factorization:

p(x t|y1:t)∝p(y t|x t)p(x t|y1:t?1)

=p(y t|x t)


p(x t|x t?1)p(x t?1|y1:t?1)d x t?1.

Let{(x(i)t,w(i)t)}N i=1be the set of N particles represent-ing the?ltering distribution at frame t.Particle?ltering uses importance sampling to sequentially create the cur-rent particle set,{(x(i)t,w(i)t)}N i=1,from the previous set, {(x(i)t?1,w(i)t?1)}N i=1.In its most standard form,the impor-tance function and importance weights are:

x(j)t~p(x t|y1:t?1)≈



w(i)t?1p(x t|x(i)t?1)

w(j)t∝p(y t|x(j)t),

where the weights are normalized to sum to one.

2.2.BraMBLe Likelihood

The Bayesian Multiple Blob(BraMBLe)tracker[5] provides a solution to tracking multiple occluding blobs/objects from video.The novelty of BraMBLe is its observation likelihood model,which provides a natural and robust solution to the data association problem.Given a hy-pothesized set of blob positions x,BraMBLe?rst computes the label l g(x)of each image location g on a grid as either foreground(within some blob)or background(outside of all blobs).The likelihood of an observed pixel value y g for each label is modeled as a Gaussian mixture model(a sep-arate background GMM is learned for each grid location,a common GMM is learned for the foreground).The pixel values are assumed to be independent given the state,so the likelihood of the entire frame is assumed to be the product of the individual pixel likelihoods:




p g(y g|l g(x)),

where p(y g|fore)and p g(y g|back)are the GMMs.

BraMBLe does suffer from the curse of dimensionality; the space searched by BraMBLe grows exponentially with the number of objects tracked.However,the number of pa-rameters required to describe a blob is minimal.In addition, the likelihood function is smooth,in comparison to likeli-hood functions used for contour tracking(e.g.[6]).This makes the search simpler and more robust.

Empirically,we found that BraMBLe performs well on the mouse tracking application in locating the approximate positions of the mice(up to permutations of the identity la-bels).The main failure is that our blob representation of the state of a mouse(an ellipse)was not detailed enough to give precise object positions.This approximate model makes the likelihood higher for some reasonable?ts than others,based on the relationship between the model and the object’s shape(in the common case that the2D mouse shape is not precisely elliptical).Because of the invalid indepen-dence assumptions made in the likelihood model,some rea-sonable?ts will be evaluated as orders of magnitude better than others.This is particularly evident during occlusions, when the set of reasonable blob?ts is large.BraMBLe will choose one?t as best early in the occlusion,and give this ?t a disproportionately large weight.The particle sets pro-duced are thus extremely sparse,and BraMBLe cannot later recover from its mistake.

2.3.MacCormick et al.’s Contour Likelihood We combine the blob likelihood from BraMBLe with the “generic contour likelihood”described in[6].In contour tracking,the state is represented by parameters which de-?ne a B-spline.To determine the likelihood of a video frame given the hypothesized state,edges in the video frame are detected along a sparse set of short measurement lines nor-mal to and centered on the B-spline.The intersection be-tween the hypothesized contour state and the measurement line is the center of the measurement line.

Let y be the binary vector indicating where edges are detected on a measurement line,n be the number of edges detected,and L be the measurement line length.If n=0, then the measurement line likelihood is p(y|x)=p01.If n≥1,with probability1?p01,one of these edge detections was produced by the hypothesized intersection,and the rest were produced by background clutter.With probability p01,

they were all produced by background clutter.The loca-tion of the edge produced by the hypothesized intersection is Gaussian around the line center with varianceσ2.Clut-ter edge detections are uniformly distributed along the line. The number n of clutter detections is a Poisson random variable,b(n )=e?λL(λL)n /n !.Thus,for n≥1,





y i N(i;L/2,σ2)+p01




The measurement line observations are assumed to be inde-pendent given the state,so the total likelihood is the product of the individual likelihoods.

We found that,even for one mouse,the search performed by contour tracking is much more dif?cult than that per-formed by BraMBLe.In our occlusion-free training se-quence of300frames,the generic contour tracker lost one of three mice completely.This is because,for a detailed contour model,the parameter space is larger than for a blob model.In addition,the contour observation likelihood is much less smooth than the blob likelihood:if a pair of con-tours are not a very good?t to the video frame,then the rankings given to the contours are often not meaningful. However,for contours that are very close to?tting the data, the contour likelihood is usually peaked around a meaning-ful?t.This is in contrast to BraMBLe,in which the rank-ings are usually meaningful on a large scale but not mean-ingful on a small scale.Even for one object,low-level blob tracking is useful for guiding the contour search[4].Thus, while a generalization of this contour likelihood to multi-ple targets exists[7],we found that it alone did not work well for our complex multitarget contour model.We in-clude contour tracking because there is more contour signal available during occlusions,both in the silhoutte of the oc-clusion and the boundary between pairs of mice.

3.A Blob and Contour Object Model We use a particle?ltering algorithm to approximate p(x t|y1:t),the posterior distribution of the state of all k mice in frame t,x t=x t,1:k,given blob and contour ob-servations for frames1through t,y1:t=(y c,1:t,y b,1:t).To perform the blob and contour particle?ltering described in Section4,we must be able to generate samples from and evaluate the transition distributions.These are:

?p(x tm|x t?1,m):the distribution of the contour state at time t for mouse m given the contour state at time t?1 for mouse m.

?p(x btm|x t?1,m):the distribution of the blob state at time t for mouse m given the contour state at time t?1 for mouse m.

?p(x tm|x btm,x t?1,m):the distribution of the contour state at time t for mouse m given the blob state at time

t for mouse m and the contour state at time t?1for mouse m.

We must also be able to evaluate the observations likeli-hoods.These are:

?p(y ctm|x tm):the likelihood of observing the contour observations at time t for mouse m given the contour state at time t for mouse m.

?p(y bt|x t,1:k)=p(y bt|x b,t,1:k):the likelihood of ob-serving the blob observations at time t given the state at time t for all k mice.

Notice that the only model that depends on all k mice is the blob observation likelihood.This is because of the many independence assumptions made.We assume that the mice move independently of one another and the transition prob-ability for k mice can be factored as the product for each mouse.Second we assume that the blob and contour ob-servations are independent given the state,so we can factor the observation likelihood as the product of the blob and the contour likelihood.Third,the assumptions made by the generic contour likelihood allowed it to be factored as the product of the likelihood of the relevant measurement lines for each mouse.

In this section,we describe the forms of these distri-butions assumed.We?rst describe the representation and parameterization of the blob and contour states in Section 3.1.In Section3.2,we describe the transition distributions which make many independence assumptions based on the state parameterization.In Section3.3,we describe the as-sumed observation model.

3.1.The Blob-Contour State

We have a combined blob and contour state model.An in-dividual mouse blob is modeled as an ellipse,which we describe by?ve shape and three velocity parameters.We chose to parameterize the blob by the center x-coordinate μx,the center y-coordinate,μy,the semimajor axis length, a,the semiminor axis length b,the rotation after scaling,θ, the center x-velocity,v x,the center y-velocity,v y,and the semimajor axis length velocity,v a:

x bm=(μxm,μym,a m,b m,θm,v x,v y,v a)

. Figure1:The12B-spline contour templates chosen to represent a mouse contour.The circles are the locations of the measurement lines.

An individual mouse contour is modeled as any af?ne transformation of any of a discrete set of closed B-spline contour templates.Figure1shows the12templates we chose.These contours each have12knots,but there is no assumed correspondence between these knots.This was necessary because the shape of the2D mouse image varies so much.Thus,there is no sense of a linear combination of contours.This means we can’t use the model described in[1].The locations of the measurement lines along each contour template are set by hand,as shown in Figure1.In addition,the minimum and maximum allowed eccentricity, pre-scaling rotation,and post-scaling rotation are set for each contour(no limits on post-scaling rotation are given for mice hanging from the ceiling).Each contour is also labeled as facing left,right,or both left and right.

The contour state was parameterized by the eight param-eters describing the ellipse and its velocity,the rotation be-fore scaling,φ,the contour template,c,and whether the template is?ipped,f:x m=(x bm,φm,c m,f m) .The state of all k mice is the concatenation of the individual mouse state vectors,x1:k=(x 1 (x)


) .

3.2.Model Dynamics

We now describe the assumed transition distributions.We assume simple Gaussian transition models for continuous parameters.For the discrete contour parameters,we cal-culate a probability for each contour template based on whether the contour template is the same((c tm,f tm)= (c t?1,m,f t?1,m))and whether the contour template is facing the same direction(direction(c tm,f tm)= direction(c t?1,m,f t?1,m)).All our transition models are similar.

We describe how to generate a sample from p(x tm|x t?1,m).We?rst generate the continuous shape parameters x s=(μx,μy,a,b,θ,φ) .These follow independent damped constant velocity and/or autogressive models:

p(x st|x t?1)=N(x st;(I?Γ)(x s,t?1+Λv s,t?1)+Γ?x s,Σs), whereΓis the diagonal autoregressive constant(which is set to0forμx,μy,andθ),Λis the diagonal dampening constant(which is set to0for b,θ,andφ),andΣs is the as-sumed diagonal covariance matrix.The velocities are then set by subtracting the previous shape state from the current shape state.

Next,we generate a contour.There is a high probability of the contour in the current frame being the same as the contour in the previous frame.The probability of chang-ing to a different contour is based on whether the current contour and the new contour face the same direction.We ?rst determine which contours are allowed given the gen-erated shape.If no contour is allowed,we rotate the shape after scaling by the minimum amount to allow at least one contour.We then decide which direction(left or right)the generated contour should face(if contours facing both di-rections are allowed).We?ip the direction with probability proportional to the squared eccentricity.Given the direc-tion,we choose an allowed contour facing that direction.If the direction has not changed,we choose the same contour with high probability.All other contours allowed in a given direction are given equal weight.

The contour to blob transition model,p(x btm|x t?1,m), is the same as p(x tm|x t?1,m),for all relevant parameters. The assumed model of the contour given the blob and previ-ous contour,p(x tm|x btm,x t?1,m),is the same as the con-tour to contour transition model for parameters used only to describe the contour.The other parameters are assumed to be Gaussian around the blob parameters,with the same variance as the contour to contour transition model.

3.3.The Observation Likelihood

We use an observation likelihood model that is a combina-tion of the BraMBLe likelihood and a soft version of the generic contour likelihood.Both likelihoods are dependent on the observation features extracted from the raw video frame.In Section3.3.1we describe the observation fea-tures.In Section3.3.2,we describe our soft generalization of the contour likelihood.In Section3.3.3,we describe our combined blob and contour likelihood.

3.3.1.Feature Extraction

BraMBLe relies on foreground and background models of pixel value being distinguishable from one other.The bed-ding and the mice are similar in color,so using just the color of the pixel does not work.BraMBLe therefore?lters the three color channels with both Gaussian and Laplacian of Gaussian?lters at a set scale.The Laplacian of Gaussian ?lter is useful in differentiating the smooth mouse texture from the variable bedding texture.The only noticeable fail-ure for this choice of features is the mouse’s shadow on the bedding.This is in between the mouse and lighted bed-ding in color,and the training data we used to learn the background model did not accurately represent both lighted and shaded modes at many locations in the image.Figure 3(a)shows the log-likelihood ratio of foreground over back-ground for some example frames.

Contour tracking relies on an accurate edge detection al-gorithm.This is a challenge for our video sequence because there is a large amount of clutter in the scene.Much of the bedding has a high gradient in image intensity and there are scratches on the cage.It is also dif?cult to detect edges be-tween the mice and the bedding,because the bedding in the shadow of the mouse is very similar in color to the mouse. In addition,the edges between pairs of mice are also subtle

(if visible at all).We tried numerous edge detection meth-ods,and the only algorithm that gave reasonable results was the boundary detection algorithm used by the Berkeley Seg-mentation Engine (BSE)[8].BSE computes the posterior probability of an edge based on the brightness,texture,and color gradient using a classi?er trained on 12,000manually labeled images.We credit BSE’s superior performance in part to the texture gradient,which is robust to the types of clutter described.Figure 3(b)shows example images illus-trating BSE’s performance.The major downside of the BSE boundary detector is it is expensive –processing one entire image took over ?ve minutes on a 2.8GHz machine.We hypothesize that this algorithm can be optimized for track-ing applications to reduce this cost.3.3.2.A Soft Contour Likelihood

Because the BSE boundary detector outputs meaningful probabilities rather than hard edge classi?cations,we used a soft version of the generic contour likelihood.This was essential for detecting edges between pairs of mice,as BSE often output a weak response for these edges (see Figure 3(b)).The BSE output for location i is p (edge |y i ),the probability that i is an edge given the observations y i .We then modeled the probability of a binary classi?cation of each pixel along the line z given the edge features y along a measurement line as the product,

p (z |y )=

L i =0

p (edge |y i )z i (1?p (edge |y i ))1?z i .

We assume equal priors for all z ,so this is equal to p (y |z ).The probability of observing measurement line y given the hypothesized contour is the sum over all these possibilities:

p (y |x )=

z ∈{0,1}L +1

p (y |z )p (z ,n |x ),where p (z ,n |x )is the generic contour likelihood described

in Section 2.3.While this computation is extremely fast for small L ,it grows exponentially with L .To combat this,the sum can be taken only over z such that p (y |z )is large.3.3.3.The Blob-Contour Observation Likelihood We make the simple and reasonable assumption that the blob observations y b and the contour observations y c are independent given the state of the object:p (y b ,y c |x )=p (y b |x )p (y c |x ).Here,p (y b |x )is the BraMBLe blob like-lihood and p (y c |x )= k

m =1p (y cm |x m )is the soft con-tour likelihood.The ICondensation algorithm avoids this assumption by using a blob posterior (note that this blob posterior is not that from BraMBLe)as the importance func-tion for contour tracking [4].For our algorithm,it is essen-tial that p (y b |x )be included in our likelihood in order to use BraMBLe to solve the data association problem.

4.A Blob and Contour Particle Filter

In this section,we describe our algorithm for ef?ciently

sampling from the combined blob and contour posterior dis-tribution for the k mice,p (x t,1:k |y b,1:t ,y c,1:t ),given the models and independence assumptions described in Section 3.We do this in a manner that accentuates the strengths of the blob and contour tracking algorithms.Our sampling al-gorithm also capitalizes on our independence assumptions –we assume that the contour likelihood and the state dy-namics are independent for each mouse.

At each iteration of particle ?ltering,we begin by sam-pling using only the blob observations.This step localizes the search space for each mouse.This has two effects.It decreases the amount of space that must be searched in the contour tracking step,like the blob importance function of [4].It also separates the mice so that we can run contour tracking independently for each mouse.Then,indepen-dently for each mouse,we incorporate the contour obser-vations to ?ne-tune the ?t.These two sampling steps result in an algorithm that samples from the product of the poste-rior marginals,


m =1

p (x tm |y 1:t,m ),

where y 1:tm =(y b,1:t ,y c,1:t ?1,y ctm )In order to obtain a particle set representation of the posterior distribution of all the mice,we reweight the samples by the importance weight.In Section 4.1,we show how the marginal posterior can be decomposed into the distributions described in Sec-tion 3.We also present our particle ?ltering algorithm that uses this decomposition to generate a particle set represent-ing the marginal posterior for each mouse.In Section 4.2,we describe our algorithm for combining these particle sets to produce a particle set representation of the joint posterior.Our algorithm is similar to the partitioned sampling al-gorithm [6],in that sampling is done for one object at a time to decrease the number of samples needed.The method and the assumptions made are different.It requires a model we do not have:the complete observation likelihood given the position of only one object.To avoid creating this model,we exploit independence assumptions that are not available in the partitioned sampling framework.

4.1.Sampling from the Marginal Posterior

Given {(x (i )

t ?1,1:k ,w (i )

t ?1)}representing p (x t ?1,1:k |y 1:t ?1),we sample from the marginal

p (x tm |y 1:t,m )=

p (x tm |y tm ,x t ?1,1:k )dp (x t ?1,1:k |y 1:t ?1)

N i =1

p (x tm |y tm ,x (i )t ?1,1:k )w (i )

t ?1,

where y tm=(y bt,y ctm).This is the probability that the state at frame for mouse m t is x tm given the state of all k mice in the previous frame,x t?1,1:k and the observations for mouse m.We only have a model of the blob likelihood given the positions of all k mice,p(y bt|x t,1:k),so we intro-duce dummy state vectors in an integral to get:

p(x tm|y1:t,m,x t?1,1:k)=

p(x tm,b1:k|y tm,x t?1,1:k)d b1:k,

where b1:k are the blob state vectors at time https://www.wendangku.net/doc/889555894.html,ing the HMM and observation likelihood independence assump-tions,we can rewrite this as

p(x tm|b m,x t?1,m,y ctm)p(b1:k|x t?1,1:k,y bt)d b1:k

=p(y ctm|x tm)

p(x tm|b m,x t?1,m)p(y bt|b1:k)

p(b1:k|x t?1,1:k)d b1:k

This decomposition is in terms of the distributions de-scribed in Section3for which we have assumed sim-ple models.We can therefore generate particles from this distribution from the previous state’s particle set,


t?1,1:k ,w(i)t?1)},using standard particle?ltering tech-

niques.There are many ways that this can be done;pseu-docode for our choice is shown in Figure2.The particle set{(?x(i)tm,w(i)tm)}generated in this algorithm represents the marginal posterior distribution for mouse m.We resample in steps1a(as in standard particle?ltering)and1d.The ex-tra resampling step in1d localizes the contour search space to parts deemed important by blob tracking.

For t=1,2,...:

1.Sample from the marginal posteriors:

For i=1,...,N:


t?1,1:k ~{(x(j)






1:k ~p(b1:k|ˉx(i)



https://www.wendangku.net/doc/889555894.html,pute the weight w(i)

b ∝p(y bt|ˉb(i)




1:k ,x(i)









e.For m=1,...,k,


tm ~p(x tm|b(i)





https://www.wendangku.net/doc/889555894.html,pute the weight w(i)

tm =p(y ctm|x tm).

2.Sample from the joint posterior:

For i=1,...,N:

a.For m=1,...,k,

Choose x(i m)

tm ~{(?x(j)






t,1:k =(x(i1)


,...,x(i k)



https://www.wendangku.net/doc/889555894.html,pute the importance weight:


t ∝p(y bt|x(i)







Q k







Figure2:Blob and Contour Particle Filtering

4.2.Sampling from the Joint Posterior

We use the product of the marginal posteriors as the impor-tance function for sampling from the joint posterior.The algorithm for sampling from the joint posterior given parti-cle sets{(?x(i)tm,w(i)tm)}representing the marginal posterior is shown in step2of Figure2.To sample from the product of marginals,we simply choose samples independently from each of the marginals and concatenate.We then reweight each sample i by the ratio of the joint posterior and the prod-uct of marginals evaluated at the sample:




|x t?1,1:k,y t)d x t?1,1:k




The numerator factors as proportional to

p(y bt|x(i)




p(y ctm|x(i)tm)p(x(i)tm|x t?1,m)d x t?1,1:k

≈p(y bt|x(i)




p(y ctm|x(i)tm)


w(j)t?1p(x(i)tm|x(j)t?1,m). We approximate the denominator by


p(y ctm|x(i m)


). Thus,the importance weight is the ratio

w(i)t∝p(y bt|x(i)








p(x(i)tm|x(j)t?1,m). 5.Experiments

We evaluated our blob and contour tracking algorithm on a video sequence of three identical mice exploring a cage. This sequence contained11occlusions of varying dif?culty. The model parameters were chosen by hand using a sepa-rate video sequence.Many of the parameters were set us-ing our knowledge about the problem.These include the variance of the transition models and the constraints on the state.Other parameters,including the damping and autore-gressive constants,were set to values used in[1].The con-tour likelihood parameters were set so that the ranking of the probability of each vector of observed edge detections seemed reasonable.Some of the parameters were chosen somewhat arbitrarily and never varied–these include the number of measurement lines and the parameters used in the BraMBLe likelihood.The number of samples was chosen to be N=2000.While we had qualitatively similar per-formance with N=1000samples,the results returned by particle?ltering varied quite a bit.We thus chose to present results with N=2000samples,for which the output of our particle?ltering algorithm was stable.This number of sam-ples compares favorably to the4000samples used to track a pair of leaves in[7]and the1000samples used to track two people/blobs in[5].

We provide with this paper a video of our results on a short sequence.Summary still frames are shown in Figure 4.These results demonstrate the following strengths of our algorithm:

?Our contour tracking algorithm is robust to erratic mouse behavior–we never lose a mouse.For instance, we follow mice that jump,drop from the ceiling,and make quick turns and accelerations that are not?t by our simple dynamics model(see Figure4(a)).

?Our algorithm succeeds in solving the data association problem:two contours never match the same mouse.?Our algorithm is rarely distracted by background clut-ter.This implies that our feature extraction methods and the blob and contour combination provide robust observation likelihoods.The only exceptions are when both algorithms make mistakes:when the blob tracker mistakes shaded bedding for foreground and the con-tour tracker?ts to the edge of a tail(see Figure4(b)for an example).

?Perhaps the most impressive result is that our algo-rithm accurately tracks the mice through7out of11 occlusions and partway through the other4.This is because of the detailed?t provided by the contour tracking algorithm and its ability to use features avail-able during occlusions.Example successful frames are shown in Figure4(c).

?In general,our algorithm usually found very good con-tour?ts outside of occlusions,much better than those obtained using contour tracking alone.

Our algorithm has a couple of failure modes which we plan on addressing in future work.First,it occasionally gets stuck in local optima in which the contour?t was facing the wrong direction(see Figure4(b)).We plan to address this problem with a better model of the probability of the direc-tion changing.Second,our algorithm swaps identity labels in four occlusions(see Figure4(d)).The reason for this is that the?t of our algorithm is heavily biased by the?t of the BraMBLe algorithm.For occlusions in which the contour observation signals are weak,this bias from BraMBLe can dominate.We propose a solution to this in Section6.

For comparison,we also implemented a combined blob-contour tracking algorithm that did not exploit the contour likelihood independence assumption.This algorithm has the disadvantage that samples of all k mice are weighted by the product of the contour likelihood for each mouse. Thus,the number of samples?tting blobs is the same,but the effective number of samples used when?tting contours is much smaller.We tested this algorithm on500frames with6occlusions(our algorithm works on4).The results?t our theory.While this algorithm was resistant to drift(blob tracking is the same in both algorithms),the contour?ts found were less satisfactory.In general,they were less?ne-tuned and more variable from frame to frame,suggesting that more samples are needed.This is particularly evident during occlusions.The?ts during occlusions are less?ne-tuned to the contour data,and therefore more in?uenced by the blob tracking results.This causes worse?ts in every oc-clusion in the sequence.This algorithm swapped identities twice more than the algorithm proposed in this paper. 6.Conclusions and Future Work

We have presented a combined blob and contour particle ?ltering algorithm that performs well on a dif?cult video sequence of three identical mice exploring a cage.Our al-gorithm leverages the strengths of multiple blob tracking, contour tracking,and the independence assumptions of our model to accurately track the mouse contours without too many particles.This combination allows robust tracking of multiple occluding objects on real data.

In future work,we plan on exploring algorithms that use information from both the past and the future to determine the positions of the mice during an occlusion.We hope that this will solve the main failure of the algorithm proposed in this algorithm by making BraMBLe return a more global ?t.We are exploring heuristic solutions to this problem,in the direction of[3].


[1] A.Blake and M.Isard.Active Contours.Springer,Great


[2] A.Doucet,N.de Freitas,and N.Gordon,editors.Sequen-

tial Monte Carlo Methods in Practice.Springer-Verlag,New York,2001.

[3] B.for anonymous review.

[4]M.Isard and A.Blake.ICONDENSATION:Unifying low-

level and high-level tracking in a stochastic framework.Lec-ture Notes in Computer Science,1406:893–908,1998. [5]M.Isard and J.MacCormick.BraMBLe:A Bayesian

multiple-blob tracker.In ICCV,2001.

[6]J.MacCormick.Stochastic Algorithms for Visual Tracking.

Distinguished Dissertations.Springer,Great Britain,2002. [7]J.MacCormick and A.Blake.A probabilistic exclusion prin-

ciple for tracking multiple objects.IJCV,39(1):57–71,2000.

[8] D.Martin,C.Fowlkes,and J.Malik.Learning to detect natu-

ral image boundaries using local brightness,color,and texture cues.PAMI,26(5):530–549,May2004.

(a)The log-likelihood ratio of foreground over back-ground for selected frames.Image ii shows an oc-clusion.Image iii shows the tail and shadow of the mice.

(b)The second column shows the BSE output;the third shows the canny edge detector’s output.In image i an edge between a pair of mice is found.In image ii,BSE gives a weak response to an edge between a pair of mice.In the bottom

frame,BSE is robust to scratches.Figure 3:Features extracted for the (a)blob and (b)contour observation


(a)Tracking results for a mouse jumping,a mouse falling from the ceiling,and a mouse turning


(b)The contour is ?t to a tail and the blob is ?t to a shadow;the tracker is robust to scratches on the cage;the contour is


(c)The ?rst three occlusion sequences in which our tracking algorithm performs


(d)The ?rst two occlusion sequences on which our algorithm swaps mouse identities.

Figure 4:Still frame summary of our results.We plot the average af?ne transformation applied to the contour with the most total weight.


