文档库 最新最全的文档下载
当前位置:文档库 › Optimal partial shape similarity

Optimal partial shape similarity

Optimal partial shape similarity
Optimal partial shape similarity

Image and Vision Computing23,pp.227-236,2005.

Optimal Partial Shape Similarity

Longin Jan Latecki1,Rolf Lakaemper1,and Diedrich Wolter2

1Dept.of Computer and Information Sciences

Temple University

Philadelphia,PA19094,USA

latecki@https://www.wendangku.net/doc/6b10238044.html, and lakamper@https://www.wendangku.net/doc/6b10238044.html,

2FB3—Cognitive Systems

Universit¨a t Bremen,Germany

dwolter@informatik.uni-bremen.de

Abstract.Humans are able to recognize objects in the presence of sig-

ni?cant amounts of occlusion and changes in the view angle.In human

and robot vision,these conditions are normal situations and not ex-

ceptions.In digital images one more problem occurs due to unstable

outcomes of the segmentation algorithms.Thus,a normal case is that a

given shape is only partially visible,and the visible part is distorted.To

our knowledge there does not exist a shape representation and similar-

ity approach that could work under these conditions.However,such an

approach is necessary to solve the object recognition problem.The main

contribution of this paper is the de?nition of an optimal partial shape

similarity measure that works under these conditions.In particular,the

presented novel approach to shape-based object recognition works even

if only a small part of a given object is visible and the visible part is

signi?cantly distorted,assuming the visible part is distinctive.

Keywords:shape similarity,visual parts,shape representation,object

recognition

1INTRODUCTION AND MOTIV ATION

When trying to compare natural shapes,one is seldom successful in?nding an identical match.How can shape similarity then be de?ned in accordance to human perception?Our intuitive de?nition of similarity emphasizes the con-tribution of common features of the objects compared to those distinguishing them.As an example,imagine the shape similarity of a centaur to a human and a horse;although there is no global similarity and even the partial dissimilarity is evident,everyone will easily agree on the similarity between the particular similar parts:the upper body makes the centaur similar to a human,while the lower body makes it similar to a horse.As a result,we interpret a centaur to be similar to a human as well as to a horse.This example illustrates that humans are able to focus on similar parts and disregard the dissimilar parts.Humans judge two objects as being similar if they have common parts that are similar

and that are signi?cant for the shape of both objects.The question arises how to identify the common parts.A conceptually simple procedure to answer this question is to try to remove certain parts and see whether the objects become more similar without them.This leads to a?rst approximation of the proposed de?nition of shape similarity:

The optimal partial similarity of Q to T is the similarity of Q to modi?ed T,denoted as T?Q,where T?Q is T with all parts that make T distinct from Q removed.

We need to state clearly that there are two di?erent shape similarity concepts in this de?nition.The?rst one,which we will call’global shape similarity’,is the measure that compares Q to modi?ed T(without the process of modi?cation). All shape similarity measures presented in the literature de?ne this global shape similarity.However,it is not what humans understand under shape similarity. The second concept is the de?ned concept of optimal partial shape similarity. It abstracts from distinct parts before comparing the modi?ed shapes using the global shape similarity.Hence it is much closer to human perception.Let s be any global shape similarity measure,more formally,s is a distance measure(the smaller its values,the greater is the similarity of compared shapes).Now we restate the proposed de?nition with the conceptual confusion removed: The optimal partial similarity of Q to T,ops(Q,T),is the global minimum of global similarities s(Q,T?S Q)taken over all parts S Q of T.

A subpart S?Q of T that yields the global minimum identi?es the parts that makes T most dissimilar to Q.Consequently,a subpart T?Q=T?S?Q of T is the part of T that makes T most similar to Q.We call T?Q the optimal subpart of T with respect to Q.Thus,ops measures how similar to Q part of T is that makes T most similar to Q or,more simply stated,how much of the shape of Q is contained in T.Clearly,additionally we could introduce a penalty for the removed parts.To keep the proposed de?nition as simple as possible,and since this de?nition in its simple form is su?cient for many applications,we will consider its extensions later in the paper(Section7).Observe also that this de?nition is not symmetric,i.e.,the query part Q is used in its entirety.This implies that the query part should be carefully selected.It should be su?ciently distinctive to allow us to distinguish the shape of the object we want to retrieve from other objects.It should also be descriptive in that it should match well the part of the object we are looking for.The process of query-part selection is similar to keyword selection in text-based searches,where a selected keyword must be distinctive and descriptive.The main contribution of this paper,then,can be made clear using the analogy of text-based retrieval.Existing shape similarity measures force the user to submit a description of the entire object shape(i.e., the whole object contour)as the query,which corresponds to submitting the whole sentence as the query,making it very unlikely to?nd a good match.The proposed partial similarity allows the user to submit key parts of the shape (parts of object contours)as queries the same way keywords are used in text-based searches.

Although we focus here on the shape of2D objects,our approach is also related to shape recognition of3D objects.The shape of3D objects can be recognized using their2D projections[22,10].The key observation is that the contour variation in the projection of a single visual part is signi?cantly smaller than the variation in the projection of the whole3D object.Moreover,a signif-icant occlusion makes the recognition of the whole contour impossible,while a part of contour can still be recognized as long as it is visible.

2TECHNICAL INTRODUCTION

We use boundary representation of shape.Since we work with the boundaries of objects in digital images,we can assume without loss of generality that shape is represented by polylines(polygonal curves)that form their boundaries.Given a query part Q(that is a visually signi?cant part)and a target object T,the goal is to?nd part P of T that is the most similar to the query Q(see Fig.

1).An initial,simple thing to do is to try out all subparts P of T and compare them to Q.Since we work with polylines,and each polyline can be viewed as an ordered set of vertices,this goal can be easily achieved by comparison of all subsets of vertices of T to Q using some global shape similarity measure s and then selecting as P the subset of vertices that yields a global minimum of s.However,this strategy leads to combinatorial explosion that occurs since we consider all subsets of vertices of T.Clearly,the combinatorial explosion can easily be avoided if only all connected subsets P of vertices of T are considered, which reduces the complexity to O(n2),where n is the number of vertices of T. This strategy is known as sliding window with variable size.However,it does not produce desired results.The problem is that the subpart P of the target shape T that correctly corresponds to Q has an additional shape feature(the horn on the snout),Fig.1.Since the horn is a signi?cant shape feature of P(viewed as a single shape),it makes P signi?cantly dissimilar to Q.Therefore,any global shape similarity measure applied to Q and P yields a value indicating that Q and P are dissimilar.Consequently,a sliding window approach with any global shape similarity measure may fail to?nd P as the most similar subpart of T to query Q.

In contrast,the proposed optimal partial similarity never directly compares Q and P,instead we?rst simplify P in the context of Q to obtain P?Q,and then compare Q and P?Q using a global shape similarity measure.Hence the partial shape similarity provides a solution to the problem of additional shape features without running into the problem of combinatorial explosion.The simpli?cation of T to P followed by the simpli?cation of P to P?Q is obtained by a single process that recursively removes carefully selected vertices of T.In each step a vertex that makes T most distinct from Q is selected for removal.This process is computed by the algorithm introduced in Section4.The outcome is shown by two arrows in Fig.1.After removing a?rst set of vertices,T is transformed into P(horizontal arrow).The second set of removed vertices transforms P to P?Q (vertical arrow).Thus,what remains is P?Q=T?Q,a modi?ed T that is composed

only of the features of T that make T similar to Q.Observe that not all removed features of P to obtain P?Q can be viewed as noise,in particular the horn on the snout.Note that the length of the most similar part P?Q of T to the query Q,in

.

this example,is only about20%of the length of T

3RELEV ANT WORK

We use the term visual parts more loosely than parts of visual form as de?ned in [12,13].In the context of this paper,visual parts mean signi?cant parts of object contours that are good candidates for identifying the objects.Our de?nition includes parts of visual form,but we also allow a composition of visual parts to be called a visual part,e.g.,the contour of a hand is a visual part of the human body,according to our de?nition.In any case,an important feature of our approach is that we use parts of objects as query shapes.According to Siddiqi et al.[26],part-based representations allow for robust object recognition and play an important role in theories of object categorization and classi?cation. There is also strong evidence for part-based representations in human vision,see e.g.,[12,13,25,26].

There is a huge variety of shape descriptors,most of them requiring the presence of the whole shape,some of them tolerating minor missing or distorted parts,e.g.,by occlusion.Even feature-based approaches,although potentially being based on local features,require the presence of most of the object to com-pute the statistics of the features.This statement applies to all shape descriptors presented in the recent overview articles[17,18],as well as to the new shape de-scriptors presented in[4,11].There exist feature-based approaches that allow for object recognition under occlusion and/or perspective distortion.However,in all these approaches only a relatively small part of the object may be occluded.To

these approaches belongs probabilistic model of spatial arrangement of local fea-tures([8,27,28]),where the object shape is modeled as spatial arrangement of point features.Due to a sophisticated Gaussian mixture model and the corre-spondence computation of point con?gurations that employs an EM algorithm, this approach works even if a small part of feature points is not present.This means that a small part of the object may be occluded.In contrast the presented approach works even if only a small part of the object is visible.

The process of simpli?cation of target shape T in the context of query part Q is a context sensitive extension of discrete curve evolution(DCE)[15,16].We simplify polyline T by removing the set of vertices whose removal yields the highest gain in the similarity value of the simpli?ed T to query Q.This way we check how much of the shape of Q is contained in the shape of P.This is in contrast to all deformation energy approaches(Basri et al.[3],Sebastian et al.

[24]),where a query shape is deformed until it becomes a target shape and the amount of deformation de?nes the similarity value.In our case the whole process is driven by a shape similarity measure,but does not de?ne the shape similarity value.The further di?erences are that we allow only for shape simpli?cation, and we simplify until a global minimum of similarity is reached.

Deformation energy approaches use penalties for length di?erences of matched curves.This makes it impossible to recognize the same shape at di?erent scales. The proposed approach is scale invariant,since before query Q is compared to simpli?ed part T?S Q of target T both curves are normalized to length one without any penalty.Observe that scaling both curves to the same length alone, does not solve the problem of di?erent scales when both curves are not very sim-ilar.The simpli?cation of T combined with scaling to the same length provides a solution to the problem of di?erent scales.

In contrast to early AI and CV approaches,we do not regard objects to be composed of primitive parts nor a speci?c shape vocabulary like generalized cylinders,superquadrics,or geons([5–7,23]).The visual parts we are interested in are not built of any primitive elements.Therefore,we have no restriction on shape that may be complex.

Our approach applies also to the recognition of3D objects.There is sub-stantial cognitive evidence that recognition of3D objects can be based on2D projections,and planar shape similarity measure can be used in this context (Cyr and Kimia[10]).

4COMPUTATION OF THE OPTIMAL PARTIAL SHAPE SIMILARITY

Given a query polyline Q and a target polyline T,we face two related goals:(1)to localize part P of target polyline T that is most similar to Q,and(2)to simplify P to query polyline Q such that the simpli?ed version of P is most similar to Q.Both goals will be achieved by a single process of simpli?cation of T in the context of Q described in this section.To achieve these goals we need a global shape similarity measure of high quality,which we call s,that can be applied to

compare two polylines.We use an improved version of the cognitively motivated shape similarity measure introduced in[16],which we describe in Section6.

A polyline T can be de?ned as an ordered set of vertices T={t1,...,t n}. Our goal is to?nd and remove a subset S?Q of vertices of T so that the polyline T?Q=T?S?Q is the most similar subpolyline of T to Q.Thus,we?nd S?Q as argument of the global minimum

S?Q=argmin{s(Q,T?S Q):S Q?T}

and the optimal partial similarity between Q and T is de?ned by

ops(Q,T)=min{s(Q,T?S Q):S Q?T}.

The length of both polylines Q and T?S Q is normalized to one before s(Q,T?S Q)is computed.

Observe again that there are fundamental di?erences of our approach to the deformation energy approaches(Basri et al.[3],Sebastian et al.[24]).First,we only permit the simpli?cation of a given shape,i.e.we do not allow for arbitrary deformation.Second we do not measure the cost of deforming a shape,but instead the shape similarity after deformation.A nice feature of this de?nition is that we are always guaranteed to obtain a global minimum of our shape similarity(therefore we are parameter free),but its computation may lead to combinatorial explosion.

Therefore,we now introduce a suboptimal algorithm to compute ops.It is suboptimal in the sense that we select in an optimal way a single vertex to be removed in each step,but the subset of all removed vertices may not be optimal. First,we recursively generate a sequence of polylines

T=T n,T n?1,...,T2

in which T k?1is obtained by removing a single vertex from T k such that

=argmin{s(Q,T k?{x}):x∈T k}.

T k?1

Q

Then we compute the global minimum of similarities between Q and T k Q:

ps(Q,T)=min{s(Q,T k Q):k=2,...,n}.

and

T?Q=argmin{s(Q,T k Q):k=2,...,n}.

The length of both polylines Q and T k Q is normalized to one before s(Q,T k Q)is computed.

We call the obtained similarity measure ps partial similarity to stress the fact that it is not globally optimal any more.The computation of the global minimum over all possible vertex subsets of T is substituted by a global minimum over the sequence of optimally removed single vertices.

An important property of the de?ned partial similarities ops and ps is the fact that they are invariant to scale di?erences between Q and T,since we normalize the length of polylines compared by s.Observe that if Q and T are at di?erent scales,making their arc length equal,does not solve the problem of di?erent scales.The reason is that actually Q is only similar to a subpolyline T?Q of T.Hence making Q and T?Q to be of the same length solves the problem of di?erent scales.This is what happens during the proposed computation of ops and ps,because when global similarity measure s is used to compare query Q to simpli?ed target T both curves are scaled to length one.To summarize simpli?cation of T combined with scaling to the same length provides a solution to the problem of di?erent scales.

The vertex removal process has complexity O(n2),where n is the number of vertices of T.This does not account for complexity of the global shape sim-ilarity measure s that is used in each step(de?ned in Section6).Since s can be computed in O(n log(n)),the total complexity of partial shape matching is O(n3log(n)).

5EXPERIMENTAL ILLUSTRATION

The best matching part T?Q of T to query Q shown in Fig.1was computed by the proposed algorithm for partial similarity ps.A few simpli?cation stages of P that led to identi?cation of T?Q are shown in Fig.2.Only a subset of all simpli?ed versions of P is depicted.The polylines shown are marked with letters A to F, where A is part P from Fig.1.A plot of the values of s(Q,T k q)is shown in Fig.

3.The global minimum of the similarity values is obtained for k=15,which is for polyline E=P?Q=T?Q.

A B C

D

E

F

Fig.2.Illustration of the computation of partial similarity ps(Q,A),where A is part P from Fig.1.We see a few simpli?ed versions of A.The global minimum is obtained for P?Q=E.

As stated before,not all removed features from polyline P=A to obtain polyline E can be viewed as noise,in particular the horn on the snout.This

feature can only be removed in the context of the query Q.Since the horn is a signi?cant shape feature of P(viewed as a single object),it makes P signi?cantly dissimilar to Q.Therefore,any classical shape similarity measure yields a value that indicates that Q and P are dissimilar.

A

B

C

D

F

E

optimal subset

1

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

12578101520222530

Fig.3.The plot of the values of the similarity measures s(Q,T k q).The global minimum is obtained for k=15,which is for polyline E=P?Q=T?Q(shown in Fig.2).

In addition to several successful experiments as the one presented in Figures 1to3,we also evaluated the retrieval rate of our optimal shape similarity mea-sure on the dataset created by the MPEG-7committee for evaluation of shape similarity measures[9,17].The test set consists of70di?erent classes of shapes, each class containing20similar objects,usually(heavily)distorted versions of a single base shape.The whole dataset therefore consists of1400shapes.For example,each row in Fig.4shows four shapes from the same class.

Fig.4.Some shapes used in part B of MPEG-7Core Experiment CE-Shape-1.Shapes in each row belong to the same class,i.e.,we see in the?rst row four di?erent shapes (out of20)of class’bone’.

Fig.5.Top:The21most similar shapes retrieved from the MPEG-7shape database for the query part Q(shown on top).Bottom:the subparts of the top shapes that are similar to Q.

The goal of this experiment is to prove the ability of our approach to correctly retrieve the search objects when only a single part is given as query part Q.As in the case of a single keyword query in the text-based search,the query part Q must be a descriptive feature that on the one hand is common to all searched shapes and on the other hand must be able to discriminate the search shapes from shapes of other classes.For this experiment a su?ciently descriptive part Q is given by the cross-like top part of the shape’fountain-01’,shown in Fig. 5,top left.Every shape of the MPEG dataset was compared to Q using the partial similarity measure ps(Q,T i),for i=1,...,1400.The top section of Fig. 5shows the?rst21shapes that contain parts that are the most similar to Q.The bottom section shows the corresponding parts,i.e.,the optimal subsets T?iQ?T i. The algorithm found the desired20shapes of class’fountain-01’in the top21 results.The only misclassi?cation(object’device0-16’,rank19,marked with’*’) contains a part visually similar to Q.Note that this result was achieved with the query being a single part only.

We performed a similar experiment on a small database composed of27 shapes shown in Fig.6.The shapes are grouped into9classes with3shapes each.The set of our query parts is shown in Fig.7;two query parts for each of the9classes were selected.We measured the retrieval rate following the MPEG-7 Bulls-Eye test[9]:the number of objects from the same class that are contained among the?rst N most similar objects to the query part,where N is equal to double the number of objects in the given class(N=6in our test).The overall

Fig.6.Our small test database composed of27objects grouped into9classes(shown in rows)

retrieval rate for selected query parts(Fig.7)is100%.When complete object contours of the objects in the database are submitted as queries and we use the global shape similarity s,the retrieval rate is91.95%,which is worse than the one for query by parts.This experiment demonstrates the gain obtained by using the proposed partial shape similarity for parts as queries.The result of this test is particularly signi?cant,since the partial shape similarity is based on the global shape similarity s.There are two points that must be addressed here.Clearly, the excellent performance of the partial shape similarity depends on the selection of query parts.However,it is easier for humans submitting the queries to sketch a query part than to sketch the whole contour,and the successful experience with key words shows that their selection is not a problem for humans.The second point is whether a di?erent global shape similarity measure would do a better job.The performance of several global shape similarity measures on the MPEG-7Shape1data set[9](queries consist of the whole shapes)strongly indicates that no global shape similarity measure is capable of yielding100% retrieval rate due to variations in global shape.This is justi?ed by the fact that our underlying global shape similarity measure s has a retrieval rate of76.45% on the MPEG-7data set.The best know retrieval rate for the MPEG-7data Shape1set is83.19%[1].The global shape similarity measure selected for the MPEG-7standard has a retrieval rate of81.12%[21].It is based on the global measure introduced by Mokhtarian and Mackworth in[20].

Fig.7.The set of18query parts grouped into9classes corresponding to the9classes

in the database in Fig.6.

The best six published retrieval rates for global shape measures(and conse-quently,for queries being the whole shapes)for the MPEG-7data set are shown in Fig.8.Further other retrieval results,all with retrieval rates lower than the rates in Fig.8,for various global shape similarity approaches are recorded in the report on the original MPEG-7Shape1experiment in[9].

Retrieval76.5178.3883.19

[17][24][21]

Authors Belongie...Grigorescu...Attalla Fig.8.The best six published retrieval rates in percents on the MPEG-7Shape-1part

B dataset.

6GLOBAL SHAPE SIMILARITY MEASURE

A global shape similarity measure is the underlying basis of the partial similarity. There exist numerous approaches to de?ne the similarity between polygonal curves,some of which we mentioned in Section3.However,since this measure is used to?nd the optimal window to drive the simpli?cation process,and?nally to provide the numerical value representing the partial similarity of arbitrary complex natural shapes,it must be robust to non-uniform contour deformations. We use an improved version of a visual-part based shape similarity measure(VPS) introduced in[16].It yielded excellent performance on the MPEG-7experiments

reported in the previous section,which can also be veri?ed by querying our online

shape database at[14].Like all global shape similarity measures VPS requires

that the whole object is present as input.Since it is contour based,this means

that the complete contour is given.(Although our experimental results show that

our measure performs well in the presence of minor occlusions,or to be more

precise:if some signi?cant parts of the contour are absent and other signi?cant

parts are present,then the results are in accordance with our intuition.)

To compute VPS between two polygonal curves,we establish the best possible

correspondence of maximal convex arcs.To achieve this,we?rst decompose the

polygonal curves into maximal convex subarcs.Note that a convex subarc of an

object contour may be either convex or concave with respect to the object area.

Since a simple one-to-one comparison of maximal convex arcs of two polygonal

curves is of little use,due to the fact that the curves may consist of a di?erent

number of such arcs and even similar shapes may di?er in small features,we allow

for1-to-1,1-to-many,and many-to-1correspondences of the maximal convex

arcs.The main idea here is that we have at least on one of the contours a maximal

convex arc that corresponds to a part of the other contour composed of adjacent

maximal convex arcs.In this context the corresponding parts of contours can be

identi?ed with visual object parts.Two example correspondences obtained by

our approach are shown in Figure9.

Fig.9.Correspondence of visual parts as computed by our global shape similarity

measure.Corresponding arcs are labeled with the same numbers.

The main advantage of VPS is that it is based on correspondence of visual

parts.A correspondence of visual parts is de?ned as a sequence of pairs of

parts of contours A and B

C=((p1A,p1B),...,(p nA,p nB))

such that A=p1A∪...∪p nA and B=p1B∪...∪p nB,the order of parts is

respected and at lease one part in each pair d(p iA,p iB)is a convex arc.The

global shape similarity s is de?ned as the global minimum of the sum of distances d arcs(p iA,p iB)over all possible correspondences,where polygonal sim-ilarity measure d arcs is de?ned below.VPS uses dynamic programming to?nd

parts

the

Basic similarity of polylines d arcs is de?ned using their tangent function rep-resentation.Tangent function,also called turning function,is a multi-valued step function mapping a polyline into the interval[0,2π)by representing angular di-rections of line-segments only.Furthermore,arc lengths are normalized to one prior to mapping into tangent space.This representation was previously used in computer vision,in particular,in[2].Denoting the tangent function by T g,the similarity gets de?ned as follows:

d arcs(C,D)= 10(T g C(s)?T g D(s)+Θ(C,D))2ds ·max l(C)l(C) ,

where l(C)denotes the arc length of C and the integral is taken over the arc length s.The constantΘ(C,D)is chosen to minimize the integral(it accounts for di?erent orientation of curves)and is given by

Θ(C,D)= 10(T g C(s)?T g D(s))2ds.

7GENERAL CASE OF THE OPTIMAL SHAPE SIMILARITY

The aforementioned experiments and de?nitions did not introduce any penalty for removing vertices of T in our partial similarity measure ps.An introduction of penalty makes sense when some estimation of the noise model is known as we

will illustrate in Section8.The penalty can be easily added to obtain an optimal partial similarity measure with penalty as linear combination of the similarity between query Q and simpli?ed T and the similarity between simpli?ed T and T:

opsp(T,P)=min{αs(Q,T )+βs(T ,T):T ?T},

whereα+β=1are weights.The penalty s(T ,T)measures how much T must be simpli?ed in order to become more similar to query Q.The weights depend on importance of both terms and are application dependent.As it was the case for computation of optimal partial shape similarity,we can use a suboptimal method to compute the optimal partial similarity measure with penalty.We will call the obtained measure partial similarity measure with penalty and denote it as psp.The same applies to the extensions de?ned in the reminder of this section.

Our de?nition of ops in Section4assumes that query part T does not contain any signi?cant distortions.This assumption is true for many applications,in particular when Q belongs to a set of learned visual parts,which are used for the shape-based image retrieval.However,there exist applications in which both Q and T should be simpli?ed(e.g.,when both T and P may be corrupted by noise).Therefore,it makes sense to de?ne an optimal partial symmetric similarity measure with penalty

opssp(Q,T)=min{αs(Q ,T )+βs(T ,T)+γs(Q ,Q):T ?T∧Q ?Q} whereα+β+γ=1are weights.

The proposed opssp constitutes out of two parts:a global shape similarity measure of simpli?ed shapes and a simpli?cation measure modeling the noise https://www.wendangku.net/doc/6b10238044.html,putation of optimal similarity is an iterative process.An initial similarity of contours is computed.As long as a simpli?cation on either one of the contours involved exists such that it reduces the dissimilarity added to the overall sum of simpli?cations carried out,the simpli?cation is performed.The resulting similarity plus the overall sum of simpli?cations yields the resulting similarity.The reason for this process being symmetrical is due to the fact that only vertices may be removed and not introduced.So,if a shape’s feature is missing in one contour due to noise,the corresponding feature may be removed from the original contour in order to yield a better similarity.The challenging task here is appropriate selection of weights.

In the above de?nition of opssp we use our global shape similarity measure in both parts.When the extent of noise can be estimated,we may use a di?erent measure for the second part(that computes the cost of simpli?cation).The es-timated noise level allows for deriving a measure judging the likelihood that an individual vertex was caused by noise.Thus,simpli?cations of shape by removal of vertices may be measured according to noise plausibility.A simpli?cation is performed whenever the noise plausibility is lower than a gain in shape similar-ity.Therefore,when comparing two shapes,exactly the di?ering shape features caused by noise can be removed.This leads us to the following de?nition of an optimal partial symmetric similarity measure with simpli?cation measure that

di?ers from opssp in that we use a relevance measure r to measure the amount of simpli?cation of two shapes in the second part:

opsss(Q,T)=min{αs(Q ,T )+βr(T ,T)+γr(Q ,Q):T ?T∧Q ?Q} whereα+β+γ=1are weights.

Expressing a simpli?cation measure r that accounts for noise in the context of the target shape T allows us to di?erentiate between simpli?cations that just cancel out noise and those that would remove shape features.For example,if we remove a given vertex that was displaced by noise,the penalty r for its removal will be compensated in the decreased value of the similarity measure s.In the next section we show experimental results that demonstrate the usefulness of the opsss measure.

8EXPERIMENTAL RESULTS FOR OPTIMAL PARTIAL SYMMETRIC SIMILARITY MEASURES Within this section we present results from two di?erent domains,namely shape retrieval and scan matching as used in robot mapping.In both domains the application of the optimal symmetric similarity measure yields improvements as compared to classical shape similarity measures.

8.1Recognizing digitally scaled shapes

Suppose,shapes(contours)extracted from images at varying sizes should be recognized.The relative amount of noise,e.g.,caused by segmentation,varies with the image’s size.Single pixels may be negligible in large images,while pro-viding valuable shape information in tiny images.The key idea is based on the observation that often the noise level can be estimated.Let us assume that con-tours can be extracted with a noise of±1pixel.This is the case,for instance, when scaling bitmap images digitally.We introduce a term resolution r ,which is said to be the distance a vertex may be translated such that the translation is still plausible.One possible simpli?cation measure judging the likelihood for a contour vertex to be caused by noise is based on area.Therefore the trian-gular area de?ned by three consecutive points in shape contour is computed.

?r(u,v,w)=1

r ·c)2

The cost function?r induces the simpli?cation measure r used in opsss.To prove the performance of opsss,a test on recognizing digitally scaled contours is performed.Recognition is possible when the similarity between the scaled query Q and the original contour C1is higher than between Q and another contour C2.Therefore,we de?ne a relative similarity.

S (Q,C1)=100·

Sim(Q,C1)

Values above 50%represent a successful recognition (the original contour is the most similar one),values below 50%represent a confusion of contours.

In a test,we selected two shapes from the MPEG-7dataset.Then,query shapes are generated by reducing the original shapes’resolution.The queries are within a range of resolution from 10×10pixel up to 100×100pixel.Whereas a query corresponding to the highest resolution shows basically no distortion,the query obtained from a 10×10resolution is heavily distorted.Two original con-tours C 1and C 2together with their example images with reduced resolution are depicted in Fig.11.For each query (of reduced resolution)the relative similarity to its original contour is computed.The computation of shape similarity is per-formed using the global shape similarity measure s ,marked with black columns in Fig.11,and the novel opsss measure,marked with gray columns.Examining the retrieval rates shown in Fig.11,it can be observed that opsss outperforms the classical shape similarity.opsss is able to recognize scaled contours of both queries correctly,whereas the classical similarity is not able to recognize shapes at a resolution of less than 80×80pixel (cp.Fig.11(d)).This example shows that opsss can signi?cantly improve shape retrieval in contexts where the noise level can be estimated.We will discuss another example in the context of robot

mapping.

(a)(b)

(d)

Fig.11.(a),(b):Two shapes from the MPEG-7database as extracted from their images with resolutions of 100×100,40×40,and 10×10pixel each.(c)Relative similarity of matching the shape depicted in (a)against the shapes in (a)and (b).Each column represents the relative similarity for a query at the resolutions 100×100,90×90,...,down to 10×10pixel.Black columns denote the results of global similarity measure s ,gray columns the results of opsss .The higher the columns,the more reliable the retrieval.(d)The same as (c),but for retrieval of shape (b).

8.2Matching shapes extracted from laser range?nder data

In the?eld of robot mapping,mobile robots(typically equipped with a laser range?nder as sensor)must construct an internal spatial representation,a map of the environment,from sensory data.A key technique here is to match two sensor readings against each other.Range information obtained from a laser range?nder can be interpreted as a description of the robot’s surroundings by means of polylines that represent obstacle boundaries.Therefore,the matching may be formulated as a shape matching(cp.[19]).

Polylines extracted from laser range?nder data su?er from sensor noise that cannot be removed easily.This is due to the relative size of the noise which changes with the amount of shape information present.To be more precise,the typical noise of the laser range?nder used in our experiments is about±2cm, which is just about the size of a door frame in a wall or table legs.Thus,removing noise up to the magnitude of±2cm would also remove shape features.However, loosing any shape feature cannot be a?orded,as shapes perceived in typical o?ce environments are poor salient features.To overcome this problem we apply opsss in a similar way as explained above.Expressing a simpli?cation measure that accounts for noise allows to di?erentiate between simpli?cations that just cancel out noise and those that would remove shape features.For example,if we remove a given vertex that was displaced by noise,the penalty r for its removal will be compensated in the decreased value of the similarity measure s.As the cost for removing a vertex likely to be introduced by noise is very low,shape similarity can still be detected reliably,even when the shapes involved are—relative to the overall shape information—strongly distorted.An example is presented in Fig. 12,where two exemplary polylines are https://www.wendangku.net/doc/6b10238044.html,puting the similarity of these polylines by means of standard shape similarity yields a similarity value of more than12,which means for this application that they are dissimilar.Since while computing the opsss measure we remove vertices that make them dissimilar,the computed similarity value is just about1.which means for this application that they are similar.

9CONCLUSIONS

Global similarity measures fail to cope with partial visibility(due to occlusion or point of view change)and not uniformly distributed noise,which is actually a normal situation for shapes extracted from digital images.Therefore,we in-troduce a partial similarity measure that is capable of?nding the most similar part in a target object to a given query shape,even if the query shape is only a small part of the target object.The partial shape similarity measure is optimal in the sense that it is able to focus on similar shape features and to abstract from distinct features in an optimal way.We are motivated by the fact that human perception of shape is based on similarity of common parts to the extent that a single,signi?cant visual part is su?cient to recognize the whole object.For example,if you see a hand in front of a door,you expect there to be a human

(a)(b)

Fig.12.(a)Two exemplary polylines as could be obtained when sensing an object with a laser range?nder.As the upper polyline is free of noise,the lower one su?ers from distortions in the same magnitude as the shape features present.The grid shown denotes1cm distances.(b)Applying the opsss to compare the two polylines,di?ering shape features are removed prior to computing shape similarity(dashed lines).The cost for removal is low,as the removed vertices are judged likely to be caused by noise. behind the door.For a given query part Q,the proposed partial shape similar-ity measure allows us not only to retrieve a target object T from a database of shapes but also to identify part P of T that is most similar to Q under the following conditions:

1.The location of P in T that corresponds best to query part Q is unknown

2.Part P is a distorted version of Q

3.Part P may be at a di?erent scale than Q

Our experimental results verify that the introduced partial shape similarity yields excellent results under partial visibility(e.g.,occlusion on the level of 80%)if visible parts are su?ciently distinctive to identify given objects.More-over,in contexts where the amount of distortion can be estimated,the proposed measure is able to account for noise plausibility.

10ACKNOWLEDGEMENTS

This work was supported in part by the National Science Foundation under grant INT-0331786and the grant161811705from Temple University O?ce of the Vice President for Research and Graduate Studies.It was carried out in collaboration with the SFB/TR8Spatial Cognition,project R3[Q-Shape].Financial support by the Deutsche Forschungsgemeinschaft is gratefully acknowledged. References

1. E.Attalla.Shape Based Digital Image Similarity Retrieval.Ph.D.Thesis,Wayne

State University,Detroit,2004.

2.M.Arkin,L.P.Chew,D.P.Huttenlocher,K.Kedem,and J.S.B.Mitchell.

An e?ciently computable metric for comparing polygonal shapes.IEEE Trans.

PAMI,13:209–206,1991.

3.R.Basri,L.Costa,D.Geiger,and D.Jacobs.Determining the Similarity of

Deformable Shapes.Vision Research38,p.2365–2385,1998.

4.S.Belongie,J.Malik,and J.Puzicha.Shape matching and object recognition

using shape contexts.IEEE PAMI24,p.509–522,2002.

5.L.Biderman.Human image understanding:Recent research and a https://www.wendangku.net/doc/6b10238044.html,-

puter Vision,Graphics,and Image Processing32,29-73,1985.

6.T.Binford.Visual Perception by Computer.IEEE Conf.on Systems and Control,

1971.

7.R.Brooks.Symbolic Reasoning Among3D Models and2D Images.Arti?cial

Intelligence17,p.285-348,1981.

8.M.C.Burl and P.Perona.Recognition of Planar Object Classes.CVPR,1996.

9.M.Bober,J.D.Kim,H.K.Kim,Y.S.Kim,W.-Y.Kim,and K.Muller.Sum-

mary of the results in shape descriptor core experiment.MPEG-7,ISO/IEC JTC1/SC29/WG11/MPEG99/M4869,Vancouver,July1999.

10. C.M.Cyr and B.B.Kimia.3D Object Recognition Using Shape Similarity-Based

Aspect Graph.ICCV,2001.

11. C.Grigorescu and N.Petkov.Distance Sets for Shape Filters and Shape Recog-

nition.IEEE Trans.Image Processing12(9),2003.

12. D.D.Ho?man and W.A.Richards,Parts of Recognition,Cognition18,65–96,

1984.

13. D.D.Ho?man and M.Singh.Salience of Visual Parts.Cognition63,p.29-78,

1997.

https://www.wendangku.net/doc/6b10238044.html,kaemper and https://www.wendangku.net/doc/6b10238044.html,tecki.Shape Search Engine.

https://www.wendangku.net/doc/6b10238044.html,/shape/

https://www.wendangku.net/doc/6b10238044.html,tecki and https://www.wendangku.net/doc/6b10238044.html,kaemper:Convexity Rule for Shape Decomposition Based

on Discrete Contour Evolution.CVIU73,441-454,1999.

https://www.wendangku.net/doc/6b10238044.html,tecki and https://www.wendangku.net/doc/6b10238044.html,kaemper:Shape Similarity Measure Based on Correspon-

dence of Visual Parts.PAMI22,p.1185-1190,2000.

https://www.wendangku.net/doc/6b10238044.html,tecki,https://www.wendangku.net/doc/6b10238044.html,kaemper,and U.Eckhardt.Shape descriptors for non-rigid

shapes with a single closed contour.CVPR,p.424-429,2000.

https://www.wendangku.net/doc/6b10238044.html,tecki,A.Gross,and R.Melter(eds.):Special Issue on Shape Represen-

tation and Similarity for Image Databases.Pattern Recognition,35(1),2002. https://www.wendangku.net/doc/6b10238044.html,tecki,https://www.wendangku.net/doc/6b10238044.html,kaemper,and D.Wolter.Shape Similarity and Visual Parts.

DGCI,Naples,Italy,November2003.

20. F.Mokhtarian and A.K.Mackworth.A Theory of Multiscale,Curvature-Based

Shape Representation for Planar Curves.IEEE PAMI14,p.789–805,1992. 21. F.Mokhtarian and M.Bober.Curvature Scale Space Representation:Theory,

Applications and MPEG-7Standardization.Kluwer Academic,2003.

22.K.Mueller,J.-R.Ohm,J.Cooper,and M.Bober.Results of2d/3d shape core

experiments ms-4.In ISO/IEC/JTC1/SC29/WG11.MPEG/M6190,July2000.

23. A.Pentland.Recognition by Parts.ICCV,p.612-620,1987.

24.T.B.Sebastian,P.Klien,and B.B.Kimia.On aligning curves.PAMI,25,p.

116-125,2003.

25.K.Siddiqi,and B.B.Kimia.Parts of visual form:Computational aspects.PAMI,

17,p.239-251,1995.

26.K.Siddiqi,K.J.Tresness,and B.B.Kimia.Parts of visual form:Ecological and

psychophysical aspects.Perception,25,p.399-424,1996.

27.M.Weber,M.Welling,and P.Perona.Unsupervised Learning of Models for

Recognition,ECCV,p.18-32,2000.

28.M.Weber,M.Welling,and P.Perona.Towards Automatic Discovery of Object

Categories.CVPR,p.101-109,2000.

相关文档