文档库 最新最全的文档下载
当前位置:文档库 › Similarity search for adaptive ellipsoid queries using spatial transformation

Similarity search for adaptive ellipsoid queries using spatial transformation

Similarity Search for Adaptive Ellipsoid Queries Using

Spatial Transformation

Yasushi Sakurai?Masatoshi Yoshikawa§Ryoji Kataoka?Shunsuke Uemura§

?NTT Cyber Space Laboratories {ysakurai,kataoka}@dq.isl.ntt.co.jp §Nara Institute of Science and Technology {yosikawa,uemura}@is.aist-nara.ac.jp

Abstract

Similarity retrieval mechanisms should utilize

generalized quadratic form distance functions

as well as the Euclidean distance function

since ellipsoid queries parameters may vary

with the user and situation.In this paper,we

present the spatial transformation technique

that yields a new search method for adap-

tive ellipsoid queries with quadratic form dis-

tance functions.The basic idea is to transform

the bounding rectangles in the original space,

wherein distance from a query point is mea-

sured by quadratic form distance functions,

into spatial objects in a new space wherein

distance is measured by Euclidean distance

functions.Our method signi?cantly reduces

CPU cost due to the distance approximation

by the spatial transformation;exact distance

evaluations are avoided for most of the ac-

cessed bounding rectangles in the index struc-

tures.We also present the multiple spatial

transformation technique as an extension of

the spatial transformation technique.The

multiple spatial transformation technique ad-

justs the tree structures to suit typical ellip-

soid queries;the search algorithm utilizes the

adjusted structure.This technique reduces

both page accesses and CPU time for ellip-

soid queries.Experiments using various ma-

trices and index structures demonstrate the

superiority of the proposed methods.

Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage,the VLDB copyright notice and the title of the publication and its date appear,and notice is given that copying is by permission of the Very Large Data Base Endowment.To copy otherwise,or to republish,requires a fee and/or special permission from the Endowment. Proceedings of the27th VLDB Conference, Roma,Italy,20011Introduction

Multimedia content-based retrieval systems use fea-ture values extracted from multimedia data;they?nd

data objects whose feature values are most similar to those of the query object.These systems in-clude various pattern recognition mechanisms,and the databases on which they operate continue to grow in size.This means that,multimedia systems and spa-tial databases require(1)information retrieval meth-ods with more general distance functions,and(2)im-proved search performance.

Since the Euclidean distance space makes all dimen-sions independent of each other,it fails to adequately represent the user’s intention.Therefore,multime-dia systems require the use of generalized quadratic form distance functions as well as the Euclidean dis-tance function.Since quadratic form distance func-tions can represent correlations between dimensions, retrieval mechanisms using them have high search quality[HSE+95].The quadratic form distance func-tion d2

M

(p,q)=(p?q)·M·(p?q)t is calculated from a query matrix M which is positive de?nite(i.e.

d2

M

(p,q)>0),where q is a query point and p is a data object in a data set.In d-dimensional spaces, the Euclidean distance function has circles for isosur-faces,and weighted Euclidean distance functions cor-respond to iso-oriented ellipsoids,whose major axis is aligned to the coordinate axis.Quadratic form dis-tance functions have arbitrarily oriented ellipsoids that are not necessarily aligned to the coordinate axis(see Figure1).Quadratic form distance functions are re-garded as a generalization of the Euclidean distance function and weighted Euclidean distance functions. MindReader[ISF98]is an example of the application of quadratic form distance functions;based on relevance feedback,it guesses the correlations between dimen-sions,which re?ect the user’s preference.Unlike the Euclidean distance function,quadratic form distance functions more faithfully re?ect the user’s intention.

Given that the size of multimedia databases will continue to grow and that the dimensionality of fea-ture data will continue to increase,high-performance data retrieval methods are essential.Many index methods have been proposed so far[GG98].They

(a)Euclidean(b)weighted Euclidean(c)quadratic form Figure1:Isosurfaces for various distance functions. include data-partitioning index trees(e.g.the R*-tree[BKSS90],the X-tree[BKK96]and the A-tree [SYUK00]).Nearest neighbor search methods using such indices have also been proposed[RKV95][HS95]. In particular,the A-tree is reported to o?er good per-formance for high-dimensional data[SYUK00].Unfor-tunately,most spatial access methods were designed for searches based on the Euclidean distance function, so new spatial access methods that are suited to el-lipsoid queries based on quadratic form distance func-tions are needed.In addition,image retrieval mecha-nisms using various user-adaptable distance functions [FSA+95][HSE+95]and relevance feedback mecha-nisms that guess the user’s desires,such as MARS [RHM97]and MindReader[ISF98],deal with queries whose parameters can vary with the user and situa-tion.These mechanisms require search methods that can support adaptive queries.

The goal of our work is to create a search method for adaptive ellipsoid queries that can?nd similar ob-jects e?ciently.Various metric indices(e.g.the M-tree[CPZ97]and the mvp-tree[BO97])have been proposed as indexing methods for arbitrary distance functions.However,these indices cannot be applied to systems that handle changeable distance functions and,thus,they are not functionally adequate to sup-port adaptive ellipsoid queries.In[SK97],Seidl et al.presented a search algorithm for adaptive ellip-soid queries on index structures that calculates exact distances between query points and MBRs(Minimum Bounding Rectangles).In[ABKS98],Ankerst et al. presented a search method that reduces the number of exact quadratic form distance calculations needed and so reduces the CPU time by using MBB(Minimum Bounding Box)distance functions and MBS(Mini-mum Bounding Sphere)distance functions,which we call the MBB-MBS approximation technique in this paper.However,this technique’s search cost increases as dimensionality grows or as the query ellipsoid be-comes?atter.

To overcome the disadvantages of the MBB-MBS approximation technique for ellipsoid queries,we have developed an approximation technique,the Spatial Transformation Technique(STT).The basic idea of STT is to transform the MBRs,whose distance from a query point is measured by the quadratic form dis-tance functions,into rectangles,whose distance is mea-sured by the Euclidean distance functions.Its ap-proximation quality is high even when the?atness of query ellipsoids and dimensionality are high,and yet STT has a low CPU cost.We also developed the Multiple Spatial Transformation Technique(MSTT).MSTT adjusts the tree structures to suit typical ellip-soid queries;this technique reduces the number of page accesses as well as the CPU cost because the search al-gorithm utilizes the adjusted structure.

The remainder of this paper is organized as follows. Section2is a summary of the analysis of the MBB-MBS approximation technique.Section3describes the motivation,de?nitions and algorithms of STT.Sec-tion4describes MSTT.Section5gives the results of a performance evaluation of STT and MSTT.Finally, Section6concludes the paper.

2Problems of Search Methods for Adaptive Ellipsoid Queries

This section summarizes the properties and problems of the MBB-MBS approximation technique.

2.1Search Methods for Adaptive Ellipsoid

Queries

The search algorithm of[SK97]for ellipsoid queries on index structures calculates exact distances between query points and MBRs(Minimum Bounding Rect-angles).This search method supports adaptive ellip-soid queries with variable distance functions by using index structures.However,the calculations of dis-tance between query points and MBRs incurs CPU costs as high as O(ω·d2)time,where d is dimension-ality andωdenotes the number of iterations.The CPU time represents a high percentage of the overall search time.Ankerst et al.[ABKS98]developed the MBB-MBS approximation technique that reduces the number of exact quadratic form distance calculations needed(and,in so doing,reduces the CPU time for ellipsoid queries)by using MBB(Minimum Bounding Box)distance functions and MBS(Minimum Bound-ing Sphere)distance functions.The following de?ni-tions formalize the MBB distance function and the MBS distance function for d-dimensional spaces: d2

MBB(M)

(p,q)=d

max

i=1

(p i?q i)2

(M?1)ii

,(1)

d2

MBS(M)

(p,q)=λ2M

min

·(p?q)2,(2)

whereλM

i

(i=1,···,d)are the eigenvalues of M, andλM

min

is the lowest eigenvalue of M.The MBB distance functions approximate an ellipsoid query area by a bounding box that totally encloses the query area. The MBS distance functions use a bounding sphere for the approximation.Both approximation techniques require O(d)time for their calculations.

2.2Summary of Experimental Evaluation and

Analysis

We have performed extensive experiments to analyze the MBB-MBS approximation technique.In the eval-uation,we varied matrix?atness and dimensionality.

The?atness of a query matrix M(det(M)=1)is evaluated as follows:

σ2M=

d

i=1

(λM

i

?λM)2,λM=

d

j=0

λM

j

d

,

whereλM

i is the i-th dimensional eigenvalue andλM

is the average of the eigenvalues of M.In this paper,

the varianceσ2

M is called the?atness of M.Before

calculatingσ2

M ,all matrices were normalized1with

det(M)=1.Here,the?atness of the unit matrix that represents searching in the Euclidean space,is0.

The details of the experiments are described in [SYKU01].Our experiments revealed that the MBB-MBS approximation technique has the following prob-lems:

(P1)CPU time

For both MBB and MBS approximation func-

tions,approximation quality decreases as either

dimensionality or matrix?atness grows.As a re-

sult,the number of exact quadratic form distance

calculations increases,thus leading to a high CPU

time.

(P2)Node accesses

Index structures are constructed to?nd target ob-

jects in the Euclidean space e?ciently,and the

search algorithms utilize the resulting index struc-

tures.Therefore,as matrix?atness grows,the

number of node accesses increases.This leads to

increases in both CPU time and number of page

accesses.

The problems revealed serve as the basis for devel-oping our proposed methods.To cope with(P1),we developed the spatial transform technique;this tech-nique achieves high quality approximations and supe-rior performance.To overcome(P2),we developed an extension of the spatial transformation technique, called the multiple spatial transformation technique. This technique reduces both CPU time and the num-ber of page accesses.

3Spatial Transformation Technique

In this section,we describe the Spatial Transformation Technique(STT),which approximates the quadratic form distance between a query point and a bounding rectangle.Like the MBB-MBS approximation tech-nique,STT guarantees no false drops and so returns exact answers to any query.

3.1Basic Ideas

In computing the exact distance for quadratic form distance functions,calculating the distance between a query point and MBRs in the index structures incurs a high CPU cost,and moreover,the calculations need to be iterated.That is,the complexity is O(ω·d2),where 1Normalization of matrices is described in Section4.4.ωdenotes the number of iterations.The basic idea of STT is to transform the MBRs,whose distance from a query point is measured by the quadratic form distance functions,into rectangles whose distance is measured by the Euclidean distance functions.STT requires no iteration.The transformation contributes to reducing CPU time.As shown in the problem(P1),the MBB-MBS approximation technique is not e?ective when ei-ther the dimensionality or?atness of query matrices is high.STT requires less CPU time and o?ers great ef-?ciency even when dimensionality and matrix?atness are high,because of its high approximation quality.In this section,we?rst de?ne the spatial transformation and then describe a spatial transformation technique for bounding rectangles in index structures.

3.2De?nition of Spatial Transformation Given a query matrix M and a query point q,the quadratic form distance between q and a point p in a d-dimensional space S is de?ned as follows:

d2M(p,q)=(p?q)·M·(p?q)t.(3) Since M is positive de?nite,the spectral decomposi-tion of M can be calculated as:

M=E M·ΛM·E t M,(4) where E M is the set of the eigenvectors of M,and the diagonal matrix

ΛM=diag(λM

1

,λM

2

,...,λM

d

)consists of the eigen-valuesλM

1

,λM

2

,...,λM

d

of M.From Equations(3) (4),we obtain:

d2M(p,q)=(p?q)·E M·ΛM·E t M·(p?q)t.(5)

When considering point p =(p?q)·E M·Λ12M in the Euclidean space S ,Equation(5)denotes that the Euclidean distance between the origin O and p in S

is equal to the quadratic form distance d2

M

(p,q)(i.e.

d2

M

(p,q)=p ·p t).Here,the transformation matrix of M is de?ned as:

A M=E M·Λ12M.(6)

A M transforms the quadratic form distances within S into the Euclidean distances within S .It yields the so called spatial transformation of p into p .

3.3Spatial Transformation of Rectangles for

Distance Calculation

STT gives the spatial transformation of rectangles in index structures.Figure2illustrates the spatial trans-formation of a rectangle.In this?gure,the bounding rectangle P in S is transformed into the d-dimensional parallelogram P in S .Since the calculation of dis-tance between the origin O and polygons in high-dimensional spaces incurs a high CPU cost,STT ap-proximates P by rectangle R as shown in Figure2(b).

c

a

(b)A rectangle calculated by STT

Figure2:An example of spatial transformation. This approximation reduces the calculation cost of el-lipsoid queries.

We assume a rectangle P within S and a query point q.Let p a and p b be endpoints of the major diagonal of P and l i be the i-th dimensional edge length of P. It follows that point p a in S can be calculated by the spatial transformation of p a:

p a=(p a?q)·A M.(7) We extract the following components from the compo-nents a ij of A M:

φij=

a ij(a ij<0)

0(otherwise),

ψij=

a ij(a ij>0)

0(otherwise).

(8)

From Equations(7)(8),the rectangle R that totally encloses the d-dimensional parallelogram with respect to the spatial transformation of P can be calculated as2:

R=(r a,r b),(9)

r a

j =p a

j

+

d

i=1

l i·φij,r b j=p a

j

+

d

i=1

l i·ψij (1≤j≤d),

where r a and r b are endpoints of the major diagonal of R.Since R totally encloses P in S ,the search al-2The proof of Equation(9)is described in[SYKU01].gorithm can use the Euclidean distance d2(R,O)in-stead of the quadratic form distance d2

M

(P,q)(i.e.

d2(R,O)≤d2

M

(P,q)).

For example,as shown in Figure2,the query point q=(2,2)and matrix:

M=

1.25?0.75

?0.751.25

are given.When using M,the vertices p a,p b,p c and p d of the bounding rectangle P in S are transformed

into the vertices p a,p

b

,p c and p

d

of the parallelogram P in S ,respectively.Also,R=(r a,r b)encloses P .

d2

M

(q,P)is approximated by d2(R,O),and we can uti-

lize d2(R,O)instead of d2

M

(q,P).

3.4Search Algorithm

Range queries and k-nearest neighbor queries are use-ful for multi-dimensional databases.An algorithm based on spatial transformation can e?ciently support both types of queries.Since k-nearest neighbor queries are more complex and require higher cost than range queries,we will focus on the k-nearest neighbor search in this paper and describe one such algorithm for STT. Note that the idea of STT can be applied to any range query.

The search algorithm implements Equations(6)(7) (8)(9)for spatial transformation.However,its CPU cost would become excessive if it required these for-mulas to be used in the spatial transformation of all accessed rectangles.Therefore,we use the following two ideas to reduce the CPU cost.

First,the result of Equations(6)(8)does not de-pend on the position of the bounding rectangles ac-cessed.Thus,the search algorithm solves these formu-las before accessing the rectangles.The result can be applied to the spatial transformation of all rectangles visited.

Second,we reduce the calculation time relative to Equation(9).Note that,on average,half of the com-ponentsφij andψij are0.Therefore,in the imple-mentation,the algorithm searches for all pairs of row number i and column number j whose components are φij=0,ψij=0before accessing nodes in index struc-tures.This preprocessing halves the CPU cost when calculating R(i.e.r a

j

and r b

j

)with Equation(9).

Let c a

j

be the number of components in the j-th column whereφij=0,and u jk be c a j row numbers of the components in the j-th column(k=1,...,c a

j

). Similarly,forψij,let c b

j

be the number of components in the j-th column whereψij=0,and v jk be c b j row numbers of the components in the j-that column(k= 1,...,c b

j

).Functions for calculating the position of R with less computation time can be obtained by using u jk and v jk:

R=(r a,r b),(10)

r a

j

=p a

j

+

c a

j

k=1

l k·φ(u

jk

)j,

Procedure search(point query,matrix M,

integer k)

1.ΦM:=analyzeMatrix(M);

2.enqueue(a pointer to the root,0);

3.while emptyQueue()=false do

4.N:=dequeue();

5.if N is a data node then

6.for each entry∈N do

7.if d MBB-MBS(M)(query,entry.vector)

≤nnlist[k].dist then 8.if d M(query,entry.vector)

≤nnlist[k].dist then

9.nnlist[k].id:=entry.id;

10.nnlist[k].dist:=d M(query,

entry.vector);

11.sort nnlist by distance;

12.pruneQueue(nnlist[k].dist);

13.endif

14.else

15.for each entry∈N do

16.if d MBB-MBS(M)(query,entry.rectangle)

≤nnlist[k].dist then 17.R:=spatialTransformation(query,

entry.rectangle,ΦM);

18.if d(R,O)≤nnlist[k].dist then

19.if d M(query,entry.rectangle)

≤nnlist[k].dist then 20.enqueue(entry.ptr,d M(query,

entry.rectangle));

21.endif

22.endif

23.enddo

24.output(nnlist);

Figure3:k-nearest neighbor search algorithm for el-lipsoid queries.

r b

j =p a

j

+

c b

j

k=1

l k·ψ(v

jk

)j,

where each c a

j and c b

j

averages d/2.

Figure3shows the search algorithm for ellipsoid queries using tree structures of the R-tree family.The search algorithm utilizes the spatial transformation of rectangles to evaluate the distance of a query point to the rectangles.STT and MBB-MBS approxima-tion techniques incur lower CPU costs compared with the exact quadratic form distance function for distance calculations.Therefore,the search algorithm?rst cal-culates the approximation distance between a query point and a bounding rectangle when evaluating the distance to the bounding rectangle.If the calculated approximation distance is less than or equal to the distance of the query point to the actual k-th nearest neighbor,the exact distance to the rectangle is evalu-ated using the exact quadratic form distance function.

In the search procedure(see Figure3),for initial-ization,the transformation matrix is calculated and its components are checked(step1),and then the pair of a pointer to the root and0is stored in the prior-ity queue(step2).In step4,the function dequeue()dequeues the pair from the top of the priority queue, and extracts a node N.If N is a data node,the MBB-MBS approximation distance of every data object in the node is evaluated.If the approximation distance is less than or equal to the actual k-th nearest neigh-bor distance,the exact distance is evaluated(steps5 to8),and the data object together with its distance is stored in the nearest neighbor list(steps9to12). If N is not a data node,the MBB-MBS approxima-tion distance of every bounding rectangle is evaluated (step16).If the MBB-MBS approximation distance of a rectangle is less than or equal to the actual k-th nearest neighbor distance,the spatial transformation of the rectangle is calculated fromΦM(step17).In step18,the Euclidean distance between O and R ob-tained by the spatial transformation is evaluated.If the distance calculated by the spatial transformation is less than or equal to the actual k-that nearest neigh-bor distance,the exact distance is evaluated(step19).

Our experiments used not only the A-tree but also the R*-tree.The A-tree is useful for ellipsoid queries as well as queries based on the Euclidean distance func-tion.The A-tree search algorithm di?ers somewhat from the other methods in the R-tree family.The de-tails of the A-tree search algorithm are described in [SYUK00].

3.5Dimensionality Reduction

When the?atness of a query matrix is high,there are eigenvectors whose eigenvalue is small.In the space created by the spatial transformation,the dimensions corresponding to the eigenvalues contribute less to ap-proximation quality although the dimensions require the same CPU cost as the others.The STT with dimensionality reduction eliminates dimensions whose eigenvalues are small in order to save on CPU costs.

Let r=(r1,r2,...,r d)be the closest vertex of R to O in S created by the spatial transformation.When using dimensionality reduction,the distance of R to O can be determined as:

?d2(R,O)=

n

i=1

(r i)2,(11) n=COUNT

λj≥

η

d

·

d

i=1

λi

(j=1,...,d),

whereηis a threshold for dimensionality reduction, andλi is arranged in ascending order(i.e.λ1≥λ2≥...≥λd>0).The function COUNT(Γ)gives the number of elements that satisfy requirementΓ.This formula shows that the dimensionality for distance cal-culation in S is limited to n(n≤d).Thus,the dimen-sionality reduction reduces the calculation time rela-tive to Equations(7)(8)(10)as well as Equation(11) to n/d.As query matrix?atness increases,n decreases and higher e?ciency is achieved for the distance cal-culations.

X

similar

Figure4:The multiple spatial transformation tech-nique.

4Multiple Spatial Transformation Technique

In this section,we present the Multiple Spatial Trans-formation Technique(MSTT),which is an extension of STT.STT’s approximation quality is high,however, the number of node accesses increases as query matrix ?atness grows.This is because STT,as well as con-ventional search methods,utilizes a structure that is constructed by the Euclidean distance function.To overcome this problem,MSTT constructs tree struc-tures based on various quadratic form distance func-tions and then chooses a structure that gives su?cient search performance;the search algorithm described in Section3.4utilizes such a chosen structure.

4.1Basic Ideas

We have revealed the node access problem(P2)in Sec-tion2.2.The search methods for adaptive ellipsoid queries presented in[SK97]and[ABKS98]use index structures based on the Euclidean distance function. Accordingly,the number of node accesses increases as query matrix?atness grows,which leads to an increase in CPU cost and number of page accesses.MSTT over-comes this problem by selecting an arbitrary quadratic form distance function before constructing the index structures;the search algorithm utilizes the resulting structure.MSTT reduces both page accesses and CPU cost for ellipsoid queries.

MSTT can handle more than one index structure. For multimedia systems that attach importance to re-trieval performance and can well a?ord the disk space, the use of more than one structure is e?ective in im-proving search performance.Figure4illustrates a re-trieval mechanism based on MSTT.The mechanism ?rst determines a typical ellipsoid query matrix X i (i=1,...,ε)from the user’s query logs,and then con-structs index structures based on X i.In query process-ing,the matrix X similar closest to the query matrix M is chosen,and target objects are found using the struc-ture constructed by X similar.In particular,the query shown in Figure4requires search processing based on the Euclidean distance function if M=X similar.This retrieval mechanism that adopts multiple indexing can accelerate search performance.

Disk prices continue to fall and disk unit capacity is increasing rapidly.[GG97]shows that disk unit ca-pacity and storage cost have increased/decreased a hundred times and ten thousand times,respectively; whereas disk access speeds have increased only ten-fold in the last twenty years.The resulting trend is to em-phasize disk access speed counts over storage cost.In addition,reducing the search cost has a higher priority than reducing the insertion cost in many multimedia databases.It follows that there is a strong rationale for using more than one index to improve search per-formance.

4.2Indexing and Retrieval Mechanisms Structure Construction:

Let C be a matrix for constructing an index structure. The transformation matrix A C of C is:

A C=E C·Λ12C.

All data points included in the data set for construct-ing an index are transformed by A C.For instance, A C transforms a data point p in the data set into p =p·A C.MSTT constructs an index structure I C based on the transformed data points.I C can ef-?ciently support queries whose matrix is C.

Query Processing by MSTT:

For a query point q and a query matrix M,we?rst transform q into q =q·A C to perform this query using I C.Given a new matrix M for the query processing of M using I C:

M =A?1

C

·M·(A?1C)t,(12)

the quadratic form distance of M between p and q can be expanded as follows:

d2M(p,q)=(p?q)·M·(p?q)t

=(p ?q )·A?1C·M·(A?1C)t·(p ?q )t

=(p ?q )·M ·(p ?q )t.

Thus,the query whose matrix is M and point is q us-ing I C leads to the search result of the ellipsoid query of M.In particular,if M=C,I C can e?ciently support the query whose matrix is M,since M is a unit matrix,which means the search is based on the Euclidean distance function.

4.3Similarity of Matrices

When more than one index structure is constructed, the search process must choose one of them for ac-cess.To do so,we de?ne the dissimilarity between a query matrix M and an index I C by using the matrix ?atness.

Queries of M using I C utilize M ,calculated by Equation(12),as the query matrix.LetλM

i

be the

Table 1:Variance of eigenvalues.w r 1101001000d =80.030776.4897998.6800214σ2M

d =2764.777933729.29e89.29e12

Table 2:Dimensions used for ellipsoid queries.

w r 1101001000

d =8

8844n d =27271899i -th dimensional eigenvalue of M and λM be the av-erage of the eigenvalues of M .The variance σ2

M of

the eigenvalues of M

is determined as follows:

σ2M

=

d i =1

(λM i ?λM )2

,

λM =

d j =0

λM

j d

(13)

We employ σ2

M as the measure of dissimilarity be-tween M and I C .For a similarity search using MSTT,

the e?ectiveness of I C relative to M improves as σ2

M decreases.

4.4Normalization of Matrices

To calculate the dissimilarity of queries and indices,all matrices must be normalized,i.e.,det (C )=det (M )=1for matrices C and M .The normalized matrix N of M is obtained by:

N =E M ·ΛN ·E t

M ,

λN i =λM i ·

d

i =1

λM i

?1d

where the diagonal matrix ΛN consists of the eigenval-ues λN i (i =1,...,d )of N .C can also normalized in the same way.

5Performance Evaluation

To verify the e?ectiveness of STT,we implemented the algorithm and compared it with the MBB-MBS approximation technique.We then measured the per-formance of MSTT.

We evaluated its performance using real data sets with size of 100,000.For the data sets,8-D and 27-D feature vectors of color histograms were extracted from images.In assessing search performance,the page access number and CPU time were measured by the average of 100queries.In our evaluation,we used 20-nearest neighbor queries;query data were di?erent from the point data included in the indices,that is,query points were generated randomly and indepen-dently of data points.Page size was 8KB.CPU time was measured on a SUN UltraSPARC-II 450MHz.We used the A-tree [SYUK00],which provides supe-rior performance for high-dimensional data,and chose the code with size of 6bits per dimension for approx-imating the bounding rectangles and data objects in

the A-tree structure.To obtain the similarity matrices M ,we calculated the components m ij of M using the following formula [HSE +95][ABKS98]:

m ij =exp (?α(d w (c i ,c j )/d max )2),

where αis a positive constant,and d w (c i ,c j )denotes the weighted Euclidean distance between the color c i and c j .The factors w =(w r ,w g ,w b )represent the weighting of the red,green and blue components in RGB color space.In our evaluation,αwas 10,and both w g and w b were ?xed to 1.w r was varied from 1to 1,000.We calculated the eigenvalues of every matrix for 8-D and 27-D data.Table 1shows the result of this calculation for various values of w r .As shown in the table,matrix ?atness increases as w r grows when α,w g and w b are ?xed.

For the dimensionality reduction technique,the best threshold (η=0.01)from among three alter-natives,η=0.1,η=0.01,η=0.001,was chosen.Table 2shows the dimensions n used for STT with respect to the matrices created in our experiments.5.1

Search Performance

Figure 5compares STT and the MBB-MBS approxi-mation technique in terms of CPU cost.The A-tree was used as the index structure.The symbol STT(DR)means the CPU cost for the STT with dimensional-ity reduction.The number of page accesses is shown in Figure 6.Since STT and the MBB-MBS approxi-mation technique utilize exact quadratic form distance functions,both require the same number of page ac-cesses to perform ellipsoid queries.Thus,the di?er-ence in search time between STT and the MBB-MBS approximation technique depends on calculation com-plexity.As described in Section 1,ellipsoid queries incur high costs in calculating the distance between bounding rectangles and query points.Figure 5shows that STT reduces CPU cost for all data sets.The ef-fectiveness of STT increases as either dimensionality or matrix ?atness grows.In particular,STT achieves a 74%reduction in CPU cost over that of the MBB-MBS approximation technique for high dimensionality and matrix ?atness.

We evaluated the STT performance using the R*-tree as well as the A-tree.STT using either the R*-tree or the A-tree was superior to the MBB-MBS ap-proximation technique.The details of the experiments using the R*-tree are described in [SYKU01].5.2

Analysis of Approximation Techniques for Elliptical Queries

STT does not utilize the exact quadratic form distance functions to access bounding rectangles whose approx-imation distance from the query point exceeds the ac-tual k -nearest neighbor distance,similar to the MBB-MBS approximation technique.Figure 7shows the percentage of ?ltered exact quadratic form distance

200400600800

1000110

1001000C P U t i m e (m s )

Weight

STT (DR)

STT MBB-MBS

10 20 30 40 50

60110

1001000

P a g e a c c e

s s e s

Weight

0 10 20 30 40 50 60 70 80 901001

10

100

1000

R a t e o f f i l t e r e d e x a c t c a l c u l a t i o n s (%)

Weight

STT (DR)

STT MBB-MBS

(a)d =8

(a)d =8

(a)d =8

200040006000800010000

12000110

1001000

C P U t i m e

(m s )

Weight

STT (DR)

STT MBB-MBS

20

40 60 80100120

140110

1001000P a g e a c c e s s e s

Weight

01020304050607080901001

10

1001000

R a t e o f f i l t e r e d e x a c t c a l c u l a t i o n s (%)

Weight

STT (DR)

STT MBB-MBS

(b)d =27

(b)d =27

(b)d =27

Figure 5:Comparison of STT with the MBB-MBS approxima-tion technique in terms of CPU cost.

Figure 6:The number of page ac-cesses for the MBB-MBS approxi-mation technique.

Figure 7:Rate of ?ltered exact dis-tance calculations for STT.

calculations versus the number of bounding rectan-gles accessed in search processing,that is,this ?g-ure illustrates the e?ectiveness of the approximation techniques.Although the e?ciency of the MBB-MBS approximation technique decreases as the ?atness of query matrix grows,the STT approximations e?-ciently ?lter exact quadratic form distance calculations for all queries.STT proves to be highly e?ective with high-dimensional data and queries whose matrix ?at-ness is high as well as those with lower dimensionality and ?atness.The e?ectiveness of STT yields a low CPU cost as shown in Figure 5.

The dimensionality reduction provided by STT eliminates dimensions that make only a slight contri-bution to the approximation of distance between query points and bounding rectangles.This technique be-comes more e?ective in ellipsoid searches as the ?at-ness of the query matrix increases.Since the ?atness of query w r =1is relatively low,the query uses all dimensions in the search as shown in Table 2.On the other hand,for queries w r =100and w r =1000,both of which have ?at ellipsoids,distance calcula-tions are based on lower dimensionality.As Figure 7shows,STT has high approximation e?ciency with and without dimensionality reduction;as a result,the STT with dimensionality reduction has superior per-formance and a lower CPU cost.5.3E?ectiveness of MSTT

For a given query matrix,MSTT constructs an index structure based on the query matrix in order to sup-

port the query more e?ciently.In these experiments,we measured the performance of MSTT with the fol-lowing structures:

(1)Unit:the index structures constructed from the

unit matrix.(2)W r =10:the index structures constructed from

the matrix w r =10.(3)W r =1000:the index structures constructed

from the matrix w r =1000.Figure 8depicts the search performance for ellipsoid queries using these index structures 3.Table 3shows the dissimilarities between the three index structures and the four ellipsoid queries for 8-D and 27-D dimen-sions.The dissimilarities of index structures to queries were calculated using Equation (13).For any query,choosing the index structure that is most similar to the query minimizes search cost of the query.MSTT sig-ni?cantly reduces CPU costs and the number of page accesses for any query.

In addition,search cost is not proportional to dis-similarity.For example,queries whose dissimilarity is 0incur some search cost since similarity searches en-tail some cost even in the Euclidean distance space.Note that the function is not a cost model.Dissimi-larity allows the search algorithm to choose the index structure well suited to query matrices.

In practical situations,the dissimilarity of a given query matrix must be calculated for each index when

3

Experiments for 8-D data are shown in [SYKU01].

5001000150020002500

3000110

1001000C P U t i m e (m s )

Weight

Unit Wr=10Wr=1000

100200300400500110

1001000

C P U t i m e (m s )

Weight

dissimilarity

Unit Best

500

10001500200025003000

3500110

1001000

C P U t i m e (m s )

Weight

Dissimilarity

Unit Best

(a)CPU time,d =27

(a)CPU time,d =8

(c)CPU time,d =27

20

30 40 50 60 70 80 90

1001101

10

100

1000

P a g e a c c e s s e s

Weight

Unit Wr=10Wr=1000

10 20 30 40 50

60110

1001000

P a g e a c c e s s e s

Weight

dissimilarity

Unit Best

20

40 60 80100120

140

110

1001000

P a g e a c c e s s e s

Weight

dissimilarity

Unit Best

(b)Page accesses,d =27

(b)Page accesses,d =8(d)Page accesses,d =27

Figure 8:Search performance for MSTT.

Figure 9:Behavior of the dissimilarity function.

trying to ?nd the best index structure.The number of dissimilarity calculations grows as the number of created indices increases since the number of dissimi-larity calculations equals the number of indices.How-ever,each calculation incurs only a small CPU cost:2ms for 27-D data in our experiments.Since this is negligible compared with the overall search time,constructing various structures substantially improves search performance.5.4

Properties of the dissimilarity function

In this section,we analyze the properties of the dissim-ilarity function de?ned in Section 4.3.We created 30query matrices for 8and 27dimensions using weight w r as follows:

w r =10random ,where random is a randomly generated number be-tween 0and 3.Figure 9shows how the dissimilar-ity function chooses an index structure.Each search cost for 30queries was measured by the average of 100queries.This experiment used three kinds of index,Unit ,W r =10and W r =1000,such as the ex-periment shown in Section 5.3.The ?gure shows the following search costs:

(1)Dissimilarity:the cost of search using index

structures chosen by the dissimilarity function.(2)Unit:the search cost on index structures con-structed from the unit matrix.(3)Best:the lowest search cost using the optimal

index structure for each query matrix.

In the experiment using 27-dimensional data,the dis-similarity function chooses the index structure Unit for the query matrices whose w r lies between 1and 3,W r =10for w r between 3and 100,and W r =1000for w r between 100and 1,000.The choice of indices for 8-dimensional data is quite similar to that for 27-dimensional data.Although search cost determined by the function is not exactly equal to the search cost achieved by the optimal index structures,the func-tion chooses a good structure for most of the query matrices.Moreover,compared with the index struc-ture based on the Euclidean distance,index struc-tures chosen by the dissimilarity function greatly re-duce the search cost.Unlike the previous works [SK97][ABKS98]that use one index structure based on the Euclidean distance,MSTT constructs various index structures,which allows the dissimilarity function to choose the structure well suited to the query matrices.This analysis demonstrates the e?ectiveness of MSTT.

6Conclusions

This paper described the Spatial Transformation Tech-nique (STT),which o?ers excellent performance when searching for adaptive ellipsoid queries.First,we an-alyzed the MBB-MBS approximation technique and discussed its problems.Then,based on this analysis,we showed how STT can overcome these problems.STT’s high level of performance is due to its use of spatial transformation.Since the spatial transforma-tion provides highly accurate approximations of the distance between query points and bounding rectan-gles,STT eliminates exact distance evaluations for most of the bounding rectangles accessed in the in-dex structures.The mechanism of STT is remarkably

Table3:Dissimilarity of matrices.

(a)Unit

w r1101001000

d=80.03176.4907999800214σ2

M d=2764.777933729.29e89.29e12

(b)W r=10

w r1101001000

d=871.2960194.9919892σ2

M d=27424670127814 1.30e9

(c)W r=1000

w r1101001000

d=874863419892196.010

σ2

M d=27 4.37e12 1.33e91320220

e?cient,especially for queries whose dimensionality or matrix?atness is high.This technique guarantees no false drops.In experiments using various matrices and index structures,STT was found to be superior to the conventional MBB-MBS approximation technique.

This paper also described the Multiple Spatial Transformation Technique(MSTT).MSTT adjusts tree structures to suit ellipsoid queries;the search al-gorithm utilizes the adjusted structures.This tech-nique reduces the number of page accesses as well as the CPU cost for ellipsoid queries.

MSTT can support ellipsoid queries e?ciently be-cause one or more index structures can be used.In the future,we plan to consider an algorithm that de-termines matrices from a log of user’s queries to create various indices.We will also create a matrix decision algorithm whose parameters are a log of queries and the number of indices that can be stored on disk.

References

[ABKS98]Mihael Ankerst,Bernhard Braunm¨u ller,Hans-Peter Kriegel,and Thomas Seidl:“Improving

Adaptable Similarity Query Processing by Us-

ing Approximations”,in Proc.of the24th Inter-

national Conference on Very Large Data Bases

(VLDB),pp.206–217,New York City,NY,Au-

gust1998.

[BKK96]Stefan Berchtold,Daniel A.Keim,and Hans-Peter Kriegel:“The X-tree:An Index Struc-

ture for High-Dimensional Data”,in Proc.

of the22nd International Conference on Very

Large Data Bases(VLDB),pp.28–39,Bombay,

September1996.

[BKSS90]Norbert Beckmann,Hans-Peter Kriegel,Ralf Schneider,and Bernhard Seeger:“The R*-tree:

An E?cient and Robust Access Method for

Points and Rectangles”,in Proc.ACM SIG-

MOD Conf.,pp.322–331,Atlantic City,NJ,

May1990.

[BO97]Tolga Bozkaya and Meral Ozoyoglu:“Distance-Based Indexing for High-Dimensional Metric

Spaces”,in Proc.ACM SIGMOD International

Conference on Management of Data,pp.357–

368,May1997.

[CPZ97]Paolo Ciaccia,Marco Patella,and Pavel Zezula:“M-tree:An E?cient Access Method for Simi-

larity Search in Metric Spaces”,in Proc.of the

23rd International Conference on Very Large

Data Bases(VLDB),pp.426—435,Athens,

August1997.

[FSA+95]M.Flickner,H.S.Sawhney,J.Ashley, Q.Huang, B.Dom,M.Gorkani,J.Hafner,

D.Lee,D.Petkovic,D.Steele,and P.Yanker:

“Query by image and video content:the QBIC

system”,IEEE Computer,Vol.28,No.9,pp.

23–32,September1995.

[GG97]Jim Gray and Goetz Graefe:“The Five-Minute Rule Ten Years Later and Other Computer

Storage Rules of Thumb”,SIGMOD Record,

Vol.26,No.4,pp.63–68,December1997. [GG98]Volker Gaede and Oliver G¨u nther:“Multidi-mensional Access Methods”,ACM Computing

Surveys,Vol.30,No.2,pp.170–231,June1998. [HS95]G.R.Hjaltason and H.Samet:“Ranking in Spatial Databases”,in Proceedings of the4th

Symposium on Spatial Databases,pp.83–95,

Portland,Maine,August1995.

[HSE+95]James L.Hafner,Harpreet S.Sawhney, William Equitz,Myron Flickner,and Wayne

Niblack:“E?cient Color Histogram Indexing

for Quadratic Form Distance Functions”,IEEE

Trans.on Pattern Analysis and Machine Intel-

ligence,Vol.17,No.7,pp.729–736,July1995. [ISF98]Yoshiharu Ishikawa,Ravishankar Subramanya, and Christos Faloutsos:“MindReader:Query-

ing databases through multiple examples”,in

Proc.of the24th International Conference on

Very Large Data Bases(VLDB),pp.218–227,

New York City,NY,August1998.

[RHM97]Y.Rui,T.S.Huang,and S.Mehrotra:“Content-based Image Retrieval with Rele-

vance Feedback in MARS”,in Proc.of IEEE

International Conference on Image Processing,

pp.II–815–818,October1997.

[RKV95]Nick Roussopoulos,Stephen Kelley,and Fr′e d′e ric Vincent:“Nearest Neighbor Queries”,

in Proc.ACM SIGMOD International Confer-

ence on Management of Data,pp.71–79,May

1995.

[SK97]Thomas Seidl and Hans-Peter Kriegel:“Ef-?cient User-Adaptable Similarity Search in

Large Multimedia Databases”,in Proc.of the

23rd International Conference on Very Large

Data Bases(VLDB),pp.506—515,Athens,

August1997.

[SYKU01]Yasushi Sakurai,Masatoshi Yoshikawa,Ry-oji Kataoka,and Shunsuke Uemura:“Similar-

ity Search for Adaptive Ellipsoid Queries Us-

ing Spatial Transformation”,Technical report,

Nara Institute of Science and Technology,2001. [SYUK00]Yasushi Sakurai,Masatoshi Yoshikawa,Shun-suke Uemura,and Haruhiko Kojima:“The A-

tree:An Index Structure for High-Dimensional

Spaces Using Relative Approximation”,in

Proc.of the26th International Conference on

Very Large Data Bases(VLDB),pp.516–526,

Cairo,Egypt,September2000.

EXCEL函数表(最全的函数大全)

函数大全一、数据库函数(13条) 二、日期与时间函数(20条) 三、外部函数(2条) 四、工程函数(39条) 五、财务函数(52条)

六、信息函数(9条) 七、逻辑运算符(6条) 八、查找和引用函数(17条) 九、数学和三角函数(60条)

十、统计函数(80条)

十一、文本和数据函数(28条)

一、数据库函数(13条) 1.DAVERAGE 【用途】返回数据库或数据清单中满足指定条件的列中数值的平均值。 【语法】DAVERAGE(database,field,criteria) 【参数】Database构成列表或数据库的单元格区域。Field指定函数所使用的数据列。Criteria为一组包含给定条件的单元格区域。 2.DCOUNT 【用途】返回数据库或数据清单的指定字段中,满足给定条件并且包含数字的单元格数目。 【语法】DCOUNT(database,field,criteria) 【参数】Database构成列表或数据库的单元格区域。Field指定函数所使用的数据列。Criteria为一组包含给定条件的单元格区域。 3.DCOUNTA 【用途】返回数据库或数据清单指定字段中满足给定条件的非空单元格数目。 【语法】DCOUNTA(database,field,criteria) 【参数】Database构成列表或数据库的单元格区域。Field指定函数所使用的数据列。Criteria为一组包含给定条件的单元格区域。 4.DGET 【用途】从数据清单或数据库中提取符合指定条件的单个值。 【语法】DGET(database,field,criteria) 【参数】Database构成列表或数据库的单元格区域。Field指定函数所使用的数据列。Criteria为一组包含给定条件的单元格区域。 5.DMAX 【用途】返回数据清单或数据库的指定列中,满足给定条件单元格中的最大数值。 【语法】DMAX(database,field,criteria) 【参数】Database构成列表或数据库的单元格区域。Field指定函数所使用的数据列。Criteria为一组包含给定条件的单元格区域。 6.DMIN 【用途】返回数据清单或数据库的指定列中满足给定条件的单元格中的最小数字。 【语法】DMIN(database,field,criteria) 【参数】Database构成列表或数据库的单元格区域。Field指定函数所使用的数据列。Criteria为一组包含给定条件的单元格区域。 7.DPRODUCT 【用途】返回数据清单或数据库的指定列中,满足给定条件单元格中数值乘积。 【语法】DPRODUCT(database,field,criteria) 【参数】同上

Excel常用函数及使用方法

excel常用函数及使用方法 一、数字处理 (一)取绝对值:=ABS(数字) (二)数字取整:=INT(数字) (三)数字四舍五入:=ROUND(数字,小数位数) 二、判断公式 (一)把公式返回的错误值显示为空: 1、公式:C2=IFERROR(A2/B2,"") 2、说明:如果是错误值则显示为空,否则正常显示。 (二)IF的多条件判断 1、公式:C2=IF(AND(A2<500,B2="未到期"),"补款","") 2、说明:两个条件同时成立用AND,任一个成立用OR函数。 三、统计公式 (一)统计两表重复 1、公式:B2=COUNTIF(Sheet15!A:A,A2) 2、说明:如果返回值大于0说明在另一个表中存在,0则不存在。 (二)统计年龄在30~40之间的员工个数 公式=FREQUENCY(D2:D8,{40,29} (三)统计不重复的总人数 1、公式:C2=SUMPRODUCT(1/COUNTIF(A2:A8,A2:A8)) 2、说明:用COUNTIF统计出每人的出现次数,用1除的方式把出现次数变成分母,然后相加。

(四)按多条件统计平均值 =AVERAGEIFS(D:D,B:B,"财务",C:C,"大专") (五)中国式排名公式 =SUMPRODUCT(($D$4:$D$9>=D4)*(1/COUNTIF(D$4:D$9,D$4:D$9))) 四、求和公式 (一)隔列求和 1、公式:H3=SUMIF($A$2:$G$2,H$2,A3:G3) 或=SUMPRODUCT((MOD(COLUMN(B3:G3),2)=0)*B3:G3) 2、说明:如果标题行没有规则用第2个公式 (二)单条件求和 1、公式:F2=SUMIF(A:A,E2,C:C) 2、说明:SUMIF函数的基本用法 (三)单条件模糊求和 说明:如果需要进行模糊求和,就需要掌握通配符的使用,其中星号是表示任意多个字符,如"*A*"就表示a前和后有任意多个字符,即包含A。 (四)多条求模糊求和 1、公式:=SUMIFS(C2:C7,A2:A7,A11&"*",B2:B7,B11) 2、说明:在sumifs中可以使用通配符* (五)多表相同位置求和 1、公式:=SUM(Sheet1:Sheet19!B2) 2、说明:在表中间删除或添加表后,公式结果会自动更新。

ElasticSearch面试题

1:es介绍 Elasticsearch是一个基于Lucene的实时的分布式搜索和分析引擎。设计用于云计算中, 能够达到实时搜索,稳定,可靠,快速,安装使用方便。基于RESTful接口。 普通请求是...get?a=1 rest请求....get/a/1 2:全文搜索的工具有哪些 Lucene Solr Elasticsearch 3:es的bulk的引用场景 1.bulk API可以帮助我们同时执行多个请求 2.create 和index的区别 如果数据存在,使用create操作失败,会提示文档已经存在,使用index则可以成功执行。 3.可以使用文件操作 使用文件的方式 vi requests curl -XPOST/PUT localhost:9200/_bulk --data-binary @request; bulk请求可以在URL中声明/_index 或者/_index/_type 4.bulk一次最大处理多少数据量 bulk会把将要处理的数据载入内存中,所以数据量是有限制的 最佳的数据量不是一个确定的数值,它取决于你的硬件,你的文档大小以及复杂性,你的索引以及搜索的负载 一般建议是1000-5000个文档,如果你的文档很大,可以适当减少队列,大小建议是 5-15MB,默认不能超过100M, 可以在es的配置文件中修改这个值http.max_content_length: 100mb 5.版本控制的一个问题 在读数据与写数据之间如果有其他线程进行写操作,就会出问题,es使用版本控制才避免这种问题。 在修改数据的时候指定版本号,操作一次版本号加1。 6.es的两个web访问工具

EXCEL的文本函数REPLACE、FIND和SEARCH

EXCEL的文本函数REPLACE、FIND和SEARCH 文本中的LEFT、MID、RIGHT、LEN、LENB、SUBSTITUTE有印象不?最后一个类似于替换,你也用过替换命令的,很多时候需要对某个文本中的部分内容进行替换,除了EXCEL原本具有的查找替换功能以外还可以用文本替换函数,而SUBSTITUTE 就是替换函数之一,如果你知道要替换的字符是什么但不知道该字符在文本中的具体位置就可以使用该函数,比如:你好你们好你好,要把你好替换成你不好,条件是被替换的字符串是你好,要替换成你不好,文本中我不确定具体位置就用SUBSTITUTE公式=SUBSTITUTE(A1,"你好","你不好"),字符串:你好你们好你好中,我要把第二个你好替换成你不好呢?也就是说第一个你好我不替换,只替换第二个出现的你好,还是使用SUBSTITUTE,因为我不知道第二个你好的位置在哪 =SUBSTITUTE(A1,"你好","你不好",2)后面的数字2是该函数的第四个参数,代 表替换位置,也就是替换第二个你好,通过组合我们可以完成一些小应用,所以SUBSTITUTE的替换作用还是挺实用的,但我们如何知道一个文本中有几个指定 的字符?这是一个小应用 比如这一串文本中有几个好字?如果知道这个文本中有几个好字? 比如里面有三个好,如果用公式算出来?如何用公式来 整出来?比如原来的字符宽是多少?再判断替换掉好字的文本的宽度,再相减,不就是结果了?不用替换成空格,替换成空 第一步替换好为无=SUBSTITUTE(A1,"好",),第二步判断已替换的字符长 =len(SUBSTITUTE(A1,"好",))第三步用原来的字符长去减 =len(a1)-len(SUBSTITUTE(A1,"好",))得到最后的答案,len是判断字符长度的,这上次已讲过了,赫赫。=len("abc")结果为3代表abc文本中有三个字符len("中华人民共和国")结果是7代表有7个字符,上次讲了LEFT、MID、RIGHT、LEN、LENB、SUBSTITUTE,len是判断字符长度的,一个字母,一个数字,一个汉字都是 一个字符,lenb是判断字节长的,一个半角字母或数字为一个字节,一个汉字是 两个字节,这个可以看上次的聊天记录吧。现在我来讲一个REPLACE,REPLACE 也是替换函数,他的参数描述是replace(原文本,第几个字符,宽度,替换成的新文本)比如A1="abcde"我要把A1的第2和第3个字符变成x,也就是我要把abcde 变成axde,用replace函数就是=replace(a1,2,2,"x")从a1的第2个位置开始,向右截2个字符宽度,以x来替换,replace一般用在已知道具体的替换位置的 应用

excel 函数的公式语法和用法

SUMIF 函数的公式语法和用法。 说明 使用SUMIF函数可以对区域中符合指定条件的值求和。例如,假设在含有数字的某一列中,需要让大于5 的数值相加,请使用以下公式: =SUMIF(B2:B25,">5") 在本例中,应用条件的值即要求和的值。如果需要,可以将条件应用于某个单元格区域,但却对另一个单元格区域中的对应值求和。例如,使用公式=SUMIF(B2:B5, "John", C2:C5)时,该函数仅对单元格区域C2:C5 中与单元格区域B2:B5 中等于“John”的单元格对应的单元格中的值求和。注释若要根据多个条件对若干单元格求和,请参阅SUMIFS 函数。 语法 SUMIF(range, criteria, [sum_range]) SUMIF函数语法具有以下参数: range必需。用于条件计算的单元格区域。每个区域中的单元格都必须是数字或名称、数组或包含数字的引用。空值和文本值将被忽略。 criteria必需。用于确定对哪些单元格求和的条件,其形式可以为数字、表达式、单元格 引用、文本或函数。例如,条件可以表示为32、">32"、B5、32、"32"、"苹果" 或TODAY()。 要点任何文本条件或任何含有逻辑或数学符号的条件都必须使用双引号(") 括起来。如果条件为数字,则无需使用双引号。

sum_range可选。要求和的实际单元格(如果要对未在range 参数中指定的单元格求和)。 如果sum_range参数被省略,Excel 会对在range参数中指定的单元格(即应用条件的单元格)求和。 注释 sum_range 参数与range参数的大小和形状可以不同。求和的实际单元格通过以下方法确定:使用sum_range参数中左上角的单元格作为起始单元格,然后包括与range参数大小和形状相对应的单元格。例如: 如果区域是并且sum_range 是则需要求和的实际单元格是 A1:A5 B1:B5 B1:B5 A1:A5 B1:B3 B1:B5 A1:B4 C1:D4 C1:D4 A1:B4 C1:C2 C1:D4 可以在criteria参数中使用通配符(包括问号(?) 和星号(*))。问号匹配任意单个字符; 星号匹配任意一串字符。如果要查找实际的问号或星号,请在该字符前键入波形符(~)。示例 示例1 如果将示例复制到一个空白工作表中,可能会更容易理解该示例。 如何复制示例? 1.选择本文中的示例。

elasticsearch学习文档

1.全文搜索引擎elasticsearch 1.1.Elasticsearch简介 Elasticsearch是开源的,分布式的,提供rest接口,支持云端调用的,构建在Apache Lucene之上的搜索引擎。 1.2.优点&缺点 优点:开箱即用,分布式,rest 接口,支持云端调用。 缺点:没有大量商业产品应用。分片的数目不能动态调整,只能在初始化索引的时候指定。 2.E lasticsearch的安装 2.1.运行环境 JDK6以上 2.2.下载Elasticsearch 为了更好的对中文进行分词,减少配置问题,下载集成分词的elasticsearch-rtf(基于elasticsearch 0.90.0,目前elasticsearch更新到0.90.5)版本。Rtf集成了ik、mmseg分词以及searchwrapper、thrift等插件。 什么是ElasticSearch-RTF? RTF是Ready To Fly的缩写,在航模里面,表示无需自己组装零件即可直接上手即飞的航空模型,elasticsearch-RTF是针对中文的一个发行版,即使用最新稳定的elasticsearch版本,并且帮你下载测试好对应的插件,如中文分词插件等,还会帮你做好一些默认的配置,目的是让你可以下载下来就可以直接的使用。下载地址如下:https://https://www.wendangku.net/doc/811411714.html,/medcl/elasticsearch-rtf

注释:分词是用于模糊匹配的时候,是把一段话当成词语还是当成单个字来搜索的规则。 2.3.安装 解压elasticsearch-rtf-mast.zip到你指定的目录下即可。 2.4.运行 2.4.1.启动服务 cd/usr/local/elasticsearch/bin/service ./elasticsearch start 第一次启动服务后,在/usr/local/elasticsearch目录生成data目录和logs目录2.4.2.停止服务 cd/usr/local/elasticsearch/bin/service ./elasticsearch stop 3.e lasticsearch配置文件详解 elasticsearch.yml配置文件内容较多,挑几个可能会用的说一下。 https://www.wendangku.net/doc/811411714.html,: elasticsearch 配置es的集群名称,默认是elasticsearch,es会自动发现在同一网段下的es,如果在同一网段下有多个集群,就可以用这个属性来区分不同的集群。 https://www.wendangku.net/doc/811411714.html,: "Franz Kafka" 节点名,默认随机指定一个name列表中名字,该列表在es的jar包中config文件夹里name.txt 文件中,其中有很多作者添加的有趣名字。 node.master: true 指定该节点是否有资格被选举成为node,默认是true,es是默认集群中的第一台机器为master,如果这台机挂了就会重新选举master。 network.bind_host: 192.168.0.1

ElasticSearch使用手册

ElasticSearch使用手册 一、ElasticSearch简介 1.1.什么是ElasticSearch ElasticSearch(以下均检查ES)是Compass(基于Lucene开源项目)作者Shay Banon在2010年发布的高性能、实时、分布式的开源搜索引擎。后来成立了ElasticSearch公司,负责ES相关产品的开发及商用服务支持,ES依旧采用免费开源模式,但部分插件采用商用授权模式,例如Marvel插件(负责ES的监控管理)、Shield插件(提供ES的授权控制)。 1.2.ElasticSearch的基础概念 ?Collection 在SolrCloud集群中逻辑意义上的完整的索引。它常常被划分为一个或多个Shard,它们使用相同的Config Set。如果Shard数超过一个,它就是分布式索引,SolrCloud让你通过Collection名称引用它,而不需要关心分布式检索时需要使用的和Shard相关参数。 ?Config Set Solr Core提供服务必须的一组配置文件。每个config set有一个名字。最小需要包括solrconfig.xml (SolrConfigXml)和schema.xml (SchemaXml),除此之外,依据这两个文件的配置内容,可能还需要包含其它文件。它存储在Zookeeper中。Config sets可以重新上传或者使用upconfig命令更新,使用Solr的启动参数bootstrap_confdir指

定可以初始化或更新它。 ?Core Core也就是Solr Core,一个Solr中包含一个或者多个Solr Core,每个Solr Core可以独立提供索引和查询功能,每个Solr Core对应一个索引或者Collection的Shard,Solr Core的提出是为了增加管理灵活性和共用资源。在SolrCloud中有个不同点是它使用的配置是在Zookeeper中的,传统的Solr core的配置文件是在磁盘上的配置目录中。 ?Leader 赢得选举的Shard replicas。每个Shard有多个Replicas,这几个Replicas需要选举来确定一个Leader。选举可以发生在任何时间,但是通常他们仅在某个Solr实例发生故障时才会触发。当索引documents时,SolrCloud会传递它们到此Shard对应的leader,leader 再分发它们到全部Shard的replicas。 ?Replica Shard的一个拷贝。每个Replica存在于Solr的一个Core中。一个命名为“test”的collection以numShards=1创建,并且指定replicationFactor设置为2,这会产生2个replicas,也就是对应会有2个Core,每个在不同的机器或者Solr实例。一个会被命名为test_shard1_replica1,另一个命名为test_shard1_replica2。它们中的一个会被选举为Leader。 ?Shard

elasticSearch

ElasticSearch:可扩展的开源弹性搜索解决方案 开源的分布式搜索引擎支持时间时间索引和全文检索。 索引:index 存放数据 类型:type 区分储存的对象 文档:document 储存的主要实体 页面: field 角色关系对照 elasticsearch 跟 MySQL 中定义资料格式的角色关系对照表如下 MySQL elasticsearch database index table type table schema mapping row document field field http://localhost:9200/mishu_index/hunanzhaobiaowang/ _search?q=title:嘉禾县基本烟田土地整理施工 ElasticSearch官网:https://www.wendangku.net/doc/811411714.html,/ 先上一张elasticsearch的总体框架图:

ElasticSearch是基于Lucene开发的分布式搜索框架,包含如下特性: 分布式索引、搜索 索引自动分片、负载均衡 自动发现机器、组建集群 支持Restful 风格接口 配置简单等。 下图是ElasticSearch的第三方插件管理工具,通过它可以很清晰的看到它索引分布的情况:哪块分布在那里,占用空间多少都可以看到,并且可以管理索引。

当一台机挂了时,整个系统会对挂机里的内容重新分配到其它机器上,当挂掉的机重新加 入集群时,又会重新把索引分配给它。当然,这些规则都是可以根据参数进行设置的,非 常灵活。ElasticSearch是先把索引的内容保存到内存之中,当内存不够时再把索引持久化 到硬盘中,同时它还有一个队列,是在系统空闲时自动把索引写到硬盘中。 的后端存储方式可以有一下四种: 1. 像普通的 Lucene 索引,存储在本地文件系统中; 2. 存储在分布式文件系统中,如 freeds; 3. 存储在 Hadoop 的 hdfs中; 4. 存储在亚马逊的 S3 云平台中。 它支持插件机制,有丰富的插件。比如和 mongoDB、couchDB 同步的river 插件,分词插件,Hadoop 插件,脚本支持插件等。 下面介绍elasticsearch的几个概念: cluster 代表一个集群,集群中有多个节点,其中有一个为主节点,这个主节点是可以通过选举产 生的,主从节点是对于集群内部来说的。es 的一个概念就是去中心化,字面上理解就是无 中心节点,这是对于集群外部来说的,因为从外部来看 es 集群,在逻辑上是个整体,与 任何一个节点的通信和与整个es 集群通信是等价的。在配置文件中可以配置集群的名字,在同一局域网内的机器,配置相同的cluster名字,将会自动组建集群,不需要其它特殊配置。 shards

Excel函数之Search函数

江西省南昌市2015-2016学年度第一学期期末试卷 (江西师大附中使用)高三理科数学分析 一、整体解读 试卷紧扣教材和考试说明,从考生熟悉的基础知识入手,多角度、多层次地考查了学生的数学理性思维能力及对数学本质的理解能力,立足基础,先易后难,难易适中,强调应用,不偏不怪,达到了“考基础、考能力、考素质”的目标。试卷所涉及的知识内容都在考试大纲的范围内,几乎覆盖了高中所学知识的全部重要内容,体现了“重点知识重点考查”的原则。 1.回归教材,注重基础 试卷遵循了考查基础知识为主体的原则,尤其是考试说明中的大部分知识点均有涉及,其中应用题与抗战胜利70周年为背景,把爱国主义教育渗透到试题当中,使学生感受到了数学的育才价值,所有这些题目的设计都回归教材和中学教学实际,操作性强。 2.适当设置题目难度与区分度 选择题第12题和填空题第16题以及解答题的第21题,都是综合性问题,难度较大,学生不仅要有较强的分析问题和解决问题的能力,以及扎实深厚的数学基本功,而且还要掌握必须的数学思想与方法,否则在有限的时间内,很难完成。 3.布局合理,考查全面,着重数学方法和数学思想的考察 在选择题,填空题,解答题和三选一问题中,试卷均对高中数学中的重点内容进行了反复考查。包括函数,三角函数,数列、立体几何、概率统计、解析几何、导数等几大版块问题。这些问题都是以知识为载体,立意于能力,让数学思想方法和数学思维方式贯穿于整个试题的解答过程之中。 二、亮点试题分析 1.【试卷原题】11.已知,,A B C 是单位圆上互不相同的三点,且满足AB AC → → =,则A BA C →→ ?的最小值为( ) A .1 4- B .12- C .34- D .1-

sumifs函数多条件求和实例

s u m i f s函数多条件求和实 例 Prepared on 22 November 2020

sumifs函数多条件求和实例 内容提要:文章首先介绍sumifs函数基本用法,然后以一个综合的实例来剖析sumifs函数的详细深入使用。 第一部分,sumifs函数用法介绍 excel中sumifs函数是2007以后版本新增的多条件求和函数。 sumifs函数的语法是:SUMIFS(求和区域,条件区域1,条件1,[条件区域2,条件2],...) 说明:[]以内的条件区域2、条件2为可选参数。最多允许127个区域/条件对。 第二部分,sumifs函数实例介绍 项目一:客户A的销售额 =SUMIFS(C2:C10,A2:A10,A2) 项目二:客户A的1月份销售额 =SUMIFS(C2:C10,A2:A10,A2,B2:B10,B2) 项目三:客户A的1月份和3月份销售额 =SUM(SUMIFS(C2:C10,A2:A10,A2,B2:B10,{1,3})) 项目四:客户A和C的销售额 =SUM(SUMIFS(C2:C10,A2:A10,{"A","C"})) 项目五:客户A和C的1月份销售额合计 =SUM(SUMIFS(C2:C10,A2:A10,{"A","C"},B2:B10,B2)) 项目六:客户A的1月份和客户C的3月份销售额合计 =SUM(SUMIFS(C2:C10,A2:A10,{"A","C"},B2:B10,{1,3})) 项目七:客户A和客户C的1月份\3月份\4月份销售额合计 =SUM(SUMIFS(C2:C10,A2:A10,{"A","C"},B2:B10,{1;3;4}))

Elasticsearch权威指南(中文版)

Elasticsearch 权威指南(中文版) 1、入门 Elasticsearch是一个实时分布式搜索和分析引擎。它让你以前所未有的速度 处理大数据成为可能。 它用于全文搜索、结构化搜索、分析以及将这三者混合使用: 维基百科使用Elasticsearch提供全文搜索并高亮关键字,以及输入实时搜索(search-as-you-type)和搜索纠错(did-you-mean)等搜索建议功能。 英国卫报使用Elasticsearch结合用户日志和社交网络数据提供给他们的编辑以实时的反馈,以便及时了解公众对新发表的文章的回应。StackOverflow结合全文搜索与地理位置查询,以及more-like-this功能来找到相关的问题和答案。 Github使用Elasticsearch检索1300亿行的代码。 但是Elasticsearch不仅用于大型企业,它还让像DataDog以及Klout这样的创业公司将最初的想法变成可扩展的解决方案。Elasticsearch可以在你的笔记本上运行,也可以在数以百计的服务器上处理PB级别的数据。Elasticsearch所涉及到的每一项技术都不是创新或者革命性的,全文搜索, 分析系统以及分布式数据库这些早就已经存在了。它的革命性在于将这些独立且有用的技术整合成一个一体化的、实时的应用。它对新用户的门槛很低,当然它也会跟上你技能和需求增长的步伐。 如果你打算看这本书,说明你已经有数据了,但光有数据是不够的,除非你能对这些数据做些什么事情。

很不幸,现在大部分数据库在提取可用知识方面显得异常无能。的确,它们能够通过时间戳或者精确匹配做过滤,但是它们能够进行全文搜索,处理同义词和根据相关性给文档打分吗?它们能根据同一份数据生成分析和聚合的结果吗?最重要的是,它们在没有大量工作进程(线程)的情况下能做到对数据的实时处理吗? 这就是Elasticsearch存在的理由:Elasticsearch鼓励你浏览并利用你的数 据,而不是让它烂在数据库里,因为在数据库里实在太难查询了。Elasticsearch是你新认识的最好的朋友。 1.1、是什么 为了搜索,你懂的 Elasticsearch是一个基于Apache Lucene(TM)的开源搜索引擎。无论在开源还是专有领域,Lucene可以被认为是迄今为止最先进、性能最好的、功能最全的搜索引擎库。 但是,Lucene只是一个库。想要使用它,你必须使用Java来作为开发语言并将其直接集成到你的应用中,更糟糕的是,Lucene非常复杂,你需要深入了解检索的相关知识来理解它是如何工作的。 Elasticsearch也使用Java开发并使用Lucene作为其核心来实现所有索引和搜索的功能,但是它的目的是通过简单的RESTful API来隐藏Lucene的复杂性,从而让全文搜索变得简单。 不过,Elasticsearch不仅仅是Lucene和全文搜索,我们还能这样去描述它: ?分布式的实时文件存储,每个字段都被索引并可被搜索 ?分布式的实时分析搜索引擎 ?可以扩展到上百台服务器,处理PB级结构化或非结构化数据

【优质文档】sumif函数的使用方法word版本 (2页)

【优质文档】sumif函数的使用方法word版本 本文部分内容来自网络整理,本司不为其真实性负责,如有异议或侵权请及时联系,本司将立即删除! == 本文为word格式,下载后可方便编辑和修改! == sumif函数的使用方法 sumif函数的使用方法 使用SUMIF函数可以对区域(区域:工作表上的两个或多个单元格。区域中的单元格可以相邻或不相邻。)中符合指定条件的值求和。例如,假设在含有数字 的某一列中,需要让大于5的数值相加,请使用以下公式: =SUMIF(B2:B25,">5") 在本例中,应用条件的值即要求和的值。如果需要,可以将条件应用于某个单 元格区域,但却对另一个单元格区域中的对应值求和。例如,使用公式 =SUMIF(B2:B5,"俊元",C2:C5)时,该函数仅对单元格区域C2:C5中与单元格区 域B2:B5中等于“俊元”的单元格对应的单元格中的值求和。 注释若要根据多个条件对若干单元格求和,请参阅SUMIFS函数。语法 SUMIF(range,criteria,[sum_range]) SUMIF函数语法具有以下参数(参数:为操作、事件、方法、属性、函数或过程 提供信息的值。):range必需。用于条件计算的单元格区域。每个区域中的 单元格都必须是数字或名称、数组或包含数字的引用。空值和文本值将被忽略。criteria必需。用于确定对哪些单元格求和的条件,其形式可以为数字、表达式、单元格引用、文本或函数。例如,条件可以表示为32、">32"、B5、32、"32"、"苹果"或TODAY()。 要点任何文本条件或任何含有逻辑或数学符号的条件都必须使用双引号(")括起来。如果条件为数字,则无需使用双引号。sum_range可眩要求和的实际单元 格(如果要对未在range参数中指定的单元格求和)。如果sum_range参数被 省略,Excel会对在range参数中指定的单元格(即应用条件的单元格)求和。 注释sum_range参数与range参数的大小和形状可以不同。求和的实际单元格 通过以下方法确定:使用sum_range参数中左上角的单元格作为起始单元格, 然后包括与range参数大小和形状相对应的单元格。例如:如果区域是并且 sum_range是则需要求和的实际单元格是 A1:A5B1:B5B1:B5A1:A5B1:B3B1:B5A1:B4C1:D4C1:D4A1:B4C1:C2C1:D4可以在criteria参数中使用通配符(包括问号(?)和星号(*))。问号匹配任意单个字符;星号匹配任意一串字符。如果要查找实际的问号或星号,请在该字符前键 入波形符(~)。注解使用SUMIF函数匹配超过255个字符的字符串时,将返回不正确的结果#VALUE!。示例示例1

Excel中sumif和sumifs函数进行条件求和的用法

Excel中sumif和sumifs函数进行条件求和的用法 sumif和sumifs函数是Excel2007版本以后新增的函数,功能十分强大,实用性很强,本文介绍下Excel中通过用sumif和sumifs函数的条件求和应用,并对函数进行解释,希望大家能够掌握使用技巧。 工具/原料 Excel 2007 sumif函数单条件求和 1. 1 以下表为例,求数学成绩大于(包含等于)80分的同学的总分之和 2. 2 在J2单元格输入=SUMIF(C2:C22,">=80",I2:I22)

3. 3 回车后得到结果为2114,我们验证一下看到表中标注的总分之和与结果一致 4. 4 那么该函数什么意思呢?SUMIF(C2:C22,">=80",I2:I22)中的C2:C22表示条件数据列,">=80"表示筛选的条件是大于等于80,那么最后面的I2:I22就是我们要求的总分之和

END sumifs函数多条件求和 1. 1 还是以此表为例,求数学与英语同时大于等于80分的同学的总分之和 2. 2 在J5单元格中输入函数=SUMIFS(I2:I22,C2:C22,">=80",D2:D22,">=80")

3. 3 回车后得到结果1299,经过验证我们看到其余标注的总分之和一致 4. 4 该函数SUMIFS(I2:I22,C2:C22,">=80",D2:D22,">=80")表示的意思是,I2:I22是求和列,C2:C22表示数学列,D2:D22表示英语列,两者后面的">=80"都表示是大于等于80

END 注意 1. 1 sumif和sumifs函数中的数据列和条件列是相反的,这点非常重要,千万不要记错咯

LISP函数(分类)大全

AutoLisp函数 一、数学运算功能函数 1.l(十数值数值…)返回:累计实数或整数数值 1.2(一数值数值…)返回:差值 1.3(* 数值数值…)返回:所有数值乘积 1.4(/ 数值数值…)返回:第一个数值除以第二个以后数值的商 1.5(l十数值)返回:数值十l l. 6(1—数值)返回:数值一l l.7(abs 数值)返回:数值的绝对值 1.8(atan 数值)返回:反正切值 1.9(cos 角度)返回:角度的余弦值,角度值为弧度 1.10(exp 数值)返回:数值的指数 1.11(expt 底数指数)返回:底数的指数值 1.12(fix 数值)返回:将数值转换为整数值 1.14(gcd 数值1 数值2)返回:两数值的最大公因数 1.15(log 数值)返回:数值的自然对数值 1.16(max 数值数值…)返回:数值中的最大值 1.17(min 数值数值…)返回:数值中的最小值 1.18 pi 常数∏,其值约为3.1415926 1.19(rem 数值 1数值 2)返回:M数值的相除的余数 l.20(sin 角度)返回:角度的正旋值,角度值为弧度 1.21(sqrt 数值)返回:数值的平方根 二、检验与逻辑运算功能函数 2.l(= 表达式1 表达式2)比较表达式1是否等于式2,适用数值及字符串 2.2 (/= 表达式1 表达式2)比较表达式1是否大于等于表达式2 2.3(<表达式1 表达式2) 比较表达式1是否<小于表达式2 2.4(<= 表达式1 表达式2)比较表达式1是否<一小于等于表达式2 2.5(>表达式1 表达式2)比较表达式1是否>大于表达式2 2.6(>= 表达式1 表达式2)比较表达式1是否大于等于表达式2 2.7 (~数值)返回:数值的位 not值,(1的补码) 2.8 (and 表达式1 表达式2…)返回:逻辑and的结果 2.9(boole 函数整数整数…)返回:位式布尔运算AutoLisp函数2/8 2.10(eq 表达式1 表达式2)比较表达式1与表达式2是否相同,适用列表比较(实际相同) 2.11(equal 表达式 1表达式 2[差量])比较表达式 1与表达式 2是否相同,差量可省略(内容相同) 三、转换运算功能函数 3.l(angtof 字符串[模式])返回:角度值的字符串转成实数 3.2(angtos 角度[模式[精度]])返回:角度转成的字符串值 3.3(atof 字符串)返回:字符串转成实数值 3.4 (atoi 字符串)返回:字符串转成整数值

EXCEL中每个函数代表的含义

EXCEL中每个函数代表的含义 第2章日期和时间函数25 日期和时间函数基础26 TODAY返回当前日期30 NOW返回当前的日期和时间33 DATB返回特定日期的年、月、日35 DATEVALU返回文本字符串所代表的日期序列号39 Y EAR返回某日期对应的年份42 MONT返回某日期对应的月份44 DAY返回某日期对应当月的天数46 TIME返回某一特定时间的小数值49 TIMEVALU返回文本字符串所代表的时间小数值52 HOU返回时间值的小时数55 MINUTE返回时间值中的分钟58 SECOND回时间值的秒数61 WEEKDA返回某日期为星期几63 WEEKNU返回代表一年中第几周的一个数字66 EDAT返回指定月数之前或之后的日期69 EOMONTH回指定日期之前或之后月份的最后一天的日期71 WORKDAY回某日期之前或之后相隔指定工作日的某一日期的日期值 73

NETWORKDA返回开始日期和结束日期之间完整的工作日数值76 DAYS360按照一年360天计算,返回两日期间相差的天数79 Y EARFRA返回开始日期和结束日期之间的天数占全年天数的百分比81 第3章逻辑函数84 IF根据指定的条件返回不同的结果85 AND判定指定的多个条件是否全部成立87 OR判定指定的任一条件是为真,即返回真90 NOT对其参数的逻辑值求反93 TRUE返回逻辑值TRUE 95 FALSE返回逻辑值FALSE 96 IFERRO捕获和处理公式中的错误97 第4章信息函数99 CELL返回引用单元格信息100 ERROR.T YP返回对应错误类型数值103 INFO返回与当前操作环境有关的信息106 N返回转换为数字后的值109 NA返回错误值110 TYPE返回表示值的数据类型的数字112 ISERR判断# N/A以外的错误值114 ISERRO判断错误值115

sumif函数与sumifs函数

1.Sumif函数的基础用法和注意事项 Excel中,单条件求和使用比较广泛,但大部分人习惯用透视表。如果只是求有限的条件,且原始数据比较庞大,这时用透视表,透视过程占用内存,速度缓慢,最后还要筛选,显得繁杂。所以,掌握sumif函数显得很有必要。很多人对这个函数还是比较陌生的,毕竟有三个参数。今天简要介绍下,相信大家看完后,一定会惊呼:原来这么简单啊,是的,就这么简单。

需要注意的是,函数虽然简单,但实际上,容易出现这个现象:用这个公式计算,公式确实没错,但结果和原数据中手工筛选出来的数据核对,结果不一样。主要原因有:一是没搞清楚绝对引用和相对引用,导致下拉公式时,需要固定的数据区域发生了变化;二是原始表格的条件区域表格不规范,如上述城市中,部分城市后面或者前面有空格,这样公式得出的结果肯定不一样,因此可以用trim函数去掉空格,这个在vlookup函数中也会存在类似现象,需要引起大家的注意 2.Sumifs函数的基础用法和注意事项 sumifs函数功能十分强大,可以通过不同范围的条件求规定范围的和,且可以用来进行多条件求和,本文在解释语法以后再展示两个实例,以便大家更好理解sumifs函数。 sumifs函数语法 sumifs(sum_range,criteria_range1,criteria1,[riteria_range2,criteria2]...) sum_range是我们要求和的范围 criteria_range1是条件的范围 criteria1是条件 后面的条件范围和条件可以增加。 详细用法请看实例 下面这张成绩单为例,演示sumifs函数用法, 先求男生的语文成绩之和 在G2单元格输入公式=SUMIFS(C2:C8,B2:B8,"男") 得到结果是228,我们看图中男生成绩得分之和与公式得到的结果一致。 再求语文和数学得分都大于等于90分的学生总分之和 在G4单元格输入公式=SUMIFS(F2:F8,C2:C8,">=90",D2:D8,">=90") 7 看到图中语文和数学都大于等于90分的学生只有一个同学,他的总分就是247分,与公式求得的结果完全一致。 补充知识点:offset函数问题。这个函数相对有点难度。完整的说一共有五个参数。函数速成宝典第88课:Offset函数实现动态查询功能。OFFSET(reference,rows,cols,height,width). OFFSET(起始单元格或区域,向下偏移几行,向右偏移几列,返回几行,返回几列)。在这里,大家要特别注意的是:第2和第3个参数如果都是0,起始点包含本行或本列;如果第2和第3个参数为1,起始点不包含本行或本列,就往下偏移一行;第4和第5个参数如果是1,起始点是包含本行和本列。大家改动下第88课素材文件中的SUM(OFFSET(K11,1,1,4,2))公式中的参数看看,就什么都明白了。 二、column函数和columns函数的问题,两者是有区别的。大家看下第27课:Average与

ElasticSearch5.1 基本概念和配置详解

二、几个基本概念 接近实时(NRT) Elasticsearch 是一个接近实时的搜索平台。这意味着,从索引一个文档直到这个文档能够被搜索到有一个很小的延迟(通常是1 秒)。 集群(cluster) 代表一个集群,集群中有多个节点(node),其中有一个为主节点,这个主节点是可以通过选举产生的,主从节点是对于集群内部来说的。es的一个概念就是去中心化,字面上理解就是无中心节点,这是对于集群外部来说的,因为从外部来看es集群,在逻辑上是个整体,你与任何一个节点的通信和与整个es集群通信是等价的。 索引(index)

ElasticSearch将它的数据存储在一个或多个索引(index)中。用SQL领域的术语来类比,索引就像数据库,可以向索引写入文档或者从索引中读取文档,并通过ElasticSearch内部使用Lucene将数据写入索引或从索引中检索数据。文档(document) 文档(document)是ElasticSearch中的主要实体。对所有使用ElasticSearch 的案例来说,他们最终都可以归结为对文档的搜索。文档由字段构成。 映射(mapping) 所有文档写进索引之前都会先进行分析,如何将输入的文本分割为词条、哪些词条又会被过滤,这种行为叫做映射(mapping)。一般由用户自己定义规则。类型(type) 每个文档都有与之对应的类型(type)定义。这允许用户在一个索引中存储多种文档类型,并为不同文档提供类型提供不同的映射。 分片(shards) 代表索引分片,es可以把一个完整的索引分成多个分片,这样的好处是可以把一个大的索引拆分成多个,分布到不同的节点上。构成分布式搜索。分片的数量只能在索引创建前指定,并且索引创建后不能更改。 副本(replicas) 代表索引副本,es可以设置多个索引的副本,副本的作用一是提高系统的容错性,当个某个节点某个分片损坏或丢失时可以从副本中恢复。二是提高es的查询效率,es会自动对搜索请求进行负载均衡。 数据恢复(recovery)

excel函数说明 必看

数据库和列表管理函数 DAVERAGE 返回选定数据库项的平均值 DCOUNT 计算数据库中包含数字的单元格个数 DCOUNTA 计算数据库中非空单元格的个数 DGET 从数据库中提取满足指定条件的单个记录 DPRODUCT 将数据库中满足条件的记录的特定字段中的数值相乘DSUM 对数据库中满足条件的记录的字段列中的数字求和GETPIVOTDATA 返回存储于数据透视表中的数据 日期和时间函数 DATE 返回特定日期的序列号 DATEVALUE 将文本格式的日期转换为序列号 DAY 将序列号转换为月份中的日 EDATE 返回在开始日期之前或之后指定月数的日期的序列号 MONTH 将序列号转换为月 NOW 返回当前日期和时间的序列号 TIMEVALUE 将文本格式的时间转换为序列号 TODAY 返回今天日期的序列号 WEEKDAY 将序列号转换为星期几 WEEKNUM 将序列号转换为一年中相应的周数 逻辑函数IF 指定要执行的逻辑检测 数学和三角函数 CEILING 将数字舍入为最接近的整数,或最接近的有效数字的倍数ROUND 将数字舍入到指定位数 SUBTOTAL 返回数据库列表或数据库中的分类汇总 SUMIF 按给定条件将指定单元格求和 SUMPRODUCT 返回相对应的数组部分的乘积和 TRUNC 将数字截尾取整 信息函数 CELL 返回有关单元格格式、位置或内容的信息 ISBLANK 如果值为空,则返回TRUE ISNA 如果值为#N/A 错误值,则返回TRUE 查找和引用函数 ADDRESS 以文本形式返回对工作表中某个单元格的引用 CHOOSE 从值的列表中选择一个值 HLOOKUP 在数组的首行查找并返回指定单元格的值 LOOKUP 在向量或数组中查找值 TRANSPOSE 返回数组的转置 VLOOKUP 在数组第一列中查找,然后在行之间移动以返回单元格的值

相关文档