文档库 最新最全的文档下载
当前位置:文档库 › Cache Updates in a Peer-to-Peer Network of Mobile Agents

Cache Updates in a Peer-to-Peer Network of Mobile Agents

Cache Updates in a Peer-to-Peer Network of Mobile Agents
Cache Updates in a Peer-to-Peer Network of Mobile Agents

Cache Updates in a Peer-to-Peer Network of Mobile Agents Elias Leontiadis,Vassilios V.Dimakopoulos,Evaggelia Pitoura Department of Computer Science,University of Ioannina

Ioannina,Greece GR-45110

{ilias,dimako,pitoura}@cs.uoi.gr

Abstract

In open multi-agent systems,agents need resources pro-vided by other agents but they are not aware of which agents provide particular resources.We consider a peer-to-peer approach,in which each agent maintains a local cache with information about k resources,that is for each of the k re-sources,an agent that provides it.However,when an agent or a resource moves,cache entries become obsolete.We propose a suite of cache update policies that combine pull-based invalidation that is initiated by the agent that main-tains the cache with push-based invalidation that is initi-ated by the agent that moves.We study and compare vari-ations of oblivious?ooding-based push/pull along with an informed push approach where each agent maintains a list of the agents that have it cached.Our experimental results indicate that a novel variation of?ooding for push where a moving agent propagates its new location to agents in its old neighborhood achieves good cache consistency with a small message overhead.The proposed policies are suitable for any peer-to-peer system where peers cache information about other peers and this information becomes obsolete.

1.Introduction

A multi-agent system is a network of software agents that cooperate to solve problems.In open multi-agent sys-tems,the agents that need resources provided by other agents are not aware of which agents provide the partic-ular resources.Most solutions to this problem are based on a central directory that maintains a mapping between agents and resources[14].However,such solutions do not scale well since the central directory becomes a bottleneck in terms of both performance and reliability.

In[4,5],we proposed a peer-to-peer based approach to this resource discovery problem.Each agent A maintains a limited size local cache in which it keeps information about k different resources,that is,for each of the k resources,it stores the contact information of one agent that provides it.The agents in the cache of agent A are called A’s neigh-bors.An agent searching for a resource uses some form of ?ooding:it checks its local cache and if there is no infor-mation for the resource,it contacts its neighbors,which in turn contact their neighbors and so on until information for the resource is found in some cache.

In this paper,we consider the case in which the agents are mobile and their contact information changes.This results in caches having incorrect information about the agents.We propose a suite of policies for addressing cache updates in this context.Note that to reach an agent that has moved,there must exist at least one cache in the network having the correct information about the agent’s new loca-tion.

Our update policies combine pull-based invalidation that is initiated by the agent that maintains the cache with push-based invalidation that is initiated by the agent that moves. We consider periodic pull where an agent periodically up-dates its cache and on-demand pull where an agent updates a cache entry only when the entry becomes obsolete.With push,an agent informs other agents about its new location. We study variations of?ooding-based push where an agent A that has moved propagates its new contact point to its neighbors blindly,that is,even if its neighbors do not have A in their cache.

We propose a novel variation of?ooding,where agents that receive information about other moving agents main-tains it for a short period of time in a snooping directory. Thus,with this method,a moving agent“infects”its old neighbors with its new address.We compare such oblivi-ous?ooding-based push approaches with an informed push where,in addition to its local cache,each agent A main-tains a list,called inverted cache,of the agents that have A as their neighbor.When A moves,it pushes its new address only to agents in its inverted cache.

Our experimental results indicate that by?ne-tuning the related parameters,snooping directories may lead to attain-ing the same cache consistency with plain?ooding-based

push but with a ten-fold reduction of the message over-head.The inverted cache method is message-cost effec-tive as compared to the oblivious push approaches,but only when cache replacements are infrequent.

The proposed policies are also applicable to a more generic context in which peers(agents in our case)cache information about the content of other peers.The problem we study here is the general problem of updating the caches when this information becomes obsolete,for instance when the peers are mobile as is the case with nodes in an ad-hoc wireless network.

The remainder of this paper is structured as follows.In Section2,we present the peer-to-peer architecture of the co-operating agents as well as the search algorithms that form the basis for maintaining cache consistency.In Section3, we introduce our cache consistency algorithms and in Sec-tion4we report our experimental results.In Section5,we discuss related work.Finally,in Section6we offer conclu-sions and directions for future work.

2.System model

We assume a multi-agent system with nodes/agents,pro-viding a number of resources.To ful?ll their goals,agents need to use resources provided by other agents.To use a resource,an agent must contact the agent that provides

it.

Figure1.Part of a cache network,each agent v i

maintains contact information for k=2resources

Each agent is equipped with a private cache of size k. In there it stores information about k different resources, that is,for each of the k resources the contact information of one agent that provides it.The system is modeled as a directed graph G(V,E),called the cache network.Each node corresponds to an agent along with its cache.There is an edge from node v to node u if and only if agent v has in its cache the contact information of agent u,in which case, u is called a neighbor of v.There is no knowledge about the size of V or E.An example is shown in Fig.1.

In such a setting,a fundamental problem is that of re-source location:how can an agent A,called the inquiring agent,that needs a particular type of resource x,?nd an agent that provides it?

Agent A initially searches its own cache.If it?nds the resource there,it extracts the corresponding contact infor-mation and the search ends.If resource x is not found in the local cache,A sends a message querying agents found in its cache,that is,some of A’s neighbors,which in turn propagate the query to their neighbors and so on.For per-formance reasons,the search is usually limited to a maxi-mum number of steps t,also known as time-to-live(TTL).

A number of different approaches to this search procedure were proposed for example in[13,4].

Flooding.In?ooding,A contacts all its neighbors(i.e.all the agents listed in its cache),by sending an inquiring mes-sage,asking for information about resource x.Any agent that receives this message searches its own cache.If x is found in there,a reply containing the contact information is sent back to the inquiring agent.Otherwise,the intermedi-ate agent contacts all of its own neighbors,thus propagat-ing the inquiring message.As the messages are sent from node to node,a“tree”is unfolded rooted at the inquiring agent(Fig.2(a)).Note that the term“tree”is not accurate in graph-theoretic terms since a node may be contacted by two or more other nodes but helps to visualize the situation.

The most important disadvantage of?ooding is the ex-cessive number of messages that have to be transmitted,es-pecially if t is not small.This number grows exponentially with the network density(cache size).

Teeming.To reduce the number of messages,a proba-bilistic variation of?ooding called teeming can be applied. At each step of teeming,if the resource is not found in the local cache of a node,the node propagates the inquir-ing message only to a random subset of its neighbors.We denote byφthe?xed probability of selecting a particular neighbor.In contrast with?ooding,the search tree is not a k-ary one any more(Fig.2(b)).A node in the search tree may have from0to k children,kφbeing the average case. Flooding is a special case of teeming for whichφ=1. Teeming with decay.Flooding and teeming suffer from a high percentage of duplicate messages,since intermediate nodes may receive the same query through different paths. To reduce this overhead,the fan out of the search in later steps should be decreased.Thusφshould not remain con-stant;it should rather decrease as the search progresses.In this policy,φis a functionφ=f(s,d)where s is the cur-rent step and d a decay parameter.We have experimented with various decay functions and the best results were ob-tained whenφdecreases exponentially as the step increases:φ=(1?d)s,where d<1.

v3

v2

(a)(b)(c)

Figure2.Searching the cache network of Figure1:(a)?ooding,(b)teeming,(c)random paths(p=2)

Random paths.In this algorithm,the inquiring agent contacts p≤k of its neighbors.However,each intermedi-ate node contacts only one of its own neighbors(randomly). The search space formed ends up being p random paths un-folding concurrently in the network(Fig.2(c)).The algo-rithm produces less messages than?ooding or teeming but it also needs more steps to locate a resource,in general.

3.Cache consistency algorithms

In mobile agent systems,agents or resources may change location at will.In such an environment,some agents will end up having invalid contact information about their cached resources.Invalid cache entries not only require re-newed searches for the resource,but also deteriorate the performance of search algorithms.We propose a number of strategies to deal with the cache consistency problem. There are two basic categories of algorithms:push-based ones,which are initiated by the moving agents and pull-based ones which are initiated by the agents that have in-valid cache entries.All proposed strategies use a combina-tion of push and pull-based methods.

3.1.Pull

Pull-based updates are initiated by agents interested in refreshing their caches.Whenever an agent discovers that a cached resource has invalid contact information(i.e.the provider of the resource has moved),it initiates a search for the required resource.Any of the search algorithms de-scribed in the previous section can be utilized.

In general,given that at least one agent knows the cor-rect location of the moving agent the agent that initiates the search should ultimately?nd the mover’s new location. However,the search procedure may yield diverse results, for two reasons.First,by the time the correct location is communicated back to the inquiring agent,the mover may have moved again.Second,other agents may reply with stale contact information because their own caches have not been updated.

In any case,invalid results require the initiation of an-other search phase,which can lead to an excessive over-head.We can reduce this overhead by using sequence num-bers.Each time an agent moves,it increases its own se-quence number by1and communicates it along with its new location information.Pulling agents select the search replies with the largest sequence number per provider.

In addition to the above on-demand pull,agents can pe-riodically pull their entire cache from the network.This is a prefetch operation aiming at avoiding the related search latency when the agent will actually be in the need of a resource.As a side effect,because periodic pulls tend to maintain cache entries up-to-date most of the time,on-demand pulls will also yield faster and more precise an-swers.However,periodic pulls cannot be too frequent, because too many pulls could cause high network traf?c. Agents should pull their cache when they realize that a large portion is invalid.

3.2.Push

When an agent moves,it must inform at least one other agent about its new location.This is to say that pull alone cannot solve the consistency problem.The general idea is that the moving agent pushes(broadcasts)a message con-taining its new location to the network in order to inform as many agents as possible1.Next,we present and com-pare different strategies for implementing a push,includ-ing oblivious and informed ones,depending on the moving agent’s knowledge about the network.

Plain push

In plain push,the agent broadcasts a message containing its new location.That is,it sends the message to its neighbors, that propagate it to their neighbors and so on.Push can be implemented using any of the algorithms presented in Section2.

This strategy is oblivious since the moving agent is ig-norant of which agent needs its new location;the idea is that by?ooding the network,most of the agents that have cached the mover will be noti?ed.For the plain push to inform most of the interested agents with high probability, it will have to utilize the?ooding or the teeming algorithm 1A sequence number is also included in the message in order to assist the pull phases as discussed in Section3.1

with a relatively large number of steps(so as to cover a good percentage of the nodes in the network).

Interested agents that will not be reached by the push will have to enter a pull phase.There is a tradeoff here;if the push utilizes a large number of steps,the pull phase can use some lighter search algorithm such as the random paths or a very shallow teeming,and vice versa.In general,if agents do not move very frequently,a push with high mes-sage overhead is tolerable so as to allow for lighter pulls. Push with snooping directories

Plain push can be improved if combined appropriately with periodic pulling.The proposed strategy requires that each agent maintains a directory of recently moved agents (termed snooping directory).

As with the plain push algorithm,moving agents push obliviously their new location to the network.The differ-ence is that any agent receiving a push message stores it in its snooping directory for a small period of time(termed expiration time),even if the agent is not interested in the mover.

When a pull phase is entered,agents consult their snoop-ing directory before initiating a search.However,the snoop-ing directory may give unreliable information if the agent pulls after a long period of time.Thus,all agents should pull their entire cache periodically in order to obtain pos-sible updates.The interval between pulls should be less or equal to the expiration time.

Agents do not have to pull information from the entire network but pull using a small TTL instead.At the same time,this strategy allows for more relaxed algorithms for push,too.For example one could use a smaller?ooding depth or a stricter decay parameter as compared to plain push.By appropriately tuning the involved parameters (push and pull algorithms,directory size,expiration time, pulling frequency)push with snooping directories should be a signi?cant improvement over plain push.

Inverted cache

A way to avoid the potentially high message overhead of the push phase is to have some knowledge of where to send the updates.The basic idea behind this informed push strategy is that each agent should keep a directory of the agents to which it is known,called inverted cache.

When the agent moves,it only needs to inform the agents in its inverted cache;this is trivially optimal and at the same time eliminates the pulls.However,storing a full directory may not be always preferable.For example,popular re-sources or agents(that are cached by many others),may need to maintain a prohibitively large inverted cache.Thus only a limited directory may be maintained.As a result,not every interested agent will be updated by the moving agent, which means that pull phases will also be needed.

The inverted cache strategy can be combined with leas-ing.Leasing was proposed in[6]to ensure that a download is valid for a certain(leased)period of time.In our case, the lease duration indicates the time interval during which the resource owner guarantees that it will notify the leaser in case the former moves.Once the lease time expires,a leaser has to renew the lease so as to be informed of a pos-sible move;otherwise the resource owner has the choice of evicting the leaser from its inverted cache.

Since the inverted cache maintains agents with valid leases,shorter leases imply smaller space overhead on aver-age.The value of the lease time should be based on the movement frequency and the popularity of the agent.If the agent is highly mobile or very popular,the lease time should be small to keep the size of the inverted cache direc-tory small.

When a leaser removes or replaces a resource from its cache it must contact the resource owner so that the latter deletes the leaser from the inverted cache.In addition,when a leaser moves it must contact all its neighbors(resource owners it knows about)so that the latter update the corre-sponding contact info in their inverted caches.Thus,when an agent moves it must contact all neighbors in its cache in addition to the agents in its inverted cache.

The proposed algorithm is very ef?cient especially when agents move frequently,since the updates become known quickly and with minimum overhead.However,it has in-creased memory requirements and it should be avoided in systems were the agents frequently change their cache.In addition,resource owners are forced to store and maintain state information about the agents that need them.

4.Experimental Results

To evaluate the performance of the proposed cache con-sistency protocols,we simulated a peer-to-peer network of agent caches.Each agent owns a number of resources and has a?xed cache size of k other agents/resources.We start our simulation by constructing a random graph of known agents:each agent picks k of the resources randomly and caches their location.We designate some resources as pop-ular and also determine the extent of this popularity.A re-source is offered by one agent only but an agent may hold many resources.All agents start with valid caches.Our simulation runs for a number of rounds(turns).During each turn,an agent may move or add/replace/remove a resource from its cache with a given probability.

Frequent cache replacements result in fewer invalid cache entries,so we kept the cache replacement probabil-ity quite low.By doing so,at the end of the simulation runs, the percentage of consistent cache entries is affected mainly by the cache updates algorithms.

The reported experiments are for a network of1000

agents owning3000resources for250turns.The popular resources are few(2%)but they are known to many other agents(ten times more agents on average).Table1summa-rizes our parameters.

During the simulation sessions,we keep statistics regard-ing the message overhead of the pull and push algorithms, the percentage of cache entries that are valid at each agent, the maximum and average directory size of the agents(in-verted cache or snooping directory).Each experiment was repeated10times and the results were averaged.

Table1.Default simulation values

Parameter Default Value

Number of agents1000

Number of resources3000

Number of turns250

Cache size k8resources

Percentage of popular agents2%

Popularity factor10x more popular

Resources per agent1to8

Probability to move at each turn0.1%

Probability to replace a cached entry0.6%

4.1.Plain push

We experimented with a variety of algorithms for push (i.e.,plain?ooding,teeming,random paths).We report our experiments using teeming with decay,which exhibited bet-ter results.The main factor that in?uences the performance of plain push is the extent of?ooding.We use the following decay function:

φ=(1?d)s

By varying the decay parameter d and the maximum step t, we can control the extent of?ooding.We used the values shown in Table2.Agents pull information using a light k-random paths algorithms with t=3.

Table2.Push versions

Extent of push?ooding d t

Full?ooding0.05

Wide?ooding0.25

Medium?ooding0.35

Narrow?ooding0.44

The moving frequency has only linear impact on the message overhead and the cache quality.The important issue in this method is the extent of?ooding,since it de-termines the percentage of agents that receive an update. As shown in Fig.3,a restrained push algorithm results in

a

20

40

60

80

100

120

020406080100120140160180200220240

turn

%

o

f

v

a

l

i

d

c

a

c

h

e

Figure3.Narrow plain push with on-demand pull

full wide medium narrow m

e

s

s

a

g

e

s

Figure4.Plain push with on-demand pull:Average

number of push and pull messages per move

poor cache condition at the end of the simulation.With wide ?ooding,we achieve satisfactory results:more than98%of the cache is eventually consistent.Notice,that these results were obtained using very low cache replacement probabil-ities.With higher replacement probabilities cache validity would not tend to zero.

A wider push induces increased message overhead as shown in Fig.4.The message overhead increases expo-nentially with the extent of push.The pull messages de-crease too;this is because pull messages do not propagate further when a cache has invalid entries.That is why in the narrow push,where the majority of the cache entries is invalid,the pull messages are very few.Notice also that the cache consistency percentage of wide?ooding is almost the same with full?ooding,while,wide?ooding has less than half the message overhead of full?ooding.Thus?ne-tuning the extent of push is important for getting balanced performance:good cache consistency with minimum mes-sage overhead.

The cache size k also affects the ef?ciency of the al-gorithm as depicted in Fig.5.Narrow?ooding performs poorly in sparse networks(small caches)and satisfactorily in dense networks(large caches).However,dense networks produce a large message traf?c(exponential increase).So in dense networks,we may restrict the extent of push to re-

20

40

60

80

100

120

140

160

180

200

220

240

turn

% o f v a l i d c a c h e

Figure 5.Plain push with on-demand pull:Cache size affects ?ooding performance duce the message overhead and still attain the same quality of cache consistency.

4.2.Snooping

With snooping,agents periodically pull information.In our simulation,the frequency of pull was set equal to the expiration time of the snooping directory,namely 20turns.We used a very limited pull phase;we search only the 1-hop neighborhood for recent updates.So,we expect the pull message overhead to be small.We used the same push algo-rithm as in plain push since it gives the best results.As with plain push,the main parameter that affects the ef?ciency of snooping is the extent of push.

As shown in Fig.6,with the same simulation parameters,snooping results in better cache consistency than plain push,while the message overhead is only slightly higher (Fig.7).This is because,the pull phase frequently ?nds up-to-date information in the snooping directory.The pull message overhead is quite small.Thus,snooping allows us to use a more restricted push for achieving the same cache quality.In particular,as we can see in Figs.3and 6,we can use snooping with narrow ?ooding instead of a plain push with wide ?ooding to achieve more than 90%of valid caches.Snooping achieves this with ten times less messages than plain push.

4.3.Inverted cache with leasing

For these experiments,we used four values for the lease time:small (5turns),medium (25turns),large (50turns)and very large (100turns).We assume that agents do not renew their lease times.Expired entries are deleted from the inverted cache when its size becomes larger than k/2.Thus a minimum directory of k/2agents is always kept.This is the only algorithm that cache replacements induce message

overhead.

20

40

60

80

100

120

140

160

180

200

220

240

turn

% o f v a l i d c a c h e

Figure 6.Snooping:Cache consistency for differ-ent extents of push

?ooding

full wide medium narrow

m e s s a g e s

Figure 7.Snooping:Average number of push and pull messages per move

The leasing time affects the ?nal consistency and the size of the inverted cache directory.Figure 8shows how we can achieve a desired ?nal consistency using appropriate leas-ing values.With inverted caches,an update is propagated with the minimum latency,since when an agent moves,it only informs the agents in its inverted and normal cache thus producing an 1-hop message overhead for push.How-ever,message overhead is created by cache replacements,as shown in Fig.9.Still,the messages increase linearly with the number of cache replacements.The inverted cache size is affected by the popularity of an agent.Popular agents have larger directories even if we use smaller leasing values for them.Additionally,the density of the network (cache size)affects the inverted cache size because dense networks result in more popular agents.

4.4Discussion

When we do not want to use any additional memory for cache consistency,plain push is the only solution.Its performance is satisfactory when wide ?ooding is utilized.Since an agent that ends up with an invalid cache entry for a resource can pull on demand in order to locate the owner,

20

40

60

80

100

120

140

160

180

200

220

240

turn

% o f v a l i d c a c h e

Figure 8.Inverted cache with leasing:Cache con-sistency for different leasing

times

high medium low

Cache replacement frequency

m e s s a g e s

Figure 9.Inverted cache with leasing:Message overhead for different cache replacement frequen-cies

push can be used in dynamic peer-to-peer systems where agents (peers)go of?ine frequently.However,push has an increased message overhead when compared to the other schemes,so it should be avoided in the case of frequently moving agents.

The snooping method with periodic pulling is more ef-?cient since it achieves the same cache quality with plain push but with a narrower ?ooding.As indicated by our simulation results,it may achieve the same cache consis-tency with plain push but with a message overhead an or-der of magnitude smaller.However,we must use addi-tional memory (for the snooping directory)to track recently moved agents.This directory can become large in the case of highly mobile agents and dense networks.

The push phase of the inverted cache method induces a small overhead since only the neighbors in the cache and the inverted cache are contacted.Our experiments indi-cate that this overhead is almost a hundred times less than that of simple push.Moreover,cache updates are prop-agated immediately since we only use 1-hop communica-tions.However,this is the only method for which cache re-placements produce message overhead.Replacing a cache

entry for a resource requires contacting the resource owner.As such,the inverted cache method is unsuitable for sys-tems with a high replacement-to-mobility ratio.Our experi-ments demonstrate that the inverted cache method induces a smaller message overhead than the other two methods when this ratio is lower than 100(that is,when we have at most 100times more cache replacements than moves).The size of the inverted cache directory depends on the density of the network and the lease time and can become quite large for popular agents.This method is not appropriate for un-reliable open MAS,because it requires agents to be online for maintaining a valid inverted cache and agents rely on each other to be updated.Finally,agents waste resources for keeping other agents informed.

5.Related work

The peer-to-peer cache network for agents was ?rst pro-posed in [4].However,[4]did not consider the fact that cache entries may become obsolete.In this paper,we focus on this problem,that is on maintaining consistency of the caches.

Distributed cache consistency has been the focus of much research [9,15,1,16].In general,most of the pro-posed solutions are based on a stateful server (i.e.a server that keeps information about the sites that maintain the cache entries)and thus apply some form of informed push.Most methods assume a reliable environment and cannot be applied directly to fully distributed and dynamic open MAS.Our proposed methods are reminiscent of the epidemic algorithms ?rst proposed for replicated databases [3].Ru-mor spreading [10,8]and gossip [12]methods are example epidemic algorithms used for lazy transmission of updates to distributed copies of a database.Both assume a peer-to-peer model for probabilistic update propagation among the sites that replicate ?les.In our case,this update may correspond to the new location of a resource.Push/pull al-gorithms were used in [2,11]to make ?le updates known to all the peers in highly unreliable peer-to-peer systems like Gnutella [7].

Our algorithms differ in the following ways:(i)the up-date that we want to propagate is the new location of a resource/agent and not a ?le update or a database replica,and (ii)we use a combination of push and pull algorithms to achieve probabilistic cache consistency in the entire net-work.To this end,we extend the push algorithms proposed in [2]with two novel push methods (snooping push and in-verted cache push)and combine them with pull to achieve ef?cient hybrid push/pull strategies.

6.Conclusion

In this paper,we consider the problem of cache updates in a peer-to-peer network of mobile agents.Each agent maintains in its cache information about other agents.When agents move,cached entries about them become obsolete. We propose a number of update policies that combine:(i) pull-based invalidation,that is initiated by the agent that maintains the cache with(ii)push-based invalidation that is initiated by the agent that moves.Both push and pull methods use appropriate variations of probabilistic message ?ooding.We propose a novel variation of push,where agents that receive information about other moving agents maintain it for a short period of time in a snooping directory. Our experimental results designate that by?ne-tuning the related parameters,maintaining a snooping directory leads to attaining the same cache consistency with plain?ooding-based push but with a ten times reduction of the message overhead.We also compare variations of oblivious push with an informed push approach where each agent main-tains a list,called inverted cache,of the agents that have it cached.Our experiments indicate that the inverted cache method is message-cost effective when compared to the oblivious push approaches,but only when cache replace-ments are not frequent.

Our future plans include extending our approach for other types of mobile peers besides mobile agents.In partic-ular,we are interested in peer-to-peer systems where peers are mobile nodes in an ad-hoc network.In this case,we plan to extend our cache update policies by taking into ac-count information about how the location changes,e.g.,the direction of a mobile peer’s movement.

References

[1]P.Cao and C.Liu,“Maintaining strong cache consistency

in the world wide web,”IEEE Transactions on Computers, vol.47,no.4,pp.445–457,Apr.1998.

[2] A.Datta,M.Hauswirth,and K.Aberer,“Updates in highly

unreliable,replicated peer-to-peer systems,”in Proc.of ICDCS2003,23rd International Conference on Distributed Computing Systems,Providence,Rhode Island,May2003, pp.76–85.

[3] A.Demers, D.Greene, C.Hauser,W.Irish,https://www.wendangku.net/doc/d49237930.html,rson,

S.Shenker,H.Sturgis,D.Swinehart,and D.Terry,“Epi-demic algorithms for replicated database maintenance,”in Proc.of PODC1987,6th Annual ACM Symposium on Prin-ciples of Distributed Computing,Vancouver,Canada,Aug.

1987,pp.1–12.

[4]V.V.Dimakopoulos and E.Pitoura,“Performance analysis

of distributed search in open agent system,”in Proc.IPDPS ’03,International Parallel and Distributed Processing Sym-posium,Nice,France,May2003.

[5]——,“A Peer-to-Peer Approach to Resource Discovery in

Multi-Agent Systems,”in Proc.CIA2003,8th Int’l Work-shop on Cooperative Information Agents,Helsinki,Finland, Aug.2003,pp.62–77.

[6] C.Gary and D.Cheriton.,“An Ef?cient Fault-Tolerant

Mechanism for Distributed File Cache Consistency,”in Proc of SOSP1989,the12th ACM Symposium on Operating Sys-tems Principles,Litch?eld Park,Arizona,Dec.1989,pp.

202–210.

[7]Gnutella Protocol Speci?cation,Version0.4,in http://

https://www.wendangku.net/doc/d49237930.html,/developer/.

[8]R.G.Guy,P.L.Reiher,D.Ratner,M.Gunter,W.Ma,and

G.J.Popek,“Rumor:Mobile data access through optimistic

peer-to-peer replication,”in Advances in Database Technolo-gies,ER’98Workshops,ser.LNCS,vol.1552.Singapore: Springer,Nov.1998,pp.254–265.

[9] A.Kahol,S.Khurana,S.Gupta,and P.Srimani,“A strategy

to manage cache consistency in a distributed mobile wireless environment,”in Proc.ICDCS2000,Int’l Conf.Distributed Computing Systems,Taipei,Taiwan,Apr.2000,pp.530–537.

[10]R.M.Karp,C.Schindelhauer,S.Shenker,and B.V¨o cking,

“Randomized rumor spreading,”in Proc.FOCS2000,41st

Annual Symposium on Foundations of Computer Science, Redondo Beach,CA,Nov.2000,pp.565–574.

[11]https://www.wendangku.net/doc/d49237930.html,n,X.Liu,P.Shenoy,and K.Ramamritham,“Consis-

tency maintenance in peer-to-peer?le sharing networks,”in Proc.of WIAPP’03,3rd IEEE Workshop on Internet Appli-cations,San Jose,CA,June2003,pp.76–85.

[12]M.-J.Lin and K.Marzullo,“Directional gossip:Gossip in

a wide area network,”in Proc.EDCC-3,3rd European De-

pendable Computing Conference,ser.Lecture Notes in Com-puter Science,vol.1667.Springer,Sept.1999,pp.364–379.

[13] D.Lv,P.Cao,E.Cohen,K.Li,and S.Shenker,“Search and

Replication in Unstructured Peer-to-Peer Networks,”in Pro-ceeding of the16th ACM International Conference on Super-computing,2002.

[14]K.Sycara,M.Klusch,S.Widoff,and J.Lu,“Dynamic Ser-

vice Matchmaking Among Agents in Open Information En-vironments,”SIGMOD Record,vol.28,no.1,pp.47–53, March1999.

[15]J.Xu,X.Tang,D.L.Lee,and Q.Hu,“Cache coherency in

location-dependent information services for mobile environ-ment,”in Proc.MDA’99,First International Conference on Mobile Data Access,ser.LNCS,vol.1748.Hong Kong, China:Springer,Dec.1999,pp.182–193.

[16]J.Yin,L.Alvisi,M.Dahlin,and A.Iyengar,“Engineering

server-driven consistency for large-scale dynamic web ser-vices,”in Proc.WWW10,the10th Int’l World Wide Web Conference,Hong Kong,China,May2001,pp.45–57.

Cache的工作原理

前言 虽然CPU主频的提升会带动系统性能的改善,但系统性能的提高不仅仅取决于CPU,还与系统架构、指令结构、信息在各个部件之间的传送速度及存储部件的存取速度等因素有关,特别是与CPU/内存之间的存取速度有关。 若CPU工作速度较高,但内存存取速度相对较低,则造成CPU等待,降低处理速度,浪费CPU的能力。 如500MHz的PⅢ,一次指令执行时间为2ns,与其相配的内存(SDRAM)存取时间为10ns,比前者慢5倍,CPU和PC的性能怎么发挥出来? 如何减少CPU与内存之间的速度差异?有4种办法: 一种是在基本总线周期中插入等待,但这样会浪费CPU的能力。 另一种方法是采用存取时间较快的SRAM作存储器,这样虽然解决了CPU与存储器间速度不匹配的问题,但却大幅提升了系统成本。 第3种方法是在慢速的DRAM和快速CPU之间插入一速度较快、容量较小的SRAM,起到缓冲作用;使CPU既可以以较快速度存取SRAM中的数据,又不使系统成本上升过高,这就是Cache法。 还有一种方法,采用新型存储器。 目前,一般采用第3种方法。它是PC系统在不大增加成本的前提下,使性能提升的一个非常有效的技术。 本文简介了Cache的概念、原理、结构设计以及在PC及CPU中的实现。 Cache的工作原理 Cache的工作原理是基于程序访问的局部性。 对大量典型程序运行情况的分析结果表明,在一个较短的时间间隔内,由程序产生的地址往往集中在存储器逻辑地址空间的很小范围内。指令地址的分布本来

就是连续的,再加上循环程序段和子程序段要重复执行多次。因此,对这些地址的访问就自然地具有时间上集中分布的倾向。 数据分布的这种集中倾向不如指令明显,但对数组的存储和访问以及工作单元的选择都可以使存储器地址相对集中。这种对局部范围的存储器地址频繁访问,而对此范围以外的地址则访问甚少的现象,就称为程序访问的局部性。 根据程序的局部性原理,可以在主存和CPU通用寄存器之间设置一个高速的容量相对较小的存储器,把正在执行的指令地址附近的一部分指令或数据从主存调入这个存储器,供CPU在一段时间内使用。这对提高程序的运行速度有很大的作用。这个介于主存和CPU之间的高速小容量存储器称作高速缓冲存储器(Cache)。 系统正是依据此原理,不断地将与当前指令集相关联的一个不太大的后继指令集从内存读到Cache,然后再与CPU高速传送,从而达到速度匹配。 CPU对存储器进行数据请求时,通常先访问Cache。由于局部性原理不能保证所请求的数据百分之百地在Cache中,这里便存在一个命中率。即CPU在任一时刻从Cache中可靠获取数据的几率。 命中率越高,正确获取数据的可靠性就越大。一般来说,Cache的存储容量比主存的容量小得多,但不能太小,太小会使命中率太低;也没有必要过大,过大不仅会增加成本,而且当容量超过一定值后,命中率随容量的增加将不会有明显地增长。 只要Cache的空间与主存空间在一定范围内保持适当比例的映射关系,Cache 的命中率还是相当高的。 一般规定Cache与内存的空间比为4:1000,即128kB Cache可映射32MB内存;256kB Cache可映射64MB内存。在这种情况下,命中率都在90%以上。至于没有命中的数据,CPU只好直接从内存获取。获取的同时,也把它拷进Cache,以备下次访问。

linux内核之内存管理

Linux内核之内存管理 作者:harvey wang 邮箱:harvey.perfect@https://www.wendangku.net/doc/d49237930.html, 新浪博客地址:https://www.wendangku.net/doc/d49237930.html,/harveyperfect,有关于减肥和学习英语相关的博文,欢迎交流 把linux内存管理分为下面四个层面 (一)硬件辅助的虚实地址转换 (二)内核管理的内存相关 (三)单个进程的内存管理 (四)malloc软件 (一)处理器硬件辅助的虚实地址转换(以x86为例) 在x86中虚实地址转换分为段式转换和页转换。段转换过程是由逻辑地址(或称为虚拟地址)转换为线性地址;页转换过程则是将线性地址转换为物理地址。段转换示意图如下 X86支持两种段,gdt和ldt(全局描述段表和局部描述符段表),在linux中只使用了4个全局描述符表,内核空间和用户空间分别两个gdt,分别对应各自的代码段和数据段。也可以认为在linux中变相地disable了x86的段式转换功能。 页转换示意图如下

在linux中x86 的cr3寄存器(页表基地址寄存器)保存在进程的上下文中,在进程切换时会保存或回复该寄存器的内容,这样每个进程都有自己的转换页表,从而保证了每个进程有自己的虚拟空间。 (二)内核管理的内存相关 从几个概念展开内存管理:node、zone、buddy、slab 1、Node SGI Altix3000系统的两个结点 如上图,NUMA系统的结点通常是由一组CPU(如,SGI Altix 3000是2个Itanium2 CPU)和本地内存组成。由于每个结点都有自己的本地内存,因此全系统的内存在物理上是分布的,每个结点访问本地内存和访问其它结点的远地内存的延迟是不同的,为了优化对NUMA 系统的支持,引进了Node 来将NUMA 物理内存进行划分为不同的Node。而操作系统也必须能感知硬件的拓扑结构,优化系统的访存。

模式识别第二次上机实验报告

北京科技大学计算机与通信工程学院 模式分类第二次上机实验报告 姓名:XXXXXX 学号:00000000 班级:电信11 时间:2014-04-16

一、实验目的 1.掌握支持向量机(SVM)的原理、核函数类型选择以及核参数选择原则等; 二、实验内容 2.准备好数据,首先要把数据转换成Libsvm软件包要求的数据格式为: label index1:value1 index2:value2 ... 其中对于分类来说label为类标识,指定数据的种类;对于回归来说label为目标值。(我主要要用到回归) Index是从1开始的自然数,value是每一维的特征值。 该过程可以自己使用excel或者编写程序来完成,也可以使用网络上的FormatDataLibsvm.xls来完成。FormatDataLibsvm.xls使用说明: 先将数据按照下列格式存放(注意label放最后面): value1 value2 label value1 value2 label 然后将以上数据粘贴到FormatDataLibsvm.xls中的最左上角单元格,接着工具->宏执行行FormatDataToLibsvm宏。就可以得到libsvm要求的数据格式。将该数据存放到文本文件中进行下一步的处理。 3.对数据进行归一化。 该过程要用到libsvm软件包中的svm-scale.exe Svm-scale用法: 用法:svmscale [-l lower] [-u upper] [-y y_lower y_upper] [-s save_filename] [-r restore_filename] filename (缺省值:lower = -1,upper = 1,没有对y进行缩放)其中,-l:数据下限标记;lower:缩放后数据下限;-u:数据上限标记;upper:缩放后数据上限;-y:是否对目标值同时进行缩放;y_lower为下限值,y_upper为上限值;(回归需要对目标进行缩放,因此该参数可以设定为–y -1 1 )-s save_filename:表示将缩放的规则保存为文件save_filename;-r restore_filename:表示将缩放规则文件restore_filename载入后按此缩放;filename:待缩放的数据文件(要求满足前面所述的格式)。缩放规则文件可以用文本浏览器打开,看到其格式为: y lower upper min max x lower upper index1 min1 max1 index2 min2 max2 其中的lower 与upper 与使用时所设置的lower 与upper 含义相同;index 表示特征序号;min 转换前该特征的最小值;max 转换前该特征的最大值。数据集的缩放结果在此情况下通过DOS窗口输出,当然也可以通过DOS的文件重定向符号“>”将结果另存为指定的文件。该文件中的参数可用于最后面对目标值的反归一化。反归一化的公式为: (Value-lower)*(max-min)/(upper - lower)+lower 其中value为归一化后的值,其他参数与前面介绍的相同。 建议将训练数据集与测试数据集放在同一个文本文件中一起归一化,然后再将归一化结果分成训练集和测试集。 4.训练数据,生成模型。 用法:svmtrain [options] training_set_file [model_file] 其中,options(操作参数):可用的选项即表示的涵义如下所示-s svm类型:设置SVM 类型,默

实验报告答案

实验2:MIPS指令系统和MIPS体系结构 一.实验目的 (1)了解和熟悉指令级模拟器 (2)熟悉掌握MIPSsim模拟器的操作和使用方法 (3)熟悉MIPS指令系统及其特点,加深对MIPS指令操作语义的理解 (4)熟悉MIPS体系结构 二. 实验内容和步骤 首先要阅读MIPSsim模拟器的使用方法,然后了解MIPSsim的指令系统和汇编语言。(1)、启动MIPSsim(用鼠标双击MIPSsim.exe)。 (2)、选择“配置”->“流水方式”选项,使模拟器工作在非流水方式。 (3)、参照使用说明,熟悉MIPSsim模拟器的操作和使用方法。 可以先载入一个样例程序(在本模拟器所在的文件夹下的“样例程序”文件夹中),然后分别以单步执行一条指令、执行多条指令、连续执行、设置断点等的方式运行程序,观察程序的执行情况,观察CPU中寄存器和存储器的内容的变化。 (4)、选择“文件”->“载入程序”选项,加载样例程序 alltest.asm,然后查看“代码”窗口,查看程序所在的位置(起始地址为0x00000000)。 (5)、查看“寄存器”窗口PC寄存器的值:[PC]=0x00000000。 (6)、执行load和store指令,步骤如下: 1)单步执行一条指令(F7)。 2)下一条指令地址为0x00000004,是一条有 (有,无)符号载入字节 (字节,半字,字)指令。 3)单步执行一条指令(F7)。 4)查看R1的值,[R1]= 0xFFFFFFFFFFFFFF80 。 5)下一条指令地址为0x00000008,是一条有 (有,无)符号载入字 (字节,半字,字)指令。 6)单步执行1条指令。 7)查看R1的值,[R1]=0x0000000000000080 。 8)下一条指令地址为0x0000000C ,是一条无 (有,无)符号载入字节 (字节,半字,字)指令。 9)单步执行1条指令。 10)查看R1的值,[R1]= 0x0000000000000080 。 11)单步执行1条指令。 12)下一条指令地址为0x00000014 ,是一条保存字 (字节,半字,字)指令。 13)单步执行一条指令。

Cache实验

Caches实验 杨祯 15281139 实验目的 1.阅读分析附件模拟器代码 2.通过读懂代码加深了解cache的实现技术 3.结合书后习题1进行测试 4.通过实验设计了解参数(cache和block size等)和算法(LRU,FIFO 等)选择的优化配置与组合,需要定性和定量分析,可以用数字或图表等多种描述手段配合说明。 阅读分析模拟器代码

课后习题 stride=132下直接相连映射 1)实验分析 由题意得:cachesize=256B blockinbyte=4*4B Noofblock=256B/16B=16个组数位16 array[0]的块地址为0/4=0 映射到cache的块号为0%16=0 array[132]的块地址为132/4=33 映射到cache的块号为33%16=1

第一次访问cache中的0号块与1号块时,会发生强制性失效,之后因为调入了cache中,不会发生失效,所以 misscount=2 missrate=2/(2*10000)=1/10000 hitcount=19998 hitrate=9999/10000 实验验证

stride=131下直接相连映射 实验分析 由题意得:cachesize=256B blockinbyte=4*4B Noofblock=256B/16B=16个组数位16 array[0]的块地址为0/4=0 映射到cache的块号为0%16=0 array[131]的块地址为131/4=32 映射到cache的块号为32%16=0 第一次访问cache中的0号时,一定会发生强制性失效,次数为1;之后因为cache中块号为0的块不断地被替换写入,此时发生的是冲突失效,冲突失效次数为19999, 则发生的失效次数为19999+1=20000 所以 misscount=20000 missrate=20000/(2*10000)=1

Java网上订餐系统与分析大型实验报告

Java系统与分析大型实验报告设计题目:基于JavaEE的网上订餐系统 班级:软件801 姓名:*** 学号:*** 指导老师:*** 2011年12月

1、需求分析 网上订餐系统需要提供客户快捷、方便的订餐服务,开发本系统的具体要求如下: (1)在系统首页需要提供推荐菜单、热门菜单已经菜单搜索功能,方便用户快速选购自己喜欢的菜单。 (2)系统要求用户进行注册和登录。 (3)在用户订餐完毕后,需要能够自动计算菜单价格。同时在用户提交订单时,需要用户确定订单无误,同时还将自动生成订单号,并保存到系统的剪贴板中,方便用户保存订单号。 (4)系统还需要提供会员服务功能,会员每消费一块钱将增加一积分。同时在系统首页将显示积分榜,鼓励会员消费。 (5)系统需要提供菜单分类查看功能,从而方便用户选购。 2、功能分析 模块: 餐店简介模块:用来介绍餐店信息,例如餐店名称、联系人、地址、电话等。 美食分类模块:用来分类显示美食信息,可以通过单击菜单来查看菜单详细信息,可以发表评论信息。 订餐模块:点击菜单的订餐按钮,进入购物车,提供订餐功能。 会员中心模块:用来显示会员身份信息,并提供会员信息更新功能。 订单查询模块:负责订单的查询功能,提供订单时间、订单号查询功能。 功能说明用例图: 用户 查询菜单 提交订单 删除订单图1 用户用例图

管理员 查询菜单 添加菜单 删除菜单 查询订单 删除订单 图2 管理员用例图 3、系统设计 系统流程图: 身份识别 是否合法后台订餐页面 是查看美事信息放入购物车查看购物车提交订单查看订单否 评价美食 图3 前台系统流程图 身份识别 是否合法 后台订餐页面 是增加美食删除美事查看订单删除订单修改美事信息 否 图4 后台系统流程图

现代cache技术的研究 课程设计报告

计算机组成与体系结构课程设计报告题目:现代计算机cache技术的研究 学生姓名:谱 学号: 10204102 班级:10204102 指导教师:谌洪茂 2013 年1月6日

摘要 随着集成电路制造技术的持续发展,芯片的集成度和工作速度不断增加,功耗密度显著增大,功耗已经成为计算机系统设计中与性能同等重要的首要设计约束。在现代计算机系统中,处理器速度远远高于存储器速度,Cache作为处理器与主存之间的重要桥梁,在计算机系统的性能优化中发挥着重要作用,但Cache也占据着处理器的大部分能耗。处理器及其Cache存储器是整个计算机系统能耗的主要来源,降低其能耗对于优化计算机系统,特别是嵌入式系统,有着重要的意义。本文主要研究体系结构级的低能耗技术,利用优化Cache结构和动态电压缩放两种技术来实现处理器及其Cache的低能耗。本文首先详细地分析了低能耗Cache技术的研究现状,将该技术总结为基于模块分割的方法、基于路预测的方法、添加一级小Cache的方法、优化标识比较的方法和动态可重构Cache的方法等五大类,并在此基础上,提出了带有效位预判的部分标识比较Cache、带有效位判别的分离比较Cache、基于程序段的可重构Cache等三种Cache结构。然后从不同的实现层面分析比较了现有的电压缩放技术及其缩放算法,提出了一种基于程序段的动态电压缩放算法。最后结合可重构Cache和动态电压缩放技术,提出了一种基于程序段的可重构Cache及处理器电压自适应算法。本文通过仿真实验证明了上述几种方法的有效性。本文所取得的研究成果主要有: 1.一种带有效位预判的部分标识比较Cache(PTC-V Cache)。组相联Cache实现了高命中率,但同时也带来了更多的能耗。本文针对组相联Cache,提出了一种带有效位预判的部分标识比较Cache,它能够有效地节省Cache中信号放大器和位线的能耗。结果表明,PTC-V Cache平均能够节省指令Cache中约55%的能耗。 2.一种带有效位判别的分离比较Cache(SC-V Cache)。该Cache基于路暂停Cache结构,在此基础上,设计了有效位判断和分离标识比较器。它能缩短标识比较的时间,并且减少对无效数据块读取的能耗,以确保同时获得高性能和低能耗。该方案很大程度上节省了路暂停Cache的平均能耗,尤其对于大容量Cache。 3.一种基于程序段的可重构Cache自适应算法PBSTA。该算法使用建立在指令工作集签名基础上的程序段监测状态机来判断程序段是否发生变化,并做出容量调整决定;在程序段内,该算法使用容量调整状态机来指导Cache进行容量调整。与先前的算法相比,该算法不仅有效地降低了Cache存储系统的能耗,而且减少了不必要的重构所带来的性能损失。 4.一种基于程序段的动态电压缩放算法PBVSA。该算法使用程序段监测状态机来判断程序段是否发生变化,并做出CPU电压和频率调整决定,在程序段内,该算法通过计算该程序段的频率缩放因子β(片外工作时间与片上工作时间的比例关系)来设定CPU的电压和频率。结果表明,该算法在保证系统性能的前提下,有效地降低了处理器的能耗。 5.一种基于程序段的可重构Cache 与处理器电压自适应算法CVPBSTA。该算法结合PBSTA算法与PBVSA算法的特点,使用程序段监测状态机来判断程序段是否发生变化,并做出Cache容量及CPU电压和频率的调整决定。在程序段内,该算法采用了与PBSTA相似的Cache容量调整策略和与PBVSA相似的CPU电压和频率调整策略,先后对Cache容量及CPU电压和频率进行调整。结果表明,该算法在保证性能的前提下,更大程度上地节省了系统的能耗。

计算机组成原理之Cache模拟器的实现

实验一Cache模拟器得实现 一、实验目得 (1)加深对Cache得基本概念、基本组织结构以及基本工作原理得理解。 (2)掌握Cache容量、相联度、块大小对Cache性能得影响。 (3)掌握降低Cache不命中率得各种方法以及这些方法对提高Cache性能得好处。 (4)理解LRU与随机法得基本思想以及它们对Cache性能得影响. 二、实验内容与步骤 1、启动Cachesim 2、根据课本上得相关知识,进一步熟悉Cache得概念与工作机制。 Cache概念:高速缓冲存 Cache工作机制:大容量主存一般采用DRAM,相对SRAM速度慢,而SRAM速度快,但价格高。程序与数据具有局限性,即在一个较短得时间内,程序或数据往往集中在很小得存储器地址范围内。因此,在主存与CPU之间可设置一个速度很快而容量相对较小得存储器,在其中存放CPU当前正在使用以及一个较短得时间内将要使用得程序与数据,这样,可大大加快CPU访问存储器得速度,提高机器得运行效率 3、依次输入以下参数:Cache容量、块容量、映射方式、替换策略与写策略. (1)Cache容量: 启动CacheSim,提示请输入Cache容量,例如1、2、4、8、、、、、、。此处选择输入4。 (2)块容量: 如下图所示,提示输入块容量,例如1、2、4、8、、、、、、。此处选择输入16。 (3)映射方式: 如下图所示,提示输入主存储器与高速缓存之间得assoiativity方法

(主存地址到Cache地址之间得映射方式),1代表直接映射(固定得映射关系)、2代表组相联映射(直接映射与全相联映射得折中)、3代表全相联映射(灵活性大得映射关系)。此处选择全相联映射。 (4)替换策略: 如下图所示,提示输入替换策略,1代表先进先出(First-In—First—Out,FIFO)算法、2代表近期最少使用(Least RecentlyUsed,LRU)算法、3代表最不经常使用(Least Frequently Used,LFU)、4代表随机法(Random)。此处选择先进 先出. (5)写策略: 如下图所示,提示输入Cache得读写操作,1代表写直达法(存直达法)即写操作时数据既写入Cache又写入主存、2代表写回法(拷回法)即写操作时只把数据写入Cache而不写入主存,但当Cache数据被替换出去时才写回主存。此处选写回法

cache性能分析实验报告

计算机系统结构实验报告 名称: Cache性能分析学院:信息工程 姓名:陈明 学号:S121055 专业:计算机系统结构年级:研一

实验目的 1.加深对Cache的基本概念、基本组织结构以及基本工作原理的理解; 2.了解Cache的容量、相联度、块大小对Cache性能的影响; 3.掌握降低Cache失效率的各种方法,以及这些方法对Cache性能提高的好处; 4.理解Cache失效的产生原因以及Cache的三种失效; 5.理解LRU与随机法的基本思想,及它们对Cache性能的影响; 实验平台 Vmware 虚拟机,redhat 9.0 linux 操作系统,SimpleScalar模拟器 实验步骤 1.运行SimpleScalar模拟器; 2.在基本配置情况下运行程序(请指明所选的测试程序),统计Cache总失效 次数、三种不同种类的失效次数; 3.改变Cache容量(*2,*4,*8,*64),运行程序(指明所选的测试程序), 统计各种失效的次数,并分析Cache容量对Cache性能的影响; 4.改变Cache的相联度(1路,2路,4路,8路,64路),运行程序(指明所 选的测试程序),统计各种失效的次数,并分析相联度对Cache性能的影响; 5.改变Cache块大小(*2,*4,*8,*64),运行程序(指明所选的测试程 序),统计各种失效的次数,并分析Cache块大小对Cache性能的影响; 6.分别采用LRU与随机法,在不同的Cache容量、不同的相联度下,运行程序 (指明所选的测试程序)统计Cache总失效次数,计算失效率。分析不同的替换算法对Cache性能的影响。 预备知识 1. SimpleScalar模拟器的相关知识。详见相关的文档。 2. 复习和掌握教材中相应的内容 (1)可以从三个方面改进Cache的性能:降低失效率、减少失效开销、减少Cache命中时间。 (2)按照产生失效的原因不同,可以把Cache失效分为三类: ①强制性失效(Compulsory miss)

Cache命中率分析工具的使用(附源代码)

题目:安装一种Cache命中率分析工具,并现场安装、演示。 一、什么是CPU-Cache CPU缓存(Cache Memory)是位于CPU与内存之间的临时存储器,它的容 量比内存小的多但是交换速度却比内存要快得多。高速缓存的出现主要是为了解 决CPU运算速度与内存读写速度不匹配的矛盾,因为CPU运算速度要比内存读 写速度快很多,这样会使CPU花费很长时间等待数据到来或把数据写入内存。 在缓存中的数据是内存中的一小部分,但这一小部分是短时间内CPU即将访问的,当CPU调用大量数据时,就可先缓存中调用,从而加快读取速度。CPU包 含多个核心,每个核心又有独自的一级缓存(细分成代码缓存和数据缓存)和二 级缓存,各个核心之间共享三级缓存,并统一通过总线与内存进行交互。 二、关于Cache Line 整个Cache被分成多个Line,每个Line通常是32byte或64byte,Cache Line 是Cache和内存交换数据的最小单位,每个Cache Line包含三个部分 Valid:当前缓存是否有效 Tag:对应的内存地址 Block:缓存数据 三、Cache命中率分析工具选择 1、Linux平台:Valgrind分析工具; 2、Windows平台如下: java的Jprofiler; C++的VisualStudio2010及以后的版本中自带profile工具; Application Verifier; intel vtune等。 四、选用Valgrind分析工具在Linux-Ubuntu14.04环境下实验 1.Valgrind分析工具的常用命令功能: memcheck:检查程序中的内存问题,如泄漏、越界、非法指针等。 callgrind:检测程序代码的运行时间和调用过程,以及分析程序性能。 cachegrind:分析CPU的cache命中率、丢失率,用于进行代码优化。 helgrind:用于检查多线程程序的竞态条件。 massif:堆栈分析器,指示程序中使用了多少堆内存等信息。 2.Valgrind分析工具的安装: 使用Ubuntu统一安装命令:sudo apt-get install valgrind 之后等待安装完成即可。 安装界面如图(由于我已经安装了此工具,而且没有更新的版本,图上结果为无可用升级)。

Cache控制器设计实验

实验3 Cache 控制器设计 1、实验目的 (1)掌握Cache控制器的原理及其设计方法。 (2)熟悉FPGA应用设计及EDA 软件的使用。 (3) 熟悉Vivado软件的使用及FPGA应用设计。 2、实验原理 Cache是介于CPU与主存之间的小容量存储器,包括管理在内的全部功能由硬件实现,对程序员是透明的,在一定程度上解决了CPU与主存之间的速度差异、与主存容量相比,Cac he的容量特不小,它保存的内容只是内存内容的一个子集,且Cache与主存的数据交互以块为单位、把主存中的块放到Cache中时必须把主存地址映射到Cache中,即确定位置的对应关系,并采纳硬件实现,以便CPU给出的访存地址能够自动变换成Cache地址。由于程序访问的局部性,使得主存的平均读出时间接近Cache的读出时间,大大提高了CPU的访存效率、 地址映射方式有全相联方式、直截了当相联方式、组相联方式,本实验采纳的是直截了当方式,这种变换方式简单而直截了当,硬件实现特不简单,访问速度也比较快,然而块的冲突率比较高、其主要原则是:主存中一块只能映象到Cache的一个特定的块中、假设主存的块号为B,Cache的块号为b,则它们之间的映象关系能够表示为:b=B mod Cb其中,Cb是Cache的块容量、设主存的块容量为Mb,区容量为Me,则直截了当映象方法的关系如图3、19所示。把主存按Cache的大小分成区,一般主存容量为Cache容量的整数倍,主存每一个分区内的块数与Cache的总块数相等、直截了当映象方式只能把主存各个区中相对块号相同的那些块映象到Cache中同一块号的那个特定块中、例如,主存的块0只能映象到Cache的块0中,主存的块1只能映象到Cache的块1中,同样,主存区1中的块Cb(在区1中的相对块号是0)也只能映象到Cache 的块0中、依照上面给出的地址映象规则,整个Cache地址与主存地址的低位部分是完全相同的。

大连理工大学计算机系统结构实验-实验四

大连理工大学实验报告计算机系统结构实验 实验四Cache性能分析 学院(系):电子信息与电气工程学部专业:计算机科学与技术 学生姓名: 班级: 学号: 大连理工大学 Dalian University of Technology

实验四Cache性能分析 一、实验目的和要求 (1)加深对Cache的基本概念、基本组织结构以及基本工作原理的理解。 (2)掌握Cache容量、相联度、块大小对Cache性能的影响。 (3)掌握降低Cache不命中率的各种方法以及这些方法对提高Cache性能的好处。 (4)理解LRU与随机法的基本思想以及它们对Cache性能的影响。 二、实验步骤与操作方法 1、Cache容量对不命中率的影响。 (1)启动MyCache。 (2)用鼠标单击“复位”按钮,把各参数设置为默认值。 (3)选择一个地址流文件。方法:选择“访问地址”—>“地址流文件”选项,然后单击“浏览”按钮,从本模拟器所在文件夹下的“地址流”文件夹中选取。 (4)选择不同的Cache容量,包括2KB、4KB、8KB、16KB、32KB、64KB、128KB和256KB。分别执行模拟器(单击“执行到底”按钮即可执行),然后在下表中记录各种情况下的不命中率。 表不同容量下Cache的不命中率 (5)以容量为横坐标,画出不命中率随Cache容量变化而变化的曲线,并指明地址流文件名。

(6)根据该模拟结果,你能得出什么结论? 答:随着Cache容量的增大,不命中率降低,但是降低的幅度由较大差别,Cache容 量足够大以后,不命中率降到一定程度以后,降低效果不再明显。 2.相联度对不命中率的影响 (1)用鼠标单击“复位”按钮,把各参数设置为默认值。此时的Cache容量为64KB。 (2)选择一个地址流文件。 (3)选择不同的Cache相联度,包括2路、4路、8路、16路和32路。分别执行模拟器,然后在下表中记录各种情况下的不命中率。 表当容量为64KB时,不同相联度下Cache的不命中率 (4)把Cache的容量设置为256KB,重复(3)的工作,并填写下表。 表当容量为256KB时,不同相联度下Cache的不命中率 (5)以相联度为横坐标,画出在64KB和256KB的情况下不命中率随Cache相联度变化而变化的曲线,并指明地址流文件名。

计算机系统结构实验报告

计算机系统结构实验报告 一.流水线中的相关 实验目的: 1. 熟练掌握WinDLX模拟器的操作和使用,熟悉DLX指令集结构及其特点; 2. 加深对计算机流水线基本概念的理解; 3. 进一步了解DLX基本流水线各段的功能以及基本操作; 4. 加深对数据相关、结构相关的理解,了解这两类相关对CPU性能的影响; 5. 了解解决数据相关的方法,掌握如何使用定向技术来减少数据相关带来的暂停。 实验平台: WinDLX模拟器 实验内容和步骤: 1.用WinDLX模拟器执行下列三个程序: 求阶乘程序fact.s 求最大公倍数程序gcm.s 求素数程序prim.s 分别以步进、连续、设置断点的方式运行程序,观察程序在流水线中的执行情况,观察 CPU中寄存器和存储器的内容。熟练掌握WinDLX的操作和使用。 2. 用WinDLX运行程序structure_d.s,通过模拟找出存在资源相关的指令对以及导致资源相 关的部件;记录由资源相关引起的暂停时钟周期数,计算暂停时钟周期数占总执行周期数的 百分比;论述资源相关对CPU性能的影响,讨论解决资源相关的方法。 3. 在不采用定向技术的情况下(去掉Configuration菜单中Enable Forwarding选项前的勾选符),用WinDLX运行程序data_d.s。记录数据相关引起的暂停时钟周期数以及程序执行的 总时钟周期数,计算暂停时钟周期数占总执行周期数的百分比。 在采用定向技术的情况下(勾选Enable Forwarding),用WinDLX再次运行程序data_d.s。重复上述3中的工作,并计算采用定向技术后性能提高的倍数。 1. 求阶乘程序 用WinDLX模拟器执行求阶乘程序fact.s。这个程序说明浮点指令的使用。该程序从标准 输入读入一个整数,求其阶乘,然后将结果输出。 该程序中调用了input.s中的输入子程序,这个子程序用于读入正整数。 实验结果: 在载入fact.s和input.s之后,不设置任何断点运行。 a.不采用重新定向技术,我们得到的结果

根据spim的cache实验

汕头大学实验报告 学院: 工学院系: 计算机系专业: 计算机科学与技术年级: 13实验时间: 2015.6.16 姓名: 林子伦学号: 2013101030实验名称:基于SPIM-CACHE的Cache实验 一.实验目的: (1)熟悉SPIM-CACHE模拟器环境 (2)深入认识CACHE的工作原理及其作用。 二.实验内容: (1)阅读实验指导书资料(虚拟教室提供了英文论文的电子版本); (2)下载SPIM-CACHE软件,理解英文论文的基本内容之后,给出几种典型的cache配置,运行英文论文提供的代码,记录运行时CACHE命中率等重要数据;(3)运行Fig.4代码,了解mapping functions 即映射规则 (4)运行Fig.7代码,了解temporal and spatial locality 即时空局部性,进一步理解cache的工作原理; (5)运行Fig.8代码,运行学习replacement algorithms 即替代算法,理解其工作原理。 三.实验地点,环境 实验地点:软件工程实验室 实验环境: 操作系统:Microsoft Windows 8 中文版 处理器:Intel(R) Core(TM) i3-3120M CPU @ 2.50GHz 2.50GHz 内存: 4.00GB(3.82GB 可用) 四.实验记录及实验分析(80%): 4.1实验前配置: 1) 按下图配置好Spim设置

2)关于实验中cache设置如下(具体配置根据下面实验要求) ——》 ——》 Cache size ——cache大小 Block size ——块大小 Mapping ——组相连 4.2实验一:fig4.s 实验目的:Algorithm and corresponding code to study mapping functions Cache配置:256-B size, 16-B line size, four-way set associative 实验操作: 1) Ctrl+O 打开运行代码fig4.s 代码如下: .data 0x10000480 Array_A: .word 1,1,1,1,2,2,2,2 .data 0x10000CC0 Array_B: .word 3,3,3,3,4,4,4,4 .text .globl _start _start: la $2,Array_A li $6,0 li $4,8 loop: lw $5,0($2) add $6,$6,$5 addi $2,$2,4

Cache 管理

1 前言 自从诞生以来,Linux 就被不断完善和普及,目前它已经成为主流通用操作系统之一,使用得非常广泛,它与Windows、UNIX 一起占据了操作系统领域几乎所有的市场份额。特别是在高性能计算领域,Linux 已经成为一个占主导地位的操作系统,在2005年6月全球TOP500 计算机中,有301 台部署的是Linux 操作系统。因此,研究和使用Linux 已经成为开发者的不可回避的问题了。 下面我们介绍一下Linux 内核中文件Cache 管理的机制。本文以 2.6 系列内核为基准,主要讲述工作原理、数据结构和算法,不涉及具体代码。 2 操作系统和文件Cache 管理 操作系统是计算机上最重要的系统软件,它负责管理各种物理资源,并向应用程序提供各种抽象接口以便其使用这些物理资源。从应用程序的角度看,操作系统提供了一个统一的虚拟机,在该虚拟机中没有各种机器的具体细节,只有进程、文件、地址空间以及进程间通信等逻辑概念。这种抽象虚拟机使得应用程序的开发变得相对容易:开发者只需与虚拟机中的各种逻辑对象交互,而不需要了解各种机器的具体细节。此外,这些抽象的逻辑对象使得操作系统能够很容易隔离并保护各个应用程序。 对于存储设备上的数据,操作系统向应用程序提供的逻辑概念就是"文件"。应用程序要存储或访问数据时,只需读或者写"文件"的一维地址空间即可,而这个地址空间与存储设备上存储块之间的对应关系则由操作系统维护。 在Linux 操作系统中,当应用程序需要读取文件中的数据时,操作系统先分配一些内存,将数据从存储设备读入到这些内存中,然后再将数据分发给应用程序;当需要往文件中写数据时,操作系统先分配内存接收用户数据,然后再将数据从内存写到磁盘上。文件Cache 管理指的就是对这些由操作系统分配,并用来存储文件数据的内存的管理。Cache 管理的优劣通过两个指标衡量:一是Cache 命中率,Cache 命中时数据可以直接从内存中获取,不再需要访问低速外设,因而可以显著提高性能;二是有效Cache 的比率,有效Cache 是指真正会被访问到的Cache 项,如果有效Cache 的比率偏低,则相当部分磁盘带宽会被浪费到读取无用Cache 上,而且无用Cache 会间接导致系统内存紧张,最后可能会严重影响性能。 下面分别介绍文件Cache 管理在Linux 操作系统中的地位和作用、Linux 中文件Cache 相关的数据结构、Linux 中文件Cache 的预读和替换、Linux 中文件Cache 相关API 及其实现。 2 文件Cache 的地位和作用 文件Cache 是文件数据在内存中的副本,因此文件Cache 管理与内存管理系统和文件系统都相关:一方面文件Cache 作为物理内存的一部分,需要参与物理内存的分配回收过程,另一方面文件Cache 中的数据来源于存储设备上的文件,需要通过文件系统与存储设备进行读写交互。从操作系统的角度考虑,文件Cache 可以看做是内存管理系统与文件系统之间的联系纽带。因此,文件Cache 管理是操作系统的一个重要组成部分,它的性能直接影响着文件系统和内存管理系统的性能。 图1描述了Linux 操作系统中文件Cache 管理与内存管理以及文件系统的关系示意图。从图中可以看到,在Linux 中,具体文件系统,如ext2/ext3、jfs、ntfs 等,负责在文件Cache 和存储设备之间交换数据,位于具体文件系统之上的虚拟文件系统VFS负责在应用程序和文件Cache 之间通过read/write 等接口交换数据,而内存管理系统负责文件Cache 的分配和回收,同时虚拟内存管理系统(VMM)则允许应用程序和文件Cache 之间通过memory

计算机组成原理之Cache模拟器的实现

实验一Cache模拟器的实现 一.实验目的 (1)加深对Cache的基本概念、基本组织结构以及基本工作原理的理解。 (2)掌握Cache容量、相联度、块大小对Cache性能的影响。 (3)掌握降低Cache不命中率的各种方法以及这些方法对提高Cache性能的好处。 (4)理解LRU与随机法的基本思想以及它们对Cache性能的影响。 二、实验内容和步骤 1、启动Cachesim 2.根据课本上的相关知识,进一步熟悉Cache的概念和工作机制。 Cache概念:高速缓冲存 Cache工作机制:大容量主存一般采用DRAM,相对SRAM速度慢,而SRAM速度快,但价格高。程序和数据具有局限性,即在一个较短的时间内,程序或数据往往集中在很小的存储器地址范围内。因此,在主存和CPU之间可设置一个速度很快而容量相对较小的存储器,在其中存放CPU当前正在使用以及一个较短的时间内将要使用的程序和数据,这样,可大大加快CPU访问存储器的速度,提高机器的运行效率 3、依次输入以下参数:Cache容量、块容量、映射方式、替换策略和写策略。Cache容量块容量映射方式替换策略写策略 8 32 全相联映射先进先出算法写回法(1)Cache容量: 启动CacheSim,提示请输入Cache容量,例如1、2、4、8......。此处选择输入4。 (2)块容量: 如下图所示,提示输入块容量,例如1、2、4、8......。此处选择输入16。

(3)映射方式: 如下图所示,提示输入主存储器和高速缓存之间的assoiativity方法(主存地址到Cache地址之间的映射方式),1代表直接映射(固定的映射关系)、2代表组相联映射(直接映射与全相联映射的折中)、3代表全相联映射(灵活性大的映射关系)。此处选择全相联映射。 (4)替换策略: 如下图所示,提示输入替换策略,1代表先进先出(First-In-First-Out,FIFO)算法、2代表近期最少使用(Least Recently Used,LRU)算法、3代表最不经常使用(Least Frequently Used,LFU)、4代表随机法(Random)。此处选择先进先出。 (5)写策略: 如下图所示,提示输入Cache的读写操作,1代表写直达法(存直达法)即写操作时数据既写入Cache又写入主存、2代表写回法(拷回法)即写操作时只把数据写入Cache而不写入主存,但当Cache数据被替换出去时才写回主存。

《计算机体系结构》第六次实验 cache

Cache实验报告姓名:王宇航学号:09283020 安全0901 Cache实验报告 一、实验要求: 1.阅读分析附件模拟器代码 要求:1)读懂2)关键注释3)总结关键参数和算法的实现方法 2.通过读懂代码加深了解cache的实现技术 3.结合书后习题1进行测试 4.通过测试和进行实验设计了解参数和算法选择的特点和相互关系(比较,组合等),需要定性和量化结合说明,可以用数字或图表等多种描述手段配合说明。 二、实验代码: 1. LRU页面置换算法 程序一共有3中模式: Direct_mapped 2 Set_associate 3 Fully_associate 对于第一种,直接映射,显然用不到LRU算法,因为每一个地址在cache中只有一个地方可以去。 对于后两种,组相联映射和全相联映射,就需要用到LRU算法了。 其中,全相联映射等于是只有一个set的Set_associate,而LRU正是用在一个set中,所以,后面两种模式的LRU问题可以归结为一种:一个set中,来了一个没有的页面,需要置换出一个,应该置换出哪一个的问题。 那么,具体过程如下: 1 这个set中的每一个block都有一个lru值,初始为0。 2 每次访问这个set的时候,不管是否命中,这个set中的所有block的lru值都+1。 3 当需要置换出去一个页面的时候,选择一个lru值最大的那个置换出入,用来放入刚刚进来的。 4 不管是否命中,刚刚访问过的,或者加入的那个block的lru值置为0。 if(x

cache缓存淘汰算法--LRU算法

缓存淘汰算法--LRU算法 1. LRU 1.1. 原理 LRU(Least recently used,最近最少使用)算法根据数据的历史访问记录来进行淘汰数据,其核心思想是“如果数据最近被访问过,那么将来被访问的几率也更高”。 1.2. 实现 最常见的实现是使用一个链表保存缓存数据,详细算法实现如下: 1. 新数据插入到链表头部; 2. 每当缓存命中(即缓存数据被访问),则将数据移到链表头部; 3. 当链表满的时候,将链表尾部的数据丢弃。 1.3. 分析 【命中率】

当存在热点数据时,LRU的效率很好,但偶发性的、周期性的批量操作会导致LRU命中率急剧下降,缓存污染情况比较严重。 【复杂度】 实现简单。 【代价】 命中时需要遍历链表,找到命中的数据块索引,然后需要将数据移到头部。 2. LRU-K 2.1. 原理 LRU-K中的K代表最近使用的次数,因此LRU可以认为是LRU-1。LRU-K的主要目的是为了解决LRU算法“缓存污染”的问题,其核心思想是将“最近使用过1次”的判断标准扩展为“最近使用过K次”。 2.2. 实现 相比LRU,LRU-K需要多维护一个队列,用于记录所有缓存数据被访问的历史。只有当数据的访问次数达到K次的时候,才将数据放入缓存。当需要淘汰数据时,LRU-K会淘汰第K次访问时间距当前时间最大的数据。详细实现如下:

1. 数据第一次被访问,加入到访问历史列表; 2. 如果数据在访问历史列表里后没有达到K次访问,则按照一定规则(FIFO,LRU)淘汰; 3. 当访问历史队列中的数据访问次数达到K次后,将数据索引从历史队列删除,将数据移到缓存队列中,并缓存此数据,缓存队列重新按照时间排序; 4. 缓存数据队列中被再次访问后,重新排序; 5. 需要淘汰数据时,淘汰缓存队列中排在末尾的数据,即:淘汰“倒数第K次访问离现在最久”的数据。 LRU-K具有LRU的优点,同时能够避免LRU的缺点,实际应用中LRU-2是综合各种因素后最优的选择,LRU-3或者更大的K值命中率会高,但适应性差,需要大量的数据访问才能将历史访问记录清除掉。

相关文档
相关文档 最新文档