文档库 最新最全的文档下载
当前位置:文档库 › 1-s2.0-S0952197605001065-main

1-s2.0-S0952197605001065-main

Engineering Applications of Arti?cial Intelligence19(2006)277–287

Prediction of automotive engine power and torque using least squares support vector machines and Bayesian inference

Chi-Man Vong a,?,Pak-Kin Wong b,Yi-Ping Li a

a Department of Computer and Information Science,University of Macau,P.O.Box3001,Macau,China

b Department of Electromechanical Engineering,University of Macau,P.O.Box3001,Macau,China

Received12November2004;received in revised form25July2005;accepted30August2005

Available online21October2005

Abstract

Automotive engine power and torque are signi?cantly affected with effective tune-up.Current practice of engine tune-up relies on the experience of the automotive engineer.The engine tune-up is usually done by trial-and-error method,and then the vehicle engine is run on the dynamometer to show the actual engine output power and torque.Obviously,the current practice costs a large amount of time and money,and may even fail to tune up the engine optimally because a formal power and torque model of the engine has not been determined yet.With an emerging technique,least squares support vector machines(LS-SVM),the approximated power and torque model of a vehicle engine can be determined by training the sample data acquired from the dynamometer.The number of dynamometer tests for an engine tune-up can therefore be reduced because the estimated engine power and torque functions can replace the dynamometer tests to a certain extent.Besides,Bayesian framework is also applied to infer the hyperparameters used in LS-SVM so as to eliminate the work of cross-validation,and this leads to a signi?cant reduction in training time.In this paper,the construction,validation and accuracy of the functions are discussed.The study shows that the predicted results using the estimated model from LS-SVM are good agreement with the actual test results.To illustrate the signi?cance of the LS-SVM methodology,the results are also compared with that regressed using a multilayer feed forward neural networks.

r2005Elsevier Ltd.All rights reserved.

Keywords:Automotive engine power and torque;Least squares support vector machines;Bayesian inference

1.Introduction

1.1.ECU tune-up

Modern automotive gasoline engines are controlled by

the electronic control unit(ECU).The engine output

power and torque are signi?cantly affected by the setup of

control parameters in the ECU.Many parameters are

stored in the ECU using a look-up table or map.Normally,

the data of a car engine and torque is obtained through

dynamometer tests.An example of performance data of an

engine output horsepower and torque against speeds is

shown in Fig.1.The engine power and torque re?ect the

dynamic performance of an engine.Traditionally,the setup

of ECU is done by the vehicle manufacturer.However,in

recent,the programmable ECU and ECU read only

memory(ROM)editors have been widely adopted by

many passenger cars.These devices allow the non-OEM’s

engineers to tune up their engines according to different

add-on components and driver’s requirements.

Current practice of engine tune-up relies on the

experience of the automotive engineer who handles a huge

number of combinations of engine control parameters.The

relationship between the input and output parameters of a

modern car engine is a complex multivariable nonlinear

model,which is very dif?cult to be estimated,because

modern automotive engine is an integration of thermo-

?uid,electromechanical and computer control systems.

Consequently,engine tune-up is usually done by trial-and-

error method.

Firstly,the engineer guesses an ECU setup based on his/

her experience and then stores the setup values in the ECU.

Then,the engine is run on a dynamometer to test the actual

https://www.wendangku.net/doc/40620605.html,/locate/engappai

0952-1976/$-see front matter r2005Elsevier Ltd.All rights reserved.

doi:10.1016/j.engappai.2005.09.001

?Corresponding author.Tel.:+868533974476.

E-mail address:cmvong@umac.mo(C.-M.Vong).

engine power and torque.If the performance is loss,the engineer adjusts the ECU setting and repeats the procedure until the performance is satisfactory.That is why vehicle manufacturers normally spend many months to tune-up an ECU optimally for a new car model.Moreover,the power and torque functions are engine dependent as well.Every engine requires doing the similar tune-up procedure.

By knowing the power and torque functions,the automotive engineers can predict if a trial ECU setup is gain or loss.The car engine only requires going through a dynamometer test for veri?cation after estimating a satisfactory setup from the functions.Hence,the number of unnecessary dynamometer tests for the trail setup can be drastically reduced so as to save a large amount of time and money for testing.

1.2.Neural networks and its drawbacks

Recent researches (Brace,1998;Traver et al.,1999;Su et al.,2002;Yan et al.,2003;Liu and Fei,2004)have described the use of neural networks for modeling the diesel engine emission performance based on experi-mental data.It is well known that a neural network (Bishop,1995;Haykin,1999)is a universal estimator.It has in general,however,two main drawbacks for its learning process (Smola et al.,1996;Scho lkopf and Smola,2002):

(1)The architecture,including the number of hidden

neurons,has to be determined a priori or modi?ed while training by heuristic,which results in a non-necessarily optimal network structure.

(2)The training process (i.e.,the minimization of the

residual squared error cost function)in neural networks can easily be stuck by local minima.Various ways of preventing local minima,like early stopping,weight decay,etc.,are employed.However,those methods greatly affect the generalization of the estimated function,i.e.,the capacity of handling new input cases.

1.3.Nonlinear regression and its drawbacks

Traditional mathematical methods for nonlinear regres-sion (Sen and Srivastava,1990;Ryan,1996;Harrell,2001;Tabachnick and Fidell,2001;Seber and Wild,2003)may be applied to estimate the engine power and torque models.It works by transforming the nonlinear data space into linear data space,i.e.,removing the nonlinearity,and then performs linear regression over the transformed data space.The drawbacks of nonlinear regression methods are:(1)These nonlinear transformations are not guaranteed to

retain the information of the transformed https://www.wendangku.net/doc/40620605.html,ually after the transformation,the training data would be distorted and hence affecting the predictability of the regressed model from the transformed training data.(2)These nonlinear transformations can work well only for

low-dimensional data set.In the current application,an engine setup involves many parameters.Constructing the prediction models in such a high-dimensional and nonlinear data space is very dif?cult for traditional regression methods.So,it is not recommended to apply the traditional nonlinear regression methods for high-dimensional data set.

1.4.Support vector machines (SVM)for engine output prediction subject to ECU tune-up

With an emerging technique of support vector machines (SVM)(Cristianini and Shawe-Taylor,2000;Scho lkopf and Smola,2002;Suykens et al.,2002)combining the advan-tages of neural networks (handling large amount of highly nonlinear data)and nonlinear regression (high general-ization),the issues of high dimensionality as well as the previous drawbacks from neural networks are overcome.Because of the above reason,SVM is employed to estimate the engine power and torque models that can be used for precision performance prediction,so that the number of dynamometer tests can be signi?cantly reduced,because the dynamometer tests normally cost a large amount of money and time.

Moreover,dynamometer is not always available,parti-cular in the case of on-road ?ne tune-up.Research on the prediction of modern gasoline engine output power and torque subject to various parameter setups in the ECU are still quite rare,so the use of SVM for modeling of engine output power and torque is the ?rst attempt.2.Support vector machines

SVM is an interdisciplinary ?eld of machine learning,optimization,statistical learning and generalization theory.It is also another category of feed-forward networks as illustrated in Fig.2(Haykin,1999).Basically,it can be used for pattern classi?cation and function estimation (Gunn,1998).Since the paper focuses on function estimation,the

1401201008060402002000

2500

3000

35004000

45005000

RPM

H o r s e p o w e r

Torque

T o r q u e HP

140120100806040200Fig.1.Example of engine output horsepower and torque curves.

C.-M.Vong et al./Engineering Applications of Arti?cial Intelligence 19(2006)277–287

278

discussion is only related to function estimation issues.For pattern classi?cation,(Gunn,1998;Haykin,1999;Smola et al.,1996)provide valuable reference.

SVM is a very nice framework or methodology to formulate the mathematical program for the training error function used in any application.No matter which application,SVM formulates the training process (i.e.,minimization of squared residual error function)as a quadratic programming (QP)problem for the weights with regularization factor included.Since QP problem is a convex function,the solution returned is global (or even unique)instead of many local ones,unlike neural net-works.This result ensures the high generalization of the trained SVM models over neural networks.

Another important appeal of SVM over other traditional regression methods is its ability to handle very high nonlinearity.Similar to nonlinear regression,SVM trans-forms the low dimensional nonlinear input data space into high-dimensional linear feature space through a nonlinear mapping j :R n !R n h (Fig.3),n is the dimension of data space,and n h is the (very high and even in?nite)dimension of the unknown feature space.Then linear function estimation over the feature space can be performed.The problem now turns to ?nd out this nonlinear mapping j for its primal formulation (Fig.4).Nevertheless,SVM dual formulation (Fig.4)provides an inner-product kernel trick,K ex k ;x l T?j ex k TT j ex l T,which totally eliminates the effort of ?nding the nonlinear mapping j in the primal formulation as necessary in traditional nonlinear regression methods.This trick is also illustrated in Section 2.4(Eqs.(6)and (7))So,the kernel function K in dual formulation is to be de?ned rather than the nonlinear mapping j .Fortunately,three common kernel functions

(Cristianini and Shawe-Taylor,2000;Scho lkopf and Smola,2002;Suykens et al.,2002)are available,and among which,radial basis function (RBF)kernel is the best for nonlinear function estimation.

2.1.SVM formulation for nonlinear function estimation Consider the data set,D ?fex 1;y 1T;...;ex N ;y N Tg ,with N data points where x i 2R n ,y 2R .SVM dual formulation for nonlinear regression is expressed as the following equation (Cristianini and Shawe-Taylor,2000;Scho lkopf and Smola,2002;Suykens et al.,2002):

Min a ;a ?J ea ;a ?

T?12X N i ?1X N j ?1

ea i àa ?i Tea j àa ?

j TK ex i ;x j Tte X

N i ?1

ea i ta ?

i TàX N i ?1

y i ea i àa ?i T

s :t :

X N i ?1

ea i àa ?i T?0,

e1T

where

a ;a ?:Lagrangian multipliers (each multiplier is ex-pressed as an N-dimensional vector)a i ;a j 2a ;a n i ;a n j 2a ?;for 1p i ;j p N and a i ;a j ;a ?i ;a ?

j 2?0;c

K :kernel function,

e :user pre-de?ned regularization constant,

c :the user pre-de?ne

d positiv

e real constant for capacity control.

Bias b

Output neuron

Linear outputs

y

Hidden layer of m 1 Inner-product

kernels

Input layer of size m 0

Input vector x

x 1x 2

x m

...

...K (x, x m 1

)

K (x, x 2)

K (x, x 1)

Fig.2.Neural networks interpretation of SVM (Haykin,1999

).

Fig.3.Nonlinear mapping from nonlinear data space to high dimensional linear feature space (Suykens et al.,2002).

C.-M.Vong et al./Engineering Applications of Arti?cial Intelligence 19(2006)277–287

279

Nonzero a i and a i*are known as support values corresponding to the i th data point,where i th data point means the i th engine setup and output torque.Besides, RBF with user pre-de?ned sample variance s2is chosen as the kernel function because it often has a good result for nonlinear regression(Suykens et al.,2002;Seeger,2004). After solving Eq.(1)with a commercial optimization package,such as MATLAB and its optimization toolbox, two N-vectors a;a?are obtained to be the solutions, resulting in the following target nonlinear model:

MexT?

X N

i?1ea iàa?

i

TKex;x iTtb

?

X N

i?1ea iàa?

i

Teàejj xàx i jj2T=s2tb;e2T

where b is the bias constant,x the new engine input setup

with n parameters and s2the user-speci?ed sample

variance.

In order to obtain the bias b,m training data

points d k?h x k;y k i2D,k?1;2;...;m,are selected,

such that their corresponding a k and a?

k

2e0;cT,

i.e.,0o a k,a?

k

o c.By substituting x k into Eq.(2)

and setting Mex kT?y k,a bias b k can be obtained.

Since there are m biases,the optimal bias value b*is

usually obtained by taking the average of b k as shown in

Eq.(3):

b??

1

k

X m

k?1

b k.(3)

Primal problem P

Kernel trick

Dual problem D

Parameter to estimate : w∈R n h , n

h

is the number of hidden neurons.

y(x) = sign [w T (x) + b]

1

(x)

n

h

(x)

w n

h

x

w

1

y (x)

K (x

k

, x

l

) = (x

k

)T (x

l

)

x

k

and x

l

are the k th and l th training data points

Parameter to estimate : α∈R N,

N is number of hidden neruon in the feature space.

y(x) = sign[∑ k y k K(x , x k )+ b],

# s v

k =1

# sv = number of support vectors < N.

K(x , x #sv)

y (x)

1

x

# s v

K(x, x

1

)

Fig.4.Primal-dual neural network interpretations of SVM(Suykens et al.,2002).

C.-M.Vong et al./Engineering Applications of Arti?cial Intelligence19(2006)277–287

280

https://www.wendangku.net/doc/40620605.html,paring SVM with neural networks and nonlinear regression

Since SVM combines the advantages of neural networks and nonlinear regression,they are compared and the respective advantages are listed out.

2.2.1.SVM vs.neural networks

Both SVM and neural networks can handle highly nonlinear function estimation.However,they have the following difference:

(1)The architecture of the SVM models has not to be

determined before training.Input data of any arbitrary dimensionality can be treated with only linear cost in the number of input dimensions.In addition,number of support vectors(i.e.,hidden neurons in neural networks)is not necessary to be speci?ed a priori. (2)SVM treats function estimation as a QP problem of

minimizing the data?tting error function plus regular-ization,which produces a global(or even unique) solution having minimal?tting error,while high generalization of the estimated model can also be obtained.For neural networks,its formulation for training error function usually leads to many local solutions.

2.2.2.SVM vs.nonlinear regression

Both SVM and nonlinear regression can estimate non-linear function with high generalization.However,they have the following difference:

(1)Inner product kernel trick Kex k;x lT?jex kTT jex lTis

applied in SVM dual formulation so that the nonlinear mapping j can be ignored,whereas this dif?cult nonlinear mapping must be explicitly speci?ed or guessed in traditional nonlinear regression methods.

This is illustrated in Section2.4(Eqs.(6)and(7)).This factor usually prevents nonlinear regression from handling high nonlinearity.

(2)An interesting property for SVM is called sparseness.

This is the result of solving the QP formulation in SVM.In solving Eq.(1),most of the support values a i and a i*are set to zero.Hence,only those data points x k related to non-zero support values a k and a k*are involved in computing Eq.(2).These data points x k are called support vectors.The number of these support vectors(i.e.,#sv)is determined during training time.

That explains why SVM does not require to specify the number of support vectors a priori.Then the estimated model in Eq.(2)involves a set of m5N support vectors.This makes the estimated model very compact and ef?cient in run time.Under this circumstance, SVM can handle much larger amount of training data (up to millions)while nonlinear regression can usually handle up to hundreds of training data.2.3.Least squares support vector machines

Least squares support vector machines(LS-SVM) (Suykens et al.,2002)is a variant of SVM,which employs least squares error in the training error function.SVM solves nonlinear function estimation problems by means of convex quadratic programs and the sparseness is obtained as a result of this QP problem.However,QP problems are inherently dif?cult to be solved.Although many commercial packages exist in the world for solving QP problems,it is still preferred to have a simpler formulation.LS-SVM is the variant that modi?es the original SVM formulation,leading to solving a set of linear equations that is easier to use/solve than QP problems, while most of the important advantages of SVM are retained.In addition,the advantages of LS-SVM over standard SVM are:

(1)The threshold b is returned automatically as part of the

LS-SVM solution whereas SVM must calculate the threshold b separately.

(2)The hyperparameters for tuning is reduced from three

(e,c,s)into two(g,s).

(3)Bayesian inference procedure has been developed to

automatically?nd out the most appropriate values for hyperparameters g and s,which eliminates the burden of manual cross-validation procedure to estimate the values of e,c and s.

2.4.LS-SVM formulation for nonlinear function estimation Consider the data set D?fex1;y1T;...;ex N;y NTg,with N data points where x k2R n,y2R,k?1to N.LS-SVM deals with the following optimization problem in the primal weight space

min

w;b;e

J Pew;eT?

1

2

w T wtg

1

2

X N

k?1

e2

k

s:t:e k?y kà?w T jex kTtb ;k?1;...;N

2

66

4

3

77

5,(4)

where w2R n h is the weight vector of the target function, e??e1;...;e N is the residual vector,and j:R n!R n h is a nonlinear mapping,n is the dimension of x k,and n h is the dimension of the unknown feature space.Solving the dual of Eq.(4)can avoid the high(and unknown)dimension-ality of w.The LS-SVM dual formulation of nonlinear function estimation is then expressed as follows(Suykens, et al.,2002):

Solve in a;b:

01T

v

1v Xte1=gTI N

"#

b

a

?

y

"#

2

66

4

3

77

5,(5)

where I N is an N-dimensional identity matrix,y?[y1,y,y N]T,1v is an(Nà1)-dimensional vector?[1,y,1]T, a?[a1,...,a N]T,and g2R is a scalar for regularization

C.-M.Vong et al./Engineering Applications of Arti?cial Intelligence19(2006)277–287281

(which is a hyperparameter for tuning).The kernel trick is employed as follows:

X k ;l ?j ex k TT j ex l T

?K ex k ;x l T;k ;l ?1;...;N ,

e6T

where K is a prede?ned kernel function.

The resulting LS-SVM model for function estimation becomes M ex T?

X N k ?1a k j ex k TT j ex Ttb ?

X N k ?1a k K ex k ;x Ttb

?

X

N k ?1

a k exp àjj x k àx jj 2

s

tb ;

e7T

where a k ,b 2R are the solutions of Eq.(5),x k is training data,x is the new input case,and RBF is chosen as the kernel function K .From the viewpoint of the current application,some parameters in Eqs.(5)and (6)are speci?ed as:N total number of engine setups (data points),

x k engine input control parameters in the k th sample data point k ?1;2;...;N (i.e.the k th engine setup),

y k

engine output torque in the k th sample data point.

3.Application of LS-SVM to gasoline engine modeling In current application,M (x )in Eq.(7)is the torque function of an automotive engine.The power of the engine is calculated based on the engine torque as discussed in Section

4.The issues of LS-SVM for this application domain are discussed in the following sub-sections.3.1.Schema

The training data set is expressed as D ?f d k g ?fex k ;y k Tg ,k ?1to N .Practically,there are many input control parameters and they are also ECU and engine dependent.Moreover,the engine power and torque curves are normally obtained at full-load condition.For the demonstration purpose of the LS-SVM methodology,the

following common adjustable engine parameters and environmental parameter are selected to be the input (i.e.,engine setup)at engine full-load condition:x ?h I r ;O ;t r ;f ;J r ;d ;a ;p i

and

y ?h T r i ,

where r is the engine speed (RPM)and r A {1000,1500,2000,2500,y ,8000},I r the ignition spark advance at the corresponding engine speed r (degree before top dead centre),O the overall ignition trim (7degree before top dead center),t r the fuel injection time at the corresponding engine speed r (ms),f the overall fuel trim (7%),J r the timing for stopping the fuel injection at the corresponding engine speed r (degree before top dead center),d the ignition dwell time at 15V (ms),a the air temperature (1C),p the fuel pressure (bar)and T r the engine torque at the corresponding engine speed r (kg m).

The engine speed range for this project has been selected from 1000to 8000rpm.Although the engine speed r is a continuous variable,in practical ECU setup,the engineer normally ?lls the setup parameters for each category of engine speed in a map format.The map is usually divided the speed range discretely with interval 500,i.e.r A {1000,1500,2000,2500y }.Therefore,it is unnecessary to build a function across all speeds.So,r is manually divided with a speci?ed interval of 500instead of any integer ranging from 0to 8000.

As the training data is engine speed dependent,another notation D r is used to further specify a data set containing the data with respect to a speci?c r .For example,D 1000contains the following parameters:/I 1000,O ,t 1000,f ,J 1000,d ,a ,p ,T 1000S ,while D 8000contains /I 8000,O ,t 8000,f ,J 8000,d ,a ,p ,T 8000S .

Consequently,D is separated into ?fteen subsets namely D 1000,D 1500,y ,D 8000.An example of the training data (ECU setup)for D 1000is shown in Table 1.For a subset D r ,it is passed to the LS-SVM regression module,Eq.(5),in order to construct the torque function M r (Eq.(7))with respective to engine speed r .According to the division of training data,there are totally 15torque functions,i.e.,M r ?{M 1000,M 1500,y ,M 8000}.

In this way,the LS-SVM module is run for ?fteen times.In every run,a different subset D r is used as training set to estimate its corresponding torque function.An engine torque against engine speed curve is therefore obtained by ?tting a curve that passes through all data points generated by M 1000,M 1500,M 2000,y ,M 8000.

Table 1

Example of training data d i in data set D 1000

I 1000

O t 1000f J 1000d a p T 1000d 1807.10385325 2.820d 2102 6.50360325 2.811^^^^^^^^^^d N

12

7.5

3

360

2.7

30

2.8

12

C.-M.Vong et al./Engineering Applications of Arti?cial Intelligence 19(2006)277–287

282

4.Data sampling and implementation

In practical engine setup,the automotive engineer determines an initial setup,which can basically start the engine,and then the engine is ?ne-tuned by adjusting the parameters about the initial setup values.Therefore,the input parameters are sampled based on the data points about an initial setup supplied by the engine manufacturer.In the experiment,a sample data set D of 200different engine setups along with output torque has been acquired from a Honda B16A DOHC engine controlled by a programmable ECU,MoTeC M4,running on a chassis dynamometer (Fig.5)at wide open throttle.The engine output data is only the torque against the engine speeds because the horsepower HP of an engine is calculated using HP ?

2p ?r ?9:81?T

746?60

;(8)

where HP is the engine horsepower (Hp),r the engine speed (RPM:revolution per minute)and T the engine torque (kg m).After collection of sample data set D,for every data subset D r &D ,it is randomly divided into two sets:TRAIN r for training and TEST r for testing,such that D r ?TRAIN r [TEST r ,where TRAIN r contains 80%of D r and TEST r holds the remaining 20%(Fig.6).Then every TRAIN r is sent to the LS-SVM module for training,which has been implemented using LS-SVMlab (Pelckmans et al.,2003),a MATLAB toolbox under MS Windows XP.The implementation and other important issues are discussed in following sub-sections.4.1.Data pre-processing and post-processing

In order to have a more accurate regression result,the data set is conventionally normalized before training (Pyle,

1999).This prevents any parameter from domination to the output value.For all input and output values,it is necessary to be normalized within the range [0,1],i.e.,unit variance,through the following transformation formula:

N ev T?v ??v àv min

v max àv min ,(9)

where v min and v max are the minimum and maximum domain values of the input or output parameter v ,respectively.For example,v A [7,39],v min ?7and v max ?39.The limits for each input and output parameter of an engine should be predetermined via a number of experiments or expert knowledge or manufacturer data sheets.As all input values are normalized,the output torque value v *produced by the LS-SVM is not the actual value.It must be de-transformed using the inverse N à1of Eq.(9)in order to obtain the actual output value v.4.2.Error function

To verify the accuracy of each function of M r ,an error function has been established.For a certain function M r ,

Fig.5.Car engine performance data acquisition on a chassis

dynamometer.

Fig.6.Further division of data randomly into training sets (TRAIN r )and test sets (TEST r ).

C.-M.Vong et al./Engineering Applications of Arti?cial Intelligence 19(2006)277–287

283

the corresponding validation error is

E r ??????????????????????????????????????????????1N X N k ?1y k

àM r ex k Ty k

2

v u u t

,(10)

where x k A R n is the engine input parameters of k th data

point in a test set or a validation set,y k is the true torque value in the data point d k (d k ?ex k ;y k Trepresents the k th data point)and N is the number of data points in the test set or validation set.

The error E r is a root-mean-square of the difference between the true torque value y k of a test point d k and its corresponding estimated torque value M r (x k ).The difference is also divided by the true torque y k ,so that the result is normalized within the range [0,1].It can ensure the error E r also lies in that range.Hence,the accuracy rate for each torque function of M r is calculated using the following formula:Accuracy r ?e1àE r T?100%.

(11)

4.3.Procedures for tuning hyperparameters

According to Eqs.(5)and (7),it can be noted that the user has to adjust two hyperparameters (g ,s ),where g is the regularization factor and s speci?es the kernel sample variance.Without knowing the best values for these hyperparameters,all estimated engine torque functions couldn’t achieve high generalization.In order to select the best values for these hyperparameters,10-fold cross validation is usually applied but it takes a very long time.Recently,there is a more sophisticated Bayesian frame-work (Suykens et al.,2002;Van Gestel et al.,2001a,b )that can infer the hyperparameter values for LS-SVM.

Given a set of training examples,Bayesian inference is a very robust framework to compute the distribution of the estimated model parameters based on the training exam-ples.Based on the distribution of the model parameters computed,the optimal model parameters values can be predicted.As the theory using Bayesian inference to predict the hyperparameters g and s is out of the scope of this research,it is not discussed in detail.

The basic idea of the hyperparameters inference proce-dure using Bayesian framework (Mackay,1995;Seeger,2004;Van Gestel et al,2001a,b )is based on a modi?ed version of LS-SVM program in Eq.(12),where m is now the regularization factor instead of g ,and z is the variance of the noise for residual e k (assuming constant variance):min w ;b ;e J P ew ;e T?m E w tz E D s :t :e k ?y k à?w T

j ex k Ttb ;k ?1;...;N 2435(12)

with

E w ?1

2w T w ,

E D ?

12X N k ?1e 2

k ?12X N k ?1

ey k à?w T j ex k Ttb T2,e13T

whose dual program is the same as Eq.(5),where w 2R n h

is the weight vector of the target function and e ?[e 1,y ,e N ]is the residual vector.The relationship of g with m and z is g ?z =m .It should be noted that after substituting Eq.(13)and the relationship of g into Eq.(12),it directly becomes Eq.(4).Fig.7brie?y illustrates the algorithm for Bayesian inference for these two hyper-parameters based on a certain data set TRAIN r ,and this ?gure is drawn by referring to (Van Gestel et al.,2001a ).Although the inference procedure is theoreti-cally very complicated,(Pelckmans et al.,2003)has provided a MATLAB/C toolbox to handle this inference procedure.4.4.Training

The training data is ?rstly preprocessed using Eq.(9).Then the hyperparameters (g ,s ),as shown in Eqs.(5)and (7),for the target torque functions are inferred at this point.Since there are 15target torque functions,then ?fteen individual sets of hyperparameters (g r ,s r )are inferred with respect to r .The detailed inference proce-dure for a certain training data set TRAIN r is listed in Fig.7.

After obtaining the 15pairs of inferred hyperparameters (g MP,r ,s MP,r ),where the subscript MP stands for maximum posterior ,the training data set TRAIN r is used for calculating the support values a and threshold b in Eq.(5).Finally,the target function M r can be constructed using Eq.(7).5.Results

To illustrate the advantages of LS-SVM regression,the results are compared with that obtained from training a multilayer feedforward neural network (MFN)with back-propagation.Since MFN is similar to SVM and LS-SVM and it is also a well-known universal estimator,the results from MFN can be considered as a rather standard benchmark.5.1.LS-SVM results

After obtaining all torque functions for an engine,their accuracies are evaluated one by one against their own test sets TEST r using Eqs.(7)and (8).According to the accuracy shown in Table 2,the predicted results are in good agreement with the actual test results under their hyperparameters (g MP,r ,s MP,r )inferred using the proce-dure described in Fig.7.However,it is believed that the function accuracy could be improved by increasing the number of training data.An example of comparison between the predicted and actual engine torque and horsepower under the same ECU con?guration is shown in Fig.8.

C.-M.Vong et al./Engineering Applications of Arti?cial Intelligence 19(2006)277–287

284

5.2.MFN results

Fifteen neural networks NET R?{NET1000,NET1500, y,NET8000}with respect to engine speed r are built based on the?fteen sets of training data TRAIN r?TR r[Valid r. TR r is really used for training the corresponding network NET r whereas Valid r is used as validation set for early stop of trainings so as to provide better network generalization. Every neural network consists of8input neurons(the parameters of an engine setup at a certain engine speed r),1 output neuron(the output torque value T r),and50hidden neurons,which is just a guess.Normally,50hidden neurons can provide enough capability to approximate a highly nonlinear function.The activation function used inside the hidden neurons is Tan-Sigmoid Transfer function while for the output neuron;a pure linear?lter is employed (Fig.9).

The training method employs standard backpropgation algorithm(i.e.,gradient descent towards the negative direction of the gradient)so that the results of MFN can be considered as a standard.Learning rate of weight update is set to be0.05.Each network is trained for1000 epochs.The training results of all NET r are shown in Table3.The same test sets TEST r are also chosen so that the accuracies of the engine torque functions built by LS-SVM and MFN can be compared reasonably.The average accuracy of each NET r shown in Table3is calculated using Eqs.(7)and(8).

https://www.wendangku.net/doc/40620605.html,parison of results

With reference to Tables2and3,SVM outperforms MFN about8.08%in overall average accuracy under the same test sets TEST r.In addition,the issues of hyperpara-meters and training time have also been compared.

In LS-SVM,two hyperparameters(g MP,s MP)are required.They can be guessed using Bayesian inference, which totally eliminates the user burden.In MFN,learning

Fig.7.Inference procedure for hyperparameters(g,s)(Van Gestel et al.,2001a).

C.-M.Vong et al./Engineering Applications of Arti?cial Intelligence19(2006)277–287285

rate and number of hidden neurons are required from the users.Surely,these parameters can also be solved by 10-fold cross-validation.However,the users have to prepare a grid of guessed values for these parameters,and the grid may not cover the best values for the hyperparameters.Therefore,LS-SVM could often produce better generalization rate over MFN as indicated in Tables 2and 3.MFN produces less training error than LS-SVM because there is no regularization factor controlling the tradeoff between training error and generalization.In the contrast,LS-SVM produces better generalization,about

8.08%,due to the regularization factor g MP introduced in the training error function.

Another issue is about the time required for training.Under a 2.4GHz Pentium 4PC with 1GB RAM on board,

Fig.8.Example of comparison between predicted and actual engine torque and power.

Input

Hidden layer

Output

8

50 × 1

50 × 150 × 1

8 × 150 × 850

1

1 × 1

1 × 1

1

1 × 1

1 ×50+

+

p 1

a 1

b 2b 1

a 3 = y

n 1n 2IW 2,1IW 1,1

a 1= tansig(IW 1,1

p 1+ b 1)

a 2 = purelin(IW 2,1

a 1+

b 2)

1Fig.9.Architecture (layer diagram)of MFN.

Table 3

Training errors and average accuracy of the 15neural networks Neural network NET r Training error (mean square error)(%)Average accuracy with test set TEST r (%)NET 10000.0186.6NET 15000.0185.4NET 20000.0384.3NET 25000.1582.9NET 30000.0683.7NET 35000.1480.3NET 40000.0384.0NET 45000.0578.2NET 50000.1076.2NET 55000.6578.1NET 60000.1881.8NET 65000.2584.2NET 70000.0382.1NET 75000.1284.1NET 8000

0.1981.7Overall average

0.13

82.24

Table 2

Accuracy of different functions M r and its corresponding hyperparameter values Torque

function M r

g MP,r

s MP,r

Mean square error with training set TRAIN r (%)Average accuracy with test set TEST r (%)M 10000.28 2.320.4391.2M 15000.318.770.6591.1M 20000.22 4.910.8990.5M 2500 1.14 5.640.4491.2M 30000.59 2.420.3291.3M 35000.74 4.370.2791.6M 40000.98 3.380.0892.5M 4500 1.33 5.89 1.2584.2M 50000.1010.71 2.1081.1M 55000.49 6.87 1.8983.2M 60000.5910.92 1.2488.7M 6500 1.237.430.5890.0M 70000.43 3.050.7791.3M 75000.75 6.340.6690.5M 80000.61 3.28

0.3990.4Overall average

0.80

90.32

C.-M.Vong et al./Engineering Applications of Arti?cial Intelligence 19(2006)277–287

286

LS-SVM takes about10min for training200data points of 8attributes for one time.The Bayesian inference for two hyperparameters takes about16min.In other words,?fteen engine torque functions requires(10+16)?15?390min training time.For MFN,an epoch takes about10s?1/6min and each network takes1000epochs for training.Consequently,it takes about(1000?1/ 6)?15?2500min for?fteen neural networks.According to this estimation,LS-SVM only takes15.6%of training time of MFN.The major time reduction is caused by the one time error function minimization in LS-SVM as opposed to1000epochs in MFN.Even LS-SVM compares with standard SVM;LS-SVM requires less training time because of elimination of10-fold cross-validation for guessing hyperparameters.

6.Conclusions

LS-SVM plus Bayesian inference is?rstly applied to produce a set of torque function for an automotive engine according to different engine speeds.According to Eq.(8), the engine power is calculated based on the engine torque. In this research,the torque functions are separately regressed based on?fteen sets of sample data acquired from an automotive engine through the chassis dynam-ometer.The engine torque functions developed are very useful for vehicle?ne tune-up because the effect of any trial ECU setup can be predicted to be gain or loss before running the vehicle engine on a dynamometer or road test.

If the engine performance with a trial ECU setup can be predicted to be gain,the vehicle engine is then run on a dynamometer for veri?cation.If the engine performance is predicted to be loss,the dynamometer test is unnecessary and another engine setup should be made.Hence,the function for prediction can greatly reduce the number of expensive dynamometer tests,which saves not only the time taken for optimal tune-up,but also the large amount of expenditure on fuel,spare parts and lubricants, etc.It is also believed that the function can let the automotive engineer predict if his/her new engine setup is gain or loss during road tests,where the dynamometer is unavailable.

Moreover,experiments have been done to indicate the accuracy of the torque functions,and the results are highly satisfactory.In comparison to the traditional neural network method,the LS-SVM plus Bayesian inference outperforms about8.08%in overall accuracy under the same test set and its training time is approximately84.4% less than that using standard neural network.

From the perspective of automotive engineering,the construction of modern automotive gasoline engine power and torque functions using LS-SVM is a new attempt and this methodology can also be applied to different kinds of vehicle engines.References

Bishop, C.,1995.Neural Networks for Pattern Recognition.Oxford University Press,New York.

Brace,C.,1998.Prediction of Diesel Engine Exhaust Emission using Arti?cial Neural Networks.IMechE Seminar S591,Neural Networks in Systems Design,UK.

Cristianini,N.,Shawe-Taylor,J.,2000.An Introduction to Support Vector Machines and Other Kernel-based Learning Methods.Cam-bridge University Press,UK.

Gunn,S.,1998.Support vector machines for classi?cation and regression.

ISIS Technical Report ISIS-1-98.Image Speech&Intelligent Systems Research Group,University of Southapton,May1998,UK. Harrell,F.,2001.Regression Modelling Strategies with Applications to Linear Models,Logistic Regression,and Survival Analysis.Springer, New York.

Haykin,S.,1999.Neural Networks:A Comprehensive Foundation, second ed.Prentice-Hall,USA.

Liu,Z.,Fei,S.,2004.Study of CNG/diesel dual fuel engine’s emissions by means of RBF neural network.Journal of Zhejiang University Science 5(8),960–965.

MacKay, D.,1995.Probable networks and plausible predictions—a review of practical Bayesian methods for supervised neural networks.

Network Computation in Neural Systems6,469–505. Pelckmans,K.,Suykens,J.,Van Gestel,T.,De Brabanter,J.,Lukas,L., Hamers, B.,De Moor, B.,Vandewalle,J.,2003.LS-SVMlab:a MATLAB/C toolbox for Least Squares Support Vector Machines.

Available at http://www.esat.kuleuven.ac.be/sista/lssvmlab

Pyle,D.,1999.Data Preparation for Data Mining.Morgan Kaufmann, USA.

Ryan,T.,1996.Modern Regression Methods.Wiley-Interscience,USA. Scho lkopf,B.,Smola,A.,2002.Learning with Kernels:Support Vector Machines,Regularization,Optimization,and Beyond.MIT Press,USA. Seber,G.,Wild,C.,2003.Nonlinear Regression,New Edition.Wiley-Interscience,USA.

Seeger,M.,2004.Gaussian processes for machine learning.International Journal of Neural Systems14(2),1–38.

Sen,A.,Srivastava,M.,1990.Regression Analysis:Theory,Methods,and Applications.Springer,New York.

Smola,A.,Burges,C.,Drucker,H.,Golowich,S.,Van Hemmen,L., Muller,K.,Scholkopf,B.,Vapnik,V.,1996.Regression estimation with support vector LEARNING machines,available at http:// www.?rst.gmd.de/$smola

Su,S.,Yan,Z.,Yuan,G.,Cao,Y.,Zhou, C.,2002.A method for prediction in-cylinder compound combustion emissions.Journal of Zhejiang University Science3(5),543–548.

Suykens,J.,Gestel,T.,De Brabanter,J.,De Moor,B.,Vandewalle,J., 2002.Least Squares Support Vector Machines.World Scienti?c, Singapore.

Tabachnick,B.,Fidell,L.,https://www.wendangku.net/doc/40620605.html,ing Multivariate Statistics,fourth ed.

Allyn and Bacon,USA.

Traver,M.,Atkinson,R.,Atkinson,C.,1999.Neural network-based diesel engine emissions prediction using in-cylinder combustion pressure.SAE Paper1999-01-1532.

Van Gestel,T.,Suykens,J.,De Moor, B.,Vandewalle,J.,2001a.

Automatic relevance determination for least squares support vector machine classi?ers.In:Proceedings of the European Symposium on Arti?cial Neural Networks(ESANN’2001),Bruges,Belgium,April 2001,pp.13–18.

Van Gestel,T.,Suykens,J.,Lambrechts,D.,Lanckriet,A.,Vandaele,G., De Moor,B.,Vandewalle,J.,2001b.Predicting?nancial time series using least squares support vector machines within the evidence framework.IEEE Transactions On Neural Networks,Special Issue on Financial Engineering12(4),809–821.

Yan,Z.,Zhou,C.,Su,S.,Liu,Z.,Wang,X.,2003.Application of neural Network in the study of combustion rate of neural gas/diesel dual fuel engine.Journal of Zhejiang University Science4(2),170–174.

C.-M.Vong et al./Engineering Applications of Arti?cial Intelligence19(2006)277–287287

相关文档