您的当前位置:首页正文

学习理论中回归问题的误差估计

2024-10-18 来源:威能网
应用概率统计 第二十六卷 第三期2010年6A Chinese JournM of Applied Probability and Statistics Vo1.26 No.3 Jun.2010 Error Estimation of Regression Problem in Learning Theory LI PINO (。ep。rtme D,^ £^emn cs, 。。 zfB9e of Arts。nd sc e礼ce,肌。。面礼9,3』2D。。) Abstract In this paper,under a general loss function ( 一,( )),when ( )is a continuous iunct on' error estimation of regression problem is discussed. Keywords: Err0r estimation,regression,learning theorY,target function,em minimization. , ., cal ris上【 AMS Subject Classiifcation:68T05,62J02・ §1.Introduction In the recent years,learning theory is discussed in literatures・ Firstly we wil1 g Ve some basic cOncepts of the statistical learning theory. Let X be a compact metric space and Y:R.Let p be a unknown probab lity distribution on Z:X×Y and = ) 1:{( , ))罂1 be a samP1es,independently drawn according to P. .. Let C(X1 be the Banach space of continuous function Oil X with the norm lIo。 sup I ( )l,and be a compact subset of C(x)・ ∈x The g0al of the regressi0n problem is to ifnd a function,:x y(based on he saⅡlples),which can predicts the value of given z.The possible choices of funct ons are a11 in a f1lnction class 7_(called the hypothesis space for the regress on problen . A quantitarive measure of how accurate a function jP∈ approximates Y s g Ven by a 1oss functi0n ( 一,( )). The errorfrisk)associated with the loss function of f is defined by ’ ): 一 )) The target function fu in 7t is defined to be : gmin ̄ ̄b( 一,( ))dp (60873206,90818020) 264 应用概率统计 第二十六卷 The empirical error(risk)associated with the loss function is defined to be ,)= m 一 The empirical target function fz associated with the samples ={( , )). 1∈zm is defined 。be fz=arg mi n Ez(,) (see[1]). ,∈工’he main idea of the statistical learning theory is Empirical Risk Minimization. Given a training set ={zd 1 and a function space ,we define the Empirical Risk Minimization(ERM)to be a symmetric procedure that selects a function that minimizes the empirical risk over al1 functions.厂∈ ,that is, =arg min  (.厂)(see[2D・ f The theory of ERM studies how fz approximates with respect to the error the target function . The covering number plays a big role in learning theory. For a subset of a metric space and叩>0,the covering number ( ,叩)is defined to be the minimal integer ∈N such that there exist disks with radius卵covering . For a subset of continuous function set c(x)on ,we denote by ( ,77)the covering number of (see f11). The sample error is defined as£(fz)一E( ). By Bernstein inequalities,under square loss function ( 一 ,( ))=(Y一,( )) ,(1] gave the following result. Theorem c【 ]Let be a compact subset of C( ),Mo>0 be a constant and f:X__÷Y is a function.Assume that,for all f∈7_(,』.厂( )一YI M0 almost everywhere. Let =sup(72(片),where 0-2(片)is the variance of ,then,for all£>0, ∈ . c 啦 一 ( , )唧 一 m£2 8(4 + ) In the present paper,under a general loss function ( 一.厂( )),when ( )is contin- UOUS function,error estimation of regression problem is discussed. §2. Main Results Let be a compact subset of C( ),since£(fz)一£(fn)= ( )一E ( )+ ( )一 argm,∈£ ( )+£ ( )一£( )and fz =£( )一£(. ) in  (,)1 s。,Ez( )一E ( ) 0,hence E( )一Cz( )+E ( )一£( ) ≤sup(e(f)一Cz(,))+sup(e (.厂)一 (.厂)). fen fen 第三期 李平:学习理论中回归问题的误差估计 265 To prove the following theorems,we first introduce a Lemma. Lemma[ ](The bounded diferences inequality(McDiarmid))Let X1,X2,…, be independent random variables taking values in a set Q and assume that f:Q _-÷R satisfies the bounded diference condition,that is,for every 1 i m, sup Jf(xl,.一,Xm)一f(Xl,…,Xi一1, :,Xi+l,…,3Cm)I Ci, l,・--,zm, ∈Q 一m then for every t>0 ; P{,( ,…,z )一E,(z ,…, ) t) exp 一 ∑c J =2亡2 1} 1 P{E.厂( ,一., )一,(z ,…, ) ) exp }・ ∑c J 一2亡2 1 i:1 Theorem 2.1 Let 7-i be a subset of C( ),M>0 be a constant and :R R+ be a loss function.For all f∈ satisifes l ( 一.厂( ))一 ( 一f(x ))1 M,then for every E>0.there holds 2仇£2 Pzczm{f£(,)一£ (,)I ) 1—2exp{ M2 Proof Set∈( )= ( 一.厂( i)).Since + ( ))一 z ̄EXxY f ( )+・..sup ~( )+…+∈( )+…+ ( )) … m, sup ∈( )一∈(z:) 1 I ( 一 Zl1---, m, :∈ ×y SUP Zl,…,Zm, ∈x×y ))一 ( 一 :)) M 仃 Applying Lemma,for every£>0,we have P T/t i 1 -]I>£]. 2 exp{一 2mE2). Hence P{Ic(f)一oCz(,)l £) =P 墨 £_>1-2 exp{一 ). 口 266 应用概率统计 第二十六卷 Theorem 2.2 Let 7-/be a compact subset of C( ),M>0 be a constant and :R-_÷R+be a loss function.Suppose that is a continuous function,for all f∈7-I satisfying J ( 一,( ))一 ( 一f(x ))I M,then for every£>0,there exists叩>0, there holds zm ,) lE) 一 ( )eXp{一蒹) Proof Since is continuous,for every g->0,there exists 77>0,when IIf—fJll ̄ 叼,there is l ( — ))一 ( 一 ( ))l ・ Let N= ( ,叩),and choose{f…N:1 C 7-/such that for every f∈7-I there is J∈{1,2,…,Ⅳ)satisyfing IIf—fjlI。。 叩. Hence 『£(.厂)一E(,j)『 =I 一 ))一 一 ))]d ))一 一 ))ldp , ( )一Cz( =f 1 m (( 一 (翰))一砂(一 一 一 圳  训 1 m I ( 一 (zi))一 一 i))l Thus E(,)一 (I厂)f = IE(.厂)一E(疗)+E( )一Cz( )+Cz( )一Cz(‘厂) I£(.厂)~£( )I+IE( )一Cz( )l+l£ ( )一£z(t厂) + +I£( )~E (fj)l 舌+『£( )一£ ( )I- Therefore {z∈z :IE(,)一Ez(,)I £,V,∈ )c N{r ∈z : IE( )一Ez( ) I差 It follows that P fL sup ,∈ E(f)-ez(,) ) P ̄ezm{JE( ) (,J)l 三) 第三期 李平:学习理论中回归问题的误差估计 267 Set∈( )= ( 一t厂(z )).Since su p∈ y I (∈(z )+…+∈( m))一 (∈( )+・・‘+∈( )+…+ ( m)) )一∈( :) ̄--sup …1,…,m,Z z ̄EXxY sup ~= Zl, .Zm z ̄EXxY m I@(Yi—f(xi))一 ( ~,(z:)) < " M. applying Lemma,we have c 主) xp{- xp{-蒹) Consequently,for every£>0,there holds zm ,)_ ,) >J_1-2 .exp{一蒹>. 口 Theorem 2.3 Let be a compact subset of C( ),M>0 be a constant and :R R+be a loss function.Suppose that is a continuous function,for all f∈7-t satisfying}砂( 一,( ))一砂( 一f(x ))I M,then for every E>0,there exists叩>0, there holds ) ) 1~2 ( ).exp{一蒜) Proof Since is continuous,for every E>0,there exists叩>0,when IIf一 II∞ 叼,there is I砂( 一,( ))一 ( 一fj( ))I 詈. 7-/such that for every f∈7-I there is Let N=Ⅳ( ,叩),and choose{,j} 1 C J∈{1,2,…,Ⅳ)satisfying lIf—fjl1。。 叩. Hence £(,)一E( ) J Z 1 ( 一 ))一 ( 一fj(x))]dp m o , E ( )~£ (厂) Thus  ̄lEI ̄(yi—fj(xd)一 ( 一 i))】 言・ E(,)一 (,)=E(,)一g( )+E(疗)一 (,J)+ (,J)一 (,) +E( )一E ( )・ 268 应用概率统计 第二十六卷 Therefore z E Zm: ̄(f) (,)> ∈ c 她) It follows that P ∈ sup{E(,)一 ) 暑N zm ( ) …。Set ( )= ( 一.厂( i)).Since su p∈x y I (∈( )+・・‘+∈( m))一 (∈( )+・・‘+∈( )+…+∈( m)) Z I,…,Zm,sup )一 ( )…z ̄EXxY m suP 一 一矽( 一 :)) Zl,…,Zm, ∈ xY < M. " applying Lemma,we have 2. \、P ∈zm{Ec 一E c ) exp 4/ 1{_ 8mcM22、)  .M2 " )= exp 2. \P ∈zm{£ ( )一E( ) ) exp 4/ mE2 1 1exp 面 ..——.M2 " )= Hence P锄m{ (,∈ £(.,)一E (,)) 三)J  ( ,叩)exp{ mc2 1 j, P m su fe p(一 J ( ,叼)eXp{一L 』 ) J  Thus PzEZm{£(,2)一E( ) E) P锄m({ ( (,) 一 ( )) }U{ s up, (Ez(,)一£(,)) )) 2/(n exp{一 ) 第三期 李平:学习理论中回归问题的误差估计 269 Consequently,for every E>0,there holds Pzczm{E( )~E( ) £} 1—2Ar( ̄,叼)-exp{一 ). 口 In this paper we obtain three theorems,we now compare the three theorems with the Theorem CI11First the conditi0ns of the three the0rems are weaker than conditions in . the Theorem CI 】;and second in the Theorem c .if M0>0 is a much large constant, then the result of the Theorem Ct can be uselessHowever,in the present paper,we may choose a suitable loss function ( 一,( )),then M>0 can be a much small constant, and the results of the theorems in the paper can be usefu1. Acknowledgements The author would like to thank Professor Wang Jing Long or hjfs assistance. References …Cuc ̄r,F.and Smale,S.,On the mathematical foundations of learning,Bul1.Amer.Math.Soc., 39(2001),1_49. 【2】Vapnik,V.,Statistical Learning Theory,John Wiley and Sons,New York,1998. [3]DeVito,E.,Caponnetto,A.and Rosasco,L.,Model selection for regularized least—squares Mgorithm in learning theory,Found.Comput.Math., ̄(2005),59-85. 【4】Devroye,L.,GySrif,L.and Lugosi,G.,A Probabilistic Theory ofPattern Recognition,springer_Verlag, NewYork,1996. 【5】MCDiarmid,C.,On the Method of Bounded Diferences,in Surveys in Combinatorics,Cambridge University Press,Cambridge,1989. 【6]Talagrand,M.,Sharper bounds for Gaussian and empirical processes,Annals fProboability,22(1994), 28 76. 【7】Wu,Q.,Ying,Y.and Zhou D.X.,Learning theory:from regression to clssiaifcation,Studies讥 Computational Mathematics,12(2006),257~290. [8 Devot8]e,R.,Kerkyacharian,G.,Picard,D.and Temlyakov,V.,Approximation methods for supervised learning,Found.Comput.Math.,6(2006),3-58. 学习理论中回归问题的误差估计 李 平 (绍兴文理学院数学系,绍兴,312000) 本文在一般的损失函数 ( 一,(z))下,当 (z)连续时,讨论了学习理论中回归问题的误差估计 关键词:误差估计,回归,学习理论,目标函数,经验风险最小化. 学科分类号:O211.4. 

因篇幅问题不能全部显示,请点此查看更多更全内容