This site is based on Github/Jekyll.
Graphs are generated dynamically through the R package knitr.
The source code to these posts are available on GitHub.

Latest posts

Boosting Algo

2015-06-28

1 http://freakonometrics.hypotheses.org/19874

n=300
set.seed(1)
u=sort(runif(n)*2*pi)
y=sin(u)+rnorm(n)/4
df=data.frame(x=u,y=y)
plot(df)

2 linear-by-part regression models.

At each iterations, there are 7 parameters to ???estimate???, the slopes and the nodes. consider some constant shrinkage parameter.

This is the implementation of the algorithm described above,

for(t in 1:100){
  fit=lm(yr~bs(x,degree=1,df=3),data=df)
  yp=predict(fit,newdata=df)
  df$yr=df$yr - v*yp
  YP=cbind(YP,v*yp)
}
str(YP)
##  num [1:300, 1:101] 0.0215 0.0215 0.0226 0.023 0.0238 ...
##  - attr(*, "dimnames")=List of 2
##   ..$ : chr [1:300] "1" "2" "3" "4" ...
##   ..$ : chr [1:101] "YP" "" "" "" ...

The red line is the initial guess we have, without boosting, using a simple call of the regression function. The blue one is the one obtained using boosting. The dotted line is the true model.

nd=data.frame(x=seq(0,2*pi,by=.01))
viz=function(M){
if(M==1)  y=YP[,1]
if(M>1)   y=apply(YP[,1:M],1,sum)
  plot(df$x,df$y,ylab="",xlab="")
  lines(df$x,y,type="l",col="red",lwd=3)
  fit=lm(y~bs(x,degree=1,df=3),data=df)
  yp=predict(fit,newdata=nd)
  lines(nd$x,yp,type="l",col="blue",lwd=3)
  lines(nd$x,sin(nd$x),lty=2)}

viz(50)
## Warning in bs(x, degree = 1L, knots = structure(c(2.08092116216283,
## 4.02645437093874: some 'x' values beyond boundary knots may cause ill-
## conditioned bases

## Copyright All the content in this website is licensed under [CC BY-NC-SA 3.0](http://creativecommons.org/licenses/by-nc-sa/3.0/).