Recent blogs - Things that make me go hmm
This site is based on Github/Jekyll.
Graphs are generated dynamically through the R package knitr.
The source code to these posts are available on GitHub.
## Copyright
All the content in this website is licensed under
[CC BY-NC-SA 3.0](http://creativecommons.org/licenses/by-nc-sa/3.0/).
Graphs are generated dynamically through the R package knitr.
The source code to these posts are available on GitHub.
Latest posts
- 17 Jan 2016 » Stockmomentum.3
- 17 Jan 2016 » Stockmomentum.2
- 01 Nov 2015 » Rest 1
- 31 Aug 2015 » Titanic Kaggle
- 28 Jun 2015 » Boosting Algo
- Read More...
Boosting Algo
1 http://freakonometrics.hypotheses.org/19874
n=300
set.seed(1)
u=sort(runif(n)*2*pi)
y=sin(u)+rnorm(n)/4
df=data.frame(x=u,y=y)
plot(df)
2 linear-by-part regression models.
At each iterations, there are 7 parameters to ???estimate???, the slopes and the nodes. consider some constant shrinkage parameter.
This is the implementation of the algorithm described above,
for(t in 1:100){
fit=lm(yr~bs(x,degree=1,df=3),data=df)
yp=predict(fit,newdata=df)
df$yr=df$yr - v*yp
YP=cbind(YP,v*yp)
}
str(YP)
## num [1:300, 1:101] 0.0215 0.0215 0.0226 0.023 0.0238 ...
## - attr(*, "dimnames")=List of 2
## ..$ : chr [1:300] "1" "2" "3" "4" ...
## ..$ : chr [1:101] "YP" "" "" "" ...
The red line is the initial guess we have, without boosting, using a simple call of the regression function. The blue one is the one obtained using boosting. The dotted line is the true model.
nd=data.frame(x=seq(0,2*pi,by=.01))
viz=function(M){
if(M==1) y=YP[,1]
if(M>1) y=apply(YP[,1:M],1,sum)
plot(df$x,df$y,ylab="",xlab="")
lines(df$x,y,type="l",col="red",lwd=3)
fit=lm(y~bs(x,degree=1,df=3),data=df)
yp=predict(fit,newdata=nd)
lines(nd$x,yp,type="l",col="blue",lwd=3)
lines(nd$x,sin(nd$x),lty=2)}
viz(50)
## Warning in bs(x, degree = 1L, knots = structure(c(2.08092116216283,
## 4.02645437093874: some 'x' values beyond boundary knots may cause ill-
## conditioned bases