The purpose of this post is to mention the Julia Language. It is a new language for technical computing. Its main strength is that it runs faster than R, MATLAB...etc. The code is compiled Just-In-Time. In the backend, amongst other things, it has LAPACK and ARPACK.
So check out http://julialang.org/
Wednesday, March 28, 2012
Saturday, March 24, 2012
R Programming Syntax Quickstart
If you have ANY programming experience in other languages, this guide will get you started in R very quickly.
Also, try the following to understand "&&" and "||":
The specific example:
Specific examples:
Note that you must something write something within the while that will update at least one of the variables in the condition. Otherwise, you could have a perpetual loop.
Specific Example:
Specific Example:
General Example:
Specific Example:
BANG!!
Logic Operators
a == b | a equals b |
a != b | a is not equal to b |
a > b | a is greater than b |
a < b | a is less than b |
a >= b | a is greater than OR equal to b |
a <= b | a is less than OR equal to b |
(condition 1) & (condition 2) | (condition 1) AND (condition 2) |
(condition 1) | (condition 2) | (condition 1) OR (condition 2) |
Also, try the following to understand "&&" and "||":
> a<-c(1:10) > b<-a > c<-b > c[1:4]<-.5 > (a == b) && (a > c)
[1] TRUE
> (a == b) & (a > c)
[1] TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
> (a == b) | (a > c)
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
> (a == b) || (a > c)
[1] TRUE
IF statements
The general example:
if( condition ) {
} else if( other condition ){
} else {
}
The specific example:
a<-55
if( a <= 54.9 ) {
print("a is less than or equal to 54.9")
} else if( a == 55 ){
print("a equals 55")
} else {
print("a is greater than 54.9 and not 55")
}
For Loops
The general example:
for(variable in vector) {
}
Specific examples:
#example 1
for(i in 1:10) {
print(i)
}
#example 2
index.vector<-c(4,3,7,5)
numberz<-runif(10)
print(numberz)
for(i in index.vector) {
print(numberz[i])
}
#example 3
for(i in 1:10) {
if(i == 3) {
next
} else if(i == 7) {
break
}
print(i)
}
#example 4
mat<-matrix(0,3,4)
print(mat)
for(i in 1:3) {
for(j in 1:4) {
mat[i,j]<-rnorm(1)
}
}
While Loops
General Example:
while(condition) {
}
Note that you must something write something within the while that will update at least one of the variables in the condition. Otherwise, you could have a perpetual loop.
Specific Example:
i<- -1
while( i < 10) {
print(i)
i<-i+1
}
Repeat Loop
In a repeat loop, you not only explicitly update variables, you must also explicitly test the condition.Specific Example:
i<- -1
repeat{
print(i)
i<-i+1
if( i == 10) {
break
}
}
Functions
For example, you could have a function that evaluates a formula. A function can call other functions.General Example:
function_name<-function(parameters) {
return(return_variable)
}
Specific Example:
calcQuadratic<-function(a, b, c, x) {
y<-a*x*x+b*x+c
return(y)
}
calcQuadratic(2,3,5,.07)
my.var<-calcQuadratic(3.32,7.6,5.999,3.2)
print(my.var)
BANG!!
Testing for seasonal unit roots in R
I will explain seasonal unit root testing in R. Briefly, R is a language for statistical computing. It is very similar to MATLAB, SAS...etc. The website is http://www.r-project.org
Suppose that a our dataset is seasonal and that we intend to use a seasonal ARIMA model. We need to test our time to see if it is seasonal integrated.
Version 3.x of the "forecast" R package has a new function for testing for seasonal unit roots. The function is nsdiffs().
R also comes with a US Accidental Deaths dataset.
So to follow along, open up R and type the following:
You will then see the US Accidental Deaths dataset. You can see that it is monthly.
Now install the "forecast" R package from CRAN. Then load it.
To view the help file for the nsdiffs() type:
It will bring up a page that is for both nsdiffs and ndiffs.
There are two tests that have been implemented in nsdiffs, the OCSB test (default) and the Canova-Hansen test. You can also speicify the seasonal period of your dataset. USAccDeaths is a TS object and the seasonal period or "frequency" is a data member of the USAccDeaths/TS object.
To perform the OCSB test:
To perform the Canova-Hansen test:
The ouput: "1" means that there is a seasonal unit root and "0" that there is no seasonal unit root.
You will notice that the two different tests give two different answers. This is because the Canova-Hansen test is less likely to decide in favour of a seasonal unit root than the OCSB test. Unlike the Canova-Hansen test, the OCSB test has a null hypothesis of a unit root. The USAccDeaths dataset is "on the edge". Osborn (1990) writes that when in doubt, it's better to seasonally difference.
Bibliography:
Osborn, DR (1990) "A survey of seasonality in UK macroeconomic variables", International Journal of Forecasting 6(3):327-336
Osborn DR, Chui APL, Smith J, and Birchenhall CR (1988) "Seasonality and the order of integration for consumption", Oxford Bulletin of Economics and Statistics 50(4):361-377.
Canova F and Hansen BE (1995) "Are Seasonal Patterns Constant over Time? A Test for Seasonal Stability", Journal of Business and Economic Statistics 13(3):237-252.
Suppose that a our dataset is seasonal and that we intend to use a seasonal ARIMA model. We need to test our time to see if it is seasonal integrated.
Version 3.x of the "forecast" R package has a new function for testing for seasonal unit roots. The function is nsdiffs().
R also comes with a US Accidental Deaths dataset.
So to follow along, open up R and type the following:
>USAccDeaths
You will then see the US Accidental Deaths dataset. You can see that it is monthly.
Now install the "forecast" R package from CRAN. Then load it.
To view the help file for the nsdiffs() type:
>?nsdiffs
It will bring up a page that is for both nsdiffs and ndiffs.
There are two tests that have been implemented in nsdiffs, the OCSB test (default) and the Canova-Hansen test. You can also speicify the seasonal period of your dataset. USAccDeaths is a TS object and the seasonal period or "frequency" is a data member of the USAccDeaths/TS object.
To perform the OCSB test:
>nsdiffs(USAccDeaths)
To perform the Canova-Hansen test:
>nsdiffs(USAccDeaths, test="ch")
The ouput: "1" means that there is a seasonal unit root and "0" that there is no seasonal unit root.
You will notice that the two different tests give two different answers. This is because the Canova-Hansen test is less likely to decide in favour of a seasonal unit root than the OCSB test. Unlike the Canova-Hansen test, the OCSB test has a null hypothesis of a unit root. The USAccDeaths dataset is "on the edge". Osborn (1990) writes that when in doubt, it's better to seasonally difference.
Bibliography:
Osborn, DR (1990) "A survey of seasonality in UK macroeconomic variables", International Journal of Forecasting 6(3):327-336
Osborn DR, Chui APL, Smith J, and Birchenhall CR (1988) "Seasonality and the order of integration for consumption", Oxford Bulletin of Economics and Statistics 50(4):361-377.
Canova F and Hansen BE (1995) "Are Seasonal Patterns Constant over Time? A Test for Seasonal Stability", Journal of Business and Economic Statistics 13(3):237-252.
Analytics Blog
I started with the ""Insurance Blog", which started as keyword laden drivel. However, there is a limited amount of drivel that I can produce. Eventually, good statistical computing info started to flow - interspersed with keyword laden drivel. According so some research by some start-up analytics company, "insurance" is one of the highest paying keywords. ;-)
When I saw in Google Analytics that my drivel blog was coming up in searches and actually helping people, I became proud of my content. So here is a blog that I am completely proud of: all the info, without the drivel.
When I saw in Google Analytics that my drivel blog was coming up in searches and actually helping people, I became proud of my content. So here is a blog that I am completely proud of: all the info, without the drivel.
Subscribe to:
Posts (Atom)