All statisticians should be proficient in C (for speed), perl (for data manipulation), and R (for interactive analyses and graphics). Think "CPR".
As described on the R project web page:
"R is a system for statistical computation and graphics. It consists of a language plus a run-time environment with graphics, a debugger, access to certain system functions, and the ability to run programs stored in script files.
"The core of R is an interpreted computer language which allows branching and looping as well as modular programming using functions. Most of the user-visible functions in R are written in R. It is possible for the user to interface to procedures written in the C, C++, or FORTRAN languages for efficiency. The R distribution contains functionality for a large number of statistical procedures. Among these are: linear and generalized linear models, nonlinear regression models, time series analysis, classical parametric and nonparametric tests, clustering and smoothing. There is also a large set of functions which provide a flexible graphical environment for creating various kinds of data presentations. Additional modules are available for a variety of specific purposes."
I use R for all interactive statistical analyses and graphics. Virtually all of the software I produce is now written as add-on packages for R. The computationally intensive portions of such software are written in C, but writing such software as a package for R makes the data input/output to the C code extremely easy, and makes it easy to create documentation and provide graphical facilities. Moreover, R has a very extensive mathematics library, so I don't need to re-write things that have already been well coded.
While I intended this page to provide an introduction similar to my Introduction to Perl page, instead, I am leaving this as a list of links and then a few tips at the bottom.
One of the nice additions to R (relative to Splus) is the easy
inclusion of mathematical expressions in plots, using the function
expression()
. Take a look at
help(plotmath)
to see a big list of what you can do; also
look at the examples in the help file for the function
legend
, and consider the following:
plot(rnorm(100),rnorm(100),xlab=expression(hat(mu)[0]),
ylab=expression(alpha^beta),
main=expression(paste("Plot of ", alpha^beta, " versus ", hat(mu)[0])))
For those running R within unix or Linux, I highly recommend
running R from within emacs. I like to have a really big emacs
window, and so I put the following line in my ~/.bashrc
file. (I use the bash
shell; if you use
tcsh
or csh
, put the analogous line in your ~/.tcshrc
or ~/.cshrc
file. After doing this, you can
type be
to get a big emacs window.
alias be='emacs -bg black -fg white -geometry 99x50+560+122 &'
Place the following lines in your ~/.emacs
file in order
to get fancy highlighting and to get access to ESS (Emacs Speaks
Statistics).
;; load ESS
(load "/sw/share/emacs/site-lisp/ess-5.3.0/lisp/ess-site")
;; automatic Font Lock mode in TeX mode
(add-hook 'tex-mode-hook 'turn-on-font-lock)
;; if Font Lock necessary for other than .tex files, uncomment following
(global-font-lock-mode t)
;; modes for other files
auto-mode-alist (append (list '("\\.c$" . c-mode)
'("\\.tex$" . latex-mode)
'("\\.S$" . S-mode)
'("\\.s$" . S-mode)
'("\\.html$" . html-mode)
'("\\.emacs" . emacs-lisp-mode)
)
auto-mode-alist)
;; html helper mode from http://www.farne.uklinux.net/emacs-primer.html
(autoload 'html-helper-mode "html-helper-mode" "Yay HTML" t)
(setq auto-mode-alist (cons '("\\.html$" . html-helper-mode) auto-mode-alist))
(setq html-helper-do-write-file-hooks t)
;; ESS for Sweave files
(defun Rnw-mode ()
(require 'ess-noweb)
(noweb-mode)
(if (fboundp 'R-mode)
(setq noweb-default-code-mode 'R-mode)))
(add-to-list 'auto-mode-alist '("\\.Rnw\\'" . Rnw-mode))
(add-to-list 'auto-mode-alist '("\\.Snw\\'" . Rnw-mode))
(setq reftex-file-extensions
'(("Snw" "Rnw" "nw" "tex" ".tex" ".ltx") ("bib" ".bib")))
(setq TeX-file-extensions
'("Snw" "Rnw" "nw" "tex" "sty" "cls" "ltx" "texi" "texinfo"))
Having done the above, open an emacs window and type M-x
R
and then enter the subdirectory in which you wish to run R.
(M-x
means press the Esc key and then press x.
down the diamond button while pressing x.)
You likely will want to create an ~/.Renviron
file in
order to let R know some general parameters.
My ~/.Renviron
file contains the following:
R_PAPERSIZE=letter
R_LIBS=/Users/kbroman/Rlibs
EDITOR=emacs
You might also want to create an ~/.Rprofile
file. R
code in this file will be run anytime you start R, no matter what
subdirectory you start in (unless there's an .Rprofile
within that subdirectory, in which case that is read instead.
Examples of what you may wish to put there include commands to modify
the options and/or load certain packages (libraries).
To install an R package (e.g., qtl_0.76.tar.gz) locally (e.g.,
in the directory /users/student/auser/Rlib
), type
R INSTALL --library=/users/student/auser/Rlib qtl_0.76.tar.gz
In your ~/.Renviron
file, include the line
R_LIBS=/users/student/auser/Rlib
Within R, when you type library()
, you should see separate
listings of the packages installed locally and those on the main
system.
Last modified: Mon Jun 29 23:52:05 2009 |