Statistics 782
Statistics 782

Statistics 782

Course Description

This course provides an introduction statistical computing. It explores the skills needed to develop professional quality software for statistical data analysis.  The particular goal of the course is to teach the skills necessary to create extension packages for the R statistical system.

The course incidentally provides an introduction to modern workstation environments and their associated software tools.  The course emphasises the Linux operating system and is intended for students with a strong interest in computing or progressing on to a research career (or both).

Course Goals

At the end of the course you should be able to make productive use of a number of computing tools which can greatly enhance your ability to to carry out statistical research or data analysis.  In particular, you should:
  • Have the skills to work productively in modern workstation environments.
  • Be able to produce professional-quality manuscripts for submission to journals and publishers.
  • Be able to use R at an expert level and to develop R packages for your own use, or use by others.
  • Understand how to increase productivity using literate data analysis, and to make it easy for others to validate your analyses.

Assessment

The final grade will be based on a number of assignments given during the course.  You will also be required to give a number of one-on-one demonstrations to show you know how to use the tools covered in the course.  There is no final exam.

Course Topics

The course covers a selection from the following topics:
  • Computers and software environments
  • The Linux desktop environment.
  • The command line interface.
  • The Emacs text editor
  • Typesetting with LaTeX
  • R programming and package development
  • Literate data analysis using R and LaTeX

Course Materials

There is no required text for the course.  A number of tutorials and manuals will be handed out. These and others will be available from the class web pages.

In addition, the following references may (or may not :-) be useful.

R. A. Becker, J. M. Chambers and A. R. Wilks (1985).
The New S Language.
Pacific Grove, California: Wadsworth & Brooks/Cole.

J. M. Chambers and T. J. Hastie (1993).
Statistical Models in S.
New York: Chapman and Hall.

J. M. Chambers (1998).
Programming with Data: A Guide to the S Language.
New York: Springer.

J. M. Chambers (2008).
Software for Data Analysis: Programming with R.
New York: Springer.

L. Lamport (1994).
LaTeX: A Document Preparation System, 2nd Edition.
Reading, Massachusetts: Addison-Wesley.

M. Goossens, F. Mittelbach, and A. Samarin (2004).
The LaTeX Companion, 2nd Edition.
Reading, Massachusetts: Addison-Wesley.

Timetable

Monday 10am-11am, Room 301-444
Wednesday 9am-10am, Room 303S-169
Fridays, 10am-11am, Room 303S-G87

Office hours will be arranged in class.
You can also make an appointment.

Instructor

Ross Ihaka
Room 275
Extension 85054