Preface. These notes are an introduction to using the statistical software package R for an introductory statistics course. They are meant to accompany an. SUR: Introduction to Probability and Statistics Using R Basic R Operations and Concepts. .. A graph of a bivariate normal PDF. If you are in need of a local copy, a pdf version is continuously · maintained. R code will be typeset using a monospace font which is syntax highlighted. a = 3.

Author: | RENEA GALIPEAU |

Language: | English, Spanish, Indonesian |

Country: | Georgia |

Genre: | Personal Growth |

Pages: | 418 |

Published (Last): | 18.05.2016 |

ISBN: | 474-7-50838-897-9 |

Distribution: | Free* [*Register to download] |

Uploaded by: | VINCE |

Using R for the study of topics of statistical methodology, such as International Standard Book Number (eBook - PDF). Using R for Introductory Statistics. John Verzani. CHAPMAN & HALL/CRC. A CRC Press Company. Boca Raton London New York Washington, D.C. Basic statistics using R Before using any functions in the packages, you need to load the packages Menu: File -> Save As -> JPEG / BMP / PDF / postscript.

This page intentionally left blank 1 This chapter provides a general presentation of R software. We first describe how to install it, the differences in use depending on the operating system, and the different help available. We then go on to list the various elements used in R vectors, matrices, etc. We conclude by presenting the packages, or libraries, of external programs which will be useful throughout this book. The installation of R varies according to the oper- ating system Windows, Mac OS X, or Linux but the functions are exactly the same and most of the programs are portable from one system to another. Installing R is very simple, just follow the instructions. You are welcome to redistribute it under certain conditions. Type license or licence for distribution details. Natural language support but running in an English locale 4 R for Statistics R is a collaborative project with many contributors. Type contributors for more information and citation on how to cite R or R packages in publications. Type demo for some demos, help for on-line help, or help.

Type contributors for more information and citation on how to cite R or R packages in publications. Type demo for some demos, help for on-line help, or help. Type q to quit R. Each instruction must be validated by Enter to be run. If the instruction is incorrect, an error message will appear. For each project, we recommend you create a file in which an image of the session concerning this project will be saved. We also recommend that users write the commands in a text file in order to be able to use them again when needed.

By default, R saves all the objects created variables, results tables, etc. At the end of the session, these objects can be saved in an image of the session. RData, using the command save.

Saved objects are thus available for future use using the load function. We will now look at the differences between a session opened in Linux, in Windows, and on a Mac. We can also open an R session directly in the directory in which we wish to work. The destination can be checked using getwd. The data and the commands will be saved in the same place as R was opened, that is to say, in the same place as R is installed.

Main Concepts 5 The best way to change the working directory is to use the setwd func- tion or by going to File then Change dir To close a session in R, go to File, select Exit or equivalently type q in the command window and answer yes when asked Save workspace image?

Frequency distributions 1. The centre of a distribution - 5 - 1. The dispersion in a distribution 1. Using a frequency distribution to go beyond the data 1. Fitting statistical models to the data What have I discovered about statistics? Building statistical models Populations and samples Simple statistical models 2. The mean: a very simple statistical model Assessing the fit of the mean: sums of squares, variance and 2.

Expressing the mean as a model Going beyond the data 2. The standard error 2. Confidence intervals Using statistical models to test research questions 2. Test statistics 2. One- and two-tailed tests 2. Type I and Type II errors 2. Effect sizes 2. Statistical power What have I discovered about statistics? Before you start 3. The R-chitecture 3. Pros and cons of R 3.

Downloading and installing R 3. Versions of R Getting started 3. The main windows in R 3. Menus in R Using R 3. Commands, objects and functions 3. Using scripts 3. The R workspace 3. Setting a working directory 3.

Installing packages 3. Getting help Getting data into R 3. Creating variables 3. Creating dataframes 3. Calculating new variables from exisiting ones 3. Organizing your data 3. Missing values Entering data with R Commander 3. Creating variables and entering data with R Commander 3.

Creating coding variables with R Commander Using other software to enter and edit data 3. Importing data 3. Importing SPSS data files directly 3. Importing data with R Commander 3.

Things that can go wrong Saving data Manipulating data 3. Selecting parts of a dataframe 3. Selecting data with the subset function - 7 - 3.

Dataframes and matrices 3.

Reshaping data What have I discovered about statistics? The art of presenting data 4. Why do we need graphs 4.

What makes a good graph? Lies, damned lies, and … erm … graphs Packages used in this chapter Introducing ggplot2 4. The anatomy of a plot 4. Geometric objects geoms 4. Aesthetics 4. The anatomy of the ggplot function 4.

Stats and geoms 4. Avoiding overplotting 4. Saving graphs 4. Putting it all together: a quick tutorial Graphing relationships: the scatterplot 4. Simple scatterplot 4. Adding a funky line 4. Grouped scatterplot Histograms: a good way to spot obvious problems Boxplots box—whisker diagrams Density plots Graphing means 4. Bar charts and error bars - 8 - 4.

Line graphs 4. Themes and options What have I discovered about statistics? What are assumptions? Quantifying normality with numbers 5. Exploring groups of data Testing whether a distribution is normal 5. Doing the Shapiro—Wilk test in R 5. Reporting the Shapiro—Wilk test Testing for homogeneity of variance 5.

Dealing with outliers 5. Dealing with non-normality and unequal variances 5. Transforming the data using R 5. When it all goes horribly wrong What have I discovered about statistics? Looking at relationships How do we measure relationships?

A detour into the murky world of covariance 6. Standardization and the correlation coefficient 6. The significance of the correlation coefficient 6. Confidence intervals for r 6. A word of warning about interpretation: causality Data entry for correlation analysis Bivariate correlation 6.

Packages for correlation analysis in R 6. General procedure for correlations using R Commander 6. General procedure for correlations using R 6.

Bootstrapping correlations 6. Biserial and point-biserial correlations Partial correlation 6. The theory behind part and partial correlation 6.

Partial correlation using R 6. Comparing independent rs 6. Comparing dependent rs Calculating the effect size How to report correlation coefficents What have I discovered about statistics?

An introduction to regression 7. Some important information about straight lines 7.

The method of least squares 7. Assessing the goodness of fit: sums of squares, R and R2 7. Assessing individual predictors Packages used in this chapter General procedure for regression in R 7.

Doing simple regression using R Commander 7. Regression in R Interpreting a simple regression 7. Overall fit of the object model 7. Model parameters 7. Using the model Multiple regression: the basics 7.

An example of a multiple regression model 7. Sums of squares, R and R2 7. Parsimony-adjusted measures of fit 7. Methods of regression How accurate is my regression model?

Assessing the regression model I: diagnostics 7. Some things to think about before the analysis 7. Multiple regression: running the basic model 7. Interpreting the basic multiple regression 7. Comparing models 7. Diagnostic tests using R Commander 7. Outliers and influential cases 7. Assessing the assumption of independence 7.

Assessing the assumption of no multicollinearity 7. Checking assumptions about the residuals 7. What if I violate an assumption?

Robust regression: bootstrapping 7.

How to report multiple regression 7. Categorical predictors and multiple regression 7. Dummy coding 7. Regression with dummy variables What have I discovered about statistics? Background to logistic regression What are the principles behind logistic regression?

Assessing the model: the log-likelihood statistic 8. Assessing the model: the deviance statistic 8.