STATISTICAL COMPUTING ACTIVITY: CHAPTER
1 & 2
Step 1. Download husb.text from the web
site.
Step 2. Run R
Step 3. Go to File menu and ÒChange
directoryÓ to the location that you have saved the file
>husb=read.table(Òhusb.txtÓ, header=T)
>attach(husb)
To see the
variable names and data type
>husb
Let us get
some descriptive statistics
>mean(husb)
>var(husb)
>sd(husb)
>median(husbage)
(Note that you need variable name not the data file name)
>fivenum(wifeage)
(Note that you need variable name not the data file name)
>summary(husb)
>cov(husb)
>cor(husb)
This will
produce Figure 2.1 (a) which is a simple scatterplot of wifeÕs age versus
husbandÕs age.
>plot(husbage,wifeage)
Note how
the arguments are entered. The form is plot(x,y), first the explanatory
variable (x), then the response variable (y).
Now, let us
add an y=x line, i.e. any point on this line will imply husbandÕs age=wifeÕs
age. This is the Figure 2.1 (b)
>abline(0,1)
Here the
form is abline(intercept,slope).
If you
would like to add a least squares regression line you need to use
>abline(lm(wifeage~husbage))
The symbol
Ò~Ó means relationship, Òlm" means linear model.
Next let us
add noise to husbage and wifeage to prevent overlapping. This is called
jittering.
> plot(jitter(husbage),jitter(wifeage))
This is the
Figure 2.1 (c).
Let us show
the marginal distribution of the variables
>rug(jitter(husbage), side=1)
>rug(jitter(wifeage), side=2)
In this
case side 1 is the horizontal axis, side 2 is the vertical.
Now, we
have the Figure 2.1 (d).
To analyze
the age difference between wife and husband, let us find the age difference
>agediff=husbage-wifeage
Let plot
husbandÕs age at the first marriage versus the age difference
>plot(agediff, husagefi)
Let us add
a line on which there is no age difference, i.e., agediff=0
> abline(v=0)
Here ÒvÓ is
for vertical ÒhÓ for horizontal.
Now, we
have the Figure 2.2.
To produced
Figure 2.3 (a) and (b):
>plot(husbhei, wifehei)
>abline(0,1)
For Figure
2.4
>heidiff=husbhei-wifehei
>plot(agediff,heidiff)
>abline(v=0)
>abline(h=0)
Now, tell
me what is the main thing that you did not like in this example. (My brain was
fighting with my hands when I was preparing this handout.)
Labeling the Scatterplot
with Row Names
species |
bodywt |
brainwt |
PotarMonkey |
10.0 |
115 |
Gorilla |
207.0 |
406 |
Human |
62.0 |
1320 |
RhesusMonkey |
6.8 |
179 |
Chip |
52.2 |
440 |
Step 1. Run R
Step 2. Go to File menu and ÒChange
directoryÓ to the location that you have saved the file
>primates=read.table(Òprimates.txtÓ, header=T)
>attach(primates)
>row.names(primates)=species
>plot(bodywt, brainwt)
>text(x=bodywt, y=brainwt, labels=row.names(primates))
From Bivariate to
Multivariate
To produce
a scatterplot matrix you can either use
>pairs(husb)
Or
>plot(husb)
To smooth
the data
> plot(husb, lower.panel=panel.smooth)
Coplots
To produce
a conditional scatterplot of wifeage versus husbage given the husbhei use
>coplot(wifeage~husbage|husbhei)
To smooth
the coplot by using lowess
> coplot(wifeage~husbage|husbhei, panel=panel.smooth)
Quantile-Quantile Plots
(Normal Plots)
To produce
a normal plot use
>qqnorm(husbage)
To add a
line
> qqline(husbage)