STATISTICAL COMPUTING ACTIVITY:
(CHAPTER 2 CONTD.)
BUBBLE PLOTS
The
scatterplots can only display the relationship between two variables. A third
variable can be displayed by adding circles with radii proportional to its
values. Now, let us produce a bubbleplot in R.
Step 1. Download husb.text from the web
site.
Step 2. Run R
Step 3. Go to File menu and ÒChange
directoryÓ to the location that you have saved the file
>husb=read.table(Òhusb.txtÓ, header=T)
>attach(husb)
>husb
>plot(wifeage, husbage)
(Now, let us add circles to the scatterplot)
>symbols(wifeage, husbage, circles=husagefi, inches=0.2,
add=TRUE)
Interpret
the graph. Can you think of any other way of displaying a third variable in a
scatterplot?
For
the chiplots bivariate boxplots, and bivariate density estimate plots we need
to load functions to R. To load them please do the following:
Step 1. On the course website click on the
LEARNING NOTES.
Step 2. Click on the R Functions for
chiplot, bivariate boxplot, and bivariate density plot.
Step 3. Highlight every thing and copy.
Step 4. Go back to R and paste.
CHIPLOTS
The chiplots
are designed to help researchers to judge whether or nor the variables are
independent by augmenting the scatterplot with an auxiliary display.
>chiplot(husbage, wifeage, vlabs=c(ÒHusband AgeÓ,
ÒWife AgeÓ))
INTERPRETATION:
In the case
of independence, the points will be concentrated in the central region, in the
horizontal band indicated on the plot.
Suppose
that the variables are negatively related. Where do you think that the points
will be concentrate?
Can you say
any thing on how strong the relationship is by looking at the chiplot? How?
Now produce
a chiplot for the husbhei and wifehei, and interpret.
BIVARIATE BOXPLOTS
Bivariate
boxplots is a two-dimesional analogue of the box-and-whisker plots. Just like
univariate boxplots they use the robust measures of location, scale, and
correlation. They are used to understand distributional properties of the
data and to
identify possible outliers. In these plots there are two concentric ellipses, one of
them includes 50% of the data (ÒhingeÓ), the other one delineates potential
outliers (ÒfenceÓ). The graph also provides resistant regression lines of both
y on x and x on y. Small acute angle between
regression lines indicates large absolute value of correlations.
>bvbox(cbind(husbage,wifeage), xlab=ÓHusband AgeÓ,
ylab=ÓWife AgeÓ)
If you
would like to use nonrobust estimators (means, variances, and correlation
coefficient), use the following:
>bvbox(cbind(husbage,wifeage), xlab=ÓHusband AgeÓ,
ylab=ÓWife AgeÓ, method=ÓOÓ)
Now produce
a bivariate boxplot for the husbhei and wifehei, and interpret.
BIVARIATE DENSITY ESTIMATES
From
scatterplots one can see the ÒclustersÓ, regions where there are low and high
density of observation, and spot ouliers. Bivariate density estimates help
researchers in these two interpretations. Plot of bivariate density estimates
can be seen as the smoothed two-dimensional histograms.
To get the
bivariate density estimates using a normal kernel
>den1=bivden(husbage, wifeage)
To
construct a perspective plot of the density values
>persp(den1$seqx, den1$seqy, den1$den, xlab=ÓHusband
AgeÓ, ylab=ÓWife AgeÓ, zlab=ÓDensityÓ)
To change
the viewing direction you can define two angles; theate (azimuthal direction)
and phi (colatitude).
>persp(den1$seqx, den1$seqy, den1$den, xlab=ÓHusband
AgeÓ, ylab=ÓWife AgeÓ, zlab=ÓDensityÓ, theta=30, phi=30)
Now, try
different viewing directions.
To add
contour plot of density values to
a scatterplot, we need to produce the scatterplot first
>plot(husbage, wifeage)
Then add
the contour plot
>contour(den1$seqx, den1$seqy, den1$den, nlevels=20,
add=T)
Now change
the number of cuts, nlevels, to 10.
Interpret
the graph.
Produce
density estimate plot for the husbhei and wifehei, and interpret.