STATISTICAL COMPUTING ACTIVITY: MDS FROM DISSIMILARITY MATRIX

 

 

You can use R only for the symmetric dissimilarity (distance) matrices, i.e., higher values implies higher distances.

 

PART 1: MOTIVATIONAL EXAMPLE

 

Consider three cities A, B, and C, and another group of cities D, E, F with the following distance matrices:


 

A

B

C

A

0

 

 

B

50

0

 

C

50

50

0

 

 

D

E

F

D

0

 

 

E

50

0

 

F

100

50

0

 


Let us use classical MDS on these to distance matrices by using R.

 

First enter the matrices:

>abc=matrix(c(0,50,50,50,0,50,50,50,0),nrow=3, dimnames=list(c(³A², ³B², ³C²,), c(³A², ³B², ³C²)))

>def=matrix(c(0,50,100,50,0,50,100,50,0),nrow=3, dimnames=list(c(³D², ³E², ³F²,), c(³D², ³E², ³F²)))

 

Note that we have named the rows and columns for plotting. First, guess how these cities will be plotted.

 

Now we are ready to use classical MDS:

 

>locabc=cmdscale(abc)

>locabc

>xabc=locabc[,1]

>yabc=locabc[,2]

>plot(xabc,yabc,type=²n²,xlab=²²,ylab=²²,main=³cmdscale(abc)²)

>text(xabc,yabc,rownames(abc),cex=0.8)    

 

Is the plot meaningful?

 

>locdef=cmdscale(def)

>locdef

>xdef=locdef[,1]

>ydef=locdef[,2]

>plot(xdef,ydef,type=²n²,xlab=²²,ylab=²²,main=³cmdscale(def)²)

>text(xdef,ydef,rownames(def),cex=0.8)       

 

Is the plot meaningful?

 

Now let us get the one-dimensional solution for both cases

>plot(xabc, type=²n²,xlab=²²,main=³cmdscale(abc)²)

>text(xabc,rownames(abc),cex=0.8)

>plot(xdef,type=²n²,xlab=²²,main=³cmdscale(def)²)

>text(xdef,rownames(def),cex=0.8)

 

Discuss how meaningful these plots are.

 

Our goal is to reproduce the observed distance matrix by using fewer dimensions to reduce the observed complexity of the nature. This example shows fewer factors may produce a worse represntation of a distance matrix than would more factors. For the cities A, B, and C there is no way to arrange the three cities on one line so that the distances can be reproduced. On the other hand cities D, E, F can be arranged in one dimension nicely as follows:

 

D ------50 miles------ E ------50 miles------ F

 

PART 2: COLA DATA FOR SUBJECT 1

 

Step 1. Download cola.txt from the web site.

Step 2. Run R

Step 3. Go to File menu and ³Change directory² to the location that you have saved the file

 

>cola=read.table(³cola.txt², header=T)

 

>loccola=cmdscale(cola)

>loccola

>xcola=loccola[,1]

>ycola=loccola[,2]

>plot(xcola,ycola,type=²n²,xlab=²²,ylab=²²,main=³cmdscale(cola)²)

>text(xcola,ycola,names(cola),cex=0.8)

 

Interpret the results

STATISTICAL COMPUTING ACTIVITY: MDS

 

Step 1. Download morse.xls from the course site

 

Step 2. Run SYSTAT

File

            Open

                        Data

 

Change Files of Type to All Files (*.*)

Locate the folder that you have saved the file

Select the file

Click Open

 

Now we have to let SYSTAT know that this is a similarity matrix

File

Save As

 

                        Click on Options and select Similarity, then OK

                                    Select the folder that you want to save the file and name it, say morsesim.

                                                Save

 

 

Open the file that you have saved

 

Analysis

            Scale

                        Multidimensional Scaling

From Available variable(s) window select number1, Š, number10 by double clicking on them

 

Make sure that Square (similarities model) is selected

Click OK

 

Suppose that you have thought that this is a dissimilarity matrix. Let us see what will happen:

Step 1. Download morse.xls from the course site

 

Step 2. Run SYSTAT

File

            Open

                        Data

 

Change Files of Type to All Files (*.*)

Locate the folder that you have saved the file

Select the file

Click Open

 

Now we have to let SYSTAT know that this is a similarity matrix

File

Save As

 

                        Click on Options and select Dissimilarity, then OK

                                    Select the folder that you want to save the file and name it, say morsedissim.

                                                Save

 

 

Open the file that you have saved

 

Analysis

            Scale

                        Multidimensional Scaling

From Available variable(s) window select number1, Š, number10 by double clicking on them

 

Make sure that Square (similarities model) is selected  (IF YOU START WITH THE DATA MATRIX SELECT RECTANGULAR)

Click OK

 

Discuss the differences in your interpretation for the similarity and dissimilarity matrices.

 

Carry out the MDS analysis for the data given in Table 5.3 which is located at country.xls. Note how data has been entered.