DATA MINING
Desktop Survival Guide by Graham Williams |
|||||
|
A dataset is usually more complex than a simple vector. Indeed, often
we have several vectors making up the dataset, and refer to this as a
matrix. A matrix is a data structure containing items all of the same
data type. We construct a matrix with the matrix and
c functions. Rows and columns of a matrix can have
names, and the functions colnames and
rownames will list the current names. However, you can
also assign a new list of names to these functions!
> ds <- matrix(c(52, 37, 59, 42, 36, 46, 38, 21, 18, 32, 10, 67), nrow=3, byrow=T) > colnames(ds) <- c("Low", "Medium", "High","VHigh") > rownames(ds) <- c("Married","Prev.Married","Single") > ds Low Medium High VHigh Married 52 37 59 42 Prev.Married 36 46 38 21 Single 18 32 10 67 |
Of course, manually creating datasets in this way is only useful for
small data collections. A slightly easier approach is to manually
modify and add to the dataset using a simple spreadsheet-like
interface through the edit function or through the
fix function which will also assign the results of the
edit back to the variable being edited. Note that normally the
edit function returns , and thus prints to the screen if
it is not assigned, the datasets. To avoid the dataset being printed
to the screen, when you do not assign edit to a variable
because all you wanted to do was browse the dataset, use the
invisible function.
> ds <- edit(ds) > fix(ds) > invisible(edit(ds)) |
The cbind function combines each of its arguments,
column-wise (the c in the name is for column), into a
single data structure:
> age <- c(35, 23, 56, 18) > gender <- c("m", "m", "f", "f") > people <- cbind(age, gender) > people age gender [1,] "35" "m" [2,] "23" "m" [3,] "56" "f" [4,] "18" "f" |
The rbind function similarly combines its argument, but in a row-wise manner. The result will be the same as if we transpose the matrix with the t function:
> t(people) [,1] [,2] [,3] [,4] age "35" "23" "56" "18" gender "m" "m" "f" "f" > people <- rbind(age, gender) > people [,1] [,2] [,3] [,4] age "35" "23" "56" "18" gender "m" "m" "f" "f" |