In this lesson we will review different data structures in R, including vectors, matrices, and arrays.
A data structure is a unique way of organizing data so that it can be stored, processed, and retrieved effectively. In this lesson we will review some of the different data structures in R, including the following:
In the Data Structure - Part 2 lesson, we will review:
Depending on what you are using R for, you will probably use a specific type of data structure most frequently. For example, in my research, I use data frames and tibbles all the time and rarely use matrices or arrays. However, it is still useful to be aware of and understand the different types of data structures.
In R, a vector is the simplest type of data structure. It is a sequence of data elements of the same basic type.
In the example below, we have three people (who happen to be myself and my two brothers) - Josh, Jenny, and Brandon. Here, we are creating three separate vectors:
names
that contains character strings of our three names.age
that contains numeric values representing our respective ages (at least at the time of writing this, but I will happily remain 30 forever).blue_eyes
that contains logical values representing whether we have blue eyes or not.names <- c("Josh", "Jenny", "Brandon")
names
[1] "Josh" "Jenny" "Brandon"
age <- c(31, 30, 27)
age
[1] 31 30 27
blue_eyes <- c(TRUE, FALSE, FALSE)
blue_eyes
[1] TRUE FALSE FALSE
Once you run the above R chunk, you can click on the Environment tab and see how the data is stored. It even shows the data types (num, logi, chr).
In this example, it’s important to notice that each vector only contains one type of data. We can also see the type of data stored with the class()
function.
The lines of code below are not being assigned (or saved) to any variables, so the results will be returned in the console, but not saved to the Environment.
A matrix has 2 dimensions of data and contains only one type of data. Matrices look like a typical table. In my experience, matrices typically contain numeric values, but there can also be character matrices.
matrix()
function.my_matrix
.my_matrix <- matrix(data = 1:25, nrow = 5, ncol = 5)
my_matrix
[,1] [,2] [,3] [,4] [,5]
[1,] 1 6 11 16 21
[2,] 2 7 12 17 22
[3,] 3 8 13 18 23
[4,] 4 9 14 19 24
[5,] 5 10 15 20 25
If you want to fill in your matrix by rows (instead of columns), you can set the byrow
argument to equal TRUE like in the example below.
Please note, we will review functions and arguments in more detail in a couple of lessons.
my_matrix_byrow <- matrix(data = 1:25, nrow = 5, ncol = 5, byrow = TRUE)
my_matrix_byrow
[,1] [,2] [,3] [,4] [,5]
[1,] 1 2 3 4 5
[2,] 6 7 8 9 10
[3,] 11 12 13 14 15
[4,] 16 17 18 19 20
[5,] 21 22 23 24 25
Here’s an example of a matrix with character strings, specifically the colors of the rainbow.
c()
function, which simply combines the elements.rainbow_matrix <- matrix(data = c("red", "orange", "yellow",
"green", "blue", "purple"), nrow = 2, ncol = 3)
rainbow_matrix
[,1] [,2] [,3]
[1,] "red" "yellow" "blue"
[2,] "orange" "green" "purple"
You can access an item within your matrix by using []
where the first number represents which column and the second represents which row.
my_matrix[2,4]
[1] 17
my_matrix_byrow[2,4]
[1] 9
rainbow_matrix[1,3]
[1] "blue"
Next, we’ll test out what happens if we try to create a matrix that is smaller than our data:
matrix(data = 1:25, nrow = 4, ncol = 4)
[,1] [,2] [,3] [,4]
[1,] 1 5 9 13
[2,] 2 6 10 14
[3,] 3 7 11 15
[4,] 4 8 12 16
Next, let’s test out what happens if we try to create a matrix that is larger than our given data:
matrix(data = 1:25, nrow = 6, ncol = 6)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 7 13 19 25 6
[2,] 2 8 14 20 1 7
[3,] 3 9 15 21 2 8
[4,] 4 10 16 22 3 9
[5,] 5 11 17 23 4 10
[6,] 6 12 18 24 5 11
Matrices are often used for data transformation. So as a final example of matrices, let’s see how we can easily transform our my_matrix
matrix to a new matrix that is multiplied by 2.
# original matrix
my_matrix
[,1] [,2] [,3] [,4] [,5]
[1,] 1 6 11 16 21
[2,] 2 7 12 17 22
[3,] 3 8 13 18 23
[4,] 4 9 14 19 24
[5,] 5 10 15 20 25
# matrix multiplied by 2
my_matrix*2
[,1] [,2] [,3] [,4] [,5]
[1,] 2 12 22 32 42
[2,] 4 14 24 34 44
[3,] 6 16 26 36 46
[4,] 8 18 28 38 48
[5,] 10 20 30 40 50
An array has 1 or more dimensions of data, but only contains a single data type.
array()
function.Even though we are calling an array()
we can see that this type of an array is simply a vector.
vector <- 1:25
array(vector)
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
[23] 23 24 25
array()
function has an argument called “dim”, which is where we can set the dimensions. [,1] [,2] [,3] [,4] [,5]
[1,] 1 6 11 16 21
[2,] 2 7 12 17 22
[3,] 3 8 13 18 23
[4,] 4 9 14 19 24
[5,] 5 10 15 20 25
However, arrays can also contain more than 2 dimensions…let’s see what happens if we add another dimension.
, , 1
[,1] [,2] [,3] [,4] [,5]
[1,] 1 6 11 16 21
[2,] 2 7 12 17 22
[3,] 3 8 13 18 23
[4,] 4 9 14 19 24
[5,] 5 10 15 20 25
, , 2
[,1] [,2] [,3] [,4] [,5]
[1,] 1 6 11 16 21
[2,] 2 7 12 17 22
[3,] 3 8 13 18 23
[4,] 4 9 14 19 24
[5,] 5 10 15 20 25
Here’s one final example, but I encourage you to play around with the array()
and test out different numbers and dimensions
, , 1
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
, , 2
[,1] [,2] [,3]
[1,] 7 9 11
[2,] 8 10 12
, , 3
[,1] [,2] [,3]
[1,] 13 15 17
[2,] 14 16 18
, , 4
[,1] [,2] [,3]
[1,] 19 21 23
[2,] 20 22 24
For more examples check out this website.
In this lesson we introduced three types of data structures: vectors, matrices, and arrays.
A vector is the simplest type of data structure, where all the elements are the same data type and only has 1 dimension.
A matrix also only contains elements of the same data type, but it has 2 dimensions.
An array also only contains elements of the same data type, but an array can be as simple as a vector with 1 dimension or it can be more complex with several dimensions.
artists
. Print your results.What is the class of artists
?
artists_heights
. Make sure to save the heights in the same order as you saved the names.What is the class of artists_heights
?
Hint: if you get stuck, try to use Google to learn how to print letters of the alphabet in R.