Quantitative Genomics and Genetics 2016

Computer Lab 4

â€“ 25 February 2016

â€“ Author: Jin Hyun Ju (jj328@cornell.edu)

1. Boolean Data Type

• Boolean data type is a data type with only two possible values, TRUE and FALSE.

• Its main usage is for testing conditions.

• In R and many other languages, the actual values for TRUE and FALSE are 1 and 0.

• This can be illustrated as follows

sum(TRUE + TRUE)
[1] 2
sum(FALSE)
[1] 0
• Booleans are useful in testing conditions.

• For example, if you are interested in finding out if certain elements of a vector are greater than or smaller than a certain value, you can use >, <, >=, <=

example.vector <- seq(1,25,by= 2)
example.vector
 [1]  1  3  5  7  9 11 13 15 17 19 21 23 25
example.vector > 10
 [1] FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
[12]  TRUE  TRUE
example.vector >= 15
 [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE
[12]  TRUE  TRUE
example.vector <= 5
 [1]  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[12] FALSE FALSE
• Each position where the value meets the condition will be marked with TRUE, and with FALSE otherwise.

• Since TRUE and FALSE are essentially 1 and 0, you can easily find out how many elements satisfy the condition by simply taking the sum of the result.

sum(example.vector >10)
[1] 8
• The same applies to matrices
example.mx <- matrix(c(2,5,7,-2,-5,-10), ncol = 3, byrow=T)
example.mx < 1
      [,1]  [,2]  [,3]
[1,] FALSE FALSE FALSE
[2,]  TRUE  TRUE  TRUE
• You can also check if an element is equal to a specific value
dim(example.mx)[1] == 2
[1] TRUE
example.vector == 3
 [1] FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[12] FALSE FALSE

2. Boolean Algebra

• Boolean algebra allows you to combine multiple conditions.

• There are three basic operations AND (&), OR (|) and NOT (!).

• The AND (&) operator returns TRUE only if all conditions are TRUE

FALSE & FALSE
[1] FALSE
TRUE & FALSE
[1] FALSE
TRUE & TRUE
[1] TRUE
# Example of an AND operator
example.vector > 5 & example.vector < 10
 [1] FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE
[12] FALSE FALSE
# if you want to see the actual elements
example.vector[example.vector >10 & example.vector < 20]
[1] 11 13 15 17 19
• The OR (|) operator returns TRUE when at least one condition is TRUE
FALSE | FALSE
[1] FALSE
TRUE | FALSE
[1] TRUE
TRUE | TRUE
[1] TRUE
# Example of an OR operator
example.vector > 10 | example.vector < 20
 [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
example.vector < 10 | example.vector > 20
 [1]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE  TRUE
[12]  TRUE  TRUE
• NOT (!) returns the opposite result
!(TRUE)
[1] FALSE
example.vector != 3
 [1]  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
[12]  TRUE  TRUE

3. More on Vector Elements

• If you want to check whether a certain element is present or absent in a vector use the %in% operator
fruits <- c("banana","apple","strawberry","peach","mango")

"mango" %in% fruits
[1] TRUE
"durian" %in% fruits
[1] FALSE
• We can see what the ! operator is doing by wrapping the previous expression with !()
!("durian" %in% fruits)
[1] TRUE
• You can find out the index of a certain entry in a vector by using the which() function
which(fruits == "apple")
[1] 2
• If you want to compare two vectors,
fruits2 <- c("orange","banana","durian","cherry","mango","apple")

fruits2 %in% fruits
[1] FALSE  TRUE FALSE FALSE  TRUE  TRUE
# show me the position
which(fruits2 %in% fruits)
[1] 2 5 6
#show me the elements
fruits2[fruits2 %in% fruits]
[1] "banana" "mango"  "apple" 
# There is also a function for this
intersect(fruits2, fruits)
[1] "banana" "mango"  "apple" 

2. If / else statements

• By using if and else statements you can insert condition specific executions in your script

• The code inside an if statement will only be executed when the condition is TRUE

• The structure looks like this

if (condition) {
do stuff
} else {
do stuff
}

# OR you can add more levels by using else if

if(condition){
do stuff
} else if (condition 2){
do stuf
} else {
do stuff
}
• Here is a simple example
# Loop over individual elements in example.vector
for( i in example.vector){

if( i < 10 ){
cat(i, "is smaller than 10 \n")
} else if ( 10 <= i & i < 20){
cat(i, "is in the interval [10,20) \n")
} else {
cat(i, "is larger than 20 \n")
}

}
1 is smaller than 10
3 is smaller than 10
5 is smaller than 10
7 is smaller than 10
9 is smaller than 10
11 is in the interval [10,20)
13 is in the interval [10,20)
15 is in the interval [10,20)
17 is in the interval [10,20)
19 is in the interval [10,20)
21 is larger than 20
23 is larger than 20
25 is larger than 20 

Exercise

• Using if/else statements and plotting functions create a function that can generate a histogram, a scatter plot, or a density plot depending on the input.
sample_data <- runif(1000)

data_plotter(sample_data, plot_type = "histogram")

data_plotter(sample_data, plot_type = "scatter")

data_plotter(sample_data, plot_type = "density")