Quantitative Genomics and Genetics 2016

Computer Lab 4

– 25 February 2016

– Author: Jin Hyun Ju (jj328@cornell.edu)

1. Boolean Data Type

  • Boolean data type is a data type with only two possible values, TRUE and FALSE.

  • Its main usage is for testing conditions.

  • In R and many other languages, the actual values for TRUE and FALSE are 1 and 0.

  • This can be illustrated as follows

sum(TRUE + TRUE)
[1] 2
sum(FALSE)
[1] 0
  • Booleans are useful in testing conditions.

  • For example, if you are interested in finding out if certain elements of a vector are greater than or smaller than a certain value, you can use >, <, >=, <=

example.vector <- seq(1,25,by= 2)
example.vector
 [1]  1  3  5  7  9 11 13 15 17 19 21 23 25
example.vector > 10
 [1] FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
[12]  TRUE  TRUE
example.vector >= 15
 [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE
[12]  TRUE  TRUE
example.vector <= 5
 [1]  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[12] FALSE FALSE
  • Each position where the value meets the condition will be marked with TRUE, and with FALSE otherwise.

  • Since TRUE and FALSE are essentially 1 and 0, you can easily find out how many elements satisfy the condition by simply taking the sum of the result.

sum(example.vector >10)
[1] 8
  • The same applies to matrices
example.mx <- matrix(c(2,5,7,-2,-5,-10), ncol = 3, byrow=T)
example.mx < 1
      [,1]  [,2]  [,3]
[1,] FALSE FALSE FALSE
[2,]  TRUE  TRUE  TRUE
  • You can also check if an element is equal to a specific value
dim(example.mx)[1] == 2
[1] TRUE
example.vector == 3
 [1] FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[12] FALSE FALSE

2. Boolean Algebra

  • Boolean algebra allows you to combine multiple conditions.

  • There are three basic operations AND (&), OR (|) and NOT (!).

  • The AND (&) operator returns TRUE only if all conditions are TRUE

FALSE & FALSE
[1] FALSE
TRUE & FALSE
[1] FALSE
TRUE & TRUE
[1] TRUE
# Example of an AND operator
example.vector > 5 & example.vector < 10
 [1] FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE
[12] FALSE FALSE
# if you want to see the actual elements 
example.vector[example.vector >10 & example.vector < 20]
[1] 11 13 15 17 19
  • The OR (|) operator returns TRUE when at least one condition is TRUE
FALSE | FALSE
[1] FALSE
TRUE | FALSE
[1] TRUE
TRUE | TRUE
[1] TRUE
# Example of an OR operator
example.vector > 10 | example.vector < 20
 [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
example.vector < 10 | example.vector > 20
 [1]  TRUE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE  TRUE
[12]  TRUE  TRUE
  • NOT (!) returns the opposite result
!(TRUE)
[1] FALSE
example.vector != 3
 [1]  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
[12]  TRUE  TRUE

3. More on Vector Elements

  • If you want to check whether a certain element is present or absent in a vector use the %in% operator
fruits <- c("banana","apple","strawberry","peach","mango")

"mango" %in% fruits
[1] TRUE
"durian" %in% fruits
[1] FALSE
  • We can see what the ! operator is doing by wrapping the previous expression with !()
!("durian" %in% fruits)
[1] TRUE
  • You can find out the index of a certain entry in a vector by using the which() function
which(fruits == "apple")
[1] 2
  • If you want to compare two vectors,
fruits2 <- c("orange","banana","durian","cherry","mango","apple")

fruits2 %in% fruits
[1] FALSE  TRUE FALSE FALSE  TRUE  TRUE
# show me the position
which(fruits2 %in% fruits)
[1] 2 5 6
#show me the elements
fruits2[fruits2 %in% fruits]
[1] "banana" "mango"  "apple" 
# There is also a function for this
intersect(fruits2, fruits)
[1] "banana" "mango"  "apple" 

2. If / else statements

  • By using if and else statements you can insert condition specific executions in your script

  • The code inside an if statement will only be executed when the condition is TRUE

  • The structure looks like this

if (condition) {
  do stuff
} else {
  do stuff
}

# OR you can add more levels by using else if

if(condition){
  do stuff
} else if (condition 2){
  do stuf
} else {
  do stuff
}
  • Here is a simple example
# Loop over individual elements in example.vector
for( i in example.vector){
  
    if( i < 10 ){
        cat(i, "is smaller than 10 \n") 
    } else if ( 10 <= i & i < 20){
        cat(i, "is in the interval [10,20) \n")
    } else {
        cat(i, "is larger than 20 \n")
    }

}
1 is smaller than 10 
3 is smaller than 10 
5 is smaller than 10 
7 is smaller than 10 
9 is smaller than 10 
11 is in the interval [10,20) 
13 is in the interval [10,20) 
15 is in the interval [10,20) 
17 is in the interval [10,20) 
19 is in the interval [10,20) 
21 is larger than 20 
23 is larger than 20 
25 is larger than 20 

Exercise

  • Using if/else statements and plotting functions create a function that can generate a histogram, a scatter plot, or a density plot depending on the input.
sample_data <- runif(1000)

data_plotter(sample_data, plot_type = "histogram")

data_plotter(sample_data, plot_type = "scatter")

data_plotter(sample_data, plot_type = "density")