Collinearity

 


In computer programming, flow control refers to the management of the order which program’s statements are executed. It allows developers to dictate the flow of execution based on certain conditions and loops. Flow control statements enable the creation of dynamic and flexible program that can respond to different situations. Just as any other programming languages, R offers familiar tools, such as if, else, and loops, but it also packs some unique features that make your code more concise and efficient.



The Conditional Statements

The if statement executes a block of code when a specified condition is true. The optional else statement provides an alternative block of code to execute when the condition in the associated if statement is evaluated to be false.

if(condition) {
  action 1
} else {
  action 2
}


This simple logic can be expanded with else if. In the code chunk below, R will evaluate each condition sequentially. Depending on the evaluation, R will perform a corresponding action. For example:

x <- 10

if (x > 0) {
  print("x is positive")
} else if (x == 0) {
  print("x is zero")
} else {
  print("x is negative")
}
## [1] "x is positive"


In the code chunk above, since the first condition is evaluated to be true, R executed print("x is positive") and did not move over the other conditions.

It is worth pointing out that you can only pass a conditional statement that can be evaluated to be a single Boolean value. Vectorization does not apply here. Otherwise, R will throw an error. For example:

conditions <- c(FALSE, TRUE, TRUE)

if (conditions){
  print("Will throw an error!")
}
## Error in if (conditions) {: the condition has length > 1


Combining Booleans and Lazy Evaluation

For more complex conditions, we may combine two or more conditions. Just as other programming languages, R has AND and OR operators.

What’s unique in R is the lazy evaluation. This means conditions are only evaluated if necessary, saving time and resources. You can take advantage of this with && and ||. Lazy evaluation can make your program faster: if you use && or || for flow control and put simpler conditional expression to be evaluated first, you can improve performance of your program. For example:

(FALSE & all(rep(1, 10^8) == 1))
## [1] FALSE

Time for this code chunk to run: 6.36548686027527


(FALSE && all(rep(1, 10^8) == 1))
## [1] FALSE

Time for this code chunk to run: 5.09875917434692


We see that the case where we took advantage of && is faster. This is because, unlike & has to evaluate both left and right hand side expressions, && returns the evaluation as soon as it encounters a FALSE.

Of course, this works when you put an expression that can be evaluated fastly; if you reverse the order of the conditions, there will not be a notable differences.

(all(rep(1, 10^8) == 1) && FALSE)
## [1] FALSE

Time for this code chunk to run: 6.20408987998962

(all(rep(1, 10^8) == 1) & FALSE)
## [1] FALSE

Time for this code chunk to run: 6.0654890537262



Iterations

In computer programming, iteration refers to the process of repeatedly executing a block of code until a certain condition is met. R supports the for and while loop for the purpose.


For Loops

When you know how many iterations you need in advance, you can use a for loop. The syntax of a for loop in R programming language is:

for(i in vector){
  ...
}


Where

  • vector is the vector that you want to iterate over
  • i is the indexing variable for the vector

The actions inside { } will be performed for each value of x. For example:

x = seq(from = 1, to = 10, by = 2)
for(i in x) {
  print(i^2)
}
## [1] 1
## [1] 9
## [1] 25
## [1] 49
## [1] 81


What happens here is that for each i value in an integer vector of length 5, which is [1, 2, 3, 4, 5], R sets i as x[1], x[2], …, x[5], sequentially, and then it performs some operations on i.

We can also nest the for loops, just as other flow controls. For example:

d = 1:5
D = matrix(NA, nrow = length(d), ncol = length(d))
D
##      [,1] [,2] [,3] [,4] [,5]
## [1,]   NA   NA   NA   NA   NA
## [2,]   NA   NA   NA   NA   NA
## [3,]   NA   NA   NA   NA   NA
## [4,]   NA   NA   NA   NA   NA
## [5,]   NA   NA   NA   NA   NA


Suppose that we want to fill out the matrix D one by one. For diagonals, we want to input the row number. All the other elements should be zero:

for (i in 1:nrow(D)) {
  for (j in 1:ncol(D)) {
    if (i == j) {
      D[i,j] = d[i]
    } else {
      D[i,j] = 0
    }
  }
}

D
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1    0    0    0    0
## [2,]    0    2    0    0    0
## [3,]    0    0    3    0    0
## [4,]    0    0    0    4    0
## [5,]    0    0    0    0    5


While Loops

Often, we don’t know how many iterations there should be, but know specific condition when to stop in advance. In such cases, you can use the while loop. The syntax of a while loop in R programming language is as follows:

while(condition) {
  ...
}


If the condition is evaluated to be true, the code under the code block will be executed. Once the execution is finished, R will check if the condition still holds true: if it is, then go over the execution again.

This repeats until R evaluates the condition as false. Or R encounters a break during the execution.

For example, we could use a while loop to find the largest power of 2 less than 1000 as below:

x = 2

while(x * 2 < 1000) {
  x = x * 2
}
x
## [1] 512


Another slightly less silly example is the modified birthday problem.

Suppose that there are 20 classes filled with randomly selected individuals. Checking classes one by one, what would be the number of classes that you encounter two pairs of people who are sharing the same birthday?

Instead of relying solely on a mathematical calculation, we can employ a while loop to iteratively approach the solution.

For each execution, we would like to sample 20 birthdays and check if there is any matching pairs.

days_in_year = 365
class_size = 20

birthdays = sample(1:days_in_year, class_size, replace = TRUE)
table(birthdays) >= 2
## birthdays
##    11    49    51    63    93   126   143   147   150   157   158   163   201 
## FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 
##   267   289   291   294   353   357   365 
## FALSE FALSE FALSE FALSE FALSE FALSE FALSE


So, in a while loop, we want to count the number of classes until we encounter the second matching birthdays.

days_in_year = 365
class_size = 20
num_classes = 0

while(TRUE) {
    num_classes = num_classes + 1
    birthdays = sample(1:days_in_year, class_size, replace = TRUE)
    two_pairs = sum(table(birthdays) >= 2) >= 2
    if(two_pairs) {
        break
    }
}

num_classes
## [1] 12

Post a Comment

0 Comments