Functions

Topics covered in this section are

➤ Creating a function

➤ Return values

➤ Arguments

➤ Argument Matching

➤ Named Arguments

➤ Lazy evaluation

➤ Variable Arguments

➤ Arguments passed after variable arguments

➤ Extending a function

Functions

↪ Creating a function

A function is a block of code that runs only when it is called. Functions in R are “first-class objects”, which means the function can be treated much like any other R object.

The functions are created using the function() directive as shown below.

      f <- function() { 
        print("I am an R function")
      }
      f()

      ---Output---
      [1] "I am an R function"

The functions are stored objects and they are R objects of class “function”.

Functions

↪ Return Values

The value of the last evaluated expression is returned automatically when there is no explicit return statement in the function. An explicit return() statement can be used for returning a value from a function.

However, the return() statement need not be the last statement of the function but when that return() statement is evaluated, the function returns the control to the place from which the function was called.

The return() function can return only a single object. However, multiple return values can be wrapped in a list (or other objects) and the return() function can return the wrapped object.

Functions

↪ Arguments

The functions can take parameters or arguments. Arguments in R functions are optional which means a function may or may not have arguments. These arguments can be named and can have default values. Values to the arguments are passed when the function is invoked.

      f <- function(x, y) { 
        print("I am an R function")
        x + y                      # calculate the sum of x and y
      }
      f(2, 3)                    # x and y arguments are passed to the function

      ---Output---
      [1] "I am an R function"
      [1] 5

Functions

↪ Argument Matching

In R, functions arguments can be matched positionally or by name. Positional matching means that R assigns the first value to the first argument, the second value to the second argument, and so on.

In R, function arguments can have default values. Calling a function with arguments can be done in many ways.

      f <- function(x, y, z = 10) {      # Argument z with a default value 10
        print("I am an R function")
        x + y + z
      }
      f(2, 3)

      ---Output---
      [1] "I am an R function"
      [1] 15

In the above function call, the first value 2 is matched to the first argument x, and second value 3 is matched to the second argument y.

In the above function call, the value for the argument z is not passed, and the default value is used for the computation of the sum. However, when the value is passed to the argument that has a default value, the passed value overrides the default value.

      f(2,3,5)

      ---Output---
      [1] "I am an R function"
      [1] 10

Functions

↪ Named Arguments

Values to the functions can be passed using the names of the arguments.

      f <- function(x, y, z = 10) { 
        print("I am an R function")
        cat("Values are: ", x, y, z, "\n")
        x + y + z
      }
      f(y=15, z=20, x=5)

      ---Output---
      [1] "I am an R function"
      Values are:  5 15 20 
      [1] 40

Functions

↪ Lazy evaluation

Arguments to functions are evaluated lazily, so they are evaluated only as needed in the body of the function.

      f <- function(x, y, z) { 
        print("I am an R function")
        x + y 
      }

      ---Output---
      f(3,5)
      [1] "I am an R function"
      [1] 8

The function never uses the argument z, and the function will not produce any errors because the values 3 and 5 are positionally matched to x and y.

When the argument has no default value and if the argument is referenced in the function, then not passing the argument value results in an Error. #CODE f <- function(x, y, z) { print("I am an R function") x + y z }
—Output— f(3,5) [1] “I am an R function” Error in f(3, 5) : argument “z” is missing, with no default

Functions

↪ Variable Arguments

The …, known as the ellipsis, represents a variable number of arguments in a function. This is useful when the number of arguments is not known in advance.

      average <- function(x,y) {
        (x + y)/nargs()
      }
      average(10,20)               # output 15

The above function calculates the average of two numbers. What if one wants to calculate the average of 3 numbers or 4 numbers, or if one wants to calculate the average of n numbers? The ... or ellipsis argument solves this variable number of arguments problem.

      average <- function(x,y,...) {
        sum = 0;
        for (i in list(x,y,...)) {
          sum = sum + i
        }
        sum / nargs()
      }
      average(20,30,40)

      ---Output---
      [1] 30

      average(20,30,40,10,15)

      ---Output---
      [1] 23

The above function can be simplified by removing x and y.

      average <- function(...) {
        sum = 0;
        for (i in list(...)) {
          sum = sum + i
        }
        sum / nargs()
      }

      ---Output---
      average(20,30,40,10)

Functions

↪ Arguments passed after variable arguments

Any arguments that appear after a variable argument list must be named explicitly and these cannot be partially matched or matched positionally.

      expo_average <- function(..., mult=1) {
        mult * average(...)
      }
      expo_average(20,30,40,10, mult=2)     # Average of 20,30,40,10 is 25. 25*2=50

      ---Output---
      [1] 50

Functions

↪ Extending a function

When extending a function, the variable argument is often used to avoid copying the entire argument list of the original function.

      expo_average <- function(x, ...) {  #  All variable arguments are passed to the 
        x * average(...)                  #  average function, and multiplied with the  
      }                                   #  first argument x
      expo_average(2,20,30,40,10)         #  average of 20, 30,40,10 = 25.  2*25 = 50

      ---Output---
      [1] 50

In R, a function can be passed into another function as an argument.

      compute <- function(..., fun=FUN) {
        fun(...)
      }
      compute(20,30,40,10, fun=average)
      compute(20,30,40,10, fun=sum)
      compute(20,30,40,10, fun=mean)

Functions

↪ Debugging

traceback()

The traceback() function prints the call stack of the last uncaught error, i.e., the sequence of calls that lead to the error.

      # average() function
      average <- function(...) {
        sum = 0;
        for (i in list(...)) {
          sum = sum + i
        }
        sum / nargs()
      }
 
      # an extended average() function
      expo_average <- function(x, ...) {
        x * average(...)
      }
      expo_average(2,20,30,40,10, X)

      ---Output---
      Error in average(...) : object 'X' not found

      traceback()

      ---Output---
      2: average(...) at #2
      1: expo_average(2, 20, 30, 40, 10, X)

The traceback() function must be called immediately after an error occurs. The traceback is lost once another function is called.

Functions

↪ Debugging

debug()

The debug() function initiates an interactive browser. With the debugger, you can step through an R function one expression at a time to pinpoint exactly where an error occurs. The debug() function takes a function as its first argument.

      div <- function(x,y) {
        x / y
      }
      debug(div(10,0))       # opens an interactive browser!

Whenever the div() function is called again, an interactive browser opens. The undebug() function should be called when debugging session is completed with debug() function.

      undebug(div)

Functions

↪ Debugging

recover()

The recover() function can be used to modify the error behavior of R when an error occurs. This function allows the user to browse directly on any of the currently active function calls and is suitable as an error option. The expression options(error = recover) will make this the error option.

      options(error = recover)

The recover() function will first print out the function call stack when the error occurs with an option to select the frame. Using the frame number, the call stack can be investigated for the problem. Similar to the undebug() call, the recovery mode should be undone after the recover() operation completes.

      options(error = NULL)

Functions

↪ Debugging

browser()

The browser() function causes the function to pause in the middle of running a function and give the control back to the developer. The developer can look at the objects that the function is using, look up their values with the same scoping rules that the function would run, and run the code under the same conditions that the function would run.

      average <- function(...) {
        sum = 0;
        for (i in list(...)) {
          sum = sum + i
        }
 
        browser()               # Opens a debug window
 
        sum / nargs()
      }
 
      average(20,30,40,10)

Functions

↪ Debugging

system.time()

The system.time() function takes an expression as input and returns the amount of time taken in seconds to evaluate the expression. If there’s an error, the system.time() gives the time until the error occurred.

      average <- function(...) {
        sum = 0;
        for (i in list(...)) {
          sum = sum + i
        }
        sum / nargs()
      }
      system.time(average(20,30,40,10))

      ---Output---
      user  system elapsed 
      0.003   0.000   0.003

The function returns an object of class "proc_time" containing the user, system, and total elapsed times for the currently running R process, and the cumulative sum of user and system times of any child processes spawned by it on which it has waited.

Functions

↪ Summary

Functions are defined using the function() directive and are assigned to R objects just like any other R object
An explicit return() statement can be used to return a value immediately from a function
If the function has no return statement, the function returns the last expression evaluated in the function body
Function arguments can be named arguments and these named arguments can have default values
Functions arguments are matched by name or by position in the argument list
The ..., known as ellipsis, represents a variable number of arguments in a function definition
All arguments defined after a ... argument must be named arguments.
The functions can be debugged is sevral ways using debug(), recover(), browser(), and system.time() functions.