Variables and Operators

Topics covered in this section are

  ➤ Creating Variables

  ➤ Variable Types

  ➤ Strings

  ➤ Operators

  ➤ Date Variable

  ➤ Time Variable

Variables and Operators

↪ Creating Variables

A variable in R is created when a value is assigned to it. The <- sign or = operator is used for assigning value to a variable. Type the variable name or use the print() function to output the variable value.

      name <- "apple"   # A variable name has been created
      price = 3
      name              # output "apple"
      print(price)      # output 3

When the code is inside an R expression, for example inside a curly brace, the print() function should be used for printing the result.

Variables are not required to be declared with any particular type in R. Variable type can be changed even after the initial declaration. The class() function can be used for checking the variable data type.

      mars <- "chocolate"   # Type string. It is a "character" type in R.
      class(mars)
      mars <- 10            # Type numeric.
      class(mars)

Variables and Operators

↪ Variable Types

Basic data types in R are:

  • numeric - All numbers by default are numeric type
  • integer - The letter "L" declares the type as an integer)
  • complex - 1 + 1i, where "i" is the imaginary part of the number
  • character/string - Enclosed in single or double quotation marks
  • logical/boolean - TRUE or FALSE
  • Here are a few examples:

          x <- 2.7         # numeric
          class(x)
          x <- 10L         # integer
          class(x)
          x <- 10 + 10i    # complex
          class(x)
          x <- "Hello"     # character
          class(x)
          x <- TRUE        # logical
          class(x)
    
    

    Variables and Operators

    ↪ Strings

    A string in R is enclosed in either single or double quotation marks. The paste() function allows concatenating two strings.

          str1 <- "Hello"
          str2 <- "World"
          str <- paste(str1,str2)
          str                          # Prints "Hello World"
    
    

    A variable can contain multi-line strings. R will add a new line character "\n" at the end of each line break. The cat() function can be used for inserting the line breaks at the same position as in the code.

          str1 <- "Pound,
           Dollar,
           Euro"
     
          str1                   # Prints "Pound,\n Dollar,\nEuro"
     
          cat(str1)              # Prints 
                                 #     Pound,
                                 #     Dollar,
                                 #     Euro
    
    

    The nchar() function is used for checking the string length.

          str <- "Landing on Mars"
          nchar(str)                    # Prints 15
    
    

    Variables and Operators

    ↪ Operators

    Arithmetic Operators

  • + Addition
  • - Subtraction
  • * Multiplication
  • / Division
  • ^ or ** Exponentiation
  • x%%y Modulus (x mod y)
  • x%/%y Integer division
  • Logical Operators

  • < Less than
  • <= Less than or equal to
  • > Greater than
  • >= Greater than or equal to
  • == Exactly equal to
  • != Not equal to
  • !x Not x
  • x | y x or y
  • x & y x and y
  • isTRUE(x) Tests whether x is TRUE
  • Variables and Operators

    ↪ Date Variable

    In R, date values are typically entered as character strings and then translated into date variables using as.Date() function. The syntax is as.Date(x, "input_format"), where x is the character string and input_format is the Date format listed below for reading the date.
  • %d Day as a number (0–31)
  • %a Abbreviated weekday (Mon, Tue, ...)
  • %A Unabbreviated weekday (Monday, Tuesday, ...)
  • %m Month (01–12)
  • %b Abbreviated month (Jan, Feb, ...)
  • %B Unabbreviated month (January, February, ...)
  • %y Two-digit year
  • %Y Four-digit year
  • Variables and Operators

    ↪ Date input format

    The default input format is 'yyyy-mm-dd'. The statement converts the character string to dates using this default format.
          mydates <- as.Date(c("2007-06-22", "2004-02-13"))
          mydates
    
    
    In contrast, the following statement reads the data using an 'mm/dd/yyyy' format.
          strdates <- c("01/05/1965", "08/16/1975")
          mydates <- as.Date(strdates, "%m/%d/%Y")
          mydates
    
    

    Input 'format string' can be stored in a variable.

          myformat <- "%m/%d/%y"
          strdates <- c("01/05/1965", "08/16/1975")
          mydates <- as.Date(strdates, myformat)
          mydates
    
    

    Variables and Operators

    ↪ Date output format

    Sys.Date() returns today’s date, and date() returns the current day, months, date, time, and year.

          Sys.Date()   # Return format "2022-10-05"
          date()       # Return format "Wed Oct  5 17:02:11 2022"
    
    

    The format(x, format="output_format") function output dates in a specified output_format and to extract portions of dates.

          mydate <- Sys.Date()
          format(mydate, format="%B %d %Y")  # Returned format "October 05 2022"
          format(mydate, format="%A")        # Returned format "Wednesday"
    
    

    Variables and Operators

    ↪ Date Arithmetic

    R stores dates represented as the number of days since January 1, 1970. Earlier dates are represented as negative values.

          startdate <- as.Date("2020-01-01")
          enddate   <- as.Date("2022-01-01")
          diff      <- enddate - startdate
          diff
    
    
    The above diff results in the output "Time difference of 731 days". The difftime() function is for calculating a time interval and expressing it in units of secs, mins, hours, days, or weeks.
          mydate <- Sys.Date()
          somedate   <- as.Date("2020-01-01")
          difftime(mydate, somedate, units="weeks") # Time difference of 144 weeks
          difftime(mydate, somedate, units="hours") # Time difference of 24192 hours
    
    

    Variables and Operators

    ↪ Converting dates to character variables

    The as.character() function converts Date values to character values.

          strDates <- as.character(dates)
    
    
    The conversion allows applying a range of character functions such as sub(), substr(), strsplit(), grep(), etc. to the date values

    Variables and Operators

    ↪ Time Variable

    The strptime() function takes a string vector with dates and times and converts them into times variable. The syntax is strptime(x, input_format), where x is the string vector and input_format is the dates and time format.

          datevector <- c("March 12, 2009 23:30", "August 27, 2017 15:20")
          dates <- strptime(datevector, "%B %d, %Y %H:%M")
          dates
    
    

    Format codes for times are

  • %c Locale-specific date and time
  • %H 24 hours format
  • %I 12 hours format
  • %j Day of the year
  • %M Minute
  • %p Locale-specific AM/PM
  • %S Second
  • %U Week of the year - starting on Sunday
  • %w Weekday (Sunday=0)
  • %W Week of the year - starting on Monday
  • %x Locale-specific Date
  • %X Locale-specific Time
  • %z Offset from GMT
  • %Z Time zone (character)
  • For example,

          datevector <- c("March 12, 2009 23:30")
          dates <- strptime(datevector, "%B %d, %Y %H:%M")
          format(dates, format="%U")                          # Prints 10
    
    

    Variables and Operators

    ↪ Time Variable

    Times in R are represented by two POSIX date/time classes: POSIXct or the POSIXlt. Both the classes inherit from POSIXt virtual class which allows operations such as subtraction to mix the two classes.

    The POSIXct class stores date/time values as the number of seconds since January 1, 1970, while the POSIXlt class stores them in a format that human see, for example second, minute, hour, day, day of the week, month, day of the month, and year. The POSIXct format is optimized for storage and computation.

    The default input format for POSIX dates consists of year, month, and day separated by dashes. For date/time values, the date may be followed by white space and a time in the form hour:minutes:seconds or hour:minutes. Examples of valid POSIX date or date/time input formats are

  • 1985/6/30
  • 1985-06-30 10:15
  • 1985/06/30 10:15:05
  • When input date/times are passed as the number of seconds from January 1, 1970, the POSIX date values can be created by assigning the appropriate POSIX classes directly to those numbers.

          datenum <- c(1129956691,1535113501)          # numeric class
          class(datenum) <- c('POSIXt', 'POSIXct')     
          class(datenum)                               # [1] "POSIXt"  "POSIXct"
          datenum
    
          ---Output---       [1] "2005-10-22 10:21:31 IST" "2018-08-24 17:55:01 IST"

    Variables and Operators

    ↪ Time Variable

    Because most date manipulation functions use the POSIXt pseudo-class, include it as the first member of the class attribute.

    To extract pieces of dates and/or times, there are a number of generic functions that work on dates and times objects.

          datenum <- c(1129956691,1535113501)
          weekdays(datenum)     # weekdays() gives the day of the week
          months(datenum)       # months() gives the month name
          quarters(datenum)     # quarters()  gives the quarter (“Q1”, “Q2”, “Q3”, or “Q4”)
    
    

    The POSIXlt object contains some useful metadata. The function unclass() shows that the data are stored separately as hour, minute, second, day, month, and year.

          datenum <- c(1129956691)                      # numeric class
          class(datenum) <- c('POSIXt', 'POSIXct')
          datenum
          x <- as.POSIXlt(datenum, "Europe/London")     # UK Timezone
          x
    
          ---Output---       [1] "2005-10-22 05:51:31 BST"
          names(unclass(x))
    
          ---Output---       [1] "sec" "min" "hour" "mday" "mon" "year" "wday" "yday" "isdst"
          x$sec   # Prints 31
          x$min   # Prints 51
          x$hour  # Prints 5
          x$mday  # Prints 22
          x$mon   # Prints 9. Jan=0, Feb=1, ... Oct=9
          x$year  # Prints 105. A year since 1900 ie. 1900 + 105 = 2005
          x$wday  # Prints 6. Sun=0, Mon=1,...,Sat=6
          x$yday  # Prints 294. 1st January = 0. ie.22nd Oct 2005 is 294th day of the year
          x$isdst # Prints 1.  Observing Daylight Savings Time. 
    
    

    The month values stored in the POSIXlt object use zero-based indexing. This means that January is stored as index 0.

    The year values are stored using a base index value of 1900. That is, 2005 is stored as 105. For the day of the week, the index 0 represents Sunday. Similarly, 1st January is stored as index 0 for the day of the year value.

    Variables and Operators

    ↪ Summary

  • A variable in R is created when a value is assigned to it. The <- sign or = operator is used for assigning value to a variable.
  • A variable data type can be checked using the class() function.
  • Date values are entered as character strings and then translated into date variables using as.Date() function.
  • R stores date represented as the number of days since January 1, 1970.
  • The strptime() function takes a string vector with dates and times and converts them into times variables.
  • Times in R is represented by two POSIX date/time classes: POSIXct or the POSIXlt.
  • The POSIXct class stores date/time values as the number of seconds since January 1, 1970, while the POSIXlt class stores them in a format that humans read.