Data Structure: Factor

↪ Creating Factor

The factor() function is used for encoding a vector as a factor. A factor() function is a special case of vector that is used for representing nominal or categorical or ordinal data.

Creating a gender factor.

      gender <- factor(c("MALE", "FEMALE", "MALE"))
      gender

      ---Output---       [1] MALE FEMALE MALE       Levels: FEMALE MALE
Above factor() function creates a Vector with the level “MALE” and “FEMALE”.

Creating blood type factor by specifying the levels.

      blood <- factor(c("O", "AB", "A"),
                      levels = c("A", "B", "AB", "O"))
      blood

      ---Output---       [1] O AB A       Levels: A B AB O

Creating an ordered factor.

      symptoms <- factor(c("SEVERE", "MILD", "MODERATE"),
                         levels = c("MILD", "MODERATE", "SEVERE"),
                         ordered = TRUE)
      symptoms

      ---Output---       [1] SEVERE MILD MODERATE       Levels: MILD < MODERATE < SEVERE

Data Structure: Factor

↪ Adding an element to the factor

A factor can be modified by simply assigning a new value. However, we cannot choose values outside of their predefined levels.

      symptoms[4] <- "MILD"
      symptoms

      ---Output---       [1] SEVERE MILD MODERATE MILD       Levels: MILD < MODERATE < SEVERE

Following line results in an error.

      symptoms[5] <- "CRITICAL"

arning message: n `[<-.factor`(`*tmp*`, 5, value = "CRITICAL") : nvalid factor level, NA generated OUT So, the "CRITICAL" level should be added before adding this value.

Adding a new level to the existing factor.

      levels(symptoms)
      levels(symptoms) <- c(levels(symptoms), "CRITICAL")
      symptoms[5] <- "CRITICAL"
      symptoms

      ---Output---       [1] SEVERE MILD MODERATE MILD CRITICAL       Levels: MILD < MODERATE < SEVERE < CRITICAL

Check for symptoms greater than "MODERATE".

      symptoms > "MODERATE"

      ---Output---       [1] TRUE FALSE FALSE FALSE TRUE