Data Structure: Factor
Creating Factor
The factor() function is used for encoding a vector as a factor. A factor() function is a special
case of vector that is used for representing nominal or categorical or ordinal
data.
Creating a gender factor.
gender <- factor(c("MALE", "FEMALE", "MALE"))
gender
---Output---
[1] MALE FEMALE MALE
Levels: FEMALE MALE
Above factor() function creates a Vector with the level “MALE” and “FEMALE”.
Creating blood type factor by specifying the levels.
blood <- factor(c("O", "AB", "A"),
levels = c("A", "B", "AB", "O"))
blood
---Output---
[1] O AB A
Levels: A B AB O
Creating an ordered factor.
symptoms <- factor(c("SEVERE", "MILD", "MODERATE"),
levels = c("MILD", "MODERATE", "SEVERE"),
ordered = TRUE)
symptoms
---Output---
[1] SEVERE MILD MODERATE
Levels: MILD < MODERATE < SEVERE
Data Structure: Factor
Adding an element to the factor
A factor can be modified by simply assigning a new value. However, we cannot
choose values outside of their predefined levels.
symptoms[4] <- "MILD"
symptoms
---Output---
[1] SEVERE MILD MODERATE MILD
Levels: MILD < MODERATE < SEVERE
Following line results in an error.
symptoms[5] <- "CRITICAL"
arning message:
n `[<-.factor`(`*tmp*`, 5, value = "CRITICAL") :
nvalid factor level, NA generated
OUT
So, the "CRITICAL" level should be added before adding this value.
Adding a new level to the existing factor.
levels(symptoms)
levels(symptoms) <- c(levels(symptoms), "CRITICAL")
symptoms[5] <- "CRITICAL"
symptoms
---Output---
[1] SEVERE MILD MODERATE MILD CRITICAL
Levels: MILD < MODERATE < SEVERE < CRITICAL
Check for symptoms greater than "MODERATE".
symptoms > "MODERATE"
---Output---
[1] TRUE FALSE FALSE FALSE TRUE