There is a switch statement but I can never seem to get it to work the way I think it should. Since you have not provided an example I will make one using a factor variable:
I later learned that there really are two different switch functions. It's not generic function but you should think about it as either switch.numeric or switch.character. If your first argument is an R 'factor', you get switch.numeric behavior, which is likely to cause problems, since most people see factors displayed as character and make the incorrect assumption that all functions will process them as such.
The one downside of this is that you have to keep writing the category name (animal, etc) for each item. It is syntactically more convenient to be able to define our categories as below (see the very similar question How do add a column in a data frame in R )
Have a look at the cases function from the memisc package. It implements case-functionality with two different ways to use it.
From the examples in the package:
z1=cases(
"Condition 1"=x<0,
"Condition 2"=y<0,# only applies if x >= 0
"Condition 3"=TRUE
)
i dont like any of these, they are not clear to the reader or the potential user. I just use an anonymous function, the syntax is not as slick as a case statement, but the evaluation is similar to a case statement and not that painful. this also assumes your evaluating it within where your variables are defined.
result <- ( function() { if (x==10 | y< 5) return('foo')
if (x==11 & y== 5) return('bar')
})()
all of those () are necessary to enclose and evaluate the anonymous function.
A case statement actually might not be the right approach here. If this is a factor, which is likely is, just set the levels of the factor appropriately.
Say you have a factor with the letters A to E, like this.
> a <- factor(rep(LETTERS[1:5],2))
> a
[1] A B C D E A B C D E
Levels: A B C D E
To join levels B and C and name it BC, just change the names of those levels to BC.
> levels(a) <- c("A","BC","BC","D","E")
> a
[1] A BC BC D E A BC BC D E
Levels: A BC D E
Mixing plyr::mutate and dplyr::case_when works for me and is readable.
iris %>%
plyr::mutate(coolness =
dplyr::case_when(Species == "setosa" ~ "not cool",
Species == "versicolor" ~ "not cool",
Species == "virginica" ~ "super awesome",
TRUE ~ "undetermined"
)) -> testIris
head(testIris)
levels(testIris$coolness) ## NULL
testIris$coolness <- as.factor(testIris$coolness)
levels(testIris$coolness) ## ok now
testIris[97:103,4:6]
Bonus points if the column can come out of mutate as a factor instead of char! The last line of the case_when statement, which catches all un-matched rows is very important.
Petal.Width Species coolness
97 1.3 versicolor not cool
98 1.3 versicolor not cool
99 1.1 versicolor not cool
100 1.3 versicolor not cool
101 2.5 virginica super awesome
102 1.9 virginica super awesome
103 2.1 virginica super awesome
I am using in those cases you are referring switch(). It looks like a control statement but actually, it is a function. The expression is evaluated and based on this value, the corresponding item in the list is returned.
switch works in two distinct ways depending whether the first argument evaluates to a character string or a number.
What follows is a simple string example which solves your problem to collapse old categories to new ones.
For the character-string form, have a single unnamed argument as the default after the named values.