获取和删除字符串的第一个字符

小开

最佳答案

见 ?substring。

x <- 'hello stackoverflow'
substring(x, 1, 1)
## [1] "h"
substring(x, 2)
## [1] "ello stackoverflow"

The idea of having a pop method that both returns a value and has a side effect of updating the data stored in x is very much a concept from object-oriented programming. So rather than defining a pop function to operate on character vectors, we can make a 参考类别 with a pop method.

PopStringFactory <- setRefClass(
"PopString",
fields = list(
x = "character"
),
methods = list(
initialize = function(x)
{
x <<- x
},
pop = function(n = 1)
{
if(nchar(x) == 0)
{
warning("Nothing to pop.")
return("")
}
first <- substring(x, 1, n)
x <<- substring(x, n + 1)
first
}
)
)


x <- PopStringFactory$new("hello stackoverflow")
x
## Reference class object of class "PopString"
## Field "x":
## [1] "hello stackoverflow"
replicate(nchar(x$x), x$pop())
## [1] "h" "e" "l" "l" "o" " " "s" "t" "a" "c" "k" "o" "v" "e" "r" "f" "l" "o" "w"

小开

使用 stringi包中的此函数

> x <- 'hello stackoverflow'
> stri_sub(x,2)
[1] "ello stackoverflow"

小开

删除第一个字符:

x <- 'hello stackoverflow'
substring(x, 2, nchar(x))

想法是选择所有字符从2开始的字符数在 x。当单词或短语中的字符数不等时，这一点很重要。

选择第一个字母和之前的答案一样琐碎:

substring(x,1,1)

小开

substring is definitely best, but here's one strsplit alternative, since I haven't seen one yet.

> x <- 'hello stackoverflow'
> strsplit(x, '')[[1]][1]
## [1] "h"

或者相当于

> unlist(strsplit(x, ''))[1]
## [1] "h"

你可以把剩下的线重新连接起来。

> paste0(strsplit(x, '')[[1]][-1], collapse = '')
## [1] "ello stackoverflow"

小开

还有来自 stringr 包的 str_sub

x <- 'hello stackoverflow'
str_sub(x, 2) # or
str_sub(x, 2, str_length(x))
[1] "ello stackoverflow"

小开

另一种方法是使用正则表达式函数 regmatches和 regexec捕获子表达式。

# the original example
x <- 'hello stackoverflow'


# grab the substrings
myStrings <- regmatches(x, regexec('(^.)(.*)', x))

这将返回一个长度为1的列表中的整个字符串、第一个字符和“弹出”结果。

myStrings
[[1]]
[1] "hello stackoverflow" "h"                   "ello stackoverflow"

which is equivalent to list(c(x, substr(x, 1, 1), substr(x, 2, nchar(x)))). That is, it contains the super set of the desired elements as well as the full string.

Adding sapply will allow this method to work for a character vector of length > 1.

# a slightly more interesting example
xx <- c('hello stackoverflow', 'right back', 'at yah')


# grab the substrings
myStrings <- regmatches(x, regexec('(^.)(.*)', xx))

这将返回一个列表，其中匹配的完整字符串作为第一个元素，()捕获的匹配子表达式作为以下元素。因此，在正则表达式 '(^.)(.*)'中，(^.)匹配第一个字符，(.*)匹配其余字符。

myStrings
[[1]]
[1] "hello stackoverflow" "h"                   "ello stackoverflow"


[[2]]
[1] "right back" "r"          "ight back"


[[3]]
[1] "at yah" "a"      "t yah"

现在，我们可以使用可靠的 sapply + [方法来提取所需的子字符串。

myFirstStrings <- sapply(myStrings, "[", 2)
myFirstStrings
[1] "h" "r" "a"
mySecondStrings <- sapply(myStrings, "[", 3)
mySecondStrings
[1] "ello stackoverflow" "ight back"          "t yah"

小开

使用子函数的另一种方法。

 sub('(^.).*', '\\1', 'hello stackoverflow')
[1] "h"


sub('(^.)(.*)', '\\2', 'hello stackoverflow')
[1] "ello stackoverflow"