获取和删除字符串的第一个字符

我想做一些二维散步使用字符串的字符通过分配不同的值,每个字符。我打算“弹出”一个字符串的第一个字符,使用它,然后对字符串的其余部分进行重复。

我怎么才能做到这样呢?

x <- 'hello stackoverflow'

我希望能做这样的事情:

a <- x.pop[1]


print(a)


'h'
print(x)


'ello stackoverflow'
252165 次浏览

?substring

x <- 'hello stackoverflow'
substring(x, 1, 1)
## [1] "h"
substring(x, 2)
## [1] "ello stackoverflow"

The idea of having a pop method that both returns a value and has a side effect of updating the data stored in x is very much a concept from object-oriented programming. So rather than defining a pop function to operate on character vectors, we can make a 参考类别 with a pop method.

PopStringFactory <- setRefClass(
"PopString",
fields = list(
x = "character"
),
methods = list(
initialize = function(x)
{
x <<- x
},
pop = function(n = 1)
{
if(nchar(x) == 0)
{
warning("Nothing to pop.")
return("")
}
first <- substring(x, 1, n)
x <<- substring(x, n + 1)
first
}
)
)


x <- PopStringFactory$new("hello stackoverflow")
x
## Reference class object of class "PopString"
## Field "x":
## [1] "hello stackoverflow"
replicate(nchar(x$x), x$pop())
## [1] "h" "e" "l" "l" "o" " " "s" "t" "a" "c" "k" "o" "v" "e" "r" "f" "l" "o" "w"

使用 stringi包中的此函数

> x <- 'hello stackoverflow'
> stri_sub(x,2)
[1] "ello stackoverflow"

删除第一个字符:

x <- 'hello stackoverflow'
substring(x, 2, nchar(x))

想法是选择所有字符从2开始的字符数在 x。当单词或短语中的字符数不等时,这一点很重要。

选择第一个字母和之前的答案一样琐碎:

substring(x,1,1)

substring is definitely best, but here's one strsplit alternative, since I haven't seen one yet.

> x <- 'hello stackoverflow'
> strsplit(x, '')[[1]][1]
## [1] "h"

或者相当于

> unlist(strsplit(x, ''))[1]
## [1] "h"

你可以把剩下的线重新连接起来。

> paste0(strsplit(x, '')[[1]][-1], collapse = '')
## [1] "ello stackoverflow"

还有来自 stringr 包的 str_sub

x <- 'hello stackoverflow'
str_sub(x, 2) # or
str_sub(x, 2, str_length(x))
[1] "ello stackoverflow"

另一种方法是使用正则表达式函数 regmatchesregexec捕获子表达式。

# the original example
x <- 'hello stackoverflow'


# grab the substrings
myStrings <- regmatches(x, regexec('(^.)(.*)', x))

这将返回一个长度为1的列表中的整个字符串、第一个字符和“弹出”结果。

myStrings
[[1]]
[1] "hello stackoverflow" "h"                   "ello stackoverflow"

which is equivalent to list(c(x, substr(x, 1, 1), substr(x, 2, nchar(x)))). That is, it contains the super set of the desired elements as well as the full string.


Adding sapply will allow this method to work for a character vector of length > 1.

# a slightly more interesting example
xx <- c('hello stackoverflow', 'right back', 'at yah')


# grab the substrings
myStrings <- regmatches(x, regexec('(^.)(.*)', xx))

这将返回一个列表,其中匹配的完整字符串作为第一个元素,()捕获的匹配子表达式作为以下元素。因此,在正则表达式 '(^.)(.*)'中,(^.)匹配第一个字符,(.*)匹配其余字符。

myStrings
[[1]]
[1] "hello stackoverflow" "h"                   "ello stackoverflow"


[[2]]
[1] "right back" "r"          "ight back"


[[3]]
[1] "at yah" "a"      "t yah"

现在,我们可以使用可靠的 sapply + [方法来提取所需的子字符串。

myFirstStrings <- sapply(myStrings, "[", 2)
myFirstStrings
[1] "h" "r" "a"
mySecondStrings <- sapply(myStrings, "[", 3)
mySecondStrings
[1] "ello stackoverflow" "ight back"          "t yah"

使用子函数的另一种方法。

 sub('(^.).*', '\\1', 'hello stackoverflow')
[1] "h"


sub('(^.)(.*)', '\\2', 'hello stackoverflow')
[1] "ello stackoverflow"