如何在哈斯克尔拆线?

在哈斯克尔有没有一种标准的分割绳子的方法?

lineswords在分割空格或换行符时效果很好,但是在逗号上分割肯定有标准方法吧?

我在胡格尔网站上找不到。

具体来说,我在寻找 split "," "my,comma,separated,list"返回 ["my","comma","separated","list"]的值。

174827 次浏览

试试这个:

import Data.List (unfoldr)


separateBy :: Eq a => a -> [a] -> [[a]]
separateBy chr = unfoldr sep where
sep [] = Nothing
sep l  = Just . fmap (drop 1) . break (== chr) $ l

只适用于单个字符,但应该是易于扩展的。

有一个称为 分开的软件包。

cabal install split

像这样使用它:

ghci> import Data.List.Split
ghci> splitOn "," "my,comma,separated,list"
["my","comma","separated","list"]

它附带了许多其他函数,用于在匹配的分隔符上进行分隔或有多个分隔符。

在 Text.Regex (Haskell 平台的一部分)模块中,有一个函数:

splitRegex :: Regex -> String -> [String]

它根据正则表达式拆分字符串。

请记住,您可以查找前奏函数的定义!

Http://www.haskell.org/onlinereport/standard-prelude.html

看这里,words的定义是,

words   :: String -> [String]
words s =  case dropWhile Char.isSpace s of
"" -> []
s' -> w : words s''
where (w, s'') = break Char.isSpace s'

因此,将它改为一个带谓词的函数:

wordsWhen     :: (Char -> Bool) -> String -> [String]
wordsWhen p s =  case dropWhile p s of
"" -> []
s' -> w : wordsWhen p s''
where (w, s'') = break p s'

那就用你想要的任何断言来称呼它!

main = print $ wordsWhen (==',') "break,this,string,at,commas"

我不知道如何添加到史蒂夫的答案评论,但我想推荐的
GHC 库文档,
特别是在那里
< a href = “ http://www.haskell.org/ghc/docs/update/html/library/base/Data-List.html # g: 10”rel = “ nofollow”> Data 中的 Sublist 函数。 List

这比仅仅阅读普通的 Haskell 报告要好得多。

一般来说,一个带有关于何时创建一个新的子列表来提要的规则的折叠应该也可以解决这个问题。

我昨天开始学习 Haskell,如果我说错了请纠正我,但是:

split :: Eq a => a -> [a] -> [[a]]
split x y = func x y [[]]
where
func x [] z = reverse $ map (reverse) z
func x (y:ys) (z:zs) = if y==x then
func x ys ([]:(z:zs))
else
func x ys ((y:z):zs)

提供:

*Main> split ' ' "this is a test"
["this","is","a","test"]

或者是你想

*Main> splitWithStr  " and " "this and is and a and test"
["this","is","a","test"]

那就是:

splitWithStr :: Eq a => [a] -> [a] -> [[a]]
splitWithStr x y = func x y [[]]
where
func x [] z = reverse $ map (reverse) z
func x (y:ys) (z:zs) = if (take (length x) (y:ys)) == x then
func x (drop (length x) (y:ys)) ([]:(z:zs))
else
func x ys ((y:z):zs)

如果使用 Data.Text,则会出现 SplitOn:

Http://hackage.haskell.org/packages/archive/text/0.11.2.0/doc/html/data-text.html#v:spliton

这是建在哈斯克尔平台。

例如:

import qualified Data.Text as T
main = print $ T.splitOn (T.pack " ") (T.pack "this is a test")

或:

{-# LANGUAGE OverloadedStrings #-}


import qualified Data.Text as T
main = print $ T.splitOn " " "this is a test"

使用 Data.List.Split,它使用 split:

[me@localhost]$ ghci
Prelude> import Data.List.Split
Prelude Data.List.Split> let l = splitOn "," "1,2,3,4"
Prelude Data.List.Split> :t l
l :: [[Char]]
Prelude Data.List.Split> l
["1","2","3","4"]
Prelude Data.List.Split> let { convert :: [String] -> [Integer]; convert = map read }
Prelude Data.List.Split> let l2 = convert l
Prelude Data.List.Split> :t l2
l2 :: [Integer]
Prelude Data.List.Split> l2
[1,2,3,4]
split :: Eq a => a -> [a] -> [[a]]
split d [] = []
split d s = x : split d (drop 1 y) where (x,y) = span (/= d) s

例如。

split ';' "a;bb;ccc;;d"
> ["a","bb","ccc","","d"]

将删除一个拖尾分隔符:

split ';' "a;bb;ccc;;d;"
> ["a","bb","ccc","","d"]

除了在答案中给出的高效和预先构建的函数之外,我还将添加自己的函数,这些函数只是我为了在自己的时间学习 Haskell 而编写的函数库的一部分:

-- Correct but inefficient implementation
wordsBy :: String -> Char -> [String]
wordsBy s c = reverse (go s []) where
go s' ws = case (dropWhile (\c' -> c' == c) s') of
"" -> ws
rem -> go ((dropWhile (\c' -> c' /= c) rem)) ((takeWhile (\c' -> c' /= c) rem) : ws)


-- Breaks up by predicate function to allow for more complex conditions (\c -> c == ',' || c == ';')
wordsByF :: String -> (Char -> Bool) -> [String]
wordsByF s f = reverse (go s []) where
go s' ws = case ((dropWhile (\c' -> f c')) s') of
"" -> ws
rem -> go ((dropWhile (\c' -> (f c') == False)) rem) (((takeWhile (\c' -> (f c') == False)) rem) : ws)

这些解决方案至少是尾递归的,因此它们不会引起堆栈溢出。

举例来说:

>  import qualified Text.Regex as R
>  R.splitRegex (R.mkRegex "x") "2x3x777"
>  ["2","3","777"]

如果不导入任何直接替换空格的字符,words的目标分隔符就是一个空格。比如:

words [if c == ',' then ' ' else c|c <- "my,comma,separated,list"]

或者

words let f ',' = ' '; f c = c in map f "my,comma,separated,list"

你可以把它变成一个带参数的函数,你可以去掉参数 字符匹配,我有很多匹配的参数,比如:

 [if elem c ";,.:-+@!$#?" then ' ' else c|c <-"my,comma;separated!list"]

我发现这更容易理解:

split :: Char -> String -> [String]
split c xs = case break (==c) xs of
(ls, "") -> [ls]
(ls, x:rs) -> ls : split c rs

我已经迟了很久,但是如果您正在寻找一个不依赖任何臃肿的包的简单解决方案,我想在这里为那些感兴趣的人补充一下:

split :: String -> String -> [String]
split _ "" = []
split delim str =
split' "" str []
where
dl = length delim


split' :: String -> String -> [String] -> [String]
split' h t f
| dl > length t = f ++ [h ++ t]
| delim == take dl t = split' "" (drop dl t) (f ++ [h])
| otherwise = split' (h ++ take 1 t) (drop 1 t) f

有很多答案,但我不喜欢所有的。实际上我并不认识 Haskell,但是我写了一个更短更干净的版本,只用了5分钟;

splitString :: Char -> [Char] -> [[Char]]
splitString _ [] = []
splitString sep str =
let (left, right) = break (==sep) str
in left : splitString sep (drop 1 right)