如何在 Postgresql 移除运输回程和新线路?

所有人,

我又一次陷入了试图获得我的数据的格式,我需要它在。我有一个像这样的文本字段。

“ Deangelo 001 Deangelo”

本地名称起源: 意大利语

来自美国名字 Deangelo

意思: 天使

他是所有人欢乐的源泉。

他的名声是他最宝贵的资产。 个性•当你被一群雄鹰包围时,你很难和它们一起翱翔 开始的很慢,但是与... 的关系 随着时间的推移,迪安杰洛建立。旅行与休闲•一次终身难忘的旅行 在他的未来。

事业与金钱•一个有天赋的孩子,Deangelo 需要成为 不断受到挑战。

生命中的机遇欢乐和幸福等待着这位受祝福的人。

Deangelo 的幸运数字: 12•38•18•34•29•16

"

在 Postgresql,消除运输回程和新航线的最佳方式是什么?我已经尝试了很多方法,但是没有一个人愿意听话。

select regexp_replace(field, E'\r\c', '  ', 'g') from mytable
WHERE id = 5520805582


SELECT regexp_replace(field, E'[^\(\)\&\/,;\*\:.\>\<[:space:]a-zA-Z0-9-]', ' ')
FROM mytable
WHERE field~ E'[^\(\)\&\/,;\*\:.\<\>[:space:]a-zA-Z0-9-]'
AND id = 5520805582;

先说声谢谢, 亚当

144606 次浏览
select regexp_replace(field, E'[\\n\\r]+', ' ', 'g' )

read the manual http://www.postgresql.org/docs/current/static/functions-matching.html

select regexp_replace(field, E'[\\n\\r\\u2028]+', ' ', 'g' )

I had the same problem in my postgres d/b, but the newline in question wasn't the traditional ascii CRLF, it was a unicode line separator, character U2028. The above code snippet will capture that unicode variation as well.

Update... although I've only ever encountered the aforementioned characters "in the wild", to follow lmichelbacher's advice to translate even more unicode newline-like characters, use this:

select regexp_replace(field, E'[\\n\\r\\f\\u000B\\u0085\\u2028\\u2029]+', ' ', 'g' )

OP asked specifically about regexes since it would appear there's concern for a number of other characters as well as newlines, but for those just wanting strip out newlines, you don't even need to go to a regex. You can simply do:

select replace(field,E'\n','');

I think this is an SQL-standard behavior, so it should extend back to all but perhaps the very earliest versions of Postgres. The above tested fine for me in 9.4 and 9.2

In the case you need to remove line breaks from the begin or end of the string, you may use this:

UPDATE table
SET field = regexp_replace(field, E'(^[\\n\\r]+)|([\\n\\r]+$)', '', 'g' );

Have in mind that the hat ^ means the begin of the string and the dollar sign $ means the end of the string.

Hope it help someone.