如何在 Swift 中解码 HTML 实体?

我从一个站点提取一个 JSON 文件,接收到的字符串之一是:

The Weeknd ‘King Of The Fall’ [Video Premiere] | @TheWeeknd | #SoPhi

我怎样才能把像 &#8216这样的东西转换成正确的字符?

我制作了一个 Xcode Playground 来展示它:

import UIKit


var error: NSError?
let blogUrl: NSURL = NSURL.URLWithString("http://sophisticatedignorance.net/api/get_recent_summary/")
let jsonData = NSData(contentsOfURL: blogUrl)


let dataDictionary = NSJSONSerialization.JSONObjectWithData(jsonData, options: nil, error: &error) as NSDictionary


var a = dataDictionary["posts"] as NSArray


println(a[0]["title"])
77253 次浏览

这个问题的答案是最近一次针对 Swift 5.2和 iOS 13.4 SDK 进行修订的。


虽然没有直接的方法可以做到这一点,但是您可以使用 NSAttributedString魔法使这个过程尽可能简单(注意,这个方法也会去掉所有的 HTML 标记)。

请记住 仅从主线程初始化 NSAttributedString。它使用 WebKit 来解析下面的 HTML,因此需求。

// This is a[0]["title"] in your case
let htmlEncodedString = "The Weeknd <em>&#8216;King Of The Fall&#8217;</em>"


guard let data = htmlEncodedString.data(using: .utf8) else {
return
}


let options: [NSAttributedString.DocumentReadingOptionKey: Any] = [
.documentType: NSAttributedString.DocumentType.html,
.characterEncoding: String.Encoding.utf8.rawValue
]


guard let attributedString = try? NSAttributedString(data: data, options: options, documentAttributes: nil) else {
return
}


// The Weeknd ‘King Of The Fall’
let decodedString = attributedString.string
extension String {


init?(htmlEncodedString: String) {


guard let data = htmlEncodedString.data(using: .utf8) else {
return nil
}


let options: [NSAttributedString.DocumentReadingOptionKey: Any] = [
.documentType: NSAttributedString.DocumentType.html,
.characterEncoding: String.Encoding.utf8.rawValue
]


guard let attributedString = try? NSAttributedString(data: data, options: options, documentAttributes: nil) else {
return nil
}


self.init(attributedString.string)


}


}


let encodedString = "The Weeknd <em>&#8216;King Of The Fall&#8217;</em>"
let decodedString = String(htmlEncodedString: encodedString)

@ akashivsky 的回答很棒,并且演示了如何利用 NSAttributedString来解码 HTML 实体 (正如他所说的)是,所有 HTML 标记也被删除,因此

<strong> 4 &lt; 5 &amp; 3 &gt; 2</strong>

变成了

4 < 5 & 3 > 2

在 OS X 操作系统上,CFXMLCreateStringByUnescapingEntities()完成了以下工作:

let encoded = "<strong> 4 &lt; 5 &amp; 3 &gt; 2 .</strong> Price: 12 &#x20ac;.  &#64; "
let decoded = CFXMLCreateStringByUnescapingEntities(nil, encoded, nil) as String
println(decoded)
// <strong> 4 < 5 & 3 > 2 .</strong> Price: 12 €.  @

但是这在 iOS 上是不可用的。

这是一个纯粹的 Swift 实现,它解码字符实体 使用字典和所有数字字符的 &lt;等引用 实体,如 &#64&#x20ac。(注意,我没有列出所有 252个 HTML 实体)

迅捷4:

// Mapping from XML/HTML character entity reference to character
// From http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references
private let characterEntities : [ Substring : Character ] = [
// XML predefined entities:
"&quot;"    : "\"",
"&amp;"     : "&",
"&apos;"    : "'",
"&lt;"      : "<",
"&gt;"      : ">",


// HTML character entity references:
"&nbsp;"    : "\u{00a0}",
// ...
"&diams;"   : "♦",
]


extension String {


/// Returns a new string made by replacing in the `String`
/// all HTML character entity references with the corresponding
/// character.
var stringByDecodingHTMLEntities : String {


// ===== Utility functions =====


// Convert the number in the string to the corresponding
// Unicode character, e.g.
//    decodeNumeric("64", 10)   --> "@"
//    decodeNumeric("20ac", 16) --> "€"
func decodeNumeric(_ string : Substring, base : Int) -> Character? {
guard let code = UInt32(string, radix: base),
let uniScalar = UnicodeScalar(code) else { return nil }
return Character(uniScalar)
}


// Decode the HTML character entity to the corresponding
// Unicode character, return `nil` for invalid input.
//     decode("&#64;")    --> "@"
//     decode("&#x20ac;") --> "€"
//     decode("&lt;")     --> "<"
//     decode("&foo;")    --> nil
func decode(_ entity : Substring) -> Character? {


if entity.hasPrefix("&#x") || entity.hasPrefix("&#X") {
return decodeNumeric(entity.dropFirst(3).dropLast(), base: 16)
} else if entity.hasPrefix("&#") {
return decodeNumeric(entity.dropFirst(2).dropLast(), base: 10)
} else {
return characterEntities[entity]
}
}


// ===== Method starts here =====


var result = ""
var position = startIndex


// Find the next '&' and copy the characters preceding it to `result`:
while let ampRange = self[position...].range(of: "&") {
result.append(contentsOf: self[position ..< ampRange.lowerBound])
position = ampRange.lowerBound


// Find the next ';' and copy everything from '&' to ';' into `entity`
guard let semiRange = self[position...].range(of: ";") else {
// No matching ';'.
break
}
let entity = self[position ..< semiRange.upperBound]
position = semiRange.upperBound


if let decoded = decode(entity) {
// Replace by decoded character:
result.append(decoded)
} else {
// Invalid entity, copy verbatim:
result.append(contentsOf: entity)
}
}
// Copy remaining characters to `result`:
result.append(contentsOf: self[position...])
return result
}
}

例如:

let encoded = "<strong> 4 &lt; 5 &amp; 3 &gt; 2 .</strong> Price: 12 &#x20ac;.  &#64; "
let decoded = encoded.stringByDecodingHTMLEntities
print(decoded)
// <strong> 4 < 5 & 3 > 2 .</strong> Price: 12 €.  @

斯威夫特3:

// Mapping from XML/HTML character entity reference to character
// From http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references
private let characterEntities : [ String : Character ] = [
// XML predefined entities:
"&quot;"    : "\"",
"&amp;"     : "&",
"&apos;"    : "'",
"&lt;"      : "<",
"&gt;"      : ">",


// HTML character entity references:
"&nbsp;"    : "\u{00a0}",
// ...
"&diams;"   : "♦",
]


extension String {


/// Returns a new string made by replacing in the `String`
/// all HTML character entity references with the corresponding
/// character.
var stringByDecodingHTMLEntities : String {


// ===== Utility functions =====


// Convert the number in the string to the corresponding
// Unicode character, e.g.
//    decodeNumeric("64", 10)   --> "@"
//    decodeNumeric("20ac", 16) --> "€"
func decodeNumeric(_ string : String, base : Int) -> Character? {
guard let code = UInt32(string, radix: base),
let uniScalar = UnicodeScalar(code) else { return nil }
return Character(uniScalar)
}


// Decode the HTML character entity to the corresponding
// Unicode character, return `nil` for invalid input.
//     decode("&#64;")    --> "@"
//     decode("&#x20ac;") --> "€"
//     decode("&lt;")     --> "<"
//     decode("&foo;")    --> nil
func decode(_ entity : String) -> Character? {


if entity.hasPrefix("&#x") || entity.hasPrefix("&#X"){
return decodeNumeric(entity.substring(with: entity.index(entity.startIndex, offsetBy: 3) ..< entity.index(entity.endIndex, offsetBy: -1)), base: 16)
} else if entity.hasPrefix("&#") {
return decodeNumeric(entity.substring(with: entity.index(entity.startIndex, offsetBy: 2) ..< entity.index(entity.endIndex, offsetBy: -1)), base: 10)
} else {
return characterEntities[entity]
}
}


// ===== Method starts here =====


var result = ""
var position = startIndex


// Find the next '&' and copy the characters preceding it to `result`:
while let ampRange = self.range(of: "&", range: position ..< endIndex) {
result.append(self[position ..< ampRange.lowerBound])
position = ampRange.lowerBound


// Find the next ';' and copy everything from '&' to ';' into `entity`
if let semiRange = self.range(of: ";", range: position ..< endIndex) {
let entity = self[position ..< semiRange.upperBound]
position = semiRange.upperBound


if let decoded = decode(entity) {
// Replace by decoded character:
result.append(decoded)
} else {
// Invalid entity, copy verbatim:
result.append(entity)
}
} else {
// No matching ';'.
break
}
}
// Copy remaining characters to `result`:
result.append(self[position ..< endIndex])
return result
}
}

斯威夫特2:

// Mapping from XML/HTML character entity reference to character
// From http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references
private let characterEntities : [ String : Character ] = [
// XML predefined entities:
"&quot;"    : "\"",
"&amp;"     : "&",
"&apos;"    : "'",
"&lt;"      : "<",
"&gt;"      : ">",


// HTML character entity references:
"&nbsp;"    : "\u{00a0}",
// ...
"&diams;"   : "♦",
]


extension String {


/// Returns a new string made by replacing in the `String`
/// all HTML character entity references with the corresponding
/// character.
var stringByDecodingHTMLEntities : String {


// ===== Utility functions =====


// Convert the number in the string to the corresponding
// Unicode character, e.g.
//    decodeNumeric("64", 10)   --> "@"
//    decodeNumeric("20ac", 16) --> "€"
func decodeNumeric(string : String, base : Int32) -> Character? {
let code = UInt32(strtoul(string, nil, base))
return Character(UnicodeScalar(code))
}


// Decode the HTML character entity to the corresponding
// Unicode character, return `nil` for invalid input.
//     decode("&#64;")    --> "@"
//     decode("&#x20ac;") --> "€"
//     decode("&lt;")     --> "<"
//     decode("&foo;")    --> nil
func decode(entity : String) -> Character? {


if entity.hasPrefix("&#x") || entity.hasPrefix("&#X"){
return decodeNumeric(entity.substringFromIndex(entity.startIndex.advancedBy(3)), base: 16)
} else if entity.hasPrefix("&#") {
return decodeNumeric(entity.substringFromIndex(entity.startIndex.advancedBy(2)), base: 10)
} else {
return characterEntities[entity]
}
}


// ===== Method starts here =====


var result = ""
var position = startIndex


// Find the next '&' and copy the characters preceding it to `result`:
while let ampRange = self.rangeOfString("&", range: position ..< endIndex) {
result.appendContentsOf(self[position ..< ampRange.startIndex])
position = ampRange.startIndex


// Find the next ';' and copy everything from '&' to ';' into `entity`
if let semiRange = self.rangeOfString(";", range: position ..< endIndex) {
let entity = self[position ..< semiRange.endIndex]
position = semiRange.endIndex


if let decoded = decode(entity) {
// Replace by decoded character:
result.append(decoded)
} else {
// Invalid entity, copy verbatim:
result.appendContentsOf(entity)
}
} else {
// No matching ';'.
break
}
}
// Copy remaining characters to `result`:
result.appendContentsOf(self[position ..< endIndex])
return result
}
}
extension String{
func decodeEnt() -> String{
let encodedData = self.dataUsingEncoding(NSUTF8StringEncoding)!
let attributedOptions : [String: AnyObject] = [
NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
NSCharacterEncodingDocumentAttribute: NSUTF8StringEncoding
]
let attributedString = NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil, error: nil)!


return attributedString.string
}
}


let encodedString = "The Weeknd &#8216;King Of The Fall&#8217;"


let foo = encodedString.decodeEnt() /* The Weeknd ‘King Of The Fall’ */

这将是我的方法。您可以添加实体字典从 https://gist.github.com/mwaterfall/25b4a6a06dc3309d9555迈克尔瀑布提到。

extension String {
func htmlDecoded()->String {


guard (self != "") else { return self }


var newStr = self


let entities = [
"&quot;"    : "\"",
"&amp;"     : "&",
"&apos;"    : "'",
"&lt;"      : "<",
"&gt;"      : ">",
]


for (name,value) in entities {
newStr = newStr.stringByReplacingOccurrencesOfString(name, withString: value)
}
return newStr
}
}

例如:

let encoded = "this is so &quot;good&quot;"
let decoded = encoded.htmlDecoded() // "this is so "good""

或者

let encoded = "this is so &quot;good&quot;".htmlDecoded() // "this is so "good""

斯威夫特2@akashivskyy 的扩展,

 extension String {
init(htmlEncodedString: String) {
if let encodedData = htmlEncodedString.dataUsingEncoding(NSUTF8StringEncoding){
let attributedOptions : [String: AnyObject] = [
NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
NSCharacterEncodingDocumentAttribute: NSUTF8StringEncoding
]


do{
if let attributedString:NSAttributedString = try NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil){
self.init(attributedString.string)
}else{
print("error")
self.init(htmlEncodedString)     //Returning actual string if there is an error
}
}catch{
print("error: \(error)")
self.init(htmlEncodedString)     //Returning actual string if there is an error
}


}else{
self.init(htmlEncodedString)     //Returning actual string if there is an error
}
}
}

用途:

NSData dataRes = (nsdata value )


var resString = NSString(data: dataRes, encoding: NSUTF8StringEncoding)

迅捷3强版 @ akashivsky 的分机,

extension String {
init(htmlEncodedString: String) {
self.init()
guard let encodedData = htmlEncodedString.data(using: .utf8) else {
self = htmlEncodedString
return
}


let attributedOptions: [String : Any] = [
NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
NSCharacterEncodingDocumentAttribute: String.Encoding.utf8.rawValue
]


do {
let attributedString = try NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil)
self = attributedString.string
} catch {
print("Error: \(error)")
self = htmlEncodedString
}
}
}

我一直在寻找一个纯粹的 Swift 3.0实用程序来转义/不转义 HTML 字符引用(即服务器端的 Swift 应用程序在 macOS 和 Linux 上) ,但没有找到任何全面的解决方案,所以我写了我自己的实现: https://github.com/IBM-Swift/swift-html-entities

这个名为 HTMLEntities的软件包可以与 HTML4命名的字符引用以及十六进制/十二进制的数字字符引用一起工作,并且它可以根据 W3 HTML5规范识别特殊的数字字符引用(即 &#x80;应该作为欧元符号(unicode U+20AC)而不应该作为 U+0080的 Unicode字符,并且在取消转义时,某些范围的数字字符引用应该用替换字符 U+FFFD替换)。

用法例子:

import HTMLEntities


// encode example
let html = "<script>alert(\"abc\")</script>"


print(html.htmlEscape())
// Prints ”&lt;script&gt;alert(&quot;abc&quot;)&lt;/script&gt;"


// decode example
let htmlencoded = "&lt;script&gt;alert(&quot;abc&quot;)&lt;/script&gt;"


print(htmlencoded.htmlUnescape())
// Prints ”<script>alert(\"abc\")</script>"

以 OP 为例:

print("The Weeknd &#8216;King Of The Fall&#8217; [Video Premiere] | @TheWeeknd | #SoPhi ".htmlUnescape())
// prints "The Weeknd ‘King Of The Fall’ [Video Premiere] | @TheWeeknd | #SoPhi "

编辑: 从2.0.0版开始,HTMLEntities现在支持 HTML5命名字符引用。

正在更新 Swift 3的答案

extension String {
init?(htmlEncodedString: String) {
let encodedData = htmlEncodedString.data(using: String.Encoding.utf8)!
let attributedOptions = [ NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType]


guard let attributedString = try? NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil) else {
return nil
}
self.init(attributedString.string)
}

@ Yishus 的回答的计算变量版本

public extension String {
/// Decodes string with HTML encoding.
var htmlDecoded: String {
guard let encodedData = self.data(using: .utf8) else { return self }


let attributedOptions: [String : Any] = [
NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
NSCharacterEncodingDocumentAttribute: String.Encoding.utf8.rawValue]


do {
let attributedString = try NSAttributedString(data: encodedData,
options: attributedOptions,
documentAttributes: nil)
return attributedString.string
} catch {
print("Error: \(error)")
return self
}
}
}

带有实际字体大小转换的 Swift 3.0版本

通常,如果直接将 HTML 内容转换为属性化字符串,则字体大小会增加。您可以尝试将 HTML 字符串转换为属性化字符串,然后再转换回来以查看差异。

相反,这里是 实际尺寸换算,通过对所有字体应用0.75的比率,确保字体大小不变:

extension String {
func htmlAttributedString() -> NSAttributedString? {
guard let data = self.data(using: String.Encoding.utf16, allowLossyConversion: false) else { return nil }
guard let attriStr = try? NSMutableAttributedString(
data: data,
options: [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType],
documentAttributes: nil) else { return nil }
attriStr.beginEditing()
attriStr.enumerateAttribute(NSFontAttributeName, in: NSMakeRange(0, attriStr.length), options: .init(rawValue: 0)) {
(value, range, stop) in
if let font = value as? UIFont {
let resizedFont = font.withSize(font.pointSize * 0.75)
attriStr.addAttribute(NSFontAttributeName,
value: resizedFont,
range: range)
}
}
attriStr.endEditing()
return attriStr
}
}

迅捷4版本

extension String {


init(htmlEncodedString: String) {
self.init()
guard let encodedData = htmlEncodedString.data(using: .utf8) else {
self = htmlEncodedString
return
}


let attributedOptions: [NSAttributedString.DocumentReadingOptionKey : Any] = [
.documentType: NSAttributedString.DocumentType.html,
.characterEncoding: String.Encoding.utf8.rawValue
]


do {
let attributedString = try NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil)
self = attributedString.string
}
catch {
print("Error: \(error)")
self = htmlEncodedString
}
}
}

Swift 4

extension String {
var replacingHTMLEntities: String? {
do {
return try NSAttributedString(data: Data(utf8), options: [
.documentType: NSAttributedString.DocumentType.html,
.characterEncoding: String.Encoding.utf8.rawValue
], documentAttributes: nil).string
} catch {
return nil
}
}
}

简单用法

let clean = "Weeknd &#8216;King Of The Fall&#8217".replacingHTMLEntities ?? "default value"

Swift 4

extension String {


mutating func toHtmlEncodedString() {
guard let encodedData = self.data(using: .utf8) else {
return
}


let attributedOptions: [NSAttributedString.DocumentReadingOptionKey : Any] = [
NSAttributedString.DocumentReadingOptionKey(rawValue: NSAttributedString.DocumentAttributeKey.documentType.rawValue): NSAttributedString.DocumentType.html,
NSAttributedString.DocumentReadingOptionKey(rawValue: NSAttributedString.DocumentAttributeKey.characterEncoding.rawValue): String.Encoding.utf8.rawValue
]


do {
let attributedString = try NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil)
self = attributedString.string
}
catch {
print("Error: \(error)")
}
}

Swift 4


  • 字符串扩展计算变量
  • Without extra guard, do, catch, etc...
  • 如果解码失败,返回原始字符串

extension String {
var htmlDecoded: String {
let decoded = try? NSAttributedString(data: Data(utf8), options: [
.documentType: NSAttributedString.DocumentType.html,
.characterEncoding: String.Encoding.utf8.rawValue
], documentAttributes: nil).string


return decoded ?? self
}
}

看看 HTMLString-一个用 Swift 编写的库,允许程序在字符串中添加和删除 HTML 实体

为了完整起见,我复制了该网站的主要功能:

  • 为 ASCII 和 UTF-8/UTF-16编码添加实体
  • 删除超过2100个命名实体(如 &)
  • 支持移除十进制和十六进制实体
  • 设计支持快速扩展图形集群(→100% 表情防护)
  • 完全单元测试
  • 快点
  • 记录在案
  • 与 Objective-C 兼容

优雅的 Swift 4解决方案

如果你想要一根绳子,

myString = String(htmlString: encodedString)

将此扩展插件添加到项目中:

extension String {


init(htmlString: String) {
self.init()
guard let encodedData = htmlString.data(using: .utf8) else {
self = htmlString
return
}


let attributedOptions: [NSAttributedString.DocumentReadingOptionKey : Any] = [
.documentType: NSAttributedString.DocumentType.html,
.characterEncoding: String.Encoding.utf8.rawValue
]


do {
let attributedString = try NSAttributedString(data: encodedData,
options: attributedOptions,
documentAttributes: nil)
self = attributedString.string
} catch {
print("Error: \(error.localizedDescription)")
self = htmlString
}
}
}

如果您想要一个带有粗体、斜体、链接等的 NSAttributedString,

textField.attributedText = try? NSAttributedString(htmlString: encodedString)

将此扩展插件添加到项目中:

extension NSAttributedString {


convenience init(htmlString html: String) throws {
try self.init(data: Data(html.utf8), options: [
.documentType: NSAttributedString.DocumentType.html,
.characterEncoding: String.Encoding.utf8.rawValue
], documentAttributes: nil)
}


}

Swift 4

func decodeHTML(string: String) -> String? {


var decodedString: String?


if let encodedData = string.data(using: .utf8) {
let attributedOptions: [NSAttributedString.DocumentReadingOptionKey : Any] = [
.documentType: NSAttributedString.DocumentType.html,
.characterEncoding: String.Encoding.utf8.rawValue
]


do {
decodedString = try NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil).string
} catch {
print("\(error.localizedDescription)")
}
}


return decodedString
}

迅捷4:

总的解决方案,最终为我工作的 HTML 代码和换行符和单引号

extension String {
var htmlDecoded: String {
let decoded = try? NSAttributedString(data: Data(utf8), options: [
.documentType: NSAttributedString.DocumentType.html,
.characterEncoding: String.Encoding.utf8.rawValue
], documentAttributes: nil).string


return decoded ?? self
}
}

用法:

let yourStringEncoded = yourStringWithHtmlcode.htmlDecoded

然后,我不得不应用更多的过滤器来去掉 单引号(例如,不要没有这是等等) ,以及像 \n这样的新行字符:

var yourNewString = String(yourStringEncoded.filter { !"\n\t\r".contains($0) })
yourNewString = yourNewString.replacingOccurrences(of: "\'", with: "", options: NSString.CompareOptions.literal, range: nil)

Swift 4.1 +

var htmlDecoded: String {




let attributedOptions: [NSAttributedString.DocumentReadingOptionKey : Any] = [


NSAttributedString.DocumentReadingOptionKey.documentType : NSAttributedString.DocumentType.html,
NSAttributedString.DocumentReadingOptionKey.characterEncoding : String.Encoding.utf8.rawValue
]




let decoded = try? NSAttributedString(data: Data(utf8), options: attributedOptions
, documentAttributes: nil).string


return decoded ?? self
}

Swift 4

我非常喜欢使用 documentAttritribute 的解决方案。但是,对于解析文件和/或在表视图单元格中的使用来说,它可能太慢了。我不敢相信苹果公司没有提供一个像样的解决方案。

作为一个解决方案,我在 GitHub 上发现了这个 String 扩展,它工作得很好,解码速度也很快。

因此,如果给出的答案是缓慢的 ,请参阅这个链接中提供的解决方案: Https://gist.github.com/mwaterfall/25b4a6a06dc3309d9555

注意: 它不解析 HTML 标记。

目标-C

+(NSString *) decodeHTMLEnocdedString:(NSString *)htmlEncodedString {
if (!htmlEncodedString) {
return nil;
}


NSData *data = [htmlEncodedString dataUsingEncoding:NSUTF8StringEncoding];
NSDictionary *attributes = @{NSDocumentTypeDocumentAttribute:     NSHTMLTextDocumentType,
NSCharacterEncodingDocumentAttribute:     @(NSUTF8StringEncoding)};
NSAttributedString *attributedString = [[NSAttributedString alloc]     initWithData:data options:attributes documentAttributes:nil error:nil];
return [attributedString string];
}

Swift 5.1版本

import UIKit


extension String {


init(htmlEncodedString: String) {
self.init()
guard let encodedData = htmlEncodedString.data(using: .utf8) else {
self = htmlEncodedString
return
}


let attributedOptions: [NSAttributedString.DocumentReadingOptionKey : Any] = [
.documentType: NSAttributedString.DocumentType.html,
.characterEncoding: String.Encoding.utf8.rawValue
]


do {
let attributedString = try NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil)
self = attributedString.string
}
catch {
print("Error: \(error)")
self = htmlEncodedString
}
}
}

此外,如果你想提取日期、图像、元数据、标题和描述,你可以使用我的 pod 命名:

][1].

可读性工具包