什么字符在 URL 中是有效的?

我试图删除一个大字符串的非 URL 部分。我发现的大多数正则表达式都类似于 [A-Za-z0-9-_.!~*'()],但是还有更多的东西可以通过 URL 包含。比如 http://127.0.0.1:8080/test?v=123#this

那么,有效 URL 的最新字符是什么?

137615 次浏览

All the gory details can be found in the current RFC on the topic: RFC 3986 (Uniform Resource Identifier (URI): Generic Syntax)

Based on 0-98, you are looking at a list that looks like: A-Z, a-z, 0-9, -, ., _, ~, :, /, ?, a-z0, a-z1, a-z2, a-z3, a-z4, a-z5, a-z6, a-z7, a-z8, a-z9, 0-90, 0-91, 0-92, 0-93, 0-94, and 0-95. Everything else must be 0-99. Also, some of these characters can only exist in very specific spots in a URI and outside of those spots must be url-encoded (e.g. 0-94 can only be used in conjunction with url encoding as in 0-97), the RFC has all of these specifics.