什么是 xs: NCName 类型? 什么时候应该使用它?

我通过一个模式生成器运行一个 xml 文件,生成的所有内容都符合预期,只有一个节点例外:

<xs:element name="office" type="xs:NCName"/>

xs:NCName到底是什么? 为什么要使用它,而不是 xs:string

91693 次浏览

NCName is non-colonized name e.g. "name". Compared to QName which is qualified name e.g. "ns:name". If your names are not supposed to be qualified by different namespaces, then they are NCNames.

xs:string puts no restrictions on your names at all, but xs:NCName basically disallows ":" to appear in the string.

http://books.xmlschemata.org/relaxng/ch19-77215.html

No spaces or colons. Allows "_" and "-".

You would use this instead of string so that you can validate that the value is limited to what is allowed. It maps well to certain conventions for name/identifier like django's concept of "slug", for instance.

I upvote the person who [\i-[:]][\c-[:]]* translates into English for us.

@skyl practically provoked me to write this answer so please mind the redundancy.

NCName stands for "non-colonized name". NCName can be defined as an XML Schema regular expression [\i-[:]][\c-[:]]*

...and what does that regex mean?

\i and \c are multi-character escapes defined in XML Schema definition.
http://www.w3.org/TR/xmlschema-2/#dt-ccesN
\i is the escape for the set of initial XML name characters and \c is the set of XML name characters. [\i-[:]] means a set that consist of the set \i excluding a set that consist of the colon character :. So in plain English it would mean "any initial character, but not :". The whole regular expression reads as "One initial XML name character, but not a colon, followed by zero or more XML name characters, but not a colon."

Practical restrictions of an NCName

The practical restrictions of NCName are that it cannot contain several symbol characters like :, @, $, %, &, /, +, ,, ;, whitespace characters or different parenthesis. Furthermore an NCName cannot begin with a number, dot or minus character although they can appear later in an NCName.

Where are NCNames needed

In namespace conformant XML documents all names must be either qualified names or NCNames. The following values must be NCNames (not qualified names):

  • namespace prefixes
  • values representing an ID
  • values representing an IDREF
  • values representing a NOTATION
  • processing instruction targets
  • entity names

Practically speaking...

Allowed characters: -, ., 0, 1, 2, 3, 4, 5, 6, 7, .0, .1, .2, .3, .4, .5, .6, .7, .8, .9, 00, 01, 02, 03, 04, 05, 06, 07, 08, 09, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44 ; plus all non-ASCII characters matching \p{L}+.

Also, digits, - and . cannot be used as the first character of the value.

Disallowed characters: , !, ", #, $, %, &, ', (, ), !0, !1, !2, !3, !4, !5, !6, !7, !8, !9, "0, "1, "2, "3, "4, `` "5{"5|"5}"5~`