Consultor Eletrônico



Kbase P20503: What characters are supported in XML documents using Web Ser
Autor   Progress Software Corporation - Progress
Acesso   Público
Publicação   3/4/2003
Status: Unverified

GOAL:

What characters are supported in XML documents using Web Services?

FIX:

We use the Xerces 1.3 XML parser, which is responsible for the interpretation of the XML. From the XML specification (http://www.w3.org/TR/REC-xml):

"2.2 Characters
[Definition: A parsed entity contains text, a sequence of characters, which
may represent markup or character data.] [Definition: A character is an
atomic unit of text as specified by ISO/IEC 10646 [ISO/IEC 10646] (see also
[ISO/IEC 10646-2000]). Legal characters are tab, carriage return, line
feed, and the legal characters of Unicode and ISO/IEC 10646. The versions
of these standards cited in A.1 Normative References were current at the
time this document was prepared. New characters may be added to these
standards by amendments or new editions. Consequently, XML processors must
accept any character in the range specified for Char. The use of
"compatibility characters", as defined in section 6.8
of [Unicode] (see also D21 in section 3.6 of [Unicode3]), is discouraged.]

Character Range
[2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] |
[#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate
blocks, FFFE, and FFFF. */

The mechanism for encoding character code points into bit patterns may vary
from entity to entity. All XML processors must accept the UTF-8 and UTF-16
encodings of 10646; the mechanisms for signaling which of the two is in
use, or for bringing other encodings into play, are discussed later, in
4.3.3 Character Encoding in Entities."