Re: XML encoding - handling foreign lang chars (SOLUTION)


DevX Home    Today's Headlines   Articles Archive   Tip Bank   Forums   

Results 1 to 2 of 2

Thread: Re: XML encoding - handling foreign lang chars (SOLUTION)

  1. #1
    Lou Guest

    Re: XML encoding - handling foreign lang chars (SOLUTION)


    PROBLEM (restated):
    ==================
    My client creates an XML doc in memory. It sends the xmldoc.xml string to
    an ASP on the server. The server does xmldoc.load(Request) and gets an error:
    "An invalid character was found in the text context." because the XML contains
    a foreign language character.

    SOLUTION
    ========
    I tried having the client do an xmldoc.createProcessingInstruction and set
    the encoding. Still got the error. Then I realized something.... xmldoc.xml
    gives you a string that DOES NOT indicate the encoding!! SO, the encoding
    was lost and the error continued.

    Now I do this:
    sXML = Replace(oXMLDoc.xml, "?>", " encoding=""ISO-8859-1""?>", 1, 1)

    Which adds the encoding to the <?xml version="1.0" ?> statement at the beginning
    of the doc (put there by the createProcessingInstruction). I send the sXML
    string to the ASP and...

    NOW IT WORKS!!! The xmldoc.load(Request) works without error!


    SIDE NOTES
    ==========
    1) xmldoc.save("somefile.xml") DOES indeed retain the encoding, but xmldoc.xml
    does NOT.

    2) I use the Internet Transfer Control to send my XML to the server because
    the MSXML.XMLHTTPRequest requires IE and I don't want that requirement for
    my users. I could not figure out which DLLs of IE are really needed by MSXML.XMLHTTPRequest
    (its more than msxml.dll).


  2. #2
    Leon Mar Guest

    Re: XML encoding - handling foreign lang chars (SOLUTION)


    "Lou" <l_kvitek@audiblemagic.com> wrote:
    >
    >PROBLEM (restated):
    >==================
    >My client creates an XML doc in memory. It sends the xmldoc.xml string

    to
    >an ASP on the server. The server does xmldoc.load(Request) and gets an

    error:
    > "An invalid character was found in the text context." because the XML contains
    >a foreign language character.
    >
    >SOLUTION
    >========
    >I tried having the client do an xmldoc.createProcessingInstruction and set
    >the encoding. Still got the error. Then I realized something.... xmldoc.xml
    >gives you a string that DOES NOT indicate the encoding!! SO, the encoding
    >was lost and the error continued.
    >
    >Now I do this:
    >sXML = Replace(oXMLDoc.xml, "?>", " encoding=""ISO-8859-1""?>", 1, 1)
    >

    Would ISO-8859-1 be enough to cater for foriegn languages, like CJK's. I
    don't think so.

    ISO-8859-1 is only for Latin 1 ANSI. UTF-8 can deal with 64,000 code points.

    XML DOM only deals with UTF-8 if it is loaded from file. with loadXML() it
    is expecting a BSTR that is a UCS-2 string. So forcing a UTF string into
    UCS-2 without going through MultiByteToWide(CP_UTF7, ...) and then handing
    it to loadXML() will cause some strange thing to happen.



Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
HTML5 Development Center
 
 
FAQ
Latest Articles
Java
.NET
XML
Database
Enterprise
Questions? Contact us.
C++
Web Development
Wireless
Latest Tips
Open Source


   Development Centers

   -- Android Development Center
   -- Cloud Development Project Center
   -- HTML5 Development Center
   -- Windows Mobile Development Center