DevX Home    Today's Headlines   Articles Archive   Tip Bank   Forums   

Results 1 to 4 of 4

Thread: How to convert string with double high/wide characters to normal string [VC++6]

  1. #1
    Join Date
    Feb 2005
    Posts
    24

    Question How to convert string with double high/wide characters to normal string [VC++6]

    My application typically recieves a string in the following format:
    " Item $5.69 "

    Some contants I always expect:
    - the LENGHT always 20 characters
    - the start index of the text always [5]
    - and most importantly the index of the DECIMAL for the price always [14]
    In order to identify this string correctly I validate all the expected contants listed above ....

    Some of my clients have now started sending the string with Doube-High / Double-Wide values (pair of characters which represent a single readable character) similar to the following:
    " Item $x80x90.x81x91x82x92 "

    For testing I simply scan the string character-by-character, compare char[i] and char[i+1] and replace these pairs with their corresponding single character when a match is found (works fine) as follows:

    Code:
    for (int i=0; i < sData.length(); i++)
    {
       char ch = sData[i] & 0xFF;
       char ch2 = sData[i+1] & 0xFF;
    
       if (ch == '\x80' && ch2 == '\x90')
          zData.replace("\x80\x90", "0");
       else if (ch == '\x81' && ch2 == '\x91')
          zData.replace("\x81\x91", "1");
       else if (ch == '\x82' && ch2 == '\x92')
          zData.replace("\x82\x92", "2");
       ...
       ...
       ...
    }
    But the result is something like this:
    " Item $5.69 "
    Notice how this no longer matches my expectation: the lenght is now 17 (instead of 20) due to the 3 conversions and the decimal is now at index 13 (instead of 14) due to the conversion of the "5" before the decimal point.


    Ideally I would like to convert the string to a normal readable format keeping the constants (length, index of text, index of decimal) at the same place (so the rest of my application is re-usable) ... or any other suggestion (I'm pretty much stuck with this)... Is there a STANDARD way of dealing with these type of characters?

    Any help would be greatly appreciated, I've been stuck on this for a while now ...
    Thanks,

  2. #2
    Join Date
    Dec 2003
    Posts
    3,366
    these are wide chars (unicode) which is used in real applications because many countries that do not speak english (esp china, japan, etc) cannt fit their alphabets into a mere 255 characters.

    This is all standard, yes. You can either convert it by just discarding every other byte (the unicode standard is ASCII at 0-255, so it extends ascii. This means that the data is all ascii when the high order byte is zero. Or you can convert it by reading the string as a wide string and using built in functions. I rarely use unicode, so rather than give you some sloppy way to do it, let me just recommend that you search your help / c++ references for a good example or the web has examples too. There are 2 approaches that somewhat overlap: your compiler settings can force ALL chars to be wide, IE your project is a "unicode" project now and all characters are 16 bit values. Or you can use wchar and the associated wide strings or the like to mix and match unicode and ascii in one project.

  3. #3
    Join Date
    Feb 2005
    Posts
    24
    So you recognize those codes? A lot of people I've spoken to say they are NOT unicode so it had me worried - knowing that someone recognized them as such is already a great help ...

    Thanks

  4. #4
    Join Date
    Dec 2003
    Posts
    3,366
    Quote Originally Posted by Shaitan00 View Post
    So you recognize those codes? A lot of people I've spoken to say they are NOT unicode so it had me worried - knowing that someone recognized them as such is already a great help ...

    Thanks
    No, I did not, I just guessed that any 2 byte char is a unicode entity. Looking at it, its pretty clear that 80 is not zero, so thats out. There are several other 2 byte char sets out there, I thought they were all obsolete though. I have no clue what char set would have 81 91 == '1'.

    What client programs are sending this data in this format, perhaps those programs have a spec to tell you what multibyte set is in use, from there there WILL be a standard but which standard... ? I just do not know.

Similar Threads

  1. Packed Data(Comp-3, etc)
    By Marcos in forum VB Classic
    Replies: 3
    Last Post: 01-25-2006, 12:18 PM
  2. Input string was not in a correct format
    By mdengler in forum ASP.NET
    Replies: 0
    Last Post: 11-26-2002, 03:32 PM
  3. Re: App Object (fixes)
    By Rob Teixeira in forum .NET
    Replies: 129
    Last Post: 06-06-2002, 05:23 AM
  4. App Object
    By Rob Teixeira in forum .NET
    Replies: 15
    Last Post: 05-31-2002, 03:30 PM
  5. Deadlock error.. how to remove
    By Chandra in forum VB Classic
    Replies: 0
    Last Post: 06-22-2000, 12:52 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
HTML5 Development Center
 
 
FAQ
Latest Articles
Java
.NET
XML
Database
Enterprise
Questions? Contact us.
C++
Web Development
Wireless
Latest Tips
Open Source


   Development Centers

   -- Android Development Center
   -- Cloud Development Project Center
   -- HTML5 Development Center
   -- Windows Mobile Development Center