Jump to content
xisto Community
Sign in to follow this  
Galahad

Vb6 And Unicode Problems displaying unicode characters with VB6

Recommended Posts

Everyone knows this, but I thought I would repeat it, and offer my solution , no, not solution, a workaround, for this problem...As most of you know, VB6 is built internaly with unicode support, but all it's custom controls, are only ANSI compliant, meaning you will get a series of ???'s when inserting unicode characters, from, say, MySQL database...There are tons of solutions for this, from using API to create a textBox that supports unicode, to using MS Forms 2.0, buying custom ActiveX that has support for unicode, and ofcorse, migrating to .Net...Well, all of these are nice, but sometimes, they don't work... First, using API to create a unicode capable textbox is too much pain and suffering, for the effect - just not worth it... MS Forms, not redistributable, and besides, doesn't always work (didn't work for me)... Buyin some third party ActiveX, may be cost ineffective, specialy, if you're doing somethiong for a school project, or just for yourself... I for one thing wouldn'tpay $300+, to have a third party ActiveX, for my control center programm, that is only for my personal use... Migrating to .Net is not a solution..This is what I did, to be able do display charactes that I needed, from latin writing of my language... These characters are "?ĐČĆ?" both in upper, and lower case... Bunch of ?'s, if VB's controls are concerned...Unicode rules are too extensive to cover here now, but I'm sure you will find plentu of information on the internet... What I did, is created a substitution table, for my characters, and whenever I find a sequence of bytes that equal unicode character value of characters in question, I replace them with ANSI equivalents... Now, you might wonder how I do this... Well, every Windows, when installed, offers multiple inpukeyboard languages... That, combined with .Charset property of textbox, set to the correct codepage, you can get your national characters in ANSI... I'm not sure how this works for Chinese, Japanese, and other languages that have hundreds of different characters, so I'm not going into that now... As I sadi, this is ONLY a WORKAROUND, not a way to make VB6 controls recognize unicode...Anyways, enough itroduction, let's get to the code part... Oh, I should just mention one thing more, that this will only work for strings, or possibli text files (haven't tried it with files)... I made these functions, to successfully get the data from the database (hosted at Xisto ;)), then display it correctly in my textbox, and after editing the text, upload it back to the database, again, as UTF8, for correct displaying on the webpage (I hate that Windows-1250 codepage)

Option ExplicitOption Base 0Private Conversion As New CollectionPublic Function ANSIToUTF8(ByVal ANSI As String) As StringDim i   As LongDim aW  As LongDim aB  As ByteDim s   As Strings = ""For i = 1 To Len(ANSI)  aW = Asc(Mid(ANSI, i, 1))  aB = AscB(Mid(ANSI, i, 1))  If aW >= 0 And aW <= 126 Then    s = s & StrConv(ChrB(aB), vbUnicode)  Else    s = s & StrConv(ChrW(Conversion.Item(Trim(CStr(aW)))), vbUnicode)  End IfNext iANSIToUTF8 = sEnd FunctionPublic Function UTF8ToANSI(ByVal UTF As String) As StringDim i   As LongDim l1  As LongDim l2  As LongDim J   As LongDim l() As LongDim s   As StringReDim b(0 To Len(UTF) - 1)For i = 0 To Len(UTF) - 1  b(i) = Asc(Mid(UTF, i + 1, 1))Next iFor i = LBound( To UBound(  ReDim Preserve l(J)  If b(i) >= 0 And b(i) <= 126 Then    l(J) = b(i)  Else    l1 = b(i + 1)    l1 = l1 * 256    l2 = b(i) + (b(i + 1) \ 256)    l(J) = l1 + l2    i = i + 1  End If  J = J + 1Next iFor i = LBound(l) To UBound(l)  l1 = l(i)  If l1 >= 0 And l1 <= 126 Then    s = s & Chr(l1)  Else    s = s & Chr(Conversion.Item(Trim(Hex(l1))))  End IfNext iUTF8ToANSI = sEnd FunctionPrivate Sub Class_Initialize()' ANSI -> UTF8 (uppercase)' ANSI charcode is used as a key to access the item in Conversion collection' actual value for Conversion.Item("138") = &HA0C5Conversion.Add &HA0C5, "138"Conversion.Add &H90C4, "208"Conversion.Add &H8CC4, "200"Conversion.Add &H86C4, "198"Conversion.Add &HBDC5, "142"' ANSI -> UTF8 (lowercase)Conversion.Add &HA1C5, "154"Conversion.Add &H91C4, "240"Conversion.Add &H8DC4, "232"Conversion.Add &H87C4, "230"Conversion.Add &HBEC5, "158"' UTF8 -> ANSI (uppercase)' UTF charcode (in hex) is used as a key to access the item in Conversion collection' actual value for Conversion.Item("A0C5") = 138Conversion.Add 138, "A0C5"Conversion.Add 208, "90C4"Conversion.Add 200, "8CC4"Conversion.Add 198, "86C4"Conversion.Add 142, "BDC5"' UTF8 -> ANSI (lowercase)Conversion.Add 154, "A1C5"Conversion.Add 240, "91C4"Conversion.Add 232, "8DC4"Conversion.Add 230, "87C4"Conversion.Add 158, "BEC5"End SubPrivate Sub Class_Terminate()Set Conversion = NothingEnd Sub

Just paste that entire code inside a Class module, and off you go... You get instant Serbain unicode support in VB6...Change the values in Conversion collection, to add your own characters to this class, and make it work... Perhaps, if someone here from China or Japan tries this, and it works, they could send me the charodes... If it works with all characters and all languages, I guess I could make some sort of a project... Actualy, this is goind to be a project, I'll publish it on my website... Opensource, of course ;)Hope someone found this useful...EDIT:It appears that previous version of this code worked in some cases, and still provided bunch of ???'s in others... Now, I have added StrConv() function inside ANSIToUTF8() function, and as far as I tested this iteration, it appears that this function is now functioning... Also, changed ChrW() to Chr() inside UTF8ToANSI() function, in the area relating to conventional alphanumerical characters... Sorry 'bout this error ;) ...

Edited by Galahad (see edit history)

Share this post


Link to post
Share on other sites

hello,

i don't understand wath table yu are used from conversion fron ANSI and UTF8.

Thanks,

Andrew

------

 

Everyone knows this, but I thought I would repeat it, and offer my solution , no, not solution, a workaround, for this problem...

 

As most of you know, VB6 is built internaly with unicode support, but all it's custom controls, are only ANSI compliant, meaning you will get a series of ???'s when inserting unicode characters, from, say, MySQL database...

 

There are tons of solutions for this, from using API to create a textBox that supports unicode, to using MS Forms 2.0, buying custom ActiveX that has support for unicode, and ofcorse, migrating to .Net...

 

Well, all of these are nice, but sometimes, they don't work... First, using API to create a unicode capable textbox is too much pain and suffering, for the effect - just not worth it... MS Forms, not redistributable, and besides, doesn't always work (didn't work for me)... Buyin some third party ActiveX, may be cost ineffective, specialy, if you're doing somethiong for a school project, or just for yourself... I for one thing wouldn'tpay $300+, to have a third party ActiveX, for my control center programm, that is only for my personal use... Migrating to .Net is not a solution..

 

This is what I did, to be able do display charactes that I needed, from latin writing of my language... These characters are "ŠĐČĆ" both in upper, and lower case... Bunch of ?'s, if VB's controls are concerned...

 

Unicode rules are too extensive to cover here now, but I'm sure you will find plentu of information on the internet... What I did, is created a substitution table, for my characters, and whenever I find a sequence of bytes that equal unicode character value of characters in question, I replace them with ANSI equivalents... Now, you might wonder how I do this... Well, every Windows, when installed, offers multiple inpukeyboard languages... That, combined with .Charset property of textbox, set to the correct codepage, you can get your national characters in ANSI... I'm not sure how this works for Chinese, Japanese, and other languages that have hundreds of different characters, so I'm not going into that now... As I sadi, this is ONLY a WORKAROUND, not a way to make VB6 controls recognize unicode...

 

Anyways, enough itroduction, let's get to the code part... Oh, I should just mention one thing more, that this will only work for strings, or possibli text files (haven't tried it with files)... I made these functions, to successfully get the data from the database (hosted at Xisto Posted Image), then display it correctly in my textbox, and after editing the text, upload it back to the database, again, as UTF8, for correct displaying on the webpage (I hate that Windows-1250 codepage)

 

 To UBound(  ReDim Preserve l(J)  If b(i) >= 0 And b(i) <= 126 Then	l(J) = b(i)  Else	l1 = b(i + 1)	l1 = l1 * 256	l2 = b(i) + (b(i + 1) \ 256)	l(J) = l1 + l2	i = i + 1  End If  J = J + 1Next iFor i = LBound(l) To UBound(l)  l1 = l(i)  If l1 >= 0 And l1 <= 126 Then	s = s & Chr(l1)  Else	s = s & Chr(Conversion.Item(Trim(Hex(l1))))  End IfNext iUTF8ToANSI = sEnd FunctionPrivate Sub Class_Initialize()' ANSI -> UTF8 (uppercase)' ANSI charcode is used as a key to access the item in Conversion collection' actual value for Conversion.Item("138") = &HA0C5Conversion.Add &HA0C5, "138"Conversion.Add &H90C4, "208"Conversion.Add &H8CC4, "200"Conversion.Add &H86C4, "198"Conversion.Add &HBDC5, "142"' ANSI -> UTF8 (lowercase)Conversion.Add &HA1C5, "154"Conversion.Add &H91C4, "240"Conversion.Add &H8DC4, "232"Conversion.Add &H87C4, "230"Conversion.Add &HBEC5, "158"' UTF8 -> ANSI (uppercase)' UTF charcode (in hex) is used as a key to access the item in Conversion collection' actual value for Conversion.Item("A0C5") = 138Conversion.Add 138, "A0C5"Conversion.Add 208, "90C4"Conversion.Add 200, "8CC4"Conversion.Add 198, "86C4"Conversion.Add 142, "BDC5"' UTF8 -> ANSI (lowercase)Conversion.Add 154, "A1C5"Conversion.Add 240, "91C4"Conversion.Add 232, "8DC4"Conversion.Add 230, "87C4"Conversion.Add 158, "BEC5"End SubPrivate Sub Class_Terminate()Set Conversion = NothingEnd Sub _linenums:0'>Option ExplicitOption Base 0Private Conversion As New CollectionPublic Function ANSIToUTF8(ByVal ANSI As String) As StringDim i   As LongDim aW  As LongDim aB  As ByteDim s   As Strings = ""For i = 1 To Len(ANSI)  aW = Asc(Mid(ANSI, i, 1))  aB = AscB(Mid(ANSI, i, 1))  If aW >= 0 And aW <= 126 Then	s = s & StrConv(ChrB(aB), vbUnicode)  Else	s = s & StrConv(ChrW(Conversion.Item(Trim(CStr(aW)))), vbUnicode)  End IfNext iANSIToUTF8 = sEnd FunctionPublic Function UTF8ToANSI(ByVal UTF As String) As StringDim i   As LongDim l1  As LongDim l2  As LongDim J   As LongDim l() As LongDim s   As StringReDim b(0 To Len(UTF) - 1)For i = 0 To Len(UTF) - 1  b(i) = Asc(Mid(UTF, i + 1, 1))Next iFor i = LBound( To UBound(  ReDim Preserve l(J)  If b(i) >= 0 And b(i) <= 126 Then	l(J) = b(i)  Else	l1 = b(i + 1)	l1 = l1 * 256	l2 = b(i) + (b(i + 1) \ 256)	l(J) = l1 + l2	i = i + 1  End If  J = J + 1Next iFor i = LBound(l) To UBound(l)  l1 = l(i)  If l1 >= 0 And l1 <= 126 Then	s = s & Chr(l1)  Else	s = s & Chr(Conversion.Item(Trim(Hex(l1))))  End IfNext iUTF8ToANSI = sEnd FunctionPrivate Sub Class_Initialize()' ANSI -> UTF8 (uppercase)' ANSI charcode is used as a key to access the item in Conversion collection' actual value for Conversion.Item("138") = &HA0C5Conversion.Add &HA0C5, "138"Conversion.Add &H90C4, "208"Conversion.Add &H8CC4, "200"Conversion.Add &H86C4, "198"Conversion.Add &HBDC5, "142"' ANSI -> UTF8 (lowercase)Conversion.Add &HA1C5, "154"Conversion.Add &H91C4, "240"Conversion.Add &H8DC4, "232"Conversion.Add &H87C4, "230"Conversion.Add &HBEC5, "158"' UTF8 -> ANSI (uppercase)' UTF charcode (in hex) is used as a key to access the item in Conversion collection' actual value for Conversion.Item("A0C5") = 138Conversion.Add 138, "A0C5"Conversion.Add 208, "90C4"Conversion.Add 200, "8CC4"Conversion.Add 198, "86C4"Conversion.Add 142, "BDC5"' UTF8 -> ANSI (lowercase)Conversion.Add 154, "A1C5"Conversion.Add 240, "91C4"Conversion.Add 232, "8DC4"Conversion.Add 230, "87C4"Conversion.Add 158, "BEC5"End SubPrivate Sub Class_Terminate()Set Conversion = NothingEnd Sub

Just paste that entire code inside a Class module, and off you go... You get instant Serbain unicode support in VB6...

Change the values in Conversion collection, to add your own characters to this class, and make it work... Perhaps, if someone here from China or Japan tries this, and it works, they could send me the charodes... If it works with all characters and all languages, I guess I could make some sort of a project... Actualy, this is goind to be a project, I'll publish it on my website... Opensource, of course Posted Image

 

Hope someone found this useful...

 

EDIT:

It appears that previous version of this code worked in some cases, and still provided bunch of ???'s in others... Now, I have added StrConv() function inside ANSIToUTF8() function, and as far as I tested this iteration, it appears that this function is now functioning... Also, changed ChrW() to Chr() inside UTF8ToANSI() function, in the area relating to conventional alphanumerical characters... Sorry 'bout this error Posted Image ...

 


Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×
×
  • Create New...

Important Information

Terms of Use | Privacy Policy | Guidelines | We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.