participate


Internationalization (I18N) - How to determine the encoding
This question is not answered.

<<   Back to Forum  |   Give us Feedback Topics: « Previous | Next
This topic has 3 replies on 1 page.
Jwas_J
Posts:6
Registered: 9/22/07
How to determine the encoding   
Sep 25, 2007 9:11 AM
 
 
In my web application , we are capturing arabic name and storing it in oracle database
// while saving
 this.arabicDesc=new String(paymentTransactionTypeDetailVO.getArabicDesc().getBytes("ISO8859_1"),"UTF8") ;

// while retrieving from database
paymentTransactionTypeDetailVO.setArabicDesc(new String (arabicDesc.getBytes("UTF8"),"ISO8859_1"));

In the jsp pages , we are using
In all the jsp pages , we are using 
<META HTTP-EQUIV="Content-type" CONTENT="text/html; charset=UTF-8">

All these values are coming correctly in our jsp pages. But when we tried to convert this arabic data to cp864 in an applet for printing , we are getting junk.Here we are not explictly converting the data but using printstream class like this. any idea why ????? is coming..
FileOutputStream fos =  new FileOutputStream("LPT1");   
PrintStream pw =  new PrintStream(fos,true,"Cp864");

I checked the database also...
select * from nls_database_parameters
where parameter='NLS_CHARACTERSET';
NLS_CHARACTERSET	UTF8

data is coming like this...
<object
  id="ReceiptPrinterApplet"
  classid="clsid:CAFEEFAC-0015-0000-0007-ABCDEFFEDCBB"
  width="0" height="0" > 
   <param name="code" value="ReceiptPrinterApplet.class">   
   <param name="printermode" value="Broad">   
   <param name="Party Name1" value="KUANJIKOMBIL VARGHESE ALEXANDER (720216)- PO Box No:245">
   <param name="Party Name Arabic1" value="????? ?????? ?????? ???????">
</object>

Where could be the problem ? How can we resolve this issue. We need to convert to cp864 since the printer is supporting only cp864.
 
one_dane
Posts:431
Registered: 8/11/03
Re: How to determine the encoding   
Sep 26, 2007 12:09 PM (reply 1 of 3)  (In reply to original post )
 
 
I can't answer you specific question, but the code you post for creating byte arrays and strings cannot possibly work if you have correctly encoded Java strings to begin with. See http://globalizer.wordpress.com/2007/09/26/can-we-agree-that-all-those-iso8859_1-hacks-are-just-that-hacks/
 
Jwas_J
Posts:6
Registered: 9/22/07
Re: How to determine the encoding   
Sep 26, 2007 1:07 PM (reply 2 of 3)  (In reply to original post )
 
 
Thanks for that article . It was very useful . In fact the example ressembles our code. As you said , it may be due to the initial data itself. Data can come either come from our application or converted to unicode as part of migrating old records. Now how can i find out the exact encoding of the data ? During the persisting of data , If various input surces were using their own encoding and converted to UTF-8 then during the retrieval period , if I try to convert from UTF8 to ISO8859_1 then it may not be correct , right ? Is that means that there is no generic solution for this issue ?
 
one_dane
Posts:431
Registered: 8/11/03
Re: How to determine the encoding   
Sep 26, 2007 1:22 PM (reply 3 of 3)  (In reply to #2 )
Helpful
 
Unfortunately, based on your description of how you acquired the data, I believe you may be correct in saying that there is no generic solution. If your database is encoded in UTF-8, and you can confirm that the data is correct in the database, then you should be in reasonable shape - then it is a matter of figuring out where in your code you have some issues.

If, on the other hand, you determine that part of the data is corrupted in the database because if may have been incorrectly converted when it was stored, then it becomes much more difficult to figure out what to do with it, and how to correct it (if that is possible at all - if one of the conversions was to a target encoding where the source code point does not exist, then the data is lost forever).

Converting UTF-8 encoded data to 8859-1 is really only possible for a small subset of data (data actually in Latin-1 scripts), and there is no good reason to do it, unless it is part of an attempt to correct a previous incorrect conversion (and that should only be attempted with extreme caution, in my opinion).
 
This topic has 3 replies on 1 page.
Back to Forum
 
Read the Developer Forums Code of Conduct

Click to email this message Email this Topic

Edit this Topic
  
 
 
Forums Statistics
    Users Online : 26
  • Guests : 122

About Sun forums
  • Sun Forums is a large collection of user generated discussions. It is here to help you ask questions, find answers, and participate in discussions.

    Check out our guide on Getting started with Sun Forums for a full walkthrough of how to best leverage the benefits of this community.

Powered by Jive Forums