자바의 인코딩에 대한 좋은 글이 있다. 결론만 이야기하면, 자바가 알아서 할테니 신경쓰지 말라.
이에 대한 테스트 코드를 붙여 본다. 좋은 글에 있는 코드를 참조한 것일 뿐이다.
public class StringTest
{
public static void main(String args[])
{
String name = new String("정용식");
byte[] strs;
try
{
System.out.println( "original string:" + name );
System.out.println( "default encoding:" + System.getProperty( "file.encoding" ));
System.out.println( "========================");
strs = name.getBytes();
System.out.println( "length:" + strs.length );
System.out.println( "hex:" + binary2Hex( strs ) );
System.out.println( "value:" + new String( strs ) );
System.out.println();
strs = name.getBytes( "utf-8" );
System.out.println( "length:" + strs.length );
System.out.println( "hex:" + binary2Hex( strs ) );
System.out.println( "value:" + new String( strs, "utf-8" ) );
System.out.println();
name = new String(strs, "utf-8");
strs = name.getBytes();
System.out.println( "length:" + strs.length );
System.out.println( "hex:" + binary2Hex( strs ) );
System.out.println( "value:" + name );
System.out.println();
name = new String(name.getBytes("euc-kr"), "utf-8");
strs = name.getBytes();
System.out.println( "length:" + strs.length );
System.out.println( "euc-kr hex:" + binary2Hex( strs ) );
System.out.println( "value:" + name );
System.out.println();
strs = name.getBytes("utf-8");
System.out.println( "length:" + strs.length );
System.out.println( "utf-8 hex:" + binary2Hex( strs ) );
System.out.println( "value:" + name );
System.out.println();
}
catch (Exception e)
{
}
}
public static String binary2Hex(byte[] buffer)
{
String res = "";
String token = "";
for( int i=0; i < buffer.length; i++)
{
token = Integer.toHexString( buffer[i] );
if(token.length() > 2)
{
token = token.substring( token.length() - 2 );
}
else
{
for(int j = 0 ; j < 2 - token.length(); j++)
{
token = "0" + token;
}
}
res += " " + token;
}
return res.toUpperCase();
}
}
위 코드에 대한 결과값은 아래와 같다.
original string:정용식 default encoding:x-windows-949 ======================== length:6 hex: C1 A4 BF EB BD C4 value:정용식 length:9 hex: EC A0 95 EC 9A A9 EC 8B 9D value:정용식 length:6 hex: C1 A4 BF EB BD C4 value:정용식 length:5 euc-kr hex: 3F 3F 3F 3F 3F value:????? length:15 utf-8 hex: EF BF BD EF BF BD EF BF BD EF BF BD EF BF BD value:????? Process finished with exit code 0
참조