자바의 인코딩에 대한 좋은 글이 있다. 결론만 이야기하면, 자바가 알아서 할테니 신경쓰지 말라.
이에 대한 테스트 코드를 붙여 본다. 좋은 글에 있는 코드를 참조한 것일 뿐이다.
public class StringTest { public static void main(String args[]) { String name = new String("정용식"); byte[] strs; try { System.out.println( "original string:" + name ); System.out.println( "default encoding:" + System.getProperty( "file.encoding" )); System.out.println( "========================"); strs = name.getBytes(); System.out.println( "length:" + strs.length ); System.out.println( "hex:" + binary2Hex( strs ) ); System.out.println( "value:" + new String( strs ) ); System.out.println(); strs = name.getBytes( "utf-8" ); System.out.println( "length:" + strs.length ); System.out.println( "hex:" + binary2Hex( strs ) ); System.out.println( "value:" + new String( strs, "utf-8" ) ); System.out.println(); name = new String(strs, "utf-8"); strs = name.getBytes(); System.out.println( "length:" + strs.length ); System.out.println( "hex:" + binary2Hex( strs ) ); System.out.println( "value:" + name ); System.out.println(); name = new String(name.getBytes("euc-kr"), "utf-8"); strs = name.getBytes(); System.out.println( "length:" + strs.length ); System.out.println( "euc-kr hex:" + binary2Hex( strs ) ); System.out.println( "value:" + name ); System.out.println(); strs = name.getBytes("utf-8"); System.out.println( "length:" + strs.length ); System.out.println( "utf-8 hex:" + binary2Hex( strs ) ); System.out.println( "value:" + name ); System.out.println(); } catch (Exception e) { } } public static String binary2Hex(byte[] buffer) { String res = ""; String token = ""; for( int i=0; i < buffer.length; i++) { token = Integer.toHexString( buffer[i] ); if(token.length() > 2) { token = token.substring( token.length() - 2 ); } else { for(int j = 0 ; j < 2 - token.length(); j++) { token = "0" + token; } } res += " " + token; } return res.toUpperCase(); } }
위 코드에 대한 결과값은 아래와 같다.
original string:정용식 default encoding:x-windows-949 ======================== length:6 hex: C1 A4 BF EB BD C4 value:정용식 length:9 hex: EC A0 95 EC 9A A9 EC 8B 9D value:정용식 length:6 hex: C1 A4 BF EB BD C4 value:정용식 length:5 euc-kr hex: 3F 3F 3F 3F 3F value:????? length:15 utf-8 hex: EF BF BD EF BF BD EF BF BD EF BF BD EF BF BD value:????? Process finished with exit code 0
참조