Tomcat5.x中文参数乱码解决

凌氏 2014-09-01

展开全文

呵呵，鼓捣了一下午，终于弄好了。

原来是用Tomcat4.x做的服务器，今天下载了Tomcat5.x，运行程序时发现参数传递中的中文一律变成了乱码，查了半天代码没有发现问题，切换回Tomcat4.x，发现一切正常，于是上网查了半天资料，终于找到了解决方法——原来是Tomcat5.x的问题。

问题原因：

Tomcat5.x对GET和POST时的Parameters的编码处理方式进行了改变，首先研究$TOMCAT_HOME/webapps/tomcat-docs/config/http.html文件，发现如下两个参数：

`URIEncoding`	This specifies the character encoding used to decode the URI bytes, after %xx decoding the URL. If not specified, ISO-8859-1 will be used.
`useBodyEncodingForURI`	This specifies if the encoding specified in contentType should be used for URI query parameters, instead of using the URIEncoding. This setting is present for compatibility with Tomcat 4.1.x, where the encoding specified in the contentType, or explicitely set using Request.setCharacterEncoding method was also used for the parameters from the URL. The default value is `false`.

这两个参数在server.xml配置文件的<connector/>段进行配置。按照文档说明，如果两个参数没有配置的话，系统将会使用ISO-8859-1编码，但Tomcat5.x在实际进行处理时并没有这样处理，阅读下面的代码：

// 这一段代码是 Tomcat 用来解码的 QueryString 函数，
// 在 org.apache.tomcat.util.http.Parameters 这个 class 里。
private String urlDecode(ByteChunk bc, String enc)
throws IOException {
if( urlDec==null ) {
urlDec=new UDecoder();
}
urlDec.convert(bc);
String result = null;
if (enc != null) {

bc.setEncoding(enc); result = bc.toString(); }
else{
    CharChunk cc = tmpNameC;
    cc.allocate(bc.getLength(), -1);
    // Default encoding: fast conversion
    byte[] bbuf = bc.getBuffer();
    char[] cbuf = cc.getBuffer();
    int start = bc.getStart();
    for (int i = 0; i < bc.getLength(); i++) {
      cbuf[i] = (char) (bbuf[i + start] & 0xff);
    }
    cc.setChars(cbuf, 0, bc.getLength());
    result = cc.toString();
    cc.recycle();
}
return result;
}