当前位置: 代码迷 >> Java Web开发 >> java.net.ProtocolException: Server redirected too many times (20),该如何解决
  详细解决方案

java.net.ProtocolException: Server redirected too many times (20),该如何解决

热度:1035   发布时间:2016-04-17 10:29:25.0
java.net.ProtocolException: Server redirected too many times (20)
求JAVA网络编程高手,指点,指点!!!!

我写的一个网络爬虫采集,爬Google页面会出异常,求解决方案!!!!

Java code
    private byte[] queryData() throws Exception {        java.net.URL connUrl = new URL(url);                java.net.HttpURLConnection conn = (HttpURLConnection) connUrl.openConnection();        conn.setRequestProperty("User-agent","Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727; Maxthon 2.0)");        java.io.InputStream input = conn.getInputStream();        byte[] data = new byte[1024];        int length = 0;        ByteArrayOutputStream baos = new ByteArrayOutputStream();        while ((length = input.read(data)) > 0) {            baos.write(data, 0, length);        }        conn.disconnect();        return baos.toByteArray();    }



URL地址为:http://www.google.com.hk/search?q=%E5%A6%87%E5%A5%B3&hl=zh-CN
异常信息如下:

java.net.ProtocolException: Server redirected too many times (20)
  at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLCon
nection.java:1315)
  at com.xdtech.platform.util.source.SourceFetch.queryData(SourceFetch.jav
a:41)
  at com.xdtech.platform.util.source.SourceFetch.queryUrl(SourceFetch.java
:29)
  at com.xdtech.platform.util.source.inter.AbstractSource.queryUrl(Abstrac
tSource.java:72)
  at com.xdtech.platform.util.source.Template.SearchFilteByTemplateChange.
filterByPages(SearchFilteByTemplateChange.java:187)
  at com.xdtech.platform.service.source.IndexSourceDataService.collectData
ByPage(IndexSourceDataService.java:147)
  at com.xdtech.platform.core.service.SourceFetchExecutorPool$CategoryFetc
h.run(SourceFetchExecutorPool.java:107)

其中at com.xdtech.platform.util.source.SourceFetch.queryData(SourceFetch.java:41) 指的是代码中的
Java code
java.io.InputStream input = conn.getInputStream();



求高手救救俺,,,,

如果把URL地址中“&hl=zh-CN” 去掉就不会出异常,但是却是繁体内容!!





------解决方案--------------------
Java code
        String cookie = "";        do {            HttpURLConnection conn = (HttpURLConnection) new URL("http://www.google.com.hk/search?q=%E5%A6%87%E5%A5%B3&hl=zh-CN").openConnection();            if(cookie.length() != 0)                conn.setRequestProperty("Cookie", cookie);            conn.setRequestProperty("User-Agent", "Mozilla/4.0 (compatible; MSIE 8.0)");            conn.setInstanceFollowRedirects(false);            int code = conn.getResponseCode();            if(code == HttpURLConnection.HTTP_MOVED_TEMP) {                cookie += conn.getHeaderField("Set-Cookie") + ";";            }            if(conn.getResponseCode() == HttpURLConnection.HTTP_OK)                break;        } while(true);
  相关解决方案