我正在制作搜寻器,无论是否为200,都需要从流中获取数据。CURL以及任何标准浏览器都在这样做。
以下内容实际上不会获得请求的内容,即使有一些内容,http错误状态代码也会引发异常。我想要输出,有办法吗?我更喜欢使用此库,因为它实际上会进行持久连接,这对于我正在执行的爬网类型非常理想。
package test; import java.net.*; import java.io.*; public class Test { public static void main(String[] args) { try { URL url = new URL("http://github.com/XXXXXXXXXXXXXX"); URLConnection connection = url.openConnection(); DataInputStream inStream = new DataInputStream(connection.getInputStream()); String inputLine; while ((inputLine = inStream.readLine()) != null) { System.out.println(inputLine); } inStream.close(); } catch (MalformedURLException me) { System.err.println("MalformedURLException: " + me); } catch (IOException ioe) { System.err.println("IOException: " + ioe); } } }
辛苦了,谢谢:这是我想出的-只是概念的粗略证明:
import java.net.*; import java.io.*; public class Test { public static void main(String[] args) { //InputStream error = ((HttpURLConnection) connection).getErrorStream(); URL url = null; URLConnection connection = null; String inputLine = ""; try { url = new URL("http://verelo.com/asdfrwdfgdg"); connection = url.openConnection(); DataInputStream inStream = new DataInputStream(connection.getInputStream()); while ((inputLine = inStream.readLine()) != null) { System.out.println(inputLine); } inStream.close(); } catch (MalformedURLException me) { System.err.println("MalformedURLException: " + me); } catch (IOException ioe) { System.err.println("IOException: " + ioe); InputStream error = ((HttpURLConnection) connection).getErrorStream(); try { int data = error.read(); while (data != -1) { //do something with data... //System.out.println(data); inputLine = inputLine + (char)data; data = error.read(); //inputLine = inputLine + (char)data; } error.close(); } catch (Exception ex) { try { if (error != null) { error.close(); } } catch (Exception e) { } } } System.out.println(inputLine); } }
简单:
URLConnection connection = url.openConnection(); InputStream is = connection.getInputStream(); if (connection instanceof HttpURLConnection) { HttpURLConnection httpConn = (HttpURLConnection) connection; int statusCode = httpConn.getResponseCode(); if (statusCode != 200 /* or statusCode >= 200 && statusCode < 300 */) { is = httpConn.getErrorStream(); } }
您可以参考Javadoc进行解释。我将处理此问题的最佳方法如下:
URLConnection connection = url.openConnection(); InputStream is = null; try { is = connection.getInputStream(); } catch (IOException ioe) { if (connection instanceof HttpURLConnection) { HttpURLConnection httpConn = (HttpURLConnection) connection; int statusCode = httpConn.getResponseCode(); if (statusCode != 200) { is = httpConn.getErrorStream(); } } }