httpclient抓取网页内容

2014-11-23 17:39:06 · 作者: · 浏览: 22

  1.想下载远程URL地址的内容。可以使用httpclient现在整理一下相关的代码:


  方法一:流转码


  public String convertStreamToString(InputStream is) throws UnsupportedEncodingException {


  BufferedReader reader = new BufferedReader(new InputStreamReader(is,"gbk"));


  StringBuilder sb = new StringBuilder();


  String line = null;


  try {


  while ((line = reader.readLine()) != null) { sb.append(line + "\n");


  }


  } catch (IOException e) {


  e.printStackTrace();


  } finally {


  try {


  is.close();


  } catch (IOException e) {


  e.printStackTrace();


  }


  }


  return sb.toString();


  }


  //下载内容


  private String urlContent(String urlString) throws HttpException, IOException {


  HttpClient client = new HttpClient();


  GetMethod get = new GetMethod("http://www.tianya.cn/publicforum/articleslist/0/no20.shtml"); client.executeMethod(get); System.out.print(get.getResponseCharSet()); InputStream iStream = get.getResponseBodyAsStream();


  String contentString = convertStreamToString(iStream);


  get.releaseConnection();


  return contentString;


  }


  通过GET方法能够实现下载网页内容出来的


  编辑特别推荐: