URL类
2021-05-03 02:30
标签:getpath toc pac port 默认 主机 user stat ring URL:统一资源定位符,由4部分组成:协议、存放资源的主机域名、端口号和资源文件名。. URL是指向互联网“资源”的指针资源可以是简单的文件或目录,也可以是对更为复杂的对象的引用,例如对数据库或搜索引擎的查询。 代码示例: (1):URL常用方法 package aaa; import java.net.MalformedURLException; public class TestURL { (2)网络爬虫初步 package aaa; import java.io.*; public class TestURL2 { URL类 标签:getpath toc pac port 默认 主机 user stat ring 原文地址:https://www.cnblogs.com/LuJunlong/p/12123940.html
https://www.baidu.com:80/index.html#aa?username=bjsxt&pwd=bjsxt
import java.net.URL;
public static void main(String[] args) throws MalformedURLException {
URL url = new URL("https://www.baidu.com:80/index.html#aa?username=bjsxt&pwd=bjsxt");
System.out.println("协议名称:"+url.getProtocol());
System.out.println("主机名称:"+url.getHost());
System.out.println("端口号:"+url.getPort());
System.out.println("获取资源路径:"+url.getFile());
System.out.println("获取资源路径:"+url.getPath());
System.out.println("获取默认端口:"+url.getDefaultPort());
}
}
import java.net.MalformedURLException;
import java.net.URL;
/**
* 网络爬虫
* (1)从网络上获取资源
* (2)存储到本地
* @author Administrator
*
*/
public static void main(String[] args) throws IOException {
//创建url对象
URL url = new URL("https://www.baidu.com");
//获取字节输入流
InputStream is = url.openStream();
//缓冲流
BufferedReader br = new BufferedReader(new InputStreamReader(is,"utf-8"));
//储存到本地
BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(new FileOutputStream("index.html"),"utf-8"));
//边读边写
String line = null;
while((line = br.readLine())!=null) {
bw.write(line);
bw.newLine();
bw.flush();
}
//关闭流
bw.close();
br.close();
}
}