1.2 lucene入门程序环境搭建及入门代码

2021-06-15 12:03

阅读：709

标签：reader 分词器 Lucene junit 符号 [] int throw 抽象

lucene入门程序环境搭建及入门代码

1.1 需求

使用lucene完成对数据库中图书信息的索引和搜索功能。

1.2 环境准备

l Jdk：1.7及以上

l Lucene：4.10（从4.8版本以后，必须使用jdk1.7及以上）

l Ide：indigo

l 数据库：mysql 5

1.3 工程搭建

l Mysql驱动包

l Analysis的包

l Core包

l QueryParser包

l Junit包（非必须）

创建po类

 1 public class Book {
 2     // 图书ID
 3     private Integer id;
 4     // 图书名称
 5     private String name;
 6     // 图书价格
 7     private Float price;
 8     // 图书图片
 9     private String pic;
10     // 图书描述
11     private String description;
12     public Integer getId() {
13         return id;
14     }
15     public void setId(Integer id) {
16         this.id = id;
17     }
18     public String getName() {
19         return name;
20     }
21     public void setName(String name) {
22         this.name = name;
23     }
24     public Float getPrice() {
25         return price;
26     }
27     public void setPrice(Float price) {
28         this.price = price;
29     }
30     public String getPic() {
31         return pic;
32     }
33     public void setPic(String pic) {
34         this.pic = pic;
35     }
36     public String getDescription() {
37         return description;
38     }
39     public void setDescription(String description) {
40         this.description = description;
41     }
42 
43 }

创建po类

DAO

 1 publicclass BookDaoImpl implements BookDao {
 2 
 3     @Override
 4     public List queryBooks() {
 5         // 数据库链接
 6         Connection connection = null;
 7 
 8         // 预编译statement
 9         PreparedStatement preparedStatement = null;
10 
11         // 结果集
12         ResultSet resultSet = null;
13 
14         // 图书列表
15         List list = new ArrayList();
16 
17         try {
18             // 加载数据库驱动
19             Class.forName("com.mysql.jdbc.Driver");
20             // 连接数据库
21             connection = DriverManager.getConnection(
22                     "jdbc:mysql://localhost:3306/solr", "root", "root");
23 
24             // SQL语句
25             String sql = "SELECT * FROM book";
26             // 创建preparedStatement
27             preparedStatement = connection.prepareStatement(sql);
28 
29             // 获取结果集
30             resultSet = preparedStatement.executeQuery();
31 
32             // 结果集解析
33             while (resultSet.next()) {
34                 Book book = newBook();
35                 book.setId(resultSet.getInt("id"));
36                 book.setName(resultSet.getString("name"));
37                 book.setPrice(resultSet.getFloat("price"));
38                 book.setPic(resultSet.getString("pic"));
39                 book.setDescription(resultSet.getString("description"));
40                 list.add(book);
41             }
42 
43         } catch (Exception e) {
44             e.printStackTrace();
45         }
46 
47         return list;
48     }
49 
50 }

DAO实现类

创建索引

创建索引流程：

IndexWriter是索引过程的核心组件，通过IndexWriter可以创建新索引、更新索引、删除索引操作。IndexWriter需要通过Directory对索引进行存储操作。

Directory描述了索引的存储位置，底层封装了I/O操作，负责对索引进行存储。它是一个抽象类，它的子类常用的包括FSDirectory（在文件系统存储索引）、RAMDirectory（在内存存储索引）。

 1 @Test
 2     publicvoidcreateIndex() throws Exception {
 3         // 采集数据
 4         BookDao dao = new BookDaoImpl();
 5         List list = dao.queryBooks();
 6 
 7         // 将采集到的数据封装到Document对象中
 8         List docList = new ArrayList();
 9         Document document;
10         for (Book book : list) {
11             document = new Document();
12             // store:如果是yes，则说明存储到文档域中
13             // 图书ID
14             Field id = new TextField("id", book.getId().toString(), Store.YES);
15             // 图书名称
16             Field name = new TextField("name", book.getName(), Store.YES);
17             // 图书价格
18             Field price = new TextField("price", book.getPrice().toString(),
19                     Store.YES);
20             // 图书图片地址
21             Field pic = new TextField("pic", book.getPic(), Store.YES);
22             // 图书描述
23             Field description = new TextField("description",
24                     book.getDescription(), Store.YES);
25 
26             // 将field域设置到Document对象中
27             document.add(id);
28             document.add(name);
29             document.add(price);
30             document.add(pic);
31             document.add(description);
32 
33             docList.add(document);
34         }
35 
36         // 创建分词器，标准分词器
37         Analyzer analyzer = new StandardAnalyzer();
38 
39         // 创建IndexWriter
40         IndexWriterConfig cfg = new IndexWriterConfig(Version.LUCENE_4_10_3,
41                 analyzer);
42         // 指定索引库的地址
43         File indexFile = new File("E:\\11-index\\hm19\\");
44         Directory directory = FSDirectory.open(indexFile);
45         IndexWriter writer = new IndexWriter(directory, cfg);
46 
47         // 通过IndexWriter对象将Document写入到索引库中
48         for (Document doc : docList) {
49             writer.addDocument(doc);
50         }
51 
52         // 关闭writer
53         writer.close();
54     }

创建索引

分词
Lucene中分词主要分为两个步骤：分词、过滤

分词：将field域中的内容一个个的分词。
过滤：将分好的词进行过滤，比如去掉标点符号、大写转小写、词的型还原（复数转单数、过去式转成现在式）、停用词过滤

停用词：单独应用没有特殊意义的词。比如的、啊、等，英文中的this is a the等等。

搜索

 1 @Test
 2     publicvoid indexSearch() throws Exception {
 3         // 创建query对象
 4         // 使用QueryParser搜索时，需要指定分词器，搜索时的分词器要和索引时的分词器一致
 5         // 第一个参数：默认搜索的域的名称
 6         QueryParser parser = new QueryParser("description",
 7                 new StandardAnalyzer());
 8 
 9         // 通过queryparser来创建query对象
10         // 参数：输入的lucene的查询语句(关键字一定要大写)
11         Query query = parser.parse("description:java AND lucene");
12 
13         // 创建IndexSearcher
14         // 指定索引库的地址
15         File indexFile = new File("E:\\11-index\\hm19\\");
16         Directory directory = FSDirectory.open(indexFile);
17         IndexReader reader = DirectoryReader.open(directory);
18         IndexSearcher searcher = new IndexSearcher(reader);
19 
20         // 通过searcher来搜索索引库
21         // 第二个参数：指定需要显示的顶部记录的N条
22         TopDocs topDocs = searcher.search(query, 10);
23 
24         // 根据查询条件匹配出的记录总数
25         int count = topDocs.totalHits;
26         System.out.println("匹配出的记录总数:" + count);
27         // 根据查询条件匹配出的记录
28         ScoreDoc[] scoreDocs = topDocs.scoreDocs;
29 
30         for (ScoreDoc scoreDoc : scoreDocs) {
31             // 获取文档的ID
32             int docId = scoreDoc.doc;
33 
34             // 通过ID获取文档
35             Document doc = searcher.doc(docId);
36             System.out.println("商品ID：" + doc.get("id"));
37             System.out.println("商品名称：" + doc.get("name"));
38             System.out.println("商品价格：" + doc.get("price"));
39             System.out.println("商品图片地址：" + doc.get("pic"));
40             System.out.println("==========================");
41             // System.out.println("商品描述：" + doc.get("description"));
42         }
43         // 关闭资源
44         reader.close();
45     }

View Code

1.2 lucene入门程序环境搭建及入门代码

标签：reader 分词器 Lucene junit 符号 [] int throw 抽象

原文地址：http://www.cnblogs.com/lht001/p/7274941.html

上一篇：NodeJs -- URL 模块.

下一篇：在raspberry的jessie版系统上安装opencv3.0

文章来自：搜素材网的编程语言模块，转载请注明文章出处。
文章标题：1.2 lucene入门程序环境搭建及入门代码
文章链接：http://soscw.com/index.php/essay/94155.html

亲，登录后才可以留言！

1.2 lucene入门程序环境搭建及入门代码

1.1 需求

1.2 环境准备

1.3 工程搭建

创建索引

评论

热门文章

推荐文章

最新文章

置顶文章