Java使用RestClient连接ElasticSearch

ElasticSearch

快速了解

索引(Index)

类似于关系型数据库的库,是相同文档的集合。

文档(document)

文档相当于关系型数据的列,是可以建立索引的基本单元。

ElasticSearch RDBMS
索引(index) 数据库(database)
类型(type) 表(table)
文档(document) 行(row)
字段(field) 列(column)
映射(mapping) 表结构(schema)
全文索引 索引
查询DSL SQL
GET select
PUT/POST update
DELETE delete

Maven 依赖

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86

<project>
<!--......................-->
<properties>
<elasticsearch-rest-client.version>7.13.2</elasticsearch-rest-client.version>
<elasticsearch-rest-high-level-client.version>7.13.2</elasticsearch-rest-high-level-client.version>
</properties>
<dependencies>
<!-- <dependency>-->
<!-- <groupId>org.elasticsearch.client</groupId>-->
<!-- <artifactId>elasticsearch-rest-client</artifactId>-->
<!-- <version>${elasticsearch-rest-client.version}</version>-->
<!-- </dependency>-->
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>${elasticsearch-rest-high-level-client.version}</version>
</dependency>
<!--......................-->
</dependencies>
<build>
<plugins>
<plugin>
<!--shade 把下面的报打包在一个 jar 里-->
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.2.0</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<relocations>
<relocation>
<pattern>org.apache.http</pattern>
<shadedPattern>hidden.org.apache.http</shadedPattern>
</relocation>
<relocation>
<pattern>org.apache.logging</pattern>
<shadedPattern>hidden.org.apache.logging</shadedPattern>
</relocation>
<relocation>
<pattern>org.apache.commons.codec</pattern>
<shadedPattern>hidden.org.apache.commons.codec</shadedPattern>
</relocation>
<relocation>
<pattern>org.apache.commons.logging</pattern>
<shadedPattern>hidden.org.apache.commons.logging</shadedPattern>
</relocation>
</relocations>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
<!--......................-->
</build>
<repositories>
<!-- add the elasticsearch repo -->
<repository>
<id>elasticsearch-releases</id>
<url>https://artifacts.elastic.co/maven</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
</repositories>
<pluginRepositories>
<pluginRepository>
<id>alimaven</id>
<name>aliyun maven</name>
<url>http://maven.aliyun.com/nexus/content/groups/public/</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</pluginRepository>
</pluginRepositories>
</project>

第一步建立文档索引(IndexRequest)

索引文档

创建带分词的索引

ES 7.13.2

kibana
1
2
3
4
PUT posts2/
{
参考 EsIndexConfig.json
}

Dev Tools

参数:

  • posts 全局索引名称
  • _doc 类型
  • id_1 文档主键,如果包含主键,则为 PUT请求
1
2
3
4
POST /posts/_doc/id_1?refresh=wait_for&version=-4&timeout=10s
{
"tag2":"不"
}

注意:opType=index 结果为:[http://127.0.0.1:9200/posts/_doc?timeout=1m] and method [POST]]]
注意:opType=create 结果为:[http://127.0.0.1:9200/posts/_create?version=-4&timeout=1m] and method [POST]]]
但是 /posts/_create后面必须要跟文档ID/posts/_create/ID_xxx
正常 &op_type=create参数是作为 url 参数出现的。

简单的逻辑

  • CREATE - 不指定文档id create
  • CREATE - 指定文档id - 不指定版本号 - 判断文档是否存在 - 不存在-create
  • CREATE - 指定文档id - 不指定版本号 - 判断文档是否存在 - 存在-报错
  • CREATE - 指定文档id - 指定版本号 - 报错,指定版本号请用INDEX
  • INDEX - 不指定文档id - create
  • INDEX - 指定文档id - 不指定版本号 - 判断文档id是否存在 - 存在 - update
  • INDEX - 指定文档id - 不指定版本号 - 判断文档id是否存在 - 不存在 - create
  • INDEX - 指定文档id - 指定版本号 - 判断文档id是否存在 - 存在 - 校验外部指定
  • INDEX - 指定文档id - 指定版本号 - 判断文档id是否存在 - 不存在 - create
  • UPDATE - 整体更新
  • UPDATE - 部分更新

Java

常用属性
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
public class demo {
void demo() {
/*全局索引名称*/
String indexName = "posts";
IndexRequest indexRequest = new IndexRequest(indexName);
/*载入资源*/
String jsonString = "{\"user\":\"maxzhao\",\"addtime\":\"2013-01-30\",\"message\":\"add data Elasticsearch\"}";
indexRequest.source(jsonString, XContentType.JSON);
/**/
indexRequest.routing("routing");
/*超时时间*/
indexRequest.timeout(TimeValue.timeValueSeconds(60));
/*超时时间*/
indexRequest.timeout("1m");
/*刷新策略*/
indexRequest.setRefreshPolicy(WriteRequest.RefreshPolicy.WAIT_UNTIL);
/*刷新策略:wait_for、true、false*/
indexRequest.setRefreshPolicy("wait_for");
/*文档主键*/
indexRequest.id();
/*版本号*/
indexRequest.version(2);
/*校验类型*/
indexRequest.versionType(VersionType.EXTERNAL);
/*操作类型,这里使用 create 必须要设置文档ID*/
indexRequest.opType(DocWriteRequest.OpType.CREATE);
/*操作类型:create or index (default)*/
indexRequest.opType("create");
/*索引文档之前执行的管道名称*/
indexRequest.setPipeline("pipeline");
/*即将废除:indexRequest.type("_doc");*/
/*获取响应结果*/
IndexResponse indexResponse = restHighLevelClient.index(indexRequest, RequestOptions.DEFAULT);
}
}
资源载入的四种方式

建议使用方式一、方式二

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
public class demo {
/**
* 建立索引
*/
public void IndexRequest() {
/*方式一:*/
String indexName = "posts";
IndexRequest indexRequest = new IndexRequest(indexName);
String jsonString = "{" +
"\"user\":\"maxzhao\"," +
"\"message\":\"add data Elasticsearch\"" +
"}";
indexRequest.source(jsonString, XContentType.JSON);
/*方式二:*/
Map<String, Object> jsonMap = new HashMap<>();
jsonMap.put("user", "maxzhao");
jsonMap.put("addtime", new Date());
jsonMap.put("message", "add data Elasticsearch");
indexRequest = new IndexRequest(indexName)
.source(jsonMap);
/*方式三:*/
XContentBuilder builder = null;
try {
builder = XContentFactory.jsonBuilder();
builder.startObject();
{
builder.field("user", "maxzhao");
builder.timeField("addtime", new Date());
builder.field("message", "add data Elasticsearch");
}
builder.endObject();
indexRequest = new IndexRequest(indexName)
.source(builder);
} catch (IOException e) {

}
/*方式四:*/
indexRequest = new IndexRequest(indexName)
.source("user", "maxzhao",
"addtime", new Date(),
"message", "add data Elasticsearch");
}
}
发起请求
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
public class demo {
void demo() {
/*同步*/
IndexResponse indexResponse = null;
try {
indexResponse = restHighLevelClient.index(indexRequest, RequestOptions.DEFAULT);
this.HandleIndexResponse(indexResponse);
} catch (ElasticsearchException e) {
log.warn(e.toString());
/*版本冲突所抛出的异常*/
/*同样会发生,以防opType将创建和相同的索引文档id已经存在:*/
if (e.status() == RestStatus.CONFLICT) {
}
} catch (IOException e) {
log.warn(e.toString());
}
/*异步*/
Cancellable cancellable = restHighLevelClient.indexAsync(indexRequest, RequestOptions.DEFAULT, new ActionListener<IndexResponse>() {
@Override
public void onResponse(IndexResponse indexResponse) {
HandleIndexResponse(indexResponse);
}

@Override
public void onFailure(Exception e) {
log.warn(e.toString());
}
});
}
}

修改(UpdateRequest)

删除(DeleteRequest)

合并请求(BulkRequest)

复制索引(ReindexRequest)

可用于将文档从一个或多个索引复制到目标索引。

查询(GetRequest、GetSourceRequest)

合并多个查询(MultiGetRequest)

检索(SearchRequest)

  1. 结构化检索
  2. 全文检索

详细了解

索引近义词

创建索引时

创建索引时,进行配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
{
"settings": {
"analysis": {
"analyzer": {
"my_hanlp_analyzer": {
"type": "custom",
"char_filter": [
"html_strip"
],
"filter": [
"my_synonym"
]
}
},
"filter": {
"my_synonym": {
"type": "synonym_graph",
"synonyms_path": "analysis/synonym/synonym.txt",
"updateable": true
}
}
}
}
}

配置文件

路径 ${ES_HOME}/analysis/synonym/synonym.txt

近义词配置方式

  • 中文,汉语,汉字
    • 这种写法在分词的时候,有中文的地方,都会解析成中文,汉语,汉字,把中文,汉语,汉字存入索引中
  • 毛衣,毛裤 => 线服
    • 这种写法在分词的时候,毛衣,毛裤都会解析成为线服,然后把线服存入索引中

分词

请查看 HanLP 文档。

分词过滤器/不区分大小写

创建索引时,进行配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 1,
"analysis": {
"analyzer": {
"my_hanlp_analyzer": {
"type": "custom",
"tokenizer": "my_hanlp",
"char_filter": [
"html_strip"
],
"filter": [
"my_lowercase"
]
},
"default": {
"type": "hanlp"
}
},
"tokenizer": {
"my_hanlp": {
"type": "hanlp",
"enable_stop_dictionary": false,
"enable_custom_config": false
}
},
"filter": {
"my_lowercase": {
"type": "lowercase",
"language": "greek"
}
}
}
}
}

附录

本文地址: https://github.com/maxzhao-it/blog/post/34329dsafadsfasdfgh/