ElasticSearch
快速了解
索引(Index)
类似于关系型数据库的库,是相同文档的集合。
文档(document)
文档相当于关系型数据的列,是可以建立索引的基本单元。
ElasticSearch |
RDBMS |
索引(index) |
数据库(database) |
类型(type) |
表(table) |
文档(document) |
行(row) |
字段(field) |
列(column) |
映射(mapping) |
表结构(schema) |
全文索引 |
索引 |
查询DSL |
SQL |
GET |
select |
PUT/POST |
update |
DELETE |
delete |
Maven 依赖
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
| <project> <properties> <elasticsearch-rest-client.version>7.13.2</elasticsearch-rest-client.version> <elasticsearch-rest-high-level-client.version>7.13.2</elasticsearch-rest-high-level-client.version> </properties> <dependencies> <dependency> <groupId>org.elasticsearch.client</groupId> <artifactId>elasticsearch-rest-high-level-client</artifactId> <version>${elasticsearch-rest-high-level-client.version}</version> </dependency> </dependencies> <build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <version>3.2.0</version> <executions> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> <configuration> <relocations> <relocation> <pattern>org.apache.http</pattern> <shadedPattern>hidden.org.apache.http</shadedPattern> </relocation> <relocation> <pattern>org.apache.logging</pattern> <shadedPattern>hidden.org.apache.logging</shadedPattern> </relocation> <relocation> <pattern>org.apache.commons.codec</pattern> <shadedPattern>hidden.org.apache.commons.codec</shadedPattern> </relocation> <relocation> <pattern>org.apache.commons.logging</pattern> <shadedPattern>hidden.org.apache.commons.logging</shadedPattern> </relocation> </relocations> </configuration> </execution> </executions> </plugin> </plugins> </build> <repositories> <repository> <id>elasticsearch-releases</id> <url>https://artifacts.elastic.co/maven</url> <releases> <enabled>true</enabled> </releases> <snapshots> <enabled>false</enabled> </snapshots> </repository> </repositories> <pluginRepositories> <pluginRepository> <id>alimaven</id> <name>aliyun maven</name> <url>http://maven.aliyun.com/nexus/content/groups/public/</url> <releases> <enabled>true</enabled> </releases> <snapshots> <enabled>false</enabled> </snapshots> </pluginRepository> </pluginRepositories> </project>
|
第一步建立文档索引(IndexRequest)
索引文档
创建带分词的索引
ES 7.13.2
kibana
1 2 3 4
| PUT posts2/ { 参考 EsIndexConfig.json }
|
参数:
- posts 全局索引名称
- _doc 类型
- id_1 文档主键,如果包含主键,则为 PUT请求
1 2 3 4
| POST /posts/_doc/id_1?refresh=wait_for&version=-4&timeout=10s { "tag2":"不" }
|
注意:opType=index
结果为:[http://127.0.0.1:9200/posts/_doc?timeout=1m] and method [POST]]]
注意:opType=create
结果为:[http://127.0.0.1:9200/posts/_create?version=-4&timeout=1m] and method [POST]]]
但是 /posts/_create
后面必须要跟文档ID/posts/_create/ID_xxx
正常 &op_type=create
参数是作为 url 参数出现的。
简单的逻辑
- CREATE - 不指定文档id create
- CREATE - 指定文档id - 不指定版本号 - 判断文档是否存在 - 不存在-create
- CREATE - 指定文档id - 不指定版本号 - 判断文档是否存在 - 存在-报错
- CREATE - 指定文档id - 指定版本号 - 报错,指定版本号请用INDEX
- INDEX - 不指定文档id - create
- INDEX - 指定文档id - 不指定版本号 - 判断文档id是否存在 - 存在 - update
- INDEX - 指定文档id - 不指定版本号 - 判断文档id是否存在 - 不存在 - create
- INDEX - 指定文档id - 指定版本号 - 判断文档id是否存在 - 存在 - 校验外部指定
- INDEX - 指定文档id - 指定版本号 - 判断文档id是否存在 - 不存在 - create
- UPDATE - 整体更新
- UPDATE - 部分更新
Java
常用属性
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
| public class demo { void demo() { String indexName = "posts"; IndexRequest indexRequest = new IndexRequest(indexName); String jsonString = "{\"user\":\"maxzhao\",\"addtime\":\"2013-01-30\",\"message\":\"add data Elasticsearch\"}"; indexRequest.source(jsonString, XContentType.JSON);
indexRequest.timeout(TimeValue.timeValueSeconds(60)); indexRequest.timeout("1m"); indexRequest.setRefreshPolicy(WriteRequest.RefreshPolicy.WAIT_UNTIL); indexRequest.setRefreshPolicy("wait_for"); indexRequest.id(); indexRequest.version(2); indexRequest.versionType(VersionType.EXTERNAL); indexRequest.opType(DocWriteRequest.OpType.CREATE); indexRequest.opType("create"); indexRequest.setPipeline("pipeline"); IndexResponse indexResponse = restHighLevelClient.index(indexRequest, RequestOptions.DEFAULT); } }
|
资源载入的四种方式
建议使用方式一、方式二
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43
| public class demo {
public void IndexRequest() { String indexName = "posts"; IndexRequest indexRequest = new IndexRequest(indexName); String jsonString = "{" + "\"user\":\"maxzhao\"," + "\"message\":\"add data Elasticsearch\"" + "}"; indexRequest.source(jsonString, XContentType.JSON); Map<String, Object> jsonMap = new HashMap<>(); jsonMap.put("user", "maxzhao"); jsonMap.put("addtime", new Date()); jsonMap.put("message", "add data Elasticsearch"); indexRequest = new IndexRequest(indexName) .source(jsonMap); XContentBuilder builder = null; try { builder = XContentFactory.jsonBuilder(); builder.startObject(); { builder.field("user", "maxzhao"); builder.timeField("addtime", new Date()); builder.field("message", "add data Elasticsearch"); } builder.endObject(); indexRequest = new IndexRequest(indexName) .source(builder); } catch (IOException e) {
} indexRequest = new IndexRequest(indexName) .source("user", "maxzhao", "addtime", new Date(), "message", "add data Elasticsearch"); } }
|
发起请求
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
| public class demo { void demo() { IndexResponse indexResponse = null; try { indexResponse = restHighLevelClient.index(indexRequest, RequestOptions.DEFAULT); this.HandleIndexResponse(indexResponse); } catch (ElasticsearchException e) { log.warn(e.toString()); if (e.status() == RestStatus.CONFLICT) { } } catch (IOException e) { log.warn(e.toString()); } Cancellable cancellable = restHighLevelClient.indexAsync(indexRequest, RequestOptions.DEFAULT, new ActionListener<IndexResponse>() { @Override public void onResponse(IndexResponse indexResponse) { HandleIndexResponse(indexResponse); }
@Override public void onFailure(Exception e) { log.warn(e.toString()); } }); } }
|
修改(UpdateRequest)
删除(DeleteRequest)
合并请求(BulkRequest)
复制索引(ReindexRequest)
可用于将文档从一个或多个索引复制到目标索引。
查询(GetRequest、GetSourceRequest)
合并多个查询(MultiGetRequest)
检索(SearchRequest)
- 结构化检索
- 全文检索
详细了解
索引近义词
创建索引时
创建索引时,进行配置
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
| { "settings": { "analysis": { "analyzer": { "my_hanlp_analyzer": { "type": "custom", "char_filter": [ "html_strip" ], "filter": [ "my_synonym" ] } }, "filter": { "my_synonym": { "type": "synonym_graph", "synonyms_path": "analysis/synonym/synonym.txt", "updateable": true } } } } }
|
配置文件
路径 ${ES_HOME}/analysis/synonym/synonym.txt
近义词配置方式
- 中文,汉语,汉字
- 这种写法在分词的时候,有中文的地方,都会解析成中文,汉语,汉字,把中文,汉语,汉字存入索引中
- 毛衣,毛裤 => 线服
- 这种写法在分词的时候,毛衣,毛裤都会解析成为线服,然后把线服存入索引中
分词
请查看 HanLP
文档。
分词过滤器/不区分大小写
创建索引时,进行配置
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
| { "settings": { "number_of_shards": 1, "number_of_replicas": 1, "analysis": { "analyzer": { "my_hanlp_analyzer": { "type": "custom", "tokenizer": "my_hanlp", "char_filter": [ "html_strip" ], "filter": [ "my_lowercase" ] }, "default": { "type": "hanlp" } }, "tokenizer": { "my_hanlp": { "type": "hanlp", "enable_stop_dictionary": false, "enable_custom_config": false } }, "filter": { "my_lowercase": { "type": "lowercase", "language": "greek" } } } } }
|
附录
本文地址: https://github.com/maxzhao-it/blog/post/34329dsafadsfasdfgh/