ElasticSearch近义词(synonym)配置

Edited on 2022-07-04 In ELK , ElasticSearch

前言

索引近义词的配置，需要在创建索引时就配置好

近义词配置

创建索引时

{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_hanlp_analyzer": {
          "type": "custom",
          "char_filter": [
            "html_strip"
          ],
          "filter": [
            "my_synonym"
          ]
        }
      },
      "filter": {
        "my_synonym": {
          "type": "synonym_graph",
          "synonyms_path": "analysis/synonym/synonym.txt",
          "updateable": true
        }
      }
    }
  }
}

`ElasticSearch` 配置文件

路径

${ES_HOME}/analysis/synonym/synonym.txt

近义词配置方式

中文,汉语,汉字
- 这种写法在分词的时候，有中文的地方，都会解析成中文,汉语,汉字，把中文,汉语,汉字存入索引中
毛衣,毛裤 => 线服
- 这种写法在分词的时候，毛衣,毛裤都会解析成为线服，然后把线服存入索引中

附录

创建索引配置

当前配置包含近义词配置、HanLP 自定义分词并忽略大小写

{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 1,
    "analysis": {
      "analyzer": {
        "my_hanlp_analyzer": {
          "type": "custom",
          "tokenizer": "my_hanlp",
          "char_filter": [
            "html_strip"
          ],
          "filter": [
            "my_lowercase",
            "my_synonym"
          ]
        },
        "default": {
          "type": "hanlp"
        }
      },
      "tokenizer": {
        "my_hanlp": {
          "type": "hanlp",
          "enable_stop_dictionary": false,
          "enable_custom_config": false
        }
      },
      "filter": {
        "my_lowercase": {
          "type": "lowercase",
          "language": "greek"
        },
        "my_synonym": {
          "type": "synonym_graph",
          "synonyms_path": "analysis/synonym/synonym.txt"
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "my_hanlp_analyzer",
        "search_analyzer": "my_hanlp_analyzer"
      },
      "content": {
        "type": "text",
        "analyzer": "my_hanlp_analyzer",
        "search_analyzer": "my_hanlp_analyzer"
      }
    }
  }
}

ElasticSearch API

Java Rest Client 7.13.2

本文地址： https://github.com/maxzhao-it/blog/post/28907/

前言

近义词配置

创建索引时

ElasticSearch 配置文件

路径

近义词配置方式

附录

创建索引配置

ElasticSearch API

`ElasticSearch` 配置文件