Elasticsearch常用API

👈🏻 Select language

基本查询

ES，默认并发限制1000，如果前面的查询卡住或者瞬时请求过多，就会出现异常。

创建

POST /a/_doc/2
{"content":"公安部：各地校车将享最高路权"}
POST /a/_doc/1
{"content":"男人老狗穿什么连衣裙"}

查询

返回文档的一部分

?_source=title,text

get

get /a/text/1
get /a/text/2

更新

部分更新

/_update

取回多个文档

/_mget

分析

GET _analyze
{
  "analyzer" : "standard",
  "text" : "this is a test"
}

分片

PUT test
{
    "settings" : {
        "number_of_shards" : 1
    },
    "mappings" : {
        "properties" : {
            "field1" : { "type" : "text" }
        }
    }
}
GET /kimchy,elasticsearch/_search?q=tag:wow
GET /_all/_search?q=tag:wow
GET _cat/indices

系统查询

健康检查

GET /_cluster/health

基于插件的查询

elasticsearch-analysis-ik

使用该插件，要注意mappings要在创建index时创建，不能后期修改/添加

PUT /a
{
	"mappings": {
		"_doc": {
			"properties": {
				"content": {
					"type": "text",
					"analyzer": "ik_max_word",
					"search_analyzer": "ik_smart"
				}
			}
		}
	}
}

使用在线热更新接口有个问题:对于旧的的数据需要重新索引(reindex).所以妄想通过增加新词来对旧的数据进行分词，这种需求是无法实现的。

热更新的词语存在内存中，不会更新dic文件

分片管理

默认模板设置

POST _template/default
{
  "template": ["*"]
  "order": -1
  "settings": {
    "number_of_replicas": "0"
  }
}

自定义模板-设置副本数默认为0

curl -XPUT 0.0.0.0:9200/_template/zeroreplicas  -H 'Content-Type: application/json' -d '
{
"index_patterns" : "*",
"settings" : {
"number_of_replicas" : 0
}
}'

缩容

put */_settings
{
 
    "settings" : {
      "index" : {
        "number_of_replicas" : "0"
    }
  
}
}

ingest/pipeline 用法

ingest 是 elasticsearch 的节点角色。在ingest里面定义pipeline。

pipeline是预处理器。什么是预处理器呢，可以勉强理解为数据清洗，在入库前对数据进行处理。

比如下面这个pipeline的定义

PUT _ingest/pipeline/monthlyindex
{
    "description" : "servicelog-test monthlyindex",
    "processors" : [
      {
        "date_index_name" : {
          "field" : "timestamp",
          "date_formats" : [
            "UNIX"
          ],
          "timezone" : "Asia/Shanghai",
          "index_name_prefix" : "servicelog-test_",
          "date_rounding" : "M"
        }
      },
      {
        "date" : {
          "field" : "timestamp",
          "formats" : [
            "UNIX"
          ],
          "timezone" : "Asia/Shanghai"
        }
      },
      {
        "remove" : {
          "field" : "timestamp"
        }
      },
      {
        "remove" : {
          "field" : "type"
        }
      }
    ]
}

意思是把写入”servicelog-test” index 的数据按月分片处理。

原始写入”servicelog-test”的请求，最终最写入到 servicelog-test_2020-02-01(当前月份的自动分片)

这个 pipeline 解决了我们写入单一elasticsearch index 的问题。以后再也不需要 delete by query 了，直接删过往的index，这也是elasticsearch推荐的方式。

参考链接：Date Index Name Processor

付费功能(_xpack)

es默认没有密码，需要用户授权功能的话买商业版的许可。

security-api-users

GET /_xpack/security/user

7.0废弃的查询

As of version 7.0 Elasticsearch will require that a [field] parameter is provided when a [seed] is set

改为

 "random_score": {
                "seed": 10,
                "field": "_seq_no"
            }

Deprecation: Deprecated field [inline] used, expected [source] instead

		"_script": {
			"script": {
				"inline": "doc['xxx'].value>0?1:0"
			},

inline

参考链接:

Basic Queries

ES has a default concurrency limit of 1000. If previous queries get stuck or there are too many instantaneous requests, exceptions will occur.

Create

POST /a/_doc/2
{"content":"公安部：各地校车将享最高路权"}
POST /a/_doc/1
{"content":"男人老狗穿什么连衣裙"}

Query

Return part of document

?_source=title,text

get

get /a/text/1
get /a/text/2

Update

Partial update

/_update

Retrieve multiple documents

/_mget

Analyze

GET _analyze
{
  "analyzer" : "standard",
  "text" : "this is a test"
}

Shards

PUT test
{
    "settings" : {
        "number_of_shards" : 1
    },
    "mappings" : {
        "properties" : {
            "field1" : { "type" : "text" }
        }
    }
}
GET /kimchy,elasticsearch/_search?q=tag:wow
GET /_all/_search?q=tag:wow
GET _cat/indices

System Queries

Health check

GET /_cluster/health

Plugin-Based Queries

elasticsearch-analysis-ik

When using this plugin, note that mappings must be created when creating the index, they cannot be modified/added later.

PUT /a
{
	"mappings": {
		"_doc": {
			"properties": {
				"content": {
					"type": "text",
					"analyzer": "ik_max_word",
					"search_analyzer": "ik_smart"
				}
			}
		}
	}
}

There’s a problem with using the online hot update interface: old data needs to be reindexed (reindex). So the idea of adding new words to segment old data cannot be achieved.

Hot-updated words are stored in memory and won’t update dic files.

Shard Management

Default Template Settings

POST _template/default
{
  "template": ["*"]
  "order": -1
  "settings": {
    "number_of_replicas": "0"
  }
}

Custom Template - Set Replica Count Default to 0

curl -XPUT 0.0.0.0:9200/_template/zeroreplicas  -H 'Content-Type: application/json' -d '
{
"index_patterns" : "*",
"settings" : {
"number_of_replicas" : 0
}
}'

Scale Down

put */_settings
{
 
    "settings" : {
      "index" : {
        "number_of_replicas" : "0"
    }
  
}
}

ingest/pipeline Usage

ingest is a node role in elasticsearch. Pipelines are defined within ingest.

A pipeline is a preprocessor. What is a preprocessor? It can be roughly understood as data cleaning, processing data before storage.

For example, the definition of this pipeline:

PUT _ingest/pipeline/monthlyindex
{
    "description" : "servicelog-test monthlyindex",
    "processors" : [
      {
        "date_index_name" : {
          "field" : "timestamp",
          "date_formats" : [
            "UNIX"
          ],
          "timezone" : "Asia/Shanghai",
          "index_name_prefix" : "servicelog-test_",
          "date_rounding" : "M"
        }
      },
      {
        "date" : {
          "field" : "timestamp",
          "formats" : [
            "UNIX"
          ],
          "timezone" : "Asia/Shanghai"
        }
      },
      {
        "remove" : {
          "field" : "timestamp"
        }
      },
      {
        "remove" : {
          "field" : "type"
        }
      }
    ]
}

This means data written to the “servicelog-test” index is processed by month.

Original requests writing to “servicelog-test” will ultimately be written to servicelog-test_2020-02-01 (automatic sharding for the current month).

This pipeline solves our problem of writing to a single elasticsearch index. We no longer need delete by query. Just directly delete past indices, which is also the way elasticsearch recommends.

Reference link: Date Index Name Processor

Paid Features (_xpack)

ES has no password by default. If you need user authorization features, buy a commercial license.

security-api-users

GET /_xpack/security/user

Deprecated Queries in 7.0

As of version 7.0 Elasticsearch will require that a [field] parameter is provided when a [seed] is set

Change to:

 "random_score": {
                "seed": 10,
                "field": "_seq_no"
            }

Deprecation: Deprecated field [inline] used, expected [source] instead

		"_script": {
			"script": {
				"inline": "doc['xxx'].value>0?1:0"
			},

inline

Reference Links:

基本クエリ

ESはデフォルトで同時実行制限が1000です。前のクエリがスタックしたり、瞬間的なリクエストが多すぎたりすると、例外が発生します。

作成

POST /a/_doc/2
{"content":"公安部：各地校车将享最高路权"}
POST /a/_doc/1
{"content":"男人老狗穿什么连衣裙"}

クエリ

ドキュメントの一部を返す

?_source=title,text

get

get /a/text/1
get /a/text/2

更新

部分更新

/_update

複数のドキュメントを取得

/_mget

分析

GET _analyze
{
  "analyzer" : "standard",
  "text" : "this is a test"
}

シャード

PUT test
{
    "settings" : {
        "number_of_shards" : 1
    },
    "mappings" : {
        "properties" : {
            "field1" : { "type" : "text" }
        }
    }
}
GET /kimchy,elasticsearch/_search?q=tag:wow
GET /_all/_search?q=tag:wow
GET _cat/indices

システムクエリ

ヘルスチェック

GET /_cluster/health

プラグインベースのクエリ

elasticsearch-analysis-ik

このプラグインを使用する場合、mappingsはインデックス作成時に作成する必要があることに注意してください。後で変更/追加することはできません。

PUT /a
{
	"mappings": {
		"_doc": {
			"properties": {
				"content": {
					"type": "text",
					"analyzer": "ik_max_word",
					"search_analyzer": "ik_smart"
				}
			}
		}
	}
}

オンラインホットアップデートインターフェースを使用する場合、古いデータは再インデックス（reindex）する必要があるという問題があります。そのため、新しい単語を追加して古いデータを分かち書きするという要求は実現できません。

ホットアップデートされた単語はメモリに保存され、dicファイルは更新されません。

シャード管理

デフォルトテンプレート設定

POST _template/default
{
  "template": ["*"]
  "order": -1
  "settings": {
    "number_of_replicas": "0"
  }
}

カスタムテンプレート-レプリカ数をデフォルトで0に設定

curl -XPUT 0.0.0.0:9200/_template/zeroreplicas  -H 'Content-Type: application/json' -d '
{
"index_patterns" : "*",
"settings" : {
"number_of_replicas" : 0
}
}'

スケールダウン

put */_settings
{
 
    "settings" : {
      "index" : {
        "number_of_replicas" : "0"
    }
  
}
}

ingest/pipelineの用法

ingestはelasticsearchのノードロールです。ingest内でpipelineを定義します。

pipelineはプリプロセッサです。プリプロセッサとは何か？データクリーニングと理解でき、保存前にデータを処理します。

たとえば、このpipelineの定義：

PUT _ingest/pipeline/monthlyindex
{
    "description" : "servicelog-test monthlyindex",
    "processors" : [
      {
        "date_index_name" : {
          "field" : "timestamp",
          "date_formats" : [
            "UNIX"
          ],
          "timezone" : "Asia/Shanghai",
          "index_name_prefix" : "servicelog-test_",
          "date_rounding" : "M"
        }
      },
      {
        "date" : {
          "field" : "timestamp",
          "formats" : [
            "UNIX"
          ],
          "timezone" : "Asia/Shanghai"
        }
      },
      {
        "remove" : {
          "field" : "timestamp"
        }
      },
      {
        "remove" : {
          "field" : "type"
        }
      }
    ]
}

これは”servicelog-test”インデックスに書き込まれるデータを月ごとに処理することを意味します。

“servicelog-test”に書き込む元のリクエストは、最終的にservicelog-test_2020-02-01（現在の月の自動シャード）に書き込まれます。

このpipelineは、単一のelasticsearchインデックスに書き込む問題を解決します。今後、delete by queryは不要です。過去のインデックスを直接削除するだけで、これもelasticsearchが推奨する方法です。

参考リンク：Date Index Name Processor

有料機能(_xpack)

esはデフォルトでパスワードがありません。ユーザー認証機能が必要な場合は、商業版のライセンスを購入してください。

security-api-users

GET /_xpack/security/user

7.0で非推奨のクエリ

バージョン7.0以降、Elasticsearchでは[seed]が設定されている場合に[field]パラメータが必要になります

変更：

 "random_score": {
                "seed": 10,
                "field": "_seq_no"
            }

非推奨：非推奨フィールド[inline]が使用されました。[source]が期待されます

		"_script": {
			"script": {
				"inline": "doc['xxx'].value>0?1:0"
			},

inline

参考リンク：

Базовые запросы

ES имеет ограничение параллелизма по умолчанию 1000. Если предыдущие запросы застревают или слишком много мгновенных запросов, возникнут исключения.

Создание

POST /a/_doc/2
{"content":"公安部：各地校车将享最高路权"}
POST /a/_doc/1
{"content":"男人老狗穿什么连衣裙"}

Запрос

Вернуть часть документа

?_source=title,text

get

get /a/text/1
get /a/text/2

Обновление

Частичное обновление

/_update

Получить несколько документов

/_mget

Анализ

GET _analyze
{
  "analyzer" : "standard",
  "text" : "this is a test"
}

Шарды

PUT test
{
    "settings" : {
        "number_of_shards" : 1
    },
    "mappings" : {
        "properties" : {
            "field1" : { "type" : "text" }
        }
    }
}
GET /kimchy,elasticsearch/_search?q=tag:wow
GET /_all/_search?q=tag:wow
GET _cat/indices

Системные запросы

Проверка здоровья

GET /_cluster/health

Запросы на основе плагинов

elasticsearch-analysis-ik

При использовании этого плагина обратите внимание, что mappings должны быть созданы при создании индекса, их нельзя изменить/добавить позже.

PUT /a
{
	"mappings": {
		"_doc": {
			"properties": {
				"content": {
					"type": "text",
					"analyzer": "ik_max_word",
					"search_analyzer": "ik_smart"
				}
			}
		}
	}
}

Есть проблема с использованием интерфейса онлайн горячего обновления: старые данные нужно переиндексировать (reindex). Поэтому идея добавления новых слов для сегментации старых данных не может быть достигнута.

Горячее обновленные слова хранятся в памяти и не обновляют dic файлы.

Управление шардами

Настройки шаблона по умолчанию

POST _template/default
{
  "template": ["*"]
  "order": -1
  "settings": {
    "number_of_replicas": "0"
  }
}

Пользовательский шаблон - установить количество реплик по умолчанию в 0

curl -XPUT 0.0.0.0:9200/_template/zeroreplicas  -H 'Content-Type: application/json' -d '
{
"index_patterns" : "*",
"settings" : {
"number_of_replicas" : 0
}
}'

Масштабирование вниз

put */_settings
{
 
    "settings" : {
      "index" : {
        "number_of_replicas" : "0"
    }
  
}
}

Использование ingest/pipeline

ingest — это роль узла в elasticsearch. В ingest определяются pipeline.

pipeline — это препроцессор. Что такое препроцессор? Его можно грубо понять как очистку данных, обработку данных перед хранением.

Например, определение этого pipeline:

PUT _ingest/pipeline/monthlyindex
{
    "description" : "servicelog-test monthlyindex",
    "processors" : [
      {
        "date_index_name" : {
          "field" : "timestamp",
          "date_formats" : [
            "UNIX"
          ],
          "timezone" : "Asia/Shanghai",
          "index_name_prefix" : "servicelog-test_",
          "date_rounding" : "M"
        }
      },
      {
        "date" : {
          "field" : "timestamp",
          "formats" : [
            "UNIX"
          ],
          "timezone" : "Asia/Shanghai"
        }
      },
      {
        "remove" : {
          "field" : "timestamp"
        }
      },
      {
        "remove" : {
          "field" : "type"
        }
      }
    ]
}

Это означает, что данные, записанные в индекс “servicelog-test”, обрабатываются по месяцам.

Исходные запросы, записывающие в “servicelog-test”, в конечном итоге будут записаны в servicelog-test_2020-02-01 (автоматическое шардирование для текущего месяца).

Этот pipeline решает нашу проблему записи в один индекс elasticsearch. Нам больше не нужен delete by query. Просто напрямую удаляйте прошлые индексы, что также является рекомендуемым способом elasticsearch.

Ссылка: Date Index Name Processor

Платные функции (_xpack)

ES по умолчанию не имеет пароля. Если вам нужны функции авторизации пользователей, купите коммерческую лицензию.

security-api-users

GET /_xpack/security/user

Устаревшие запросы в 7.0

Начиная с версии 7.0 Elasticsearch потребует, чтобы параметр [field] был предоставлен, когда установлен [seed]

Изменить на:

 "random_score": {
                "seed": 10,
                "field": "_seq_no"
            }

Устаревание: Использовано устаревшее поле [inline], ожидается [source] вместо этого

		"_script": {
			"script": {
				"inline": "doc['xxx'].value>0?1:0"
			},

inline

Ссылки:

💬 讨论 / Discussion

对这篇文章有想法？欢迎在 GitHub 上发起讨论。
Have thoughts on this post? Start a discussion on GitHub.

在 GitHub 参与讨论 / Discuss on GitHub

基本查询

创建

查询

get

更新

分片

系统查询

基于插件的查询

elasticsearch-analysis-ik

分片管理

默认模板设置

自定义模板-设置副本数默认为0

缩容

ingest/pipeline 用法

付费功能(_xpack)

7.0废弃的查询

参考链接:

Basic Queries

Create

Query

get

Update

Shards

System Queries

Plugin-Based Queries

elasticsearch-analysis-ik

Shard Management

Default Template Settings

Custom Template - Set Replica Count Default to 0

Scale Down

ingest/pipeline Usage

Paid Features (_xpack)

Deprecated Queries in 7.0

Reference Links:

基本クエリ

作成

クエリ

get

更新

シャード

システムクエリ

プラグインベースのクエリ

elasticsearch-analysis-ik

シャード管理

デフォルトテンプレート設定

カスタムテンプレート-レプリカ数をデフォルトで0に設定

スケールダウン

ingest/pipelineの用法

有料機能(_xpack)

7.0で非推奨のクエリ

参考リンク：

Базовые запросы

Создание

Запрос

get

Обновление

Шарды

Системные запросы

Запросы на основе плагинов

elasticsearch-analysis-ik

Управление шардами

Настройки шаблона по умолчанию

Пользовательский шаблон - установить количество реплик по умолчанию в 0

Масштабирование вниз

Использование ingest/pipeline

Платные функции (_xpack)

Устаревшие запросы в 7.0

Ссылки:

💬 讨论 / Discussion

CATALOG

FEATURED TAGS

FRIENDS