SQL REST API

edit

概述

edit

The SQL search API 接受JSON文档中的SQL,执行它,并返回结果。例如:

POST /_sql?format=txt
{
  "query": "SELECT * FROM library ORDER BY page_count DESC LIMIT 5"
}

返回结果:

     author      |        name        |  page_count   | release_date
-----------------+--------------------+---------------+------------------------
Peter F. Hamilton|Pandora's Star      |768            |2004-03-02T00:00:00.000Z
Vernor Vinge     |A Fire Upon the Deep|613            |1992-06-01T00:00:00.000Z
Frank Herbert    |Dune                |604            |1965-06-01T00:00:00.000Z
Alastair Reynolds|Revelation Space    |585            |2000-03-15T00:00:00.000Z
James S.A. Corey |Leviathan Wakes     |561            |2011-06-02T00:00:00.000Z

使用 Kibana 控制台

如果你正在使用Kibana Console (强烈推荐),在创建查询时利用三重引号"""。这不仅会自动转义查询字符串中的双引号("),还支持多行,如下所示:

控制台三重引号

响应数据格式

edit

虽然文本格式对人类很友好,但计算机更喜欢更结构化的东西。

Elasticsearch SQL 可以以以下格式返回数据,这些格式可以通过 URL 中的 format 属性或通过设置 Accept HTTP 头来设置:

URL参数优先于Accept HTTP头。如果两者都未指定,则响应将以与请求相同的格式返回。

格式

Accept HTTP 头

描述

人类可读

csv

文本/csv

逗号分隔值

json

application/json

JSON(JavaScript 对象表示法)人类可读的格式

tsv

text/tab-separated-values

制表符分隔值

txt

text/plain

类似命令行的表示

yaml

application/yaml

YAML(YAML 不是标记语言)人类可读的格式

二进制格式

cbor

application/cbor

简洁二进制对象表示法

微笑

application/smile

Smile 二进制数据格式类似于CBOR

CSV 格式接受一个格式化 URL 查询属性 delimiter,该属性指示应使用哪个字符来分隔 CSV 值。它默认为逗号 (,),并且不能使用以下值:双引号 (")、回车 (\r) 和换行 (\n)。制表符 (\t) 也不能使用,需要使用 tsv 格式。

以下是一些人类可读格式的示例:

CSV

edit
POST /_sql?format=csv
{
  "query": "SELECT * FROM library ORDER BY page_count DESC",
  "fetch_size": 5
}

返回结果为:

author,name,page_count,release_date
Peter F. Hamilton,Pandora's Star,768,2004-03-02T00:00:00.000Z
Vernor Vinge,A Fire Upon the Deep,613,1992-06-01T00:00:00.000Z
Frank Herbert,Dune,604,1965-06-01T00:00:00.000Z
Alastair Reynolds,Revelation Space,585,2000-03-15T00:00:00.000Z
James S.A. Corey,Leviathan Wakes,561,2011-06-02T00:00:00.000Z

或者:

POST /_sql?format=csv&delimiter=%3b
{
    "query": "SELECT * FROM library ORDER BY page_count DESC",
    "fetch_size": 5
}

返回结果为:

author;name;page_count;release_date
Peter F. Hamilton;Pandora's Star;768;2004-03-02T00:00:00.000Z
Vernor Vinge;A Fire Upon the Deep;613;1992-06-01T00:00:00.000Z
Frank Herbert;Dune;604;1965-06-01T00:00:00.000Z
Alastair Reynolds;Revelation Space;585;2000-03-15T00:00:00.000Z
James S.A. Corey;Leviathan Wakes;561;2011-06-02T00:00:00.000Z

JSON

edit
POST /_sql?format=json
{
  "query": "SELECT * FROM library ORDER BY page_count DESC",
  "fetch_size": 5
}

返回结果:

{
  "columns": [
    {"name": "author",       "type": "text"},
    {"name": "name",         "type": "text"},
    {"name": "page_count",   "type": "short"},
    {"name": "release_date", "type": "datetime"}
  ],
  "rows": [
    ["Peter F. Hamilton",  "Pandora's Star",       768, "2004-03-02T00:00:00.000Z"],
    ["Vernor Vinge",       "A Fire Upon the Deep", 613, "1992-06-01T00:00:00.000Z"],
    ["Frank Herbert",      "Dune",                 604, "1965-06-01T00:00:00.000Z"],
    ["Alastair Reynolds",  "Revelation Space",     585, "2000-03-15T00:00:00.000Z"],
    ["James S.A. Corey",   "Leviathan Wakes",      561, "2011-06-02T00:00:00.000Z"]
  ],
  "cursor": "sDXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAAEWWWdrRlVfSS1TbDYtcW9lc1FJNmlYdw==:BAFmBmF1dGhvcgFmBG5hbWUBZgpwYWdlX2NvdW50AWYMcmVsZWFzZV9kYXRl+v///w8="
}

TSV

edit
POST /_sql?format=tsv
{
  "query": "SELECT * FROM library ORDER BY page_count DESC",
  "fetch_size": 5
}

返回结果:

author	name	page_count	release_date
Peter F. Hamilton	Pandora's Star	768	2004-03-02T00:00:00.000Z
Vernor Vinge	A Fire Upon the Deep	613	1992-06-01T00:00:00.000Z
Frank Herbert	Dune	604	1965-06-01T00:00:00.000Z
Alastair Reynolds	Revelation Space	585	2000-03-15T00:00:00.000Z
James S.A. Corey	Leviathan Wakes	561	2011-06-02T00:00:00.000Z

TXT

edit
POST /_sql?format=txt
{
  "query": "SELECT * FROM library ORDER BY page_count DESC",
  "fetch_size": 5
}

返回结果:

     author      |        name        |  page_count   |      release_date
-----------------+--------------------+---------------+------------------------
Peter F. Hamilton|Pandora's Star      |768            |2004-03-02T00:00:00.000Z
Vernor Vinge     |A Fire Upon the Deep|613            |1992-06-01T00:00:00.000Z
Frank Herbert    |Dune                |604            |1965-06-01T00:00:00.000Z
Alastair Reynolds|Revelation Space    |585            |2000-03-15T00:00:00.000Z
James S.A. Corey |Leviathan Wakes     |561            |2011-06-02T00:00:00.000Z

YAML

edit
POST /_sql?format=yaml
{
  "query": "SELECT * FROM library ORDER BY page_count DESC",
  "fetch_size": 5
}

返回结果:

columns:
- name: "author"
  type: "text"
- name: "name"
  type: "text"
- name: "page_count"
  type: "short"
- name: "release_date"
  type: "datetime"
rows:
- - "Peter F. Hamilton"
  - "Pandora's Star"
  - 768
  - "2004-03-02T00:00:00.000Z"
- - "Vernor Vinge"
  - "A Fire Upon the Deep"
  - 613
  - "1992-06-01T00:00:00.000Z"
- - "Frank Herbert"
  - "Dune"
  - 604
  - "1965-06-01T00:00:00.000Z"
- - "Alastair Reynolds"
  - "Revelation Space"
  - 585
  - "2000-03-15T00:00:00.000Z"
- - "James S.A. Corey"
  - "Leviathan Wakes"
  - 561
  - "2011-06-02T00:00:00.000Z"
cursor: "sDXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAAEWWWdrRlVfSS1TbDYtcW9lc1FJNmlYdw==:BAFmBmF1dGhvcgFmBG5hbWUBZgpwYWdlX2NvdW50AWYMcmVsZWFzZV9kYXRl+v///w8="

分页浏览大型响应

edit

使用来自上一节的示例,可以通过发送回游标字段继续到下一页。在CSV、TSV和TXT格式的情况下,游标在Cursor HTTP头中返回。

POST /_sql?format=json
{
  "cursor": "sDXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAAEWYUpOYklQMHhRUEtld3RsNnFtYU1hQQ==:BAFmBGRhdGUBZgVsaWtlcwFzB21lc3NhZ2UBZgR1c2Vy9f///w8="
}

看起来像:

{
  "rows" : [
    ["Dan Simmons",        "Hyperion",             482,  "1989-05-26T00:00:00.000Z"],
    ["Iain M. Banks",      "Consider Phlebas",     471,  "1987-04-23T00:00:00.000Z"],
    ["Neal Stephenson",    "Snow Crash",           470,  "1992-06-01T00:00:00.000Z"],
    ["Frank Herbert",      "God Emperor of Dune",  454,  "1981-05-28T00:00:00.000Z"],
    ["Frank Herbert",      "Children of Dune",     408,  "1976-04-21T00:00:00.000Z"]
  ],
  "cursor" : "sDXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAAEWODRMaXBUaVlRN21iTlRyWHZWYUdrdw==:BAFmBmF1dGhvcgFmBG5hbWUBZgpwYWdlX2NvdW50AWYMcmVsZWFzZV9kYXRl9f///w8="
}

请注意,columns 对象仅是第一页的一部分。

当结果中没有返回cursor时,您已到达最后一页。与Elasticsearch的scroll类似,SQL可能会在Elasticsearch中保持状态以支持游标。与scroll不同,接收最后一页足以保证Elasticsearch状态被清除。

要清除较早的状态,请使用清除游标 API

POST /_sql/close
{
  "cursor": "sDXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAAEWYUpOYklQMHhRUEtld3RsNnFtYU1hQQ==:BAFmBGRhdGUBZgVsaWtlcwFzB21lc3NhZ2UBZgR1c2Vy9f///w8="
}

这将类似于返回

{
  "succeeded" : true
}

使用 Elasticsearch 查询 DSL 进行过滤

edit

可以使用标准的Elasticsearch查询DSL通过在过滤器参数中指定查询来过滤SQL将运行的结果。

POST /_sql?format=txt
{
  "query": "SELECT * FROM library ORDER BY page_count DESC",
  "filter": {
    "range": {
      "page_count": {
        "gte" : 100,
        "lte" : 200
      }
    }
  },
  "fetch_size": 5
}

返回结果:

    author     |                name                |  page_count   | release_date
---------------+------------------------------------+---------------+------------------------
Douglas Adams  |The Hitchhiker's Guide to the Galaxy|180            |1979-10-12T00:00:00.000Z

标准查询DSL过滤的一个有用且不那么明显的用途是根据特定的路由键搜索文档。 由于Elasticsearch SQL不支持routing参数,因此可以指定terms过滤器用于_routing字段

POST /_sql?format=txt
{
  "query": "SELECT * FROM library",
  "filter": {
    "terms": {
      "_routing": ["abc"]
    }
  }
}

列式结果

edit

最广为人知的显示SQL查询结果的方式是,每个单独的记录/文档代表一行。对于某些格式,Elasticsearch SQL可以以列式方式返回结果:一行代表当前结果页中某一列的所有值。

以下格式可以以列式方向返回:jsonyamlcborsmile

POST /_sql?format=json
{
  "query": "SELECT * FROM library ORDER BY page_count DESC",
  "fetch_size": 5,
  "columnar": true
}

返回结果:

{
  "columns": [
    {"name": "author", "type": "text"},
    {"name": "name", "type": "text"},
    {"name": "page_count", "type": "short"},
    {"name": "release_date", "type": "datetime"}
  ],
  "values": [
    ["Peter F. Hamilton", "Vernor Vinge", "Frank Herbert", "Alastair Reynolds", "James S.A. Corey"],
    ["Pandora's Star", "A Fire Upon the Deep", "Dune", "Revelation Space", "Leviathan Wakes"],
    [768, 613, 604, 585, 561],
    ["2004-03-02T00:00:00.000Z", "1992-06-01T00:00:00.000Z", "1965-06-01T00:00:00.000Z", "2000-03-15T00:00:00.000Z", "2011-06-02T00:00:00.000Z"]
  ],
  "cursor": "sDXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAAEWWWdrRlVfSS1TbDYtcW9lc1FJNmlYdw==:BAFmBmF1dGhvcgFmBG5hbWUBZgpwYWdlX2NvdW50AWYMcmVsZWFzZV9kYXRl+v///w8="
}

使用 cursor 进行的任何后续调用仍然必须包含 columnar 参数以保留方向,这意味着初始查询不会 记住 列选项。

POST /_sql?format=json
{
  "cursor": "sDXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAAEWWWdrRlVfSS1TbDYtcW9lc1FJNmlYdw==:BAFmBmF1dGhvcgFmBG5hbWUBZgpwYWdlX2NvdW50AWYMcmVsZWFzZV9kYXRl+v///w8=",
  "columnar": true
}

看起来像:

{
  "values": [
    ["Dan Simmons", "Iain M. Banks", "Neal Stephenson", "Frank Herbert", "Frank Herbert"],
    ["Hyperion", "Consider Phlebas", "Snow Crash", "God Emperor of Dune", "Children of Dune"],
    [482, 471, 470, 454, 408],
    ["1989-05-26T00:00:00.000Z", "1987-04-23T00:00:00.000Z", "1992-06-01T00:00:00.000Z", "1981-05-28T00:00:00.000Z", "1976-04-21T00:00:00.000Z"]
  ],
  "cursor": "46ToAwFzQERYRjFaWEo1UVc1a1JtVjBZMmdCQUFBQUFBQUFBQUVXWjBaNlFXbzNOV0pVY21Wa1NUZDJhV2t3V2xwblp3PT3/////DwQBZgZhdXRob3IBBHRleHQAAAFmBG5hbWUBBHRleHQAAAFmCnBhZ2VfY291bnQBBGxvbmcBAAFmDHJlbGVhc2VfZGF0ZQEIZGF0ZXRpbWUBAAEP"
}

将参数传递给查询

edit

在查询条件中使用值,例如,或在HAVING语句中,可以通过将值集成到查询字符串本身中来“内联”完成:

POST /_sql?format=txt
{
	"query": "SELECT YEAR(release_date) AS year FROM library WHERE page_count > 300 AND author = 'Frank Herbert' GROUP BY year HAVING COUNT(*) > 0"
}

或者可以通过提取参数列表中的值并在查询字符串中使用问号占位符(?)来完成:

POST /_sql?format=txt
{
	"query": "SELECT YEAR(release_date) AS year FROM library WHERE page_count > ? AND author = ? GROUP BY year HAVING COUNT(*) > ?",
	"params": [300, "Frank Herbert", 0]
}

推荐的向查询传递值的方式是使用问号占位符,以避免任何尝试的攻击或SQL注入。

使用运行时字段

edit

使用 runtime_mappings 参数在搜索期间从现有字段中提取并创建 运行时字段 或列。

以下搜索从release_date创建一个release_day_of_week运行时字段,并在响应中返回它。

POST _sql?format=txt
{
  "runtime_mappings": {
    "release_day_of_week": {
      "type": "keyword",
      "script": """
        emit(doc['release_date'].value.dayOfWeekEnum.toString())
      """
    }
  },
  "query": """
    SELECT * FROM library WHERE page_count > 300 AND author = 'Frank Herbert'
  """
}

API返回:

    author     |     name      |  page_count   |      release_date      |release_day_of_week
---------------+---------------+---------------+------------------------+-------------------
Frank Herbert  |Dune           |604            |1965-06-01T00:00:00.000Z|TUESDAY

运行异步 SQL 搜索

edit

默认情况下,SQL 搜索是同步的。它们会等待完整的结果返回响应。然而,对于跨大数据集或冻结数据的搜索,结果可能需要更长时间。

为了避免长时间等待,请运行异步SQL搜索。将wait_for_completion_timeout设置为您希望等待同步结果的时间长度。

POST _sql?format=json
{
  "wait_for_completion_timeout": "2s",
  "query": "SELECT * FROM library ORDER BY page_count DESC",
  "fetch_size": 5
}

如果搜索在此时间段内未完成,搜索将变为异步。API返回:

  • 搜索的id
  • is_partial值为true,表示搜索结果不完整。
  • is_running值为true,表示搜索仍在后台运行。

对于CSV、TSV和TXT响应,API在这些相应的HTTP头中返回这些值:Async-IDAsync-partialAsync-running

{
  "id": "FnR0TDhyWUVmUmVtWXRWZER4MXZiNFEad2F5UDk2ZVdTVHV1S0xDUy00SklUdzozMTU=",
  "is_partial": true,
  "is_running": true,
  "rows": [ ]
}

要检查异步搜索的进度,请使用搜索ID与获取异步SQL搜索状态API

GET _sql/async/status/FnR0TDhyWUVmUmVtWXRWZER4MXZiNFEad2F5UDk2ZVdTVHV1S0xDUy00SklUdzozMTU=

如果 is_runningis_partial 都是 false,则异步搜索已完成并返回完整结果。

{
  "id": "FnR0TDhyWUVmUmVtWXRWZER4MXZiNFEad2F5UDk2ZVdTVHV1S0xDUy00SklUdzozMTU=",
  "is_running": false,
  "is_partial": false,
  "expiration_time_in_millis": 1611690295000,
  "completion_status": 200
}

要获取结果,请使用搜索 ID 与 获取异步 SQL 搜索 API。如果搜索仍在运行,请指定您希望等待的时间长度,使用 wait_for_completion_timeout。您还可以指定响应的 format

GET _sql/async/FnR0TDhyWUVmUmVtWXRWZER4MXZiNFEad2F5UDk2ZVdTVHV1S0xDUy00SklUdzozMTU=?wait_for_completion_timeout=2s&format=json

更改搜索保留期

edit

默认情况下,Elasticsearch 存储异步 SQL 搜索的时间为五天。在此期限之后,Elasticsearch 会删除该搜索及其结果,即使搜索仍在运行。要更改此保留期限,请使用 keep_alive 参数。

POST _sql?format=json
{
  "keep_alive": "2d",
  "wait_for_completion_timeout": "2s",
  "query": "SELECT * FROM library ORDER BY page_count DESC",
  "fetch_size": 5
}

您可以使用异步 SQL 搜索 API 的 keep_alive 参数来稍后更改保留期。新的保留期从请求运行后开始。

GET _sql/async/FmdMX2pIang3UWhLRU5QS0lqdlppYncaMUpYQ05oSkpTc3kwZ21EdC1tbFJXQToxOTI=?keep_alive=5d&wait_for_completion_timeout=2s&format=json

使用删除异步SQL搜索APIkeep_alive周期结束之前删除异步搜索。如果搜索仍在运行,Elasticsearch将取消它。

DELETE _sql/async/delete/FmdMX2pIang3UWhLRU5QS0lqdlppYncaMUpYQ05oSkpTc3kwZ21EdC1tbFJXQToxOTI=

存储同步SQL搜索

edit

默认情况下,Elasticsearch 仅存储异步 SQL 搜索。要保存同步搜索,请指定 wait_for_completion_timeout 并将 keep_on_completion 设置为 true

POST _sql?format=json
{
  "keep_on_completion": true,
  "wait_for_completion_timeout": "2s",
  "query": "SELECT * FROM library ORDER BY page_count DESC",
  "fetch_size": 5
}

如果 is_partialis_running 都是 false,则搜索是同步的并返回完整结果。

{
  "id": "Fnc5UllQdUVWU0NxRFNMbWxNYXplaFEaMUpYQ05oSkpTc3kwZ21EdC1tbFJXQTo0NzA=",
  "is_partial": false,
  "is_running": false,
  "rows": ...,
  "columns": ...,
  "cursor": ...
}

您可以使用搜索 ID 通过 get async SQL search API 稍后获取相同的结果。

保存的同步搜索仍然受限于keep_alive保留期。当此期限结束时,Elasticsearch会删除搜索结果。您也可以使用删除异步SQL搜索API来删除保存的搜索。