在 Amazon OpenSearch Service 中搜索数据 - Amazon Opensearch Service
Amazon Web Services 文档中描述的 Amazon Web Services 服务或功能可能因区域而异。要查看适用于中国区域的差异,请参阅中国的 Amazon Web Services 服务入门

在 Amazon OpenSearch Service 中搜索数据

有几种在 Amazon OpenSearch Service 中搜索文档的常见方法,包括 URI 搜索和请求正文搜索。OpenSearch Service 提供了可改进搜索体验的其他功能,例如自定义软件包、SQL 支持和异步搜索。有关全面 OpenSearch 搜索 API 引用,请参阅 OpenSearch 文档

注意

以下示例请求与 OpenSearch API 一起使用。某些请求可能不适用于旧 Elasticsearch 版本。

URI 搜索

统一资源标识符 (URI) 搜索是最简单的搜索形式。在 URI 搜索中,可以指定查询作为 HTTP 请求参数:

GET https://search-my-domain.us-west-1.es.amazonaws.com/_search?q=house

示例响应可能与以下内容下类似:

{ "took": 25, "timed_out": false, "_shards": { "total": 10, "successful": 10, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 85, "relation": "eq", }, "max_score": 6.6137657, "hits": [ { "_index": "movies", "_type": "movie", "_id": "tt0077975", "_score": 6.6137657, "_source": { "directors": [ "John Landis" ], "release_date": "1978-07-27T00:00:00Z", "rating": 7.5, "genres": [ "Comedy", "Romance" ], "image_url": "http://ia.media-imdb.com/images/M/MV5BMTY2OTQxNTc1OF5BMl5BanBnXkFtZTYwNjA3NjI5._V1_SX400_.jpg", "plot": "At a 1962 College, Dean Vernon Wormer is determined to expel the entire Delta Tau Chi Fraternity, but those troublemakers have other plans for him.", "title": "Animal House", "rank": 527, "running_time_secs": 6540, "actors": [ "John Belushi", "Karen Allen", "Tom Hulce" ], "year": 1978, "id": "tt0077975" } }, ... ] } }

默认情况下,此查询在所有索引的所有字段中搜索 house 一词。要缩小搜索范围,请在 URI 中指定索引 (movies) 和文档字段 (title):

GET https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search?q=title:house

您可以在请求中包含其他参数,但支持的参数仅提供一小部分 OpenSearch 搜索选项。以下请求将返回 20 个(而不是默认的 10 个)结果并按年(而不是按 _score)对这些结果进行排序:

GET https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search?q=title:house&size=20&sort=year:desc

请求正文搜索

要执行更复杂的搜索,请对查询使用 HTTP 请求正文和特定于 OpenSearch 域的语言 (DSL)。利用查询 DSL,您可以指定所有的 OpenSearch 搜索选项。以下 match 查询类似于最终 URI 搜索示例:

POST https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search { "size": 20, "sort": { "year": { "order": "desc" } }, "query": { "query_string": { "default_field": "title", "query": "house" } } }
注意

对于请求正文搜索,_search API 接受 HTTP GETPOST,但并非所有 HTTP 客户端都支持将请求正文添加到 GET 请求。POST 是更普遍的选择。

在许多情况下,您可能想搜索多个字段,而不是全部字段。使用 multi_match 查询:

POST https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search { "size": 20, "query": { "multi_match": { "query": "house", "fields": ["title", "plot", "actors", "directors"] } } }

提升字段

可以通过“提升”某些字段来增强搜索相关性。提升是一种倍增器,它使一个字段中的匹配项的权重高于其他字段中的匹配项的权重。在以下示例中,johnplot 字段中的匹配项的影响是 title 字段中匹配项的影响的_score两倍,并且是 actorsdirectors 字段中匹配项的影响的四倍。其结果是,John WickJohn Carter 等电影在搜索结果中接近榜首,而由 John Travolta 主演的电影则接近底部。

POST https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search { "size": 20, "query": { "multi_match": { "query": "john", "fields": ["title^4", "plot^2", "actors", "directors"] } } }

对搜索结果进行分页

如果您需要显示大量搜索结果,可使用 from 参数对结果进行分页。以下请求返回从零开始编制索引的搜索结果列表中的 20-39 个结果:

POST https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search { "from": 20, "size": 20, "query": { "multi_match": { "query": "house", "fields": ["title^4", "plot^2", "actors", "directors"] } } }

搜索结果突出显示

如果查询与一个或多个字段匹配,则 highlight 选项将指示 OpenSearch 返回 hits 数组内的其他对象:

POST https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search { "size": 20, "query": { "multi_match": { "query": "house", "fields": ["title^4", "plot^2", "actors", "directors"] } }, "highlight": { "fields": { "plot": {} } } }

如果查询与 plot 字段的内容匹配,则命中的内容可能与以下类似:

{ "_index": "movies", "_type": "movie", "_id": "tt0091541", "_score": 11.276199, "_source": { "directors": [ "Richard Benjamin" ], "release_date": "1986-03-26T00:00:00Z", "rating": 6, "genres": [ "Comedy", "Music" ], "image_url": "http://ia.media-imdb.com/images/M/MV5BMTIzODEzODE2OF5BMl5BanBnXkFtZTcwNjQ3ODcyMQ@@._V1_SX400_.jpg", "plot": "A young couple struggles to repair a hopelessly dilapidated house.", "title": "The Money Pit", "rank": 4095, "running_time_secs": 5460, "actors": [ "Tom Hanks", "Shelley Long", "Alexander Godunov" ], "year": 1986, "id": "tt0091541" }, "highlight": { "plot": [ "A young couple struggles to repair a hopelessly dilapidated <em>house</em>." ] } }

预设情况下,OpenSearch 将匹配的字符串包含在 <em> 标记中,提高与匹配项相关的最多 100 个字符的上下文,并通过标识标点符号、空格、制表符和换行符来将内容分成多个句子。所有这些设置均可自定义:

POST https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search { "size": 20, "query": { "multi_match": { "query": "house", "fields": ["title^4", "plot^2", "actors", "directors"] } }, "highlight": { "fields": { "plot": {} }, "pre_tags": "<strong>", "post_tags": "</strong>", "fragment_size": 200, "boundary_chars": ".,!? " } }

计数 API

如果您对文档内容不感兴趣,只是想知道匹配项的数量,则可使用 _count API 而非 _search API。以下请求使用 query_string 查询来标识浪漫喜剧:

POST https://search-my-domain.us-west-1.es.amazonaws.com/movies/_count { "query": { "query_string": { "default_field": "genres", "query": "romance AND comedy" } } }

示例响应可能与以下内容下类似:

{ "count": 564, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 } }