Test an analyzeredit
The analyze
API is an invaluable tool for viewing the
terms produced by an analyzer. A built-in analyzer can be specified inline in
the request:
POST _analyze { "analyzer": "whitespace", "text": "The quick brown fox." }
The API returns the following response:
{ "tokens": [ { "token": "The", "start_offset": 0, "end_offset": 3, "type": "word", "position": 0 }, { "token": "quick", "start_offset": 4, "end_offset": 9, "type": "word", "position": 1 }, { "token": "brown", "start_offset": 10, "end_offset": 15, "type": "word", "position": 2 }, { "token": "fox.", "start_offset": 16, "end_offset": 20, "type": "word", "position": 3 } ] }
You can also test combinations of:
- A tokenizer
- Zero or more token filters
- Zero or more character filters
POST _analyze { "tokenizer": "standard", "filter": [ "lowercase", "asciifolding" ], "text": "Is this déja vu?" }
The API returns the following response:
{ "tokens": [ { "token": "is", "start_offset": 0, "end_offset": 2, "type": "<ALPHANUM>", "position": 0 }, { "token": "this", "start_offset": 3, "end_offset": 7, "type": "<ALPHANUM>", "position": 1 }, { "token": "deja", "start_offset": 8, "end_offset": 12, "type": "<ALPHANUM>", "position": 2 }, { "token": "vu", "start_offset": 13, "end_offset": 15, "type": "<ALPHANUM>", "position": 3 } ] }
Alternatively, a custom
analyzer can be
referred to when running the analyze
API on a specific index:
PUT my-index-000001 { "settings": { "analysis": { "analyzer": { "std_folded": { "type": "custom", "tokenizer": "standard", "filter": [ "lowercase", "asciifolding" ] } } } }, "mappings": { "properties": { "my_text": { "type": "text", "analyzer": "std_folded" } } } } GET my-index-000001/_analyze { "analyzer": "std_folded", "text": "Is this déjà vu?" } GET my-index-000001/_analyze { "field": "my_text", "text": "Is this déjà vu?" }
The API returns the following response:
{ "tokens": [ { "token": "is", "start_offset": 0, "end_offset": 2, "type": "<ALPHANUM>", "position": 0 }, { "token": "this", "start_offset": 3, "end_offset": 7, "type": "<ALPHANUM>", "position": 1 }, { "token": "deja", "start_offset": 8, "end_offset": 12, "type": "<ALPHANUM>", "position": 2 }, { "token": "vu", "start_offset": 13, "end_offset": 15, "type": "<ALPHANUM>", "position": 3 } ] }