answerstu

elasticsearch - Ngram Tokenizer on field, not on query

I'm having trouble finding the solution for a use case here.Basically, it's pretty simple : I need to perform a "contains" query, like a SQL like '%...%'.I've seen there is a regexp query, which I actually managed to get working perfectly, but as it seems to scale badly, i'm trying out nGrams. Now, I've played around with them before and know "how they work", but the behaviour isn't the one I expect it to be.Basically, i've configured my analyzer to be mingram =2, maxgram = 20. Say I index a user called "Christophe". I want the query "Chris" to...Read more

elasticsearch - How to do Incremental/Search as you type full text search on 5 million records sets using Elastic search

I m using elastic search on a huge dataset of all wikipedia article names they are approx 5 million in numbers database field name is articlenamescurl -XPUT "http://localhost:9200/index_wiki_articlenames/" -d'{ "settings":{ "analysis":{ "filter":{ "nGram_filter":{ "type":"edgeNGram", "min_gram":1, "max_gram":20, "token_chars":[ "letter", "digit", "punctuation", "symbol" ] ...Read more

nest - Unwind in ElasticSearch

I am currently having the below index in ElasticSearchPUT my_index{ "mappings": { "doc": { "properties": { "type" : { "type": "text", "fielddata": true }, "id" : { "type": "text", "fielddata": true }, "nestedTypes": { "type": "nested", "properties": { "nestedTypeId":{ "type": "integer" }, "nestedType":{ "type": "text", "fielddata": true }, "isLead":{ ...Read more

elasticsearch composite aggs with nested object

i have a object with nested filed. "parameters": { "type": "nested", "properties": { "id": { "type": "integer" }, "values": { "type": "keyword" } } }im try aggsGET places/place/_search?size=0{ "query": { "match_all": {} }, "aggs": { "parameters": { "nested": { "path": "parameters" }, "aggs": { "parameters_cnt_i": { "terms": { "field": "parameters.id", "size": 100 }, "aggs": {...Read more

replication - How does Elasticsearch recover from a quorum that is not unanimous

When using replication with a quorum, Elasticsearch allows writes to fail for some (a small number of) replica shards. Writing to a replica might fail only because it is temporarily unavailable (because of a temporary network partition, for example). When that shard becomes available again (the network is fixed, for example), what happens?Does Elasticsearch automatically detect that the shard is out of date (stale, inconsistent with the primary shard) and update it in the background? Or must you perform a manual operation? When the shard return...Read more

elasticsearch - What are the drawbacks of using a Lucene directory as a primary file store?

I want to use a Lucene MMapDirectory as a primary file store. Each file would be stored in a separate document as a byte array in a StoredField. All file properties that should be searchable, like file name, size etc., would be stored in indexable fields in the same document.My questions would be:What are the drawbacks of using Lucene directories for storing files, especially with regards to indexing and search performance and memory (RAM) consumption?If this is not a "no-go", is there a better/faster way of storing files in the directory than ...Read more

Elasticsearch filter with multi_match

I'm trying to write a query in ElasticSearch where I combine multi_match with filter for an id or a number og ids.This is what i have so far:{ "query": { "bool": { "must": { "multi_match": { "query": "Kasper", "fields": ["name", "first_name", "last_name"] } }, "filter": { "term": { "user_id": "ea7528f0-1b8a-11e8-a492-13e39bbd17cb" } } } }}The "must" part of the query...Read more

elasticsearch - Append to nested object field

I want to append my elasticsearch nested object while updating { "_index": "feed", "_type": "feed", "_id": "41", "_version": 1, "found": true, "_source": { "id": 1, "name": "Trip to LA", "stats": { "likes": 40, "comments": 50, }, }}here is the query POST feed/feed/41/_update{ "script": { "source" : "ctx._source.stats.add(params.abc)", "params": { "abc": { "likes":1 } } }}...Read more

groovy - Elasticsearch data does not match mapping

I am migrating elasticsearch prod data from 1.4.3v to 5.5v, for which I am using reindex. When I try to reindex old ES index to new ES index the reindexing fails with an exception Failed Reason: mapper [THROUGHPUT_ROWS_PER_SEC] cannot be changed from type [long] to [float]. Failed Type: illegal_argument_exceptionES mapping for task_history index in ES 1.4.3v{ "task_history": { "mappings": { "task_run_hist": { "_all": { "enabled": false }, "_routing": { "required": true,...Read more

groovy - Elasticsearch: storing tags without duplicates

I want to store tags in my documents in a way that I won't have duplicates.My documents have a Tags field defined as:..."Tags": { "type": "string" }...I add the tags to its Tags field from Python:es.update(index=ES_INDEX, doc_type=ES_DOC_TYPE, id=user_id, body=doc)My update document:doc = { "script": { "lang": "groovy", "inline": "ctx._source.Tags.addAll(tags)", "params": { "tags": [ "c#", "winforms", "type-conversion", "decimal", "opacity" ] } }}This works, but the tags are potentially...Read more

How to search for a part of a word with ElasticSearch

I've recently started using ElasticSearch and I can't seem to make it search for a part of a word.Example: I have three documents from my couchdb indexed in ElasticSearch:{ "_id" : "1", "name" : "John Doeman", "function" : "Janitor"}{ "_id" : "2", "name" : "Jane Doewoman", "function" : "Teacher"}{ "_id" : "3", "name" : "Jimmy Jackal", "function" : "Student"} So now, I want to search for all documents containing "Doe"curl http://localhost:9200/my_idx/my_type/_search?q=DoeThat doesn't return any hits. But if I search forcurl http://local...Read more

nest - Favor exact matches over ngram matches in ElasticSearch when mapping

I have partial matching of words working with ngrams. How can I modify the mapping to always favor exact matches over ngram tokens? I do not want to modify the query. One search box will search multiple types, each with their own fields.For example, lets say I'm searching job titles, one person has a title of "field engineer", the other a title of "engine technician". If a user searches for "engine", I'd want ES to return the latter as more relevant.I'm using this mapping almost verbatim: https://stackoverflow.com/a/19874785/978622-Exceptio...Read more

How to use wildcards with ngrams in ElasticSearch

Is it possible to combine wildcard matches and ngrams in ElasticSearch? I'm already using ngrams of length 3-11.As a very small example, I have records C1239123 and C1230123. The user wants to return both of these. This is the only info they know: C123?12The above case won't work on my full match analyzer because the query is missing the 3 on the end. I was under the impression wildcard matches would work out of the box, but if I perform a search similar to the above I get gibberish.Query:.Search<ElasticSearchProject>(a => a .Size...Read more

n gram - Elasticsearch Auto complete using ngram

Im kind of new in Elasticsearch and I have a question on implementing autocomplete feature using NGram. From the internet, I understand that the NGram implementation allows a flexible solution such as match from middle, highlighting and etc, compared to using the inbuilt completion suggesters.Thus, I have the following field mapping for one of my index types:"suggest_keywords": { "type": "string", "analyzer": "nGram_analyzer", "search_analyzer": "whitespace_analyzer"},nGram analyzer config:"nGram_analyzer": { "filter": [ "low...Read more