Will & Skill Developers

Will & Skill Developers


Thoughts, snippets and ideas from the team at Will & Skill AB, Stockholm.

Faisal Mahmud
Author

“The mind is not a vessel to be filled, but a fire to be kindled.” ― Plutarch

Share


How to enable Swedish funny characters åäö when using ElasticSearch 2.3.x

Faisal MahmudFaisal Mahmud

I ran into an issue when indexing and searching for strings containing the Swedish characters åäö. Fortunately there is a simple fix for this.

You can read more about Unicode Character Folding at https://www.elastic.co/guide/en/elasticsearch/guide/current/character-folding.html

I had some problems when I tried to make the changes as they described it on the article above but I found a way around it.

So the necessary steps are

  1. Close <MYINDEX> with _close
  2. Make changes to <MYINDEX> via a PUT call
  3. Open <MYINDEX> with _open
  4. Reindex Your data

NOTE: Remember to change <MYINDEX> to the actual name of the index You intend to modify!

Step 1
curl -X POST 'http://localhost:9200/<MYINDEX>/_close?pretty=1'  
Step 2

The snippet below has been taken from the article linked above. You might need to make some changes depending on what type of character set You want to allow

PUT /myindex  
{
  "settings": {
    "analysis": {
      "filter": {
        "swedish_folding": { 
          "type": "icu_folding",
          "unicodeSetFilter": "[^åäöÅÄÖ]"
        }
      },
      "analyzer": {
        "swedish_analyzer": { 
          "tokenizer": "icu_tokenizer",
          "filter":  [ "swedish_folding", "lowercase" ]
        }
      }
    }
  }
}

This is how it would look if You intend to use curl

curl -X PUT -H "Content-Type: application/json" -d '{"settings": { "analysis": { "filter": { "swedish_folding": { "type": "icu_folding", "unicodeSetFilter": "[^åäöÅÄÖ]"}}, "analyzer": {"swedish_analyzer": { "tokenizer": "icu_tokenizer", "filter":  [ "swedish_folding", "lowercase" ]}}}}}' "localhost:9200/<MYINDEX>/_settings?pretty=1"  

Step 3

curl -X POST 'http://localhost:9200/<MYINDEX>/_open?pretty=1'  

Make sure to pay attention to the output from the ElasticSearch server!

Step 4

I am using a Django library named django-elasticsearch-dsl (https://github.com/sabricot/django-elasticsearcoh-dsl) so I would run the command below to rebuild my index.

$ python manage.py search_index --rebuild
Faisal Mahmud
Author

Faisal Mahmud

“The mind is not a vessel to be filled, but a fire to be kindled.” ― Plutarch

Comments