Character Filters

Character Filters #

Character filters are used to preprocess the string of characters before it is passed to the tokenizer. A character filter may be used to strip out HTML markup, , or to convert “&” characters to the word “and”.

Mapping Char Filter #

mapping

Replaces characters of an analyzed text with a given mapping

Settings:

  • mappings

HTML Strip Char Filter #

html_strip

Strips out HTML elements from an analyzed text.

Pattern Replace Char Filter #

pattern_replace

Allows the use of a regex to manipulate the characters in a string before analysis. The regular expression is defined using the pattern parameter, and the replacement string can be provided using the replacement parameter.

Settings:

  • pattern
  • replacement