Lucene Analyzer

Configure the CMS to use a different Lucene Analyzer (text search engine). For example for Dutch, German or French instead of English. 

Introduction

Hippo Repository uses org.hippoecm.repository.query.lucene.StandardHippoAnalyzer as default Lucene Analyzer for the stored content. This analyzer strips stopwords for the languages English, German, Dutch, French, Spanish and Braziian. It also applies a ISO Latin 1 accent filter, this replaces a letter like ç with c and ï with i, etc

You can configure custom language analyzers, that for example also add stemming.  The side effect is that it breaks wildcard searching. Explaining this is beyond the scope of this page, as it involves general concepts about inverted indexes, such as Lucene.  We advice to stick to the StandardHippoAnalyzer if you want to avoid wildcard searching issues.

Add dependency

To configure a different Lucene Analyzer, first add the lucene-analyzers dependency to the pom.xml of each subproject that contains its own repository. This is mostly the "CMS" subproject. If your website and CMS use their own repositories, add it to the projects that need the different analyzers.

<dependency> 
  <groupId>org.apache.lucene</groupId> 
  <artifactId>lucene-analyzers</artifactId> 
    <!-- check the version for your current Hippo Repository -->  
  <version>2.3.2</version> 
</dependency> 

Modify the Analyzer class

The Analyzer class is configured in the repository.xml file. If you do not have one yet, you can copy it from cms/target/work/webapp/WEB-INF/classes/org/hippoecm/repository/repository.xml after you installed your CMS subproject.

In a running project it can be read from a filesystem location or from the application package. If you want to keep this file in your application package, copy it to src/main/resources/org/hippoecm/repository/repository.xml in your Maven project.

Change the value of

<param name="analyzer" value="org.apache.lucene.analysis.standard.StandardAnalyzer"/>
to the classname of your analyzer.

Optionally configure the location of repository.xml

If the repository.xml is not in the default location, also modify the web.xml:

  <!-- Repository -->  
<servlet>
  <servlet-name>Repository</servlet-name>
  <servlet-class>org.hippoecm.repository.RepositoryServlet</servlet-class>
  <init-param>
      <param-name>repository-config</param-name>
      <param-value>repository.xml</param-value>
      <description>The location of the repository configuration file.
        Unless the location starts with file://, the location is 
        retrieved from within the application package as resource
      </description>
  </init-param>
  <load-on-startup>4</load-on-startup>
</servlet>

Hippo Europe: +31 (0)20 5224466
Hippo North America: +1 (707) 773-4646