This article covers a Hippo CMS version 7.7. There's an updated version available that covers our most recent release.

Faceted Navigation Configuration 

This document describes the faceted navigation that is supported by Hippo CMS. This description mainly focuses on configuration options, for example how to configure ranges.

General:

A faceted classification system allows the assignment of multiple classifications to an object, enabling the classifications to be ordered in multiple ways, rather than in a single, pre-determined, taxonomic order

Hippo Repository Faceted navigation:

faceted navigation can be done on metadata (facets) of documents. In terms of jcr, we do faceted navigation on the properties of nodes. Currently not yet on the properties of descendant nodes. Thus

At the moment faceted navigation takes place on direct properties of documents, not on properties of descendant nodes

Except for free text searching: this is done on all the descendant nodes' properties of the document as well (though you can also specify to search in a single property)

Also see How to structure your data and content

Can't wait to see it in action:

Deploy the latest HST2 [testsuite|http://svn.onehippo.org/repos/hippo/hippo-cms7/testsuite/trunk] cms war & site war or check out and startup the demosite/cms and demosite/site and visit the url /site/preview and browse to faceted.

Currently supported features:

  1. We support faceted navigation on a subset of the repository. This is done by defining the (multiple) 'root folder'(s) from where below faceted navigation should be created for. Accessing a faceted navigation through a preview or live entry point is accounted for. (in live, only documents count that are published)

  2. We support multivalued properties. In other words, you can have a property author, that has two values, 'admin', 'siteuser'. This document can be found in faceted navigation
    below author 'admin' as well as below author ''siteuser'. Even multivalued dates (with ranges) are supported.

  3. Faceted navigation is 'free drill path' based: When you define that the navigation should be based on 'brand' and 'product', you can start browsing either by brand or by product.

  4. You can configure a 'guided drill path': for example, the facet month is only available after facet year has been chosen.

  5. Configurable limit of facet values per facet.

  6. Configurable sorting (count, facetvalue, by config) per facet.

  7. Configurable limit of the documents and sorting of the documents in the resultset (default resultset ordering is Lucene scoring of best matching documents in case of free text search)

  8. Faceted navigation supports resolution (see later) based browsing for date properties. Thus you can navigate through years / month / day as if there were three facets (year/month and day), though there is only a single date property. Note that the month number are 0 based. Thus 0-11 were 11 == december, also see java.util.Calendar. When browsing on the date without resolution, the values of dates are presented in millisec since epoch. Calendar.getInstance().setTimeInMillis(long value) can be used to get the correct java.util.Calendar instance.

  9. Faceted navigation supports ranges on Strings, Doubles, Longs and resolution (for example 'this year' , 'this month') based on Dates.

  10. Possibility to configure filters: for example faceted navigation only on documents of type news, or only documents that contains the word 'Hippo', or documents that only contain some word in the title. Or a combination of these filters.

  11. Free text search support (Lucene QueryParser systax)

  12. New XPath queries as in jsr-170

All combination mentioned before are supported, thus multivalued date properties with ranges in combination with resolution browsing and normal faceted browsing are supported. Also combined with free text search and even XPath queries. As a site note, faceted navigation on a single multivalued property, such as 'hippostd:tags' can be used directly as a tagged browsing (show me first the 10 most used tags, then choose one tag, and below then one, show me the 10 most used tags where the documents have already the selected tag). Thus, a subset of our faceted navigation is tagged browsing, which is possible out-of-the-box. Also tagged browsing in combination with faceted navigation is thus possible, merely by configuration.

The resultset from faceted navigation is the set of nodes that comply to the selected facet-value combinations. The resultset can be configured to be sorted on some propertie(s), where per property you can specify descending or ascending.

Configuration options: A step by step guide from simple to advanced

As of this writing, faceted navigation configuration is done in the ecm console interface.

a) Simplest minimal form:

(a1): create a new node of type 'hippofacnav:facetnavigation' and name you can choose.
(a2.1): add a new property with name 'hippo:docbase' the value is the uuid of the folder that is the 'root folder' below which the documents should be added in the faceted navigation.
(a2.2): look for the uuid in the hippo:paths property of the folder of your choice
(a3): add a multivalued string property called 'hippofacnav:facets' and add the property names you want to have faceted navigation on. For example

  • myproject:brand

  • myproject:product

  • jcr:primaryType

This is enough for the simplest form. Write the changes (for reasons of updating the tree, you need to click in the browser navigation again to reload the tree. This is (currently) needed when modifying the
'hippofacnav:facetnavigation' and writing the changes).

Now, you can navigate the faceted view. You will see, that the properties 'myproject:brand' and 'myproject:product' are the start, and that these names return in deeper
levels as well. For frontend simplicity, a choosen facet-value pair is repeated one more time as descendant nodes, and then stopped (stop to avoid recursion)

Note that the property jcr:primaryType is a built in facet which is always available.
Note if you want to have more then one 'root folder', you can use a comma separated list of docbases

b) Showing pretty names:

(b1): add the optional multivalued string property 'hippofacnav:facetnodenames'. Here can add the 'pretty' facet names. In this example thus:

  • brand

  • product

  • doctype

Note: when using 'hippofacnav:facetnodenames', the configured number of values most be equal to the number of values in 'hippofacnav:facets'

now you see in the faceted view, instead of nodes 'myproject:brand', you see 'brand'.

c) Limiting the number of documents in the resultset

(c1): add the optional long property 'hippofacnav:limit' and fill in some long. This will limit the number of results in the resultset. This is for performance and memory reasons,
as the resultset can easily contains tens/hundreds of thousands of results. You most like will only need no more then X of them. Set this as limit. Not doing so can use to much unnecessary memory and cpu

d) Sorting the documents in the resultset

(d1): when using a limit on the resultset, you might first want to sort the resultset on some property(s) (facet). You can sort on multiple properties (when the first
is not distinctive enough). Add the multivalued string property 'hippofacnav:sortby', for example:

  • myproject:startdate

  • myproject:enddate

default order of sorting is 'ascending'

(d2): When using a 'hippofacnav:sortby', you can change the default 'ascending' per property by adding the multivalued string property 'hippofacnav:sortorder'.
Allowed values are only 'descending' or 'ascending'

  • descending

  • descending

Note: when using 'hippofacnav:sortby', the configured number of values most be equal to the number of values in 'hippofacnav:sortby'

e) Date faceted navigation

(e1) you can use date properties 'resolution' based. Normally, when specifying a date facet, the faceted view will return the facet-values
in millisec since epoch. What if you want to navigate first through the year, then the month, and then day. You can achieve this by specifying the facet name, then the delimiter '$', and then the resolution/
Thus suppose your current configuration is:

hippofacnav:facets

  • myproject:startdate

hippofacnav:facetnodenames

  • startdate

Change this into:

hippofacnav:facets

  • myproject:startdate$year

  • myproject:startdate$month

  • myproject:startdate$day

hippofacnav:facetnodenames

  • year

  • month

  • day

now, the single property 'myproject:startdate' is being used for year, month and day. We also support a predefined order through which the facets become available: For example, only show the month after year has been chosen. See Guided drill paths.

Supported date resolutions are

  • year

  • month

  • week

  • dayofyear

  • dayofweek

  • day ( = day of month)

  • hour

  • minute

  • second

Note: The 'month' resolution is 0-based: Thus January is number 0, December is number 11. This is inline with java.util.Calendar

f) Ranges faceted navigation

Faceted navigation supports dynamic range navigation. for example a facet like 'this week' we see as dynamic as the range changes every week automatically, whereas a facet like myproject:startdate$year never changes, and is in this sense static.

Dynamic ranges are dynamic in two ways:

  1. On the 'hippofacnav:facetnavigation' you can configure the ranges. You can have multiple faceted navigations, with different ranges, you can have one and the same property added multiple times with different ranges, etc etc.

  2. When configuring ranges for dates, the 'ranges' themselve shift in time. For example a range for 'today' has been shifted to another date range tomorrow. This obviously only applies for ranges on dates, and not for ranges on for example properties containing a double (say price), a long or even String.

Range configuration has a notation according JSON array format after a delimiter '$', see http://json-lib.sourceforge.net. So, for example, this would be a range configuration for myproject:startdate:

myproject:startdate$[{name:'today', resolution:'day', begin:0, end:1},
{name:'yesterday', resolution:'day', begin:-1, end:0}]

The part after the $ is JSON format for an array.

(f1) The range configuration for Dates, Doubles and Longs are equivalent:

Supported attributes:

  • name (mandatory ; type:String)

  • resolution (mandatory ; type:String, options: 'long', 'double', 'year', 'month', 'week', 'day', 'hour')

  • begin (optional ; type:double, inclusive, default Double.NEGATIVE_INFINITY)

  • end (optional ; type:double, exclusive, default Double.POSITIVE_INFINITY)

Note: begin = INCLUSIVE, end = EXCLUSIVE
Note: the 'name' is the name of the node that contains all the items in the configured range

A date range example:

Suppose again you have the current configuration:

hippofacnav:facets

  • myproject:startdate

hippofacnav:facetnodenames

  • startdate

Now, you would like to navigate the tree as follows:

  • today

  • yesterday

  • this week

  • this month

  • this year

  • before this year

Let's start the today. Suppose it is today the 24th of december. We want to navigate everything of today. Therefor, the range format is:

{name:'today', resolution:'day', begin:0, end:1}

If you would have want to navigate not today, but the last 24 hours, you would have to configure:

{name:'last 24 hours', resolution:'hour', begin:-23, end:1}

Now, yesterday is:

{name:'yesterday', resolution:'day', begin:-1, end:0}

And this week is:

{name:'this week', resolution:'week', begin:0, end:1}

etc etc.

So, note the following:

If you would like a range of the last three months, you have to realize the following: Do I want the last three months, as in october, novermber, december, or do I want the
last ~ 91 days. For the first, you use

{name:'last 3 months', resolution:'month', begin:-2, end:1}

and the latter you use

{name:'last 3 months', resolution:'day', begin:-90, end:1}

At last, you can configure the 'before this year' as follows:

{name:'before this year', resolution:'year', end:0}

You can see, no begin.

Putting this altogther (you need to wrap the {} into [], see below), the, somewhat more verbose configuration of the range property would become:

hippofacnav:facets
      - myproject:startdate$[{name:'today', resolution:'day', begin:0, end:1},
{name:'yesterday', resolution:'day', begin:-1, end:0}, ...... ,
{name:'before this year', resolution:'year', end:0}]

hippofacnav:facetnodenames
      - startdate

Note thus, that you can also use range navigating intertwined with 'date facet overloading' and navigating on any other property: thus, you can combine faceted navigation by year,month and day, and at the same time you can
use ranges for the same date property in another facet

A price range example:

Assume you have stored prices as a Double in the property myproject:price. Now, you'd like to show prices less then 10.000, between 10.000 - 50.000, and more than 50.000. Then, your configuration would be something like:

hippofacnav:facets
- myproject:price$[{name:'less 10.000', resolution:'double', end:10000},
  {name:'10.000 - 50.000', resolution:'double', begin:10000, end:50000},
  {name:'more than 50.000', resolution:'double', begin:50000}]

hippofacnav:facetnodenames
- price

Of course, you can combine the date range with price range, and any other facets

(f2) The range query for String values:
A String range query is very useful for creating for example an index page, where for each range and index word you want to display the number of documents. String range configuration is similar to the range configuration for Longs, Doubles and Dates with a small difference:

Supported attributes:

  • name (mandatory ; type:String)

  • resolution (mandatory ; type:String, options: 'string')

  • lower (optional ; type:string, inclusive)

  • upper (optional ; type:string, exclusive)

When no lower is defined, all facet values before upper are counted. When no upper is defined, all facet values after lower are counted. If both are missing, all values are counted.

Limitation: If you use a lower and an upper bound, the configured ranges must have an equal number of chars. Thus 'aa' to 'ab' is correct, 'aa' to 'b' is not supported and will throw an exception.

A String range example

Assume you have a property myproject:brand, and want to create an index page for all brands, grouped together in 3 groups, and one extra group for 'all'. Now, the following configuration would do this for:

hippofacnav:facets
- myproject:brand$[{name:'all', resolution:'string'}, {name:'a - f', resolution:'string', lower:'a', upper:'g'},
  {name:'g - m', resolution:'string', lower:'g', upper:'n'}, {name:'n - z', resolution:'string', lower:'n', upper:'{'} ]

hippofacnav:facetnodenames
- brand

Note: After the char 'z' comes '{', hence this is the upper in the last range

Note: We support ranges for String values with the limitation that you can only have a range of at most 3 chars. So you can for example define a String range from 'a' to 'b', or from 'aa' to 'ab' or 'aaa' to 'aab', but not from 'aaaa' to 'aaab'. This is for performance reasons, and also unlikely that it is needed as a range.

g) Sorting and limiting the facet values
When configuring your faceted navigation node, you most of the time know how you want the facet values to be shown. If for example the facet is myproject:date$year then quite likely I want to have the values ordered descending or ascending in year number, and not according the default faceted navigation ordering, which is the count. Another reason why you would like to influence the ordering and limit is for example when you want to expose a tag cloud: You only want the first 10 most common tags. Not all other thousands of tags. Or, in case you want to show the most uncommon tags, you reverse the ordering.

Therefor, you can per facet on hippofacnav:facetnodenames define how it should return its facet values with respect to :

a) order on (type:String):
a1) count (sorting on count, descending is the default, except for ranges)
a2) facetvalue
a3) config (in case of configured ranges: the default order for config is ascending, which is the order how it is configured)
b) ordering (type:String, options: descending/ascending)
c) limit: (type:int) the number of unique facetvalues shown (useful for tag cloud's and performance when thousands of unique facets are present)

thus, suppose we have again the facet brand, and we want to show all the brands order by their name. The configuration then should be something like:

hippofacnav:facets
- myproject:brand

hippofacnav:facetnodenames
- brand${sortby:'facetvalue', sortorder:'ascending'}

This shows all brand facetvalues alphabetically ascending. Now, suppose you also have a myproject:year facet, the configuration can be:

hippofacnav:facets
- myproject:brand
- myproject:date$year
- myproject:date$[{name:'today', resolution:'day', begin:0, end:1}, {name:'yesterday', resolution:'day', begin:-1,  end:0}]

hippofacnav:facetnodenames
- brand${sortby:'facetvalue', sortorder:'ascending'}
- year${sortby:'facetvalue', sortorder:'descending'}
- range${sortby:'config', sortorder:'descending'}

Now, you also have all available years, ordered descending. The ordering for years is done numerically instead of alphabetically. Furthermore, you also have the range 'today' and 'yesterday', where 'yesterday' comes first because the order is 'descending' (which is the reverse of the configured order)

So, back to only brands. Assume you have hundreds/thousands of brands, and only want to show the top 25 brands. You can easily achieve this by configuring a limit as well:

hippofacnav:facets
- myproject:brand
hippofacnav:facetnodenames
- brand${sortby:'count', sortorder:'descending', limit:25}

Suppose, you are interested in the least 25 used brands, you only need to change the 'descending' into 'ascending'

Note ordering on facetvalue works for Longs, Doubles, Dates and Strings: We do a runtime logical check whether we are dealing with numerical values or Strings
Note sorting and limiting facet values is different then sorting the resultset described earlier

Tagged browsing

With the help of limit and faceted navigation on multivalued properties, this gives you out-of-the-box tagged browsing. Suppose I have a couple of hundred different tags, where every document can contain multiple tags. Now, if you have this configuration:

hippofacnav:facets
- myproject:tags

hippofacnav:facetnodenames
- tags${sortby:'count', sortorder:'descending', limit:10}

Then, you only see the 10 most common tags, right. Now, suppose one of the most common tags is called 'green'. Navigation to this tag gives below this tag the 10 most common tags for all documents that have the tag 'blue'. So, tagged browsing is just a subset of our faceted navigation engine! You can also combine tagged browsing with everything the normal faceted navigation has (!!), for example:

hippofacnav:facets
- myproject:tags
- myproject:startdate$[{name:'today', resolution:'day', begin:0, end:1}, {name:'this week', resolution:'week', begin:0, end:1},
  {name:'this year', resolution:'year', begin:0, end:1}]

hippofacnav:facetnodenames
- tags${sortby:'count', sortorder:'descending', limit:10}
- startdate

So, if you would first go the 'this week', then you have below there tagged browsing for documents of 'this week'.

h) Guided drill paths

By default, faceted navigation is completely free with respect to how you traverse into the facets. Thus, if you configure facets 'brand', 'color' and 'type', then you can start traversing first by type, then color then brand, but also through color, type and then brand. In the case of multivalued facets, you can traverse the same facet multiple times (if color is blue and grey, you can go to /color/grey/color/blue but also /color/blue/color/grey).

We do however support for guiding the possible paths, which we call guided drill paths. For example, you might want to only have the facet 'month' available after the facet 'year' has been chosen. You can do this as follows:

hippofacnav:facets
- myproject:startdate$year
- myproject:startdate$month
- myproject:startdate$day

hippofacnav:facetnodenames
- year
- month${after:'year'}
- day${after:'month'}

the

${after:'year'}

means the month is visible after facet 'year' as been chosen. Note that you have to configure the facetnodenames 'year' and not the facets 'myproject:startdate$year'. Obviously, normally you would also like to have the months sorted wrt facetvalue, ascending, so it would become:

month${after:'year', sortby:'facetvalue', sortorder:'ascending'}

Now, besides to having a facet available after some other has been chosen, you can also choose to hide some facet(s) after another facet has been chosen. You can also hide the facet that has been chosen: this way, it is not displayed again in the faceted navigation tree. This can be useful if you want to avoid all kinds of frontend logics as to when to show which facets. For example:

hippofacnav:facets
- myproject:startdate$year
- myproject:startdate$month
- myproject:startdate$day

hippofacnav:facetnodenames
- year${hide:'year'}
- month${after:'year', hide:'month'}
- day${after:'month'}

This configuration works as follows: When you browse to the facet year and then to 2010, thus /year/2010, then you won't get the facet 'year' below 2010 again. Normally, this facet is shown again, as it makes frontends really simple. Last thing about guided drill paths, is that you can also hide multiple facets. So, suppose that when you have chosen facet 'year', you don't want to hide only the facet year, but also the facet 'range' (might make sense). You can then use

year${hide:['range', 'year']}

i) Filtering

You can configure filters on your faceted navigation through the multivalued property hippofacnav:filters. All filters are being AND-ed. We support currently the following formats as a filter:

0) jcr:primaryType = nodeType
1) propertyname = propertyvalue
2) propertyname != propertyvalue
3) contains(. , text)
4) contains(propertyname, text)
5) not(expression)
6) text

Explanations and examples:
Ad0) You can configure for example

hippofacnav:filters
- jcr:primaryType = myproject:news

this makes sure only documents of type 'myproject:news' are being reflected in the current faceted navigation.

Ad1) You can configure for example

hippofacnav:filters
- myproject:brand = hippo

Now, only documents containing property myproject:brand = hippo are being reflected.

Ad2) See Ad1, only now a ! is added to indicate not equal.

Ad3) This is a 'document' scoped free text search. Thus, also free text search in any property of any descendant node. If there is no specific ordering configured of the resultset, the document resultset is returned according Lucene scoring. The 'text' is according the Lucene QueryParser syntax (except some specifics, see below), see http://lucene.apache.org/java/2_3_2/queryparsersyntax.html. The operator for a space is AND

Examples:

Returns all documents containing the word 'jump':

hippofacnav:filters
- contains(.,jumps)

Returns all documents containing 'quick' AND 'jump':

contains(.,quick brown)

Returns all documents containing 'quick' OR 'jump':

contains(.,quick OR jump)

Returns all document which contain the words 'brown fox jumps' in this exact order:

4) contains(.,"brown fox jumps")

Returns all documents with words that match bro?n, where ? is any single char:

5) contains(.,bro?n) :

Returns all documents with words that match laz*, where * is any char array:

6) contains(., laz*)

Returns all documents containing 'quick' OR 'jump' and in the scoring, consider the word 'quick' 10 times more important:

7) contains(.,quick^10 OR jump)

Ad4) This is the same as Ad3, only now instead of the '.', you use a specific property of the document (cannot be done on descendant node properties). For example:
contains(myproject:title, hippo OR onehippo)

Ad5) Any of the expressions above, you can invert by wrapping a not() around it, for example:

not(myproject:brand = hippo)

Ad6) If only text is filled in, the filter is treated as if it is: contains(.,text)

This for example:

hippofacnav:filters
- quick brown fox

is equal to

hippofacnav:filters
- contains(.,quick brown fox )

Note not supported from the Lucene QueryParser syntax are fuzzy searches, proximity searches, and range queries. Furthermore, we do not allow a prefix wildcard

j) Free Text Search

Free text search is treated as a 'runtime' filter, thus equivalent to the fixed text filtering, see (i6). This feature is available from 2.16.01 and higher. It can be used by giving an extra argument to the faceted navigation node in the jcr path. If for example in the path:

/content/document/mysite/facetednews/year/2009

the node 'facetednews' is the faceted navigation node (type = 'hippofacnav:facetnavigation'), then you can combine a free text search as follows:

/content/document/mysite/facetednews[{'hippo'}]/year/2009

This will return in the faceted navigation only documents that contain 'hippo'. You can also use two words, for example

/content/document/mysite/facetednews[{'hippo gogreen'}]/year/2009

This will only return documents that contain 'hippo' and 'gogreen'. You can use any syntax that is allowed in the jackrabbit QueryParser, which is apart from some subtleties exactly the same as the Lucene QueryParser.

You can test this very easily in the /repository servlet.

k) XPath queries as in jsr-170

A XPath queries is treated as a 'runtime' filter, just as free text search. The only thing different, is the syntax. When using the xpath query

//*[jcr:contains(.,'hippo')]

it will result in the same thing as a normal free text search.

We support XPath according jsr-170. We can thus expose xpath search results in your desired faceted navigation view! This is very nice we think.

Writing the same free text query from above as xpath example looks like:

/content/document/mysite/facetednews[{xpath(//*[jcr:contains(.,'hippo')])}]/year/2009

The syntax is thus:

[{xpath(???)}]

where you fill in an xpath in the ???. The HST2, see Faceted Navigation combined with Free Text Search and HstQueries , makes it very easy for you, as it has utility methods doing this all for you. You can just drop a HstQuery in this utility method.

Now, also supported is for example a xpath query like:

//*[jcr:contains(.,'hippo') OR jcr:contains(@myproject:title,'hippo')]

This will boost hits that have 'hippo' in the title as well. This will make sure, that these hits end up higher ranked in the faceted navigation resultset.

You can also specify ordering in the xpath query (default order is the one configured by the faceted navigation, see 'd) Sorting the documents in the resultset', or when missing, it is @jcr:score descending). When you specify an order, it overrides the configured order/sorting configured on the faceted navigation. For example:

//*[jcr:contains(.,'hippo') OR jcr:contains(@myproject:title,'hippo')] order by @myproject:date descending

Note We do also support range queries in xpath. Note that some range queries in XPath on large data sets, like date ranges on many documents, are cpu intensive and slow. These range queries in XPath will also be slow in faceted navigation. If you can use 'f) Ranges faceted navigation', this is much more efficient.