HST Rewriting rich text field runtime

Introduction

Goal

Rewrite rich text content in documents at runtime.

Background

As a developer you might want to inject runtime changes to the rich text field from some document. You even might want to inject context aware runtime modifications.

For example:

  • When an internal link cannot be resolved, remove the entire <a> element

  • When a link is created, add a tooltip

  • When the channel is a mobile channel, take images of lower resolution

  • Create a lightbox for images (show some small variant that is clickable to show a large one)

  • Create a context aware lightbox for images : Depending on the context, show a different sized image when clicking

  • Etc

To use your own content rewriter, use one of the following options:

  • configure it on a per template basis (HST Front Ends)
  • override the global default (HST Front Ends)
  • override the default content rewriter Spring bean (Delivery API)

Rewrite Rich Text Content in HST Front End

Configure on a Per Template Basis

Normally, when displaying rich text content, you use something like

JSP

<hst:html hippohtml="${requestScope.document.html}"/>

Freemarker

<@hst.html hippohtml=document.html />

This assumes content rewriting is done with the built-in HST SimpleContentRewriter. But, you can use your own custom content rewriter as well. Your script becomes something like:

JSP

<hst:html hippohtml="${requestScope.document.html}" 
          contentRewriter="${requestScope.myContentRewriter}"/>

Freemarker

<@hst.html hippohtml=document.html contentRewriter=myContentRewriter />

Also, you need to have set myContentRewriter on the request as well. Thus for example, you BaseComponent could have something like:

public abstract class BaseComponent extends BaseHstComponent {
   public static final MyContentRewriter myContentRewriter =
       new MyContentRewriter();
   @Override
   public void doBeforeRender(HstRequest request, HstResponse response) {
        // always have the custom content rewriter available
        request.setAttribute("myContentRewriter", myContentRewriter);
   } 

Configure Global Default

Available from HST 3.0.2 and 3.1.0 onwards

By default, the HST uses the SimpleContentRewriter. You can override this default by configuring default.hst.contentrewriter.class in the file hst-config.properties, which is typically located in the site/webapp/src/main/webapp/WEB-INF directory in a project.

Writing a Custom Content Rewriter

Next, your custom content rewriter needs to be written. It needs to implement org.hippoecm.hst.content.rewriter.ContentRewriter. The easiest way is to extend from org.hippoecm.hst.content.rewriter.impl.AbstractContentRewriter, or even from SimpleContentRewriter which gives you many rewriting utilities already.

Assume you want to write a content rewriter that adds a style=color:red to internal links that are broken. The easiest way to achieve this is to extend SimpleContentRewriter. The SimpleContentRewriter does string based rewriting of the rich text field. For our use case, it is easier to use the org.htmlcleaner.HtmlCleaner to do the job. Hence, our content rewriter will need to override the rewrite method from SimpleContentRewriter. The BrokenLinksMarkerContentRewriter below should pretty much do what we want. Note there is one important thing: The SimpleContentRewriter does content rewriting for html links and images : If you override rewrite to only change the way links are rewritten, then, at the end, you need to call super.rewrite(..) unless you make sure that you also rewrite images. The example below does also do the rewriting of image therefor, and does not need the super.rewrite(..)

If you also need access to the HstRequest / HstResponse in your ContentRewriter, then you can use the following code

HstRequest hstRequest = HstRequestUtils.getHstRequest(
                                requestContext.getServletRequest());
HstResponse hstResponse = HstRequestUtils.getHstResponse(
                                requestContext.getServletRequest(),
                                requestContext.getServletResponse()
                                );

BrokenLinksMarkerContentRewriter:

import javax.jcr.Node;
import org.apache.commons.lang.StringUtils;
import org.hippoecm.hst.configuration.hosting.Mount;
import org.hippoecm.hst.content.rewriter.impl.SimpleContentRewriter;
import org.hippoecm.hst.core.linking.HstLink;
import org.hippoecm.hst.core.request.HstRequestContext;
import org.htmlcleaner.CleanerProperties;
import org.htmlcleaner.HtmlCleaner;
import org.htmlcleaner.TagNode;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class BrokenLinksMarkerContentRewriter
                    extends SimpleContentRewriter {

    private final static Logger log =
                    LoggerFactory.getLogger(SimpleContentRewriter.class);

    private static boolean htmlCleanerInitialized;
    private static HtmlCleaner cleaner;

    private static synchronized void initCleaner() {
        if (!htmlCleanerInitialized) {
            cleaner = new HtmlCleaner();
            CleanerProperties properties = cleaner.getProperties();
            properties.setOmitHtmlEnvelope(true);
            properties.setTranslateSpecialEntities(false);
            properties.setOmitXmlDeclaration(true);
            properties.setRecognizeUnicodeChars(false);
            properties.setOmitComments(true);
            htmlCleanerInitialized = true;
        }
    }

    protected static HtmlCleaner getHtmlCleaner() {
        if (!htmlCleanerInitialized) {
            initCleaner();
        }

        return cleaner;
    }
    @Override
    public String rewrite(final String html, final Node node,
                          final HstRequestContext requestContext,
                          final Mount targetMount) {

        if (html == null) {
            if (html == null || HTML_TAG_PATTERN.matcher(html).find() ||
                BODY_TAG_PATTERN.matcher(html).find()) {

                return null;
            }
        }
        try {
            TagNode rootNode =  getHtmlCleaner().clean(html);
            TagNode [] links = rootNode.getElementsByName("a", true);
            // rewrite of links
            // THIS IS WHERE THE EXAMPLE IS ABOUT: WHEN A LINK CANNOT BE
            // RESOLVED, WE REMOVE THE href AND SET A STYLE
            for (TagNode link : links) {
                String documentPath =  link.getAttributeByName("href");
                if(isExternal(documentPath)) {
                   continue;
                } else {
                    String queryString =
                            StringUtils.substringAfter(documentPath, "?");
                    boolean hasQueryString =
                            !StringUtils.isEmpty(queryString);
                    if (hasQueryString) {
                        documentPath =
                            StringUtils.substringBefore(documentPath, "?");
                    }
                    HstLink href = getDocumentLink(documentPath,node,
                                                   requestContext,
                                                   targetMount);
                    // if the link is null, marked as notFound or has an
                    // empty path, we mark the link element with a
                    // style=color:red
                    if (href == null || href.isNotFound() ||
                        href.getPath() == null) {

                        // mark the element and remove the href
                        link.removeAttribute("href");
                        setAttribute(link, "style", "color:red");
                    } else {
                        String rewritterHref = href.toUrlForm(
                                 requestContext, isFullyQualifiedLinks());
                        if (hasQueryString) {
                            rewritterHref += "?"+ queryString;
                        }
                        // override the href attr
                        setAttribute(link, "href", rewritterHref);
                    }
                }
            }

            // BELOW IS FOR REWRITING IMAGE SRC ATTR WHICH RESULTS IN
            // VERY SAME BEHAVIOR AS SimpleContentRewriter
            // We could skip the code below altogether, and rewrite the
            // result below from getHtmlCleaner().getInnerHtml(bodyNode);
            // with super.rewrite() from SimpleContentRewriter
            TagNode [] images = rootNode.getElementsByName("img", true);
            for (TagNode image : images) {
                String srcPath =  image.getAttributeByName("src");
                if(isExternal(srcPath)) {
                    continue;
                } else {
                    HstLink binaryLink = getBinaryLink(srcPath, node,
                                                       requestContext,
                                                       targetMount);
                    if (binaryLink != null &&
                        binaryLink.getPath() != null) {

                        String rewrittenSrc = binaryLink.toUrlForm(
                                  requestContext, isFullyQualifiedLinks());
                        setAttribute(image, "src", rewrittenSrc);
                    } else {
                        log.warn("Skip href because url is null");
                    }
                }
            }

            // everything is rewritten. Now write the "body" element
            // as result
            TagNode [] targetNodes =
                         rootNode.getElementsByName("body", true);
            if (targetNodes.length > 0 ) {
                TagNode bodyNode = targetNodes[0];
                return getHtmlCleaner().getInnerHtml(bodyNode);
            }  else {
                log.warn("Cannot rewrite content for '{}' because there is no 'body' element" + node.getPath());
            }

        } catch (Exception e) {
            throw new RuntimeException(e);
        }
        return null;
    }

    private void setAttribute(TagNode tagNode, String attrName, String attrValue) {
        if (tagNode.hasAttribute(attrName)) {
            tagNode.removeAttribute(attrName);
        }
        tagNode.addAttribute(attrName, attrValue);
    }

}

Rewrite Rich Text Content in Delivery API

In headless implementation scenarios using the Delivery API, the default content rewriter used is org.hippoecm.hst.pagemodelapi.v10.content.rewriter.HtmlContentRewriter and can be overriden using Spring bean configuration.

Extend HtmlContentRewriter

Make sure your custom content rewriter class extends org.hippoecm.hst.pagemodelapi.v10.content.rewriter.HtmlContentRewriter, for example:

site/components/src/main/java/org/example/CustomRewriter.java

package org.example;

import javax.jcr.Node;

import org.hippoecm.hst.configuration.hosting.Mount;
import org.hippoecm.hst.core.request.HstRequestContext;
import org.hippoecm.hst.pagemodelapi.v10.content.rewriter.HtmlContentRewriter;
import org.htmlcleaner.HtmlCleaner;

public class CustomRewriter extends HtmlContentRewriter {

    // implementation here
}

Override HtmlContentRewriter Spring Bean

Override the org.hippoecm.hst.pagemodelapi.v10.content.rewriter.HtmlContentRewriter Spring bean in an XML file placed in classpath*:META-INF/hst-assembly/addon/org/hippoecm/hst/pagemodelapi/v10/*.xml, for example:

site/components/src/main/resources/META-INF/hst-assembly/overrides/addon/org/hippoecm/hst/pagemodelapi/v10/custom.xml

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd">

  <bean id="org.hippoecm.hst.pagemodelapi.v10.content.rewriter.HtmlContentRewriter"
        class="org.example.CustomRewriter">
    <constructor-arg>
      <bean class="org.hippoecm.hst.content.rewriter.HtmlCleanerFactoryBean" />
    </constructor-arg>
    <property name="removeAnchorTagOfBrokenLink" value="${pagemodelapi.v10.removeAnchorTagOfBrokenLink:false}" />
  </bean>

</beans>
Did you find this page helpful?
How could this documentation serve you better?
On this page
    Did you find this page helpful?
    How could this documentation serve you better?