Contact us
  • en
  • fr
  • Solutions
    • Company IntranetDiscover our company intranet software to connect and engage your employees
    • Collaboration PlatformDiscover our digital collaboration platform that boost your employee engagement and performance
    • Knowledge ManagementHow To build and manage your knowledge base with eXo Platform
    • Employee EngagementDiscover how to boost your employee engagement
  • Product
    • Overview
      • Software TourAn overview of our software capabilities
      • Why eXoHow eXo improves your employee experience
    • Features
      • CommunicationFeatures to facilitate employee communications
      • CollaborationFeatures to create efficient teams
      • KnowledgeKnowledge management capabilities
      • ProductivityEmployee productivity tools to engage employees
    • Technology
      • Open sourceAn overview of our technology
      • IntegrationsDiscover how eXo integrates with your tools and systems
      • SecurityHow eXo Platform ensures your data security
  • Pricing
    • Product pricingLearn about our business model and pricing
    • ServicesLearn about our professional services
    • FAQsQuestions about the software, the community and our offers
  • Resources
    • Resource center
      • Case studies
      • White Papers
      • Datasheets
      • Videos
    • Technical documentation
      • Getting started
      • Developer Guide
      • Technical administration guide
      • REST APIs
    • From The Blog
      • eXo Platform 6.4 is here, available in both Enterprise and Community editions!
      • eXo Platform community edition is back
      • Cloud Vs On-premise Digital Workplace: Which one is right for your business?
  • Company
    • About us
    • Careers
    • Customers
    • Newsroom
    • Contact us
    • Partners
  • Menu mobile
    • Pricing
    • About us
    • Services
    • FAQs
    • Customers
    • Resource center
    • Contact us
    • Blog
Company Intranet Discover our company intranet software to connect and engage your employees
Collaboration Platform Discover our digital collaboration platform that boost your employee engagement and performance
Knowledge Management How To build and manage your knowledge base with eXo Platform
Employee Engagement Discover how to boost your employee engagement
Overview
  • Software Tour An overview of our software capabilities
  • Why eXo How eXo improves your employee experience
Features
  • Communication Features to facilitate employee communications
  • Collaboration Features to create efficient teams
  • Knowledge Knowledge management capabilities
  • Productivity Employee productivity tools to engage employees
Technology
  • Open source An overview of our technology
  • Integrations Discover how eXo integrates with your tools and systems
  • Security How eXo Platform ensures your data security
Product pricing Learn about our business model and pricing
Services Learn about our professional services
FAQs Questions about the software, the community and our offers
Resource center
  • Case studies
  • White Papers
  • Datasheets
  • Videos
Technical documentation
  • Getting started
  • Developer Guide
  • Technical administration guide
  • REST APIs
From The Blog
  • eXo Platform 6.4 is here, available in both Enterprise and Community editions!
  • eXo Platform community edition is back
  • Cloud Vs On-premise Digital Workplace: Which one is right for your business?
About us
Careers
Customers
Newsroom
Contact us
Partners
Pricing
About us
Services
FAQs
Customers
Resource center
Contact us
Blog
  1. Accueil
  2. Uncategorized
  3. Use ElasticSearch to search and use external sources, like Wikipedia, inside eXo Platform

Use ElasticSearch to search and use external sources, like Wikipedia, inside eXo Platform

Since its version 4, eXo Platform has added a new unified search, which greatly improves its search capabilities. All the platform’s resources (contents, files, wiki pages, etc.) can now be easily found from a single and centralized location.

01-default-unified-search

Besides these out-of-the-box capabilities, a new API allows creation of custom search connectors in order to extend the search scope and enrich the results. This blog post explains how to implement and configure such a connector.

For this blog post, the search connector will retrieve data indexed by ElasticSearch, a highly powerful and easy to use search engine. It is of course up to you to decide what your search connector returns (data indexed by another search engine, data from a database, other custom data stored in eXo, etc.)

ElasticSearch

The first step if we want to use ElasticSearch is to install and configure it! The only thing to do here is to download it, extract it, and start it with:

bin/elasticsearch.sh -f

Index data in ElasticSearch

As is, ElasticSearch is empty, no data has been indexed. So we need to feed it. For this purpose, we will use the Wikipedia River plugin. A river is an ElasticSearch component which feeds ElasticSearch with data to index. The Wikipedia River simply feeds ElasticSearch with Wikipedia pages.

After stopping your ElasticSearch server you can install the plugin with:

bin/plugin -install elasticsearch/elasticsearch-river-wikipedia/1.3.0

After restarting ElasticSearch you should see logs similar to the following:

[2013-11-27 11:55:48,716][INFO ][node] [It, the Living Colossus] version[0.90.7], pid[14776], build[36897d0/2013-11-13T12:06:54Z]
[2013-11-27 11:55:48,716][INFO ][node] [It, the Living Colossus] initializing ...
[2013-11-27 11:55:48,725][INFO ][plugins] [It, the Living Colossus] loaded [river-wikipedia], sites []
[2013-11-27 11:55:50,632][INFO ][node] [It, the Living Colossus] initialized
[2013-11-27 11:55:50,632][INFO ][node] [It, the Living Colossus] starting ...
[2013-11-27 11:55:50,718][INFO ][transport] [It, the Living Colossus] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/192.168.0.5:9300]}

This ensures that the Wikipedia River plugin is correctly installed (loaded [river-wikipedia]).

We can now start indexing Wikipedia pages in ElasticSearch by creating the river with a REST call (we are using curl here; feel free to use your favorite tool):

curl -XPUT localhost:9200/_river/my_river/_meta -d '
{
  "type" : "wikipedia"
}
'

A lot of data is now being indexed by ElasticSearch (yes, Wikipedia is a huge source of data :)). You can check this by executing a search with:

curl -XGET 'http://localhost:9200/_search?q=test'

Warning: the Wikipedia River will index a lot of data. You should stop the river after a few minutes to avoid filling your entire disk space ;-). This can be done by deleting the river with:

curl -XDELETE localhost:9200/_river/my_river

Now that we have data indexed by ElasticSearch, let’s dig into the eXo search connector!

eXo search connector

A search connector is a simple class that extends org.exoplatform.commons.api.search.SearchServiceConnector and implements the “search” method:

package org.exoplatform.search.elasticsearch;

import ...

public class ElasticSearchConnector extends SearchServiceConnector {

  public ElasticSearchConnector(InitParams initParams) {
    super(initParams);
  }

  @Override
  public Collection<SearchResult> search(SearchContext context, String query, Collection<String> sites, int offset, int limit, String sort, String order) {
      // Fetch data
  }
}

It needs to be declared in the eXo configuration, either in an extension or directly in the jar which will contain the connector class. Let’s go for the jar method:

  • add the class in your jar
  • add a file named configuration.xml in conf/portal in your jar with the following content (the “type” tag contains the FQN of your connector class):
<configuration
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xsi:schemaLocation="http://www.exoplatform.org/xml/ns/kernel_1_2.xsd http://www.exoplatform.org/xml/ns/kernel_1_2.xsd"
   xmlns="http://www.exoplatform.org/xml/ns/kernel_1_2.xsd">

    <external-component-plugins>
      <target-component>org.exoplatform.commons.api.search.SearchService</target-component>
      <component-plugin>
        <name>ElasticSearchConnector</name>
        <set-method>addConnector</set-method>
        <type>org.exoplatform.search.elasticsearch.ElasticSearchConnector</type>
        <description>ElasticSearch Connector</description>
        <init-params>
          <properties-param>
             <name>constructor.params</name>
             <property name="searchType" value="wikipedia"/>
             <property name="displayName" value="Wikipedia"/>
          </properties-param>
        </init-params>      
      </component-plugin>
    </external-component-plugins>

</configuration>

We now have the skeleton of our search connector. The last step is to implement the search method.

Fetching results from ElasticSearch

We need to call ElasticSearch in order to fetch Wikipedia pages based on the input parameters of the search (query text, offset, limit, sort field, sort order). ElasticSearch provides a Java Client API (TransportClient). Sadly, it depends on Lucene artifacts, and since eXo Platform already embeds Lucene artifacts that are not necessarily in the same version as the ones needed by ElasticSearch, it can cause conflicts. Instead we will directly use the REST API:

package org.exoplatform.search.elasticsearch;

import org.apache.commons.io.IOUtils;
import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.entity.StringEntity;
import org.apache.http.impl.client.DefaultHttpClient;
import org.exoplatform.commons.api.search.SearchServiceConnector;
import org.exoplatform.commons.api.search.data.SearchContext;
import org.exoplatform.commons.api.search.data.SearchResult;
import org.exoplatform.container.xml.InitParams;
import org.json.simple.JSONArray;
import org.json.simple.JSONObject;
import org.json.simple.parser.JSONParser;

import java.io.StringWriter;
import java.util.*;

public class ElasticSearchConnector extends SearchServiceConnector {

  private Map<String, String> sortMapping = new HashMap<String, String>();

  public ElasticSearchConnector(InitParams initParams) {
    super(initParams);

    sortMapping.put("date", "title"); // no date field on wikipedia results
    sortMapping.put("relevancy", "_score");
    sortMapping.put("title", "title");
  }

  @Override
  public Collection<SearchResult> search(SearchContext context, String query, Collection<String> sites, int offset, int limit, String sort, String order) {
    Collection<SearchResult> results = new ArrayList<SearchResult>();

    String esQuery = "{\n" +
            "     \"from\" : " + offset + ", \"size\" : " + limit + ",\n" +
            "     \"sort\" : [\n" +
            "       { \"" + sortMapping.get(sort) + "\" : {\"order\" : \"" + order + "\"}}\n" +
            "     ],\n" +
            "     \"query\": {\n" +
            "        \"filtered\" : {\n" +
            "            \"query\" : {\n" +
            "                \"query_string\" : {\n" +
            "                    \"query\" : \"" + query + "\"\n" +
            "                }\n" +
            "            }\n" +
            "        }\n" +
            "     },\n" +
            "     \"highlight\" : {\n" +
            "       \"fields\" : {\n" +
            "         \"text\" : {\"fragment_size\" : 150, \"number_of_fragments\" : 3}\n" +
            "       }\n" +
            "     }\n" +
            "}";

    try {
      HttpClient client = new DefaultHttpClient();
      HttpPost request = new HttpPost("http://localhost:9200/_search");
      StringEntity input = new StringEntity(esQuery);
      request.setEntity(input);

      HttpResponse response = client.execute(request);
      StringWriter writer = new StringWriter();
      IOUtils.copy(response.getEntity().getContent(), writer, "UTF-8");
      String jsonResponse = writer.toString();

      JSONParser parser = new JSONParser();

      Map json = (Map)parser.parse(jsonResponse);
      JSONObject jsonResult = (JSONObject) json.get("hits");
      JSONArray jsonHits = (JSONArray) jsonResult.get("hits");
      for(Object jsonHit : jsonHits) {
        JSONObject hitSource = (JSONObject) ((JSONObject) jsonHit).get("_source");
        String title = (String) hitSource.get("title");
        JSONObject hitHighlights = (JSONObject) ((JSONObject) jsonHit).get("highlight");
        JSONArray hitHighlightsTexts = (JSONArray) hitHighlights.get("text");
        String text = "";
        for(Object hitHighlightsText : hitHighlightsTexts) {
          text += (String) hitHighlightsText + " ... ";
        }

        results.add(new SearchResult(
                "http://wikipedia.org",
                title,
                text,
                "",
                "http://upload.wikimedia.org/wikipedia/commons/thumb/7/77/Wikipedia_svg_logo.svg/45px-Wikipedia_svg_logo.svg.png",
                new Date().getTime(),
                1
        ));
      }

    } catch (Exception e) {
      e.printStackTrace();
    }

    return results;
  }
}

Requests and responses are full JSON. You can find more details about ElasticSearch query syntax in their documentation. The important point here about the search connector is that each result has to be a SearchResult object returned in a collection.

Deploy it in eXo, and enjoy!

We can now deploy our jar (which contains the SearchConnector class and the XML configuration file) in the libs of the application server (/lib of Tomcat for example) and start eXo.

A search using the quick search in the toolbar now retrieves contents from Wikipedia:

02-search-results-preview

When the unified search screen is displayed, we can see that a new Wikipedia filter is listed, and our search results contain some Wikipedia pages:

03-search-results-1

If you don’t want to see Wikipedia contents in your results, simply uncheck the filter:

04-search-results-2

The code source is available here, as a Maven project.

Learn more about this project and what you can do with eXo Platform; join the eXo tribe!

Brahim Jaouane

I am a Digital Marketing specialist specialized in SEO at eXo Platform. Passionate about new technologies and Digital Marketing. With 10 years' experience, I support companies in their digital communication strategies and implement the tools necessary for their success. My approach combines the use of different traffic acquisition levers and an optimization of the user experience to convert visitors into customers. After various digital experiences in communication agencies as well as in B2B company, I have a wide range of skills and I am able to manage the digital marketing strategy of small and medium-sized companies.

Full-featured digital workplace with everything your employees need to work efficiently, smartly integrated for a compelling employee experience

  • Product
    • Software tour
    • Communication
    • Collaboration
    • Knowledge
    • Productivity
    • Open Source
    • Integrations
    • Security
  • Uses cases
    • Digital Workplace
    • Intranet software
    • Collaboration software
    • Knowledge management software
    • Entreprise Social Network
    • Employee Engagement platform
  • Roles
    • Internal Communications
    • Human Resources
    • Information Technology
  • Company
    • Product offer
    • Services Offer
    • Customers
    • Partners
    • About us
  • Resources
    • FAQs
    • Resource Center
    • Collaboration guide
    • What is a Digital workplace?
    • What is an intranet?
    • Employee engagement
  • Terms and Conditions
  • Legal
  • Privacy Policy
  • Accessibility
  • Contact us
  • Sitemap
  • Facebook
  • Twitter
  • LinkedIn
wpDiscuz