703.868.6393 contact@focalcxm.com

Overview: This blog outlines the Drupal Integration into the Coveo Search.

Target Audience: Search Engineers

Problem-Statement:  Every customer/user should be able to search for the items(pages) in coveo from the drupal website. To index the pages from drupal website to coveo, we need to first understand what is drupal and the capabilities of coveo in the process of indexing. The main focus will be on creating the APIs in drupal and leveraging coveo to search those pages in the search interface.

Drupal: Content management is a set of processes and technologies that supports the collection, managing, and publishing of information in any form or medium. When stored and accessed via computers, this information may be more specifically referred to as digital content, or simply as content. Drupal is one among them and is used to make many of the websites and applications you use every day.

Features:

  • Easy content authoring
  • Reliable performance and excellent security.
  • Flexibility and modularity
  • Its tools help you build the versatile, structured content that dynamic web experiences need.
  • Great choice for creating integrated digital frameworks.

Coveo search: Coveo cloud provides intelligent search, delivering the most relevant results possible from a potentially large amount of data. It works in an enterprise level. You can leverage the power of coveo cloud through a good use of all the tools and search features

Features of coveo search:

  • Drive content into context
  • Relevancy
  • Get better results with less effort
  • Power of coveo unified search
  • Provides information every time, everywhere and for everyone

Prerequisites:

  • Sign up to pantheon (drupal online hosting website).
  • Once you sign up for the pantheon, Install the REST services to generate the API’s.
  • Sign up to coveo and create an organization.

This is the basic flow of how the drupal pages are indexed in coveo:

This is the architecture flow diagram for drupal:

Steps to create a website using drupal:

  • Go to drupal Website https://drupal.org
  • Click on Try Drupal and then go for demo online
  • Then click on pantheon (which helps to host a website online)
  • Login with your credentials
  • Once you’re logged in click on create a site
  • You can select any versions from drupal 7 to drupal 9(latest version)
  • Once you install, you’re all set.
  • Give a name to the website.
  • Click on Visit Development Site to build the website

  • To start with, Add a content to the website
  • Content can be added as an Article or a Basic page.

  • Give the Title, body and the Text Format for the page and save.

The sample website looks like this:

  • To integrate with coveo, we need to generate a REST API for the website pages we’ve created.
  • Go to modules on the home page and click on Install New Module.
  • Install the required modules for the REST API like Services, REST server, Libraries, CTools etc to generate the REST APIs of the pages we created.

  • Select the file and copy the link address and paste in the new module section.
  • Copy the link address in the module section and Install.

  • Once the installations are done, enable all the newly added modules.

  • Once, all the modules are installed, Enable them by clicking on the checkbox.
  • Also Install REST servers and CTools and Libraries and if needed Authentication and then Enable them.
  • And now Save the Configuration.

To generate The REST services, go to structure on the home page and click on Services:

  • Services are collections of methods available to remote applications. They are defined in modules and may be accessed in number of ways through server modules.
  • Add a service by clicking on ADD.

  • Add a new Endpoint:

  • Give the Machine -readable name, the server as REST and path to the endpoint(you can get from the pages you’ve created)and then Save the Configuration.
  • Once the configuration is saved, Edit the resource(s) or methods you would like to enable, and click save.

  • In the resources page, check on all the resources you want to enable. I checked on node.

  • In the same services, Go to the Server On the top right and click on the response formats(JSON,XML)/ request parser types (application/json or application/xml), then enable and save.

  • Now, add the machine-readable name to the page path and check to get required format.
  • Ex: https://dev-samplefocalcxm.pantheonsite.io/products/node/6 (Products is the machine-readable name and node/6 is the path.)
  • The API for the Order Zen page is:  https://dev-samplefocalcxm.pantheonsite.io/products/node/7.
  • Give this API in a browser or Postman to get the JSON response and then construct the JSON configuration according to the coveo format.

How to Integrate the pages from Drupal to Coveo Search?

  • A source is a Coveo Cloud virtual container holding all items related to a repository such as your company website, SharePoint system etc. This content is searchable through a search interface.
  • To retrieve the content and create a source, Coveo Cloud uses a connector, a module that establishes a connection with a specific type of repository. A connector extracts the desired data as well as the corresponding permissions and stores it in your index.
  • Coveo supports multiple connectors to crawl the data from the source. Generic REST API is one among them.
  • A Generic REST API source allows you to crawl content from a remote repository exposing its data through a REST API. Since we’re generating APIs from drupal and coveo has a direct connector(Generic REST API)which can be used to index the pages from drupal.
  • To start with, sign up/login to coveo.
  • On a successful login, Coveo admin console opens.

  • To create a new source, click on Add source

  • Select Generic Rest API connector from the options list.

  • Using the API response, Construct the JSON configuration according to coveo format.
  • In the configuration tab, give the source name, authentication(if required) and the JSON configuration.
  • Check on the optical character recognition if you want to make the text found in image files/PDF files with images searchable.
  • JSON configuration according to coveo format for: https://dev-samplefocalcxm.pantheonsite.io/products/node/7.

{
“services”: [
{
“url”: “https://dev-samplefocalcxm.pantheonsite.io/”,
“Paging”: {
“PageSize”: 10,
“OffsetStart”: 0,
“OffsetType”: “item”,
Parameters”: {
“Limit”: “limit”,
“Offset”: “offset”
}
},
“endpoints”: [
{
“path”: “/products/node/7/”,
“method”: “GET”,
“ItemType”: “page”,
“Uri”: “https://dev-samplefocalcxm.pantheonsite.io/products/node/7/”,
“ClickableUri”: “https://dev-samplefocalcxm.pantheonsite.io/products/node/7/”,
“vid”: “%[vid]”,
“uid”: “%[uid]”,
“title”: “%[title]”,
“nid”: “%[nid]”,
“type”: “%[type]”,
“language”: “%[language]”,
“body”: “%[body.und]”,
“und”: [
{
“value”: “%[und.value]”,
“format”: “%[und.format]”,
“safe_value”: “%[und.safe_value]”
}
],
“name”: “%[name]”,
“data”: “%[data]”
}
]
}
],
“SubQueries”: [
{
“Path”: “https://dev-samplefocalcxm.pantheonsite.io/products/node/%[nid]”,
“Method”: “GET”
}
]
}

  • Then go to the next tab (content security) and ensure to choose “Everyone” so that all the users can access the same content in search Interface.

  • Check for the permissions in the Access tab(Users/Admins), then Save and build source.

  • The indexing pipeline extension(IPE) feature provides a way to execute Python conversion scripts in a securely isolated non-persistent container, allowing developers to customize how items get indexed. Extension scripts can be executed at two different stages of the indexing pipeline: pre-conversion and post-conversion.
  • To add an extension, click on the source, then click on More and then Manage Extensions.

Click on Add extension to write an IPE

  • Give the Extension name, Description and the Extension script and click on Save.

  • For this API, we’re changing the ClickableUri. So we’re changing the uri_replacements as “products/node/7” to “node/7”:
  • Once the source is built, we can see all the items(pages) that are indexed in the content browser.

  • Setup the search pages and therefore you can search for all the items that has been indexed.
  • To setup search page, go to search pages and Add page.

  • Give a page name and HTML title for the page and then click on Add page.

  • When we click on the drupal link, the search interface opens where we can see all the pages from the drupal as results in the result list.

  • Once the source is built and all the items are indexed in coveo, what if we update/delete(add/remove) pages in drupal and we want the latest results to be displayed in coveo? Here, coveo leverages the 3 source update types such as Refresh, Rescan and Rebuild to ensure that the source is up to date with the system and that source configuration changes are applied.
  • Rebuild: The rebuild update type scans all items within the source. Once the scan is complete, all of the items are re-indexed.  Rebuild is only necessary following a source configuration change that affects the indexed content. When needed, you can start a rebuild manually.
  • Rescan: The rescan update type scans all items within the source. Once the scan is complete, only the content and permissions that have been modified are re-indexed. A rescan can be performed manually or automatically through a schedule. It has a medium impact on resources since it crawls an entire source but only re-indexes the items that have changed since the last update.
  • Refresh: The refresh update type scans the items and permissions that have been identified by the source system as having been modified since the last update. Once the scan is complete, the changes are retrieved and the Coveo index is updated. A refresh can be performed manually or automatically through a schedule. It has the smallest impact on resources since it only scans and re-indexes the items that have changed since the last update.
  • To create/update a schedule for refresh and rescan, go to the source(Ex: Focal products) and click on More option and then click Manage schedules.

  • In the Edit a Source Schedule panel, enable the update types for which you want to configure schedules like hourly, daily or weekly and click on update schedule.

Reference Links:

For Drupal:

  • https://en.wikipedia.org/wiki/Content_management
  • https://www.drupal.org/about

To install modules:

  • https://www.drupal.org/project/services
  • https://www.drupal.org/project/restws
  • https://www.drupal.org/project/libraries
  • https://www.drupal.org/project/ctools

For Coveo Search:

  • https://docs.coveo.com/en/0/coveo-documentation-home
  • https://docs.coveo.com/en/1702/index-content/connector-directory
  • https://docs.coveo.com/en/1933/index-content/edit-a-source-schedule

Generic REST API:

  • https://docs.coveo.com/en/1896/index-content/add-or-edit-a-generic-rest-api-source
  • https://docs.coveo.com/en/2030/index-content/generic-rest-api-source-tutorial
  • https://docs.coveo.com/en/3300/tutorials/lesson-1-creating-a-generic-rest-api-source-part-1