KNOWLEDGE BASE

Umbraco Examine v4x Powerful Umbraco Indexing


This post it outdated. For the latest information on Examine please refer to either the Examine page on our site or the Examine CodePlex project home

Umbraco Examine is a powerful, fully configurable, and extensible library used for indexing Umbraco content to allow for fast and easy content searching. It utilizes the Lucene.Net library which is included in the Umbraco installation (v2.x). It is extremely easy to setup and caters for simple indexing/searching to very complex index/searching by utilizing it's fully extensible codebase and it's event model. The library was built with .Net 3.5 SP1 and has not been tested with previous versions of .Net.

Basic Setup

  • Copy the DLL files to the bin folder
  • Add the following to the <configSections> portion of your Web.config file:
<section name="UmbLuceneIndex" 
type="TheFarm.Umbraco.Lucene.Configuration.IndexSets, TheFarm.Umbraco.Lucene" />
  • For the most basic setup, add the following to the configuration in your Web.config (Also see the readme.txt and app.config files in the binaries download!):
<UmbLuceneIndex DefaultIndexSet="MyIndexSet" EnableDefaultActionHandler="true">
<IndexSet SetName="MyIndexSet" IndexPath="~/data/UmbracoExamine/" MaxResults="100">
<IndexUmbracoFields>
<add Name="id" /> <!-- REQUIRED -->
<add Name="nodeName" /> <!-- REQUIRED -->
<add Name="updateDate" />
<add Name="writerName" />
<add Name="path" />
<add Name="nodeTypeAlias" /> <!-- REQUIRED -->
</IndexUmbracoFields>
<IndexUserFields>
<add Name="PageTitle"/>
<add Name="PageContent"/>
</IndexUserFields>
<IncludeNodeTypes />
<ExcludeNodeTypes />
</IndexSet>
</UmbLuceneIndex>
  • Create the folder: ~/data/UmbracoExamine/ since this is what has been specified for the index path above. 
    • Ensure that the IIS user has full control on this folder.
  • Since EnableDefaultActionHandler is set to true, each time a node is published, it will be indexed based on the rules suplied in the configuration. When a node is unpublished, it will automatically be removed from the index.
  • Log into Umbraco, publish a node and verify that files have been created in the index path as specified above.

Basic Search

  • To perform a search:
UmbracoIndexer examine = new UmbracoIndexer();
List<SearchResult> results = examine.Search("find this", true);
  • The returned structure is simple, containing 3 properties: Id, Score and Fields:
public int Id { get; set; }
public float Score { get; set; }
public Dictionary<string, string> Fields { get; set; }
  • The Fields property contains all of the field data that has been configured in the web.config file.

Advanced Setup

You can create multiple indexes depending on your needs. For example, you may want to have different indexes for different portal sites in your content tree, or different indexes to separate the type of content being indexed such as one for News and one for Forum, as an example. Creating different indexes if easy:

<UmbLuceneIndex DefaultIndexSet="Site1" EnableDefaultActionHandler="true"> 
<!-- Create an index for a site called 'Site1' which has a starting parent
node in the content tree of 1234. Only nodes that have the Id, or are children of node 1234 will be indexed. -->
<IndexSet SetName="Site1" IndexPath="~/data/indexes/site1/" MaxResults="100" IndexParentId="1234">
<IndexUmbracoFields>
<add Name="id" /> <!-- REQUIRED -->
<add Name="nodeName" /> <!-- REQUIRED -->
<add Name="updateDate" />
<add Name="writerName" />
<add Name="path" />
<add Name="nodeTypeAlias" /> <!-- REQUIRED -->
<add Name="parentID"/>
</IndexUmbracoFields>
<IndexUserFields>
<add Name="PageTitle"/>
<add Name="PageContent"/>
<add Name="CommentText"/>
<add Name="CommentUser"/>
<add Name="umbracoNaviHide"/>
</IndexUserFields>
<IncludeNodeTypes>
<add Name="HomePage" />
<add Name="BasicPage" />
<add Name="Comment" />
</IncludeNodeTypes>
<ExcludeNodeTypes />
</IndexSet> <!-- Create an index for a site called 'Site2' which has a starting parent node in the
content tree of 4567. Only nodes that have the Id, or are children of node 4567 will be indexed. -->

<IndexSet SetName="Site2" IndexPath="~/data/indexes/site2/" MaxResults="100" IndexParentId="4567">
<IndexUmbracoFields>
<add Name="id" /> <!-- REQUIRED -->
<add Name="nodeName" /> <!-- REQUIRED -->
<add Name="updateDate" />
<add Name="writerName" />
<add Name="path" />
<add Name="nodeTypeAlias" />
<!-- REQUIRED -->
</IndexUmbracoFields>
<IndexUserFields>
<add Name="PageTitle"/>
<add Name="PageContent"/>
<add Name="umbracoNaviHide"/><!-- You can add as many user fields here that you would like to be indexed... -->
</IndexUserFields>
<IncludeNodeTypes />
<ExcludeNodeTypes><!-- Index everything except for document types of 'UserNotes' -->
<add Name="UserNotes" />
</ExcludeNodeTypes>
</IndexSet>
</UmbLuceneIndex>

Advanced Search

There are a few overriden search methods you can use to perform different types of searches, all depends on what kind of results you want to acheive:

//This will create a new examiner to search in Site1 since Site 1 is 
//listed as the default Index in the configuration. 

UmbracoIndexer examine = new UmbracoIndexer();
List<SearchResult> results = examine.Search("find this in Site1", true);
//This will create a new examiner to search in Site2 UmbracoIndexer examine2 = new UmbracoIndexer("Site2"); List<SearchResult> results2 = examine2.Search("find this in Site2", true);
//disables wild card searching

List<SearchResult> results3 = examine2.Search("find exact matches in Site2", false);

//searches site 2 but only in NewsArticle document types

List<SearchResult> results4 = examine2.Search("find news in Site2", "NewsArticle", true, null);

//searches site 2 but only for nodes that are children of the node with ID 4999

List<SearchResult> results5 = examine2.Search("find something in Site2", "", true, 4999);
//searches site 1, in all of it's defined doc types to be searched but only in
//the properties: PageTitle and PageContent and will only return a maximum
//of 10 results.

List<SearchResult> results6 =
examine.Search("find in Site1", "", true, null, new string[] {"PageTitle","PageContent"}, 10);

 


Need an Umbraco Master?

Here at Simon Antony, we have an in house certified Umbraco Grand Master available for hire. Got a problem with your site, need architecture advice, give us a call to speak to Simon directly and see how we can help

Contact Simon Today!