This week I’ve found myself needing to do something very specific for the first time. Remove from the index ALL items from a given list of templates, Index ALL items from a different list of templates, and finally index only a few items from this very specific template.
The two first requisites are easy to be performed, all I needed was the so famous excluded template feature to list the templates that must be excluded. Every other template would be included.
The third point was the complicated one. It was the first time I faced this requisite and since it looked a pretty common requisite I decided it was time to write something here and potentially help someone in the future (probably myself).
The most important thing to keep in mind when defining the strategy to achieve our goal here is that we can’t just remove a few items from the index. We need to remove those items during the indexing process, not after.
It doesn’t matter if you are using Solr or Coveo, you can always use the Inbound Filter feature to filter out anything you don’t want to be present in your indexes. And that’s exactly the strategy we are going to use here.
Let’s take Coveo as an example in this article since it was the search provider used by my client. If it is the first time you are going to be creating a new Inbound Filter on Coveo, don’t worry. It couldn’t be simpler and the documentation Coveo provides it’s all you need.
Creating the Filter Processor
We need to create a new filter processor which inherits from Coveo.SearchProvider.InboundFilters.AbstractCoveoInboundFilterProcessor. This class will be really simple, all it needs is to override the Process(…) method and you are done.
The idea here is to have a very flexible processor that can be used for any template in your Sitecore.
Basically, it needs two parameters:
- The Template ID we want to exclude partially;
- The list of items using the above template should not be excluded.
The Process(..) method is going to take every single indexed item and check if the template is the partially excluded one. If that’s true, then it should check if the current item is one of the exceptions that should be kept. If it’s not, then the processor is going to mark this current item as excluded. Otherwise, the item is indexed and Sitecore moves to the next processor on this same pipeline.
public class RemoveItemsFromTemplateFilter : AbstractCoveoInboundFilterProcessor { public string ExcludedTemplateId { get; set; } private IList<string> PagePathsToInclude { get; set; } public RemoveItemsFromTemplateFilter() { PagePathsToInclude = new List<string>(); } public void AddPagePathsToInclude(XmlNode node) { string sectionPageId = XmlUtil.GetValue(node); PagePathsToInclude.Add(sectionPageId); } public override void Process(CoveoInboundFilterPipelineArgs p_Args) { if (p_Args.IndexableToIndex != null && !p_Args.IsExcluded && ShouldExecute(p_Args)) { p_Args.IsExcluded = IsExcludedTemplate(p_Args.IndexableToIndex) && !KeepThisItem(p_Args.IndexableToIndex); } } private bool IsExcludedTemplate(IIndexableWrapper item) { return item.Item.Template.ID == new Data.ID(ExcludedTemplateId); } private bool KeepThisItem(IIndexableWrapper item) { return PagePathsToInclude.Contains(item.Item.Paths.FullPath); } }
Changing the Configuration
Since I’m using Coveo in this project I’m going to use the Coveo.SearchProvider.Custom.config to add the new processor on my Sitecore instance. If you are using Solr, all you need is to add the processor in the right pipeline.
[...] <coveoInboundFilterPipeline> <processor type="Sitecore.Feature.Coveo.Processors.RemoveItemsFromTemplateFilter, Sitecore.Feature.Coveo"> <excludedTemplateId>{8EE208F9-A6A6-41E2-88A0-C188737A178C}</excludedTemplateId> <pagesToInclude hint="raw:AddPagePathsToInclude"> <work>/sitecore/content/MySite/Home/Work</work> <about>/sitecore/content/MySite/Home/About</about> </pagesToInclude> </processor> </coveoInboundFilterPipeline> [...]
How to Apply Everything
Alright, everything is well placed and new we need to make it happen. There is no magic here, the only way we can remove items which are already indexed is rebuilding the entire index. Yeah I know, it’s so long… but there is no other way.
After rebuilding your index you should be able to verify that all items from this given template are no longer part of your index, except by those which are explicitly listed on the pagesToInclude configuration.
I hope that this article may help you and please let me know if you still have questions about anything related to this subject! See you next time!