I had issue while crawling of our SharePoint sites, I was getting error for the outlook message file in which there were embedded PDF document.
Errors which I was getting into crawl log:
1. The filtering process could not be initialized. Verify that the file extension is a known type and is correct
2. Error HRESULT E_FAIL has been returned from a call to a COM component.
I have found on the internet that I have to install extra filter into server which will search and crawl for PDF files, there are 2 free filters available. One is from Adobe Ifilter 6.0 and one from Foxit Filter.
I tried to play with both filter but I was getting error into PDF and message file, I know, I was missing in some little configuration and installation of something.
At the last, I have found one very good and important link, as per that link, Adobe has not created any separate iFilter for PDF file types after Adobe Reader 7 version.
So they suggested us to installed Adobe Reader 8 or reader 9 version into our server, because after reader 7, Adobe package iFilter functionality into same software as plugs-in.
So after doing configuration from below links, I got the success, now I am able to crawl message file and PDF documents from the SharePoint sites.
MAIN IMPORTANT LINK:
Good Reference Links.
http://www.adobe.com/support/downloads/detail.jsp?ftpID=2611&promoid=DNRLI (all about Adobe PDF IFilter v6.0)
http://www.adobe.com/support/downloads/product.jsp?product=1&platform=Windows (Different products from Adobe)
http://blog.tylerholmes.com/2008/04/walkthrough-installing-adobe-v6-pdf.html ( How to install Adobe filter)
http://downloads.fuxinsoftware.com.cn/pub/foxit/manual/enu/FoxitPDFIFilter10forMOSS_manual.pdf (Manual for Foxit Filter)
Filed under: SharePoint Search, Sharepoint Problems | Tagged: Adobe iFilter, crawling, Foxit Filter, SharePoint Search, The filtering process could not be initialized