|
|
FAQ: Problems indexing PDF on Windows XP
When using Windows XP professional as a server platform some installations
report problems extracting the raw text from PDF documents for
indexing. These problems are reported for version 6.0 of the iFilter for PDF
provided by Adobe.
In powerKNOW this usually shows up in the way:
- Error "Unable to extract raw text from text given..." reported when
storing a knowledge document with an attachment of type pdf.
- Fulltext search for contents of an attachment of type pdf does not
deliver the expected results.
As long as a fix for PDF iFilter 6.0 is not available you can use version 5.0
to index PDF documents. Version 5.0 can be found at
http://www.adobe.com/support/downloads/detail.jsp?ftpID=1276.
See
Solution 2 in "Adobe PDF IFilter returns unexpected search results" for
a description on how to get version 5.0 running on Windows XP.
For using the Adobe iFilter with powerKNOW you will only have to do the registry change to
HKEY_CLASSES_ROOT
\CLSID
\{4C904448-74A9-11d0-AF6E-00C04FD8DC02}
\InprocServer32
For this registry key the value of subkey "ThreadingModel" has to be changed from
"Apartment" to "Both" (1. to 5. in the Adobe description).
All other changes are not neccessary for using the iFilter with powerKNOW.
If you already have knowledge documents with attachments of type PDF
consider reindexing those knowledge documents. See
"Reindexing existing documents" for details.
|
|
|
|
|
|