|
|
FAQ: Fulltext indexing for additional document types (PDF, ...)
On a normal windows machine the fulltext filters for Microsoft Office documents
are already installed. Accordingly the internal configuration of
powerKNOW is set to put the following attachment types into the
fulltext index (required for fulltext search on attachments):
- Microsoft Word documents (.doc)
- Microsoft Excel documents (.xls)
- Microsoft Powerpoint documents (.ppt)
- HTML documents(.html .htm)
- Plain text documents(.txt)
If you want other types of documents (as attachments) to be indexed as well
additional filtering software is required. Follow these steps to setup a new
document type for indexing:
- Get an iFilter module for the required document type and install it as
documented for the module.
These filters are available from various suppliers for different
document types.
Adobe offers an iFilter module for PDF free of charge
(see: iFilter for PDF)
Other iFilters can be found at www.iFilter.org.
- Configure powerKNOW to index this document type.
To do this the file extension of this document type (eg: 'pdf' for
PDF documents) has to be added to the config key
'idxExtensionsIndexed'.
After completing these steps new attachments with the new document type will be indexed
by powerKNOW. The contents of these attachments will be available for
fulltext search.
If you are using the Adobe offers an iFilter module for PDF on Windows XP and you are
experiencing problems pleasy hav a look at our FAQ
"Problems indexing PDF on Windows XP".
|
|
|
|
|
|