About fileextension.properties file.

Added by sachin patil 550 days ago

Hi,

I have question about fileextension.properties file.
As per my understanding this file contains list of file extensions which needs to be scanned, but
I can scan/search *.cpp, *py etc files which are not present in fileextension.properties file.

Can you please tell me how the fileextension.properties file used in scanning?

Thanks in advance.

Best Regards,
Sachin


Replies

RE: About fileextension.properties file. - Added by Karl Heinz Marbaise 549 days ago

Hi Sachin,

with this file it is possible to associate particular extensions to a particular document parser.

For example the usual configuration
zip = com.soebes.supose.scan.document.ScanArchiveDocument
jar = com.soebes.supose.scan.document.ScanArchiveDocument
tar = com.soebes.supose.scan.document.ScanArchiveDocument
tar.gz = com.soebes.supose.scan.document.ScanArchiveDocument
tar.bz2 = com.soebes.supose.scan.document.ScanArchiveDocument
tgz = com.soebes.supose.scan.document.ScanArchiveDocument
tbz2 = com.soebes.supose.scan.document.ScanArchiveDocument

will not extract the contents of jar, zip, tar.gz archives, but if you like you can configure to extract the contens of such archives and let it be indexed as well.
#zip = com.soebes.supose.scan.document.ScanArchiveWithContentDocument
jar = com.soebes.supose.scan.document.ScanArchiveWithContentDocument
tar = com.soebes.supose.scan.document.ScanArchiveWithContentDocument
tar.gz = com.soebes.supose.scan.document.ScanArchiveWithContentDocument
tar.bz2 = com.soebes.supose.scan.document.ScanArchiveWithContentDocument
tgz = com.soebes.supose.scan.document.ScanArchiveWithContentDocument
tbz2 = com.soebes.supose.scan.document.ScanArchiveWithContentDocument

Or you can associate a particular extension to a particular Scan Type e.g. if you have files which have the extension .xyz but you like to have them to be parsed as xml files you can define this in this file.

At the moment default behaviour is to parse every document type which is recognized as non binary from Subversion will be parsed by the default handler but you can change this.
If a file is a binary file type SupoSE will only index it's file name etc. but not it's content.

Kind regards
Karl Heinz Marbaise