Class ResourceSearchTextExtractorImpl
java.lang.Object
ch.tocco.nice2.dms.impl.entitylistener.ResourceSearchTextExtractorImpl
- All Implemented Interfaces:
ResourceSearchTextExtractor
@Component
public class ResourceSearchTextExtractorImpl
extends Object
implements ResourceSearchTextExtractor
Extract the text content of a binary with apache tika.
The service can be configured through application properties:
nice2.dms.fullTextIndex.ignoreFileExtensions: a comma separated blacklist of file extensions for which extensions
the extraction process is skipped.
nice2.dms.fullTextIndex.maxContentSizeInMb: if the content is larger than the threshold an empty string is returned.
nice2.dms.fullTextIndex.maxFileSizeInMb: if the file is larger than the content is not extracted and an empty string is returend.
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionextractContent
(Binary binaryValue) void
setIgnoreFileExtensions
(String ignoreFileExtensions) void
setMaxContentSizeInMb
(double maxContentSizeInMb) void
setMaxFileSizeInMb
(double maxFileSizeInMb)
-
Constructor Details
-
ResourceSearchTextExtractorImpl
public ResourceSearchTextExtractorImpl(org.slf4j.Logger log)
-
-
Method Details
-
extractContent
- Specified by:
extractContent
in interfaceResourceSearchTextExtractor
- Throws:
IOException
-
setIgnoreFileExtensions
@Value("${nice2.dms.fullTextIndex.ignoreFileExtensions}") public void setIgnoreFileExtensions(String ignoreFileExtensions) -
setMaxContentSizeInMb
@Value("${nice2.dms.fullTextIndex.maxContentSizeInMb}") public void setMaxContentSizeInMb(double maxContentSizeInMb) -
setMaxFileSizeInMb
@Value("${nice2.dms.fullTextIndex.maxFileSizeInMb}") public void setMaxFileSizeInMb(double maxFileSizeInMb)
-