We have a requirement to identify the Non OCR documents(non-text searchable PDFs) in the Live link 9.7.1.Is there any flag that i can identify Non OCR documents in the Livelink 9.7.1?
I believe if you put any document in livelink and specifically not exclude its mime type from FT indexing LL will try its search DCS conversion process to FT Index it.In the case of a non text readable PDF it will mark that and move on(obviously anything readable it swill put in meta data indexing).Being 971 your serach engine is very old so you might not benefit from new advances into this. However OTSearch.jar is a single jar file that does all that magic of searching,if you contact OT they will probably have better ideas for you.Otherwise there is a region -regions are what we search through livelink, that puts a summary of text,I think it is called OTSummary or something like that,my gut feel is if it is readable then you will have something in it,otherwise it will be empty so you may have to use some kind of set logic,you may do a search for OTsummary and that will be your searchable pdf's now through database and a little bit of creativity one can find the rest.If I were doing this I would put items in a collection and see if there are any free api's I can use to mark each as good/bad.Your best bet is OT because they get these kind of requests day in day out.
Well, if I called the wrong number, why did you answer the phone?
James Thurber, New Yorker cartoon caption, June 5, 1937
Certified OT Developer,Livelink ECM Champion 2008,Livelink ECM Champion 2010
This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.