I am indexing 34000 + documents physically located on the Hard Drive
Windows Server 2008 SP2
CF9
Oracle
Thanks to advice in another thread I started I am indexing the folders one at a time followed by an update after each. Some of the PDFs can be huge (130mb) but the average is closer to 1 mb. On occasion I will get to a PDF that is corrupt (If I copy it to my desktop and attempt to open it, Acrobat Pro says it is corrupt).
I have attempted using cfpdf to read header info in a cftry block with the catch creating a log entry. That should work but it hangs trying to read the doc (assuming that is what is happening with Solr too). I get no log entry and it will continue to hang until timeout for the request.
Can anyone think of a way to break out of a hung file and continue to index the remaining files?
Thanks