BatchLoader Tuning

Recently I have been tuning BatchLoader. Using sample PDFs around 100K in size, throughput was initially pretty good at 16 files per second but got progressively worse with more batches loaded; at some point reaching a low of 2 files per second.  In looking at the database activity while running BatchLoader, it was apparent that the following query was consuming 90% of the overall processing time:

SELECT DocMeta.dID FROM Revisions, DocMeta
upper(dDocName)=upper(:”SYS_B_0″) AND
NOT(DocMeta.xCollectionID=:”SYS_B_1″) AND

There was already an index on dDocName for the Revisions table, but because this query asks for dDocName in uppercase, the existing index is not used! To attempt to solve this problem a new index was created on the Revisions table for upper(dDocName). The result was a clear performance enhancement, with throughput in excess of 27 files per second after the change.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s