java - My Code is too slow( uses jdbc, oracle.sql.BLOB and searches a string in the blob)

Below is my code snippet

    PreparedStatement preStatement = conn.prepareStatement(query);
    System.out.println("preparing sql ststement: " + query);

    ResultSet result = preStatement.executeQuery();
    System.out.println(" result stored in result set ");
    while(result.next()){
        try{
            int readCount = 0;
            oracle.sql.BLOB blob=((OracleResultSet)result).getBLOB("payload");
            InputStream in=blob.getBinaryStream();
            BufferedReader br = new BufferedReader(new java.io.InputStreamReader(in));  
            String total="";   
            String str;   
            while ((str = br.readLine()) != null) {   
               total += str;   
               //System.out.println(total);
            }   
            System.out.println(total);
            in.close();  

the line ResultSet result = preStatement.executeQuery(); takes approximately 1 minutes to execute. but, the conversion of Blob data to text string takes 4 minutes below is that part of the code

oracle.sql.BLOB blob=((OracleResultSet)result).getBLOB("payload");
                InputStream in=blob.getBinaryStream();
                BufferedReader br = new BufferedReader(new java.io.InputStreamReader(in));  
                String total="";   
                String str;   
                while ((str = br.readLine()) != null) {   
                   total += str;   
                   //System.out.println(total);
                }   

and then, a search for a string is done in 'total' based on that the result set columns are stored in hashmap.

can someone suggest how to make this processing faster ?

3 Answers

  1. Keith- Reply

    2019-11-14

    Can you move the search step into the SQL fragment that you pass to executeQuery, so that the database does the search instead of the Java code?

    Otherwise, the Blob interface supports more direct methods to get the data than using a stream, e.g. getBytes() can get a byte array that can be passed to a String constructor, or e.g. position(), which searches the blob for a pattern.

  2. Ken- Reply

    2019-11-14

    I see several possible enhancements, given the information you provided. Not all might apply though:

    1. Store structured data instead of blobs. An RDBMS is best used when searchable information is stored in normalized tables and columns. Searching inside of blobs is not what an RDBMS is made for.
    2. Use the full-text search capabilities of Oracle
    3. Use a real search engine, such as Lucene
    4. Use a StringBuilder instead of concatenating strings and thus generating tons of garbage.
    5. Search in the lines you just read, ans stop reading as soon as you found what you were looking for, instead of searching only once you have read the whole BLOB in memory
  3. Kenneth- Reply

    2019-11-14

    For the first part (the slow sql query) I can't provide much help since we do not know the query, the size of your blob and your database configuration. However if the blob is not too large, this is not a natural response time. You should really use either Oracle Text features or a third party search engine as Lucene. This usually makes a huge difference in search response times.

    For your second problem (reading the blob), I believe that you are going to see some improvement from the following:

    • Use a buffer size other than the default (8K) for the BufferedReader i.e. 64 or 128K. You can try which value provides better results in your case.
    • Use a StringBuilder to concatenate strings

    As an additional thing to watch out for, use the constructor of the InputStreamReader that takes a character set argument and set it to the same character set your database is configured for. The code you are currently using, has the potential to corrupt your data if they are UTF8

Leave a Reply

Your email address will not be published. Required fields are marked *

You can use these HTML tags and attributes <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>