Extracting Hibernate SQL Queries from Log Files and Exporting Them to CSV: A Complete Beginner’s Guide

If you work with Java applications that use Hibernate, you’ve probably encountered large log files containing hundreds or thousands of SQL statements. Analyzing these queries manually can be time-consuming, especially when running automated test cases where each test generates its own logs.

In this guide, you’ll learn how to automatically extract Hibernate SQL queries from multiple log files and export them into a CSV file for analysis, reporting, troubleshooting, or performance testing.


What Problem Are We Solving?

Imagine you have a folder structure like this:

TestCase1/
    cdc-server.log

TestCase2/
    cdc-server.log

TestCase3/
    cdc-server.log

...

TestCase16/
    cdc-server.log

Each cdc-server.log contains Hibernate SQL statements such as:

2024-04-26 10:21:25.856 [main] DEBUG org.hibernate.SQL RequestId= -

    select
        consentdat0_.id as id1_16_,
        consentdat0_.delete_status as delete_s4_16_
    from
        consent_data consentdat0_
    where
        consentdat0_.delete_status=?

The SQL query may span multiple lines and can be followed by another SQL statement or a completely different log entry:

2024-04-26 10:21:25.876 [main] INFO c.p.cdc.OpenBankingApplication RequestId= -
FailedTokenFallback.csv already exists

The goal is to generate a CSV file like this:

Test CaseTimestampSQL Query
TestCase12024-04-26 10:21:25.856SELECT …
TestCase12024-04-26 10:21:30.112INSERT …
TestCase22024-04-26 10:22:15.443UPDATE …

Why Extract Hibernate Queries?

There are many reasons to extract SQL queries from logs:

Performance Analysis

Identify slow-running queries.

Database Optimization

Review generated SQL and improve indexing.

Test Automation

Compare queries executed by different test cases.

Auditing

Maintain a record of database operations.

Troubleshooting

Detect unexpected table access or missing joins.


Understanding Hibernate SQL Logs

Hibernate logs SQL statements using:

DEBUG org.hibernate.SQL

A typical query begins with:

2024-04-26 10:21:03.473 [main] DEBUG org.hibernate.SQL RequestId= -

and continues until:

  • another log entry starts
  • another Hibernate SQL statement starts
  • the file ends

Example:

create table achq_payments (
   transaction_id varchar(255) not null,
   achq_token varchar(255),
   primary key (transaction_id)
)

Notice that the SQL spans multiple lines.


Challenges When Parsing Hibernate Logs

Challenge 1: Multi-Line Queries

Unlike simple log messages, SQL statements often span dozens of lines.

Example:

select
    id,
    name,
    email
from
    customer
where
    status=?

A parser must collect all lines belonging to the same query.


Challenge 2: Consecutive SQL Statements

Example:

2024-04-26 10:21:03.473 DEBUG org.hibernate.SQL ...

create table table1 ...

2024-04-26 10:21:03.519 DEBUG org.hibernate.SQL ...

create table table2 ...

The parser must recognize where one query ends and the next begins.


Challenge 3: Mixed Log Levels

Example:

DEBUG org.hibernate.SQL ...

followed by:

INFO OpenBankingApplication ...

The parser must stop collecting SQL when another log entry appears.


Python Solution

The following Python script processes all test case folders and extracts Hibernate SQL statements into a CSV file.

import os
import re
import csv

BASE_PATH = r"C:\Logs"

sql_start_pattern = re.compile(
    r'^(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d{3}).*DEBUG org\.hibernate\.SQL'
)

log_line_pattern = re.compile(
    r'^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d{3}'
)

with open('hibernate_queries.csv', 'w', newline='', encoding='utf-8') as csvfile:

    writer = csv.writer(csvfile)
    writer.writerow(['Test Case', 'Timestamp', 'SQL Query'])

    for test_case in os.listdir(BASE_PATH):

        folder_path = os.path.join(BASE_PATH, test_case)

        if not os.path.isdir(folder_path):
            continue

        log_file = os.path.join(folder_path, 'cdc-server.log')

        if not os.path.exists(log_file):
            continue

        with open(log_file, 'r', encoding='utf-8', errors='ignore') as f:

            current_timestamp = None
            current_query = []
            collecting = False

            for line in f:

                sql_match = sql_start_pattern.match(line)

                if sql_match:

                    if current_query:
                        writer.writerow([
                            test_case,
                            current_timestamp,
                            ' '.join(current_query)
                        ])

                    current_timestamp = sql_match.group(1)
                    current_query = []
                    collecting = True
                    continue

                if collecting:

                    if log_line_pattern.match(line):

                        writer.writerow([
                            test_case,
                            current_timestamp,
                            ' '.join(current_query)
                        ])

                        current_query = []
                        collecting = False

                        if sql_start_pattern.match(line):
                            current_timestamp = sql_start_pattern.match(line).group(1)
                            collecting = True

                    else:
                        current_query.append(line.strip())

            if current_query:
                writer.writerow([
                    test_case,
                    current_timestamp,
                    ' '.join(current_query)
                ])

print("Done.")

Sample Output

The generated CSV may look like:

Test Case,Timestamp,SQL Query
TestCase1,2024-04-26 10:21:03.473,"create table achq_payments (...)"
TestCase1,2024-04-26 10:21:03.519,"create table finicity_bank (...)"
TestCase1,2024-04-26 10:21:03.559,"create table finicity_bank_address (...)"
TestCase2,2024-04-26 10:21:25.856,"select consentdat0_.id ... order by consentdat0_.decision_limit_time asc"

Enhancements You Can Add

Export to Excel

Instead of CSV, use:

import pandas as pd

df = pd.read_csv("hibernate_queries.csv")
df.to_excel("hibernate_queries.xlsx")

Count Queries per Test Case

SELECT COUNT(*)
FROM extracted_queries
GROUP BY test_case

Or in Python:

from collections import Counter

counter = Counter()

counter[test_case] += 1

Detect Slow Queries

If Hibernate logs execution times, you can extract:

Query executed in 523 ms

and identify performance bottlenecks.


Compare Queries Between Test Runs

Store results from multiple executions:

Run1.csv
Run2.csv
Run3.csv

Then compare:

  • Missing queries
  • Additional queries
  • Different SQL generation

This is useful during application upgrades and regression testing.


Best Practices

Enable SQL Logging Only When Needed

Hibernate SQL logging can generate huge files.

Example:

logging.level.org.hibernate.SQL=DEBUG

Use it primarily in:

  • Development
  • Testing
  • Troubleshooting

Archive Log Files

Large test suites can produce gigabytes of logs.

Compress old logs:

zip testlogs.zip *.log

Store Extracted Queries Separately

Keep:

Raw Logs
Extracted Queries
Performance Reports

in separate folders for easier analysis.


Conclusion

Hibernate SQL logs contain valuable information about the database operations performed by your application. When executing multiple automated test cases, manually reviewing these logs becomes impractical.

By using a simple Python parser, you can automatically:

  • Scan hundreds of log files
  • Detect Hibernate SQL statements
  • Handle multi-line queries
  • Identify query boundaries
  • Export results to CSV or Excel
  • Simplify debugging and performance analysis

This approach is particularly useful for QA teams, database administrators, performance engineers, and Java developers working with Hibernate-based applications. Proper SQL extraction can dramatically reduce investigation time and provide deeper insight into application behavior during testing and production troubleshooting.

This article is inspired by real-world challenges we tackle in our projects. If you're looking for expert solutions or need a team to bring your idea to life,

Let's talk!

    Please fill your details, and we will contact you back

      Please fill your details, and we will contact you back