Blog post

Advances in SonarQube's Bug Detection

Denis Troller

Product Manager

May 28, 2025

7 min read

At Sonar, we pride ourselves on bringing the best analysis to our users. To us, this means SonarQube accurately finds difficult to spot bugs before they become a problem. This is true whether we talk about issues impacting the Maintainability, the Reliability, or the Security of your software.

Today I want to share advances we are making in our bug detection technology which detects issues that impact reliability.

What are Reliability-impacting issues?

Reliability issues are what any developer would generally classify as “bugs”. This is a class of issues that definitely requires immediate attention in the same way security issues warrant quick resolution. If left in the code that makes it to production, the impact will be felt by your end users directly causing them to be frustrated with the poor behavior of your software. They can lead to a cascade of problems for you

Your software crashes
Your software misbehaves in terms of the flow of your code, which means your business logic is somewhat wrong
You have to divert developers from their current focus to understand, debug, and fix the problem
Your team’s velocity in delivering new features is reduced because of increased work to resolve issues
You might have to deal with SLA penalties with your customers
The general public perception of your software and company is negatively impacted

Any of these outcomes are bad enough on their own, but, taken together, they definitely mean any such issue should be prevented from making it into production at all costs. The nature of these issues is also that, very often, they will manifest only under a specific set of circumstances, lurking in your software and appearing weeks, maybe months after release. As a result, diagnosing and resolving them can take even longer, since by then your team has probably moved on to work on coding different features or functionality.

To give you an idea of the impacts of bugs on your dev team, industry data consistently indicates that developers spend between 30% to 50% of their working hours on identifying and resolving software defects (Stripe, The Developer Coefficient, 2018). Some of this time is accounted for in the development phase during unit testing, but a good chunk of it comes from discovering bugs in production because finding issues later in the development cycle takes developers longer to resolve them.

Because of this “high priority” status, it is important to make sure the issues being raised are relevant. This is a problem similar to security issues. Because issues need to be dealt with as soon as they are detected, and because they often require more time to understand and fix, false-positives must be kept to a minimum so as not to overload developers with bogus findings.

But wait, we test our code!

Of course, the first line of defense against introducing bugs is to test your software. But the reality is, no amount of testing will catch everything. Some bugs are more immune to being detected by tests. The sheer complexity of modern software makes exhaustively testing all code paths with all potential inputs impossible.

Moving from measuring code test coverage at the line level to branch and conditional coverage helps, but it is simply not feasible to reach a level that could catch everything. Developers would spend more time writing tests than features, voiding any productivity gains from not having to debug code.

Without a doubt, testing is a critical aspect of developing high quality code, but it is simply not enough on its own.

What about AI generated code?

AI-generated is a hot topic these days, and for good reason. The promises it makes in terms of productivity gains, and the incredible speed of progress in the field, are too tempting to ignore. However, AI models are trained on human-written code. There is no reason today to believe AI-generated code will be less prone to these kinds of issues.

In fact, it is probable that using AI agents to write code will lead to worse outcomes in terms of time spent debugging. The time spent by a developer debugging an issue increases when the code has been written by another person. When the developer is not familiar with the specific code being debugged, it takes more time to understand it, and thus to debug it. It also increases the likelihood that making a change will cause other undetected problems. This stems from the fact that debugging code is about formulating hypotheses and testing them. With less knowledge about the code, the developer has to first formulate broader hypotheses before homing in on the actual problem, which takes more time (Arab, Liang, Hong, LaToza, How developers choose debugging strategies for challenging web application defects, 2025).

It stands to reason that AI agents will generate the same proportion of bugs as humans. We expect AI (through code assistants or agents) to dramatically increase the speed at which code is created, resulting in an exponentially growing quantity of code and larger codebases with even more bugs. All these facts will act as multipliers on the time lost to debugging, because no developer in the team will be familiar with the code. This means more cost sunk to debugging, and less time spent developing new features.

Still, because of the initial productivity gains, nobody is ready to pass on this opportunity. The best solution is to equip the agent with the safety net of a tool that finds those bugs.

Advances in Java and Python bug detection

Because we know these issues can have a serious impact on the stability of applications, we chose to develop advanced bug detection engines for the most popular languages. One of these engines targets Java and Python. This engine is cross-procedural, which means it can find more complex bugs than standard approaches.

An engine that only looks at methods in isolation will find potential bugs. However, it will either be perceived as being noisy with too many false positives because, by design, your codebase might never make use of the method in a dangerous way, or too cautious for fear of raising too many false-positives. The best way to systematically reduce false-positives is to know more about the context, and look at how the code actually unfolds in its entirety. This is what our engine does.

It’s an engine because such analysis goes beyond simple rules. It traverses the code to figure out what could happen if your app were to execute in a myriad of different ways. The objective is to detect which paths lead to an issue. As a result we have a much higher true positive and very low false positive rate because of this complex analysis.

This engine, named the Dataflow Bug Detection (DBD) Engine, is already being used in combination with our “historical” non-cross-procedural engine for detecting issues in Java and Python code. We always planned on completely replacing our historical engine with the DBD Engine. We have been taking steps in that direction for some time now. In order to completely switch over, we needed to ensure the new engine’s detection capability reached a high level of quality, which we have been working very hard to achieve.

Our goal this past year was to make this engine better for both our Python and Java users, and we have done so. Now, we are confident that it’s time to make the switch.

Enough, give me some numbers!

We test our modifications extensively to assess the impact of the changes we make on our engine.

We chose to focus on the rules that raise the most prevalent types of issues for this first release. Rest assured, the other rules will be ported to this new engine in due time. [javabugs:S6320, javabugs:S6417, javabugs:S6322, pythonbugs:S6464, pythonbugs:S6465, pythonbugs:S6417, pythonbugs:S6899, pythonbugs:S5633, and pythonbugs:S6886 will be ported in subsequent releases]

Here’s a summary of the gains we’ve made on some of our rules for Python and for Java, based on our internal benchmarks. In this table, false positives are the issues we should not raise. Reducing that number is very important to keep developers productive, since they detract their attention from the actual problems (true positives).For these rules, we have made real progress both in terms of how many real issues we find (increase in true positive rate), and in being less noisy (decrease in false positive rate).

		New Engine	New Engine
Language	Rule	True Positives (increased rate)	False Positives (decreased rate)
Python	S6466 (Out of bounds access)	1.35	1.3
Java	S6466 (Out of bounds access)	7	7
Java	S6555 (Null dereference)	1.2	113
Java	S6649 (Division by zero)	14	9

The rate of increase of True Positives and decrease of False Positives by migrating each issue to the advanced Dataflow Bug Detection engine.

Generally speaking, there will be a bit of adjustment. Some new false positives will be raised, of course, because no engine is perfect. But the large majority of what we find will be good, actual issues that need to be tackled. This has a real impact on developers, day after day. They will be more confident that what SonarQube finds matters, waste less time looking at irrelevant findings, and spend their time fixing actual problems that could end up costing a lot.

Just like not finding some important security issues is worse than a few false-positives, we know it will be worth it. Keep in mind that any bug has a potential to become a security liability in this day-and-age.

What’s next?

To reduce the sudden impact of these rule changes and the difference in findings, our plan is to transition the rules slowly. We will be retiring the historical Java rules, starting with S2259, in favor of new ones based on this new engine. You can follow the corresponding announcements on our Community.

This is the beginning of the journey, but we are very excited for these changes!

This new engine version is available today on SonarQube Cloud for all plans, and will be available in all the editions of SonarQube Server 2025 Release 4 this summer.

twitter facebook linkedin mail

Get new blogs delivered directly to your inbox!

Stay up-to-date with the latest Sonar content. Subscribe now to receive the latest blog articles.