DeepDiff: Next-Generation Binary Diffing for Precise Vulnerability and Patch Detection

We're excited to announce the launch of DeepDiff, Deepbits' groundbreaking solution for vulnerability detection and binary diffing. DeepDiff represents a major leap forward in security analysis. It helps security researchers, reverse engineers, and development teams to pinpoint vulnerable functions and generate precise diffing views across binary files with unmatched accuracy.

DeepDiff Demo:

The Challenge: Detecting Vulnerabilities and Patches in Complex Software Systems

In today's rapidly evolving software ecosystem, security vulnerabilities pose an ever-present risk to organizations. Security teams face two key challenges when analyzing large, complex software systems:

Identifying newly discovered vulnerabilities within massive binary files, especially when debug symbols are unavailable.
Confirming the presence of security patches in compiled binaries, a task often hindered by compiler optimizations and architectural variations.

These challenges become even more critical as software scales. It makes more likely that security patches applied at the source level may not make it into the final release.

Why This Matters: Security at Scale

The stakes couldn't be higher. Undetected vulnerabilities can lead to:

Costly data breaches affecting millions of users
Exploitation of critical infrastructure by threat actors
Compromised user privacy and security
Regulatory non-compliance, leading to fines and legal repercussions

The ability to rapidly identify vulnerable functions across binary files isn't just a nice-to-have capability—it's essential for maintaining robust security postures in modern organizations.

Limitations of Existing Binary Diffing Tools

Traditional binary diffing solutions, such as BinDiff ¹ and Diaphora ², suffer from key shortcomings:

Accuracy issues: Traditional binary diffing tools often rely on various heuristics to find matched functions, but they struggle with compiler optimizations and architectural differences, resulting in high false positive/negative rates.
Limited semantic understanding: Most tools focus on syntactic differences, generating noisy results that make patch detection difficult.
High expertise and manual effort required: Existing tools cannot independently confirm vulnerabilities or generate executive reports, making vulnerability analysis labor-intensive and reliant on expert knowledge.

DeepDiff overcomes these limitations with an entirely new approach to binary diffing.

DeepDiff: A Smarter, More Accurate Approach

DeepDiff tackles these challenges with innovative technology. It converts decompiled function code in a binary file into embeddings, then searches for similar function code within the vulnerability database. When a potential vulnerable function is identified, DeepDiff leverages control flow and data flow analysis to detect logic-altering code changes while filtering out modifications introduced by compiler and architectural differences.

Key Capabilities

Accurate Vulnerability Function Detection – DeepDiff precisely identifies function matches, even in heavily optimized binaries.
Semantic Diffing Views – Unlike traditional tools that focus on raw assembly, DeepDiff generates high-level semantic diffing views, making it easier to verify patch applications.
Non-code change resiliency - DeepDiff can be applied in cross architecture and cross optimization level scenarios and it's matching and diffing performance will not be significantly impacted.
AI-Powered Analysis – Leveraging LLM-based reasoning, DeepDiff evaluates whether detected differences stem from actual security patches.
Comprehensive Report Generation – DeepDiff not only produces detailed diffing results but also generates an executive summary explaining code changes and whether they originate from the security patch. This allows non-experts to assess vulnerabilities easily, reducing labor costs and minimizing the need for specialized knowledge.

Real-World Application: Patch Verification in Samsung Galaxy Note10 Firmware

To demonstrate DeepDiff's capabilities, we analyzed Samsung Galaxy Note10 firmware versions SM-N970F_AUT_N970FXXS9HWG9 and SM-N970F_AUT_N970FXXS8HVJ1. Our goal is to confirm whether the Android Security Bulletin patches for Bluetooth CVE-2023-21273 ³ and NFC CVE-2023-21241 ⁴ vulnerabilities were applied in their latest firmware (security patch 2023-08-01).

Overview of CVE-2023-21273

CVE-2023-21273 is a security vulnerability affecting Google's Android operating system in version 11, 12, 12L, and 13. In SDP_AddAttribute of sdp_db.cc, there is a possible out of bounds write due to an incorrect bounds check. This could lead to remote (proximal/adjacent) code execution with no additional execution privileges needed. The security patch adds two checks for early termination.

diff --git a/system/stack/sdp/sdp_db.cc b/system/stack/sdp/sdp_db.cc
index 297b312..acef4a5 100644
--- a/system/stack/sdp/sdp_db.cc
+++ b/system/stack/sdp/sdp_db.cc
@@ -355,6 +355,11 @@
   uint16_t xx, yy, zz;
   tSDP_RECORD* p_rec = &sdp_cb.server_db.record[0];
 
+  if (p_val == nullptr) {
+    SDP_TRACE_WARNING("Trying to add attribute with p_val == nullptr, skipped");
+    return (false);
+  }
+
   if (sdp_cb.trace_level >= BT_TRACE_LEVEL_DEBUG) {
     if ((attr_type == UINT_DESC_TYPE) ||
         (attr_type == TWO_COMP_INT_DESC_TYPE) ||
@@ -402,6 +407,13 @@
     if (p_rec->record_handle == handle) {
       tSDP_ATTRIBUTE* p_attr = &p_rec->attribute[0];
 
+      // error out early, no need to look up
+      if (p_rec->free_pad_ptr >= SDP_MAX_PAD_LEN) {
+        SDP_TRACE_ERROR("the free pad for SDP record with handle %d is "
+                        "full, skip adding the attribute", handle);
+        return (false);
+      }
+
       /* Found the record. Now, see if the attribute already exists */
       for (xx = 0; xx < p_rec->num_attributes; xx++, p_attr++) {
         /* The attribute exists. replace it */
@@ -440,15 +452,13 @@
           attr_len = 0;
       }
 
-      if ((attr_len > 0) && (p_val != 0)) {
+      if (attr_len > 0) {
         p_attr->len = attr_len;
         memcpy(&p_rec->attr_pad[p_rec->free_pad_ptr], p_val, (size_t)attr_len);
         p_attr->value_ptr = &p_rec->attr_pad[p_rec->free_pad_ptr];
         p_rec->free_pad_ptr += attr_len;
-      } else if ((attr_len == 0 &&
-                  p_attr->len !=
-                      0) || /* if truncate to 0 length, simply don't add */
-                 p_val == 0) {
+      } else if (attr_len == 0 && p_attr->len != 0) {
+        /* if truncate to 0 length, simply don't add */
         SDP_TRACE_ERROR(
             "SDP_AddAttribute fail, length exceed maximum: ID %d: attr_len:%d ",
             attr_id, attr_len);

Overview of CVE-2023-21241

CVE-2023-21241 is a security vulnerability affecting Google's Android operating system in versions 11, 12, 12L, and 13. In rw_i93_send_to_upper of rw_i93.cc, there is a possible out of bounds write due to an integer overflow. This could lead to local escalation of privilege with no additional execution privileges needed. The security patch adds an if condition to detect integer overflow.

diff --git a/src/nfc/tags/rw_i93.cc b/src/nfc/tags/rw_i93.cc
index 2b246e8..4056a02 100644
--- a/src/nfc/tags/rw_i93.cc
+++ b/src/nfc/tags/rw_i93.cc
@@ -540,6 +540,15 @@
     case I93_CMD_GET_MULTI_BLK_SEC:
     case I93_CMD_EXT_GET_MULTI_BLK_SEC:
 
+      if (UINT16_MAX - length < NFC_HDR_SIZE) {
+        rw_data.i93_cmd_cmpl.status = NFC_STATUS_FAILED;
+        rw_data.i93_cmd_cmpl.command = p_i93->sent_cmd;
+        rw_cb.tcb.i93.sent_cmd = 0;
+
+        event = RW_I93_CMD_CMPL_EVT;
+        break;
+      }
+
       /* forward tag data or security status */
       p_buff = (NFC_HDR*)GKI_getbuf((uint16_t)(length + NFC_HDR_SIZE));

DeepDiff's Approach

DeepDiff leverages our advanced function matching techniques to pinpoint the vulnerable functions in a binary file. It then analyzes data flow to remove code changes and noises caused by compiler and architecture differences so that DeepDiff can focus on code changes that alter the logic. Using DeepDiff, detecting if a patch is present is simple. The tool requires:

A JSON file with metadata about the vulnerability
A target binary for analysis

A sample JSON configuration:

{
    "patchedBinary": "/mnt/libnfc-sec.so_new",
    "address": "001dde38",
    "patch": "/mnt/nfc.patch"
}

where patchedBinary refers to a binary file that contains the patched function, address is the function address of the patched function, and patch is the path to the git diff of the security patch.

DeepDiff then finds and generates:

The vulnerable function in the target binary
A diff view of the decompiled code between the two functions
An analysis of whether the changes in the decompiled code are caused by the security patch.
A full report of the tested binaries, CVE numbers, our conclusion, and evidences.

In this demo, we used libnfc-sec.so and libbluetooth.so that we know are free from the aforementioned vulnerabilities as the patched binaries, and tested them with the files extracted from the Samsung firmware.

Example 1: NFC CVE-2023-21241

DeepDiff Analysis

DeepDiff not only pinpoints vulnerable function, but also highlights the new if condition. Its analysis then reasons whether the new condition matches the one in the security patch, and eventually gives a conclusion. The decompiled code along with the analysis is easy to read and understand, making patch presence test straightforward.

Time DeepDiff takes: 17 minutes.

+ local_180 = 0;
+ NStack383 = (NFC_HDR)0x0;
- pcVar5 = (char *)((ulong)local_198 | 1);
+ pcVar5 = (char *)((ulong)local_198 | 1);
+ if ((local_198[0] & 1) != 0) {
+ uVar7 = local_190;
+ if ((ushort)-sVar3 < 8) {
...
=======================
I looked closely at the two diffs. In the source diff a new branch is inserted under the I93_CMD_GET_MULTI_BLK_SEC (and EXT_GET_MULTI_BLK_SEC) case that checks whether there is enough “head‐room” (i.e. if UINT16_MAX – length is less than NFC_HDR_SIZE) before proceeding. If this condition is true the routine sets the error completion status (and resets sent_cmd) and breaks out early rather than continuing toward allocating and forwarding tag data.

In the decompiled diff we now see an extra conditional branch (the “if ((ushort)-sVar3 < 8)” test) along with a flurry of new variable assignments. Although the names are obfuscated, the comparison “< 8” very likely corresponds to testing against NFC_HDR_SIZE (which in the source seems to be 8 bytes). Also, the extra assignments (for example, setting local_180, adjusting fields in the rw_cb structure, and the differing constants like 0xab versus 0xa9) indicate that the error‐handling branch was added. In other words, the decompiled code now first checks a condition (derived from what is “UINT16_MAX - length < NFC_HDR_SIZE” in the source) and then takes an alternate branch—exactly as the source change does.

So, the differences seen in the decompiled code (the extra conditional check and the new assignments) very likely result from the source code additions.

Confidence Level: I'm about 80% confident that the decompiled diff's differences are directly caused by the source code change that adds the extra check and error branch.

BinDiff Result

BinDiff is a comparison tool for binary files that allow researchers and engineers to quickly find differences and similarities in assembly code. In this example, BinDiff can identify the vulnerable function, but determining if the differences are a result of security patch on assembly code is non-trivial. Specifically, BinDiff shows two added basic blocks and four changed basic blocks. The first basic block uses the and operator for W9 and 0xFFFF, and compare the result with 7. Users need to have enough expert knowledge to understand that the assembly code is trying to check if W9 is larger than 7.

BinDiff takes about 10 seconds to produce to the result.

BinDiff Diff View 1 - Bluetooth

BinDiff Diff View 2 - Bluetooth

Diaphora Result

Similar to BinDiff, Diaphora is able to find the vulnerable function, and it shows that six basic blocks have been updated. However, the diff view of Diaphora is not so clear, thus determining if the new basic blocks are created by the security patch is also non-trivial.

Diaphora takes about 2 minutes to produce the result.

Diaphora Diff View 1 - Bluetooth

Diaphora Diff View 2 - Bluetooth

Example 2: Bluetooth CVE-2023-21273

DeepDiff Analysis

- if ((param_5 != (undefined8 *)0x0) && (param_4 != 0)) {
+ if (param_5 == (undefined8 *)0x0) {
+ if (1 < DAT_009bc5ac) {
+ FUN_005206f0(0xa0001,"Trying to add attribute with p_val == nullptr, skipped");
- if (*(char *)((long)param_5 + (ulong)(param_4 - 1)) != '\0') goto LAB_004f2a44;
+ if (*(char *)((long)param_5 + (ulong)(param_4 - 1)) != '\0') goto LAB_004f2dfc;
...
+ if (*puVar14 < 600) {
+ if (DAT_009bc5ac != 0) {
+ FUN_005206f0(0xa0000,
+ "the free pad for SDP record with handle %d is full, skip adding the attribute"
+ ,param_1);
=======================
Both diffs show changes that match each other. In the source diff one new branch immediately checks if the input value pointer (p_val) is null and logs a warning before exiting. In the decompiled diff you see the corresponding check "if (param_5 == (undefined8 *)0x0)" followed by a call that prints the warning "Trying to add attribute with p_val == nullptr, skipped" (depending on the runtime flag). Also, the source diff adds an early error for a full free pad (using SDP_MAX_PAD_LEN) and in the decompiled version you see the similar check (using what appears to be a numerical limit, e.g. 600) and an associated error message "the free pad for SDP record with handle %d is full, skip adding the attribute." 

These indicate that the additional branches and logging found in the decompiled diff are indeed a result of the modifications made in the source code diff.

Confidence Level: High.

In a separate analysis of the Bluetooth stack, DeepDiff successfully detected patch-related modifications. The newly introduced strings and conditional checks are strong indicators that the code changes are a result of the security patch. Decompiled code and the corresponding analysis make this clear and easy to understand.

BinDiff Result

BinDiff again can identify the vulnerable code, but its assembly-level diffing view makes it difficult to ensure if the patch has been applied. There are too many change sites.

BinDiff Diff View 1 - NFC

BinDiff Diff View 2 - NFC

Diaphora Result

Similar to BinDiff, Diaphora highlights too many changes, and many are unrelated to the security patch. This makes average users determine the presence of a patch very challenging.

Diaphora Diff View 1 - NFC

Diaphora Diff View 2 - NFC

Getting Started with DeepDiff

DeepDiff is available now for enterprise deployment. To experience the power of DeepDiff for yourself:

Request a demo at sales@deepbits.com
Contact our sales team to discuss licensing options tailored to your organization's needs

Conclusion: The Future of Binary Security Analysis

With DeepDiff, Deepbits is proud to offer a solution that addresses one of the most persistent challenges in software security. By enabling precise, efficient identification of vulnerable functions across binary files, we're empowering organizations to protect their assets, their customers, and their reputations in an increasingly complex threat landscape.

Stay tuned for more updates as we continue to enhance DeepDiff with additional capabilities based on customer feedback and evolving security requirements.

Deepbits Deep Thinking

Deepbits

Deep Thinking

Blog

DeepDiff: Next-Generation Binary Diffing for Precise Vulnerability and Patch Detection

DeepDiff Demo:

The Challenge: Detecting Vulnerabilities and Patches in Complex Software Systems

Why This Matters: Security at Scale

Limitations of Existing Binary Diffing Tools

DeepDiff: A Smarter, More Accurate Approach

Key Capabilities

Real-World Application: Patch Verification in Samsung Galaxy Note10 Firmware

Overview of CVE-2023-21273

Overview of CVE-2023-21241

DeepDiff's Approach

Example 1: NFC CVE-2023-21241

DeepDiff Analysis

BinDiff Result

Diaphora Result

Example 2: Bluetooth CVE-2023-21273

DeepDiff Analysis

BinDiff Result

Diaphora Result

Getting Started with DeepDiff

Conclusion: The Future of Binary Security Analysis

References

Deepbits Deep Thinking

Deepbits

Deep Thinking

Blog

DeepDiff: Next-Generation Binary Diffing for Precise Vulnerability and Patch Detection

DeepDiff Demo:

The Challenge: Detecting Vulnerabilities and Patches in Complex Software Systems

Why This Matters: Security at Scale

Limitations of Existing Binary Diffing Tools

DeepDiff: A Smarter, More Accurate Approach

Key Capabilities

Real-World Application: Patch Verification in Samsung Galaxy Note10 Firmware

Overview of CVE-2023-21273

Overview of CVE-2023-21241

DeepDiff's Approach

Example 1: NFC CVE-2023-21241

DeepDiff Analysis

BinDiff Result

Diaphora Result

Example 2: Bluetooth CVE-2023-21273

DeepDiff Analysis

BinDiff Result

Diaphora Result

Getting Started with DeepDiff

Conclusion: The Future of Binary Security Analysis

References

Footnotes