Binary Analysis: Memory Dump Forensics
Analyze memory dumps and firmware images to identify repeated structures, strings, and allocated blocks for security research and malware analysis.
The Problem
Memory dumps and firmware images contain significant redundancy:
- Repeated allocations - Same strings or structs allocated many times
- Template structures - Identical object instances in memory
- String pools - Duplicate string constants
- Padding and alignment - Repeated fill bytes
- Large dump sizes - Multi-gigabyte memory images are expensive to analyze
Deduplication helps identify unique structures and reduces analysis overhead.
Input Data
memory-dump.bin
Binary memory dump (374 bytes) containing repeated memory structures:
- String allocations (4 identical instances)
- Struct data (3 identical instances)
- Buffer blocks (2 identical instances)
- Unique heap metadata (1 instance)
Blocks separated by padding bytes (0xFF).
Hex dump (first 20 lines):
00000000: 7573 6572 5f73 6573 7369 6f6e 5f41 4141 user_session_AAA
00000010: 4141 4141 4141 4141 4141 4141 4141 4141 AAAAAAAAAAAAAAAA
00000020: 4141 4100 0078 5634 12ff ffff ffff ffff AA..xV4.........
00000030: ffef bead de01 0000 0000 0100 0002 434f ..............CO
00000040: 4e46 4947 0000 ffff ffff ffff ffff ff75 NFIG...........u
...
Output Data
expected-memory-dedup.bin
Deduplicated memory dump (242 bytes):
- 1× String allocation (3 duplicates removed)
- 1× Struct data (2 duplicates removed)
- 1× Buffer block (1 duplicate removed)
- 1× Heap metadata (kept)
Result: 35% size reduction (374 → 242 bytes)
Solution
Options:
--byte-mode: Process binary memory data--delimiter-hex ff: Split on padding byte (0xFF)--window-size 1: Deduplicate individual memory blocks--quiet: Suppress statistics output
from uniqseq import UniqSeq
uniqseq = UniqSeq(
delimiter=b"\xff", # (1)!
window_size=1, # (2)!
)
with open("memory-dump.bin", "rb") as f:
with open("output.bin", "wb") as out:
data = f.read()
# Split on delimiter, keeping empty chunks (consecutive delimiters)
blocks = data.split(b'\xff')
# Process all but last block (last is after trailing delimiter)
for block in blocks[:-1]:
uniqseq.process_line(block, out)
# Process last block if non-empty
if blocks[-1]:
uniqseq.process_line(blocks[-1], out)
uniqseq.flush_to_stream(out)
- Use bytes delimiter for binary mode
- Deduplicate individual memory blocks
How It Works
Binary deduplication identifies identical memory structures:
Before (374 bytes with duplicates):
[String: "user_session_AAA..."] <-- Keep
[Padding: 0xFFFFFFFF...]
[Struct: {0xDEADBEEF, ...}] <-- Keep
[Padding: 0xFFFFFFFF...]
[String: "user_session_AAA..."] <-- Duplicate, remove
[Padding: 0xFFFFFFFF...]
[Buffer: "Buffer: AAA..."] <-- Keep
...
(more duplicates)
After (242 bytes, unique only):
[String: "user_session_AAA..."]
[Struct: {0xDEADBEEF, ...}]
[Buffer: "Buffer: AAA..."]
[Heap metadata: {0x7fff0000, ...}]
Real-World Workflows
Firmware Analysis
Extract unique strings and structures from firmware images:
# Analyze firmware image
uniqseq firmware.bin \
--byte-mode \
--delimiter-hex 00 \
--window-size 1 \
--quiet > unique-firmware-strings.bin
# Convert to readable strings
strings unique-firmware-strings.bin > firmware-strings.txt
# Analyze for hardcoded credentials, URLs, etc.
grep -E "(password|api_key|http)" firmware-strings.txt
Malware Memory Analysis
Identify unique malware artifacts in memory dumps:
# Extract process memory
volatility -f memory.dmp --profile=Win10x64 memdump -p 1234 -D .
# Deduplicate memory blocks
uniqseq 1234.dmp \
--byte-mode \
--delimiter-hex ff \
--window-size 1 \
--stats-format json \
> unique-blocks.bin \
2> analysis-stats.json
# Check redundancy
jq '.statistics.redundancy_pct' analysis-stats.json
Output: 67.5% (high redundancy indicates many repeated structures)
Heap Spray Detection
Detect heap spray attacks by finding repeated allocations:
# Analyze process heap
uniqseq heap-dump.bin \
--byte-mode \
--delimiter-hex 00 \
--annotate | \
grep "DUPLICATE" | \
wc -l
High duplicate count may indicate heap spray attack.
String Pool Analysis
Extract unique strings from memory:
# Dump memory, extract null-terminated strings
uniqseq memory.dmp \
--byte-mode \
--delimiter-hex 00 \
--quiet | \
strings -n 8 | \
head -20
Shows 20 longest unique strings from memory.
Struct Pattern Discovery
Find repeated data structures:
# Identify 16-byte aligned structures
uniqseq memory.bin \
--byte-mode \
--delimiter-hex "00000000" \
--window-size 1 \
--stats-format json \
--quiet 2>&1 | \
jq '.statistics'
Statistics reveal how many repeated structures exist.
Advanced Patterns
Multi-Block Sequences
Find repeated allocation patterns:
# Detect sequences of 3 consecutive blocks
uniqseq memory-dump.bin \
--byte-mode \
--delimiter-hex ff \
--window-size 3 \
--annotate | \
grep "DUPLICATE"
Identifies repeated multi-block patterns (e.g., object hierarchies).
Variable Data Normalization
Normalize pointers before comparison:
# Replace 8-byte pointers with placeholder
uniqseq memory.bin \
--byte-mode \
--delimiter-hex 00 \
--hash-transform 'sed "s/\x00\x00\x00\x00\x7f\x00\x00\x00/XXXXXXXX/g"' \
--quiet
Groups structures with different pointer values but identical layout.
Save Unique Blocks
Build library of unique memory structures:
# Extract unique blocks to library
uniqseq memory-dump.bin \
--byte-mode \
--delimiter-hex ff \
--library-dir ./unique-structures \
--quiet > /dev/null
# Each file in library is a unique block
ls -lh unique-structures/
Compare Memory Snapshots
Detect new allocations between snapshots:
# Baseline: Deduplicate snapshot 1
uniqseq snapshot1.bin --byte-mode --delimiter-hex ff \
--library-dir ./baseline --quiet > /dev/null
# Analysis: Find new blocks in snapshot 2
uniqseq snapshot2.bin --byte-mode --delimiter-hex ff \
--library-dir ./baseline --annotate | \
grep "NEW PATTERN"
Shows memory blocks allocated between snapshots.
Forensics Use Cases
Credential Extraction
Find repeated credential structures:
# Extract blocks matching credential patterns
uniqseq memory.dmp --byte-mode --delimiter-hex 00 --quiet | \
strings | \
grep -E "(password|token|key)" | \
uniq
Rootkit Detection
Identify injected code patterns:
# Compare process memory against known clean state
uniqseq suspicious-process.dmp \
--byte-mode --delimiter-hex ff \
--library-dir ./clean-baseline \
--inverse | \ # Show only known patterns
wc -l
Low match count indicates many unknown blocks (possible injection).
Memory Leak Analysis
Track repeated object allocations:
#!/bin/bash
# Analyze memory leaks
for snapshot in snapshot-*.bin; do
echo "=== $snapshot ==="
uniqseq "$snapshot" --byte-mode --delimiter-hex 00 \
--stats-format json --quiet 2>&1 | \
jq -r '.statistics |
"Unique blocks: \(.lines.emitted)
Duplicate blocks: \(.lines.skipped)
Redundancy: \(.redundancy_pct)%"'
done
Increasing duplicate count over time may indicate memory leak.
Binary Diff for Firmware
Compare firmware versions:
# Extract unique blocks from firmware v1
uniqseq firmware-v1.bin --byte-mode --delimiter-hex ff \
--library-dir ./v1-blocks --quiet > /dev/null
# Find differences in firmware v2
uniqseq firmware-v2.bin --byte-mode --delimiter-hex ff \
--library-dir ./v1-blocks --annotate | \
grep "NEW PATTERN" > firmware-changes.txt
# Analyze what changed
wc -l firmware-changes.txt
Performance Benefits
Reduced Analysis Time
# Before: Analyze full memory dump
$ time strings memory-full.dmp | wc -l
real 0m45.2s
12,450,000 strings
# After: Analyze deduplicated dump
$ time uniqseq memory-full.dmp --byte-mode --delimiter-hex 00 --quiet |
strings | wc -l
real 0m15.8s # 3× faster
4,150,000 strings (67% reduction)
Storage Savings
# Original dump
$ ls -lh process-dump.bin
2.4G
# Deduplicated
$ uniqseq process-dump.bin --byte-mode --delimiter-hex ff --quiet | wc -c
805306368 # 768 MB → 68% reduction
Common Delimiters
| Delimiter | Use Case | Example |
|---|---|---|
0x00 |
Null-terminated strings | C strings, paths |
0xFF |
Memory padding/alignment | Heap allocations |
0x00000000 |
4-byte aligned structures | 32-bit pointers |
0x0000000000000000 |
8-byte aligned structures | 64-bit pointers |
0xCC |
Debug fill pattern | MSVC debug heap |
0xCD |
Uninitialized memory | MSVC runtime |
Detecting the Right Delimiter
# Find most common byte value (likely delimiter)
xxd -p memory.bin | \
fold -w2 | \
sort | \
uniq -c | \
sort -rn | \
head -5
Integration Examples
Volatility Plugin
# Extract process memory with Volatility
volatility -f memory.dmp --profile=Win10x64 memdump -p $PID -D ./dumps
# Deduplicate for analysis
for dump in dumps/*.dmp; do
uniqseq "$dump" --byte-mode --delimiter-hex ff \
--quiet > "dedup/$(basename $dump)"
done
radare2 Analysis
# Load deduplicated binary in radare2
uniqseq firmware.bin --byte-mode --delimiter-hex 00 --quiet > firmware-dedup.bin
r2 -a arm -b 32 firmware-dedup.bin
# Analyze with /x, afl, pdf commands
Binary Ninja
# Python script for Binary Ninja
from uniqseq import UniqSeq
# Deduplicate before loading
uniqseq = UniqSeq(delimiter=b"\xff")
with open("large-firmware.bin", "rb") as f:
with open("firmware-dedup.bin", "wb") as out:
data = f.read()
for block in data.split(b'\xff'):
if block:
uniqseq.process_line(block, out)
uniqseq.process_line(b'\xff', out)
uniqseq.flush_to_stream(out)
# Load deduplicated firmware in Binary Ninja
bv = binaryninja.open_view("firmware-dedup.bin")
When to Use This
Good candidates: - ✅ Memory dumps with repeated allocations - ✅ Firmware images with string tables - ✅ Process heap analysis for malware - ✅ Rootkit detection (compare against baseline) - ✅ Memory leak investigation
Not recommended: - ❌ Encrypted memory regions - ❌ Compressed firmware images - ❌ Small dumps (<10 MB) - ❌ Heavily obfuscated malware
See Also
- Byte Mode - Binary data processing
- Custom Delimiters - Delimiter configuration
- Library Mode - Saving unique blocks
- Network Capture - Binary protocol analysis