Application Logs: Removing Repeated Stack Traces
Your application retries failed operations, logging the same error stack trace multiple times. Remove duplicate stack traces to make logs more readable while preserving unique errors.
Input Data
app.log
2024-11-25 10:15:01 INFO Starting request handler
2024-11-25 10:15:02 ERROR Database connection failed
2024-11-25 10:15:02 at db.connect (db.js:45)
2024-11-25 10:15:02 at app.init (app.js:12)
2024-11-25 10:15:03 INFO Retrying connection (attempt 1/3)
2024-11-25 10:15:04 ERROR Database connection failed
2024-11-25 10:15:04 at db.connect (db.js:45)
2024-11-25 10:15:04 at app.init (app.js:12)
2024-11-25 10:15:05 INFO Retrying connection (attempt 2/3)
2024-11-25 10:15:06 ERROR Database connection failed
2024-11-25 10:15:06 at db.connect (db.js:45)
2024-11-25 10:15:06 at app.init (app.js:12)
2024-11-25 10:15:07 WARN Max retries exceeded
2024-11-25 10:15:08 INFO Shutting down gracefully
Highlighted lines show three identical 3-line stack traces (ERROR + 2 stack frames) occurring during connection retries.
Output Data
output.log
2024-11-25 10:15:01 INFO Starting request handler
2024-11-25 10:15:02 ERROR Database connection failed
2024-11-25 10:15:02 at db.connect (db.js:45)
2024-11-25 10:15:02 at app.init (app.js:12)
2024-11-25 10:15:03 INFO Retrying connection (attempt 1/3)
2024-11-25 10:15:05 INFO Retrying connection (attempt 2/3)
2024-11-25 10:15:07 WARN Max retries exceeded
2024-11-25 10:15:08 INFO Shutting down gracefully
Result: 6 duplicate lines removed. First stack trace kept, retries collapsed to just their INFO messages.
Solution
Options:
--window-size 3: Match 3-line sequences (ERROR + 2 stack frames)--skip-chars 20: Ignore timestamp prefix when comparing
from uniqseq import UniqSeq
uniqseq = UniqSeq(
window_size=3, # (1)!
skip_chars=20, # (2)!
)
with open("app.log") as f:
with open("output.log", "w") as out:
for line in f:
uniqseq.process_line(line.rstrip("\n"), out)
uniqseq.flush_to_stream(out)
- Match 3-line sequences (error + stack trace)
- Skip first 20 chars (timestamp)
How It Works
Each error produces a 3-line sequence:
1. ERROR: Database connection failed
2. at db.connect (db.js:45)
3. at app.init (app.js:12)
This pattern repeats three times with different timestamps. We need:
--window-size 3: Detect that these 3-line stack traces are identical--skip-chars 20: Ignore the timestamp prefix2024-11-25 10:15:02when comparing
Visual Breakdown
Input (14 lines): Output (8 lines):
1. INFO Starting → 1. INFO Starting
2. ERROR Database failed → 2. ERROR Database failed
3. at db.connect → 3. at db.connect
4. at app.init → 4. at app.init
5. INFO Retrying (1/3) → 5. INFO Retrying (1/3)
6. ERROR Database failed ✗ (duplicate stack trace removed)
7. at db.connect ✗
8. at app.init ✗
9. INFO Retrying (2/3) → 6. INFO Retrying (2/3)
10. ERROR Database failed ✗ (duplicate stack trace removed)
11. at db.connect ✗
12. at app.init ✗
13. WARN Max retries → 7. WARN Max retries
14. INFO Shutting down → 8. INFO Shutting down
The cleaned log clearly shows: - One stack trace (the actual error) - Retry attempts progressing (1/3, 2/3) - Final outcome (max retries exceeded)
Benefits
Before deduplication: 14 lines with repetitive stack traces obscuring the flow
After deduplication: 8 lines with clear progression of events
This makes it easier to: - Understand the error sequence - Count retry attempts - Focus on unique issues - Reduce log storage
Real-World Usage
# Process application logs in real-time
tail -f /var/log/app.log | uniqseq --window-size 3 --skip-chars 20
# Clean up archived logs
for log in /var/log/app/*.log; do
uniqseq "$log" --window-size 3 --skip-chars 20 > "${log}.clean"
done
# Combine with pattern filtering to deduplicate only errors
uniqseq app.log --track "ERROR" --window-size 3 --skip-chars 20
See Also
- CI Build Logs - Similar multi-line deduplication
- Ignoring Prefixes - How skip-chars works
- Window Size - Choosing the right window