Cutting Splunk log storage without rewriting a single query
Cutting Splunk volume usually costs information: drop noisy lines, sample, or strip fields. There is a lossless way. One log.info("User {} logged in from {}", userId, ip) call site, hit a million times, emits a million events differing in two values. Store each as a hash of the shared template plus those values, and every event survives at a fraction of the bytes. The catch: every dashboard breaks. We built 10x for Splunk so it does not, in twelve lines of jQuery.
The approach rests on one property: the template hash must be stable. Templates inferred from a sliding window of live data drift as it moves, a poor storage key. We derive the vocabulary from the format strings and class names in the application's repos, so a compile pass yields a deterministic library and the hash never moves. Where that library comes from is its own post; here I take the hash as stable across queries, deploys, and nodes.
Two facts up front. The reduction is a modeled 50 to 80% of indexed volume, lossless, specific to destinations that store the encoded form; and the Splunk app in this post is Apache 2.0 and inspectable end to end, while the Receiver that produces the encoded events is the commercial piece.
Now, search time. A panel that yesterday showed 2026-04-14 10:23:01 INFO admin logged in from 192.168.1.1 shows ~x7Kp2m,1776162181000,admin,192.168.1.1. Every SPL query needs editing, yet we had to expand events back without touching a saved search, dashboard, or alert.
Three places to intercept; we picked the hook
A custom search command is Splunk's official extension: add | tenxsearch to every query. Easy to build, but every saved search, panel, and alert needs editing: a deployment wall.
Expanding at index time sidesteps that. The forwarder inflates events before indexing, so everything downstream works. But it pays for the full raw stream again: Splunk meters on indexed volume, the very bill compaction was meant to remove.
Hooking the browser's search submission holds up. We point the URL the dashboard POSTs to at our REST endpoint, which rewrites the SPL, submits it, and returns a standard sid.
The interception is twelve lines
Splunk auto-loads dashboard.js from an app's appserver/static/ directory. Ours is 37 lines: it waits for the page, then calls TenxSearchHook.execute(true). The hook lives in tenx_search_hook.js, a 49-line module; its twelve interception lines:
// tenx-for-splunk/appserver/static/javascript/search/tenx_search_hook.js
$.ajaxSetup({
beforeSend: function (xhr, settings) {
if ((settings.type == "POST") &&
(settings.url.endsWith("/search/jobs"))) {
var baseUrl = settings.url.substring(
0, settings.url.length - "/search/jobs".length);
settings.url = baseUrl + "/tenx-search";
}
}
});The response keeps its sid field, and the dispatchState polling loop still works.
The remaining lines handle a startup race: some dashboards fire searches before the hook loads. The execute(true) call walks the search managers Splunk already instantiated, via mvc.Components.getInstances() plus startSearch({refresh: true}), and re-issues those through the intercepted path. One $.ajax chokepoint in the classic framework makes this work.
What /tenx-search does with the SPL
The REST endpoint is registered in restmap.conf as a persistent Python 3 handler, which hands the search to TenxSearchBuilder, resolved in four steps.
Parse the SPL. The builder calls Splunk's /services/search/parser endpoint with parse_only=true, then modifies only the leading search command. Everything after the first pipe passes through.
Check for encoded data. It inspects props.conf for the REPORT-tenx = tenx-hash-vars-extraction extraction that splits the encoded line into hash and variable slots. Sourcetypes lacking it use native search.
Find the matching templates. The search words are joined with OR and run against tenx_dml_pure; matches yield their hashes.
Build the combined search. In admin logged in, admin is data in the encoded event; logged in is template text, reachable only by hash. One clause catches each:
Original:
search index=main sourcetype=tenx_encoded admin logged in
Resolved:
search index=main sourcetype=tenx_encoded
((admin OR logged OR in) OR (tenx_hash IN (x7Kp2m,r9Qw3n)))
| `tenx-inflate`
| extract
| search admin logged inHere tenx_hash is the template hash under its on-the-wire name. The OR pulls in events matching either place; the trailing | search drops anything whose decoded _raw lacks the original terms.
On a shape it cannot model, the parser flips to COMPLEX, one of four states alongside SUCCESS, FAILURE, and PENDING, and forwards the query to Splunk's native endpoint unchanged. I would rather return raw encoded results than silently break a query. Subsearches resolve recursively, and any failed step returns the original untouched.
The inflate macro reassembles events in pure SPL
tenx-inflate reconstructs events in pure SPL, with no Python. A regex in transforms.conf splits the event into hash, first variable, and the rest:
REGEX = ^~?(?<tenx_hash>[^,]+),(?<tenx_var_0>[^,]+)(?:,(?<tenx_vars>.*))?The leading ~ marks an event as encoded. When the template has a timestamp, the first slot carries it, so the macro treats tenx_var_0 as the epoch.
makemv delim="," tenx_vars
| lookup tenx-dml-lookup _key AS tenx_hash
OUTPUT part_0 AS tenx_part_0,
pattern_parts AS tenx_log_parts,
pattern_terminator AS tenx_log_term,
timestamp_format AS tenx_ts_f
| eval tenx_ts_sec=if(tenx_var_0 > 10000000000000,
tenx_var_0 / 1000000000,
tenx_var_0 / 1000)
| eval _raw=if(isnull(tenx_log_term), _raw,
if(tenx_ts_f == "",
if(isnull(tenx_var_0), tenx_log_term,
mvjoin(mvappend(tenx_part_0, tenx_var_0,
mvzip(tenx_log_parts, tenx_vars, ""),
tenx_log_term), "")),
replace(
mvjoin(mvappend(
mvzip(tenx_log_parts, tenx_vars, ""),
tenx_log_term), ""),
"__TENX_TS__",
strftime(tenx_ts_sec, tenx_ts_f))))
| fields - tenx_hash, tenx_log_parts, tenx_log_term,
tenx_ts_f, tenx_part_0, tenx_var_0,
tenx_vars, tenx_ts_secThe lookup pulls the template's static text from the KV Store by _key; mvzip, mvappend, and mvjoin weld each segment and value back into _raw. For timestamped templates, replace swaps in a __TENX_TS__ placeholder, and a check against 10 trillion tells nanoseconds from milliseconds. A final | extract replays the user's own props.conf extractions against _raw.
Templates arrive via HEC and a two-minute cron
Templates enter Splunk via HEC as JSON on the tenx_dml_raw_json sourcetype. Each carries a templateHash and a template like User $ logged in from $, each $ a value slot. A saved search named "Consume KV" fires every two minutes, splitting each template on $ into the tenx_dml KV Store collection and writing a flat copy to tenx_dml_pure for matching. The cron is a latency gap: a new log format is not expandable until the next run picks it up.
What the hook doesn't cover
Anything bypassing jQuery is invisible to the hook: server-side scheduled searches, REST calls from Python, SDK queries from external tools. Those call tenx-inflate directly, or decode the format in their own code with the standalone Java or JavaScript decoder.
The substantive gap is Dashboard Studio, Splunk's React-based framework and the recommended path for new dashboards. It uses fetch, not jQuery, so its searches slip past the hook and fall back to the macro. Covered: classic dashboards, the Search & Reporting app, the saved-search dialog.
The full app is Apache 2.0 at github.com/log-10x/splunk-app; the setup guide covers install, HEC tokens, and forwarder config. Whether browser-level interception ages well as Dashboard Studio grows is the open question, probably not in this form. The next post handles it without rebuilding the hook.
Related: why the template hash is stable, the same problem in Elasticsearch, solved one layer down inside Lucene, and the same idea on ClickHouse in plain SQL.