Recipe: bucket statistics summary
Recipe: bucket statistics summary
Recipe: bucket statistics summary
Prompt exampleShow me a summary of how many assets, folders, and files are in each bucket, broken down by status, media type, and storage class, with totals across all buckets.
Enumerate every top-level bucket and report a full breakdown - asset types, statuses, media types, storage classes, size, and duration - all from a single search query. This is read-only, Class A (safe on any environment, including prod).
Key facts
- Buckets have
assetType == 5. QueryassetType Equals 5to enumerate them - do not scope byparentId, because buckets span the root level and aparentIdfilter would miss any that aren't direct children of the sentinel. - Each bucket's search result already carries a fully populated
assetStatsobject computed at index time. No per-bucket follow-up queries are needed. assetStatscontains:assetTypeCounts- counts keyed by type name ("file","folder","bucket")assetStatusCounts- counts keyed by status name ("available","registering","uploading","placeholder","archived","error", etc.)mediaTypeCounts- counts keyed by media type name ("video","image","audio","document","text","mediamanifest", etc.)storageClassCounts- counts keyed by storage class name ("standard","intelligenttiering","glacier", etc.)totalContentLength/totalContentLengthDisplay- total size in bytes / human-readabletotalVideoDuration/totalVideoDurationDisplay- total video durationtotalAudioDuration/totalAudioDurationDisplay- total audio duration- All keys in the count dicts are discovered dynamically - new statuses or storage classes appear automatically without code changes.
Python
# component: search
def get_buckets(sdk):
"""One search call - returns all bucket rows with assetStats populated."""
flt = [{"fieldName": "assetType", "operator": "Equals", "values": 5}]
return (search(sdk, filters=flt, size=200) or {}).get("items", [])
def bucket_stats_summary(sdk):
"""Return per-bucket stat rows plus a grand-total row.
Each row dict contains:
name, id, files, folders,
all assetStatusCounts keys (prefixed "status_"),
all mediaTypeCounts keys (prefixed "media_"),
all storageClassCounts keys (prefixed "storage_"),
size, video_dur, audio_dur.
The last row has name="TOTAL" and summed integer fields.
"""
def _int(v):
try:
return int(v) if v is not None else 0
except (TypeError, ValueError):
return 0
buckets = get_buckets(sdk)
rows = []
for b in buckets:
name = (b.get("identifiers") or {}).get("displayName") or b.get("id")
s = b.get("assetStats") or {}
row = {
"name": name,
"id": b.get("id"),
"files": _int((s.get("assetTypeCounts") or {}).get("file")),
"folders": _int((s.get("assetTypeCounts") or {}).get("folder")),
"size": s.get("totalContentLengthDisplay") or "0 bytes",
"video_dur": s.get("totalVideoDurationDisplay") or "0 sec",
"audio_dur": s.get("totalAudioDurationDisplay") or "0 sec",
}
for k, v in (s.get("assetStatusCounts") or {}).items():
row[f"status_{k}"] = _int(v)
for k, v in (s.get("mediaTypeCounts") or {}).items():
row[f"media_{k}"] = _int(v)
for k, v in (s.get("storageClassCounts") or {}).items():
row[f"storage_{k}"] = _int(v)
rows.append(row)
rows.sort(key=lambda r: r["files"] + r["folders"], reverse=True)
# Collect all dynamic keys then build the grand-total row
all_keys = {k for r in rows for k in r if k.startswith(("status_", "media_", "storage_"))}
grand = {"name": "TOTAL", "id": None,
"files": sum(r["files"] for r in rows),
"folders": sum(r["folders"] for r in rows)}
for k in all_keys:
grand[k] = sum(r.get(k, 0) for r in rows)
return rows + [grand]JavaScript
// component: search
async function getBuckets(sdk) {
// One call - assetStats is populated on every bucket result.
const flt = [{ fieldName: "assetType", operator: "Equals", values: 5 }];
const res = await search(sdk, null, flt, null, 200);
return res ? res.items : [];
}
export async function bucketStatsSummary(sdk) {
const buckets = await getBuckets(sdk);
const rows = buckets.map((b) => {
const name = b.identifiers?.displayName ?? b.id;
const s = b.assetStats ?? {};
const row = {
name,
id: b.id,
files: (s.assetTypeCounts?.file) ?? 0,
folders: (s.assetTypeCounts?.folder) ?? 0,
size: s.totalContentLengthDisplay ?? "0 bytes",
videoDur: s.totalVideoDurationDisplay ?? "0 sec",
audioDur: s.totalAudioDurationDisplay ?? "0 sec",
};
for (const [k, v] of Object.entries(s.assetStatusCounts ?? {})) row[`status_${k}`] = v ?? 0;
for (const [k, v] of Object.entries(s.mediaTypeCounts ?? {})) row[`media_${k}`] = v ?? 0;
for (const [k, v] of Object.entries(s.storageClassCounts ?? {})) row[`storage_${k}`] = v ?? 0;
return row;
});
rows.sort((a, b) => (b.files + b.folders) - (a.files + a.folders));
// Grand total - sum all integer fields discovered across all rows
const allKeys = [...new Set(rows.flatMap(r =>
Object.keys(r).filter(k => k.startsWith("status_") || k.startsWith("media_") || k.startsWith("storage_"))
))];
const grand = { name: "TOTAL", id: null,
files: rows.reduce((s, r) => s + r.files, 0),
folders: rows.reduce((s, r) => s + r.folders, 0),
};
for (const k of allKeys) grand[k] = rows.reduce((s, r) => s + (r[k] ?? 0), 0);
return [...rows, grand];
}Notes
- One query total.
assetStatsis computed at index time and embedded in every bucket's search result - no per-bucket follow-up calls are needed. The previous approach (2 extrauuidSearchFieldqueries per bucket) is superseded by readingassetStatsdirectly. - Dynamic keys. The count dicts (
assetStatusCounts,mediaTypeCounts,storageClassCounts) are iterated at runtime - new values (e.g. a new storage tier, a new status) appear automatically without code changes. - Index lag.
assetStatsis refreshed when the bucket is re-indexed. Very recently ingested assets may not be reflected until the next index cycle. - Buckets with identical names: display names repeat on some deployments
(see
folder-navigation.md). The summary keeps all rows - disambiguate byidif needed.
