Metadata Folders and Structures

How and where processed asset metadata is stored across the metadata bucket and metadata-archive bucket.

All metadata generated for an asset is stored in one of two S3 buckets. In general, the metadata bucket holds public-facing metadata and the metadata-archive bucket holds metadata used internally to derive other information.

Metadata Bucket

Items stored in the metadata bucket (publicly accessible, non-exhaustive):

  • Screenshots
  • Transcoded video
  • Audio previews
  • Job rollups
  • Asset, segment, and screenshot manifests
  • Transcription text files
  • VTT subtitle files

Metadata Archive Bucket

Items stored in the metadata-archive bucket (internal use, non-exhaustive):

  • imageinfo.json
  • mediainfo.json
  • All Rekognition output JSON
  • Transcription JSON

File Path Format

All files in both buckets follow the same path structure:

<bucket>/<first-2-chars-of-assetId>/<next-2-chars>/<next-2-chars>/<full-assetId>/<filetype>.<extension>

Example:

metadata-content/4d/5b/02/4d5b02b2-b829-47c8-b8a3-eb5f68af4bd7/manifest_asset.json

Processor output files are named after the processor type:

rekognitionimagedetecttext.json

Public vs. Private Folders in the Metadata Bucket

The system configuration includes a cloudfrontMappings section that determines whether a file lands in the public or non-public folder within the metadata bucket:

{
  "bucketName": "demo3-system-metadata-s0of3mkw3ewn",
  "prefix": "pmc",
  "restrict": false,
  "url": "https://demo3-content.nomad.media"
}

The system checks the asset's bucketName and objectKey against each mapping entry:

  • If the bucketName and prefix match and restrict is false → file goes to the public folder
  • If no mapping matches → file goes to the non-public folder (default)

The metadata-archive bucket is always non-public regardless of CloudFront mappings.