Catalog#

Description#

Takes a root URL or a TSV file list to build a catalog via either a file tree walk or a TSV file list, returning a catalog manifest.

The catalog command can be used to create a list of files in a specified subtree and store the result together with explicit version information in a catalog (aka manifest) file. It will not capture empty folders.

The JSON file produced archives the files and their versions at the time the command is run. Being a server, running the command again might produce a different catalog when files are added, deleted, or updated in the meantime.

To determine if the cataloged version is still the same, use the Diff command to compare two catalogs made at different points in time or even different copy locations of the same asset.

Use --hash-cache-file (--hcf) to speed up repeated catalog operations by caching file hashes between runs. See Generic Parameters for details. Legacy pickle cache files (WRAPP <=2.1.0) are rejected; see CLI for JSON format and one-time migration guidance.

For usage examples, see the Tutorial. For CLI options, run wrapp catalog --help.

Python API Reference#

async wrapp.catalog(
root_dir: str | None = None,
file_list: str | List[Tuple[str, str]] | None = None,
tags: bool = False,
ignores: str | IgnoreEvaluator | None = None,
local_hash: bool = False,
checkpoints: bool = True,
skip_missing: bool = False,
*,
context: CommandParameters = CommandParameters(debug=False, verbose=False, dry_run=False, log_file=None, hash_cache_file=None),
scheduler: SchedulerContext | None = None,
) Catalog#

Takes a root URL or a file list to build a catalog via either a file tree walk or a file list tsv file, returning a Catalog

Parameters:
  • root_dir – URL of the root of the files to be cataloged

  • file_list – Alternative to root_dir, specify the name of a tab separated file containing a list of files to include, or a list explicitly listing all files as tuples base path, relative path.

  • tags – Set this to additionally query the tagging service and make tags part of the catalog

  • ignores – Specify the name of an ignore file containing rules for ignoring items, or an IgnoreEvaluator object

  • local_hash – Flag to allow the catalog operation to calculate the hash locally, potentially after a download of the data

  • checkpoints – Flag, default on, to list checkpoints of local items and add the latest checkpoint into the source URL if the source URL supports checkpoints

  • skip_missing – When True and using file_list mode, silently skip files that are not found instead of raising an error. This is useful for targeted cataloging where some files may not exist in the destination.

  • context – Global configuration parameters

  • scheduler – Optionally pre-constructed SchedulerContext. When calling many functions in a row make sure to pre-construct the scheduler.

Returns:

Catalog created

Raises:
  • FailedCommand – When prerequisites not matched

  • StorageOperationError – Raised when network or file operations fail