Catalog#

Description#

Takes a root URL or a file list to build a catalog via either a file tree walk or a file list tsv file, returning a Catalog.

The catalog command can be used to create a list of files in a specified subtree and store the result together with explicit version information in a catalog (aka manifest) file. It will not capture empty folders.

The JSON file produced has archived the files and their versions at the very moment the command was run. Being a server, running the command again might produce a different catalog when files are added, deleted, or updated in the meantime.

To be able to determine if the version that was cataloged is still the same, we can use the diff command to compare two catalogs made at different points in time or even different copy locations of the same asset.

For usage examples, see the Tutorial. For CLI options, run wrapp catalog --help.

Python API Reference#

async wrapp.catalog(
root_dir: str | None = None,
file_list: str | List[Tuple[str, str]] | None = None,
tags: bool = False,
ignores: str | IgnoreEvaluator | None = None,
local_hash: bool = False,
checkpoints: bool = True,
*,
context: CommandParameters = CommandParameters(debug=False, verbose=False, dry_run=False, log_file=None, hash_cache_file=None),
scheduler: SchedulerContext | None = None,
) Catalog#

Takes a root URL or a file list to build a catalog via either a file tree walk or a file list tsv file, returning a Catalog

Parameters:
  • root_dir – URL of the root of the files to be cataloged

  • file_list – Alternative to root_dir, specify the name of a tab separated file containing a list of files to include, or a list explicitly listing all files as tuples base path, relative path.

  • tags – Set this to additionally query the tagging service and make tags part of the catalog

  • ignores – Specify the name of an ignore file containing rules for ignoring items

  • local_hash – Flag to allow the catalog operation to calculate the hash locally, potentially after a download of the data

  • checkpoints – Flag, default on, to list checkpoints of local items and add the latest checkpoint into the source URL if the source URL supports checkpoints

  • context – Global configuration parameters

  • scheduler – Optionally pre-constructed SchedulerContext. When calling many functions in a row make sure to pre-construct the scheduler.

Returns:

Catalog created

Raises:
  • FailedCommand – When prerequisites not matched

  • StorageOperationError – Raised when network or file operations fail