Bucket

Module for bucket object.

class qarnot.bucket.Bucket(connection, name, create=True, filtering: Filtering | None = None, resources_transformation: ResourcesTransformation | None = None, cacheTTLSec: int | None = None)[source]

Bases: Storage

Represents a resource/result bucket.

This class is the interface to manage resources or results from a Bucket.

Raises:

BucketStorageUnavailableException – the bucket storage engine is not available

Note

Paths given as ‘remote’ arguments, (or as path arguments for Bucket.directory()) must be valid unix-like paths.

__init__(connection, name, create=True, filtering: Filtering | None = None, resources_transformation: ResourcesTransformation | None = None, cacheTTLSec: int | None = None)[source]
to_json()[source]

Get a dict ready to be json packed from this bucket.

classmethod from_json(connection, json_bucket)[source]

Create a Bucket object from a json advance bucket.

Parameters:
  • connection (Connection) – the cluster connection

  • json_bucket (dict) – Dictionary representing the bucket

Returns:

The created Bucket.

with_filtering(filtering)[source]

Create a new Bucket object from the given bucket with a specific filtering.

examples:

filtered_bucket = bucket.with_filtering(BucketPrefixFiltering("prefix1"))
other_filtered_bucket = Bucket(connection, "name", False).with_filtering(BucketPrefixFiltering("prefix1"))
Parameters:

filtering (AbstractFiltering) – Filtering to add to the bucket.

Returns:

The created Bucket.

with_resource_transformation(resource)[source]

Create a new Bucket object from the given bucket with a specific resource transformation.

examples:

trans_bucket = Bucket(connection, "name", False).with_resource_transformation(PrefixResourcesTransformation("prefix2"))
trans_filtered_bucket = bucket.with_resource_transformation(PrefixResourcesTransformation("prefix2")).with_filtering(BucketPrefixFiltering("prefix1"))
Parameters:

resource (AbstractResourcesTransformation) – The resource transformation to add to the bucket.

Returns:

The created Bucket.

with_cache_ttl(ttl: int)[source]

Create a new Bucket object from the given bucket with a specific cache ttl (in seconds).

examples:

new_bucket = bucket.with_cache_ttl(2592000)
new_bucket = Bucket(connection, "name", False).with_cache_ttl(2592000)
Parameters:

ttl (int) – Time to live for the bucket resource cache.

Returns:

The created Bucket.

delete()[source]

Delete the bucket represented by this Bucket.

list_files()[source]

List files in the bucket

Return type:

list(S3.ObjectSummary)

Returns:

A list of ObjectSummary resources

directory(directory='')[source]

List files in a directory of the bucket according to prefix.

Return type:

list(S3.ObjectSummary)

Returns:

A list of ObjectSummary resources

sync_directory(directory, verbose=False, remote=None)[source]

Synchronize a local directory with the remote buckets.

Parameters:
  • directory (str) – The local directory to use for synchronization

  • verbose (bool) – Print information about synchronization operations

  • remote (str) – path of the directory on remote node (defaults to local)

Warning

Local changes are reflected on the server, a file present on the bucket but not in the local directory will be deleted from the bucket.

A file present in the directory but not in the bucket will be uploaded.

Note

The following parameters are used to determine whether synchronization is required :

  • name

  • size

  • sha1sum

sync_files(files, verbose=False, remote=None)[source]

Synchronize files with the remote buckets.

Parameters:
  • files (dict) – Dictionary of synchronized files

  • verbose (bool) – Print information about synchronization operations

  • remote (str) – path of the directory on remote node (defaults to local)

Raises:

MissingBucketException – the bucket is not on the server

Dictionary key is the remote file path while value is the local file path.

Warning

Local changes are reflected on the server, a file present on the bucket but not in the local directory will be deleted from the bucket.

A file present in the directory but not in the bucket will be uploaded.

Note

The following parameters are used to determine whether synchronization is required :

  • name

  • size

  • sha1sum

add_string(string, remote)[source]

Add a string on the storage.

Parameters:
  • string (str) – the string to add

  • remote (str) – name of the remote file

add_file(local_or_file, remote=None)[source]

Add a local file or a Python File on the storage.

Note

You can also use object[remote] = local

Parameters:
  • local_or_file (str or File) – path of the local file or an opened Python File

  • remote (str) – name of the remote file (defaults to local_or_file)

get_all_files(output_dir, progress=None)[source]

Get all files from the storage.

Parameters:
  • output_dir (str) – local directory for the retrieved files.

  • progress (bool or function(float, float, str)) – can be a callback (read,total,filename) or True to display a progress bar

Raises:

Warning

Will override output_dir content.

get_file(remote, local=None, progress=None)[source]

Get a file from the storage. Create needed subfolders.

Parameters:
  • remote (str) – the name of the remote file

  • local (str) – local name of the retrieved file (defaults to remote)

  • progress (bool or function(float, float, str)) – can be a callback (read,total,filename) or True to display a progress bar

Return type:

str

Returns:

The name of the output file.

Raises:

ValueError – no such file

add_directory(local, remote='')[source]

Add a directory to the storage. Does not follow symlinks. File hierarchy is preserved.

Parameters:
  • local (str) – path of the local directory to add

  • remote (str) – path of the directory on remote node (defaults to local)

Raises:

IOError – not a valid directory

copy_file(source, dest)[source]

Create a copy of a file

Parameters:
  • source (str) – name of the existing file to duplicate

  • dest (str) – name of the created file

flush()[source]

Ensure all background uploads are complete

Deprecated since version 2.6.0: This will be removed in 3.0. Legacy function

Deprecated since version 2.6.0: This will be removed in 3.0. Legacy function

update(flush=False)[source]

Update object from remote endpoint

Parameters:

flush (bool) – bypass cache

Deprecated since version 2.6.0: This will be removed in 3.0. Legacy function

Deprecated since version 2.6.0: This will be removed in 3.0. Legacy function

delete_file(remote)[source]

Delete a file from the storage.

Parameters:

remote (str) – the name of the remote file

property uuid

Bucket identifier

property description

Bucket identifier