Skip to content

Cloud helpers

We provide several helper methods to interact with cloud storage (other than load/save)

cloud_copy_file

Method to copy content from disk/cloud (depending on dist_origin value) to cloud.

def cloud_copy_file(origin: Path,
                    dest: Path,
                    dist_origin: bool = False,
                    cloud: Cloud = CLOUD,
                    location: str | None = None
                    ) -> None:

Arguments are:

  • origin: a pathlib Path specifying where the data is located in the disk (if dist_origin is False) / s3 bucket / blob container. Remember to read the forge key page to learn how to properly... Well, forge a key 😊.
  • dest: the (pathlib path specified) cloud place where to copy the file
  • dist_origin (optional): whether the file is already on the cloud storage (True) or on disk (False). Disk location by default
  • cloud (optional): the cloud provider used to connect to a distance storage. Read the installation guide in order to learn how to connect to your S3 like/blob cloud provider, and also consult cloud details. By default, the environment variable cloud_provider is used.
  • location (optional): the s3 bucket/blob container on which to connect. By default the s3_bucket_name environment variable is used if cloud=Cloud.AWS, and container is used if cloud=Cloud.Azure.

cloud_exists

Method checking if a file exists on the cloud storage

def cloud_exists(file_path: Path, cloud: Cloud = CLOUD) -> bool:

Arguments are:

  • file_path: a pathlib Path specifying where the data is located in the s3 bucket / blob container. Remember to read the forge key page to learn how to properly... Well, forge a key 😊.
  • cloud (optional): the cloud provider used to connect to a distance storage. Read the installation guide in order to learn how to connect to your S3 like/blob cloud provider, and also consult cloud details. By default, the environment variable cloud_provider is used.

cloud_is_dir

Method checking if the passed file_path is a folder or not (look at file extension)

def cloud_is_dir(file_path: Path) -> bool:

Arguments are:

  • file_path: a pathlib Path specifying where the data is located in the s3 bucket / blob container. Remember to read the forge key page to learn how to properly... Well, forge a key 😊.

cloud_iterdir

Method creating an Iterator of files directly underneath (as the pathlib disk version) in the provided file_path

def cloud_iterdir(file_path: Path, cloud: Cloud = CLOUD) -> Iterator[Path]:

Arguments are:

  • file_path: a pathlib Path specifying where the data is located in the s3 bucket / blob container. Remember to read the forge key page to learn how to properly... Well, forge a key 😊.
  • cloud (optional): the cloud provider used to connect to a distance storage. Read the installation guide in order to learn how to connect to your S3 like/blob cloud provider, and also consult cloud details. By default, the environment variable cloud_provider is used.

cloud_move_file

Method to move content from disk/cloud (depending on dist_origin value) to cloud.

def cloud_move_file(origin: Path,
                    dest: Path,
                    dist_origin: bool = False,
                    delete_file: bool = True,
                    cloud: Cloud = CLOUD
                    ) -> None:

Arguments are:

  • origin: a pathlib Path specifying where the data is located in the disk (if dist_origin is False) / s3 bucket / blob container. Remember to read the forge key page to learn how to properly... Well, forge a key 😊.
  • dest: the (pathlib path specified) cloud place where to move the file
  • dist_origin (optional): whether the file is already on the cloud storage (True) or on disk (False). Disk location by default
  • cloud (optional): the cloud provider used to connect to a distance storage. Read the installation guide in order to learn how to connect to your S3 like/blob cloud provider, and also consult cloud details. By default, the environment variable cloud_provider is used.
  • location (optional): the s3 bucket/blob container on which to connect. By default the s3_bucket_name environment variable is used if cloud=Cloud.AWS, and container is used if cloud=Cloud.Azure.

cloud_move_folder

Method to move/copy (depending on delete_file value) content from disk/cloud (depending on dist_origin value) to cloud.

def cloud_move_folder(origin: Path,
                      dest: Path,
                      dist_origin: bool = False,
                      delete_file: bool = True,
                      cloud: Cloud = CLOUD
                      ) -> None:

Arguments are:

  • origin: a pathlib Path specifying where the folder is located in the disk (if dist_origin is False) / s3 bucket / blob container. Remember to read the forge key page to learn how to properly... Well, forge a key 😊.
  • dest: the (pathlib path specified) cloud place where to move the folder
  • dist_origin (optional): whether the file is already on the cloud storage (True) or on disk (False). Disk location by default
  • cloud (optional): the cloud provider used to connect to a distance storage. Read the installation guide in order to learn how to connect to your S3 like/blob cloud provider, and also consult cloud details. By default, the environment variable cloud_provider is used.
  • location (optional): the s3 bucket/blob container on which to connect. By default the s3_bucket_name environment variable is used if cloud=Cloud.AWS, and container is used if cloud=Cloud.Azure.

cloud_rglob

Method creating an Iterator of all (as would do the pathlib disk version) files underneath (even recursively) the provided file_path

def cloud_rglob(file_path: Path,
                pattern: str | None = None,
                cloud: Cloud = CLOUD
                ) -> Iterator[Path]:

Arguments are:

  • file_path: a pathlib Path specifying where the data is located in the s3 bucket / blob container. Remember to read the forge key page to learn how to properly... Well, forge a key 😊.
  • pattern: a matching pattern (not equivalent to regex: rglob pathlib inspired. Read this for more information). Only keys matching this pattern will be returned.
  • cloud (optional): the cloud provider used to connect to a distance storage. Read the installation guide in order to learn how to connect to your S3 like/blob cloud provider, and also consult cloud details. By default, the environment variable cloud_provider is used.

delete_cloud_content

Method deleting the file file_path in the cloud storage

def delete_cloud_content(file_path: Path,  cloud: Cloud = CLOUD) -> None:
  • file_path: a pathlib Path specifying where the data is located in the s3 bucket / blob container. Remember to read the forge key page to learn how to properly... Well, forge a key 😊.
  • cloud (optional): the cloud provider used to connect to a distance storage. Read the installation guide in order to learn how to connect to your S3 like/blob cloud provider, and also consult cloud details. By default, the environment variable cloud_provider is used.

download_cloud_object

Method downloading the file file_path in the cloud storage

def download_cloud_object(file_path: Path, local_path: Path, cloud: Cloud = CLOUD) -> None:
  • file_path: a pathlib Path specifying where the data is located in the s3 bucket / blob container. Remember to read the forge key page to learn how to properly... Well, forge a key 😊.
  • local_path: the (pathlib path specified) local disk place where to store the downloaded file.
  • cloud (optional): the cloud provider used to connect to a distance storage. Read the installation guide in order to learn how to connect to your S3 like/blob cloud provider, and also consult cloud details. By default, the environment variable cloud_provider is used.

get_cloud_url

In the case of:

In both cases, the aim is to give access to a person not having cloud storage credential to a specific file, and for a limited duration. It proves handy to let a client download a given voluminous file for instance, where you obviously do not want to give the client access to the full cloud storage.

def get_cloud_url(file_path: Path, timeout: int = 3600, cloud: Cloud = CLOUD) -> str | None:
  • file_path: a pathlib Path specifying where the data is located in the s3 bucket / blob container. Remember to read the forge key page to learn how to properly... Well, forge a key 😊.
  • timeout (optional): How long the url is valid for downloading data located at file_path (specified in seconds, 3600 by default)
  • cloud (optional): the cloud provider used to connect to a distance storage. Read the installation guide in order to learn how to connect to your S3 like/blob cloud provider, and also consult cloud details. By default, the environment variable cloud_provider is used.