Cloud read¶

One of the two central methods of ecodev-cloud is load_cloud_data allows one to load data from the cloud provider specified in the cloud_provider environment variable.

It's interface reads

def load_cloud_data(file_path: Path,
                    cloud: Cloud = CLOUD,
                    location: str | None = None
                    ) -> Any:

where:

file_path: a pathlib Path (by the way, we advise to get rid of all os code when you can. For dealing with files/folders, pathlib is just way simpler and user friendly to use.) specifying where the data is located in the s3 bucket / blob container. Remember to read the forge key page to learn how to properly... Well, forge a key 😊.
cloud (optional): the cloud provider used to connect to a distance storage. Read the installation guide in order to learn how to connect to your S3 like/blob cloud provider, and also consult cloud details. By default, the environment variable cloud_provider is used.
location (optional): the s3 bucket/blob container on which to connect. By default the s3_bucket_name environment variable is used if cloud=Cloud.AWS, and container is used if cloud=Cloud.Azure.

As of 2024/05, this method can read the following file types (it relies on the file extension given in file_type to use the appropriate loader):

csv
xlsx
npy
npy.npz (compressed numpy)
json
netcdf (a very useful format when dealing with climate data)
tex
txt
tif
gpkg
shp (subtlety: you have to store the shapefile zipped in order to easily retrieve it. You can go inspect the source code, and the load_zipped_shp method to learn more)