This is an old revision of the document!
S3 Object Storage
The object storage is internally hosted via minIO. There is a web-interface which allows browsing through the store. Data can also be downloaded via the minIO client.
Web-Interface: https://minio.strg1.lan/login
minIO client setup
Install instructions: https://min.io/docs/minio/linux/reference/minio-mc.html
S3 API URL: https://s3.strg1.lan
The CLI-tool first requires the set-up of an alias for the storage. For doing so, you need an access key, which can be created via the web-interface.
You can then either directly download a credentials.json, or do it yourself with the given accessKey and secretKey. NOTE: The URL provided by the web-interface is wrong. You need to specify the S3 API URL specified above.
Assuming you have the mc command available in the command-line, and the credentials in credentials.json, you can setup an alias with the name “strg1” via:
mc alias import strg1 credentials.json
If everything worked out, you should be able to ls the storage:
mc ls strg1
Failed to verify certificate
If you get the following error, you still need to add the certificate (see bottom of the page) to your system.
mc: <ERROR> Unable to list folder. Get "https://s3.strg1.lan/": tls: failed to verify certificate: x509: certificate signed by unknown authority
For adding the certificate, write it to a
ca.crt
file, move it to the path where certificates are stored on your OS (e.g.
/etc/pki/trust/anchors/ca.crt
on OpenSuse Leap 15.6) and then run
update-ca-certificates
(same command on Ubuntu or OpenSuse).
Example ''credentials.json'':
{"url":"https://s3.strg1.lan",
"accessKey":"YOURACCESSKEY",
"secretKey":"YOURSECRETKEY",
"api":"s3v4",
"path":"auto"}
Accessing the Buckets from python
For accessing the storage in python you can either use the minio-client provided by the `minio` package, or the `s3fs` package together with `boto3`. The latter has wider support, and is e.g. required for using zarr.
An important part is that the ssl certificate of the storage is self signed, so a certificate file has to be provided.
Note: For storing secrets, NEVER save them in your code directly, as the code will be pushed to repositories.
Instead, create a file .env in your project root, where you store your credentials, and later load them in python using the `dotenv` package.
Example .env file:
STORAGE_ACCESS_KEY=... STORAGE_SECRET_KEY=... ENDPOINT=s3.strg1.lan ENDPOINT_FULL=https://s3.strg1.lan
Example Using minio package:
This assumes a ca.crt.cer file is available.
For this snippet, the following dependency was used in the pyproject.toml file:
dotenv = "^0.9.9" minio = "^7.2.15"
Code Snippet
from pathlib import Path import minio import urllib3 import os from dotenv import load_dotenv # Specify the path to your custom CA certificate # Set the environment variable for the custom CA certificate _root = Path(__file__).parent.parent # Create a boto3 session with your custom configuration # load credentials from .env file load_dotenv() access_key_id = str(os.getenv("STORAGE_ACCESS_KEY")) secret_access_key = str(os.getenv("STORAGE_SECRET_KEY")) endpoint_url = str(os.getenv("ENDPOINT")) endpoint_url_full = str(os.getenv("ENDPOINT_FULL")) ca_cert_path = _root/"ca.crt.cer" assert Path(ca_cert_path).is_file() # S3 # when making requests to endpoints with self-signed certs http_client = urllib3.PoolManager( cert_reqs="CERT_REQUIRED", ca_certs=_root/"ca.crt.cer", ) # create Minio client with custom http client # replace with boto3 client for AWS ? minio_client = minio.Minio( endpoint_url, secure=True, access_key=access_key_id, secret_key=secret_access_key, http_client=http_client, ) print(minio_client.bucket_exists("rekonas-dataset-101-nights"))
Example Using s3fs and boto3 package (required for zarr v2):
This assumes a ca.crt.cer file is available.
For this snippet, the following dependency was used in the pyproject.toml file.
Note: There are weird dependency issues in poetry with s3fs and boto3.
The dependency below should however resolve those.
dotenv = "^0.9.9"
s3fs = {extras = ["boto3"], version = ">=2023.12.0"}
Code Snippet
import os from pathlib import Path from dotenv import load_dotenv import s3fs # load credentials from .env file load_dotenv() access_key_id = os.getenv("STORAGE_ACCESS_KEY") secret_access_key = os.getenv("STORAGE_SECRET_KEY") endpoint_url = os.getenv("ENDPOINT") endpoint_url_full = os.getenv("ENDPOINT_FULL") # Specify the path to your custom CA certificate ca_cert_path = "ca.crt.cer" assert Path(ca_cert_path).is_file() # Create s3fs filesystem with custom cert fs = s3fs.S3FileSystem( client_kwargs={'endpoint_url': endpoint_url_full, 'verify': str(ca_cert_path)}, key=access_key_id, secret=secret_access_key, use_ssl=True, ) # sanity check, exchange with some other bucket of interest assert fs.exists("rekonas-dataset-nch-sleep-databank") # Create zarr store and group within a bucket # import zarr # store = s3fs.S3Map(root='test-bucket-fabricio-zarr', s3=fs) # z = zarr.group(store=store, path="test_group")
Example Using obstore (required for zarr v3):
Dependencies:
dotenv = "^0.9.9" obstore = "^0.6.0" zarr = ">=3.0.8"
import os import ssl from pathlib import Path import obstore as obs import zarr.storage # noqa: F401 from dotenv import load_dotenv # noqa: F401 from obstore.store import ( MemoryStore, # noqa: F401 S3Store, ) from zarr.storage import ObjectStore # noqa: F401 # load credentials from .env file load_dotenv() access_key_id = str(os.getenv("STORAGE_ACCESS_KEY")) secret_access_key = str(os.getenv("STORAGE_SECRET_KEY")) endpoint_url = str(os.getenv("ENDPOINT")) endpoint_url_full = str(os.getenv("ENDPOINT_FULL")) # Specify the path to your custom CA certificate ca_cert_path = "ca.crt.cer" assert Path(ca_cert_path).is_file() # Create SSL context for custom certificate ssl_context = ssl.create_default_context() # ssl_context.check_hostname = False # ssl_context.verify_mode = ssl.CERT_NONE ssl_context.load_verify_locations(ca_cert_path) ob_store = S3Store( "test-bucket-fabricio-zarr", endpoint=endpoint_url_full, access_key_id=access_key_id, # Should be access_key_id, not secret_access_key # Should be secret_access_key, not access_key_id secret_access_key=secret_access_key, virtual_hosted_style_request=False, region="ch-bsl-1", # Add the required region client_options={"allow_invalid_certificates": True}, ) # ls to see files that exist in bucket list_of_files = obs.list(ob_store).collect() # create a small array for testing zarr. store = zarr.storage.ObjectStore(store=ob_store) zarr.create_array(store=store, shape=(2,), dtype="float64")
SSL Certificate
Below you can find the certificate as of May 06, 2025.
Simply save it into a ca.crt.cer file to use as shown above.
-----BEGIN CERTIFICATE----- MIIB+DCCAX+gAwIBAgIUNOXxe14mKQCbT9gKVouhzCD3TL0wCgYIKoZIzj0EAwQw FTETMBEGA1UEAwwKUmVrb25hcyBDQTAeFw0yMzAyMjIxNDA0MzVaFw0zMzAyMTkx NDA0MzVaMBUxEzARBgNVBAMMClJla29uYXMgQ0EwdjAQBgcqhkjOPQIBBgUrgQQA IgNiAAR6Nija/wfPLwmX/KW2rsowfxbLIJ3JMTJmltFOqrul074ZQkVQWsyShp67 2GlehcDP+oLR7VJg8oCEIFDQYug00x2QWlnqDHMxkE0ZtN6vH5lq/RaUUf0hdYy3 eP6l+qijgY8wgYwwDAYDVR0TBAUwAwEB/zAdBgNVHQ4EFgQUWtnLStq5o/+O4B2Y 3Fsc11dadqwwUAYDVR0jBEkwR4AUWtnLStq5o/+O4B2Y3Fsc11dadqyhGaQXMBUx EzARBgNVBAMMClJla29uYXMgQ0GCFDTl8XteJikAm0/YClaLocwg90y9MAsGA1Ud DwQEAwIBBjAKBggqhkjOPQQDBANnADBkAjBVVoWkAHc2jQpkobopyGhS+bLDRjEm 3ZtGVo9Blvk0TNciDBSgeQ6onuAjorLP3/ICMD5G2CR4rmfCh6Ed+mag7wMlBQYf 1q5iT+kB7u9gG8lhIeB+1MT5JIeIK7ygmC6g/Q== -----END CERTIFICATE-----
Issues with VPN: MTU Problems
In case you have issues accessing the minio store from VPN, check your MTU settings for the vpn:
ip a | grep mtu
A working mtu for packages transmitted over the VPN via the docker bridge interface is 1360. If you see a different value, you may manually set the value with the following command. Note that this will not change the value `ip a` command.
Replace `<your_vpn_interface>`.
sudo iptables -I FORWARD -i docker0 -o <your_vpn_interface> -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --set-mss 1360