S3 is a popular object storage service from Amazon Web Services. It is a common requirement to download files from the S3 bucket to Azure Databricks. You can mount object storage to the Databricks workspace, but in this example, I am showing how to recursively download and sync the files from a folder within an AWS S3 bucket to DBFS. This program will overwrite the files that already exist.
The program is given below. This program uses boto3 python package for interacting with AWS S3.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters