How to Copy (or Move Files) From One Bucket to Another Using Boto3 [Python]?

Introduction

Boto3 is an AWS SDK for Python. It allows users to create, and manage AWS services such as EC2 and S3. It provides an object oriented API services and low level services to the AWS services.

S3 is a Simple Storage Service which allows you to store files as objects. It is also know as object based storage service.

In this tutorial, you’ll learn how to

  • Copy S3 object from one bucket to another using Boto3
  • Copy all files from one S3 bucket to another using Boto3
  • Move S3 object from one bucket to another using Boto3
  • Move all files from one S3 bucket to another using Boto3
  • Copy all files from one S3 bucket to another using s3cmd (Directly from terminal)
  • Run Boto3 script from Command line (EC2)

You’ll use the Boto3 Session and Resources to copy and move files between S3 buckets. Resources provide object oriented interface to AWS services and it represents the higher level abstraction of AWS services. Where as Boto3 Client provides the low level service calls to AWS servcies. Hence its recommended to use the Boto3 Resources rather using Boto3 client.

Side Note: This post is originally published on my blog askvikram.com

Prerequisites

Before you start, you’ll need the following.

  • If you do not have servers in the cloud, Create an AWS EC2 Ubuntu server instance by following the guide How to launch an EC2 Instance.
  • [Important] – Update the packages list in the server which is upgrade-able using sudo apt update
  • [Important] – Upgrade the packages in the server to the latest versions using sudo apt upgrade

  • Install Boto3 using the command pip3 install boto3

Copying S3 Object From One Bucket to Another Using Boto3

In this section, you’ll copy an s3 object from one bucket to another.

Each section of the python script is explained separately below. In the end of each section, you’ll find the full python script to perform the copy or move operation.

Creating Boto3 Session

First you’ll create a session with Boto3. You need to specify credentials for connecting Boto3 to s3. The credential required are AWS Access key id and Secret access key.

Use the below code snippet to create a Boto3 Session.

#Creating Session With Boto3.
session = boto3.Session(
aws_access_key_id='Your Access Key ID',
aws_secret_access_key='You Secret access key'
)

Boto3 session is created.

Creating S3 Resource

Next, you’ll create an S3 resource using the Boto3 session. Use the below code to create an S3 resource.

#Creating S3 Resource From the Session.
s3 = session.resource('s3')

Resource is created.

Next, you’ll create the python objects necessary to copy the S3 objects to another bucket.

Creating Source Bucket Dictionary

A source bucket dictionary is necessary to copy the objects using bucket.copy() method.

Dictionary is a python implementation of data structures known as an associative array. It consists of a collection of key-value pairs. Each key-value pair maps the key to its associated value.

You’ll create a source bucket dictionary named copy_source with the source bucket name and the object key which needs to be copied to another bucket.

Use the code below to create a source bucket dictionary.

#create a source dictionary that specifies bucket name and key name of the object to be copied
copy_source = {
    'Bucket': 'your_source_bucket_name',
    'Key': 'Object_Key_with_file_extension'
}

Source bucket dictionary is created.

Creating Target Bucket Representation From S3 Resource

You’ll create a Boto3 resource that represent your target AWS S3 bucket using s3.bucket() function.

Use the below code to create the target bucket representation from the s3 resource.

bucket = s3.Bucket('target_bucket_name')

The target S3 bucket representation from resources is created.

Copying the S3 Object to Target Bucket

Finally, you’ll copy s3 object to another bucket using the boto3 resource copy() function.

Use the below code to copy the objects between the buckets.

bucket.copy(copy_source, 'target_object_name_with_extension')
  • bucket – Target Bucket created as Boto3 Resource
  • copy() – function to copy the object to the bucket
  • copy_source – Dictionary which has the source bucket name and the key value
  • target_object_name_with_extension – Name for the object to be copied. Object will be copied with this name. You can either use the same name as source or you can specify a different name too.

These are the detailed step by step code you can use to copy S3 object from one bucket to another.

Full python script to copy S3 object from one bucket to another is given below.

import boto3

#Creating Session With Boto3.
session = boto3.Session(
aws_access_key_id='Your Access Key ID',
aws_secret_access_key='You Secret access key'
)

#Creating S3 Resource From the Session.
s3 = session.resource('s3')

#Create a Soucre Dictionary That Specifies Bucket Name and Key Name of the Object to Be Copied
copy_source = {
    'Bucket': 'your_source_bucket_name',
    'Key': 'Object_Key_with_file_extension'
}

bucket = s3.Bucket('target_bucket_name')

bucket.copy(copy_source, 'target_object_name_with_extension')


# Printing the Information That the File Is Copied.
print('Single File is copied')

Update the highlighted variables based on your bucket names and object names. Then you’ll be able to copy your S3 objects.

You’ve learnt how to copy an S3 object from one bucket to another using Boto3.

Next, you’ll learn how to copy all files.

Copying All Files From One Bucket to Another Using Boto3

In this section, you’ll copy all files existing in one bucket to another bucket using Boto3.

For copying all files, you need to iterate over all the objects available in the source bucket.

So you need to create a source S3 bucket representation and the destination s3 bucket representation from the S3 resource you created in the previous section.

Use the below code to create a source s3 bucket representation.

srcbucket = s3.Bucket('your_source_bucket_name')

Use the below code to create a target s3 bucket representation.

destbucket = s3.Bucket('your_target_bucket_name')

Next, you need to iterate through your s3 bucket objects present in your source bucket by using objects.all() function available in bucket representation python object.

Use the below code to iterate through s3 bucket objects.

for file in srcbucket.objects.all():

During each iteration, file object will hold details of the current object (including the name of the object).

Now, create a source bucket dictionary which can be used to copy file from one directory to another.

#Create a Soucre Dictionary That Specifies Bucket Name and Key Name of the Object to Be Copied
    copy_source = {
    'Bucket': 'your_source_Bucket_name',

    #file.key holds the name of the current object. Pass that name to the key
    'Key': file.key
    }

Next, you need to copy the object from the source bucket to the destination bucket using bucket.copy() function available in the S3 Bucket representation object.

Use the below code to copy the object from source to target.

destbucket.copy(copy_source, file.key)

Now, during each iteration, the file object will be copied to the target bucket.

Full python script to copy all S3 objects from one bucket to another is given below.

import boto3

#Creating Session With Boto3.
session = boto3.Session(
aws_access_key_id='Your Access Key ID',
aws_secret_access_key='Your Secret access key'
)

#Creating S3 Resource From the Session.
s3 = session.resource('s3')

srcbucket = s3.Bucket('your_source_bucket_name')

destbucket = s3.Bucket('your_target_bucket_name')

# Iterate All Objects in Your S3 Bucket Over the for Loop
for file in srcbucket.objects.all():

    #Create a Soucre Dictionary That Specifies Bucket Name and Key Name of the Object to Be Copied
    copy_source = {
    'Bucket': 'your_source_bucket_name',
    'Key': file.key
    }

    destbucket.copy(copy_source, file.key)

    print(file.key +'- File Copied')

Update the highlighted variables based on your bucket names and object names. Then you’ll be able to copy all files to another s3 bucket using Boto3.

Next, you’ll learn how to move between s3 buckets.

Moving S3 Object From One Bucket to Another Using Boto3

In this section, you’ll learn how to move S3 object from one bucket to another.

In principle, there is no native methods available for moving s3 object within buckets. However, the move operation can be achieved by copying the file to your target directory and deleting the objects in the source directory.

Copying an object to another bucket can be achieved using the Copy section of this tutorial.

Additionally to delete the file in the source directory, you can use the s3.Object.delete() function.

s3.Object('your_source_bucket_name','Object_Key_with_file_extension').delete()
  • s3 – Resource created using the Boto3 session
  • Object() – Function to create a resource representing the Object name in your source bucket
  • delete() – Function to delete the object from your S3 bucket.

Full python script to move S3 objects from one bucket to another is given below. This will copy the objects to the target bucket and delete the object from the source bucket.

import boto3

#Creating Session With Boto3.
session = boto3.Session(
aws_access_key_id='Your Access Key ID',
aws_secret_access_key='Your Secret access key'
)

#Creating S3 Resource From the Session.
s3 = session.resource('s3')

#Create a Soucre Dictionary That Specifies Bucket Name and Key Name of the Object to Be Copied
copy_source = {
    'Bucket': 'your_source_bucket_name',
    'Key': 'Object_Key_with_file_extension'
}

#Creating Destination Bucket 
destbucket = s3.Bucket('your_target_bucket_name')

#Copying the Object to the Target Directory
destbucket.copy(copy_source, 'Object_Key_with_file_extension')

#To Delete the File After Copying It to the Target Directory
s3.Object('your_source_bucket_name','Object_Key_with_file_extension').delete()

# Printing the File Moved Information
print('Single File is moved')

Update the highlighted variables based on your bucket names and object names. Then you’ll be able to move s3 objects to another s3 bucket using Boto3.

Next, you’ll learn how to move all objects to another s3 bucket.

Moving All Files From One S3 Bucket to Another Using Boto3

In this section, you’ll move all files from One s3 buckets to another bucket using Boto3.

As said in the previous section, there is no native methods available for moving all s3 object within buckets. The move operation can be achieved by copying all the files to your target directory and deleting the objects in the source directory.

Copying all objects to another bucket can be achieved [using the Copy all files section of this tutorial.

Additionally to delete the file in the source directory, you can use the s3.Object.delete() function. You’ll already have the s3 object during the iteration for the copy task. Once copied, you can directly call the delete() function to delete the file during each iteration.

Full python script to move all S3 objects from one bucket to another is given below. This will copy all the objects to the target bucket and delete the object from the source bucket once each file is copied.

import boto3

#Creating Session With Boto3.
session = boto3.Session(
aws_access_key_id='Your Access Key ID',
aws_secret_access_key='Your Secret access key'
)

#Creating S3 Resource From the Session.
s3 = session.resource('s3')

srcbucket = s3.Bucket('your_source_bucket_name')

destbucket = s3.Bucket('your_target_bucket_name')

# Iterate All Objects in Your S3 Bucket Over the for Loop
for file in srcbucket.objects.all():

    #Create a Soucre Dictionary That Specifies Bucket Name and Key Name of the Object to Be Copied
    copy_source = {
    'Bucket': 'your_source_bucket_name',
    'Key': 'Object_Key_with_file_extension'
    }

    destbucket.copy(copy_source, file.key)

    #to delete the file after copying   
    file.delete()

    print(file.key +'- File Moved')

Copy All Files From One S3 Bucket to Another Using S3cmd Sync

In this section, you’ll learn how to copy all files from one s3 bucket to another using s3cmd.

All the files can be copied to another s3 bucket just by running a single command in terminal. It ll sync which means, it’ll copy the files that doesn’t exists in the target directory.

You can also check which files will be copied by using the –dryrun option along with the sync command. It’ll show the list of files that will be copied to the target directory.

Use the below command to copy all files from your source bucket to the target bucket.

s3cmd sync s3://your/source/bucket/ s3://your/target/bucket/

Running Boto3 Script in Command Line


You can run the Boto3 script in command line using python3 command. You must have python3 and Boto3 packages installed in your machine before you can run the Boto3 script in command line (EC2).

For example, assume your python script to copy all files from one s3 bucket to another is saved as copy_all_objects.py.

You can run this file by using the below command.

python3 copy_all_objects.py

For more detailed information on running python script in command line, refer How to Run Python File in terminal[Beginners Guide]?

Conclusion

In this tutorial, you’ve learnt how to copy a single s3 object to another bucket, copy all files from s3 bucket to another bucket, move a single object to another bucket and move all files to another bucket using boto3.

If you have any questions or if you face any problem while following the tutorial, feel free to comment below.

What Next?

You may find the below tutorial useful.

Feel Free to look at other AWS articles in my blog askvikram.com