Introduction to Amazon S3

Accredian Publication

5 min readJan 31, 2022

by Pronay Ghosh and Hiren Rupchandani

In the previous article, we saw how to set up and deploy a machine learning model into Amazon EC2.
We learned about cloud computing, AWS, and its services, deployment of a machine learning model into the localhost, and finally deploying it into Amazon EC2.
From this article onwards, we will be starting a brand new series on Amazon Sagemaker.
However, before jumping on Sagemaker as a service, we first have to know the basics of Amazon Simple Storage Service(Amazon S3).

Though Amazon Sagemaker and Amazon S3 are not mutually exclusive, most of the companies use Amazon S3.
Apart from this, S3 automatically creates and stores duplicates of all uploaded objects across multiple systems.
This ensures that your data is safe from failures, errors, and threats and readily available when needed.
We will dive into what is Amazon’s S3 as a service and what are some of the popular features of this service.

What is Amazon S3?

Amazon S3 is a program that’s built to store, protect, and retrieve data from “buckets” at any time from anywhere on any device.

Why Amazon S3?

One can use the Amazon S3 for a variety of reasons. Some of them are listed as follows

Organizations of any size in any industry can use this service.
Use cases include websites, mobile apps, archiving, data backups, and providing the underlying storage layer for your data lake.
Amazon S3 includes management features that allow you to optimize, organize, and configure access to data to suit your specific business, operational, and compliance needs.

We all know that amazon S3 is a bucket but the question remains if Amazon S3 is same as an SQL/NoSQL database or not?

The answer is a big NO!
This is because SQL or NoSQL databases typically store structured data or semi-structured data whereas a bucket allows you to store unstructured data.
Apart from that from the below image, we can clearly check the difference between Amazon S3 and DynamoDB (which is nothing but the NoSQL database of AWS).

Top 3 Amazon S3 Features

The top 3 features of Amazon S3 are as follows

Storage classes

S3 Intelligent-Tiering allows one to store data with changing or unknown access patterns.
This optimizes storage costs by instantly starting to move the data between four access tiers when your access patterns change.
One can learn more about S3 Storage classes here.

Storage management

Amazon S3 includes storage management features that help you manage costs.
It also helps in meeting regulatory requirements, making low latency systems, and saving multiple different copies of data for compliance.
The four types of storage management by Amazon S3 are respectively S3 Lifecycle, S3 Object Lock, S3 Replication, and S3 Batch Operations.

With an S3 lifecycle, one can configure a lifecycle policy to manage objects and store them cost-effectively throughout their lifecycle.
Amazon S3 objects are prevented from being deleted or overwritten for a set period of time or indefinitely with the help of the S3 Object lock.
Amazon S3 Replication is used for reduced latency, compliance, security, and other use cases.
S3 Replication helps to replicate objects and their associated metadata.
With a single S3 API request or a few clicks in the Amazon S3 console, S3 Batch Operations can manage billions of objects at scale.

Access management

Amazon S3 includes tools for auditing and managing access to buckets and objects.
S3 buckets and the objects contained within them are private by default.
The user can only access the S3 resources that you create.
The user can only use the following features to grant granular resource permissions that support your specific use case or to audit the permissions of the Amazon S3 resources.

S3 Block public access:

S3 buckets and objects are not accessible to the general public.
Block Public Access settings are enabled by default at the account and bucket levels.

AWS IAM

To manage access to your Amazon S3 resources, one can create IAM users for their respective AWS account.
AWS Identity and Access Management (IAM) provides fine-grained access control across all of AWS.
With IAM, you can specify who can access which services and resources, and under which conditions.
With IAM policies, you manage permissions to your workforce and systems to ensure the least privileged permissions.

Bucket Policies

Configure resource-based permissions for your S3 buckets and the objects in them using IAM-based policy language.

S3 Object ownerships

Disable ACLs and take ownership of all objects in your bucket to simplify access management for Amazon S3 data.
As the bucket owner, one can automatically own and control every object in the bucket, and access control for your data is based on policies.

Conclusion

So far in this article, we covered a high-level overview of Amazon S3.
We understood what exactly S3 is, why should one choose S3 and the top 3 features of Amazon S3.
In the next article, we will learn about the working of Amazon S3 and then we will see the bird eye-view of Amazon Sagemaker.
After that, we will learn how to build, train and deploy a machine learning model with the help of Amazon Sagemaker.

Follow us for more upcoming future articles related to Data Science, Machine Learning, and Artificial Intelligence.

Also, Do give us a Clap👏 if you find this article useful as your encouragement catalyzes inspiration for and helps to create more cool stuff like this.