APIBasicsHowto

NetApp sets new standard in ease of use with latest release of StorageGRID object storage solution

By Ronnie Chan, Product Manager, Object Storage, NetApp

NetApp StorageGRID Webscale, the industry’s most mature object storage product, with a continuously developed architecture, is setting a new standard in ease of use with the latest release, version 10.3.

Ease of use begins right from the out-of-box moment, or in the case of a software-defined object store like StorageGRID, from the initial deployment experience. NetApp made simplicity the theme for this release,  and extended tests by internal and external beta users, including some with no prior StorageGRID experience have demonstrated how easy it is to deploy the new 10.3 version of StorageGRID Webscale.

storagegrid most mature object storage

 Source: : Philippe Nicolas. June 2016. www.theregister.co.uk/2016/07/15/the_history_boys_cas_and_object_storage_map

From zero to one petabyte and serving I/O in 30 minutes

Installing StorageGRID Webscale 10.3 is fast and easy because deployment is:

Simple and flexible

Regardless of topology, deployment uses a common set of binaries: a virtual disk image and a set of OVF or YAML templates. The virtual disk image packages the operating system and containerized StorageGRID nodes for every node type; OVF and YAML templates guide which type of node is deployed. When planning a deployment, users can mix-and-match engineered object storage appliances from NetApp and virtual machine (VM) nodes that use NetApp or third-party storage in the same grid. The same deployment process enables users to rapidly deploy a StorageGRID Webscale system, across multiple sites, physical appliances and virtual machines, with 3 top-level steps:

  1. Deploy the Primary Admin Node, which host the single-pane-of-glass, web-based UI for system installation, expansion, maintenance and administration.
  2. Deploy any additional VM nodes and appliance nodes onto the network. For virtual nodes, StorageGRID Webscale supports VMware ESXi and KVM QEMU hypervisors. Users can use vSphere vCenter or OpenStack Dashboard and Heat to deploy VMs; users can also use NetApp-provided scripts to automate the VM deployment through vCenter and Heat. Appliance setup is also wizard driven and startup scripts automatically configure the appliance for deployment.
  3. Browse to the Primary Admin Node and launch the deployment wizard to install the StorageGRID software.

Intuitive deployment wizard

The deployment wizard guides the user to configure license, passwords, NTP, DNS, sites, and networks. Virtual and appliance nodes on the network are automatically discovered. The user selects which nodes to join into the grid and configure additional networking details. As the final step, the user starts up the StorageGRID system with the click of a mouse. StorageGRID starts up automatically, ready for the user to configure S3 and Swift API tenants and start serving I/O.

storagegrid install welcomestoragegrid install wizard

Build or run apps with best-in-class Amazon S3-compatible API

A second measure of ease of use is the building and integrating of S3 applications.

Best-in-class Amazon S3-compatible API

Since release 10.2, StorageGRID Webscale has led the industry with its implementation of Amazon S3-compatible API. StorageGRID implements a broad set of Amazon S3 API operations, including advanced features such as path and virtual hosted styled access, V2 and V4 authentication, pre-signed URL, anonymous access, multipart upload, server side encryption, IAM-styled bucket and group policies, and identity federation with AD/LDAP. Developers can use Amazon SDKs to code applications for the StorageGRID S3 API.

StorageGRID has also enhanced the capabilities of Amazon S3. For example, StorageGRID makes it easy for API users to retrieve account usage information, such as the aggregate amount of data stored in the account and for each bucket in the account, by providing a modified GET Service request with the x-ntap-sg-usage parameter.

New to 10.3: S3-compatible object versioning

In release 10.3, StorageGRID adds S3 object versioning, supporting operations including PUT/GET/DELETE object versions, GET Bucket Object versions, GET Bucket versioning, and PUT Bucket versioning. Versioning enables API users to easily restore an older version of an object, or “undelete” a previously deleted object. When enabled on a bucket, each new object created in the bucket receives a version ID. When a DELETE Object request for a previously stored object is received, StorageGRID inserts a delete marker into the bucket so that the object appears to be removed, but retains it and allows it to be retrieved with a GET Object request that specifies the version ID. A DELETE Object request that specifies the version ID will cause that object version to be permanently deleted.

Unique to StorageGRID: tunable bucket consistency

Release 10.3 also adds new S3 enhancements, such as tunable consistency at the bucket level. Using the x-ntap-sg-consistency HTTP header, users can specify customized values to tune the consistency setting of a bucket with a modified PUT and GET Bucket operation.

The default consistency behavior is the same as Amazon S3: guaranteed read-after-write consistency within the site for newly created objects in normal operating conditions. In other words, any GET following a successfully completed PUT at the same site will be able to read the newly written data. Like Amazon S3, overwrites of existing objects, metadata updates, and deletes are eventually consistent.

API users can choose a stronger consistency setting for their bucket. For example, the value “strong-global” guarantees read-after-write consistency across all sites in an operation, regardless of operating conditions, or else the request is failed. Stronger consistency comes at a tradeoff against availability, i.e. with strong-global, a request may fail if there were nodes at a remote site that were unreachable even when the site where the request was made was operating normally.

Industry’s most powerful object lifecycle policy

Finally, a third measure of ease of use comes from helping users manage the gigantic quantities of objects and data in a geo distributed object store. As data is the “lifeblood of the digital economy”, the ability to access and share data, across premises and in the cloud, drives innovation and revenue. The value and thus demand for data can ebb and flow throughout its lifecycle. The only way to efficiently manage such vast quantities of data is by metadata-driven policy-based object lifecycle management, which is something StorageGRID Webscale does uniquely well.

Release 10.3 makes it easier than ever for IT to customize granular object lifecycle policies to manage their users’ data. In a recent conversation with a global enterprise, for one application the customer wanted to store a full copy of the data at their data center on the West Coast for an initial period, use geo erasure coding across their West Coast, Mountain, and East Coast data centers to protect their data, and drop the full copy at the West Coast data center after some time. With StorageGRID, the user can do this with a single Information Lifecycle Management rule.

storagegrid-ilm-rule

StorageGRID object lifecycle policies also enable users to easily and granularly move data to and back from the cloud, by allowing objects to be tiered to Amazon S3 or S3-compatible clouds. Objects can be secured with AES-256 server side encryption by StorageGRID before being sent to the cloud. Users maintain control of object metadata, and thus the namespace, on premise; objects in the cloud can be retrieved seamlessly on the data path and can be repatriated on premise by changing the policy.

Object storage is changing the way IT organizations around the world store and protect their unstructured digital data. Learn more about NetApp StorageGRID at cloud.netapp.com/storagegrid.

APIHowto

Using StorageGRID Webscale to host Static Websites and Content

Introduction

NetApp’s StorageGRID Webscale is a massively scalable, distributed, multi-site object store. It supports standard object protocols like S3 and OpenStack Swift and is suited to be deployed across many data centers (up to 16). In a multi-site deployment, e.g., across the US, Europe and Asia, StorageGRID provides a single namespace. This means, regardless where a user inserts data, it can be accessed from all locations. In the backend, data is replicated or Erasure Encoded to the different sites. Where data lands is driven by the administrator’s policies or directly by meta-data which has been attached to the object.

While websites are usually dynamic and hosted on special webservers, many parts of them are actually static. Content like html files, download files, images, videos, and other content is usually static. In this post, we show how this data can be hosted on StorageGRID Webscale. This provides an easy way to serve static content that should be offered for e.g., download or streaming to clients in multiple geographies.

Hosting Website Content on StorageGRID Webscale

Firstly, we create a new bucket “bucket-website” which will hold our static website and its content (e.g., files, audio, images, videos, etc.). Next, we generate a JSON file to set the bucket policies for allowing public, non-authenticated read-only access to the bucket:

> cat website.json
{
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": "*",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "urn:sgws:s3:::website-bucket",
        "urn:sgws:s3:::website-bucket/*"
      ]
    }
  ]
}

With s3cmd, we apply the bucket policy to the bucket:

> s3cmd setpolicy website.json s3://website-bucket
s3://website-bucket/: Policy updated

If we now point a browser to the bucket, we’ll notice that we have access:

web1

Next, we’ll upload our static website files and content into the bucket. If we refresh our browser, we’ll see that the bucket has been populated:

web2

All we now need to do is point the browser to the index.html file and we’ll be able to browser the website:

web3

If we point the browser to a file that we want to offer for download, the download would being.

Summary

StorageGRID Webscale allows you to easily host static websites and website content like large files, images, videos or audio files for download or streaming access. By using StorageGRID’s policy engine and global namespace, this data can be easily distributed across many sites or worldwide. This allows users to access data at low latency and with high throughput, regardless of the location they are at.

If you have questions, feel free to reach out to me on Twitter: @clemenssiebler

APIHowtoUncategorized

Anonymous/Public Bucket Access

Introduction

This quick example shows how we can update the bucket access policy in StorageGRID 10.2 in order to allow anonymous access. This allows to access the bucket without S3 credentials, e.g. through a browser.

Instructions

In this example, we utilize s3cmd to connect to StorageGRID Webscale. In order to get S3cmd talking to StorageGRID, update the following fields in ~/.s3cfg as shown below. Please note that it is not advised to disable SSL for production workloads, but rather set the ca_certs_file field.

$ cat ~/.s3cfg (only important fields shown)
access_key = <S3 access key>
access_token = <S3 secret access key>
check_ssl_certificate = False
check_ssl_hostname = False
host_base = <StorageGRID address>:8082
host_bucket = <StorageGRID address>:8082/%(bucket)

Next, we create a JSON document to enable access to a bucket “public-bucket1” which we will expose to the public:

$ cat anonymous_access.json
{
  "Statement": [
   {
    "Effect": "Allow",
    "Principal": "*",
    "Action": [
    "s3:GetObject",
    "s3:ListBucket"
   ],
   "Resource": [
     "urn:sgws:s3:::public-bucket1",
     "urn:sgws:s3:::public-bucket1/*"
    ]
 }
 ]
}

Lastly, we use S3cmd to apply the policy to the bucked:

$ s3cmd setpolicy anonymous_access.json s3://public-bucket1 
s3://public-bucket1/: Policy updated

We are now able to point a browser to an object and download it without requiring credentials, e.g. via https://<StorageGRID address>:8082/public-bucket1/objectkey.

If you have questions, feel free to reach out via @clemenssiebler.

APIHowtoSwiftUncategorized

Swift API Access to StorageGRID

Introduction

This post gives a short overview how StorageGRID Webscale 10.2’s Swift API can be accessed via Python through Python-Swiftclient.

Preparation

First, we need to install the Swift Client for Python:
sudo pip install python-swiftclient
Next, we need to create a Swift Account in the StorageGRID NMS GUI by selecting: Grid Management -> Storage Tenants -> Tenant Accounts. After creating the new user, you can copy the Swift Tenant ID from the same view. This creates the API username in the format of “Swift Tenant ID:Swift Username”

API Examples

Connect to the StorageGRID Swift API Endpoint:
import swiftclient
username = '11071826158283910917:swiftadmin'
password = 'supersecret'
authurl = 'https://10.65.57.176:8083/auth/v1.0'
swift = swiftclient.client.Connection(auth_version='1', user=username, key=password, insecure=True, authurl=authurl)

In order to rely on proper TLS security, you can pass in a CA certificate:
cacert = '/path/to/server/cert'
swift = swiftclient.client.Connection(auth_version='1', user=username, key=password, cacert=cacert, authurl=authurl)

Now, we can start creating new containers via:
swift.put_container('test-container')

In order to list all containers, including various metrics, we can use the follow snippet:

response = swift.get_account()
containers = response[1]
for c in containers:
name = c['name']
num_objects = c['count']
size = c['bytes']
print("Container name: %s (total: %s objects, %s bytes)" % (name, num_objects, size))

We can delete an empty bucket via:
swift.delete_container('test-container')

Finally, let’s put some objects into StorageGRID and associate some metadata:
swift.put_object('test-container', 'test-object',
contents='This is my object\'s content',
headers={'X-Object-Meta-CustomerID':'42',
'X-Object-Meta-Color':'red'})

And, obviously, we also can retrieve our objects again:
response = swift.get_object('test-container', 'test-object')
object_headers = response[0]
object_content = response[1]
print "Object headers: ", object_headers
print "Object content: ", object_content

For more examples, visit https://github.com/csiebler/storagegrid-examples.
Feel free to reach out via @clemenssiebler