Standard Object Storage Benchmarks

Standard Object Storage Benchmarks

 I'd love to hear your thoughts on what makes a good test set for the comparative performance of object storage protocols (S3 and Swift primarily). I've talked with others in the business and there doesn't seem to be a standard set today. Chime in via comments and let's figure it out together.

My thoughts so far:

  • The CosBench tool seems pretty well suited for this
  • The test should include the job files and a script to launch from the cli
  • There should be a document that clearly explains how to read the results

Thoughts on this?

Thoughts on the test set?

Be sure to expand the comments so that we can have a healthy discussion!

Personally I think there are several benchmarking philosophies which can help determine what tool to use. My perspective is many are focused on running various workloads and reporting results but don't necessary help figure out where underlying problems may exist and so I wrote my own to meet my own needs, called getput (since putget was already taken at the time). Getput does NOT do workloads but rather you give it a specific object size, run time and number of threads to test against. You can also specify multiple sizes and multiple thread counts and it will run against various permutations for very extensive profiling, but the key point is repeatability which you don't necessarily get against workloads which are more statistics based. In about an hour, I can generate numbers for object sizes of 1K to about 100M over parallel thread counts from 1 to 100 or more. Using multiple clients I can drive the thread count up to well over 1000! Lets say I want to compare performance of 1K objects against 10K objects over a long period of time. I don't want a mix of object sizes or puts/gets polluting my numbers. If I want to tune the system or add/remove servers, I want to see exactly what happens with those object sizes. If I want to see what else is happening on the system with respect to cpu, network, memory, etc. I also want to see it for fixed object sizes and measure it all with collectl (a shameless plug for another tool I wrote ;)). Using this methodology I'd found a number of performance bugs the swift developers didn't even know existed though they were there for years! For example who would have ever guessed a 7887 byte PUT was twice as fast as a 7888 byte put? Profiling tools just tell you an overall workload performed at X. I'm not saying don't run workload based tools but what I am saying is perhaps you need multiple ones to get the whole picture.

CosBench looks quite good for such actions, the main question is what kind of workload we should emulate, as archive type workload is completely different than web app data type. The other thing is what we want to get of such actions ? implemented solution / storage software / some kind of hardware below storage software...

Like
Reply

CosBench is the standard answer, indeed.

Like
Reply

I think a spread of 4k, 32k, 128k, 1M, and 4M make sense in a matrix of 80% seq read, 80% seq write, random mixed 70%Read/30%Write. Thoughts?

Like
Reply

David- always hear to help:)

Like
Reply

To view or add a comment, sign in

More articles by David Byte

  • Trying to enable remote employees during COVID-19 restrictions?

    There are lots of ways to enable your employees to work remotely. If you've not already provided a VPN or remote access…

    1 Comment
  • Making an Incredible Team

    Face it, we've all been part of some teams in our career that we would say are okay. Not great, not outstanding, but…

    8 Comments
  • New Year Reflections and Looking Forward

    The last year has been quite a ride! My oldest daughter married, celebrated 22 years of marriage, finally took care of…

  • Help Make Ceph Better!

    Well, here we are, SUSE Enterprise Storage 6 is out the door and seeing customer deployments. Product management is…

  • SUSE on the IO500 List for HPC Storage

    If you haven't been hanging around the Ceph world for a bit, you may not realize that Ceph was originally intended to…

    2 Comments
  • Q: What do you do?

    For a long time, my wife, my kids, and I have tried to figure out the best answer to give people when they ask what I…

    1 Comment
  • Working High Tech in the Heartland

    People I meet in business are always surprised to find out I live in Jenks, Oklahoma. Many people not native to the…

    5 Comments
  • Serious About Storage @SUSE

    I'm passionate about storage. I began my storage journey in the 1980s by optimizing the data layouts, changing the…

    1 Comment
  • Byte's Prognostication Bits

    About this time every year I like to make a few prognostications for business technology. These are based on what I see…

    2 Comments
  • Ditching the converged home network

    I've had it with consumer grade home networking gear. In the past year, I have burned through 3 wifi routers from…

    10 Comments

Others also viewed

Explore content categories