We’ve recently needed to compare our file storage areas on Amazon and Rackspace. We’re mainly using Amazon S3 and using Rackspace Could Files to backup those files in S3. I’ve written couple of simple PHP scripts to get information (name, size, etag, date modified) about the files kept on Amazon and Rackspace using their corresponding APIs. Then, the scripts save the information in a MySQL database. Here are the results, to get info about 100k+ files..

  • It took over 35 minutes on Amazon API
  • It took over 11.5 hours on Rackspace API

These are two different implementations and two very different results.
That’s why we should aim for a robust system at the beginning every time.