This post is a follow-up to the Power and Scalability in One Location article I wrote for the CenturyLink Cloud blog. It covers technical details of what we did to get data from Elasticsearch (ES), store it in Orchestrate, and make it available via REST API using Ruby on Rails. Here's a summary of the original post:

  • My team released our next generation Object Storage. We badly needed business metrics for our product.
  • Our internal Analytics team supports an ELK stack for our product teams to use.
  • My team already had an auditor job that regularly queried our Object Storage system, reporting bucket sizes to our platform billing system.

Log the Data to Kafka

The first step to begin capturing our data was getting it logged into Kafka. Since our auditor runs inside of our platform, it already had access to the platform logging tools. All we had to do was wire our logging to those tools. (Our other logging process, outside of the platform, uses Fluentd.)

Getting the Data out of ElasticSearch

We had some options for how to get the data out of ES. Since we used Ruby, we could have used the Elasticsearch gem, but instead opted to go with RestClient as the use case was clean and simple.

body = '{
  "query": {
    "filtered": {
      "query": {
        "bool": {
          "should": [
              "query_string": {
                "query": "@fields.objectStorageAuditorJob_quantity: *"
  "fields" : [
  "sort": { "header.receivedTimestamp": { "order": "desc" }}

url = 'https://your_elastic_url/es/platform_nlog-2016.04.18/_search?pretty=true&size=50000&from=0'

res = RestClient::Request.execute(:url => url, :method => :post, :payload => body, :user => "username", :password => "password")

Our filter returns any records with the @fields.objectStorageAuditorJob_quantity field. We limit the fields returned to only those we care about; we hit a particular daily index, ordered by header.receivedTimestamp desc. Our ES instance is protected by HTTP Basic Auth, hence the user/password. Don't forget to enter your legit credentials if you are behind Basic Auth.

Parking the Data in Orchestrate

Since we now had the response in a variable res, it was time to parse the data and store it away. This section assumes you've already set up your Orchestrate account, have an application (in CenturyLink NY1 - US East), have a collection called AuditorData, and have retrieved your key. This sounds like a lot, but it is really just a few clicks and keystrokes. Here is the Orchestrate Walkthrough KB which takes you step-by-step through the process.

parsed = JSON.parse(res)

client =

hits = parsed["hits"]
puts "Total: " + hits["total"].to_s

hits["hits"].each_with_index do |hit, index|

  fields = hit["fields"]
  acct_alias = fields["@fields.objectStorageAuditorJob_accountAlias"].first
  quantity = fields["@fields.objectStorageAuditorJob_quantity"].first
  bucket = fields["@fields.objectStorageAuditorJob_bucket"].first
  timestamp = fields["header.receivedTimestamp"].first
  location = fields["@fields.objectStorageAuditorJob_location"].first
  key = timestamp + "-" + bucket
  res = client.put(:AuditorData, key, {'timestamp' => timestamp, 'acct_alias' => acct_alias,  'bucket' => bucket, 'quantity' => quantity, 'location' => location})

  if res.status != 201
    puts 'fail: ' + res.to_s

  puts "Finished: #{index}" if index % 100 == 0

puts 'Done: ' + (current).to_s

We parsed the JSON and extracted the fields we wanted. We instantiate the Orchestrate client and store the data using a key comprised of a timestamp concatenated with a bucket name. We left in the puts trace statements so we could track the status of the job.

A cool feature of Orchestrate is that you can use the Orchestrate UI to List/Search your collections. Instructions for that are in the Orchestrate Walkthrough KB.

Creating our Rails app

So now that we had our data in Orchestrate, we needed to have a way to retrieve it. Given we wanted fast spin-up time and we were already writing Ruby, Rails was a logical choice. We set up Rails and initialized our app (not covering that here as there are plenty of example in Rail GitHub and other places). Now if only we had an easy way to run this Rails app without having to care about infrastructure...

Get Running in AppFog

AppFog was an easy choice for many reasons (rapid deployment, no infrastructure concerns, just to name a couple). Since we were already using Orchestrate's US East region, we chose to use AppFog's US East option.

To get started, we followed the AppFog Getting Started KB. Next, we set up our AppFog membership following the AppFog Membership KB. Then, we set up the CloudFoundry CLI following the Appfog Login Using CF CLI KB. At this point we had a local Rails app (default app), a functioning CF install and were logged in using the cf login command. The last step was to push the app to Appfog using cf push example-app from the root directory of the app. We watched CF work its magic, pushing our new Rails app up to AppFog. After it was done, we hit the URL for our site and there's the default Rails app 'Welcome Aboard' page.

Building our API

Now we were finally at the point where we could do something fun with the data. We generated a Rails controller and view named ApiBucket. (Again, we're not covering that here). Here is the code from our controller:

require 'json'

class ApiBucketController < ApplicationController

  def index

    name = params[:name]
    bucket_name = 'bucket:"' + name + '"'

    unless name
      render nothing: true, status: :bad_request

    client =
    results = []
    offset = 0
    limit = 100

    loop do
      res =, bucket_name, limit: limit, offset: offset, sort: :timestamp)

      if res.status == 200
        chunk_of_results = res.body['results']
        results +={ |r| one_value(r['value']) }
        puts 'fail: ' + res.to_s
        render nothing: true, status: :service_unavailable

      offset += limit
      puts 'loaded ' + offset.to_s
      break if res.next_link.nil?

    render json: JSON.dump(results: results)


  def one_value(value)
      timestamp: value['timestamp'],
      value: {
        bucket: value['bucket'],
        quantity: value['quantity'],
        acct_alias: value['acct_alias'],
        location: value['location']


We accepted a name parameter which is the name of the bucket. We used the Orchestrate client to search for entries with a bucket field that matches. We looped through the results building a JSON string to return to the caller. We added some mild error handling and logic to loop through Orchestrate's page size of 100.


Finally, we added ApiBucket to our routes file.

Rails.application.routes.draw do
  get 'home/index'
  scope '/api' do
    scope '/bucket' do
      get '/' => 'api_bucket#index'

We did a cf push example-app to push up our changes. Now we were in business. We could now do a GET on and receive JSON data for our foo bucket! From here, we wrote a little JavaScript to massage the data and feed it into a graph.


So that's it. Obviously there is room for improvement here (tests, move that logic from the controller to the model, etc.), but it gives you a basic idea of how you can pull data from ElasticSearch, put it in Orchestrate, and expose the data via Rails in Appfog.

Sign up for our Developer-focused newsletter CODE. Designed hands-on by developers, for developers. Keep up to date on topics of interest: tutorials, tips and tricks, and community building events.

We’re a different kind of cloud provider – let us show you why.

Thanks for reading, Phil Jensen