If your life is anything like mine, there are spreadsheets everywhere. Even if you don't use them yourself, you likely have co-workers or clients tossing them around. There are a lot of ways spreadsheets are super useful, but they certainly aren't a database. When it comes to searching, or pulling out insights beyond SUM functions, you need something more powerful. In this post, I'll show how to quickly bring a spreadsheet into our Orchestrate NoSQL database, so you can do more with your data.

Why NoSQL?

You may be wondering why we're using a NoSQL database. Spreadsheets look a lot like the tables of a relational database, after all. While it's true that there's a strong metaphor match, there is rarely a relational aspect to most spreadsheet data. Without the need for the major benefit a relational database can provide, it makes sense to go for the simplicity and speed that a document store provides.

Orchestrate combines the power of multiple flavors of NoSQL, giving us key-value and full-text search in a single interface... plus a few others we'll explore toward the end.

Find the Right Kind of Data

You may already have some spreadsheets in mind, which is fine. As you consider data to import into NoSQL, look for these characteristics:

  • Data in one table, likely in a single sheet
  • Data that is "denormalized" and does not require pivot tables to be useful
  • Data that contains a mix of text and numbers will be most interesting

Even if you don't have data at the ready, you can find it all over. For example, there are thousands of datasets on the US Data.gov site in either Excel or CSV formats, both of which are essentially spreadsheets.

Convert Your Spreadsheet to JSON

Orchestrate, like any NoSQL database, stores JSON objects (also called documents). A table in a spreadsheet (or in a relational database, for that matter) can easily be converted into JSON format. The column names become field names, and each row is its own individual object.

Take this simple example:

character possession
Mary little lamb
Old McDonald farm

In JSON, it becomes:

[{
  "character": "Mary",
  "possession": "little lamb"
},
{
  "character": "Old McDonald",
  "possession": "farm"
}]

You could manually convert your spreadsheet, but that could take awhile. A better choice would be to use a tool like CSV2JSON.

CSV to JSON Example

If your data is in Excel format, you should be able to save it as CSV before uploading. Or, you could just copy your table from any spreadsheet program. CSV2JSON auto-detects field separators (typically tabs in copied spreadsheet data).

Click the Convert button, then highlight and copy the contents of the JSON box. Store that to a file, because it's what we'll use to import into our database.

Import Data into Orchestrate NoSQL

Now that we have our JSON data, we're about ready to import it into the Orchestrate database. Since we'll be using Orchestrate bulk operations, we need to make some tweaks to the JSON so it knows how to handle the data.

We need to change these two things:

  1. Envelope each record within a "value" object
  2. Split our records into chunks of 250, if needed

The first is required by the bulk data format Orchestrate uses. The same operations can be used to import events and relations, as well as specify keys for every object. The second is to limit the size of each request, for everyone's benefit.

Our updated JSON for bulk operations looks like this:

{"value": {
  "character": "Mary",
  "possession": "little lamb"
}}
{"value":{
  "character": "Old McDonald",
  "possession": "farm"
}}

The commas between each object, as well as the [ and ] array brackets are not necessary in the JSON Stream format supported by Orchestrate. You should be able to alter your data with a simple find and replace of your JSON document.

Next, import the data to Orchestrate using cURL:

curl -i "https://api.ctl-va1-a.orchestrate.io/v0/characters" -XPOST -H "Content-Type: application/orchestrate-export-stream+json" -u "API_KEY:" -d @characters.json

You'll need to replace API_KEY with the key from your Orchestrate dashboard. If you don't have an account yet, see how to Get Started with Orchestrate. Also, you may need to change the API URL based on your Orchestrate data center.

Also, note that my JSON object was stored in the characters.json file. You will want to change the name to match every file you import (if you have more than 250 records).

Do More With Your Data

With your data added to Orchestrate, you have access to all the query types that it provides. For starters, everything added to Orchestrate is automatically indexed for full-text search. You can query by field, within ranges, across fields, across collections, and even perform aggregations on your searches.

Point and Radius Geospatial Search

If your data is geographic (i.e., contains latitude and longitude fields), you can immediately take advantage of geographic search with Orchestrate's bounding box or point-and-radius location search.

You could also use your new data import as a foundation for a project. With the objects as your anchor data, you can use Orchestrate's graph features to create relations between them (or across other collections of data you load). You could also create a log of events, time series data, associated with each object.

Create an Orchestrate account today and get insights from your spreadsheet data.