Periodically, we turn over control of the CenturyLink Cloud® blog to members of our certified technology Ecosystem to share how they leverage our platform to enable customer success. This week’s guest author from the Cloud Marketplace Provider Program is Tom Coffing, President and CEO at Coffing Data Warehousing.
Experts say that up to 75% of the entire world’s data will be on Hadoop by the year 2020. Yes, that’s right — 75%! This means a huge influx of data over the next several years, which translates into a huge opportunity for your organization! This means data not captured today will become commonplace over the next several years.
What is Hadoop, Where did it Come From, and why is it so Popular?
Hadoop was created by the genius minds at Google to analyze massive amounts of web data. It was then utilized and expanded by Yahoo! who began working with the open source community. The name Hadoop came from Doug Cutting, chief architect of Cloudera and one of the original creators. Cutting's son, then 2-years old, was just beginning to talk and called his beloved stuffed yellow elephant "Hadoop".
Hadoop is so popular is because it is all about unstructured data. As much as 80% of data created each day is unstructured — and impossible to mine as a result. Hadoop brings structure to the chaos, helping to store the data sets across distributed clusters of servers, and at a much lower cost than with legacy servers.
Companies are beginning to collect, analyze, and mine data via different avenues. Sqoop transfers data to and from Hadoop to legacy databases like Oracle, Teradata, SQL Server and DB2. Companies will use Flume to gather weblogs from places like Twitter, Facebook, LinkedIn or any mechanical logs (e.g., trucking logs or smart thermostats). This will allow a company to mix structured data from legacy systems with unstructured data from websites, social media and logs.
“Sentiment” is a Hadoop data-type term used to describe how your customers “feel”. Companies can analyze “Sentiment” by tracking Twitter feeds to find out what society thinks of their branding. “Sensor Machine” is a Hadoop data-type term used to discover patterns in data streaming from remote sensors and machines. Companies can also track their vehicles’ driving patterns based on logs associated with each vehicle. The possibilities are endless!
Big Data Fosters New Realities
Big data has become too big to keep on a single proprietary system. Therefore, the demands for integration are increasing exponentially. The competition for a customer's business increased while social media under public domain began to be considered as a new data strategy. All of the above causes forced companies to rethink their traditional strategies towards data processing, which led to significant changes as companies adapted to new IT environments for managing and deploying business intelligence.
Creating New Realities:
- A line of business could no longer be on its own disparate platform
- Cloud computing needs to be implemented to save costs and improve economics
- Data needs to be integrated across many different sources
- The IT staff and business users need to work closer together than ever before
- Data visualization and automation became a necessity to simplify processes
- Coalescing of data is needed to improve customers’ lives and buying experiences
- Customers expect increased capabilities without being constrained by antiquated
Hadoop has paved the way for analysis of data volumes never before imagined. However, a gap existed; software was the missing link to unleashing the power of data stores across disparate systems.
The Big Data Challenges that are Solved by the Nexus Chameleon on CenturyLink Cloud
The Nexus Chameleon helps CenturyLink Cloud® customers by unifying their data warehousing environment under one software solution. Multiple tools for multiple systems are no longer needed as part of the CenturyLink Cloud Blueprint. The Nexus Chameleon has automated the most difficult challenges in today’s computing environment by doing things nobody else in the world is doing, but yet doing so with the click of the mouse. Since its inception, Nexus Chameleon was designed to work on all platforms. It is designed to focus on all aspects of data, including query capabilities, data movement, graphing and charting and visualization for simplicity. The Nexus Chameleon meets the market-based need for robust, economical software to integrate data across systems with cross-system join capabilities and advanced graphing and analysis tools.
The Nexus Chameleon Database Mover
Nexus uses the Teradata Parallel Transport (TPT) utilities from Teradata and the Bulk Copy Program (BCP) from Microsoft to move data to and from Teradata and SQL Server platforms and whatever data movement utility that each vendor uses. Customers have the ability to point-and-click on any single table or move an entire database of tables. Nexus will continue to enhance its data movement strategy until Nexus can move any single table or an entire database to and from all systems.
The Nexus Garden of Analysis
Nexus can simultaneously run queries on over 24 data sources. Each data source query is in its own separate tab and each returns its own individual answer set. The Nexus Garden of Analysis is designed to take all answer sets and place them in the garden — where any answer set can be re-queried or graphed without leaving the desktop. The Nexus Garden of Analysis allows customers to re-query any answer set without leaving their desktop. Nexus acts as its own database and does all the work internally. Garden users merely drop and drag column headers to pre-defined garden templates. This allows a customer to produce over 15 different reports and over 30 different graphs and charts in less than one minute! The graphs and charts are then placed inside a dashboard where users can see them in thumbnail view, a slideshow presentation, scrolling across the screen, or in a comparison mode.
The Nexus Conversion Tool
Nexus has always been designed as a cross-platform tool, and one essential philosophy has always been to convert the table structures between systems —officially referred to as the Data Definition Language (DDL). CoffingDW has concentrated heavily on converting to and from all key systems integrated into Nexus. Below is a picture that shows Nexus converting tables from Teradata to table structures for Hadoop. Nexus transforms what has typically been a 6-week process into an efficient conversion taking seconds. Nexus dynamically makes the DDL and data-type conversions using an iterative, proven approach that took years to develop. CoffingDW has automated this process in order to save companies time and money, and it has been implemented as a key foundation for cross-system joins.
If you are ready to get started, but are not yet a CenturyLink Cloud customer, no problem.
- Migrate to the CenturyLink Cloud Platform with free on-boarding assistance and receive a matching spend credit based on your commitment for your initial period of platform use with us.
- Coffing Data Warehousing is providing CenturyLink users a free trial license as part of this Blueprint. Please contact us to secure your free fully-functional trial license of the Nexus Chameleon.
- How-to Video: Nexus in 120 Seconds.
- Learn more about the Nexus Chameleon on CenturyLink Cloud.
- Getting Started with Coffing Data Warehousing Nexus Chameleon Blueprint on CenturyLink Cloud.