Building A Successful and Scalable Data System
Choose the things that matters most – this is one important rule in scaling your analytics. Imagine that you are in the middle of a baseball game. Your first goal is to hit the ball before you can make a home run – “this first before that”. Things happen in order, it always does. If you are trying to scale your data, your ultimate goal is to help your business make informed decisions. You are presenting something and they are making their judgment out of it. You are in charge of producing valuable information for the business to succeed. The data needs to be organized, accurate and easy to understand – it should stand on its own. It has to say something more than plain words and numbers. The challenge – how are you going to do it? How do you build a successful and scalable data system?
Data systems are complex structures that need to be carefully built. Like any real structure, it needs a good foundation.
In a webinar hosted by KISSmetrics, Alyson Murphy – a senior data analyst from MOZ – discussed, along with Convert, how to scale analytics in a mature organization. Convert learned that we need to focus on some key areas when scaling a data system. Data systems are complex structures that need to be carefully built. Like any real structure, it needs a good foundation.
Have you experienced looking at something – a wall perhaps, and wondered what is on the other side? The same goes for scaling a data system. The end-users need to see what is happening on the other side of their business. Are they winning? Are they losing? Why did that happen? What are the opportunities for improvement? Your job is to build data solutions so they can take a look on their business in a different perspective.
Scaling Data System: How to Do It Right The First Time
Know your priorities, think of the business problems that you want to solve and start from there.
According to Ian Gorton in Four Principles of Engineering Scalable, Big Data Software Systems “The more complex a solution, the less likely it will scale”. Know your priorities, think of the business problems that you want to solve and start from there. In scaling a data system, we need to focus on six key areas;
- Data infrastructure
- Data integrity
- Data access
- Data visualization
- Infrastructure change process
- Data utilization process
Data Infrastructure. This is the structural and physical requirement that you need so your data can be produced. Do not attempt to neither create different views and different metrics nor create a crossover joint because that will take time to run queries. Simplify. Save time in querying by creating a replication onto one reporting server. Once you are done with the infrastructure, you can join the data produced with web analytics tool for a more comprehensive data presentation.
Data Integrity. Now that you have a way to present your data, your next assignment is to ensure that you are presenting accurate and valid information. How confident are you that your data is error-free? Have an audit system. You have the option to do this manually by doing spot check or you can automate the process by transitioning into a more efficient data audit system.
[Tweet “Building a minimum viable data system should be the #1 goal in scaling your analytics”]
Data Access and Visualization. This is the part where you need to ensure that the data are easily accessible and are presented in a manner that everyone in the organization could understand. The data must speak the organization’s common language.
Infrastructure change process. Changing your data infrastructure changes things in the back end. Due to modifications, the start-up engineers may not realize that you have build a report on some of the data and when they make changes it could break your reporting or tell that it’s not true anymore. To prevent this, you need to appoint a person – somebody from your analytics team, who has knowledge on the reporting – to keep watch over the changes being made. This allows you to adapt easily on the changes.
Work not from an engineer’s perspective but from a data analyst’s perspective.
Data utilization process. People like to reselect different analysis because they like to be quick and use the data that was already in existence as opposed to trying to create a new analysis. However, as your organization starts to mature you may want to have some sort of quality assurance and get the right conclusion drawn from your data.
The data is created for production purposes – it was built to function on the system and not for easy data analysis. Remove the engineering side – how the data is transported from one source to the next, how the data is produced, how can it be extracted and how is the script written are the questions that you no longer need to care so much about. Work not from an engineer’s perspective but from a data analyst’s perspective.