Many organizations today are experiencing the growth pains of a legacy data landscape and architecture for managing their analytics and reporting needs. Many of the issues related to this obsolescence derive from the now outdated architecture of the enterprise data warehouse that no longer delivers digital value or data insights to drive business innovation. While these legacy systems originally provided significant value by allowing the integration of data from multiple sources, provided a basis for better decision making, and allowed for robust enterprise reporting, they are no longer up to the task of driving that same business value with newer and more modern data sources.
“Most existing legacy enterprise data warehouse systems were originally designed and architected for online analytic processing (OLAP) and reporting systems. These systems (many of which are outdated today) were often architected for highly normalized and structured data only“.Forbes Magazine – 2019
Other challenges with legacy data warehouses include things such as: heightened risk of security or compliance breaches, late to market, fragility of the system(s) and frequent downtime, data that is untapped or unused due to silos, high costs of maintenance and hardware refreshes, they are more difficult to innovate with and they often lead to missed business opportunities.
As organizations grow, evolve and seek to innovate in order to better serve their customers, sources of data are also changing. Where organizations previously relied solely on highly normalized, structured data from line of business applications, they now find themselves needing to analyze and understand data from a plethora of new and ever evolving sources of data. Artificial intelligence, IOT, social data, customer surveys, textual information, sentiment analysis of customer interactions in social media all generate data in massive quantities.
Examples of these types of data include social media feeds, text and multimedia content, e-mail messages, word processing documents, videos, photos, audio files, presentations, webpages and many other kinds of business documents. This data is largely unstructured, non-normalized and requires newer modern tools to ingest, transform, store and generate analytics for insight driven business decision making.
These changes in organizations, the market, customer behavior and data require organizations to redesign, rearchitect, re-platform, redeploy or otherwise augment their analytics capabilities. The complexity arising from these ever-expanding data types and needs are further complicated by needs for additional types of reporting, prescriptive and predictive analytics, massive storage capacities, and as well as significant processing and compute power.
- 91.6% of executives report that the pace of investment in Big Data and AI is increasing, while 87.8% report a greater urgency to invest (Forbes)
- There will be more than 175 zettabytes of data by 2025, with 30% consumed in real-time (Seagate)
- More than 150B devices will be connected across the globe by 2025, most of which will be creating data in real time (Internet of Business)
- Investment in Data would reap an annual revenue increase of $5.2 million, with organizations seeing a potential 547 percent return (Snaplogic)
How will organizations rise to these challenges, innovate, deliver digital value and derive actionable insights from their data?
“… systems of intelligence encompass your people, processes and technology. And they will ultimately define your competitiveness and ability to change the landscape of the industries you participate in…”Satya Nadella – CEO Microsoft
- Financial institutions say that legacy data platforms are the biggest obstacles to improving their data management and analytics capabilities (https://www.comparex-group.com)
- IT managers indicate that securing legacy data is one of their top cost
- Enterprises consider ease of use and flexible deployment as the top business considerations for new data management and analytics capabilities
professionals say transitioning from legacy infrastructure/systems
is too complex
Modern businesses require a modern data estate! Businesses thrive on Data. Leading enterprises need to collect store and process huge volumes of data from a variety of sources at near real-time speeds.
According to TDWI surveys, the leading driver for modernizing a data warehouse is enabling business-focused, data-driven use cases that are new to the organization. https://tinyurl.com/wktyeb3
There are several high-level business use cases driving the move to data estate modernization or a modern data warehouse experience.
Other business use cases for modernizing your data landscape:
- Data warehouse to business alignment is leading driver for most organizations today. The business changes direction, changes customer focus, changes operating models etc. and the data must change to support.
- Newer and more advanced analytics (Big data, artificial intelligence, data science, IOT, social feeds, sentiment analysis)
- Support the deployment, use or leverage of newer technology tools (Hadoop, Data Lakes, more modern reporting and analysis tools)
- Support self-service analytic, end user or “3rd wave BI)
- More real time business monitoring
- Real time or near time analytics and reporting
- Needs for data to me online continually for reporting
Other examples of new data sources that require more modern approaches:
- Modern digital marketing, marketing analytics and insights
- Sentiment analysis (natural language processing)
- Digital supply change management
- Location data
Another hallmark of the modernized data estate is frequently multi-platform approaches and hybrid deployments.
There are still valid, viable business use cases for the traditional enterprise data warehouse. Organizations may still derive operational insights from line of business application data in these environments. Therefore, an on premises or hybrid data warehouse may just be one component of the modern data estate which could include things like Hadoop, Cloudera, HDInsight, Databricks etc. in a hybrid or cloud architecture.
Additionally, there are four stages for processing big data, AI or data science solutions that are common to all data warehouse architectures:
- Ingestion – The ingestion phase identifies the technology and processes that are used to acquire the source data. This data can come from files, logs, and other types of unstructured data that must be put into the Data Lake Store. The technology that is used will vary depending on the frequency that the data is transferred
- Store – The store phase identifies where the ingested data should be placed.
- Prep and train – The prep and train phase identifies the technologies that are used to perform data preparation and model training and scoring for data science solutions.
- Model and serve – Finally, the model and serve phase involves the technologies that will present the data to users.
Modern Data Estate Architecture Example
The architecture uses Azure Data Lake Storage at the center of the solution for a modern data warehouse. Integration Services are replaced by Azure Data Factory to ingest data into the Data Lake from a business application. This is the source for the predictive model that is built into Azure Databricks.
PolyBase is used to transfer the historical data into a big data relational format that is held in Azure SQL Data Warehouse, which also stores the results of the trained model from Databricks. Azure Analysis Services provides the caching capability for SQL Data Warehouse to service many users and to present the data through Power BI reports. How to get started? Where do organizations begin the process of this level of transformation to modernize their data estate? The process can quickly become challenging and complex. It requires an effective methodology, framework and approach to ensure success.
One such approach follows an agile type process with phased approach:
- Look at existing legacy databases
- Rehost them – No database changes; servers & container images are migrated to the cloud
- Refactor – Take advantage of Azure Data Services with minimal changes
- Rearchitect – Data Architected for the cloud rearchitect / refactor / rewrite
A sample modernization methodology:
- Capture and define – capture current business, process and technology priorities. Define requirements
- Assess – an analysis and assessment of current state including data systems, upstream and downstream systems, performance analysis and baselining
- Design – future state architectural design and documentation, leveraging target capabilities
- Iterative design, modernization and deployment processes – ongoing testing, refinement and improvement
- Deploy – ensure organizational readiness and lifecycle support
In summary, organizations must modernize in order to innovate, realize the promises of digital transformation, derive digital value from data assets, better monitor customer behavior and patters, get to market more quickly, improve employee productivity, improve profitability and a host of others.
You can further mitigate risk and ensure success through observing several key principals:
- Never make this an IT only or technology project
- Involve the business
- Leverage your data assets as well as subject matter experts on your data and applications (your controller or CFO knows your financial data and processes far better than any of your IT team)
- Keep it simple – leverage the right platform for the right data
Contact the Oakwood Data and Analytics team today for more information, onsite Lunch and Learn, Proof of Concept or custom presentation to your organization. We are also happy to partner with you and broker a deep dive data and analytics architectural discussion and design session with the Microsoft Technology Center and Microsoft team!
Need more information? Send the Data & Analytics team at Oakwood a message below and we’ll be happy to help!