Building reliable data pipelines with AI and DataOps

Image Credit: Shutterstock (Image credit: Image Credit: Alexskopje / Shutterstock)

The industry’s use of analytics is ubiquitous and highly varied. From correlating all components in a technology ecosystem to learning from and adapting to new events as well as automating and optimising processes - in many different ways, these use cases are all about assisting the human in the loop and making them more productive and reducing error rates.

We as a society are finding that analytics are increasingly seen as the glue or brain that drive emerging business and social ecosystems that can, and already are, transforming our economy and the way we live, work and play.

From people data to ‘thing’ data

The old touchstone of the technology industry - ‘people, processes and technology’ - is firmly entrenched, but we might start replacing ‘technology’ with ‘things’; increasingly so as embedded and unseen tech becomes truly ubiquitous from sensors and connected tech in everything around us.

As we become more connected, it’s been called an Internet of Things or an internet of everything, but for a truly connected and efficient system we are beginning to layer on top a much needed ‘analytics of things’. Forrester talk of ‘systems of Insight’ and believe that these are the engines that are powering future-proofed digital businesses. This is required as it’s only through analytics that businesses and institutions can synchronise the varied components of this complex ecosystem that is driving business and social transformation. Put another way, if we can’t understand and make use of all this data, then why are we bothering to generate it all?

While having a digital fabric means that so much can connect together, from varied enterprise solutions to manufacturing, or even consumer digital solutions like home control applications, it is analytics that coordinates and adapts demand using cognitive capabilities in the face of new forces and events. It’s needed to automate and optimise processes, making humans more productive and able to respond to pressures like the money markets, global social media feeds, and other complex systems in a timely and adaptive manner.

However, the fly in the analytics ointment has tended to be the well-known plethora of problems with data warehouses – even well-designed ones. Overall, data warehouses have been good for answering known questions, but business has tended to ask the data warehouse to do too much. It’s generally ideal for reporting and dashboarding with some ad hoc analysis around those views, but it’s just one aspect of many data pipelines and has tended to be slow to deploy, hard to change, expensive to maintain, and not ideal for many ad hoc queries or for big data requirements.

Image Credit: Pixabay

Image Credit: Pixabay

Spaghetti data pipelines

The modern data environment relies on a variety of sources beyond the data warehouse, like production databases, applications, data marts, ESB, big data stores, social media, and other external data sources - and unstructured data too. the trouble is, it often relies on a spaghetti architecture joining these up with the ecosystem and the targets like production applications, analytics, reporting, dashboards, websites and apps.

To get from these sources to the right endpoints, data pipelines consist of a number of steps that convert data as a raw material into a usable output. Some pipelines are relatively simple, such as ‘export this data into a CSV file and place into this file folder’. But many are more complex, such as ‘move select tables from ten sources into the target database, merge common fields, array into a dimensional schema, aggregate by year, flag null values, convert into an extract for a BI tool, and generate personalised dashboards based on the data’.

Complementary pipelines can also run together, such as operations and development, where development feeds innovative new processes into the operations workflow at the right moment - usually before data transformation is passed into data analysis.

As long as the process works efficiently, effectively and repeatedly - as well as pulls data from sources through the various data processes, to the business users that need it - be they data explorers, users, analysts, scientists, or consumers, then it’s a successful pipeline.

Dimensions of DataOps

DataOps provides a series of values into the mix. From the agile perspective, SCRUM, kanban, sprints and self-organising teams keep development on the right path. DevOps relies on continuous integration, deployment and testing, with code and config repositories and containers. Total quality management is derived from performance metrics, continuous monitoring, benchmarking and a commitment to continuous improvement. Lean techniques feed into automation, orchestration, efficiency, and simplicity.

The benefits this miscellany of dimensions bring include speed, with faster cycles times and faster changes; economy, with more reuse and coordination; quality, with fewer defects and more automation; and higher satisfaction, based on a greater trust in data and in the process.

AI can add considerable value to the DataOps mix, as together data plus AI is becoming the default stack upon which many modern enterprise applications are built. There’s no part of the DataOps framework that AI cannot optimise, from the data processes (development, deployment, orchestration) or data technologies (capture, integration, preparation, analytics); or the pipeline itself from ingestion to engineering and analytics.

This AI value will come from machine learning, AI, and advanced analytics beyond troubleshooting (though that will be a massive cost, resource and time saving), through greater automating and rightsizing the process and the parts to work in optimal harmony.

Where DataOps adds value

The goal of good architecture is to coordinate and simplify data pipelines, and the goal of DataOps is to fit in and automate, monitor and optimise data pipelines. Enterprises do need to inventory their data pipelines and ensure they carefully explore DataOps processes and tools so that they solve their challenges with the right-sized tools. AI will layer on top by bringing the ultimate value from DataOps.

Kunal Agarwal, CEO of Unravel Data

Kunal Agarwal
Kunal Agarwal co-founded Unravel Data in 2013 and serves as CEO. Kunal has led sales and implementation of Oracle products at several Fortune 100 companies. He co-founded Yuuze.com, a pioneer in personalised shopping and what-to-wear recommendations. Before Yuuze.com, he helped Sun Microsystems run Big Data infrastructure such as Sun's Grid Computing Engine.
Latest in Pro
Protection from AI hacker attacks
Maintaining SAP’s confidentiality, integrity, and availability triad
A trough sensor at Overbury farm
“It's wildlife working for you” - how Agri-Tech can help revolutionize British farming as we know it
Epson EcoTank ET-4850 next to a TechRadar badge that reads Big Savings
I found the best printer deal you won't see in the Amazon Spring Sale and it's got a massive $150 saving
NVIDIA RTX PRO 6000 Blackwell Server Edition
Nvidia's most expensive Blackwell card gets massive price cut but it is not the RTX 5090
Microsoft Copiot Studio deep reasoning and agent flows
Microsoft reveals OpenAI-powered Copilot AI agents to bosot your work research and data analysis
Group of people meeting
Inflexible work policies are pushing tech workers to quit
Latest in News
Buzz Lightyear Space Ranger Spin Rennovations
Disney’s giving a classic Buzz Lightyear ride a tech overhaul – here's everything you need to know
Hisense U8 series TV on wall in living room
Hisense announces 2025 mini-LED TV lineup, with screen sizes up to 100 inches – and a surprising smart TV switch
Nintendo Music teaser art
Nintendo Music expands its library with songs from Kirby and the Forgotten Land and Tetris
Opera AI Tabs
Opera's new AI feature brings order to your browser tab chaos
An image of Pro-Ject's Flatten it closed and opened
Pro-Ject’s new vinyl flattener will fix any warped LPs you inadvertently buy on Record Store Day
The iPhone 16 Pro on a grey background
iPhone 17 Pro tipped to get 8K video recording – but I want these 3 video features instead