The data platform serves as the nerve center of your organization. It cuts across the entire business, serving users of all technical skill levels - from data engineers, scientists, and analysts to machine learning teams, operations managers, and business executives. By seamlessly integrating data from all corners of your business into an accessible and understandable ecosystem, it transforms raw event and transactional data into definitions, metrics, and meaning that drive business decisions and unlock new opportunities.
The data platform achieves all this on the foundation of the infrastructure platform, with an eye towards enabling the most potentially valuable (and challenging) data features – ML and AI models.
Data Platform
The specific technologies used for a data platform will depend on your business and requirements, but generally a data platform offers answers to the following questions:
Data Discovery & Availability
- what data is available?
- how was it generated?
- where is it stored?
- how can I access it?
Data Quality & Reliability
- what is the history (schema, migrations) of the data source?
- how fresh, reliable and accurate is the data?
- how do I define and measure data quality?
Data Relationships and Integration
- how do different data assets relate to each other?
- can I track data lineage across various transformations and integrations?
- how do I resolve discrepancies between data sources?
- how do I integrate with external data sources and APIs?
Data Governance and Security
- who has access to specific data assets?
- at what granularity are data controls implemented?
- how do I implement a data control?
Data Processing & ETL
- how do I define and run ETL pipelines?
- how can I combine data from multiple sources?
- how can I locally test and iterate on data pipelines
Business Intelligence & Analytics
- how can I generate and share reports, dashboards and visualizations?
- how can I run backward looking analytics on my data?
- how can I run backward looking analytics on my big data?
Data Definitions & Metadata
- how do I centralize and broadcast business definitions of transformed data and their raw data dependencies?
Advanced Analytics & AI Integration
- how does a data platform work with a machine learning platform?
- how does a data platform work with an AI platform?
The data platform is a collaboration zone where data engineers, analysts, scientists, product managers, and the executive team come together. It’s a place where teams can see what’s happening in the business at any given moment and begin to ask why.
Strong data platforms and teams are the unsung heroes of the bright and shiny ML & AI features titillating the market. While ML & AI investment shouldn’t wait on the perfect data platform, data platform investment should co-occur with any ML & AI features you want to deploy to more than a PowerPoint.