Are you practicing Big data blocking and tackling actions? Ron Bodkin (@ronbodkin) penned a good post describing the value of having information in one place; the mantra of data warehouses and data marts.
While Hadoop makes it easier to warehouse data (due to flexible schema model), effective analytics across disparate data sources still requires defining data semantics, data mapping, and master data sources. Don’t forget these important foundational building blocks.
The Path to Big Data Requires Little Data
In a recent workshop with industry IT practitioners, focus was on the little data problems. The following problems will inhibit scaling little data to big data:
- Uneven data management maturity across the organization
◦ Emerging master data management practices
◦ Minimal identification of single source of truth
◦ Little agreement on core data entity representation
- Enterprise Information sharing platform not in place
◦ Fragmented data silos and data repositories
◦ Ad hoc, project-level data integration
◦ Limited data virtualization and data services
◦ Proliferation of unknown Excel spreadsheets
In addition to copying legacy data, some BDP implementation roadmaps tie directly into business activity message streams and don’t wait for bulk copies.
Big Data Reading Recommendations