Last week I attended the first full day of RapidMiner World here in Aberdeen’s hometown of Boston. Appropriately, the event took place in the new District Hall, a public space designed to foster innovation and collaboration in Boston’s fast-growing Seaport District.
RapidMiner CEO Ingo Mierswa opened the event and introduced the theme of “All Data for All Enterprises.” Mr. Mierswa emphasized simplicity as the key to this ambitious goal, but also drew an analogy between selecting a data mining solution and buying a shirt. While preaching simplicity, he reminded the audience that cheap shows and high-end designer options look nice but are often over-priced and unnecessary, so the best option is a tailored fit.
Acceleration was the underlying topic of the rest of Mr. Mierswa’s remarks. Speed is one of Aberdeen’s Three Pillars of Big Data Strategy. In Q4, I’ll be publishing a full report on the state of data speed heading into 2015 (keep an eye out for the state of data availability in September). My colleague Michael Lock’s recent research has also focused on the idea of “fast data” as it relates to integration.
The keynote speaker for the event was Usama Fayyad, the Chief Data Officer of Barclays. Mr. Fayyad emphasized that the distinction between big data and “classic” data is quickly disappearing. This is largely due to companies’ ability to augment classic data sets, which may be relatively small, with outside data to perform more robust analysis.
Mr. Fayyad listed seven data axioms that are worth sharing, along with Aberdeen research that supports them:
- Data gains value exponentially when you integrate · Big Data Becomes Fast Data with Accelerated Integration
- Fusing data together from disparate sources is difficult to achieve · Interactive Data Visualization: The IT Perspective
- Standardization is key · The ROI of Data Scientists
- Data governance and policies must be centralized · Collaborative Data Governance: Peeling the Red Tape off Data Discovery
- Recency matters · Real-Time Executives: Streaming Data into the C-Suite
- Data infrastructure needs (rapid renewal and modernization) · The Best-in-Class Data Warehouse: Fast, Simple, Impactful
- Data is a primary competency and not a side activity · The Modern Data Analyst: Generating Insight at the Line of Business
I particularly enjoyed Mr. Fayyad’s use of Peter Bruegel the Elder’s Tower of Babel to drive home his point on the importance of data standardization. I may have to borrow the biblical metaphor for an upcoming report.
Mr. Fayyad continued his presentation with some thoughts on Hadoop. He laid out the top reasons why more and more data is moving naturally to Hadoop: cost of storage and convenience (vs. ETL). He shared compelling figures on the cost savings of Hadoop on community storage versus more traditional storage. Aberdeen hopes to publish several Hadoop case studies in the coming months based on conversations with members of our readership community.
Finally, Mr. Fayyad wrapped things up by preaching against the evils of batch thinking, a matter that is near and dear to my heart (and to my research on streaming data and real-time analytics).
There were more interesting speakers and ideas throughout the rest of the day than I can share here, but suffice it to say RapidMiner put out a great lineup. I’d like to thank the fine folks at RapidMiner for having (and feeding) me and I look forward to following the company’s activity in the coming months.
For more on this topic, read the Aberdeen report Three Pillars of Big Data Strategy