Strata + Hadoop World wrapped up on February 20, and this year’s four-day event focused on big data, with keynote speeches from leaders as prominent as U.S. President Barack Obama himself. During the show the President announced the appointment of D.J. Pail as Chief Data Science Officer of the U.S.A., highlighting the government’s commitment to new technologies and further cementing the importance of big data. Perhaps surprisingly, Hadoop wasn’t the star of the show, but rather a newer Apache product, Spark, quickly became the talk of the town.
Spark, which was created at Amp Labs at UC Berkely, is designed to work with Hadoop’s file system and is primarily an in-memory data-processing framework that’s faster and easier than MapReduce. In addition to its core competency, Spark also includes other projects such as it’s own in-memory file system called Tachyon, machine learning, stream processing, NoSQL, interactive SQL technologies and GraphX. Spark is being commercialized by a company called Databricks.
Pretty much everywhere, the main attraction was technology companies lining up to support Spark: Databricks (naturally), Intel, Altiscale, MemSQL, Qubole and ZoomData were among them. In the coming years we may begin to see tension between the Spark and Hadoop platforms, or possibly an evolution in how they relate and work together, but for now we expect most companies will continue to use trusted platforms like Hadoop MapReduce, Hive or Impala for their big data workloads.
Beyond the Spark excitement, several companies introduced new product and ideas, including Pivotal, Cloudera, Map R, Microsoft, HP, Oracle, Salesforce.com and more. Here are a few key takeaways from the 3 day event:
All in all the conference reinforced the obvious fact that big data is quickly becoming one of the most important forces in the IT field, as well as in business in general. The rapid improvement of analytics software and the increasing development of machine learning products promises to change the face of computing in the coming years, and you can be sure Collabera will be on top of every update as the field continues to mature.