Archived Content

The following content is from an older version of this website, and may not display correctly.

Big data startup Databricks has teamed with education specialist O’Reilly Media in a bid to create a critical mass of developers with knowledge of writing apps for Apache Spark - an open source cluster computing engine which fits into the Hadoop analytics community.

Databricks head of business development Arsalan Tavakoli, once part of the team that originally created Spark, said his mission is to build sufficient momentum behind the open source processing engine in order for the data center industry to reap the benefits.

“Hadoop created a cheaper data management system. Now we need the next piece in place. So the next question is: can we get a big enough processing machine in place, with popular support, in order to run these data management systems effectively,” said Tavakoli.

Though Spark has many admirers in the open source movement few developers know how to write applications that use it to full effect, said Tavakoli. The open source movement faces a challenge in recruiting enough partners and developers to create a mass movement. When that happens, the open source engine, which should work with any distribution of Hadoop, will become the de facto standard, according to Tavakoli.

The programme, devised by Databricks’ Spark experts and O'Reilly's editorial team, promises a wide variety of freely available training materials such as online tutorials and training videos, with certification being granted to anyone who can demonstrate sufficient knowledge in an exam. “Too often training and certification is seen as a revenue stream and a way to lock people into a vendor. We have no motivation to do that because we’ve seen that too often open source fails when different parties descend into in fighting,” said Tavakoli.

O'Reilly Media is also developing full suite of training materials to support the certification.

Recent Databricks initiatives have included the ‘Certified on Spark’ and ‘Certified Spark Distribution’ programs designed to ensure compatibility between Spark applications and distributions.

“The adoption of Apache Spark by businesses is growing at an incredible rate across a wide range of industries, and the demand for developers with certified expertise is quickly following suit,” said John Tripier, alliances and ecosystem lead at Databricks.

Spark Developer Certification will formally launch at O’Reilly 's Strata Conference + Hadoop World in New York, October 15-17. Training materials can be found at Databricks.com.