Apache Spark - Big Data Platform for All

by Sumit Mund | May 19, 2015

Apache Spark is a powerful open source in-memory cluster computing framework built around speed, ease of use, and sophisticated analytics. It runs everywhere - Hadoop (YARN), Mesos, standalone, or in the cloud. It can access diverse data sources including HDFS, Cassandra, HBase, S3 and more. Spark powers a stack of high-level tools including Spark SQL, MLlib for machine learning, GraphX for graph processing, and Spark Streaming to build scalable fault-tolerant streaming applications.

Read More...

Partnership with Hortonworks!

by Sumit Mund | Nov 22, 2013

Today, Hadoop has been synonymous with big data as it has been the platform of choice for big data processing. Apache™ Hadoop® is an open source project governed by the Apache Software Foundation (ASF) that allows you to gain insight from massive amounts of structured and unstructured data quickly and without significant investment.

Read More...