Course Overview
Introduction to Apache Spark:
Apache Spark™ is a fast and general engine for large-scale data processing. It is supposed to run programs at a high speed as compared to Hadoop and Mapreduce. It is quite easy to use and can write applications quickly in Java, Scala, Python, R. In terms of performance it can combine SQL, streaming, and complex analytics. It is quite versatile and can runs on Hadoop, Mesos, standalone, or in the cloud. It can access diverse data sources including HDFS, Cassandra, HBase, and S3.
Through this overview course on Apache Spark you shall understand the fundamental mechanisms and basic internals of the framework and understand the need to use Spark. This course is basically intended for users who are interested to learn about Apache Spark and are just starting to learn about what is does. The course will give you an overview of Apache Spark, why use Spark and Spark Core.
Course Objective:
- To understand the need for Apache Spark
- Overview of Core Apache Spark
Target Customers:
- Students/Professionals Interested in learning about Apache Spark
- Anyone who wants to learn about data and analytics
- Data Engineers
- Analysts
- Architects
- Software Engineers
- IT operations
- Technical managers
Pre-Requisites:
- Basic Computer Knowledge
- Experience of coding
- Knowledge of Mapreduce paradigm
- Basic knowledge of any these- Java/Scala/Python