Details
What is a Spark?
Apache Spark is a day to an analytics cluster computing framework. It is an open source software.It was a fully developed in the A&P lab that you see Barkley spa fits into the Hadoop open source community.It builds on top of the Hadoop distributed file system called a DFS, however, Spark is not tied to the two-stage MapReduce paradigm.It promises performance up to times faster than how do MapReduce for certain applications. Spark provides primitives foreign memory cluster computing; the in-memory cluster computing allows use programs to load data into a clusters memory and clearing.It repeatedly this makes Spark well-suited to machine learning algorithms.
Spark became an Apache top-level project. It was previously an Apache Incubator project. It has received code contributions from large companies that use Spark the companies include Yahoo and Intel .over individual developers had contributed code to Spark representing different companies .the software is written in scholar Java and Python language.It is available for operating systems Linux Mac operating system, and Windows Spark is available for use an under Apache License to the official website is Spark doctor patchy .org
Outline
Module 1
Introduction to Scala
Learning Objectives – In this module, you will understand basic concepts of Scala, motives towards learning a new language and get your set-up ready.
Topics
1) Why Scala?
2) What is Scala?
3) Introducing Scala
4) Installing Scala
5) Journey – Java to Scala
6) First Dive – Interactive Scala
7) Writing Scala Scripts – Compiling Scala Programs
8) Scala Basics
9) Scala Basic Types
10) Defining Functions
11) IDE for Scala, Scala Community
Module 2
Scala Essentials
Learning Objectives – In this module, you will learn essentials of Scala that are needed to work on it.
Topics
1) Immutability in Scala – Semicolons
2) Method Declaration, Literals
3) Lists
4) Tuples
5) Options
6) Maps
7) Reserved Words
8) Operators
9) Precedence Rules
10) If statements
11) Scala For Comprehensions
12) While Loops
13) Do-While Loops
14) Conditional Operators
15) Pattern Matching
16) Enumerations
Module 3
Traits and OOPs in Scala
Learning Objectives – In this module, you will understand implementation of OOPs concepts in Scala and use Traits as Mixins
Topics
1) Traits Intro – Traits as Mixins
2) Stackable Traits
3) Creating Traits Basic OOPS – Class and Object Basics
4) Scala Constructors
5) Nested Classes
6) Visibility Rules
Module 4
Functional Programming in Scala
Learning Objectives – In this module, you will understand functional programming know-how for Scala.
Topics
1) What is Functional Programming?
2) Functional Literals and Closures
3) Recursion
4) Tail Calls
5) Functional Data Structures
6) Implicit Function Parameters
7) Call by Name
8) Call by Value
Module 5
Introduction to Big Data and Spark
Learning Objectives – In this module, you will understand what Big Data is, it’s associated challenges, various frameworks available and will get the first-hand introduction to Spark
Topics
1) Introduction to Big Data
2) Challenges with Big Data
3) Batch Vs. Real-Time Big Data Analytics
4) Batch Analytics – Hadoop Ecosystem Overview
5) Real-Time Analytics Options, Streaming Data – Storm
6) In Memory Data – Spark
7) What is Spark?
8) Modes of Spark
9) Spark Installation Demo
10) Overview of Spark on a cluster
11) Spark Standalone Cluster
Module 6
Spark Baby Steps
Learning Objectives – In this module, you will learn how to invoke Spark shell and use it for various standard operations.
Topics
1) Invoking Spark Shell
2) Loading a File in Shell
3) Performing Some Basic Operations on Files in Spark Shell
4) Building a Spark Project with sbt, Building and Running Spark Project with sbt
5) Caching Overview, Distributed Persistence
6) Spark Streaming Overview
7) Example: Streaming Word Count
Module 7
Playing with RDDs
Learning Objectives – In this module, you will learn one of the building blocks of Spark – RDDs and related manipulations for implementing business logics.
Topics
1) RDDs
2) Transformations in RDD
3) Actions in RDD
4) Loading Data in RDD
5) Saving Data through RDD
6) Scala and Hadoop Integration Hands-on
Module 8
Shark – When Spark meets Hive
Learning Objectives – In this module, you will see different offspring of Spark like Shark, SparkSQL, and Mila. This session is primarily interactive for discussing industrial use cases of Spark and latest developments happening in this area.
Topics
1) Why Shark?
2) Installing Shark
3) Running Shark
4) Loading of Data
5) Hive Queries through Spark
6) Testing Tips in Scala
7) Performance Tuning Tips in Spark
8) Shared Variables: Broadcast Variables
9) Shared Variables: Accumulators