About The Program:

With the belief to build a healthy ecosystem as per the Industry Standards REGex Software brings a Winter Training/Internship Program on “BigData”. We organize Winter Training/Internship Program for improving the knowledge and skills of the Students/Professionals, so that they can become expert in the field of BigData and get their Dream Job in Software Development Field in Big MNCs.

REGex Software Services’s BigData course is a valuable resource for beginners and experts. This course will introduce you to Hadoop, HDFS, HIVE, Apache Spark Amazon EMR etc. from Basics to Advance. If you are preparing for a coding interview, REGex introduce this course for you.

Batch 1

Starting from 21st Dec, 2020
Timing: 08:30 PM – 10:30 PM (TTS)
Platform: Google Meet
Duration: 8 Weeks

Batch 2

Starting from 4th Jan, 2021
Timing: 08:30 PM – 10:30 PM (TTS)
Platform: Google Meet
Duration: 8 Weeks

Batch 3

Starting from 11th Jan, 2021
Timing: 08:30 PM – 10:30 PM (TTS)
Platform: Google Meet
Duration: 8 Weeks

What you will Learn

  • Big Data Analytics & Hadoop
  • HDFS [ Hadoop Distributed File System ]
  • Map-Reduce [ Data Processing ]
  • HIVE
  • Apache Spark on Azure DataBricks
  • Neo4j Graph Analytics & NoSQL DataBase
  • Amazon EMR
  • Learn how to use these tools in the field of Data Analytics

Study Material

  • E-Notes
  • Assignments per day
  • Poll test per day
  • Weekly Tests
  • 60+ hours on demand Live Video Lectures
  • Offline Access of Lecture Videos & Notes
  • 24*7 Mentorship Support

  •  Working on Live Projects


  • Help you in Data Analytics Domain
  • Able to think out of the box
  • Expertise in different Big Data Tools like HDFS, Hive, Apache Spark, Amazon EMR
  • Able to solve many Interview Questions of Top MNCs
  • Package of Data Analyst in Big MNCs starts from 6 LPA

Live Sessions

Live Sessions by Expertise Trainers and Access of Recorded Session is also available

Live Projects

Get a chance to work on Industry Oriented Projects to implement your learning

24*7 Support

24*7 Mentorship Support available for all Students to clear all of your doubts


REGex provides Internship / Job opportunities to the best Students in different Companies.

Course Content

Introduction to Python
Introduction to Python Programming
● Why do we need Python?
● Program structure in Python
Execution steps
● Interactive Shell
● Executable or script files
● User Interface or IDE
Data Types and Operations
● Numbers, Strings, List, Tuple, Dictionary
● Other Core Types
Statements and Syntax in Python
● Assignments, Expressions and prints
● If tests and Syntax Rules
● While and For Loops
● Iterations and Comprehensions
Functions in Python
● Function definition and call
● Function Scope, Arguments
● Function Objects
● Anonymous Functions
Modules and Packages-Basic
● Module Creations and Usage
● Package Creation and Importing
Classes in Python
● Classes and instances
● Classes method calls
File Operations
● Opening a file
● Using Files
● Other File tools
● Importing a library
● Math, Numpy
Introduction to LINUX Operating System and Basic LINUX commands
● Introduction to LINUX Operating System and Basic LINUX commands
● Operating System
● Basic LINUX Commands
LINUX File System
● LINUX File System
● File Types
● File Permissions
● File Related Commands
● Filters
o Simple Filters
o Advanced Filters
Vi Editor
● Vi Editor
● Input Mode Commands
● Vi Editor – Save & Quit
● Cursor Movement Commands
Shell Programming
● Shell Variables
● Environmental Variables
● Shell script Commands
● Arithmetic Operations
● Command Substitution
● Command Line Arguments
Data Warehouse & Modeling Concepts
Business Intelligence
● Business Intelligence
● Need for Business Intelligence
● Terms used in BI
● Components of BI
General concept of Data Warehouse
● Data Warehouse
● History of Data Warehousing
● Need for Data Warehouse
● Data Warehouse Architecture
● Data Mining Works with DWH
● Features of Data warehouse
● Data Mart
● Application Areas
Dimensional modeling
● Dimension modeling
● Fact and Dimension tables
● Database schema
● Schema Design for Modeling
● Star, SnowFlake
● Fact Constellation schema
● Use of Data mining
● Data mining and Business Intelligence
● Types of data used in Data mining
● Data mining applications
● Data mining products
Big Data Platform
Big Data Overview
● What’s Big Data?
● Big Data: 3V’s
● Explosion of Data
● What’s driving Big Data
● Applications for Big Data Analytics
● Big Data Use Cases
● Benefits of Big Data
● History of Hadoop
● Distributed File System
● What is Hadoop
● Characteristics of Hadoop
● RDBMS Vs Hadoop
● Hadoop Generations
● Components of Hadoop
● HDFS Blocks and Replication
● How Files Are Stored
● HDFS Commands
● Hadoop Daemons
Hadoop 2.0 & YARN
● Difference between Hadoop 1.0 and 2.0
● New Components in Hadoop 2.x
● Configuration Files in Hadoop 2.x
● Major Hadoop Distributors/Vendors
● Cluster Management & Monitoring
● Hadoop Downloads
Map Reduce
● What is distributed computing
● Introduction to Map Reduce
● Map Reduce components
● How MapReduce works
● Word Count execution
● Suitable & unsuitable use cases for MapReduce
● Architecture
● Basic Syntax
● Import data from a table in a relational database into HDFS
● import the results of a query from a relational database into HDFS
● Import a table from a relational database into a new or existing Hive table
● Insert or update data from HDFS into a table in a relational database
Hive Programming
● Define a Hive-managed table
● Define a Hive external table
● Define a partitioned Hive table
● Define a bucketed Hive table
● Define a Hive table from a select query
● Define a Hive table that uses the ORCFile format
● Create a new ORCFile table from the data in an existing non-ORCFile Hive table
● Specify the delimiter of a Hive table
● Load data into a Hive table from a local directory
● Load data into a Hive table from an HDFS directory
● Load data into a Hive table as the result of a query
● Load a compressed data file into a Hive table
● Update a row in a Hive table
● Delete a row from a Hive table
● Insert a new row into a Hive table
● Join two Hive tables
● Use a subquery within a Hive query
● An overview of functional programming
● Why Scala?
● Working with functions
● objects and inheritance
● Working with lists and collections
● Abstract classes
Spark in Memory
SPARK Basics
● What is Spark?
● History of Spark
● Spark Architecture
● Spark Shell
Working with RDDs in Spark
● RDD Basics
● Creating RDDs in Spark
● RDD Operations
● Passing Functions to Spark
● Transformations and Actions in Spark
● Spark RDD Persistence
Working with Key/Value Pairs
● Pair RDDs
● Transformations on Pair RDDs
● Actions Available on Pair RDDs
● Data Partitioning (Advanced)
● Loading and Saving the Data
Spark Advanced
● Accumulators
● Broadcast Variables
● Piping to External Programs
● Numeric RDD Operations
● Spark Runtime Architecture
● Deploying Applications
● Spark SQL Overview
● Spark SQL Architecture
DataFrame :
● What are dataframe
● Manipulating Dataframes
● Reading new data from different file format
● Group By & Aggregations functions
Spark streaming
● What is Spark streaming?
● Spark Streaming example
No SQL Databases
Introduction to HBASE
● Introduction of HBase
● Comparison with traditional database
● HBase Data Model (Logical and Physical models)
● Hbase Architecture
● Regions and Region Servers
● Partitions
● Compaction (Major and Minor)
● Shell Commands
● HBase using APIs
Talend Basics
● Pre-requisites
● Introduction
● Architecture
Talend Data Integration
● Installation and Configuration
● Repository
● Projects
● Metadata Connection
● Context Parameters
● Jobs / Joblets
● Components
● Important components
● Aggregation & working with Input & output data
Pseudo Live Project (PLP)
● Pseudo Live Project (PLP) program is primarily to handhold participants who are fresh into the technology. In PLP, more importance given to “Process Adherence”
● The following SDLC activities are carried out during PLP
o Requirement Analysis
o Design ( High Level Design and Low Level Design)
o Design of UTP(Unit Test Plan) with test cases
o Coding
o Code Review
o Testing
o Deployment
o Configuration Management
o Final Presentation

Fee Structure

(for each program)

Indian Fee

Price: ₹4999/- (Flat 80% off) => ₹999/- 

International Fee

Price: $100 (Flat 70% off) => $30 
