Data Engineering (BigData)

(Batches Start from July, August, September & October 2024)

About The Program

With the belief to build a healthy ecosystem as per the Industry Standards REGex Software brings a Summer Industrial Internship & Training Program on “Data Engineering (BigData)”. We organize Training/Internship Program for improving the knowledge and skills of the Students/Professionals, so that they can become expert in the field of BigData and get their Dream Job in Software Development Field in Big MNCs.

REGex Software Services’s BigData program is a valuable resource for beginners and experts. This program will introduce you to Hadoop, HDFS, HIVE, Apache Spark Amazon EMR etc. from Basics to Advance. If you want to become BigData Analyst, REGex introduce this program for you.

Key Benefits & Perks

  • Get Summer Internship Offer Letter
  • Need to Spend Min. 5 hours with REGex
  • Get Internship Project Completion Certificate
  • No Previous Knowledge Required
  • Get Summer Training Certificate
  • Get Performance based Letter of Recommendation (LOR)

July Batches Dates

Batch 1: 01st July 2024
Batch 2: 8th July 2024
Batch 3: 19th July 2024
Batch 4: 26th July 2024

August Batches Dates

Batch 1: 06th August 2024
Batch 2: 17th August 2024
Batch 3: 27th August 2024

September Batches Dates

Batch 1: 07th September 2024
Batch 2: 17th September 2024
Batch 3: 24th September 2024

October Batches Dates

 Batch 1: 01st October 2024
Batch 2: 8th   October 2024
Batch 3: 19th October 2024
Batch 4: 26th October 2024

Weekly Duration




20 Hours Per week

Physical (Jaipur)
Online (Google Meet)
45 – 60 Days
3 – 4 Months
25 – 30 per Batch

What you will Learn

  • Linux basics 
  • Big Data Analytics & Hadoop
  • HDFS [ Hadoop Distributed File System ]
  • Map-Reduce [ Data Processing ]
  • HIVE
  • Apache Spark on Azure DataBricks
  • Neo4j Graph Analytics & NoSQL DataBase
  • Amazon EMR
  • Learn how to use these tools in the field of Data Analytics

Study Material

  • E-Notes
  • Assignments per day
  • Poll test per day
  • 60+ hours on demand Live Video Lectures
  • Access of Lecture Videos & Notes
  • 24*7 Mentorship Support
  • Working on Live Projects 


  • Help you in Data Analytics Domain
  • Able to think out of the box
  • Expertise in different Big Data Tools like HDFS, Hive, Apache Spark, Amazon EMR
  • Able to solve many Interview Questions of Top MNCs
  • Able to get package of Data Analyst in Big MNCs upto 30 LPA

Why Choose Us

Live Sessions

Live Sessions by Expertise Trainers and Access of Recorded Session is also available.

Live Projects
Get a chance to work on Industry Oriented Projects to implement your learning.
24*7 Support
24*7 Mentorship Support available for all Students to clear all of your doubts.
REGex provides Internship / Job opportunities to the best Students in different Companies.

Placed Students//Partnership

What People Tell About Us

Placed Students

Course Content

  • Basics of Python 
  • OOPs Concepts
  • File & Exception Handling
  • Working with Pandas, Numpy & Matplotlib
    ■ Working with Missing Data
    ■ Data Grouping
    ■ Data Subsetting
    ■ Merging & Joining Data Frames
  • Importing Libraries & Datasets

● Introduction to LINUX Operating System and Basic LINUX commands
● Operating System
● Basic LINUX Commands

● LINUX File System
● File Types
● File Permissions
● File Related Commands
● Filters
o Simple Filters
o Advanced Filters

● Vi Editor
● Input Mode Commands
● Vi Editor – Save & Quit
● Cursor Movement Commands

● Shell Variables
● Environmental Variables
● Shell script Commands
● Arithmetic Operations
● Command Substitution
● Command Line Arguments

● Business Intelligence
● Need for Business Intelligence
● Terms used in BI
● Components of BI

● Data Warehouse
● History of Data Warehousing
● Need for Data Warehouse
● Data Warehouse Architecture
● Data Mining Works with DWH
● Features of Data warehouse
● Data Mart
● Application Areas

● Dimension modeling
● Fact and Dimension tables
● Database schema
● Schema Design for Modeling
● Star, SnowFlake
● Fact Constellation schema
● Use of Data mining
● Data mining and Business Intelligence
● Types of data used in Data mining
● Data mining applications
● Data mining products

● What’s Big Data?
● Big Data: 3V’s
● Explosion of Data
● What’s driving Big Data
● Applications for Big Data Analytics
● Big Data Use Cases
● Benefits of Big Data

● History of Hadoop
● Distributed File System
● What is Hadoop
● Characteristics of Hadoop
● RDBMS Vs Hadoop
● Hadoop Generations
● Components of Hadoop
● HDFS Blocks and Replication
● How Files Are Stored
● HDFS Commands
● Hadoop Daemons

● Difference between Hadoop 1.0 and 2.0
● New Components in Hadoop 2.x
● Configuration Files in Hadoop 2.x
● Major Hadoop Distributors/Vendors
● Cluster Management & Monitoring
● Hadoop Downloads

● What is distributed computing
● Introduction to Map Reduce
● Map Reduce components
● How MapReduce works
● Word Count execution
● Suitable & unsuitable use cases for MapReduce

● Architecture
● Basic Syntax
● Import data from a table in a relational database into HDFS
● import the results of a query from a relational database into HDFS
● Import a table from a relational database into a new or existing Hive table
● Insert or update data from HDFS into a table in a relational database

● Define a Hive-managed table
● Define a Hive external table
● Define a partitioned Hive table
● Define a bucketed Hive table
● Define a Hive table from a select query
● Define a Hive table that uses the ORCFile format
● Create a new ORCFile table from the data in an existing non-ORCFile Hive table
● Specify the delimiter of a Hive table
● Load data into a Hive table from a local directory
● Load data into a Hive table from an HDFS directory
● Load data into a Hive table as the result of a query
● Load a compressed data file into a Hive table
● Update a row in a Hive table
● Delete a row from a Hive table
● Insert a new row into a Hive table
● Join two Hive tables
● Use a subquery within a Hive query

● An overview of functional programming
● Why Scala?
● Working with functions
● objects and inheritance
● Working with lists and collections
● Abstract classes

● What is Spark?
● History of Spark
● Spark Architecture
● Spark Shell

● RDD Basics
● Creating RDDs in Spark
● RDD Operations
● Passing Functions to Spark
● Transformations and Actions in Spark
● Spark RDD Persistence

● Pair RDDs
● Transformations on Pair RDDs
● Actions Available on Pair RDDs
● Data Partitioning (Advanced)
● Loading and Saving the Data

● Accumulators
● Broadcast Variables
● Piping to External Programs
● Numeric RDD Operations
● Spark Runtime Architecture
● Deploying Applications

● Spark SQL Overview
● Spark SQL Architecture

● What are dataframe
● Manipulating Dataframes
● Reading new data from different file format
● Group By & Aggregations functions

● What is Spark streaming?
● Spark Streaming example

● Introduction of HBase
● Comparison with traditional database
● HBase Data Model (Logical and Physical models)
● Hbase Architecture
● Regions and Region Servers
● Partitions
● Compaction (Major and Minor)
● Shell Commands
● HBase using APIs

● Pre-requisites
● Introduction
● Architecture

● Installation and Configuration
● Repository
● Projects
● Metadata Connection
● Context Parameters
● Jobs / Joblets
● Components
● Important components
● Aggregation & working with Input & output data

● Pseudo Live Project (PLP) program is primarily to handhold participants who are fresh into the technology. In PLP, more importance given to “Process Adherence”
● The following SDLC activities are carried out during PLP
o Requirement Analysis
o Design ( High Level Design and Low Level Design)
o Design of UTP(Unit Test Plan) with test cases
o Coding
o Code Review
o Testing
o Deployment
o Configuration Management
o Final Presentation

 Note: Content may Subject to Change by REGex as per Requirement

Extra Sessions

Additinal Session on GIT, Linux, Docker, AWS Basics, Jenkins and many more for all students.

Projects you may work on

Live Client Projects With Development Team Under The Guidance Of Mentor

Fee Structure

Indian Fee

[Summer Batch(45-60 days)]

Price: ₹59,999/- (Flat 75% off) => ₹14,999/- => ₹8,000/-
(Flat 50% Extra off – Limited Period Special Offer)

Indian Fee (Physical)

[Regular Batch(3-4 Months)]

Price: ₹59,999/- (Flat 75% off) => ₹14,999/-  
(Limited Period Special Offer)

Indian Fee (Online)

[Summer & Regular Batch]​

Price: ₹59,999/- (Flat 75% off) => ₹14,999/- => ₹7,500/-
(Flat 50% off – Limited Period Special Offer)

International Fee

Price: $1200 (Flat 75% off) => $300
(Limited Period Special Offer)

Fee Can be Paid as No Cost EMI of 6 Months @2500/Month.

Cashback Policy

  • You will get your Unique Referral Code after successful paid registration.
  • You will get Upto ₹1000 Cashback directly in your account for each paid registration from your Unique Referral Code (After Closing Registrations of this program) .
  • For Example:- If we received 10 paid registration from your Unique Referral Code then you will receive Upto ₹1000*10 = ₹10,000.
For Frequent Course Updates and Information, Join our Telegram Group

Summer Industrial Internship/Training Program 2024

For Webinar Videos and Demo Session, Join our Youtube Channel

Enroll Now

(Batches Start from July, August, September & October 2024)

*It will help us to reach more
*Extra off is applicable on 1 time payment only. Seats can be filled or Price can be increased at any time. Refund policy is not available*