Data Engineering (BigData) Summer Internship Program

(Batches Start in May, June & July  2025)

About The Program

At Regex Software, we offer one of the most comprehensive Data Engineering Summer Intership Program specifically designed for aspiring data professionals who want to dive deep into the world of Big Data, data pipelines, cloud infrastructure.With data now at the center of every decision-making process, our program equips learners with the essential tools and skills to manage, process, and analyze vast datasets using modern technology

This specialization covers key areas such as Big Data architecture, distributed systems, Apache Spark, Hadoop, HIVE, HDFS, Amazon EMR data modeling, Our Data Engineering ensures hands-on experience through real-world projects and personalized mentorship. Whether you’re a beginner or a professional aiming to upskill, Regex Software’s training is tailored to industry standards and hiring trends.

Key Benefits & Perks

  • Get Summer Internship Offer Letter
  • Need to Spend Min. 5 hours with REGex
  • Get Internship Project Completion Certificate
  • No Previous Knowledge Required
  • Get Summer Training Certificate
  • Get Performance based Letter of Recommendation (LOR)

May Batches Dates

Batch 1: 05th May 2025
Batch 2: 12th May 2025
Batch 3: 19th May 2025
Batch 4: 26th May 2025

June Batches Dates

Batch 1: 02nd June 2025
Batch 2: 09th June 2025
Batch 3: 16th June 2025
Batch 4: 23rd June 2025
Batch 5: 30th June 2025

July Batches Dates

Batch 1: 07st July 2025
Batch 2: 14th July 2025
Batch 3: 21th July 2025
Batch 4: 28th July 2025

Weekly Duration

20 Hours Per week

Location

Physical (Jaipur)
or 
Online (Google Meet)

Duration

45 – 60 Days

Participants

25 – 30 per Batch

What you will Learn

What you will Learn

  • Linux basics 
  • Big Data Analytics & Hadoop
  • HDFS [ Hadoop Distributed File System ]
  • Map-Reduce [ Data Processing ]
  • HIVE
  • Apache Spark on Azure DataBricks
  • Neo4j Graph Analytics & NoSQL DataBase
  • Amazon EMR
  • Learn how to use these tools in the field of Data Analytics

Study Material

  • E-Notes
  • Assignments per day
  • Poll test per day
  • Live & Hands-on-Session
  • Access of Lecture Videos & Notes
  • 24*7 Mentorship Support
  • Working on Live Projects 

Output

  • Help you in Data Engineering Domain
  • Able to think out of the box
  • Expertise in different Big Data Tools like HDFS, Hive, Apache Spark, Amazon EMR
  • Understand End-to-End Data Workflow
  • Out-of-the-Box Problem Solving Skills

Why Choose Us

Live Sessions

Live Sessions by Expertise Trainers and Access of Recorded Session is also available.

Live Projects
Get a chance to work on Industry Oriented Projects to implement your learning.
24*7 Support
24*7 Mentorship Support available for all Students to clear all of your doubts.
Opportunities
REGex provides Internship / Job opportunities to the best Students in different Companies.

Placed Students//Partnership

What People Tell About Us

Placed Students

Course Content

  • Basics of Python 
  • OOPs Concepts
  • File & Exception Handling
  • Working with Pandas, Numpy & Matplotlib
    ■ Working with Missing Data
    ■ Data Grouping
    ■ Data Subsetting
    ■ Merging & Joining Data Frames
  • Importing Libraries & Datasets

● History of Hadoop
● Distributed File System
● What is Hadoop
● Characteristics of Hadoop
● RDBMS Vs Hadoop
● Hadoop Generations
● Components of Hadoop
● HDFS Blocks and Replication
● How Files Are Stored
● HDFS Commands
● Hadoop Daemons

● Difference between Hadoop 1.0 and 2.0
● New Components in Hadoop 2.x
● YARN/MRv2
● Configuration Files in Hadoop 2.x
● Major Hadoop Distributors/Vendors
● Cluster Management & Monitoring
● Hadoop Downloads

● What is Spark?
● History of Spark
● Spark Architecture
● Spark Shell

● RDD Basics
● Creating RDDs in Spark
● RDD Operations
● Passing Functions to Spark
● Transformations and Actions in Spark
● Spark RDD Persistence

● Pair RDDs
● Transformations on Pair RDDs
● Actions Available on Pair RDDs
● Data Partitioning (Advanced)
● Loading and Saving the Data

● Accumulators
● Broadcast Variables
● Piping to External Programs
● Numeric RDD Operations
● Spark Runtime Architecture
● Deploying Applications

● Spark SQL Overview
● Spark SQL Architecture

● What is Spark streaming?
● Spark Streaming example

  • EC2 Server
  • S3 Bucket
  • lambda
  • emr
  • glue
  • athena
  • What is Snowflake?
  • Why Use Snowflake?
  • Key Features
  • Introduction to Databricks?
  • All-in-One Platform
  • Notebook Interface

 Note: Content may Subject to Change by REGex as per Requirement

Extra Sessions

Additinal Session on GIT, Linux, Docker, AWS Basics, Jenkins and many more for all students.

Projects you may work on

Live Client Projects With Development Team Under The Guidance Of Mentor

Fee Structure

Indian Fee
(Physical)

[Summer Batch(45-60 days)]

Price: ₹10,000/- (Flat 40% off) => ₹6,000/-

Indian Fee
(Online)

[Summer Batch(45-60 days)]

Price: ₹10,000/- (Flat 40% off) => ₹6,000/-
For Frequent Course Updates and Information, Join our Telegram Group

Join 100% Placement Guaranteed
Programs

For Webinar Videos and Demo Session, Join our Youtube Channel

Enroll Now

(Batches Start from May, June & July 2025)

*It will help us to reach more