Data Engineering Specialization With Cloud Industrial Internship & Training Program Program- january 2024

Data Engineering Specialization With Cloud

(Batches Start from 8th, 19th, 29th January 2024)

About The Program

With the belief to build a healthy ecosystem as per the Industry Standards REGex Software brings a Training/Internship Program on “BigData & AWS Cloud”. We organize Summer Training/Internship Program for improving the knowledge and skills of the Students/Professionals, so that they can become expert in the field of BigData and get their Dream Job in Software Development Field in Big MNCs.

REGex Software Services’s Data Engineering Specialization With Cloud is a valuable resource for both beginners and experts. This specialization program will introduce you to the domain of Data Engineering and Cloud include Hadoop, Map-Reduce, HIVE, Apache Spark, Kafka Streaming ,SQL, Amazon EMR and Connecting, Managing, Deploying and Updating Cloud much more starting from Basics to Advance. If you want to become Data Engineer / Business Analyst, REGex introduce this program for you.

Weekly Duration

20 Hours Per week

Location

Physical (Jaipur)or Online (Google Meet)

Duration

6 Months

Participants

25 – 30 per Batch

What you will Learn

Linux basics
Big Data Analytics & Hadoop
HDFS [ Hadoop Distributed File System ]
Map-Reduce [ Data Processing ]
HIVE
Apache Spark on Azure DataBricks
NoSQL DataBase
Data visualization
SQL
Power Query & Editor
Dashboard & Graph
Amazon EMR
Learn how to use these tools in the field of Data Analytics
AWS Foundations and Services
AWS Security & Costs
AWS Cloud Services Overview
Compute Services Design, Implementation & Management
Identity and Access Management (IAM)
Auto Scaling Solutions
Virtual Network Services – DNS
AWS Application Deployment
AWS Database Design & Deployment
Additional AWS Services.

Study Material

E-Notes.
Poll Test & Assignments .
Over 300+ hours of Live Video Lectures available on demand.
Accessing lecture videos and notes.
24*7 Mentorship Support
Engaging in real-time project assignments

Output

Able to think out of the box
Expertise in different Big Data Tools like HDFS, Hive, Apache Spark, Amazon EMR
Work on multiple projects and Opportunity to get Internship in REGex or in any other Other company through us.
Understand creating Data Insights by connecting data sets, transform & clean the data into data models and then create chars/graphs to provide visuals of the data
Become Data Engineer after completion of this program
Able to get Package of upto 30 LPA

Why Choose Us

Live Sessions

Live Sessions by Expertise Trainers and Access of Recorded Session is also available.

Live Projects

Get a chance to work on Industry Oriented Projects to implement your learning.

24*7 Support

24*7 Mentorship Support available for all Students to clear all of your doubts.

Opportunities

REGex provides Internship / Job opportunities to the best Students in different Companies.

Placed Students//Partnership

What People Tell About Us

Best IT Training and Internship Company in Jaipur. Highly recommended. Supportive faculties, Management, online and offline sessions access with recording access help every student to concentrate more on learning. Practical Learning and working on live projects with team is a main key highlights of REGEX.

Drishti Khandelwal Ex-Student, Django Batch

The experience of learning in the Institute is really good. I've joined the MERN full stack course doing well. Thanks to the mngt. To provide certified facility. Very helping in solving my queries and Institute provide me to practical knowledge and demo project to improve my skills..... Thnks to Regex software service

Gulshan S Arya Ex-Student, Mearn Stack

This training center is exceptional, providing me with extensive knowledge in various domains and technologies. I enrolled in the Python Django course eight months ago, where I learned website development. Prior to joining this coaching, I struggled with speaking English, but now I have gained the ability to communicate effectively. My experience has been extremely positive, and I strongly recommend joining The Regex Software Services at the earliest opportunity.

Mohit Sharma Ex-Student, python django

Competitive Programming is the best course they have - i am part of both python and C++ course. Cracked several interviews with their course, poll test & assignment are always new and beneficial. Best CP course you will find here, i hope this will be beneficial for you

Yaman Singh Ex-Student, CP Batch

Tushar sir is best in delivery. His approach is mind blowing. I have not found any gap although I am from U.S Lots of Big Data tools I have learnt like Hadoop, Hive, Spark, Sqoop & most amazingly Talend ETL Tools which was the most lovely part of training. every component is told in very simple terms with great practical approach

Josh Well Ex-Student, BigData Batch

I recently joined Python Django(Web Development - Full Stack)Course About Course: - I must say instructor makes every concept simple to understand - No Copy Paste,Every line of code is explained - Even given Assignments to work on - Even given Projects to work on If you looking to learn Python Django I highly recommend to go for this course

Salman Khan Ex-Student, Django Batch

I am from UK & loved the teaching. Competitive Programming was the best experience I had in coding. I can truly say the money I spend is worth it. Go for it guys!!

Jack Ryan Ex-Student, CP Batch

Placed Students

Gunjan Saini National Instruments — Gunjan Saini
National Instruments

Purnima ponrajkumar Indium software — Purnima ponrajkumar
Indium software

Mohammad Atash Shaikh Wipro — Mohammad Atash Shaikh
Wipro

Ayush Kumar srivastava Sopra — Ayush Kumar srivastava
Sopra

Sourav Dash Tek System — Sourav Dash
Tek System

Simran Khatri Techmatrix jaipur — Simran Khatri
Techmatrix jaipur

Madhav Sharma Cognizant — Madhav Sharma
Cognizant

Rajkishor. P. Game Redington India Ltd — Rajkishor. P. Game
Redington India Ltd

Bhuvan Nucleus software — Bhuvan
Nucleus software

Akshay Kachave Talent Sikha — Akshay Kachave
Talent Sikha

Vasavi Uppala NCR coperation — Vasavi Uppala
NCR coperation

Ashish Chauhan wipro — Ashish Chauhan
wipro

Debarya Pal Accenture — Debarya Pal
Accenture

Vritika Vijay Kamra Larsen and toubro infotech — Vritika Vijay Kamra
Larsen and toubro infotech

Mihir Vatsa Infosys — Mihir Vatsa
Infosys

Divyansh Singh Sengar Infosys — Divyansh Singh Sengar
Infosys

Harsha Kumari Bank Of America — Harsha Kumari
Bank Of America

Vatan Gupta Accenture — Vatan Gupta
Accenture

Fardeen Khan Deloitte us — Fardeen Khan
Deloitte us

Divyanshi jain Deloitte — Divyanshi jain
Deloitte

Hritick goyal Ranjio — Hritick goyal
Ranjio

Nishant Kumar cognizant — Nishant Kumar
cognizant

Saquib Mansuri Circulants — Saquib Mansuri
Circulants

Dhanisha sharma TCS — Dhanisha sharma
TCS

Aditya Prasad Capgemini — Aditya Prasad
Capgemini

Sathvika Chekuri Barclays — Sathvika Chekuri
Barclays

Priyanshu Lasod Barclays Arcgate — Priyanshu Lasod
Barclays Arcgate

Jaya Mendhe Accenture — Jaya Mendhe
Accenture

Dipali Jp Morgan Chase & Co. — Dipali
Jp Morgan Chase & Co.

Meenal Hewlett-Packard — Meenal
Hewlett-Packard

Muskan Hewlett-Packard — Muskan
Hewlett-Packard

Praveen Jangid Celebal Tech — Praveen Jangid
Celebal Tech

Course Content

Python

Basics of Python
OOPs Concepts
File & Exception Handling
Working with Pandas, Numpy & Matplotlib ■ Working with Missing Data ■ Data Grouping ■ Data Subsetting ■ Merging & Joining Data Frames
Importing Libraries & Datasets

Introduction to LINUX Operating System

● Introduction to LINUX Operating System and Basic LINUX commands ● Operating System ● Basic LINUX Commands Linux File System

Vi Editor

Vi Editor
Input Mode Commands
Vi Editor – Save & Quit
Cursor Movement Commands

Shell Programming

Business Intelligence & Data Modeling

Business Intelligence
Need for Business Intelligence
Terms used in BI
Components of BI

General concept of Data Warehouse

Dimensional modeling

Big Data Overview

● What’s Big Data?
● Big Data: 3V’s
● Explosion of Data
● What’s driving Big Data
● Applications for Big Data Analytics
● Big Data Use Cases
● Benefits of Big Data

SQL

- Functional Dependency
- Closure of Attributes
- Types of Keys: PrimaryKey CandidateKey & Super Key in DBMS
- Normalization
- Indexing
- Transaction and Concurrency Control
- Transaction in DBMS
- ACID Propertise in DBMS
- Joins in DBMS
- Create & Alter Table
- Constraints in SQL
- Sql Queries & Sub Queries
- SQL Stored Procedure
- View, Cursor & Trigger in SQL
- Common Table Expession
- Replace Null and Coalesce Function
- Running Total In SQL

Big Data Tools

Hadoop(HDFS)

● History of Hadoop
● Distributed File System
● What is Hadoop
● Characteristics of Hadoop
● RDBMS Vs Hadoop
● Hadoop Generations
● Components of Hadoop
● HDFS Blocks and Replication
● How Files Are Stored
● HDFS Commands
● Hadoop Daemons

Hadoop 2.0 & YARN

● Difference between Hadoop 1.0 and 2.0
● New Components in Hadoop 2.x
● YARN/MRv2
● Configuration Files in Hadoop 2.x
● Major Hadoop Distributors/Vendors
● Cluster Management & Monitoring
● Hadoop Downloads

Map Reduce

● What is distributed computing
● Introduction to Map Reduce
● Map Reduce components
● How MapReduce works
● Word Count execution
● Suitable & unsuitable use cases for MapReduce

Sqoop

● Architecture
● Basic Syntax
● Import data from a table in a relational database into HDFS
● import the results of a query from a relational database into HDFS
● Import a table from a relational database into a new or existing Hive table
● Insert or update data from HDFS into a table in a relational database

Hive Programming

● Define a Hive-managed table
● Define a Hive external table
● Define a partitioned Hive table
● Define a bucketed Hive table
● Define a Hive table from a select query
● Define a Hive table that uses the ORCFile format
● Create a new ORCFile table from the data in an existing non-ORCFile Hive table
● Specify the delimiter of a Hive table
● Load data into a Hive table from a local directory
● Load data into a Hive table from an HDFS directory
● Load data into a Hive table as the result of a query
● Load a compressed data file into a Hive table
● Update a row in a Hive table
● Delete a row from a Hive table
● Insert a new row into a Hive table
● Join two Hive tables
● Use a subquery within a Hive query

Spark Basics

● What is Spark?
● History of Spark
● Spark Architecture
● Spark Shell

Working with RDDs in Spark

● RDD Basics
● Creating RDDs in Spark
● RDD Operations
● Passing Functions to Spark
● Transformations and Actions in Spark
● Spark RDD Persistence

Working with Key/Value Pairs

● Pair RDDs
● Transformations on Pair RDDs
● Actions Available on Pair RDDs
● Data Partitioning (Advanced)
● Loading and Saving the Data

Spark Advanced

● Accumulators
● Broadcast Variables
● Piping to External Programs
● Numeric RDD Operations
● Spark Runtime Architecture
● Deploying Applications

Spark With SQL

Spark SQL Overview
Spark SQL Architecture

Data Frame

What are dataframe
Manipulating Dataframes
Reading new data from different file format
Group By & Aggregations functions

Spark Streaming

What is Spark streaming?
Spark Streaming example

Apache Kafka

Understand the fundamentals of Kafka.
Understand the distributed nature of Kafka and its scalability.
Understand how data is organized into topics and partitions.
Install and set up Kafka on your local machine or a cluster.
Learn how to create topics, produce messages, and consume messages using
Kafka APIs.

Amazon EMR

Overview of Amazon EMR and its features.
Setting up and configuring Amazon EMR clusters.
Running big data processing jobs on EMR.
Integrating Amazon EMR with other AWS services.
Monitoring and optimizing EMR clusters.
Security considerations for EMR.

Introduction to HBASE / NO SQL

● Introduction of HBase
● Comparison with traditional database
● HBase Data Model (Logical and Physical models)
● Hbase Architecture
● Regions and Region Servers
● Partitions
● Compaction (Major and Minor)
● Shell Commands
● HBase using APIs

NO SQL

Introduction to NoSQL databases and their characteristics.
Types of NoSQL databases: Document-oriented, Key-Value, Column-Family, Graph.
Use cases for NoSQL databases.
MongoDB: A popular document-oriented NoSQL database.
Redis: A widely used key-value NoSQL database.
Cassandra: A column-family NoSQL database.

ETL Tool

Understand the fundamentals of ETL (Extract, Transform, Load) processes.
Learn how to install and configure Taland.
Explore Taland’s interface and understand its key components.
Practice using Taland to extract data from different sources, perform
transformations, and load it into target systems.

Talend Basics

● Pre-requisites
● Introduction
● Architecture

Talend Data Integration

● Installation and Configuration
● Repository
● Projects
● Metadata Connection
● Context Parameters
● Jobs / Joblets
● Components
● Important components
● Aggregation & working with Input & output data

Pseudo Live Project (PLP)

● Pseudo Live Project (PLP) program is primarily to handhold participants who are fresh into the technology. In PLP, more importance given to “Process Adherence”
● The following SDLC activities are carried out during PLP
o Requirement Analysis
o Design ( High Level Design and Low Level Design)
o Design of UTP(Unit Test Plan) with test cases
o Coding
o Code Review
o Testing
o Deployment
o Configuration Management
o Final Presentation

AWS Cloud

1. Design Resilient Architectures

Design highly available and/or fault-tolerant architectures

AWS global infrastructure (for example, Availability Zones, AWS Regions, Amazon Route 53)
AWS managed services with appropriate use cases (for example, Amazon Comprehend, Amazon Polly)
Basic networking concepts (for example, route tables)
Disaster recovery (DR) strategies (for example, backup and restore, pilot light, warm standby, active-active failover, recovery point objective [RPO], recovery time objective [RTO])
Distributed design patterns
Failover strategies
Immutable infrastructure
Load balancing concepts (for example, Application Load Balancer)
Proxy concepts (for example, Amazon RDS Proxy)
Service quotas and throttling (for example, how to configure the service quotas for a workload in a standby environment)
Storage options and characteristics (for example, durability, replication)
Workload visibility (for example, AWS X-Ray) Skills in:
Determining automation strategies to ensure infrastructure integrity
Determining the AWS services required to provide a highly available and/or fault-tolerant architecture across AWS Regions or Availability Zones
Identifying metrics based on business requirements to deliver a highly available solution
Implementing designs to mitigate single points of failure
Implementing strategies to ensure the durability and availability of data (for example, backups)
Selecting an appropriate DR strategy to meet business requirements
Using AWS services that improve the reliability of legacy applications and applications not built for the cloud (for example, when application changes are not possible)
Using purpose-built AWS services for workloads

2. Design High-Performing Architectures

Determine high-performing and/or scalable storage solutions

Hybrid storage solutions to meet business requirements
Storage services with appropriate use cases (for example, Amazon S3, Amazon Elastic File System [Amazon EFS], Amazon Elastic Block Store [Amazon EBS])
Storage types with associated characteristics (for example, object, file, block) Skills in:
Determining storage services and configurations that meet performance demands
Determining storage services that can scale to accommodate future needs

Design high-performing and elastic compute solutions

AWS compute services with appropriate use cases (for example, AWS Batch, Amazon EMR, Fargate)
Distributed computing concepts supported by AWS global infrastructure and edge services
Queuing and messaging concepts (for example, publish/subscribe)
Scalability capabilities with appropriate use cases (for example, Amazon EC2 Auto Scaling, AWS Auto Scaling)
Serverless technologies and patterns (for example, Lambda, Fargate)
The orchestration of containers (for example, Amazon ECS, Amazon EKS) Skills in:
Decoupling workloads so that components can scale independently
Identifying metrics and conditions to perform scaling actions
Selecting the appropriate compute options and features (for example, EC2 instance types) to meet business requirements
Selecting the appropriate resource type and size (for example, the amount of Lambda memory) to meet business requirements

Determine high-performing database solutions

AWS global infrastructure (for example, Availability Zones, AWS Regions)
Caching strategies and services (for example, Amazon ElastiCache)
Data access patterns (for example, read-intensive compared with write-intensive)
Database capacity planning (for example, capacity units, instance types, Provisioned IOPS)
Database connections and proxies
Database engines with appropriate use cases (for example, heterogeneous migrations, homogeneous migrations)
Database replication (for example, read replicas)
Database types and services (for example, serverless, relational compared with non-relational, in-memory)

3. Design Cost-Optimized Architectures

Design cost-optimized storage solutions

Access options (for example, an S3 bucket with Requester Pays object storage)
AWS cost management service features (for example, cost allocation tags, multi-account billing)
AWS cost management tools with appropriate use cases (for example, AWS Cost Explorer, AWS Budgets, AWS Cost and Usage Report)
AWS storage services with appropriate use cases (for example, Amazon FSx, Amazon EFS, Amazon S3, Amazon EBS)
Backup strategies
Block storage options (for example, hard disk drive [HDD] volume types, solid state drive [SSD] volume types)
Data lifecycles
Hybrid storage options (for example, DataSync, Transfer Family, Storage Gateway)
Storage access patterns • Storage tiering (for example, cold tiering for object storage)
Storage types with associated characteristics (for example, object, file, block)

Design cost-optimized compute solutions

AWS cost management service features (for example, cost allocation tags, multi-account billing)
AWS cost management tools with appropriate use cases (for example, Cost Explorer, AWS Budgets, AWS Cost and Usage Report)
AWS global infrastructure (for example, Availability Zones, AWS Regions)
AWS purchasing options (for example, Spot Instances, Reserved Instances, Savings Plans)
Distributed compute strategies (for example, edge processing)
Hybrid compute options (for example, AWS Outposts, AWS Snowball Edge)
Instance types, families, and sizes (for example, memory optimized, compute optimized, virtualization)
Optimization of compute utilization (for example, containers, serverless computing, microservices)
Scaling strategies (for example, auto scaling, hibernation)

Design cost-optimized database solutions

AWS cost management service features (for example, cost allocation tags, multi-account billing)
AWS cost management tools with appropriate use cases (for example, Cost Explorer, AWS Budgets, AWS Cost and Usage Report)
Caching strategies
Data retention policies
Database capacity planning (for example, capacity units)
Database connections and proxies
Database engines with appropriate use cases (for example, heterogeneous migrations, homogeneous migrations)
Database replication (for example, read replicas)
Database types and services (for example, relational compared with non-relational, Aurora, DynamoDB)

Design cost-optimized network architectures

AWS cost management service features (for example, cost allocation tags, multi-account billing)
AWS cost management tools with appropriate use cases (for example, Cost Explorer, AWS Budgets, AWS Cost and Usage Report)
Load balancing concepts (for example, Application Load Balancer)
NAT gateways (for example, NAT instance costs compared with NAT gateway costs)
Network connectivity (for example, private lines, dedicated lines, VPNs)
Network routing, topology, and peering (for example, AWS Transit Gateway, VPC peering)
Network services with appropriate use cases (for example, DNS)

Note: Content may Subject to Change by REGex as per Requirement

Extra Sessions

Additinal Session on GIT, Linux, Docker, AWS Basics, Jenkins and many more for all students.

Fee Structure

Indian Fee

Price: ₹59,999/- (Flat 75% off) => ₹25,000/- (Limited Period Special Offer)

International Fee

Price: $1200 (Flat 75% off) => $500 (Limited Period Special Offer)

Fee Can be Paid as No Cost EMI @2500/Month

Cashback Policy

You will get your Unique Referral Code after successful paid registration.
You will get ₹2000 Cashback directly in your account for each paid registration from your Unique Referral Code on monthly basis(After Closing Registrations of this program) .
For Example:- If we received 10 paid registration from your Unique Referral Code then you will receive ₹2000*10 = ₹20,000 on monthly basis.

Related Programs

Data Engineering Specialization with Cloud

AWS Cloud Specialization with DevOps

ML & DL Specialization with Python D-jango

Enroll Now

(Batches Start from 8^th, 19^th & 29^th January 2024)

Name *

First

Last

Email *

Mobile Number (WhatsApp) *

Alternate Number

College/University/Organization *

City *

State *

Country *

Qualification *

Passing Year *

Designation (Ex- Student/Professor/Developer etc) *

Which program you want to join? *

Batch Date *

Session Type *

Physical (Jaipur)
Online

How did you come to know about REGex ? (Ex: Telegram/Friend/Instagram etc) *

*It will help us to reach more

Referral Code (To get Extra Discount as per the Cashback Policy)

*Extra off is applicable on 1 time payment only. Seats can be filled or Price can be increased at any time. Refund policy is not available*

Website