Talent's Information
-
Location
Nagpur, India
-
Rate
$15.0 per Hour
-
Experience
8 Year
-
Languages Known
English,Hindi
Available for
About Piyusha R
Having 8+ years of experience in the IT industry including technical proficiency in the big data environment. Hands-on experience with Big Data core components and Hadoop Eco System including Data ingestion and Data processing. Good understanding of Hadoop Architecture and its major core components like Hive, HDFS, Pyspark, and Yarn. Also, of Spark Architecture including Spark Core, SparkSQL, and DataFrame. Good hands-on experience with DataBricks Perform tasks such as writing scripts, and SQL queries. Experience in writing Pyspark code to migrate data from RDBMS to HBase and Hive tables using the Pyspark jobs. Experience in writing spark programs to perform data cleansing, Data Ingestion using Pyspark. Solved performance issues in Spark with an understanding of Joins, groups, and aggregations. Good knowledge of Partitioning, Bucketing, and Loading in Hive.
Tech Stack Expertise
-
Python
Python
4 Years -
AWS
AWS,AWS S3
7 Years
Work Experience

Data Engineer
- January 2013 - July 2023 - 10 Year
- India
Projects

Financial Intelligence Hub
- May 2019 - August 2023 - 52 Months
-
In today's world people are going towards credit based economy i.e leveraging more credits to fulfill their daily needs and wants. However in order to limit delinquencies and reduce default rates on payment, Companies need to do smart analytics. The purpose of the project is to analyze the data to find the total outstanding limit of a particular customer.
Importing Historical data from RDBMS into the NoSQL database (Hbase) using pyspark scripts.
Understand the client's requirements for the developing project.
Write Hive DDL to create a Hive External table Hive managed table to optimize query Performance.
Writing pyspark jobs for importing the data into Hbase and from Hbase to Hive.
Implement the data validation, quality check & deal with nulls.
Worked on Optimization of Spark jobs.
Involved in creating some UDF when needed.
Involved in importing data from Hive Managed to External Hive table.
Write a Hive query for data analysis to meet the business requirements.
Creating pyspark jobs for data transformation and aggregation.

Data Migration Project
- October 2015 - March 2019 - 42 Months
-
We get daily data into the S3 bucket of AWS. After that we clean that data and perform some transformation and again stored into another S3 bucket or sometimes dumped that data into Redshift as per client requirement.
Role and Responsibilities-
Extracting data from raw s3 bucket.
Write a pyspark job to clean the data Stored clean data into another s3 bucket.
Then validate the data and append it into Redshift for further use.
Soft Skills
Industry Expertise
Education

in BE
Nagpur University- June 2012 - June 2015