Data Engineer

BMT Score
  • Remote

Available for

About Piyusha R

Having 8+ years of experience in the IT industry including technical proficiency in the big data environment. Hands-on experience with Big Data core components and Hadoop Eco System including Data ingestion and Data processing. Good understanding of Hadoop Architecture and its major core components like Hive, HDFS, Pyspark, and Yarn. Also, of Spark Architecture including Spark Core, SparkSQL, and DataFrame. Good hands-on experience with DataBricks Perform tasks such as writing scripts, and SQL queries. Experience in writing Pyspark code to migrate data from RDBMS to HBase and Hive tables using the Pyspark jobs. Experience in writing spark programs to perform data cleansing, Data Ingestion using Pyspark. Solved performance issues in Spark with an understanding of Joins, groups, and aggregations. Good knowledge of Partitioning, Bucketing, and Loading in Hive.

Tech Stack Expertise

  • Tech Stack Expertise



    4 Years
  • Tech Stack Expertise


    AWS,AWS S3

    7 Years

Work Experience


Data Engineer

  • January 2013 - July 2023 - 10 Year
  • India



Financial Intelligence Hub

  • May 2019 - August 2023 - 52 Months
Role & Responsibility
    In today's world people are going towards credit based economy i.e leveraging more credits to fulfill their daily needs and wants. However in order to limit delinquencies and reduce default rates on payment, Companies need to do smart analytics. The purpose of the project is to analyze the data to find the total outstanding limit of a particular customer.

    Importing Historical data from RDBMS into the NoSQL database (Hbase) using pyspark scripts.
    Understand the client's requirements for the developing project.
    Write Hive DDL to create a Hive External table Hive managed table to optimize query Performance.
    Writing pyspark jobs for importing the data into Hbase and from Hbase to Hive.
    Implement the data validation, quality check & deal with nulls.
    Worked on Optimization of Spark jobs.
    Involved in creating some UDF when needed.
    Involved in importing data from Hive Managed to External Hive table.
    Write a Hive query for data analysis to meet the business requirements.
    Creating pyspark jobs for data transformation and aggregation.
...see less

Data Migration Project

  • October 2015 - March 2019 - 42 Months
Role & Responsibility
     We get daily data into the S3 bucket of AWS. After that we clean that data and perform some transformation and again stored into another S3 bucket or sometimes dumped that data into Redshift as per client requirement.

    Role and Responsibilities-
    Extracting data from raw s3 bucket.
    Write a pyspark job to clean the data Stored clean data into another s3 bucket.
    Then validate the data and append it into Redshift for further use.
...see less

Industry Expertise



in BE

Nagpur University
  • June 2012 - June 2015

Our Suggestions