Not Available
User

Data Engineer

BMT Score
86
86%
  • Remote

Available for

About KUNAL

Big Data professional with overall 7+ years of experience in design, development and deployment of big data applications.
Extensively worked in developing Spark applications using Spark Core, Spark-SQL, and Spark Structured Streaming API's.
Worked on Hadoop eco-system components like Map Reduce, HDFS, Hive, HBase, Sqoop, Oozie, Kafka, Spark, Impala, Hue etc.
Extensive experience in AWS cloud services like S3, RDS, CloudWatch, Glue, EMR, EC2 and Athena.
Experience working with centralized version control system – Git
Experience working with agile methodology.
                                                                                                                     


 

Tech Stack Expertise

  • Tech Stack Expertise

    ABCL

    Apache Airflow,Apache NIFI

    4 Years
  • Tech Stack Expertise

    Python

    Python

    1 Years
  • Tech Stack Expertise

    AWS

    AWS

    1 Years
  • Tech Stack Expertise

    Java

    Java,Core Java

    2 Years

Work Experience

Images

Data Engineer

  • January 2016 - February 2023 - 7 Year
  • India

Projects

Images

Baazi Games

  • February 2021 - February 2023 - 25 Months
Technologies
Role & Responsibility
    Designed 3-layered architecture of DataLake for differet Real-Money Gaming applications.
    Developed Delta Lake solution using integration of Databricks and AWS.
    Migrated data to and from different data sources like Clevertap, Google Analytics4, Branch etc.
    Setup Apache Airflow infrastructure on AWS environment using EC2 instance and Docker.
    Developed framework to ingest structured historical and incremental data on S3 from RDS using custom Airflow operators and Apache Nifi.
    Built Spark application to manage upserts from relational database into the Parquet backed tables in AWS Catalog.


    Project Description: Develop and support end-to-end datapipelines to process production data and create summarized tables which are used further for audit, data visualization and analysis across multiple teams.Migrating some of the data pipelines from AWS to GCP.Developed spark jobs to process data using Dataproc.Developed pipeline to collect clickstream data using Kinesis Streams and Firehose.Responsible for integration testing of datasets with Tableau using Athena connector.
     
...see less
Images

Australian Bank

  • April 2019 - December 2020 - 21 Months
Technologies
Role & Responsibility
    Working as a part of the development team to onboard different datasets into Enterprise Data Hub for risk management.
    Developed metadata-driven validation and ETL framework based on PySpark.
    Migrated complex ETL pipelines from Alteryx to Spark jobs.
    Responsible for applying row level security to the data exposed to the end users using Impala views.

    Project Description: Migrated data from Sybase DB and MySQL to hive backed by S3 using Sqoop. Developed logic in Spark to load the incremental data from upstream to Hive table.involved in building scripts for automated deployment of applications into the PROD environment using Gradle, Bamboo and organization legacy tools. Working as a part of DevOps team to monitor and troubleshoot scheduled jobs in production.
     
...see less
Images

Sterlite

  • August 2018 - April 2019 - 9 Months
Technologies
Role & Responsibility
    Worked as an agile team member to develop, test and deploy the product for IP Log Management System.
    Involved in building datapipelines to ingest and transform data using spark and loading the output in multiple sources like HDFS, Hive and HBase.
    Developed components using Spark Structured Streaming API to form dataframe from real time streaming data on Kafka and performed aggregations on the same.
    Schedule the spark jobs in the cluster using oozie.
    Project Description: Developed “Integrator” component as part of framework to call REST APIs on Spark dataframes.Developed JUnit of various components in Scala.Contributed in development of Java modules to read JSON configuration and ingest data in HDFS.Involved in installation of HDP 3.0 on the development cluster.Performed functional testing of various business scenarios of the product.
     
...see less
Images

Leading Pharmaceutical

  • January 2016 - August 2018 - 32 Months
Technologies
Role & Responsibility
    Ingested data into Hive from Teradata using Sqoop.
    Queried data over hive tables using concepts like join, group and creating views for ad hoc analysis.
    Migrated data from RDBMS using Sqoop.
    Orchestrated and scheduled periodic jobs using Oozie.

    Project Description: it Involved in various POCs on big data stack.Understanding business rules and their prototype to automate the formation of SQL queries.Develop JUnit for unit testing of modules.Testing and validating individual modules and integration of different modules using VBA based Macros.Automate the process of triggering SQL queries and getting results using JDBC.
     
...see less

Industry Expertise

Education

Education

Bachelor Of Technology in B.T

Delhi University
  • June 2011 - June 2015

Our Suggestions