$12.0 per Hour
Around 7 years of IT experience on [Java, Data Engineering, Data Analytics and Data Modeling] in all the phases of project life cycle such as requirement gathering, Building ETL Pipelines using ADF, NIFI Building Data Lakes and Data Modeling.
Dynamic software professional with strong academic background of B.Tech (IT), having well developed skills in multiple Big Data platforms.
Having excellent technique of designing ETL pipelines using ADF (Azure Data Factory), Data Flows, Data Bricks, Logic Apps, Azure Functions, Azure SQL DB in Azure Cloud and Apache NiFi in cloudera – Hortonworks platforms.
Good Understanding of Azure AD and authentication types like Service Principles, Managed Identity also using Key Vault.
Hands on experience in Pyspark and Python
Good understanding of Glue and Athena ETL
Good Understanding of Azure Dev Ops and Pipelines Release Process.
Good Knowledge on Snowflake Cloud Data Warehouse and extensive experience in developing complex stored procs.
Hands on experience in Apache Kafka and developing Kafka Stream application.
Commendable knowledge in spark eco system like Spark core, Spark SQL, Spark Streaming.
Hands on experience in Hadoop distributed environments called Hortonworks and Cloudera.
Hands on experience in Core java, Scala programming languages.
Strong hands-on experience in Hadoop eco system components like Apache Sqoop, map-reduce, Hive.
Having strong knowledge on developing Spark batch and streaming Jobs in Scala language.
Having good Knowledge on developing both Batch and Streaming Jobs using Spark Streaming and integrating with Apache Kafka.
Hands on experience developing Star Schema and Snowflake schema Data Models for Supply Chain use cases using Hive.
Techno savvy professional with solid Data Engineering, Data Modeling skills coupled with proven ability to develop ETL Pipelines as per project needs.
A systematic, organized, hardworking and dedicated team player with an analytical bent of mind, determined to be a part of a growth-oriented organization.
Ideal combination of technical and communication skills; creative problem solver, able to think logically and pay close attention to detail; proficient at gathering user, industry requirements and customizing Data Architecture and plans as per need.
Tech Stack Expertise
Azure data factory,Azure Function,Azure SQL Database,Azure Data Flows4 Years
Apache Spark,Apache NIFI,Apache Ranger0 Years
Apache Kafka0 Years
- January 2016 - January 2023 - 7 Year
- February 2017 - June 2018 - 17 Months
This project helps to provide optimized routes to truck driver, Predicting the consumption of material in a tank. The role of Data Engineer in this project is supply the data from different source systems and provide to downstream users. Major data sources are data from Nalco SAP, Data coming from sensors which are attached to Tank and providing distance data between two zip codes.
Design the technical requirement and data flow design documents for data coming from different source systems like Nalco SAP, OIP (sensor data), Distance data (PC Miler API).
Build the generic pipelines using ADF and Logic apps for data engineers to reuse and refer to build new pipelines.
Review the pipelines.
Develop the pipeline for streaming data coming from SAP -> Tibco(queue) using Azure Function and Logic Apps and store data in Azure SQL DB.
Building Data Lake.
- June 2019 - January 2020 - 8 Months
Main motive of this project is building data lake with supply chain data which is present in SAP.
This helps the organization with tracking their orders, understanding the detention cost and deriving the advanced analytics.
Create Ingestion pipeline for bringing data from SAP to Azure.
Design Pipeline using Apache NIFI for storing data in organized manner and ingest into Hive Data Base.
Create Apache spark jobs which does preprocessing of incoming files to reduce unnecessary burden to NIFI and to achieve best performance.
Create Apache Spark automated code for generating Hive QL metadata.
Identify the right candidates for partitioning, bucketing on data stored in hive.
Create Spark code to identify Meta Data Mismatches in source and target files.
- June 2020 - January 2021 - 8 Months
Find Work is a product to identify right skills set talent for the Companies and vice versa.
It has its own matching algorithm for identifying right candidates for right set of jobs.
It is not only for giving matches Find work also provides the coaching for talents and they have Educators platform where talents can enhance their skills.
Create Apache Spark batch jobs for identifying the right talents for right jobs and generating monthly billing cycle and etc.
Create Spark Streaming jobs using Apache Kafka to capture the changes made by talents and match them accordingly.
Building the index DB using Elastic Search and Apache Spark for Faster Search results and best performance.
Create Employer/Educator KPI dash boards using Spark for understanding their business.
Analyzed Weekly Data Based on Marketing campaigns held using Python Pandas Library
- June 2021 - January 2022 - 8 Months
This project is mainly focus on analyzing the detention cost of supply chain data.
Building the Curation Layer by reading data from Data Lake (Raw Data) using Apache Spark.
Created Apache Spark jobs to load data from Horton works to Hive Data Base.
Built the Star Schema Model for various use cases.
As part of Data Modeling work identified the Dimension Tables.
Identified the Fact Table and measures required in Fact Table.
Created View which will be pointed to Power Bi Report.
Identified the Delta Data Refresh strategy for Dimensions and Facts.
Computer Engineering in B.techKarnataka University
- June 2009 - June 2012