$11.0 per Hour
With 3+ years of total experience in the areas of Software Development Hadoop/Spark.
Pursuing Executive PG Course in Data Science from IIT Roork and Certification in Artificial Intelligence from cloud lab.
Analyze or transform stored data by writing Spark Jobs using Python , Hive Scripts , Pig Scripts, Amazon AWS services based on business requirements and having good working knowledge on Hive, Spark, Python, Hbase, Sqoop, Oozie.
Good exposure in writing SQL queries and in data analysis and fixing data issues for various facts and dimensions for data warehouse and data marts.
Have good working experience while processing huge amounts of data(in TBs) which includes data ingestion, data cleansing and data transformation.
Adept to end-to-end development from requirement analysis to system study, designing, coding, testing, debugging and documentation.
Used Agile software development methodology in the business process.
Good communication & interpersonal skills
- January 2019 - December 2022 - 4 Year
- January 2021 - December 2022 - 24 Months
The Customer Insights Service (CIS) was underpinned to reinvigorate the previous legacy ACI platform. requirements were used as the basis to refresh the technology stack and where appropriate take the decision to move more cloud native: Tool Modernization and Integration, Self-Service Data and Analytics, Automated Data Management and Streamlined Service Support. The solution architecture involves following tiers: Raw Data, Cleansed Data, Curated Data, Data Mart and Dashboard visualization and Reporting.
Roles and Responsibilities:
Developed various functionalities for Data Ingestion framework using Python.
Created the Json configuration files for cleansing the data through generic Spark framework.
Developed the Spark SQL Jobs and processed the data in Amazon S3 with complex business requirements for Data Marts creation for various markets.
Developed various python functions used while data cleansing and data transformations.
Have processed the master reference data with SCD2 implementation using Spark.
Have scheduled daily, weekly & monthly workflows for Spark jobs using AWS Glue and created CRON jobs as well in some cases and scheduled.
Have processed history & incremental data using Spark and used Redshift for data validation & for historical data processing.
Atos Syntel, Hadoop Developer
- March 2019 - January 2021 - 23 Months
T-Mobile is an American multinational mobile network corporation headquartered in Seattle, Washington that provides mobile networks across all USA. T-Mobile wants to introduce a new ETL which will process faster and load the data in Teradata, from where business users will generate reports.
HCL Technologies Ltd. is providing the ETL solutions called IDW (Integrated Data Warehouse) to process the data using Apache Hadoop and finally load into Teradata. IDW has 3 phases. Ingestion, Preparation and Dispatch.
Ingestion: The upstream will inject the data to Hadoop as file systems or in Hbase.
Preparation: Ingested data is being processed in Hadoop using Pig script, Hive, HDFS and Hbase, Oozie and MapReduce.
Dispatch: The processed data finally loaded into the Teradata.
We helped the company’s business processes by developing the Hadoop Jobs that moved data from individual systems into HDFS, transforming according to the business requirements and exported to external systems.
Roles and Responsibilities:
Analyzing the S2TM with table descriptions, schema details for both source and target, data types of both source and target while mapping the columns, transformations and the Teradata target table structure.
Experience in writing Pig scripts and Hive Scripts and to transform raw data based on the given transformation provided in the S2TM.
Prepared the Source and target DDL based on S2TM and created tables accordingly in Hive and Hbase.
Prepared and executed Pre-preparation, preparation and dispatcher jobs.
Developing the dispatcher job to move the data from preparation output to the Teradata by using the Sqoop or MLoad scripts.
Processed both the history and daily load/incremental data with the Hadoop jobs.
Experience in writing Shell scripts which are used within the jobs.
Extracted data from databases into HDFS using Sqoop jobs with incremental/full load.
Solved performance issues in Pig scripts and Hive Scripts with understanding of Joins, Group and aggregation.
Developed Oozie workflow for scheduling and orchestrating the ETL process
Worked with the onsite team in handling the defects raised in QA and monitoring the jobs in production.
Documented design documents, unit test cases documents for Pre-preparation, preparation and dispatcher jobs.
in BTECH (COMPUTER SCIENCE)LENDI Institute Of Engineering And Technology, JNT
- June 2017 - June 2019