Data Engineer

BMT Score
  • Hybrid

Available for

About Bhagya L

  • 8+ years of professional IT experience in BIGDATA using HADOOP framework and Analysis, Design, Development, Documentation, Deployment and Integration using SQL and Big Data technologies as well as Java / J2EE technologies with AWS,AZURE 
  • Experience in Hadoop Ecosystem components like Hive, HDFS, Sqoop, Spark, Kafka, Pig.
  • Experience in architecting, designing, installation, configuration and management of Apache Hadoop Clusters, MapR, Hortonworks & Cloudera Hadoop Distribution.
  • Good understanding of Hadoop architecture and Hands on experience with Hadoop components such as Resource Manager, Node Manager, Name Node, Data Node and Map Reduce concepts and HDFS Framework.
  • Expertise in Data Migration, Data Profiling, Data Ingestion, Data Cleansing, Transformation, Data Import, and Data Export through the use of multiple ETL tools such as Informatica Power Centre. 
  • Working knowledge of Spark RDD, Dataframe API, Data set API, Data Source API,
  • park SQL and Spark Streaming.
  • Experience in exporting as well as importing the data using Sqoop between HDFS to Relational Database systems and vice-versa and load into Hive tables, which are partitioned.
  • Worked on HQL for required data extraction and join operations as required and having good experience in optimizing Hive Queries.
  • Experience in Partitions, bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.
  • Developed Spark code using Scala, Python and Spark-SQL/Streaming for faster processing of data.
  • Implemented Spark Streaming jobs in Scala by developing RDD's (Resilient Distributed Datasets) and used Pyspark and spark-shell accordingly.
  • Profound experience in creating real time data streaming solutions using Apache Spark /Spark Streaming, Kafka and Flume.
  • Good knowledge of using apache NiFi to automate the data movement between different Hadoop Systems.
  • Good experience in handling messaging services using Apache Kafka.
  • Knowledge in Data mining and Data warehousing using ETL Tools and Proficient in Building        reports and dashboards in Tableau (BI Tool).
  • Excellent knowledge of job workflow scheduling and locking tools/services like Oozie and Zookeeper.
  • Good understanding and knowledge of NoSQL databases like HBase and Cassandra.
  • Good understanding of Amazon Web Services (AWS) like EC2 for computing and S3 as storage mechanism and EMR, Step functions, Lambda,RedShift, DynamoDB.
  • Good understanding and knowledge of Microsoft Azure services like HDInsight Clusters, BLOB, ADLS, Data Factory and Logic Apps.
  • Worked with various formats of files like delimited text files, JSON files, XML Files. Mastered in using different columnar file formats like RC, ORC and Parquet formats and has a good understanding of various compression techniques used in Hadoop processing like G-zip, Snappy, LZO etc.
  • Hands on experience building enterprise applications utilizing Java, J2EE, Spring, Hibernate, JSF, JMS, XML, EJB, JSP, Servlets, JSON, JNDI, HTML, CSS and JavaScript, SQL, PL/SQL.

Industry Expertise

Our Suggestions