Muhammad Zaid

Senior Azure Data Engineer | AWS Certified Cloud Practitioner | AWS Certified Solutions Architect

About

Highly accomplished Big Data Engineer with 3.3 years of expertise in designing, implementing, and optimizing robust ETL processes and cloud-based data platforms. Proven ability to leverage Python, SQL, Apache Spark, and advanced Azure/AWS services to engineer scalable data pipelines, enhance data quality, and drive significant improvements in reporting performance and cost efficiency. Adept at collaborating with cross-functional teams to deliver high-impact data solutions that reduce project costs by up to 20%.

Work Experience

Senior Software Engineer

HCLTech

Nov 2024 - Present

Noida, Uttar Pradesh, IN

Led enterprise data migration and optimized scalable data pipelines, enhancing analytics and reporting for cloud-based solutions.

  • Spearheaded the successful migration of enterprise data from on-premises Oracle SQL database to Azure Data Lake Storage Gen2, establishing a secure and scalable foundation for cloud-based analytics.
  • Engineered and optimized high-performance data pipelines with Apache Spark, achieving a 40% reduction in data latency across large distributed datasets.
  • Developed and fine-tuned over 30 complex SQL queries and stored procedures, boosting ETL efficiency and reporting performance by 35%.
  • Implemented comprehensive data validation, schema checks, and unit testing frameworks, ensuring end-to-end pipeline reliability and enhancing data quality assurance.
  • Integrated diverse Azure services, including Data Factory, Data Lake Storage Gen2, and Databricks, to establish secure, code-driven data movement and transformation workflows.

Data Engineer

Wipro

Apr 2022 - Nov 2024

Greater Noida, Uttar Pradesh, IN

Facilitated data migration POCs to Azure and implemented PySpark optimizations to enhance data processing efficiency and cluster utilization.

  • Facilitated critical Proof-of-Concepts (POCs) for Azure data migration, achieving 100% accuracy and efficiency in data ingestion, transformation, and storage processes.
  • Drove 100% consistency across a 4-member team during POCs focused on developing robust data workflows within Azure Data Factory.
  • Implemented advanced PySpark optimizations, including partitioning, broadcast joins, and caching, significantly reducing job runtimes and enhancing cluster utilization.
  • Executed comprehensive performance tuning on Spark jobs and SQL queries, resulting in a 30-40% improvement in data processing times during critical POC validation cycles.

Education

Computer Application

Dr Virendra Swaroop Institute of Computer Studies

72%

Jan 2019 - Jan 2022

Certificates

AWS Certified Cloud Practitioner

Amazon Web Services (AWS)

AWS Certified Solutions Architect - Associate

Amazon Web Services (AWS)

Languages

English

Skills

Cloud Platforms

  • Microsoft Azure
  • Azure SQL Database
  • Azure Data Factory
  • Azure Databricks
  • Amazon Web Services (AWS)
  • EC2
  • S3
  • Lambda
  • Glue
  • ADLS Gen2
  • Delta Lake

Big Data Technologies

  • Apache Spark
  • PySpark
  • Hadoop Ecosystem
  • Apache Hive
  • Kafka

Programming & Scripting

  • Python
  • SQL (Advanced Queries)

Relational Databases

  • Oracle
  • PostgreSQL

Data Visualization & Reporting

  • Microsoft Excel (Data Analysis, Pivot Tables, Formulas, Charts)
  • Power BI
  • Tableau