Sai Kiran Siddhabathula

Data Engineer

LinkedIn | GitHub

About

Highly accomplished Data Engineer with 4 years of experience in designing, scaling, and optimizing robust data pipelines, ETL/ELT workflows, and cloud-based data platforms. Proficient in Python, SQL, and Spark, with hands-on expertise across AWS and Azure environments, adept at data modeling, warehousing, and SQL query optimization. Proven ability to translate complex business requirements into high-quality, data-driven solutions and enable analytics teams with reliable, scalable data infrastructure.

Work Experience

Teaching Assistant – Big Data

St. Francis Xavier University

Sep 2024 - Dec 2024

Antigonish, NS, CA

Assisted in delivering Big Data concepts and practical exercises, supporting student learning and problem-solving in distributed computing.

  • Assisted in teaching concepts of distributed computing, Hadoop, Spark, and big data architectures to enhance student comprehension.
  • Mentored over 40 students on Python, SQL, and data engineering assignments, fostering skill development and project success.
  • Designed practical lab sessions and exercises for real-world big data pipelines, providing hands-on experience.
  • Evaluated assignments and provided detailed feedback, strengthening student problem-solving and analytical skills.

Website Coordinator

St. Francis Xavier University Students' Union

Sep 2024 - Dec 2024

Antigonish, NS, CA

Led the redesign and maintenance of the student union website, enhancing user experience and operational efficiency through modern web technologies.

  • Rebuilt and maintained the student union website using React.js and Firebase, significantly improving load time and accessibility for the student body.
  • Integrated secure login, event scheduling, and payment modules, streamlining student services and enhancing operational efficiency.
  • Developed analytics dashboards to measure engagement, boosting participation in campus activities by providing data-driven insights.
  • Automated content updates and optimized hosting for scalability, ensuring reliable performance for all users.
  • Coordinated with cross-functional teams (administration, IT, student reps) to gather requirements and deliver comprehensive web solutions.

Subject Matter Expert - Computer Science

Chegg Inc.

Mar 2021 - Apr 2024

Remote, Any, US

Provided expert-level solutions and explanations for advanced computer science problems, supporting global student learning and exam preparation.

  • Solved over 1000 advanced computer science problems spanning algorithms, data structures, and system design, demonstrating deep technical expertise.
  • Authored detailed explanations and optimized solutions for a global student audience, enhancing comprehension and learning outcomes.
  • Supported students in preparing for exams, interviews, and coding assignments, contributing to their academic and career success.
  • Consistently ranked among top contributors for accuracy and responsiveness, maintaining high quality and timely support.

Specialist Programmer – Client Project: Allied World Insurance, Singapore

Infosys Limited

May 2022 - Aug 2023

Hyderabad, Telangana, IN

Developed and optimized backend systems and ETL workflows for insurance operations, enhancing performance and deploying solutions across cloud environments.

  • Built robust REST APIs and microservices with Java Spring Boot, supporting critical insurance workflows for claims, underwriting, and policy issuance.
  • Developed ETL workflows in Python and SQL to automate insurance reporting and compliance checks, ensuring data accuracy and regulatory adherence.
  • Migrated legacy processes to AWS (S3, Lambda, RDS), improving system performance and significantly reducing downtime.
  • Implemented CI/CD pipelines using Jenkins and GitHub, cutting deployment time by 40% and enhancing release efficiency.
  • Conducted peer code reviews and mentored junior engineers on backend and cloud best practices, fostering team growth and code quality.

Systems Engineer – Client Project: MetLife Insurance, USA

Infosys Limited

Jul 2021 - May 2022

Hyderabad, Telangana, IN

Engineered backend modules and optimized data processes for customer-facing insurance portals, improving system performance and ensuring production stability.

  • Developed backend modules in Java and React.js for customer-facing insurance portals, enhancing user experience and functionality.
  • Optimized SQL queries and batch jobs, reducing response times for data-heavy processes and improving system efficiency.
  • Collaborated with Business Analysts to gather requirements and deliver production-ready features, ensuring alignment with business needs.
  • Automated regression testing pipelines, improving release quality and significantly reducing manual QA effort.
  • Provided L3 support during critical production incidents, resolving issues within strict SLAs and maintaining system uptime.

Software Engineering Intern – Client Project: Sun Life Insurance, Canada

Infosys Limited

Sep 2020 - Jun 2021

Hyderabad, Telangana, IN

Supported software development and quality assurance for insurance product enhancements, contributing to API integration and Agile practices.

  • Designed Python scripts to automate repetitive QA and data validation tasks, improving testing efficiency.
  • Supported development of Java modules for insurance product enhancements, contributing to feature delivery.
  • Assisted in API integration testing and bug fixing for web applications, ensuring functional stability.
  • Learned and applied Agile development practices, actively contributing to daily stand-ups and sprint planning.

Education

Applied Computer Science

Saint Francis Xavier University

Sep 2023 - May 2025

Antigonish, NS, CA

Computer Science Engineering

Jawaharlal Nehru Technological University

Jun 2017 - Jul 2021

Hyderabad, Telangana, IN

Certificates

Microsoft Certified: Azure Fundamentals (AZ-900)

Microsoft

Google Data Analytics Professional Certificate

Google / Coursera

Python Core and Advanced

Udemy

Python Data Structures

Udemy

Machine Learning Training

Internshala

Projects

Intelligent Multi-Agent Platform (In Progress)

May 2024 - Present

Building an AI multi-agent platform using LangChain and AutoGen, integrating various cognitive services for contextual memory and workflow orchestration.

Credit Card Churn Prediction and Retention Strategies

Jan 2024 - Apr 2024

Applied machine learning models and conducted extensive data analysis to predict customer churn and propose retention strategies.

Solving N-Queens with Constraint Satisfaction

Oct 2023 - Dec 2023

Implemented and optimized a backtracking and constraint satisfaction approach to solve the classical N-Queens problem.

Amazon Price Tracker

Jul 2023 - Sep 2023

Developed a Python-based web scraper to track Amazon product prices and automate price alerts.

Mobile Controlled Mouse

Apr 2023 - Jun 2023

Developed an IoT-based application transforming a smartphone into a wireless mouse, enhancing accessibility.

Languages

English

Skills

APIs

  • REST
  • gRPC
  • API Integration

Programming Languages

  • Python
  • SQL
  • Java
  • C
  • C++
  • JavaScript

Big Data Technologies

  • Apache Spark
  • PySpark
  • Databricks
  • Hadoop
  • Hive
  • Pig
  • HDFS
  • MapReduce

Data Streaming

  • Apache Kafka
  • AWS Kinesis
  • Google Pub/Sub

ETL & Orchestration

  • Airflow
  • AWS Glue
  • Azure Data Factory
  • Prefect

Data Warehousing & Modeling

  • Snowflake
  • Redshift
  • Delta Lake
  • Data Modeling
  • SQL Query Optimization

Data Lakes

  • AWS S3
  • Azure Data Lake Storage (ADLS)

Databases

  • PostgreSQL
  • MySQL
  • MongoDB

Data Quality & Testing

  • dbt
  • Great Expectations
  • PyTest

Cloud Platforms

  • AWS (S3, EMR, Lambda, RDS, EC2, IAM)
  • Azure (Databricks, SQL DB, ADF)

DevOps & IaC

  • Docker
  • Kubernetes
  • Terraform
  • Jenkins
  • GitHub Actions

Data Visualization

  • Power BI
  • Plotly
  • Matplotlib
  • Seaborn

AI & Machine Learning

  • scikit-learn
  • TensorFlow
  • Keras
  • NLP (LangChain)
  • AutoGen
  • Feature Engineering

Collaboration & Monitoring

  • GitHub
  • Bitbucket
  • Jira
  • Confluence
  • DataDog
  • OpenSearch
  • AppDynamics

Methodologies

  • Agile/Scrum
  • Cross-functional Collaboration
  • Requirements Gathering
  • Problem-Solving
  • Data Transformation
  • Backend Services
  • Microservices
  • CI/CD
  • Data Quality Assurance
  • L3 Support