Professional Projects
Project: Enterprise Data Warehouse Migration: SQL Server to Snowflake
Overview:
Led the end-to-end migration of a legacy on-premise SQL Server–based data warehouse to Snowflake, enabling a scalable, cloud-native architecture for enterprise analytics and reporting.
Client Served:
A top-tier international financial advisory firm through one of my employer.
Tech Stack:
Snowflake, Snowpipe, Snowsql, Streams & Tasks, Azure Data Factory, Azure Blob Storage, ADLS Gen2, Alteryx
Key Contributions:
- Conducted comprehensive schema discovery and object mapping, translating complex SQL Server structures into Snowflake-compatible designs.
- Designed and implemented robust data extraction and ETL pipelines using Azure Data Factory and Snowpipe, ensuring secure and efficient data flow between Azure storage and Snowflake.
- Built orchestration workflows for automated data loading, validation, and fault-tolerant backfills, enabling reliable daily operations.
- Identified and resolved performance bottlenecks, optimized ETL and extraction logic, and implemented proactive error handling and pipeline resiliency.
- Re-engineered legacy SQL code to align with Snowsql standards, ensuring consistency and maintainability in the new environment.
- Led post-migration QA and monitoring, addressing data quality gaps, object mismatches, and downstream dependency issues.
Outcome:
Successfully migrated mission-critical workloads to Snowflake, achieving enhanced scalability, faster data processing, and reduced operational overhead.
Key Learnings:
Strengthened skills in cloud data warehousing, ELT architecture, process automation, and cross-functional collaboration with business stakeholders during large-scale data migrations.
Project: Backend Design & Development of a Data Integration and Reporting Platform
Overview:
Led the backend architecture and development of a scalable, automated data integration and reporting system, enabling seamless ingestion, processing, and secure access to structured and semi-structured data for analytics.
Client Served:
A global financial advisory and asset management firm through one of my employer.
Tech Stack:
Python, PySpark, SQL, Snowflake, Snowsql, Azure Function App, REST APIs, Azure Blob Storage, GitLab, Postman
Key Contributions:
- Designed and implemented Python-based REST APIs to ingest and process data from various sources, including third-party APIs (JSON), CSV/image/PDF files from Azure Blob Storage, and direct integrations with Snowflake and SQL databases.
- Leveraged Azure Function Apps to orchestrate automated workflows for data validation, transformation, and scheduled SQL execution, as well as for real-time alerting and error handling.
- Implemented Change Data Capture (CDC) using Snowflake Streams and Tasks to support Slowly Changing Dimensions (SCD Type I & II), ensuring accurate historical tracking and real-time data updates.
- Enforced role-based access control (RBAC) and secure data consumption through Snowflake views, stored procedures, and user-defined functions, with proper integration of Row Access Policies and Masking Policies.
- Actively mentored junior developers, conducting regular code reviews, knowledge-sharing sessions, and guidance on best practices aligned with business needs and technology standards.
Outcome:
Delivered a fully automated, self-sustaining backend system for enterprise reporting—capable of handling high-volume data ingestion, transformation, and secure, policy-driven access for dashboards and analytics.
Key Learnings:
Advanced experience with bulk data processing, Python–Snowflake integration, event-driven design using Azure Function Apps, and implementation of data security frameworks within a modern cloud data stack.
Project: WHAutomate– Automated Multi-Platform Data Warehouse Provisioning
Overview:
WHAutomate is a SaaS platform that automates the creation of database schemas across different data warehouse systems like Snowflake, MySQL, and SQL Server. Users log in to the application, upload a JSON schema containing table definitions and constraints, select a target platform, and the system dynamically generates and executes the corresponding SQL DDL. This eliminates the need for manual SQL scripting and reduces setup time for data teams and engineers.
Client Served:
Internal product prototype designed as a proof-of-concept for enterprise data teams, cloud consultancies, and BI engineers who frequently work with multi-platform data warehousing solutions.
Tech Stack:
Python-FastAPI, SQLAlchemy, Snowflake Connector, pyodbc, mysql-connector-python, Pydantic, Docker, Git.
Key Contributions:
- Designed and implemented the core engine that parses JSON schemas and maps them to SQL DDL scripts tailored for different SQL dialects (Snowflake, SQL Server, MySQL).
- Built a modular database connector layer that securely connects to user-provided databases and executes DDL scripts with error handling and rollback support.
- Developed a JSON schema validator to enforce constraints like primary/foreign keys, datatype consistency, and naming conventions.
- Integrated user authentication and project-level management for schema history and audit logging.
- Ensured extensibility by architecting the system in plug-and-play modules to support future DB platforms or transformation pipelines.
Outcome:
Reduced schema provisioning effort, eliminating the need for manual SQL script writing across multiple platforms.
Accelerated onboarding and prototyping for analytics teams by enabling schema deployment in under 2 minutes.
Key Learnings:
Learned to design platform-agnostic systems with modular connectors and validators. Gained experience in building backend SaaS architectures with production-grade authentication, error handling, and security considerations.
Project: Design and Development of a Python FastAPI–Based Data & Reporting Platform for Property Management Domain.
Overview:
Built a backend data and reporting solution powering property management analytics platform (integrated with 3rd party API), enabling dynamic dashboards, real-time data access, and self-service reporting for leasing teams.
Client Served:
A USA based client in property management domain.
Tech Stack:
Python, FastAPI, OAuth2, Azure Blob Storage, Swagger, GitHub, Azure App Services, Pandas, Excel/PDF libraries, REST APIs
Key Contributions:
- Engineered a suite of RESTful APIs using Python FastAPI to handle data ingestion, transformation, and delivery for dashboards and downloadable reports.
- Implemented OAuth 2.0 authentication for secure, role-based access to API endpoints and user-specific data views.
- Designed a self-service reporting workflow, allowing users to generate real-time reports and dashboards by selecting filters and parameters via the frontend interface.
- Enabled dynamic export of reports in both Excel and PDF formats, along with web-based dashboard views for in-platform analysis.
- Integrated Azure Blob Storage for handling CSV uploads/downloads and storing associated image or PDF content used in reports.
- Managed version-controlled development with GitHub and implemented CI/CD deployments to Azure, ensuring fast and reliable release cycles.
Outcome:
Release of a secure, scalable, backend system providing real-time reporting and dashboards for a property management platform.
Key Learnings:
Implementing secure oAuth2-based access control, Managing deployment pipelines with GitHub and Azure, Utilising Swagger for API testing and documentation.
Project: Development of a Data Integration and Reporting Platform for Marketing & Campaign Analytics
Overview:
Designed and implemented a data platform that centralized marketing, campaign, and survey data from diverse sources. The system streamlined data ingestion, preprocessing, and reporting workflows to support business insights and performance tracking for marketing initiatives.
Client Served:
Clients in the Indian FMCG sector requiring data-driven evaluation of marketing and outreach efforts.
Tech Stack:
MySQL, Excel, Python, NumPy, Pandas, Matplotlib, Flask, REST APIs
Key Contributions:
- Aggregated multi-source data related to marketing activities, campaigns, and customer surveys.
- Built and maintained data pipelines to load structured data into centralized MySQL tables.
- Conducted data preprocessing and transformation to ensure readiness for downstream analytics.
- Designed relational schemas aligning with business logic and reporting needs.
- Developed REST APIs to automate third-party data ingestion into on-premise systems.
- Performed data analysis to evaluate marketing strategies, measure revenue impact, and compute ROI.
- Created interactive and visually appealing dashboards and reports using Python libraries.
Outcome:
Enabled faster and more accurate insights into campaign performance and business metrics. The platform improved operational efficiency, reduced manual data handling, and empowered stakeholders with actionable data.
Key Learnings:
Deepened expertise in API-driven data workflows, relational schema design, and marketing analytics. Gained experience in automating data ingestion, visual storytelling through dashboards.