Stories
Over the past few years, I’ve worked on systems that operate at real production scale — from real-time dashboards and AI processing pipelines to distributed backend services and cost-optimized data platforms. Many of the most valuable engineering lessons don’t appear in bullet points on a résumé: they emerge while solving performance bottlenecks, debugging production incidents, or redesigning systems under tight constraints. This page collects a few of those stories. Each one describes a real engineering problem I encountered, the architectural decisions behind the solution, and the impact it had on performance, scalability, or developer productivity.
Eliminating CI/CD Bottlenecks by Redesigning GitHub Actions Workflows
Quantiphi Analytics
Our CI/CD pipelines relied on GitHub Actions, but the organization account could run only two workflows in parallel. Each pipeline built and deployed the entire microservice stack, which caused runs to take nearly 1.5 hours. As the project approached the UAT phase and pull requests increased, workflows began competing for limited runners. Queue times grew, failure rates increased, and code shipping slowed significantly. I redesigned the pipeline to deploy only the microservices affected by a pull request by introducing service-level dependency detection and targeted build steps in GitHub Actions. This reduced workflow runtime from 1.5 hours to about 20 minutes (78% less time), which effectively unlocked the ability to process 5–7 pull request workflows in parallel. The change eliminated pipeline congestion, reduced infrastructure costs, and restored development velocity during a critical release window with estimated savings of 900+ developer hours.
Designing a Secure and Cost-Efficient Static Content Hosting Layer for an LMS
Quantiphi Analytics
While building a Learning Management System serving more than 4,000 concurrent users and processing over 10,000 learning-content updates daily, we needed a scalable way to host thousands of static HTML learning modules stored in cloud buckets. The initial architecture assumed these files could be served directly from storage using signed URLs or a CDN. However, as we approached the UAT phase we discovered a critical issue: the HTML learning modules contained multiple relative imports (CSS, JavaScript, images). A single signed URL could not authorize all dependent assets, and generating signed URLs for every imported file would have been computationally expensive and difficult to maintain. Integrating a CDN at that stage would also have required significant architectural changes and ongoing operational cost while the system was close to release. We needed a solution that could be integrated quickly into the existing architecture, remain secure because the learning material was copyrighted, and require minimal maintenance. I designed a REST-based virtual file system layer where cloud storage buckets were mounted as volumes on our application servers, allowing HTML files to be served locally so relative imports resolved naturally. To keep the application stateless while still managing access efficiently, we used our existing Redis cluster to cache signed URLs and file metadata such as age and expiry. Authentication was enforced through secure cookies so only authorized learners could access the content. This solution allowed us to deliver the feature before UAT, support thousands of concurrent learners, and avoid the complexity and cost of introducing a CDN late in the development cycle.
Scaling a Real-Time Visualization Dashboard to Handle High Event Throughput
Quantiphi Analytics
We were building a micro-frontend visualization dashboard where chart configurations were sourced from Looker Studio while the underlying data streamed from BigQuery queries and Google Cloud Pub/Sub event pipelines. Because the dashboard displayed event-driven operational metrics, the UI needed to ingest high-frequency updates in near real time. Initially, every event was pushed directly to the frontend, which resulted in excessive message rates and frequent UI re-renders across micro-frontend components. We optimized the system on two fronts. First, on the backend we introduced event batching for a predefined interval, intentionally converting the system from strict real-time to near-real-time streaming. This drastically reduced the number of messages per minute while preserving the freshness needed for monitoring dashboards. Second, on the frontend we redesigned the data layer using custom React hooks and a shared state store built with Zustand. Instead of each component opening its own streaming connection, a single Server-Sent Events (SSE) connection collected all updates and distributed them through the shared store. This eliminated redundant network requests and prevented unnecessary component re-renders. Together, these changes allowed the dashboard to sustain more than 500 streaming events per second while maintaining sub-300 ms perceived latency across the UI.
Processing a 5-Year Document Backlog with an Event-Driven AI Pipeline
Quantiphi Analytics
We initially built a proof-of-concept system to process a large backlog of documents using AI services, but the prototype quickly evolved into the production system. The platform had to process more than 50,000 PDFs accumulated over five years. We designed an event-driven architecture with three AI microservices that consumed Vertex AI services for different stages of document processing. Because each AI call took a variable amount of time, synchronous workflows would have caused severe bottlenecks. To solve this, we used Google Cloud Pub/Sub to orchestrate asynchronous processing between services. Intermediate results were stored in transactional tables in Postgres so each stage could resume reliably and maintain job state. We used Cloud Schedulers for archiving the Transaction tables data to BigQuery everyday at midnight. We also integrated Cloud Functions as hooks to notify users when processing jobs completed. To maximize throughput, we tuned CPU and memory allocations on Cloud Run deployments and optimized the Python workers for parallel processing. Cloud Run’s free tier of 1M requests per month meant our compute cost was effectively zero, so the system only incurred costs for Vertex AI inference. The final pipeline processed the entire backlog within roughly three hours after deployment.
Building a JSON-Driven Form Builder for a Human-in-the-Loop AI Pipeline
Quantiphi Analytics
Our team was developing a Human-in-the-Loop (HITL) pipeline where users uploaded batches of PDF documents and AI models extracted structured information from them, eliminating large amounts of manual data entry. The AI also produced confidence scores for each extracted field so human reviewers could quickly verify or correct the results. A major challenge emerged because the PDFs existed in many historical formats and versions, which meant every variation required a slightly different verification form in the UI. Initially these forms were hardcoded in the repository, creating a maintenance nightmare as new formats were introduced. I solved this by designing a JSON-driven form builder. Instead of embedding forms in code, the application fetched a JSON schema from a cloud storage bucket based on the detected PDF version. The React frontend dynamically rendered the verification form in real time using this schema. I also built a lightweight internal tool to generate the schema files and wrote Jest tests to ensure the components behaved correctly across schema variations. This architecture removed the need to hardcode multiple forms and dramatically simplified maintenance. The tool proved useful beyond our project, and sister teams working on similar PDF processing HITL pipelines adopted the form builder across their applications.
Reducing BigQuery Costs and Latency for User-Generated Reports
Quantiphi Analytics
We were building a self-service analytics platform where business executives could configure and schedule their own reports. The data was sourced from BigQuery tables containing more than 10,000 columns and petabyte-scale datasets. Because BigQuery charges based on the amount of data processed, poorly written queries could become extremely expensive. Since the report creators were not data engineers, many queries contained redundant filters or scanned far more data than necessary, resulting in slow dashboard renders and high processing costs. To address this, I implemented several guardrails directly in the UI layer. We dynamically queried table schemas from BigQuery and highlighted redundant filters while suggesting filter pushdowns before a query was executed. For every saved report query, we ran a BigQuery dry-run and surfaced the estimated query cost to the user, making them aware of the financial impact of their queries. We also enforced server-side pagination even when users did not explicitly request it, limiting the amount of data scanned and returned to the UI. These optimizations reduced query processing overhead and improved response times from roughly 3–5 seconds to around 250–500 milliseconds depending on the query, while significantly lowering BigQuery processing costs.
