Summary

Overview

Work History

Education

Skills

Technical Projects

Timeline

Keshav Saraogi

Mumbai

Summary

Results-driven IT professional with hands-on experience in building enterprise AI solutions and scalable backend architectures. Proficient in developing RAG systems, fine-tuning LLMs, and designing data pipelines using tools like LangChain, Airflow, and FastAPI. Demonstrated success in microservices deployment, CI/CD optimization, and API engineering. Adept at solving complex data and system integration challenges in fast-paced environments.

Overview

years of professional experience

Work History

Information Technology Intern

Indorama Ventures

Bangkok

04.2025 - Current

Engineered a scalable Retrieval-Augmented Generation (RAG) platform, leveraging LangChain, OpenAI Embeddings, and FastAPI, backed by Pinecone VectorDB, to deliver real-time, context-aware LLM outputs across 10GB+ of SAP ERP and Excel data, enhancing semantic search and decision support for enterprise users.
Developed robust, production-ready ETL pipelines using Apache Airflow, seamlessly integrating SAP connectors and Excel data sources; optimized data transformation with Pandas and OpenPyXL, achieving a 70% reduction in manual data prep time, and accelerating analytics workflows.
Customized OpenAI LLMs with procurement-specific datasets using few-shot learning, advanced prompt engineering, and LangChain prompt templates; implemented role-sensitive response conditioning and dynamic context management, boosting response accuracy and task alignment in internal QA tests by over 60%.
Architected a multi-agent reasoning framework using LangChain’s AgentExecutor, combining SQL and Pandas agents with custom insight-generation tools to process hybrid data queries; ensured consistent multi-turn interactions while mitigating hallucinations through context-aware agent orchestration.

Software Development Intern

Patton Labs

Jacksonville

01.2023 - 04.2023

Contributed to the design and deployment of a scalable microservices ecosystem for a cross-functional team of five, leveraging Docker, Kubernetes, Linux, and TypeScript to ensure modularity, fault tolerance, and ease of maintenance.
Streamlined the CI/CD pipeline by integrating automated testing, build, and deployment workflows with Docker and Kubernetes, reducing build and release cycle times by 20%, and accelerating feature delivery.
Enhanced microservice communication efficiency by architecting RESTful APIs and implementing asynchronous messaging with RabbitMQ, resulting in a 17% increase in inter-service data throughput, and reduced latency.
Boosted database performance by 22% through schema optimization and the development of high-efficiency SQL and NoSQL queries, eliminating key backend bottlenecks, and improving overall system responsiveness.

Education

Master of Science - Computer Science

Boston University

Boston, Massachusetts

12-2024

Bachelor of Science - Computer Science

Temple University

Philadelphia, Pennsylvania

12-2022

Skills

Docker, Kubernetes, AWS, and Azure Data Lake Storage
LangChain, GitHub, Linux, Apache Spark
Python, JavaScript, TypeScript, R, Java
HTML, CSS, Tableau, MySQL, PostgreSQL

MongoDB, SQL, NoSQL, REST APIs, and Cloud Services
TensorFlow, Keras, PyTorch, NumPy, Pandas
Matplotlib, PyTest, CI/CD, GitHub Actions, Bash
LLMs, prompt engineering, RAG systems, OpenAI APIs

Technical Projects

Artist classification and recognition using big data and deep learning

Designed a scalable audio ingestion and preprocessing pipeline to handle 10GB+ of raw audio data using the Hadoop Distributed File System (HDFS) and Apache Spark on AWS EMR, optimizing for the efficient preparation of high-volume datasets for ML tasks
Developed modular ETL workflows with PySpark and Spark MLlib to extract audio features (MFCCs, Chroma, Spectral Contrast) from WAV/MP3 files, transforming them into time-frequency spectrogram matrices, resulting in a 30% reduction in data preparation latency
Automated the end-to-end data processing lifecycle using AWS Step Functions, integrating raw data retrieval, feature extraction, and S3-based storage with prefix-based partitioning and versioning to enhance auditability and scalability

Object detection pipeline on AWS

Engineered a complete image data pipeline for object detection using AWS S3, SageMaker Processing Jobs, and OpenCV, supporting real-time ingestion and preprocessing of large-scale annotated datasets
Implemented robust data augmentation and annotation standardization workflows using Pandas and Boto3, reducing preprocessing time by 40 percent and ensuring consistency across diverse metadata formats
Built ETL components for bounding box extraction, format conversion (Pascal VOC ↔ YOLO ↔ COCO), and dataset partitioning (train/validation/test) using Python and Dockerized SageMaker scripts for deployment at scale

Timeline

Information Technology Intern

Indorama Ventures

04.2025 - Current

Software Development Intern

Patton Labs

01.2023 - 04.2023

Master of Science - Computer Science

Boston University

Bachelor of Science - Computer Science

Temple University

Keshav Saraogi

Summary

Overview

Work History

Information Technology Intern

Software Development Intern

Education

Master of Science - Computer Science

Bachelor of Science - Computer Science

Skills

Technical Projects

Timeline

Information Technology Intern

Software Development Intern

Master of Science - Computer Science

Bachelor of Science - Computer Science

Similar Profiles

Brian S GreenBrian S Green

Moises ManriquezMoises Manriquez

Branly FernandezBranly Fernandez

Ayon BanerjeeAyon Banerjee

SHAHRUKH SHAIKHSHAHRUKH SHAIKH