Rohit Kumar

Professional Experience

Building AI solutions at scale for leading organizations

AI/ML Engineer

SUFFESCOM SOLUTIONS PVT. LTD.

March 2026 — june 2026 | Mohali, India

Designed and deployed scalable AI solutions using LLMs, RAG architectures, and agentic AI systems
Built end-to-end ML pipelines including feature engineering, model training, fine-tuning, and evaluation
Productionized ML/DL models through robust APIs, cloud deployment, and performance optimization
Implemented MLOps best practices with CI/CD, DVC, model versioning, monitoring, and automated retraining
Developed multi-agent workflows using LangGraph, CrewAI, AutoGen, and intelligent data automation pipelines

LLMs Feature Engineering Fine-Tuning MLOps DVC LangGraph/CrewAI/Autogen LangChain/OpenAISdk Deployment RAG

Associate AI/ML Engineer

EMINENCE INTERNET TECHNOLOGY PVT. LTD.

April 2024 — March 2026 | Mohali, India

Architected and deployed production-ready AI/ML models using LangChain, LangGraph, and modern LLM frameworks
Built enterprise RAG systems with ChromaDB, Pinecone, handling 100K+ queries/day with high accuracy
Implemented real-time conversational AI with WebSocket integration, TTS/STT, and emotion detection
Fine-tuned transformer models (BERT, T5, LLaMA) achieving high accuracy on custom datasets
Developed MLOps pipelines for model versioning, monitoring, and automated deployment

LangChain LangGraph PyTorch RAG NLP FastAPI Vector DBs MLOps

Software Developer

Shine Dezign Pvt Ltd

3 months(Internship)

Developed and maintained PHP-based web applications with an emphasis on backend performance, clean architecture, and reliable API integrations. Worked on responsive UI integration and optimized server-side logic to improve application efficiency and user experience.

Featured Projects

Production-grade AI solutions across multiple domains

📱 Product Image Classification

Built an end-to-end deep learning pipeline for classifying product images as Mobile Phones or Laptops using TensorFlow and MobileNetV2. Implements complete ML workflow including dataset preparation, training, evaluation, and deployment-ready inference via FastAPI. Features binary classification with transfer learning, achieving high accuracy on custom datasets for e-commerce automation.

TensorFlow MobileNetV2 FastAPI Transfer Learning Image Classification E-commerce AI

🎵 Deepfake Audio Detection Model

Developed a machine learning system to detect deepfake/synthetic audio using Wav2Vec2 embeddings and classical ML classifiers. Achieved 92.86% accuracy with Logistic Regression on the Real vs Fake Human Voice dataset (700k samples). Pipeline extracts 768-dimensional feature vectors, handles variable-length audio, and implements preprocessing with StandardScaler normalization. Trained and compared Logistic Regression (best), SVM, and Random Forest models.

Wav2Vec2 Audio ML Logistic Regression SVM Deepfake Detection

🤖 Multi-Agent RAG System with LangGraph

Architected an enterprise-grade multi-agent RAG system using LangGraph orchestration. Implements specialized agents for document retrieval, synthesis, fact-checking, and response generation. Features dynamic routing, agent collaboration, and context-aware memory management with 95%+ answer accuracy on domain-specific queries.

LangGraph Multi-Agent RAG Pinecone FastAPI Redis

🎯 Fine-Tuned Sentiment Analysis Models

Fine-tuned BERT and T5 transformer models for domain-specific sentiment analysis. Implemented transfer learning, data augmentation, and advanced preprocessing. Achieved 94% F1-score on custom dataset with balanced precision-recall. Deployed with FastAPI for real-time inference.

BERT T5 Transfer Learning HuggingFace PyTorch FastAPI

💻 LLaMA Command Intent Classifier

Fine-tuned LLaMA 3.1 8B for classifying Linux commands and natural language into predefined intents. Built for AI terminal assistants and DevOps automation. Achieved 96% accuracy using LoRA fine-tuning with custom prompt-completion dataset.

LLaMA 3.1 LoRA Transformers NLP Classification

🎙️ AI-Powered Podcast Intelligence Platform

Built a production-grade podcast processing system with speaker diarization, multi-host/guest identification, and automatic music filtering. Implemented real-time line-level editing, WebSocket-based progress tracking, and chunked long-form processing using a sliding-window approach for LLM limits. Integrated LLM-driven summarization, sentiment analysis with timestamps, and semantic search via Pinecone, with transcripts securely stored in AWS S3.

AssemblyAI ElevenLabs AWS S3 WebSocket LangChain Pinecone Speaker Diarization Music Detection Real-time Editing Chunked Processing Background Jobs NLP

📩 AI-Driven Customer Support Automation

Built an intelligent complaint handling system using CrewAI multi-agent framework. Automatically processes text and audio inputs, classifies issues, verifies against policies, generates responses, and integrates with CRM. Reduced response time by 70% while maintaining quality.

CrewAI OpenAI MySQL CRM Automation Multi-Agent

🤖 N8N Multi-Agent Workflow Orchestration

Engineered a sophisticated n8n workflow with dual AI agents, persistent MongoDB memory, and intelligent routing. Implements context-aware conversations, webhook triggers, and modular architecture for scalable automation across multiple domains and use cases.

n8n OpenAI MongoDB Webhook Automation

🤖 IoT Device Control via ADB & MQTT

Developed a remote control system for Hisense TV and Fire TV using ADB and MQTT protocols. Enables seamless device communication, Android automation, and real-time command execution for smart home integration.

ADB MQTT IoT Python

🕷️ ML-Powered Intelligent Web Scraper

Built an ML-powered web scraping system to handle dynamic popups, CAPTCHA detection, and anti-bot measures. Trained a CNN-based popup detection model enabling automated interaction and seamless scraping across 50+ websites, achieving 98% accuracy using Selenium with headless Chrome.

Selenium CNN Computer Vision Web Scraping BeautifulSoup Automation Anti-Bot

🎤 Ultra-Low Latency Voice AI (~1.5s)

Built a high-performance speech-to-speech system with industry-leading latency. Uses Deepgram for STT, Groq LLM for rapid inference, and ElevenLabs for natural TTS. Implements streaming responses, ChromaDB for knowledge retrieval, and optimized pipeline achieving consistent sub-2-second response times.

Deepgram STT Groq LLM ElevenLabs TTS ChromaDB Streaming Low Latency

💬 Enterprise Conversational AI Platform

Developed a production-ready chatbot with LangChain and ChromaDB. Features voice interaction, real-time streaming via WebSocket, conversation memory, and automatic HTML transcript generation sent via SMTP. Handles context across sessions with personalized responses.

LangChain ChromaDB OpenAI TTS/STT WebSocket SMTP

🎥 InstantMeet - WebRTC Video Platform

Built a production-grade video conferencing app with FastAPI and WebRTC. Features instant meeting creation, multi-participant support, text chat, user authentication, OTP recovery, and optional recordings. Optimized for low latency and high concurrent user capacity.

FastAPI WebRTC WebSocket SQLite Real-time

About Me

Professional Experience

AI/ML Engineer

Associate AI/ML Engineer

Software Developer

Featured Projects

📱 Product Image Classification

🎵 Deepfake Audio Detection Model

🤖 Multi-Agent RAG System with LangGraph

🎯 Fine-Tuned Sentiment Analysis Models

💻 LLaMA Command Intent Classifier

🎙️ AI-Powered Podcast Intelligence Platform

📩 AI-Driven Customer Support Automation

🤖 N8N Multi-Agent Workflow Orchestration

🤖 IoT Device Control via ADB & MQTT

🕷️ ML-Powered Intelligent Web Scraper

🎤 Ultra-Low Latency Voice AI (~1.5s)

💬 Enterprise Conversational AI Platform

🎥 InstantMeet - WebRTC Video Platform

Technical Expertise

🤖 AI/ML & Deep Learning

🔧 Automation & Workflow

🕷️ Web Scraping & Data Extraction

⚙️ Backend & APIs

💾 Databases & Vector Stores

🚀 DevOps & Cloud

Let's Connect

Email

Phone

LinkedIn

GitHub

Hugging Face