AI Researcher & Engineer, Visionary Leader, Tech Innovator!
“Advancing AI to decode intelligence, enhance perception, and shape the future of human-machine collaboration ! “
I am a first principles driven AI Research Engineer with extensive experience in designing and building AI-powered solutions across Healthcare, Retail, Transportation and Financial Services. My research interests broadly lie in Computer Vision – particularly, the application of Deep Learning for Object Detection, Visual Recognition, and Automated Decision-Making in dynamic environments. I am particularly passionate about advancing model interpretability and robustness, with a focus on optimizing these technologies for scalable deployment.
My recent work explores Explainable AI (XAI) methodologies for high-stakes decision-making, bridging deep learning with model interpretability through Local Interpretable Model-Agnostic Explanations (LIME), SHapley Additive exPlanations (SHAP), and counterfactual reasoning to enhance transparency and trust in AI-driven systems. I have developed real-time AI-driven computer vision systems, leveraging YOLO-based architecture for object detection and semantic segmentation, to solve high-impact challenges in Retail and Healthcare.
By incorporating post-hoc explainability, I ensure that AI-driven decisions are transparent, trustworthy, and actionable, allowing domain experts to validate model predictions effectively.
As CTO of Leap2X, I collaborated with Vizuara Labs at Massachusetts Institute of Technology on pioneering multi-modal AI research, integrating vision and large language models such as CLIP (Contrastive Language-Image Pretraining) and BLIP-2 (Bootstrapped Language-Image Pretraining) to develop intelligent systems that seamlessly bridge visual perception with natural language reasoning, redefining human-machine interaction.
I have extensive experience with TensorFlow, PyTorch, OpenCV, and cloud-based AI deployment on AWS SageMaker and Google Vertex AI. My work also incorporates Scikit-Learn for traditional ML, YOLO for real-time object detection, and LIME & SHAP for explainability, ensuring interpretability in AI-driven decision-making.
I am actively involved in AI mentorship and open-source initiatives, working to bridge the gap between academia and real-world AI adoption.
Beyond research and engineering, I am an avid explorer and lifelong learner. I have traveled to 70 countries across all seven continents, immersing myself in diverse cultures and histories.
As a history buff, I enjoy engaging in socio-political conversations and deep philosophical discussions. I have been practicing Siddha Yoga meditation for over 20 years, blending mindfulness with my scientific pursuits.
When I’m not working on AI, I’m embracing the outdoors. I’ve hiked five of the world’s seven tallest peaks and hold a Nidan (Second-Degree Black Belt) in Shotokan Karate. I was the Gold Medalist at the UK National Level Shotokan Karate Championships, held at the Osaka Kyobashi Branch of the Japan Karate Association (JKA) 🇯🇵.
And when I find a rare moment to relax, you’ll probably find me strumming my six-string.
Degree: Postgraduate in Artificial Intelligence & Machine Learning : Jan 2023 - Jan 2024
Rank: 2nd of 110 | CGPA: 4.33 of 4.33 | Percentage: 98.95 %
Degree: Bachelor’s in Computer Engineering: Jun 1996 - May 1999
Thesis:"Crack Detection in Scanned Pipes"
Degree: Diploma in Computer Technology: Jun 1992 - Apr 1996
Thesis:"Braille Translation to convert AutoCAD 3D files to Text"
Degree: Diploma in Computer Engineering: Jun 1994 - Apr 1995
Degree: Primary & Secondary Education: Jun 1992
Founder & CTO London | London, United Kingdom | Jan 2021 - Present
Leading a team of Data Scientists to design and build AI-driven solutions using Computer Vision and Large Language Models on Azure and AWS platform
Senior Technical Program Manager | London, United Kingdom | Sep 2017 - Sep 2020
Led a team of Machine Learning Engineers to design and build Retail solutions on the AWS platform
Senior Technical Program Manager | London, United Kingdom| Oct 2010 - Aug 2017
Led the Digital Transformation team to deliver FinTech solutions across AWS & Google Cloud Platform
Technical Program Manager | London, United Kingdom | Oct 2008 - Sep 2010
Led a team of Data Scientists to deliver Risk Management solutions
Technical Program Manager | London, United Kingdom | Nov 2005 - Aug 2008
Led the Customer Insights & Data Analytics team to build eCommerce solutions for the Retail business
Principal Software Engineer | Multiple Location - US, UK & India| Feb 2001 - Oct 2005 Led eCommerce Product Development teams to design and build eCommerce solutions across Retail
Research Internship| Mumbai, India | Feb 2000 - Jan 2001
Object detection and tracking in the realm of computer vision is a critical task that not only identifies the location and class of objects within the frame but also maintains a unique ID for each detected object as the video progresses. This project uses the YOLO model for Object detection and classification in real time videos and images
This project aims to implement automated face obfuscation techniques on the ImageNet dataset using computer vision. The system detects and blurs or pixelates human faces while preserving the rest of the image, thereby ensuring privacy protection while maintaining dataset usability for non-facial recognition tasks.
This project focuses on multi-person real time pose detection and estimation using deep learning-based computer vision techniques. The project explores applications in human activity recognition, sports analytics and motion tracking. Optimizations like multi-scale feature extraction, edge inference, and lightweight architectures ensure efficiency for real-world deployment.
This project focuses on Object Recognition, Tracking and Speed estimation. The goal is to estimate the speed of objects in real-time by utilizing YOLOv9 for real-time object detection and DeepSORT for multi-object tracking. By analyzing consecutive frames and calculating displacement over time, the system determines object speeds and overlays them on the video. This approach is useful for applications in traffic monitoring, sports analytics, and autonomous systems.
This project focuses on real-time object detection and classification using deep learning models trained on the COCO (Common Objects in Context) dataset. By leveraging state-of-the-art architectures such as YOLO (You Only Look Once), Faster R-CNN, and SSD (Single Shot MultiBox Detector), the system identifies and classifies multiple objects in images and videos with high accuracy. The project also incorporates bounding box annotations and confidence scores to enhance interpretability. Potential applications include autonomous driving, smart surveillance, and augmented reality.
This project improves medical image segmentation by integrating guided self-attention into CNNs, enhancing feature selection and long-range dependencies. Using an MRI-based dataset, a U-Net transformer-based model is trained with dice loss and evaluated using DSC and IoU metrics. The approach outperforms standard models, achieving higher accuracy and lower variance, making it valuable for automated medical diagnosis.
This project uses U-Net to automate femur bone contour extraction in knee X-rays, aiding surgeons' decision-making. Manual annotation is time-consuming and repetitive, making deep learning a valuable solution. The model improves efficiency, accuracy, and consistency, reducing effort in surgical planning.
This project implements a 3D CNN for automatic glioma segmentation in MRI scans, capturing multi-scale contextual information to handle tumor variations. It hierarchically segments glioma subregions and achieves state-of-the-art Dice scores on the BraTS 2017 dataset, ensuring compactness, efficiency, and accuracy. This approach is highly valuable for surgeons and radiologists, aiding in tumor monitoring and treatment planning.
This project aims to modernize agriculture by leveraging AI and Deep Learning to automate plant seedling recognition and reduce reliance on manual labor. By efficiently distinguishing between crops and weeds, the system enhances accuracy, productivity, and crop yields while freeing agricultural workers for higher-level decision-making. This innovation not only optimizes farming operations but also contributes to sustainable agricultural practices, making the industry more efficient and environmentally friendly in the long run.
Click on any publications to learn more
Managing online paid advertisements across multiple platforms presents significant challenges due to fragmented interfaces, complex account structures, and the manual effort required to optimize campaigns effectively. Advertisers using channels like Google Ads, Facebook Ads, LinkedIn, and Twitter face difficulties in real-time ad modifications, performance monitoring, and implementing quick changes based on user insights. Current systems often lack the flexibility to allow advertisers to make immediate adjustments directly from the ad display environment, relying instead on separate, platform-specific tools. Additionally, analyzing the performance of new or edited ads is cumbersome with existing tools, which provide limited capabilities for real-time feedback and optimization. Advertisers struggle to track performance changes across channels simultaneously, and users viewing the ads have restricted options to offer suggestions or initiate modifications efficiently. This results in time-consuming processes, delayed campaign improvements, and suboptimal ad performance. This patent introduces an innovative method and system for managing online paid advertisements that simplifies campaign management across multiple channels. It enables advertisers to modify ad content and features in real time, directly from the ad interface, while simultaneously monitoring performance metrics. By streamlining ad management, enhancing cross-channel performance tracking, and reducing the complexity of campaign optimization, this system improves advertising efficiency, maximizes ROI, and empowers advertisers to respond swiftly to market demands.
Many merchants face challenges in managing multi-channel purchases, where offline and online transactions are handled separately, leading to inefficiencies in order management, sales attribution, and customer engagement. Additionally, generating personalized promotions requires manual intervention, delaying the process and missing real-time opportunities to influence purchasing decisions. Competitive product analysis is often reactive, lacking real-time insights that could enhance pricing strategies and improve customer conversion rates. Existing e-commerce systems struggle to integrate offline and online sales seamlessly, provide dynamic promotions, and deliver real-time competitive product analysis. Merchants often face operational delays, inaccurate sales metrics, and missed opportunities for personalized customer engagement due to fragmented systems and manual processes. This patent introduces an advanced, computer-implemented system that unifies multi-channel prospective purchases into a single draft order, streamlines promotion recommendations using automated analysis, and performs real-time competitive product searches. By enhancing sales attribution accuracy, automating personalized promotions, and providing real-time competitive insights, the system optimizes sales processes, improves customer experiences, and drives revenue growth across diverse retail environments.
Many borrowers face challenges financing mid-sized projects, such as home improvements or furniture purchases, which are often too large for standard credit lines but too small for traditional loans or home equity lines of credit. Additionally, merchants prefer credit card payments, making mortgage loans impractical for such needs. Even when borrowers have funds, they may seek loans with favorable terms, but the mortgage process can be time-consuming and stressful. Existing lending systems match borrowers with specific lender pools, requiring repeated applications if denied, causing delays and discomfort for both borrowers and merchants. There is a need for flexible systems that streamline loan applications, provide access to multiple lenders, and offer a minimally intrusive user experience. Moreover, solutions that enable near-instant loan fund access, allowing merchants to process payments like typical credit card transactions, are essential. This patent introduces an innovative system and method for loan application, origination, and assignment, designed to simplify the borrowing process. It offers borrowers quick access to funds, seamless payment integration for merchants, and flexible connections to multiple lenders, reducing friction, enhancing convenience, and transforming the lending experience.
**Abstract** : High-ash coal presents unique challenges for gasification due to its low calorific value and high ash content, which impede the production of syngas—a versatile fuel for industrial applications. These characteristics result in reduced gasification efficiency, elevated emissions, and increased operational complexities, posing environmental and economic difficulties. Traditional modeling approaches, reliant on empirical correlations, often fail to capture the complex, nonlinear dynamics of high-ash coal gasification. These dynamics involve intricate interactions among variables like temperature, pressure, chemical reactions, and material properties, all of which are critical for accurately predicting outcomes such as syngas composition, carbon conversion, and calorific value.
**Abstract** : Closed Circuit Television (CCTV) surveys are widely used in Mumbai to assess the structural integrity of underground drainage and sewer systems, which play a critical role in the city's infrastructure. Given Mumbai’s high population density and heavy monsoons, maintaining underground pipelines is essential to prevent urban flooding and structural collapses. The manual visual inspection of pipeline video footage for defect classification is labor-intensive, prone to subjectivity, and highly inefficient, particularly in a city with thousands of kilometers of aging sewer networks. This paper presents ongoing research into the automatic assessment of underground pipelines in Mumbai using AI-based image processing, reducing human fatigue and error. By leveraging Computer Vision techniques, the system aims to streamline preventive maintenance operations for the Brihanmumbai Municipal Corporation (BMC) and similar urban planning authorities, ensuring sustainable urban infrastructure management. This study introduces an advanced, automated image analysis framework that integrates traditional image processing techniques with a Bayesian inference model to improve detection accuracy and manage uncertainty in complex imaging environments. The framework employs Gaussian filtering for noise reduction and image enhancement, followed by Canny edge detection to delineate potential crack boundaries. Validated on real-world datasets, this approach demonstrates superior robustness compared to deterministic methods, significantly reducing false positives and offering scalable, reliable solutions for automated underground pipe inspections. The integration of probabilistic modeling with traditional image analysis not only advances the state-of-the-art in structural health monitoring but also sets the foundation for future research in intelligent infrastructure diagnostics. This scalable framework presents a reliable solution for automated underground pipe inspection, with potential applications in infrastructure monitoring, predictive maintenance, and industrial diagnostics.
**Abstract** : Until recently, aseptic loosening was the common cause for revision of a Total Hip Replacement. According to the National Joint Registry, this has now been replaced by post-operative periprosthetic femoral fractures (POPPF). Managing this condition involves complex surgery, which is expensive and associated with high mortality and morbidity. The incidence of POPPF appears to be increasing year on year, and the cumulative probability of sustaining a POPPF within 10 years of THR was 1%, with 15% of these patients dying within one year of surgery. While long-term follow-up of joint replacements was previously the norm, this is no longer the case in most institutions. It is therefore imperative that Orthopaedic Surgeons in general and arthroplasty surgeons in particular develop a strategy to identify patients at risk and take pre-emptive action on an elective basis. This would generally involve early revision to prevent the occurrence of a fracture. This research proposes an innovative approach leveraging Computer Vision (CV) and Explainable Artificial Intelligence (XAI) to develop an automated Early Warning System for detecting pre-fracture indicators such as implant loosening, positional changes, and bone loss from serial post-operative radiographs. By enabling remote monitoring and risk assessment without requiring in-person hospital visits, this AI-driven system has the potential to enhance clinical decision-making, reduce patient morbidity and mortality, and deliver substantial cost savings to healthcare systems.
**Abstract** : Efficient inventory management is a cornerstone of retail success, directly influencing revenue, profitability, operational efficiency, and customer satisfaction. Traditional inventory tracking methods are often labor-intensive, error-prone, and reactive rather than proactive, leading to stockouts, overstocking, and lost sales opportunities. This research explores the integration of Computer Vision (CV) and Explainable AI (XAI) to develop an intelligent, real-time shelf monitoring system that automates inventory tracking, detects stock shortages, and ensures optimal shelf replenishment. Leveraging state-of-the-art deep learning techniques such as YOLO-based object detection and semantic segmentation, the system continuously monitors shelf conditions with high precision and speed. Furthermore, Explainable AI methodologies enhance transparency by providing interpretable insights into the system's decision-making process, empowering store managers with actionable intelligence to optimize restocking strategies, reduce revenue loss, and enhance the overall customer shopping experience. The proposed solution has the potential to revolutionize inventory management by enabling data-driven decision-making, improving supply chain efficiency, and transforming retail operations at scale.
**Abstract** : Drawing Interpreter is a robust framework designed for the automated extraction and modification of geometric and textual entities from AutoCAD files. It streamlines the processing of complex technical drawings by leveraging traditional image processing techniques and CAD manipulation methods to interpret AutoCAD DXF and DWG files. Utilizing the powerful ezdxf library, the framework efficiently parses key CAD entities, including lines, circles, polylines, and text annotations, converting them into structured, human-readable formats. Beyond data extraction, the Drawing Interpreter enables seamless automated modifications such as adjusting entity positions, resizing dimensions, modifying layer properties, and adding or removing specific design elements based on user-defined parameters. This non-visual CAD editing capability eliminates the need for manual interaction with AutoCAD software, significantly reducing human effort in repetitive design workflows. The framework demonstrates high efficiency, accuracy, and compatibility with industry-standard CAD tools through rigorous testing on real-world engineering drawings. Modified files can be exported back into DXF or DWG formats, ensuring seamless integration with existing CAD workflows. The Drawing Interpreter offers substantial value across industries that rely on scalable, automated CAD file manipulation, driving productivity gains and enhancing design automation.
**Abstract** : E-Governance has emerged as a transformative force across various industries, with higher education institutions increasingly adopting its principles to enhance operational efficiency and service delivery. Universities, in particular, are leveraging e-Governance to streamline administrative processes and improve student experiences. A notable success story in this domain is the implementation of Student e-Services Management through e-Governance in India. Driven by the objectives of reducing costs, saving time, and optimizing administrative efforts, the Maharashtra State Government, in collaboration with the Maharashtra Knowledge Corporation Limited (MKCL), launched the 'e-Suvidha' project in 2006. This initiative provides a comprehensive suite of e-Services to students across state universities at an exceptionally affordable cost of just $1 per student per year. The project has significantly improved the efficiency of academic and administrative workflows, including admissions, examinations, results, and certification processes. Its success has inspired similar implementations across other states in India, positioning 'e-Suvidha' as a role model for universities globally. This publication serves as a valuable resource for professionals, academicians, and students in the fields of e-Governance, Information Technology, Management, and Education. It offers in-depth insights into the design, implementation, and impact of e-Governance solutions in higher education, highlighting best practices and lessons learned from the 'e-Suvidha' project. By exploring the intersection of technology and education administration, this work aims to inspire the development of innovative, cost-effective, and scalable e-Governance models worldwide.
I’m an avid hiker and a mountaineer. I’ve climbed 5 of the 7 tallest peaks across the 7 continents and currently in training to peak Mount Aconcagua in 2026 and Mount Everest in 2027.
1. Mount Everest (Asia): 8,849 meters (29,032 feet) - Scheduled for 2027
2. Mount Aconcagua (South America): 6,961 meters (22,838 feet) - Scheduled for 2026
3. Mount Kilimanjaro (Africa): 5,895 meters (19,341 feet) - Summited in 2025
4. Mount Denali (North America): 6,194 meters (20,322 feet) - Summited in 2024
5. Mount Elbrus (Europe): 5,642 meters (18,510 feet) - Summited in 2020
6. Mount Vinson (Antarctica): 4,897 meters (16,067 feet) - Summited in 2019
7. Mount Puncak Jaya (Oceania): 4,884 meters (16,023 feet) - Summited in 2018
As much as I love the mountains, I am equally mesmerized by the ocean. I am a PADI-certified Scuba Instructor and I have had the privilege of training individuals across Europe, from the kelp forests of the UK to the sea caves of Croatia. My role involves not only imparting essential diving skills but also fostering a deep appreciation for marine ecosystems. This experience has honed my teaching abilities, cultural adaptability, and commitment to safety. Leading diverse groups in dynamic underwater environments has equipped me with unique perspectives on discipline, precision, and the importance of continuous learning.
I am currently undergoing deep-sea diving training and have embarked on expeditions to some of the world's most renowned dive sites. At Australia's Great Barrier Reef, the largest coral reef system globally, I explored its vibrant marine life and intricate coral formations. Diving into Belize's Great Blue Hole, a UNESCO World Heritage site, I descended into its vast underwater sinkhole, witnessing its unique geological structures. In Lake Titicaca, straddling the border between Peru and Bolivia, I navigated the high-altitude freshwater lake, uncovering its hidden underwater landscapes. These experiences have enriched my understanding of diverse aquatic ecosystems and advanced my diving proficiency.
I am the cofounder of DoAR - Donate An Hour (🌍 donateanhour.org), a non-profit organization in India committed to bridging the digital divide and fostering sustainable development in underserved communities.
Recognizing education as a catalyst for change, I lead the Computer Science & Technology chapter, bringing together Professors, Educators, and Software Engineers to volunteer their expertise in teaching coding, computational thinking, and digital literacy. This initiative not only enhances technical skills but also nurtures problem-solving abilities and career opportunities, creating a scalable impact in marginalized communities.
My leadership in this initiative reflects my deep commitment to leveraging technology for social good, a principle that also drives my research aspirations in computer science, AI, and equitable technology development. Through this work, I aim to bridge academia and real-world impact, ensuring that technological advancements are accessible and transformative for all.
The Symphonica is a blues band and we perform at concerts and various gigs on and around corporate houses! I am the co-founder and the current co-Music Director of The Symphonica Blues, which means I am part of the Executive Board. I also lead the group musically and organize rehearsals, in addition to other responsibilities. I've held many different positions in the group during the past couple years, though, including Auditions Manager, Publicity Chair, Social Chair, and Choreographer.
Copyright © 2025 Rahul Kulkarni