ABOUT ME
Bennett University
August 2022 - August 2026
Bachelor Of Technology (Computer Science) | CGPA: 9.04
Greater Noida, India
Delhi Public School, Rohini
March 2020 - July 2022
Science | 10th: 82% — 12th: 85%
Delhi, India
WORK EXPERIENCE
Deep Learning Intern
Thothica
April 2024 - Dec 2024
Remote
Responsibilities:
- Developed and implemented an evaluation method to assess the performance of Tesseract and Surya OCR engines in processing complex Urdu and Hindi scripts. Optimized Surya OCR, resulting in a 35% improvement in Hindi recognition accuracy.
- Leveraged OpenAI API to rapidly develop a compelling proof of concept for a client within 24 hours, seamlessly integrating it into a website and hosting it on an AWS EC2 instance, demonstrating innovative solutions and delivering immediate value.
- Automated web scraping for Indian literature using Selenium, boosting efficiency by over 70%.
- Conducted in-depth research on Retrieval-Augmented Reasoning, exploring innovative methodologies to enhance information retrieval and reasoning capabilities in AI systems.
Technologies:
PythonPyTorchTensorflowSeleniumGenAILLMOpenAIAWSNextJSGithub
MY PROJECTS
- Developed a custom Generative Adversarial Network (GAN) model using a convolutional neural network (CNN) as the generator and a CNN-based discriminator in PyTorch.
- Trained the GAN model on a custom dataset of NFT style images sourced from HuggingFace, enabling the generation of unique and diverse NFT-style images through unsupervised learning. Developed an API using FastAPI for easy access and integration.
- Designed and directed the website's front-end to ensure seamless functionality across, facilitating efficient production. Integrated the image generator into the website for enhanced user experience.
PythonPyTorchnumpyFlaskHTMLTailwindCSS
- Restructured a deep learning model to transcribe text from videos of individuals speaking, leveraging lip movements for improved accuracy.
- Preprocessed video data using dlib and OpenCV to extract essential facial information necessary for the model.
- Defined a neural network architecture that combines spatial pooling layers to capture spatial lip movements with LSTM layers to model temporal sequences.
PythonTensorFlowStreamLit
- Developed a multi-knowledge chatbot utilizing Langchain to streamline access to university policies and protocols, enhancing user experience for students and staff.
- Implemented advanced natural language processing techniques to facilitate user queries, resulting in a significant reduction in response time for policy-related inquiries.
- Designed and executed a user-friendly interface that enables seamless interaction with the chatbot, improving overall engagement and satisfaction among users.
FlaskTailwindCSSVectorDBLLMLangChain
- Evaluated and compared various computer vision frameworks, including MediaPipe and YOLO V7, and selected YOLO V7 for its superior performance and accuracy.
- Explored the potential of custom keypoint detection using Detectron.
- Leveraged the YOLO V7 object detection algorithm to train a model capable of real-time pose estimation like Sholder press, Chest press and more
PythonYOLO V7StreamlitMediaPipe
RESEARCH & PUBLICATIONS
How could we add emotional nuances to AI-generated music?
Feb 2025 - Current
In ProcessA Comparative Study of BDM and TM in the Age of AI
Jan 2025 - March 2025
In review processISLR using Deep Learning: Attention is Everywhere
Sept 2024 - Dec 2024
AcceptedAI in Finance: Navigating Ethical Quandaries
Aug 2024 - Sept 2024
AcceptedACHIEVEMENTS & CERTIFICATIONS
Fork-it Hackathon
2nd Position
IIIT Delhi
Engineering Project Showcase 2023
3rd Position
Bennett University
DeepLearning.AI TensorFlow Developer
Certification
DeepLearning.AI
Algorithmic Toolbox
Certification
UC San Diego
Fundamentals of Network Communication
Certification
University of Colorado
Introduction to Computers and Operating Systems and Security
Certification
Microsoft
GET IN TOUCH
Contact Information
Send a Message
Made with ❤️ by Moaksh Kakkar
© 2025 All right reserved.