Shaida Muhammad

Machine Learning Engineer

Get In Touch

About Me

I am a dedicated and analytical Machine Learning Engineer with extensive experience in developing and deploying NLP and classification models. My background includes hands-on work with sentiment analysis, text classification, and large language models (LLMs), such as Meta LLAMA-2 and Phi-3.5, all of which are crucial for understanding and analyzing data.

I possess a strong foundation in data engineering and machine learning, demonstrated through various projects involving text summarization, classification, and conversational AI. My expertise in leveraging transformer-based models allows me to extract meaningful insights from complex data sets.

I am committed to continuous learning and staying updated with the latest research in machine learning. I thrive in collaborative environments and am eager to contribute to projects that require technical proficiency and innovation.

Work Experience

January 2024 – June 2024

Sr. Machine Learning Engineer

Love For Data, Rawalpindi, Pakistan
  • Worked on Large Language Models (LLMs) for text summarization and information extraction, enhancing content comprehension.
  • Implemented text classification models using HuggingFace transformers (BERT, RoBERTa) to analyze textual data, including sentiment analysis tasks.
October 2022 – December 2023

Machine Learning Engineer

Datacrypt, Remote, UAE
  • Developed intelligent chatbots using the RASA Framework, improving user interaction and feedback collection.
  • Designed and maintained data automation pipelines, ensuring efficient data handling and processing.
  • Deployed multiple machine learning applications on AWS, Azure, and GCP, ensuring scalability and reliability.
April 2021 – May 2022

Machine Learning Engineer

QLU.ai, Islamabad, Pakistan
  • Enhanced company's product with machine learning-based components, focusing on NLP for better user engagement.
  • Trained and deployed NLP models for text classification and analysis, including sentiment analysis models.
  • Conducted data engineering tasks on textual data to improve model performance.

Education

Sep 2019 – 2023

MS in Computer Science (18 years)

National University of Sciences and Technology (NUST), Islamabad, Pakistan

CGPA: 3.90

Core Courses: Advanced Algorithms Analysis, Mathematical Methods for Computing, Theory of Computation, Operating Systems

Elective Courses: Machine Learning, Deep Learning, Data Mining, Natural Language Processing (NLP)

Oct 2015 – Mar 2018

M.Sc in Computer Science (16 years)

University of Peshawar, Peshawar, Pakistan

Marks: 80.58%

Courses: Data Structures, Algorithms, Databases, Networking, C++, E-Commerce, Digital Logic Design, Software Engineering, Compiler, Artificial Intelligence, Operating Systems

Oct 2013 – August 2015

Bachelor of Computer Science (14 years)

Bacha Khan University Charsadda, Charsadda, Pakistan

Courses: Computer Science, Mathematics-A, Physics

Skills

Programming Languages & Frameworks

Python Java C++ Flask Django FastAPI

Machine Learning Libraries

PyTorch Scikit-learn NLTK SciPy Numpy Pandas TensorFlow Keras HuggingFace

Other Skills

Git GitHub GitLab SQL Linux

Projects

Voice-Based AI Assistant for Elderly

Phi-3.5 Open Source TTS/STT

Developed a voice-based AI assistant system designed to support elderly individuals by answering queries related to their health, medicines, and other relevant topics. This system utilized the Phi-3.5 model and integrated various open-source text-to-speech and speech-to-text models to create a comprehensive pipeline. The assistant functions similarly to Siri, providing voice-activated responses to help users manage their daily needs and inquiries.

Document Summarization System

Django PyTorch Meta LLAMA-2 Quantized Model

Developed an advanced document summarization system using Meta LLAMA-2 integrated into a Django-based web app. This project, conducted in a secure environment with limited resources, involved summarizing documents of varying lengths and formats—from a few sentences to hundreds of pages. Leveraging a quantized Meta LLAMA-2 13B model and collaborating closely with Django and DevOps teams, I ensured efficient processing and accurate summaries tailored to specific formats and requirements.

Domain Classification System

PyTorch roBERTa Synthetic Data Pandas

Developed a domain classification system for a complex document dataset with nearly 100 domains. Overcame challenges of limited and noisy labeled data by generating synthetic data and fine-tuning roBERTa base models. Designed a hierarchical system with 16 parent models for initial broad domain classification and various child models for subdomain classification, achieving around 80% accuracy.

Sentiment Analysis System

PyTorch roBERTa Pandas

Developed a Sentiment Analysis model to classify paragraphs within documents into five categories: negative, extreme negative, neutral, positive, and extreme positive. This model provided detailed sentiment insights for document content, enhancing the ability to understand and interpret textual data effectively.

Data Processing Pipeline System

Python Numpy Pandas lxml

Designed and implemented a data processing pipeline system for the Azadea Group, including Azadea, Decathlon, Adidas, etc. The system handled over 10GB of XML files daily, generating multiple CSV and XML files. Faced with the challenge of limited hardware (8GB RAM on a Windows laptop), I utilized creative solutions to sequentially process data, avoiding memory overload.

Message Health Checker

PyTorch Transformers RoBERTa Numpy Pandas

Developed "Message Health Checker," a sophisticated roBERTa-based classification system for LinkedIn messages. This tool categorizes messages into 5 multi-label categories with 5 ranks each, assessing tone and effectiveness. It empowers users to craft highly personalized messages, boosting engagement by 10x and tracking campaign success with detailed conversion metrics.

Message Assistant

PyTorch Scikit-learn Transformers BERT RoBERTa Fasttext Numpy

Developed a multiclass multilabel text classification model for job advertisement messages sent by recruiters on LinkedIn to job candidates. The classification determines the quality of the message and uses an AI-based text assistant. This project is a component of QLU.ai's primary product.

Job Description Splitting into Sections

Python NLTK

Created a hybrid approach algorithm to split job descriptions into logical sections. This component works as an assistant to improve the precision and recall of other components of QLU.ai's primary product.

Job Description Sections Classification

PyTorch HuggingFace Transformers

Trained transformer-based models such as BERT, RoBERTa, and XLNet to classify each section of a job description. This helper component increases the accuracy of other components of QLU.ai's primary product, thereby enhancing customer trust in the product.

Text Summarization, Paraphrasing, and Grammar Correction

HuggingFace PyTorch

Researched pre-trained models with good accuracy for paraphrasing, text summarization, and grammar correction. These models serve as helper components for QLU.ai's message generation feature.

Conversational AI System (Chatbot)

RASA Framework

Developed a conversational AI system to handle communication between recruiters and candidates on LinkedIn.

Pokémon Image Classification

PyTorch Numpy

Academic Project: Applied Feed-Forward Neural Network (FFNN) and Convolutional Neural Network (CNN) to classify Pokémon images into different categories.

Names Generation

PyTorch NLTK

Academic Project: Developed a text generation model based on RNN to generate names.

Generating Meaningful Word Embeddings

Python Numpy SciPy

Academic Project: Trained Skip-gram, CBoW, and SVD models on a custom dataset to generate meaningful word embeddings.

Deployments on Cloud

AWS Azure GCP EC2

Deployed various services on cloud hosting platforms like AWS, GCP, and Azure. Developed scalable data pipelines using AWS and Azure services. Utilized EC2 instances with external FTP, S3 connected to AWS Lambda for data processing, and Azure Virtual Machines with Azure Functions linked to Blob Storage for efficient data handling. Integrated GitHub Actions as a CI/CD pipeline to automate deployment and streamline workflows across both cloud environments.

Research Publications

Conference Papers

Journal Papers

Teaching Experience

March 3, 2025 – March 28, 2025

Instructor – 4-Week Python Online Course

Python FREE Online Ramzan Course 2025, Remote, Worldwide
  • Delivered a 4-week online Python programming course from beginner to advanced level during the month of Ramzan.
  • Covered foundational to advanced Python topics through live sessions, including real-time coding, assignments, and quizzes.
  • Used Google Classroom, Google Meet, and Google Chat for content delivery and interaction with students.
  • Provided all materials—slides, code examples, assignments, and quizzes—open-source on GitHub.
  • GitHub Repository: github.com/ShaidaMuhammad/Python-Free-Ramzan-Course-2025
Sept 2018 – March 2019

Visiting Faculty – Computer Science

Government Shaheed Rizwan Sareer Higher Secondary School, Charsadda, Pakistan
  • Taught Computer Science to F.Sc students with a focus on both theory and practical programming.
  • Conducted coding practice sessions in the computer lab to reinforce programming concepts.
  • Assigned and evaluated assignments and quizzes to monitor student progress.
  • Managed attendance records and maintained classroom discipline and engagement.

Awards

Stoori da Pakhtunkhwa (Stars of the Pukhtunkhwa) Scholarship

BISE Peshawar

Secured 15th position among the Top 20 students from Government Colleges in BISE Peshawar.

January 2014

Top 1% in GAT General Test

National Testing Service

GAT is taken by all graduated students in Pakistan. The GAT Test is a requirement for applying for a Master's Degree (MS) in Pakistan. Scored 83 out of 100 marks with a 99.62 percentile.

January 2019

Certifications

Coursera Certifications

Natural Language Processing Specialization on Coursera [L9XZX72XDF2A]

Natural Language Processing with Classification and Vector Spaces [HV2BR2GHE7SU]

Natural Language Processing with Probabilistic Models [9ZKN4QEUBSDC]

Natural Language Processing with Sequence Models [Z84QTPRB9C6L]

Natural Language Processing with Attention Models [7SYPZ5D8DPDN]

Introduction to Large Language Models [LEKT5FXJHJ56]

Generative AI with Large Language Models [RBXM49D5MZH4]

Build Basic Generative Adversarial Networks (GANs) [Z7LPD7T6M3A4]

Introduction to Self-Driving Cars [5AYN5M4FG8LW]

What is Data Science? [XDT4LQRWLGNB]

Python Essentials for MLOps [96WYAFV6J2WJ]

AI For Everyone [46G53XPMEA6S]

How Google does Machine Learning [G3HM7DJ4XG5Z]

Artificial Intelligence on Microsoft Azure [2Q55W93P5JPL]

Introduction to Deep Learning [UF89T3S2QVQG]

Neural Networks and Deep Learning [LVB52JUQ9HU7]

Supervised Machine Learning: Regression and Classification [Y7ECGZDYCVWY]

Mathematics for Machine Learning: Linear Algebra [RYF7JRYNYMJS]

Get In Touch

I'm currently looking for new opportunities in machine learning and NLP. Whether you have a question or just want to say hi, I'll try my best to get back to you!

Email Me LinkedIn