Available for opportunities

Hi, I'mSohaib Shamsi

I build|

Full-Stack Engineer & AI/ML Specialist crafting intelligent systems that bridge cutting-edge research with production-grade software.

0+
Projects Built
0+
AI/ML Models
0+
Years Coding
Scroll Down
About Me

Turning Complex Problems Into Elegant Solutions

Sohaib Shamsi

I'm a Computer Science graduate from FAST NUCES, Karachi, with a deep focus on Artificial Intelligence, Machine Learning, and Full-Stack Engineering.

I specialize in designing and deploying end-to-end AI pipelines: from data ingestion and model training to production APIs and scalable infrastructure. My work spans fraud detection systems, NLP engines, recommendation algorithms, reinforcement learning environments, and distributed data architectures.

Beyond code, I mentor aspiring AI engineers, compete in programming contests (IEEE Xtreme — Top 10 nationally), and regularly publish technical content on system design and ML engineering.

AI/ML Engineering

End-to-end ML pipelines, model deployment, MLOps

Full-Stack Dev

React, Node.js, Python, REST APIs, databases

Data Engineering

Apache NiFi, Solr, Ozone, distributed systems

Skills

My Tech Arsenal

Technologies and tools I use to bring ideas to life

AI / Machine Learning

PyTorchTensorFlowScikit-learnHugging FaceLangChainOpenAI APIAnthropic APIXGBoostNLTK / spaCyOpenCVMLflowW&BRAG PipelinesRLHF

Full-Stack Development

ReactNext.jsNode.jsExpressFastAPIFlaskTypeScriptTailwind CSSPostgreSQLMongoDBRedisGraphQL

Data & Cloud Engineering

Apache NiFiApache SolrApache OzoneApache TikaApache KafkaDockerKubernetesAWSCI/CDTerraformAirflowSpark

Tools & Practices

Git / GitHubLinuxVS CodeJupyterAgile / ScrumSystem DesignREST APIsMicroservicesTestingDocumentation
Projects

What I've Built

A selection of projects that showcase my expertise in AI, ML, and full-stack development

Fraud Alert Triage & Evaluation Pipeline

AI-driven alert triage module for fraud risk management at Paysys, enabling automated prioritization and classification of high-volume transaction alerts using a dynamically retraining ML pipeline that adapts to evolving fraud patterns with elastic data windows.

PythonXGBoostScikit-learnFastAPIPostgreSQLDocker
Dynamic retraining with elastic data windows
Real-time alert classification at scale
Continuous feedback loop integration

Enterprise Data Archiving Pipeline

Engineered a structured and unstructured data archiving pipeline using Apache NiFi, Tika, and Solr for content extraction, indexing, and searchability. Integrated long-term storage with Apache Ozone (S3-compatible) for scalability and durability.

Apache NiFiApache TikaApache SolrApache OzoneJavaS3
Multi-format content extraction
Full-text search with Solr indexing
S3-compatible lifecycle management

Agentic RL Framework for LLM Teaching

Designed and implemented an agentic AI framework for reinforcement learning tasks using Anthropic models at Preference Model. Built RL agents with frozen language models for inference, developing a teaching pipeline where agents curate high-quality offline datasets.

PythonAnthropic APIPyTorchRLTransformersRLHF
Agentic AI framework for RL tasks
Frozen LLM inference pipeline
Automated judge evaluation system

Insulin Resistance & ML Analysis

Applied advanced ML models to public physiological and clinical time-series datasets to study insulin resistance under noisy, non-stationary data conditions. Built robust preprocessing pipelines for biomedical signal analysis.

PythonTensorFlowPandasSciPyMatplotlibJupyter
Time-series clinical data modeling
Noise-robust feature engineering
Biomedical signal processing

Collaborative Filtering Recommendation System

Implemented sophisticated recommendation models on user–item interaction datasets (MovieLens-style), analyzing sparsity, cold-start, and scalability trade-offs with matrix factorization and neural collaborative filtering.

PythonPyTorchSurpriseNumPyPandasFlask
Matrix factorization & neural CF
Cold-start mitigation strategies
Scalability benchmarking

Reinforcement Learning Task Design

Designed custom RL environments with synthetic and semi-realistic datasets, focusing on reward shaping, evaluation stability, and policy gradient methods for complex sequential decision-making tasks.

PythonOpenAI GymStable Baselines3PyTorchRay RLlib
Custom Gym environment design
Advanced reward shaping
Policy gradient evaluation

RAG-Powered Knowledge Engine

Built a Retrieval-Augmented Generation system combining vector databases with LLMs for context-aware question answering over large document corpora. Features semantic search, chunk optimization, and hallucination reduction.

LangChainOpenAIPineconeFastAPIReactDocker
Semantic vector search
Context-aware response generation
Hallucination guard rails

Real-Time Sentiment Analysis Dashboard

End-to-end NLP pipeline streaming social media data through Kafka, performing real-time sentiment classification with transformer models, and visualizing trends on a live React dashboard with WebSocket updates.

TransformersKafkaReactWebSocketD3.jsFastAPI
Sub-second streaming inference
Transformer-based classification
Interactive D3.js visualizations

AI-Powered Code Review Agent

Autonomous code review bot using LLMs to analyze pull requests, detect bugs, suggest refactoring, and enforce coding standards. Integrates with GitHub Actions for seamless CI/CD pipeline integration.

GPT-4LangChainGitHub APINode.jsTypeScriptDocker
Automated PR analysis
Bug detection & code smell alerts
GitHub Actions integration

Distributed ML Training Platform

Scalable distributed training infrastructure supporting data and model parallelism across GPU clusters. Features automatic hyperparameter tuning, experiment tracking, and one-click model deployment with MLflow.

PyTorch DDPRayMLflowKubernetesTerraformAWS
Multi-GPU distributed training
Automated hyperparameter search
One-click model serving

Computer Vision Quality Inspector

Deep learning-based visual inspection system for manufacturing defect detection. Uses custom-trained YOLO and EfficientNet models with real-time inference on edge devices, achieving 98.5% defect detection accuracy.

YOLOv8EfficientNetOpenCVONNXTensorRTFastAPI
98.5% detection accuracy
Edge-optimized inference
Real-time video processing

ILF Cross-Currency Payment System

Led the Interledger Framework (ILF) project at Paysys enabling cross-currency payments using the Interledger Protocol. Built a robust settlement system handling multi-currency transactions with real-time exchange rates.

Node.jsTypeScriptInterledgerPostgreSQLRedisDocker
Cross-currency settlement engine
Interledger Protocol integration
Real-time FX rate handling
Experience

Career & Achievements

Nov 2024 — 2025

AI/ML Engineer — Preference Model

Remote, USA

Designed and implemented an agentic AI framework for reinforcement learning tasks using Anthropic models.

  • Built RL agents with frozen language models for inference, focusing on teaching LLMs effective strategies
  • Developed an RL-based teaching pipeline where agents curate high-quality offline datasets
  • Implemented automated judge system for model performance evaluation
Jun 2024 — Present

Software Engineer & AI — Paysys

Karachi, Pakistan

Working on AI-driven fraud risk management and cross-currency payment systems.

  • Built rule engines and AI models for fraud risk management using Tazama's platform
  • Led ILF (Interledger Framework) project for cross-currency payments with Interledger Protocol
  • Engineered data archiving pipeline with Apache NiFi, Tika, Solr, and Ozone
2021 — 2025

BS Computer Science — FAST NUCES, Karachi

Relevant Coursework: Machine Learning, Deep Learning, NLP, Computer Vision, Distributed Systems, Data Structures & Algorithms, Database Systems, Operating Systems

Competition

IEEE Xtreme Programming Contest

Achieved Top 10 nationally in the IEEE Xtreme 24-hour competitive programming marathon, demonstrating exceptional problem-solving skills under pressure.

Ongoing

AI/ML Mentor & Technical Writer

Mentored 50+ students in AI and ML fundamentals through online platforms. Regularly publish technical content, system design explanations, and ML engineering guides.

Contact

Let's Build Something Amazing

Have a project in mind or want to collaborate? I'd love to hear from you.

Location

Karachi, Pakistan