Architected high-performance backend systems and custom AI agents. Built a B2B outreach ecosystem processing millions of records and an Enterprise RAG system capable of querying 5,000+ Arabic/English research docs. Engineered a Go-based credit scoring engine focusing on concurrency and error handling.
Specialized in Document Intelligence and Generative AI. Fine-tuned DONUT transformer models for OCR-free extraction on French datasets. Architected a Next.js AI E-Learning platform integrating Tavily for live web browsing and course generation.
Author of 'Blitz Parse' (Rust/Python), a high-performance library for PDF extraction outperforming PyMuPDF. Winner of Stability AI Hackathon (LabLab.ai) for creating a local AI code-completion VS Code extension. Built 'Invoice Watcher', a cross-platform Electron app for real-time invoice automation.
A proprietary verification engine replacing commercial tools like Reoon. Reduced costs by 90% while processing hundreds of thousands of emails daily with high accuracy.
Commercial verification services were too costly for massive cold outreach. I built a custom engine deployed on GCP that saves 90% on costs. It combines syntax validation, DNS/MX resolution, and deep SMTP connection simulation (without sending) to verify emails in real-time.
- Implemented IP rotation and rate-limiting to avoid blacklists.
- Built a "Guesser" module generating valid emails from corporate patterns.
- Uses Celery/Redis for distributed task queuing across scalable workers.
Advanced RAG system for a major Arabic academic platform. Scales to 1.5 million documents using agentic workflows for deep research queries.
Processing unstructured Arabic academic texts (RTL, diacritics) is difficult. I built a hierarchical agent system using LangGraph where a "Meta-Agent" routes queries to specific document agents optimized for summarization or deep Q&A.
- Scaled from 5k to 1.5M documents using incremental indexing.
- Solved linguistic nuances with specific Arabic embedding models (AraBERT).
- Utilized multi-threading for high-throughput embedding generation.
Comprehensive data platform hosting millions of B2B leads. Features streaming uploads, robust credit systems, and real-time administrative control.
Needed to ingest and serve millions of records rapidly. I architected a full-stack solution using Async FastAPI and streaming parsers to handle massive file uploads without memory spikes, ensuring sub-second search speeds.
- Implemented a fraud-resistant credit consumption system.
- Background workers (Celery) handle continuous data enrichment.
- Tiered data DETAILS: Verified vs. Unverified lead storage.
Internal operations project managing a massive database of 1 billion raw contacts for automated, personalized acquisition campaigns.
Managed a 1B+ record dataset. Deployed high-RAM instances for cleaning and deduplication. Built a custom TypeScript crawler to scrape company data, then used LLMs to generate personalized email variants at scale.
Demo for SWCC (Saudi Water Conversion Corp). Allows secure, conversational interaction with sensitive databases.
Enabling non-technical users to query DBs via chat requires strict security. I built an agentic workflow that translates natural language to SQL but implements safeguards to block destructive commands (DELETE/DROP) while allowing safe queries.