
Software Engineer specializing in backend systems, large-scale data pipelines, and applied machine learning. Experienced in building distributed web-scraping frameworks, ETL workflows, and event-driven architectures on AWS. Skilled in designing REST APIs, orchestrating data workflows with Airflow/Metaflow, and integrating ML features such as embeddings, NLP, and semantic search into production systems. Background includes developing migration pipelines, compliance engines, LLM-powered enrichments, and high-volume data services for enterprise products. Currently completing a Master’s in Artificial Intelligence and Machine Learning (UAX, Spain). Strong focus on reliability, system design, and delivering clean data, insights, and automation for real-world applications.
Languages: Python, JavaScript/TypeScript, SQL
Backend & APIs: Django, DRF, FastAPI, Flask, REST API development, Authentication & RBAC, Schema modeling
Web Scraping: Playwright, Apify, Anti-bot evasion (proxies, fingerprints, stealth), Reverse engineering APIs (REST, GraphQL, XHR), Distributed scraping frameworks
Data Engineering & ETL: Apache Airflow, Prefect, Metaflow, Event-driven pipelines (S3, SQS, EventBridge), Bronze–Silver–Gold modeling, Data normalization & ingestion
Cloud and DevOps: AWS (ECS Fargate, Step Functions, Lambda, S3, RDS, VPC, IAM), Docker, Terraform, CI/CD with GitHub Actions
Databases: PostgreSQL, Relational modeling, Data lakes & S3 storage
Aether Data Platform (4-Repo Architecture)
End-to-end data & ML platform for collecting, processing, enriching, and serving product and review data.
Built using modular services:
• Hermes – Distributed scraping engine (Playwright, proxy rotation, anti-bot).
• Athanor – ETL orchestration (Bronze → Silver pipelines, S3, SQS, Step Functions).
• Intel – ML enrichment layer (embeddings, NLP, signals, pgvector).
• Aurum – FastAPI insights service (REST endpoints, semantic search).
Repos: https://github.com/Aether-Data-Platform
Demo video available upon request.