Technical Stack & Tooling

Technical Stack & Tooling

Core Competencies

Languages, frameworks, and systems that underpin my machine learning and infrastructure work.

Languages & Core Frameworks

  • Python – primary development language for ML, orchestration, and tooling.
  • Rust – for systems-level performance and backend microservices.
  • C – low-level optimization, bindings, and hardware interfacing.
  • Bash / Shell – job orchestration, automation, and system scripting.

Machine Learning & AI Frameworks

  • PyTorch – primary framework for model training, inference, and fine-tuning.
  • Transformers (Hugging Face) – model loading, tokenization, and adaptation pipelines.
  • TRL / RLHF frameworks – reinforcement learning from human feedback for open-source LLMs.
  • vLLM / ExLlama / FlashAttention – optimized inference backends for large model serving.
  • Diffusers / VAEs – generative modeling, diffusion-based scientific simulations.

Model Training & Experimentation

  • Weights & Biases (W&B) – experiment tracking, telemetry, and visualization.
  • Axolotl / Swift / Megatron-LM – large-scale fine-tuning, distillation, and evaluation pipelines.
  • QLoRA / GLORA – parameter-efficient fine-tuning on constrained GPUs.
  • Hugging Face Hub / Datasets – versioned dataset and model hosting.
  • Kaggle GPU Runtime – fast prototyping and benchmarking of scientific ML workloads.

Data Engineering & Automation

  • NumPy / Pandas / Arrow / JSON Schema – data processing, validation, and ingestion.
  • FastAPI – for internal APIs, model serving, and research agent interfaces.
  • AsyncIO / Multiprocessing – asynchronous task orchestration for dataset generation and evaluation.
  • Pydantic – schema enforcement, validation, and structured data I/O.
  • OpenAI / Exa APIs – external integrations for dataset generation and research automation.

DevOps, Systems, & Environment

  • Docker – containerization for reproducible compute environments.
  • Tailscale – secure peer-to-peer network for distributed nodes.
  • Linux (Arch, Ubuntu) – main OS environments for development and HPC workloads.
  • Git / GitHub Actions – version control, CI/CD for pipelines and infra modules.
  • S3 (MinIO / AWS) – telemetry and dataset storage.

Visualization & Research Tools

  • Plotly / Matplotlib / Seaborn – experiment visualization and scientific plotting.
  • Obsidian / Markdown – research documentation, knowledge graph integration.
  • Jupyter / VSCode / Neovim – main development and analysis environments.

Hardware & Compute Layer

  • Multi-GPU orchestration – A100 / RTX 5090 clusters via NexaCompute.
  • Mixed-precision training (bf16 / fp8) – for performance and cost efficiency.
  • Distributed compute (Slurm-like job API) – internal orchestration across consumer + HPC nodes.
← Return to Home