Algorithm/model development and fine-tuning skill. Use for tasks like dataset design/cleaning, supervised fine-tuning (SFT), preference optimization (DPO/RLHF concepts), LoRA/QLoRA, training configs, evaluation (offline/online), safety checks, deployment packaging, and cost/performance trade-offs.
Algorithm/model development and fine-tuning skill. Use for tasks like dataset design/cleaning, supervised fine-tuning (SFT), preference optimization (DPO/RLHF concepts), LoRA/QLoRA, training configs, evaluation (offline/online), safety checks, deployment packaging, and cost/performance trade-offs.