List: LLM fundamental | Curated by Mirco Milletarì Ph.D. | Medium

Mirco Milletarì Ph.D.

Sep 29, 2024

7 stories

LLM fundamental

In

Towards AI

by

Raghunaathan

LLM Finetuning Strategies

Unlocking Precision: Tailor Your LLM to Perfectly Fit Your Needs!

Sep 24, 2024

LLM Finetuning Strategies

Sep 24, 2024

In

TDS Archive

by

Heiko Hotz

DeepSpeed Deep Dive — Model Implementations for Inference (MII)

A closer look at the latest open-source library from DeepSpeed

Nov 17, 2022

DeepSpeed Deep Dive — Model Implementations for Inference (MII)

Nov 17, 2022

Geronimo

LLM Inference on multiple GPUs with 🤗 Accelerate

Minimal working examples and performance benchmark

Nov 27, 2023

LLM Inference on multiple GPUs with 🤗 Accelerate

Nov 27, 2023

In

HuggingFace

by

Thomas Wolf

💥 Training Neural Nets on Larger Batches: Practical Tips for 1-GPU, Multi-GPU & Distributed setups

Training neural networks with larger batches in PyTorch: gradient accumulation, gradient checkpointing, multi-GPUs and distributed setups…

Oct 15, 2018

💥 Training Neural Nets on Larger Batches: Practical Tips for 1-GPU, Multi-GPU & Distributed setups

Oct 15, 2018

Eduardo Ordax

Fine tuning Vs Pre-training

The objective of my articles is to ensure clarity and simplicity in technical explanations. To achieve this, I will skip over certain…

Jan 15, 2024

Fine tuning Vs Pre-training

Jan 15, 2024

In

Level Up Coding

by

Talib

Pre-training vs. Fine-tuning [With code implementation]

TL;DR: Enhancing the performance of large language models (LLMs) in certain tasks and circumstances requires fine-tuning them. This blog…

Jun 25, 2024

Pre-training vs. Fine-tuning [With code implementation]

Jun 25, 2024

In

TDS Archive

by

Benjamin Marie

Multi-GPU Fine-tuning for Llama 3.1 70B with FSDP and QLoRA

What you can do with only 2x24 GB GPUs and a lot of CPU RAM

Aug 8, 2024

Multi-GPU Fine-tuning for Llama 3.1 70B with FSDP and QLoRA

Aug 8, 2024

Mirco Milletarì Ph.D.
329 Followers
Theoretical Physicist, Machine learning scientist@Microsoft, everything Geek.
Following
Data Science Collective
Max Lapan
Davdelvecchio
The Medium Blog
See all (254)

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams