Skip to main content

Parallelism Configuration Guide

This guide explains the parallelism configuration fields used in SGLang model configurations and how they map to SGLang server command-line arguments.

Quick Reference

Config FieldSGLang CLI ArgumentDescription
tp--tp-size, --tensor-parallel-sizeTensor Parallelism - splits model across GPUs
dp--dp-size, --data-parallel-sizeData Parallelism - runs multiple model replicas
ep--ep-size, --expert-parallel-size, --epExpert Parallelism - distributes MoE experts
enable_dp_attention--enable-dp-attentionDP for attention, TP for FFN (hybrid)