Tips & Tricks for Dia
- For **Dialogue** mode, clearly mark speaker turns using
[S1]
and[S2]
. - Add non-verbal sounds like
(laughs)
,(sighs)
,(clears throat)
within the text where desired. - For **Voice Clone** mode, upload a clean reference audio file (
.wav
/.mp3
) using the "Load" button. Crucially, include the exact transcript of the reference audio at the beginning of your text input (e.g.,[S1] Reference transcript. [S1] Target text...
). - Experiment with **CFG Scale** (higher = more adherence to text, potentially less natural) and **Temperature** (higher = more random/varied).
- The **Speed Factor** adjusts playback speed (0.8 = slower, 1.0 = original).
- Use the
/v1/audio/speech
endpoint for OpenAI compatibility. Use thevoice
parameter to specify mode ('S1', 'S2', 'dialogue', 'reference_file.wav').