Image Generation
Image model ID:
fal-ai/flux-2, bytedance/seedream-4-5, dall-e-3Text description of the image to generate
Image size:
1024x1024, 1024x1792, 1792x1024Number of images to generate (1-4). Default:
1Response
Video Generation
Video model ID:
openai/sora-2, google/veo3-1, kwai/kling-2.6Text description of the video
Video duration in seconds (5-60)
Job Response
Poll for Status
Text-to-Speech
TTS model ID:
tts-1, tts-1-hdText to synthesize (max 4096 characters)
Voice:
alloy, echo, fable, onyx, nova, shimmerSpeech Recognition
Audio file (mp3, mp4, mpeg, mpga, m4a, wav, webm)
ASR model ID:
whisper-1ISO-639-1 language code
Embeddings
Embedding model:
text-embedding-3-large, intfloat/e5-mistral-7b-instructText or array of texts to embed
Music Generation
Music model:
google/lyria2Text description of the music to generate
Duration in seconds (10-300)