Query model Speech-to-Text
Query out Speech-to-text model. The input format is the same has OpenAI API. In async mode, use the following endpoint to get the result
Path parameters
AI API product identifier
Body Parameters
application/jsonDefines the maximum duration for an active segment in sec. For subtitle tasks, it's recommended to set this to a short duration (5-10 seconds) to avoid long sentences.
The audio file to transcribe (25mo max, types : mp3,mp4,aac,wav,flac,ogg,opus,wma,m4a)
The language of the input audio. Supplying the input language will translate the output.
ID of the model to use.
If the no_speech probability is higher than this value AND the average log probability over sampled tokens is below log_prob_threshold
, consider the segment as silent.
An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.
The format of the transcript output (default: json)
Response Body
The id of the batch dispatched handling the transcription.
Example request
<?php
use GuzzleHttp\Client;
$client = new Client();
$headers = [
'Authorization' => 'Bearer YOUR-TOKEN-HERE',
'Content-Type' => 'application/json'
];
$body = '{
"file": "example",
"model": "whisper"
}';
$request = new Request('POST', 'https://api.infomaniak.com/1/ai/{product_id}/openai/audio/transcriptions', $headers, $body);
$res = $client->sendAsync($request)->wait();
echo $res->getBody();
Example response
application/json
{"batch_id":"9b9fec49-cc95-44d5-8d3a-be56a6e05970"}