Skip to main content
POST
/
predict
Analyze audio (Predict)
curl --request POST \
  --url https://aurigin.ai/api-ext/predict \
  --header 'Content-Type: multipart/form-data' \
  --header 'x-api-key: <api-key>' \
  --form file=@example-file
{
  "error": [
    null,
    null,
    null
  ],
  "global_probability": [
    0.9584,
    0.9585,
    0.9123
  ],
  "predictions": [
    "fake",
    "fake",
    "real"
  ]
}

Endpoint

POST https://aurigin.ai/api-ext/predict

Headers

x-api-key: aurigin_test_1234567890abcdef

Request Body

You can send audio data in two ways:

Option 1: Multipart form data (direct file upload)

Content-Type: multipart/form-data
Parameters:
  • file (required): Audio file (WAV, MP3, M4A, etc.)
  • user_id (optional): User identifier for tracking

Option 2: JSON with presigned URL

Content-Type: application/json
Parameters:
  • presigned_url (required): URL to the audio file
  • user_id (optional): User identifier for tracking

Examples

  • Direct file upload
  • Presigned URL
curl -X POST "https://aurigin.ai/api-ext/predict" \
  -H "x-api-key: $AURIGIN_API_KEY" \
  -F "file=@recording.wav" \
  -F "user_id=user123"

Response

{
  "error": [null, null, null],
  "global_probability": [0.9584, 0.9585, 0.9123],
  "predictions": ["fake", "fake", "real"]
}

Response Fields

  • predictions (array): AI detection results for each 5-second chunk of the audio
    • "fake": Audio segment is likely AI-generated
    • "real": Audio segment is likely human-generated
    • Array length = number of 5-second chunks in the audio file
  • global_probability (array): Confidence scores (0.0-1.0) for each prediction
    • Higher values indicate higher confidence in the prediction
    • Corresponds 1:1 with the predictions array
  • error (array): Error messages for each chunk (null if successful)
    • Corresponds 1:1 with the predictions array
    • Contains error details if processing failed for a specific chunk

Audio Processing Details

The API processes your audio file in 5-second chunks:
  • Minimum duration: 3 seconds (1 chunk)
  • Maximum duration: 50MB file size
  • Chunk size: 5 seconds each
  • Output: One prediction per 5-second chunk
For example:
  • A 12-second audio file → 3 chunks → 3 predictions
  • A 30-second audio file → 6 chunks → 6 predictions

Supported Formats

  • WAV, MP3, M4A, FLAC, OGG
  • Mono or stereo
  • Various bitrates and sample rates

Error Codes

  • 400: Invalid input or file too large (50MB max)
  • 403: Authentication failed (check x-api-key)
  • 500: Internal error or upstream unavailability

Authorizations

x-api-key
string
header
required

Body

file
file
required
user_id
string

Optional user identifier

Response

200 - application/json

OK

error
(string | null)[]

Error messages for each 5-second chunk (null if successful)

global_probability
number[]

Confidence scores (0.0-1.0) for each prediction, one per 5-second chunk

predictions
enum<string>[]

AI detection results for each 5-second chunk of the audio. Array length equals the number of 5-second chunks in the audio file.

I