Analyze audio (Predict)

AI-voice detection

curl --request POST \
  --url https://api.aurigin.ai/v0/predict \
  --header 'Content-Type: multipart/form-data' \
  --header 'x-api-key: <api-key>' \
  --form file='@example-file' \
  --form user_id=speaker_123

{
  "predictions": [
    "fake",
    "fake",
    "real"
  ],
  "global_probability": [
    0.9584,
    0.9585,
    0.9123
  ],
  "error": [
    null,
    null,
    null
  ],
  "model": "apollo-4-2025-10-20",
  "processing_time": 1.350719928741455,
  "audio_duration": 69.91
}

POST

predict

AI-voice detection

curl --request POST \
  --url https://api.aurigin.ai/v0/predict \
  --header 'Content-Type: multipart/form-data' \
  --header 'x-api-key: <api-key>' \
  --form file='@example-file' \
  --form user_id=speaker_123

{
  "predictions": [
    "fake",
    "fake",
    "real"
  ],
  "global_probability": [
    0.9584,
    0.9585,
    0.9123
  ],
  "error": [
    null,
    null,
    null
  ],
  "model": "apollo-4-2025-10-20",
  "processing_time": 1.350719928741455,
  "audio_duration": 69.91
}

You can send audio data in two ways: via multipart form data (direct file upload) or via JSON with a presigned URL.

How It Works

The API processes your audio file in 5-second chunks:

Minimum duration: 3 seconds (1 chunk)
Maximum duration: 4MB file size
Chunk size: 5 seconds each
Output: One prediction per 5-second chunk

For example:

A 12-second audio file → 3 chunks → 3 predictions
A 30-second audio file → 6 chunks → 6 predictions

Make a Request

JSON with presigned URL
Multipart form upload

curl

curl -X POST "https://aurigin.ai/api-ext/predict" \
  -H "Content-Type: application/json" \
  -H "x-api-key: <YOUR_API_KEY>" \
  -d '{
    "presigned_url": "https://example.com/presigned.wav",
    "user_id": "optional-user-id"
  }'

curl

curl -X POST "https://aurigin.ai/api-ext/predict" \
  -H "x-api-key: <YOUR_API_KEY>" \
  -F "file=@/path/to/audio.wav" \
  -F "user_id=optional-user-id"

Authenticate

Include your x-api-key header with every request.

Choose upload method

Use multipart upload for local files or supply a presigned_url when the file already lives in storage.

Inspect chunked results

Review the per-chunk predictions to understand which segments are AI-generated.

Response Breakdown

Supported Formats

WAV, MP3, M4A, FLAC, OGG
Mono or stereo
Various bitrates and sample rates

Error codes

Code	Description
`400`	Invalid input or file too large (4MB max)
`403`	Authentication failed (check `x-api-key`)
`500`	Internal error or upstream unavailability

Tips for production use

Cache presigned URLs for their validity window to minimize round trips.
Retry with exponential backoff when you receive 500 errors.
Monitor confidence scores to spot borderline results and trigger manual review.

Authorizations

x-api-key

string

header

required

Body

file

required

user_id

string

Optional user identifier

Response

error

(string | null)[]

Error messages for each 5-second chunk (null if successful). Aligns 1:1 with the predictions array.

global_probability

number<float>[]

Confidence scores (0.0-1.0) for each prediction, one per 5-second chunk. Aligns 1:1 with the predictions array.

predictions

enum<string>[]

AI detection results for each 5-second chunk of the audio. Array length equals the number of 5-second chunks in the audio file.

Available options:

fake,

real

Deepfake detection

Voice ID

How It Works

Make a Request

Response Breakdown

Supported Formats

Authorizations

Body

Response

Deepfake detection

Voice ID

​How It Works

​Make a Request

​Response Breakdown

​Supported Formats

Authorizations

Body

Response

How It Works

Make a Request

Response Breakdown

Supported Formats