Simple API. Superpower capabilities.
Security and intelligence for every voice interaction. Deepfake detection, speaker profiling, intent analysis -all in one call.
Early Access
Ask for early access. Start today.
Get your audio intelligence superpowers now. Early access users shape the roadmap, get priority support, and start securing their voice pipelines immediately.
Capabilities
Beyond deepfake detection -full voice intelligence from a single API.
Liveness Verification
Confirm the speaker is a live human in real time
Deepfake Detection
Identify AI-generated and synthetic speech with confidence scoring
Voice Cloning Detection
Catch cloned voices using formant and prosody analysis
Forensics Traceability
Full audit trail with model version, raw scores, and analytics
Intelligence
Speaker Age Estimation
BetaEstimate speaker age range from vocal characteristics
Gender Classification
BetaClassify speaker gender from audio signal
Intent Analysis
Coming SoonDetect conversational intent -urgency, deception cues, emotional state
Malicious Intent Detection
Coming SoonFlag social engineering, phishing attempts, and coercion patterns
Emotion Recognition
Coming SoonIdentify anger, stress, fear, and other emotional markers in speech
Language & Accent Detection
Coming SoonIdentify spoken language and regional accent from audio
Speaker Verification
Coming SoonVerify if two audio samples belong to the same speaker
IVR & Voicebot Detection
Coming SoonDetect automated voice response systems and distinguish bots from human callers
Noise & Environment Analysis
Coming SoonClassify background environment and audio quality metrics
Quick Start
curl -X POST https://api.vocos.io/v1/detect \
-H "X-API-Key: your-api-key" \
-F "file=@audio.wav"
# Response:
# {
# "prediction": "real",
# "confidence": 0.973,
# "raw_score": 0.986,
# "model_version": "forensics-0.3B"
# }Base URL
All requests require an X-API-Key header.
Endpoints
/v1/detectUpload an audio file and receive a real/fake verdict with confidence score.
Parameters
fileaudio filerequiredWAV, MP3, FLAC, or OGG. Max 50 MB.session_idstringGroup multiple segments for sliding-window analysis.Response
{
"request_id": "abc-123",
"prediction": "real",
"confidence": 0.973,
"raw_score": 0.986,
"model_version": "forensics-0.3B"
}/v1/analyzeFull intelligence analysis -deepfake detection plus speaker profiling, age, gender, and environment.
Parameters
fileaudio filerequiredWAV, MP3, FLAC, or OGG. Max 50 MB.featuresstring[]Features to enable: "age", "gender", "language", "emotion". Defaults to all.Response
{
"request_id": "abc-456",
"security": {
"prediction": "deepfake",
"confidence": 0.961,
"liveness": false
},
"intelligence": {
"age_range": "25-34",
"gender": "male",
"language": "en-US",
"emotion": "neutral",
"environment": "office"
},
"model_version": "forensics-0.3B"
}/v1/healthCheck API status, model readiness, and GPU availability.
Response
{
"status": "healthy",
"model_loaded": true,
"device": "cuda",
"model_version": "forensics-0.3B"
}/v1/chat/streamStream a forensic AI analysis of the detection results via Server-Sent Events.
Parameters
session_idstringrequiredSession from a prior /detect call.messagestringrequiredUser question about the audio.Response
SSE stream of forensic analysis tokensRate Limits
Free
50 requests / day
Pro
10,000 requests / day
Enterprise
Unlimited -custom SLA
Ready to build with voice intelligence?
Ask for early access and get your security audio intelligence superpowers today.