Skip to main content

Voice Conversion

Transform your audio files using professional voice models with the VoiceByAuribus API.

Overview

Voice conversion is the process of transforming source audio to match the characteristics of a selected voice model while preserving the original content, timing, and emotional expression. VoiceByAuribus provides high-quality voice conversion with pitch shifting capabilities.

How It Works

  1. Upload your source audio file
  2. Select a voice model from our voice models library
  3. Configure pitch shifting (optional)
  4. Process the conversion job
  5. Download your converted audio

The entire process is handled asynchronously, allowing you to submit multiple conversion jobs and receive notifications when they complete.

Creating a Conversion

Basic Conversion

Create a voice conversion with default settings:

curl -X POST https://api.auribus.io/api/v1/voice-conversions \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"audio_file_id": "660e8400-e29b-41d4-a716-446655440001",
"voice_model_id": "550e8400-e29b-41d4-a716-446655440000",
"pitch_shift": "same_octave",
"use_preview": false
}'

Response:

{
"success": true,
"data": {
"id": "770e8400-e29b-41d4-a716-446655440002",
"audio_file_id": "660e8400-e29b-41d4-a716-446655440001",
"audio_file_name": "my-audio.wav",
"voice_model_id": "550e8400-e29b-41d4-a716-446655440000",
"voice_model_name": "Sarah Mitchell",
"pitch_shift": "same_octave",
"use_preview": false,
"status": "pending_preprocessing",
"output_url": null,
"created_at": "2025-01-15T10:40:00Z",
"completed_at": null
}
}

Pitch Shifting Options

Adjust the pitch of the converted audio to match your desired vocal range. VoiceByAuribus provides seven pitch shifting options:

OptionSemitonesDescription
same_octave0No pitch change (default)
third_down-4Subtle deepening
third_up+4Subtle brightening
fifth_down-7Moderate lowering
fifth_up+7Moderate raising
lower_octave-12Dramatic deepening
higher_octave+12Dramatic brightening

Example with pitch shift:

curl -X POST https://api.auribus.io/api/v1/voice-conversions \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"audio_file_id": "660e8400-e29b-41d4-a716-446655440001",
"voice_model_id": "550e8400-e29b-41d4-a716-446655440000",
"pitch_shift": "fifth_up"
}'
Choosing Pitch Shifts

Start with subtle adjustments (third_up or third_down) and increase incrementally. Thirds and fifths generally sound more natural than full octave shifts.

Conversion Lifecycle

Status Flow

pending_preprocessing → queued → processing → completed

failed
StatusDescriptionActions Available
pending_preprocessingWaiting for audio file preprocessing to completeWait or poll status
queuedAudio ready, waiting in conversion queueWait or poll status
processingActively converting audioWait or poll status
completedConversion finished successfullyDownload output files
failedConversion failedContact support, retry

Checking Status

Poll the conversion status:

curl -X GET https://api.auribus.io/api/v1/voice-conversions/{id} \
-H "Authorization: Bearer $TOKEN"

Response (completed):

{
"success": true,
"data": {
"id": "770e8400-e29b-41d4-a716-446655440002",
"use_preview": false,
"status": "completed",
"output_url": "https://s3.amazonaws.com/bucket/output/conversion-id.wav?X-Amz-Algorithm=...",
"completed_at": "2025-01-15T10:45:00Z"
}
}
Use Webhooks Instead

Instead of polling, use webhook notifications to receive instant updates when conversions complete. This is more efficient and provides better user experience.

Output Files

Each completed conversion provides a converted audio file accessible via the output_url field.

Understanding use_preview

When creating a conversion, the use_preview parameter determines which audio is converted:

  • use_preview: false (default): Converts the full audio file
  • use_preview: true: Converts only a 10-second preview from the beginning

Both options return a single output_url containing the converted audio.

Downloading Converted Audio

To download the converted audio, make a GET request to the URL in data.output_url:

curl -X GET "$OUTPUT_URL" --output converted-audio.wav

Output Properties:

  • Format: WAV
  • Duration: Full audio length if use_preview: false, or ~10 seconds if use_preview: true
  • URL Validity: 12 hours
URL Expiration

Download URLs expire after 12 hours. If a URL expires, call GET /api/v1/voice-conversions/{id} to get a fresh URL.

When to Use Preview Mode

Preview mode (use_preview: true) is useful for:

  • Quick testing: Test voice models without processing the entire audio
  • Faster results: Preview conversions complete much faster
  • Cost optimization: Process only what you need to evaluate
# Create a preview conversion for testing
curl -X POST https://api.auribus.io/api/v1/voice-conversions \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"audio_file_id": "660e8400-e29b-41d4-a716-446655440001",
"voice_model_id": "550e8400-e29b-41d4-a716-446655440000",
"pitch_shift": "same_octave",
"use_preview": true
}'

Best Practices

1. Use Webhooks for Production

Instead of polling, implement webhook notifications to receive instant updates:

# Create webhook subscription
curl -X POST https://api.auribus.io/api/v1/webhooks/subscriptions \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"url": "https://your-app.com/webhooks/conversions",
"events": ["conversion_completed", "conversion_failed"]
}'

2. Start Conversions Immediately

You don't need to wait for audio file preprocessing to finish before creating a conversion:

# Step 1: Upload audio
UPLOAD_RESPONSE=$(curl -X POST https://api.auribus.io/api/v1/audio-files ...)
AUDIO_ID=$(echo $UPLOAD_RESPONSE | jq -r '.data.id')

# Step 2: Upload to S3
curl -X PUT "$UPLOAD_URL" --upload-file audio.wav

# Step 3: Create conversion immediately (no waiting!)
curl -X POST https://api.auribus.io/api/v1/voice-conversions \
-d '{"audio_file_id": "'"$AUDIO_ID"'",...}'

The system automatically queues the conversion and starts processing when the audio file is ready.

3. Download Files Promptly

Download URLs expire after 12 hours:

  • Download files soon after conversion completes
  • Don't store download URLs in your database
  • Request fresh URLs if needed by calling GET /voice-conversions/{id}

4. Handle Failures Gracefully

Implement proper error handling:

const response = await fetch('https://api.auribus.io/api/v1/voice-conversions', options);

if (!response.ok) {
const error = await response.json();
console.error('Conversion failed:', error.message);
// Implement retry logic or user notification
}

5. Test with Preview Mode

Use use_preview: true to quickly test voice models and settings before processing full audio:

# Create a preview conversion (10 seconds)
curl -X POST https://api.auribus.io/api/v1/voice-conversions \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{
"audio_file_id": "'"$AUDIO_ID"'",
"voice_model_id": "'"$VOICE_ID"'",
"pitch_shift": "same_octave",
"use_preview": true
}'

# Listen to preview result, then create full conversion if satisfied

Troubleshooting

Conversion Stays in Pending

Cause: Audio file is still preprocessing.

Solution: Wait for audio file preprocessing to complete. This is automatic - the conversion will start when ready.

Conversion Failed

Causes:

  • Source audio file is corrupted
  • Unsupported audio format
  • Audio file too large (>100MB)
  • System error

Solution: Check audio file quality and format. Contact support if issue persists.

Download URL Returns 403 Forbidden

Cause: Download URL has expired (>12 hours old).

Solution:

# Get fresh URLs
curl -X GET https://api.auribus.io/api/v1/voice-conversions/{id} \
-H "Authorization: Bearer $TOKEN"

Conversion Sounds Unnatural

Causes:

  • Pitch shift too extreme
  • Voice model not suitable for source audio
  • Low-quality source audio

Solutions:

  1. Try a more subtle pitch shift (e.g., third_up or third_down instead of full octaves)
  2. Test different voice models
  3. Improve source audio quality

Rate Limits

Rate limits may apply depending on your plan and traffic patterns.

Stay Within Limits

Use webhooks instead of frequent polling to reduce API traffic.

Next Steps

Getting Help

Need assistance with voice conversions? We're here to help:

  • Email: support@auribus.io
  • Technical Support: Get help choosing voice models and optimizing conversions