Voice Conversion

Transform your audio files using professional voice models with the VoiceByAuribus API.

Overview

Voice conversion is the process of transforming source audio to match the characteristics of a selected voice model while preserving the original content, timing, and emotional expression. VoiceByAuribus provides high-quality voice conversion with pitch shifting capabilities.

How It Works

Upload your source audio file
Select a voice model from our voice models library
Configure pitch shifting (optional)
Process the conversion job
Download your converted audio

The entire process is handled asynchronously, allowing you to submit multiple conversion jobs and receive notifications when they complete.

Creating a Conversion

Basic Conversion

Create a voice conversion with default settings:

curl -X POST https://api.auribus.io/api/v1/voice-conversions \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "audio_file_id": "660e8400-e29b-41d4-a716-446655440001",
    "voice_model_id": "550e8400-e29b-41d4-a716-446655440000",
    "pitch_shift": "same_octave",
    "use_preview": false
  }'

Response:

{
  "success": true,
  "data": {
    "id": "770e8400-e29b-41d4-a716-446655440002",
    "audio_file_id": "660e8400-e29b-41d4-a716-446655440001",
    "audio_file_name": "my-audio.wav",
    "voice_model_id": "550e8400-e29b-41d4-a716-446655440000",
    "voice_model_name": "Sarah Mitchell",
    "pitch_shift": "same_octave",
    "use_preview": false,
    "status": "pending_preprocessing",
    "output_url": null,
    "created_at": "2025-01-15T10:40:00Z",
    "queued_at": null,
    "processing_started_at": null,
    "completed_at": null,
    "error_message": null
  }
}

Pitch Shifting Options

Adjust the pitch of the converted audio to match your desired vocal range. VoiceByAuribus provides seven pitch shifting options:

Option	Semitones	Description
`same_octave`	0	No pitch change (default)
`third_down`	-4	Subtle deepening
`third_up`	+4	Subtle brightening
`fifth_down`	-7	Moderate lowering
`fifth_up`	+7	Moderate raising
`lower_octave`	-12	Dramatic deepening
`higher_octave`	+12	Dramatic brightening

Example with pitch shift:

curl -X POST https://api.auribus.io/api/v1/voice-conversions \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "audio_file_id": "660e8400-e29b-41d4-a716-446655440001",
    "voice_model_id": "550e8400-e29b-41d4-a716-446655440000",
    "pitch_shift": "fifth_up"
  }'

Choosing Pitch Shifts

Start with subtle adjustments (third_up or third_down) and increase incrementally. Thirds and fifths generally sound more natural than full octave shifts.

Conversion Lifecycle

Status Flow

pending_preprocessing → queued → processing → completed
                                          ↓
                                        failed

Status	Description	Actions Available
`pending_preprocessing`	Waiting for audio file preprocessing to complete	Wait or poll status
`queued`	Audio ready, waiting in conversion queue	Wait or poll status
`processing`	Actively converting audio	Wait or poll status
`completed`	Conversion finished successfully	Download output files
`failed`	Conversion failed	Contact support, retry

Checking Status

Poll the conversion status:

curl -X GET https://api.auribus.io/api/v1/voice-conversions/{id} \
  -H "Authorization: Bearer $TOKEN"

Response (completed):

{
  "success": true,
  "data": {
    "id": "770e8400-e29b-41d4-a716-446655440002",
    "audio_file_id": "660e8400-e29b-41d4-a716-446655440001",
    "audio_file_name": "my-audio.wav",
    "voice_model_id": "550e8400-e29b-41d4-a716-446655440000",
    "voice_model_name": "Sarah Mitchell",
    "pitch_shift": "same_octave",
    "use_preview": false,
    "status": "completed",
    "output_url": "https://s3.amazonaws.com/bucket/output/conversion-id.wav?X-Amz-Algorithm=...",
    "created_at": "2025-01-15T10:40:00Z",
    "queued_at": "2025-01-15T10:41:00Z",
    "processing_started_at": "2025-01-15T10:42:00Z",
    "completed_at": "2025-01-15T10:45:00Z",
    "error_message": null
  }
}

Use Webhooks Instead

Instead of polling, use webhook notifications to receive instant updates when conversions complete. This is more efficient and provides better user experience.

Output Files

Each completed conversion provides a converted audio file accessible via the output_url field.

Understanding `use_preview`

When creating a conversion, the use_preview parameter determines which audio is converted:

use_preview: false (default): Converts the full audio file
use_preview: true: Converts only a 10-second preview from the beginning

Both options return a single output_url containing the converted audio.

Downloading Converted Audio

To download the converted audio, make a GET request to the URL in data.output_url:

curl -X GET "$OUTPUT_URL" --output converted-audio.wav

Output Properties:

Format: WAV
Duration: Full audio length if use_preview: false, or ~10 seconds if use_preview: true
URL Validity: 12 hours

URL Expiration

Download URLs expire after 12 hours. If a URL expires, call GET /api/v1/voice-conversions/{id} to get a fresh URL.

When to Use Preview Mode

Preview mode (use_preview: true) is useful for:

Quick testing: Test voice models without processing the entire audio
Faster results: Preview conversions complete much faster
Cost optimization: Process only what you need to evaluate

# Create a preview conversion for testing
curl -X POST https://api.auribus.io/api/v1/voice-conversions \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "audio_file_id": "660e8400-e29b-41d4-a716-446655440001",
    "voice_model_id": "550e8400-e29b-41d4-a716-446655440000",
    "pitch_shift": "same_octave",
    "use_preview": true
  }'

Best Practices

1. Use Webhooks for Production

Instead of polling, implement webhook notifications to receive instant updates:

# Create webhook subscription
curl -X POST https://api.auribus.io/api/v1/webhooks/subscriptions \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://your-app.com/webhooks/conversions",
    "events": ["conversion_completed", "conversion_failed"]
  }'

2. Start Conversions Immediately

You don't need to wait for audio file preprocessing to finish before creating a conversion:

# Step 1: Upload audio
UPLOAD_RESPONSE=$(curl -X POST https://api.auribus.io/api/v1/audio-files ...)
AUDIO_ID=$(echo $UPLOAD_RESPONSE | jq -r '.data.id')

# Step 2: Upload to S3
curl -X PUT "$UPLOAD_URL" --upload-file audio.wav

# Step 3: Create conversion immediately (no waiting!)
curl -X POST https://api.auribus.io/api/v1/voice-conversions \
  -d '{"audio_file_id": "'"$AUDIO_ID"'",...}'

The system automatically queues the conversion and starts processing when the audio file is ready.

3. Download Files Promptly

Download URLs expire after 12 hours:

Download files soon after conversion completes
Don't store download URLs in your database
Request fresh URLs if needed by calling GET /voice-conversions/{id}

4. Handle Failures Gracefully

Implement proper error handling:

const response = await fetch('https://api.auribus.io/api/v1/voice-conversions', options);

if (!response.ok) {
  const error = await response.json();
  console.error('Conversion failed:', error.message);
  // Implement retry logic or user notification
}

5. Test with Preview Mode

Use use_preview: true to quickly test voice models and settings before processing full audio:

# Create a preview conversion (10 seconds)
curl -X POST https://api.auribus.io/api/v1/voice-conversions \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "audio_file_id": "'"$AUDIO_ID"'",
    "voice_model_id": "'"$VOICE_ID"'",
    "pitch_shift": "same_octave",
    "use_preview": true
  }'

# Listen to preview result, then create full conversion if satisfied

Troubleshooting

Conversion Stays in Pending

Cause: Audio file is still preprocessing.

Solution: Wait for audio file preprocessing to complete. This is automatic - the conversion will start when ready.

Conversion Failed

Causes:

Source audio file is corrupted
Unsupported audio format
Audio file too large (>100MB)
System error

Solution: Check audio file quality and format. Contact support if issue persists.

Download URL Returns 403 Forbidden

Cause: Download URL has expired (>12 hours old).

Solution:

# Get fresh URLs
curl -X GET https://api.auribus.io/api/v1/voice-conversions/{id} \
  -H "Authorization: Bearer $TOKEN"

Conversion Sounds Unnatural

Causes:

Pitch shift too extreme
Voice model not suitable for source audio
Low-quality source audio

Solutions:

Try a more subtle pitch shift (e.g., third_up or third_down instead of full octaves)
Test different voice models
Improve source audio quality

Rate Limits

Rate limits may apply depending on your plan and traffic patterns.

Stay Within Limits

Use webhooks instead of frequent polling to reduce API traffic.

Next Steps

Webhook Notifications: Set up real-time conversion notifications
Voice Models: Browse and select voice models
Uploading Audio: Detailed guide on audio file uploads
Quickstart Guide: Complete end-to-end example

Getting Help

Need assistance with voice conversions? We're here to help:

Email: support@auribus.io
Technical Support: Get help choosing voice models and optimizing conversions

Overview​

How It Works​

Creating a Conversion​

Basic Conversion​

Pitch Shifting Options​

Conversion Lifecycle​

Status Flow​

Checking Status​

Output Files​

Understanding use_preview​

Downloading Converted Audio​

When to Use Preview Mode​

Best Practices​

1. Use Webhooks for Production​

2. Start Conversions Immediately​

3. Download Files Promptly​

4. Handle Failures Gracefully​

5. Test with Preview Mode​

Troubleshooting​

Conversion Stays in Pending​

Conversion Failed​

Download URL Returns 403 Forbidden​

Conversion Sounds Unnatural​

Rate Limits​

Next Steps​

Getting Help​