INTL
Freelancer
전문가
외주
원격 가능
Python Developer Needed – Voice Accent Softening
예산
$30~$250 USD
예상 기간
2~3일
난이도
전문가
기술 스택
Python
Audio Services
Linux
Audio Processing
Signal Processing
Deep Learning
Audio Engineering
Natural Language Processing
Machine Learning Algorithms
Speech Synthesis
GPU Programming
Real-time Audio Processing
AI 분석 요약
나이지리아 악센트를 실시간으로 중화하는 Python 스크립트를 개발해야 합니다. 핵심은 150ms 미만의 지연 시간 내에 음성 변환 모델을 사용하여 라이브 오디오를 처리하는 것이며, AI/ML 기반의 오디오 처리 및 GPU 최적화 역량이 요구됩니다.
프로젝트 원문 설명
Our client runs a call center with Nigerian agents calling US businesses. They need a Python script that makes the agents' accent sound more neutral and clear to American ears in real time during live phone calls.
The hard part is already done. There is a fully built Node.js server that handles the live call audio. We just need you to write one Python script that plugs into it.
What the script needs to do:
Receive live audio from the server, process it through a voice conversion model to soften the Nigerian accent, and send the processed audio back. The whole process must happen in under 150ms so the conversation feels natural with no delay.
Tech details:
Audio comes in as mulaw 8kHz chunks. You can resample internally for better model performance then convert back before sending out. RVC is the preferred model but we are open to alternatives that meet the latency requirement. Script runs on a RunPod RTX A5000 GPU.
Before we hire:
You must send a before and after audio demo showing accent softening on any Nigerian English voice sample. You do not need our server or infrastructure to do this. Just run a Nigerian English audio sample through your model and send both files. This is how we evaluate all candidates.
Deliverable: One working Python script tested on GPU with latency confirmed under 150ms.
Timeline: 2 to 3 days after demo is approved.
Please include your demo files and your fixed price in your proposal.
The hard part is already done. There is a fully built Node.js server that handles the live call audio. We just need you to write one Python script that plugs into it.
What the script needs to do:
Receive live audio from the server, process it through a voice conversion model to soften the Nigerian accent, and send the processed audio back. The whole process must happen in under 150ms so the conversation feels natural with no delay.
Tech details:
Audio comes in as mulaw 8kHz chunks. You can resample internally for better model performance then convert back before sending out. RVC is the preferred model but we are open to alternatives that meet the latency requirement. Script runs on a RunPod RTX A5000 GPU.
Before we hire:
You must send a before and after audio demo showing accent softening on any Nigerian English voice sample. You do not need our server or infrastructure to do this. Just run a Nigerian English audio sample through your model and send both files. This is how we evaluate all candidates.
Deliverable: One working Python script tested on GPU with latency confirmed under 150ms.
Timeline: 2 to 3 days after demo is approved.
Please include your demo files and your fixed price in your proposal.
Freelancer에서 원본 확인
원본 보기