Job Details

View jobs in our app

Learn more about the app. Workinapps.com

Research Intern - Audio-Visual VoiceAI (Open Source)

2025-11-17 WhissleAI Santa Rosa,CA

Description:

Research Intern – Audio-Visual VoiceAI (Open Source)

We're looking for a Research Intern to join WhissleAI and help advance our open-source work at the intersection of speech, vision, and structured understanding — inspired by projects like

advanced speech recognition asr.whissle.ai
and recent multi-modal alignment research (example:

You'll work on developing audio-visual foundation models that connect voice, context, and environment — enabling systems that can listen, see, and act coherently in real time. Most of this work is open-source and contributes directly to the broader research community.

Ideal candidate

Undergrad, Master's, or PhD student in CS, AI, or related field
Prior research experience (conference/workshop publications a plus)
Strong background in one or more of: multimodal learning, audio-visual representation learning, speech modeling, or self-supervised methods
Experience with PyTorch, Hugging Face, or similar frameworks

What you'll do

Prototype and evaluate audio-visual alignment models
Extend our open-source ASR and meta-speech pipelines
Collaborate on papers, demos, and real-time VoiceAI applications

Location: Remote

Type: Paid internship / research collaboration

Job Details

View jobs in our app

Research Intern - Audio-Visual VoiceAI (Open Source)

Apply for this Job

Registration Required

Login to Apply

You are leaving our site

Registration Required

Email this job to a friend

Job: Research Intern - Audio-Visual VoiceAI (Open Source)

Job Alert Sign Up

Add To Job Alert

Job Alert Updated

Email Customer Care