David Samuel Roth

Research

End-to-End Fluent Speech Transcription using Hidden Unit BERT

Using raw audio and pretrained encoders for robust, end-to-end speech transcription.

Variations and Relaxations of Normalizing Flows

This paper covers model classes that emerge from relaxing invertibility contraints in normalizing flows, and explores their relationship to VAEs, score-based diffusion and the broader family of generative models.

Multi-Modal Inductive Graph Learning with Zillow

Learning connections in large, multimodal graphs (text+image) using CLIP priors.