title: Speculative Speech Recognition by AudioPrefixed LowRank Adaptation of Language Models

publish date:

2024-07-05

authors:

Bolaji Yusuf et.al.

paper id

2407.04641v1

download

abstracts:

This paper explores speculative speech recognition (SSR), where we empower conventional automatic speech recognition (ASR) with speculation capabilities, allowing the recognizer to run ahead of audio. We introduce a metric for measuring SSR performance and we propose a model which does SSR by combining a RNN-Transducer-based ASR system with an audio-prefixed language model (LM). The ASR system transcribes ongoing audio and feeds the resulting transcripts, along with an audio-dependent prefix, to the LM, which speculates likely completions for the transcriptions. We experiment with a variety of ASR datasets on which show the efficacy our method and the feasibility of SSR as a method of reducing ASR latency.

QA:

coming soon

编辑整理: wanghaisheng 更新日期:2024 年 7 月 9 日