Speech-to-Text Webcam Overlay Key Image

Speech-to-Text Webcam Overlay

Web App for Overlay Maschine Transcribed Text on Webcam Video
Web Application, Open Source Project 2020

Problem

With transcription in video conferencing tools, subtitles and video images are displayed separately, and it is difficult to see the correspondence between the speaker and characters. Someone proposed a method to compose transcribed text in front of a face using live transcribe Android app and video switcher. However, it is difficult to use because it requires Android smartphone and video switcher, and the settings are complicated.

Solution

We implemented a web page that can display live transcribed subtitles overlay on the webcam video by simply opening it with a web browser. If you share the screen with a video conferencing tool or use a screen capture tool, you can participate in a video conference while showing your face and text at the same time. For speech recognition, we employ Web Speech API, which is available on Google Chrome. We made the source code available on GitHub as open source.

Materials

Application (Google Chrome Only)

Demonstration Video