Speech to Text to Speech
/ home / spielereien / .
Siehe auch: Artificial Intelligence (AI)
Speech to Text
Speech to Text on Debian
See also: YouTube Transkript erstellen
$ VIDEO="tgws21.mp4"
$ AUDIO="audio.wav"
$ ffmpeg -i ${VIDEO} ${AUDIO}
$ ffmpeg -i ${VIDEO} -vn -acodec libvorbis audio.ogg
Open Source Speech Recognition Software Solutions
Quelle: https://www.goodfirms.co/speech-recognition-software/blog/best-free-open-source-speech-recognition-software
Quellen:
2021/2022:
21 Beste Online-Software für die Sprachausgabe
OCR
Quelle: https://www.goodfirms.co/ocr-software/blog/best-free-open-source-ocr-software
Textkorrektur
Audio to text
- Ferdinand, Linux News, 2026-01-09: EasySpeak: Sprachsteuerung für den Linux-Desktop
- DeepSpeech documentation
- GitHub: Mozilla DeepSpeech
- Yujian Tang, 13. Oktober 2021: DeepSpeech for Dummies - A Tutorial and Overview
- Immo Junghärtchen, c’t Magazin / heise.de, 26. Mai 2023: Spracherkennung und Transkription mit KI: Sprache in Text umwandeln mit Whisper - Die Open-Source-Spracherkennung Whisper transkribiert Sprache aus Audiodateien mit sehr guter Erkennungsquote und versteht sich sogar auf Zeichensetzung.
- Ralf Hersel, GNU/Linux.ch, 24. August 2020: Open Source Spracherkennung - Ein Überblick über aktuelle Speech-to-Text Systeme
- DeepSpeech, Kaldi, Julius, Wav2Letter++, DeepSpeech2, OpenSeq2Seq, Fairseq, Vosk, Athena, ESPnet
- Vanessa Arnold, neuroflash, 10. Juli 2023: Whisper OpenAI: Sprache in Text umwandeln wie ein Profi
- Kaldi: https://kaldi-asr.org/
https://github.com/kaldi-asr/kaldi
- Julius: dead? Packages!
julius.osdn.jp/en_index.php
https://github.com/julius-speech/julius
https://github.com/julius-speech/julius/blob/master/Sample.jconf
CLI
https://www.voxforge.org/home/dev/acousticmodels/linux/create/htkjulius/tutorial/run-julius
https://oceanai.mit.edu/pavlab/pdfs/app_uspeech_rec.pdf
- wav2letter / Wav2Letter++ / –> Flashlight ASR
https://github.com/flashlight/wav2letter
- DeepSpeech2, PaddleSpeech
- OpenSeq2Seq
- Fairseq
- Vosk
- Athena
- ESPnet / espnet
- KDE Simon, no packages!
https://simon.kde.org/
https://speechify.com/product-reviews/simon-speech-recognition/?landing_url=https%3A%2F%2Fspeechify.com%2Fproduct-reviews%2Fsimon-speech-recognition%2F
https://github.com/KDE/simon
- Kdenlive (based on Vosk), packages are there!
https://docs.kdenlive.org/en/effects_and_compositions/speech_to_text.html
-> Overcomplicated!!!
- KMouth, text to speach!!!
- Gnome Orca, screen reader: https://help.gnome.org/users/orca/stable/introduction.html.en
- Speech Recognition Software: https://tldp.org/HOWTO/Speech-Recognition-HOWTO/software.html
- https://en.wikipedia.org/wiki/Speech_recognition_software_for_Linux
- https://github.com/flashlight/flashlight
- https://github.com/cmirnow/Google-Cloud-TTS-Rails
- https://unix.stackexchange.com/questions/256138/is-there-any-decent-speech-recognition-software-for-linux
- https://www.goodfirms.co/speech-recognition-software/blog/best-free-open-source-speech-recognition-software
- https://fosspost.org/open-source-speech-recognition/