Discussion:
Speech mode of Debian installer
Add Reply
Roland Clobus
2024-11-19 11:40:01 UTC
Reply
Permalink
Hello list,

Recently I've enabled the recording of the audio that is generated by
espeakup in the speech version of the installer (netinst image) in
openQA. The first step of the installer is recorded.

You can see the result here:
https://openqa.debian.net/tests/325775

The most striking recording is at step 2:6:1
https://openqa.debian.net/tests/325775/file/bootwalk_2:6:1-captured.wav
which is about 5 minutes long and lists 78 language options.

Before asking questions at the debian-accessibility mailing list, I'll
ask some technical questions here:
* Are all these languages supported by the speech generators? (I've
noticed e.g. 'Chinese letter - Chinese letter' being spoken at 1:00) ->
i.e. should the list of languages be reduced for this specific variant
of the installer, because the speech module cannot read it?
* Could a different font be used to show the UTF-8 characters, similar
to the text installer? (I've noticed the square symbols for missing glyphs)
* The spoken text 'Prompt. For help' does not speak the most important
bit, i.e. that the question mark will show the help text
* Nowadays newer TTS voices exist that speak a more natural language
(e.g. piper https://rhasspy.github.io/piper-samples/), could this be
used instead?

I'm very well aware of the huge amount of work needed to implement this,
for a team that is already under load.
At least I'll be able to help with the automated testing side, on openQA.

With kind regards,
Roland Clobus
Samuel Thibault
2024-11-19 13:30:01 UTC
Reply
Permalink
Hello,

Cc-ing debian-accessibility at least for information.
Post by Roland Clobus
The most striking recording is at step 2:6:1
https://openqa.debian.net/tests/325775/file/bootwalk_2:6:1-captured.wav
which is about 5 minutes long and lists 78 language options.
Yes. One can use arrow keys instead to quickly go over the list.
Post by Roland Clobus
* Are all these languages supported by the speech generators?
I don't remember if that's 100% the case, in my memory it is at least
very largely covered.
Post by Roland Clobus
(I've noticed e.g. 'Chinese letter - Chinese letter' being spoken
at 1:00) -> i.e. should the list of languages be reduced for this
specific variant of the installer, because the speech module cannot
read it?
Ideally that could be implemented in localechooser, by looking at the
list of voices in espeak to filter the list.
Post by Roland Clobus
* Could a different font be used to show the UTF-8 characters, similar to
the text installer?
For the screen reader to work, we have to use the linux console, not
an fbterm. So we are limited to the linux console capability, and thus
cannot display everything at the same time. In practice this is not a
problem because the speech is correct, and the font is switched once a
language is selected.
Post by Roland Clobus
* The spoken text 'Prompt. For help' does not speak the most important bit,
i.e. that the question mark will show the help text
This is

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=690343

forwarded upstream

https://github.com/espeak-ng/espeak-ng/issues/150
Post by Roland Clobus
* Nowadays newer TTS voices exist that speak a more natural language (e.g.
piper https://rhasspy.github.io/piper-samples/), could this be used instead?
The problem is the size. espeak-ng supports a very wide range of
languages with a quite small disk footprint. Piper etc. (we had mbrola
already for a long time) take a *lot* of space. We have packages ready
for including e.g. mbrola voices, but we cannot really include them on
the default images, it's rather for specialized images.

Samuel

Loading...