Sujet : Re: Voice compression
De : pozzugno (at) *nospam* gmail.com (pozz)
Groupes : comp.arch.embeddedDate : 03. Apr 2025, 18:53:28
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vsmhuo$188tp$1@dont-email.me>
References : 1 2
User-Agent : Mozilla Thunderbird
Il 02/04/2025 19:55, Rafael Deliano ha scritto:
CVSD uses a bit-serial data stream. Harris datasheets for obsolete Codecs are HC55516, HC55532. The "recording"-circuit can be an analog hack ( Kop, flipflop, 4 Bit shiftregister ) that sends data via SPI.
The "playback" would have to emulate this circuit in software and output
via a 8 bit D/A ( R2R resistor network, but serial ICs may be easier in SMD ).
16kBit/sec is very moderate quality, 24kBit/sec more reasonable.
We used these in the 80ies for digital answering machines in cars for the analog telephone system via radio that predated GSM in Germany. 24kBit was for incoming messages in RAM, 16 kBit for the fixed messages from EPROM. CVSD was ok, as the analog radio was a bit noisy
anyway.
Thank you for the suggestion. I tried to implement a simple CVSD codec in Python just to test the quality. I finally got these two functions[1].
I started from this audio[2] and obtained this one[3] after an encoding and decoding process. It's a short speech from an italian voice. I think you can see how bad the quality of decoded audio is.
I suspect I made some errors, because I don't think this is the quality of this audio codec. You said this codec was used in the past, but even if the quality some years ago wasn't high, the quality I reached in my implementation is very poor, quite unusable.
[2]
https://we.tl/t-RmC6EszYRS[3]
https://we.tl/t-oVbXFy5twWAt 32kBit/sec ADPCM is better, but you probably do not intend to use a 64kBit PCM codec as a frontend. If you use a handset or a digital
PCM-link, the quality of CVSD may be not competitive. For playback via
a loudspeaker sufficient, there is usually enough background noise.
My sounds is quite clear, they are generated by a TTS engine. Then they are flashed on the chip memory.
[1]
def cvsd_encode(samples):
prev_sample = 0
step_size = 16
STEP_SIZE_MIN = 16
STEP_SIZE_MAX = 16384
encoded_stream = bytearray()
encoded_byte = ""
last_bits = 0x00
for sample in samples:
bit = 1 if sample >= prev_sample else 0
# Aggiorna il valore del campione precedente
if bit == 1:
prev_sample += step_size
else:
prev_sample -= step_size
# Adatta la dimensione dello step guardando gli ultimi 3 bit
last_bits = last_bits << 1
last_bits += 1 if bit == 1 else 0
last_bits &= 0x07
if last_bits == 0x00 or last_bits == 0x07:
step_size = step_size * 2
else:
step_size = step_size // 2
# Limita la dimensione del passo
if step_size > STEP_SIZE_MAX:
step_size = STEP_SIZE_MAX
elif step_size < STEP_SIZE_MIN:
step_size = STEP_SIZE_MIN
encoded_byte += "1" if bit == 1 else "0"
if len(encoded_byte) == 8:
encoded_stream += bytes([int(encoded_byte,2)])
encoded_byte = ""
return encoded_stream
def cvsd_decode(bitstream):
prev_sample = 0
step_size = 16
STEP_SIZE_MIN = 16
STEP_SIZE_MAX = 16384
samples = []
last_bits = 0x00
for byte in bitstream:
for sbit in f"{byte:08b}":
bit = 1 if sbit == "1" else 0
if bit == 1:
prev_sample += step_size
else:
prev_sample -= step_size
samples += [prev_sample]
# Adatta la dimensione dello step guardando gli ultimi 3 bit
last_bits = last_bits << 1
last_bits += 1 if bit == 1 else 0
last_bits &= 0x07
if last_bits == 0x00 or last_bits == 0x07:
step_size = step_size * 2
else:
step_size = step_size // 2
# Limita la dimensione del passo
if step_size > STEP_SIZE_MAX:
step_size = STEP_SIZE_MAX
elif step_size < STEP_SIZE_MIN:
step_size = STEP_SIZE_MIN
return samples