Re: Voice compression

Liste des GroupesRevenir à ca embedded 
Sujet : Re: Voice compression
De : pozzugno (at) *nospam* gmail.com (pozz)
Groupes : comp.arch.embedded
Date : 03. Apr 2025, 18:53:28
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vsmhuo$188tp$1@dont-email.me>
References : 1 2
User-Agent : Mozilla Thunderbird
Il 02/04/2025 19:55, Rafael Deliano ha scritto:
CVSD uses a bit-serial data stream. Harris datasheets for obsolete Codecs are HC55516, HC55532. The "recording"-circuit can be an analog hack ( Kop, flipflop, 4 Bit shiftregister ) that sends data via SPI.
The "playback" would have to emulate this circuit in software and output
via a 8 bit D/A ( R2R resistor network, but serial ICs may be easier in SMD ).
16kBit/sec is very moderate quality, 24kBit/sec more reasonable.
We used these in the 80ies for digital answering machines in cars for the analog telephone system via radio that predated GSM in Germany. 24kBit was for incoming messages in RAM, 16 kBit for the fixed messages from EPROM. CVSD was ok, as the analog radio was a bit noisy
anyway.
Thank you for the suggestion. I tried to implement a simple CVSD codec in Python just to test the quality. I finally got these two functions[1].
I started from this audio[2] and obtained this one[3] after an encoding and decoding process. It's a short speech from an italian voice. I think you can see how bad the quality of decoded audio is.
I suspect I made some errors, because I don't think this is the quality of this audio codec. You said this codec was used in the past, but even if the quality some years ago wasn't high, the quality I reached in my implementation is very poor, quite unusable.
[2] https://we.tl/t-RmC6EszYRS
[3] https://we.tl/t-oVbXFy5twW

At 32kBit/sec ADPCM is better, but you probably do not intend to use a 64kBit PCM codec as a frontend. If you use a handset or a digital
PCM-link, the quality of CVSD may be not competitive. For playback via
a loudspeaker sufficient, there is usually enough background noise.
My sounds is quite clear, they are generated by a TTS engine. Then they are flashed on the chip memory.
[1]
def cvsd_encode(samples):
     prev_sample = 0
     step_size = 16
     STEP_SIZE_MIN = 16
     STEP_SIZE_MAX = 16384
     encoded_stream = bytearray()
     encoded_byte = ""
     last_bits = 0x00
     for sample in samples:
         bit = 1 if sample >= prev_sample else 0
         # Aggiorna il valore del campione precedente
         if bit == 1:
             prev_sample += step_size
         else:
             prev_sample -= step_size
         # Adatta la dimensione dello step guardando gli ultimi 3 bit
         last_bits = last_bits << 1
         last_bits += 1 if bit == 1 else 0
         last_bits &= 0x07
         if last_bits == 0x00 or last_bits == 0x07:
             step_size = step_size * 2
         else:
             step_size = step_size // 2
         # Limita la dimensione del passo
         if step_size > STEP_SIZE_MAX:
             step_size = STEP_SIZE_MAX
         elif step_size < STEP_SIZE_MIN:
             step_size = STEP_SIZE_MIN
         encoded_byte += "1" if bit == 1 else "0"
         if len(encoded_byte) == 8:
             encoded_stream += bytes([int(encoded_byte,2)])
             encoded_byte = ""
     return encoded_stream
def cvsd_decode(bitstream):
     prev_sample = 0
     step_size = 16
     STEP_SIZE_MIN = 16
     STEP_SIZE_MAX = 16384
     samples = []
     last_bits = 0x00
     for byte in bitstream:
         for sbit in f"{byte:08b}":
             bit = 1 if sbit == "1" else 0
             if bit == 1:
                 prev_sample += step_size
             else:
                 prev_sample -= step_size
             samples += [prev_sample]
             # Adatta la dimensione dello step guardando gli ultimi 3 bit
             last_bits = last_bits << 1
             last_bits += 1 if bit == 1 else 0
             last_bits &= 0x07
             if last_bits == 0x00 or last_bits == 0x07:
                 step_size = step_size * 2
             else:
                 step_size = step_size // 2
             # Limita la dimensione del passo
             if step_size > STEP_SIZE_MAX:
                 step_size = STEP_SIZE_MAX
             elif step_size < STEP_SIZE_MIN:
                 step_size = STEP_SIZE_MIN
     return samples

Date Sujet#  Auteur
2 Apr 25 * Voice compression7pozz
2 Apr 25 +* Re: Voice compression4Rafael Deliano
3 Apr 25 i`* Re: Voice compression3pozz
5 Apr 25 i `* Re: Voice compression2Rafael Deliano
7 Apr 25 i  `- Re: Voice compression1pozz
4 Apr 25 `* Re: Voice compression2Paul Rubin
7 Apr 25  `- Re: Voice compression1pozz

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal