Front-end Audio Processing: Reflections on Issues, Requirements, and Solutions

Slide1 l.jpg
1 / 31
0
0
1295 days ago, 552 views
PowerPoint PPT Presentation
Front-end Audio Processing. Handling to upgrade saw and/or measured sound quality in correspondence and recording gadgets. At that point. Presently. Not all that Famous Quotes (Acoustic Jewelry/Bluetooth Headset) . Gary Elko (mh/Bell labs colleague)At IWAENC 1995:

Presentation Transcript

Slide 1

Front-end Audio Processing: Reflections on Issues, Requirements, and Solutions Tomas Gaensler mh acoustics www.mhacoustics.com Summit NJ/Burlington VT USA

Slide 2

Front-end Audio Processing to upgrade saw as well as measured sound quality in correspondence and recording gadgets Then Now

Slide 3

Not So Famous Quotes (Acoustic Jewelry/Bluetooth Headset) Gary Elko (mh/Bell labs associate) At IWAENC 1995: "Acoustic Echo cancelation won't be required later on when individuals wear acoustic adornments" Arno Penzias (1978 Nobel prize laureate) "Nobody would need acoustic gems since individuals would think the clients conversing with themselves are insane" I'm happy the achievement of Bluetooth headsets demonstrate that both were totally off-base!

Slide 4

Classical Front-end Architectures - POTS Large coupling misfortune in handset mode Switch misfortune in speakerphone supporting phones Carbon mouthpiece with extension impact that lessens clamor Switch Loss

Slide 5

Classical Front-end Architectures – Cellphone 1995

Slide 6

Classical Front-end Architectures – Cellphone 2005 - 2010

Slide 7

Common issues: Far-end audience does not hear close end talker Near-end audience does not comprehend far-end talker Why? Shape calculate – Size Limited comprehension of material science and acoustics(?) Cellphones and Handsfree

Slide 8

RX/TX Levels, Coupling and Doubletalk Far-end  95—100 dBSPL at amplifier Echo louder than close end: Linear AEC ERLE  20-30 dB After cancelation Residual Echo to Near-end Ratio (RENR): RENR  90-20-70 = 0 dB 85—90 dBSPL at mic > 20 dB of leftover reverberate concealment required Duplexness endures Near-end talker  55—70 dBSPL at mic

Slide 9

Actual discourse to room clamor proportion is just around 27 dB, best case scenario Gain is required to get sufficiently uproarious yield Perceived commotion level is ~20 dB above ordinary room commotion level TX: Dynamic Range and Noise Echo 90 dBSPL  Peak resound  105-110 dB No immersion of reverberate in TX way Echo Level: 90 dBSPL Near-end discourse Level: 70 dBSPL

Slide 10

TX: Fixed-point Processing and Quantization Noise N=64  Q-clamor increments by 36 dB Double-exactness "required" Q-commotion increments by 6log2(N) dB!

Slide 11

Small amplifiers have rather high cut-off recurrence (high-pass) EQ frequently required to get worthy "sound" (recurrence reaction). However EQ implies: Loss of flag commotion and element extend Increased (simple) contortion Many producers repay the loss of flag level by over the top computerized pick up and hence get (advanced) immersion RX: Dynamic Range and Distortion Analog increase Digital pick up To AEC

Slide 12

What Can or Should be Done? Limit acoustical coupling by great physical outline TX Use commotion concealment yet not exorbitantly Double-exactness, square scaling, or gliding point RX Compression rather than settled increase 10% or less amplifier/driver THD is wanted

Slide 13

What about Non-straight AEC Algorithms? Intriguing issue proposed and chipped away at for a long time Not useful in most AEC applications since Complicated model Gain and in this manner immersion perhaps in both TX and RX ways Added many-sided quality and framework cost Often ease back meeting Difficult to calibrate in field Even when non-direct cancelation works consummately, the client still sees a bended amplifier flag!

Slide 14

Classical Front-end Architectures – Cellphone 2005 - 2010 Why RX NS? Why TX NS?

Slide 15

Single Channel Noise Suppression Basic single channel clamor silencer A to a great degree fruitful flag handling development by Manfred Schroeder in the 1960s Musical tones – is it an (unraveled) issue? How would we assess and enhance quality? What about joining rate?

Slide 16

Background to Single Channel Noise Suppressors Block handling: Frequency space show: Linear Time-changing channel: Wiener channel: "upgraded" discourse NS discourse clamor

Slide 17

Background to Single Channel Noise Suppressors Estimation of spectra is regularly done recursively: Frequency smoothing: , when discourse is "not" present

Slide 18

Musical Tones – Is it a (Solved) Problem? Illustrations Original ("Sally Sievers' reel, June-Sept. 1964" by Manfred Schroeder and Mohan Sondhi at Bell Labs) Original + clamor (iSNR ~ 6 dB) Schroeder – 1960s "Non specific phantom subtraction" – Boll 1979 IS-127 – 1995 "An issue of a century ago", just a requirement in configuration Controlling difference of concealment increases Any NS calculation ought to be obliged not to have melodic tones Must just smallly affect voice quality

Slide 19

Quality Metrics in particular: Listen! SNR Total Segmental During discourse Distortion measurements: ISD (Itakura-Saito remove) ITU-T P.862: PESQ/MOS-LQO

Slide 20

Quality Metric – P.862 (PESQ/MOS-LQO) MOS-LQO (MOS Listening Quality Objective) Alg-1/2 – Wiener techniques with 12 dB commotion concealment What can the best clamor silencer accomplish?

Slide 21

Quality Metric – "My Rule of Thumb" Ideal MOS (PESQ) execution bound is given by moving the natural PESQ-bend to one side Example for 12 dB concealment 12 dB move to one side 12 dB

Slide 22

Convergence Rate Important execution model: Non-stationary commotion conditions Frame misfortune Main target: Maximize union rate while keeping up discourse quality

Slide 23

Convergence Rate – A Useful Test Input arrangement IS-127 Wiener Based An otherworldly subtraction m-script recovered from the web

Slide 24

Convergence Rate and MOS-LQO "Typical" "Quick" MOS-LQO

Slide 25

Current Applications and Drivers of NS Technology Where is NS going in industry now? Past "12 dB" of concealment Multi-amplifier arrangements at least two channel silencers Linear beamforming Applications Mobile telephones (a couple of two-mouthpiece models have achieved the market) Bluetooth headsets: awesome "new" application for flag preparing (Ericsson BT headset 2000)

Slide 26

Background to Linear Beamforming N : Number of receivers Broadside direct beamforming (e.g. delay-total) Directional increase: 10log(N) White Noise Gain (WNG)>0 Practical size: "substantial" (~30cm) Endfire differential beamforming Directional increase: 20log(N) WNG<0 Practical size: "little" (1.5-5cm)  Differential beamformers more appropriate for little shape components

Slide 27

Background to Linear Beamforming What do we pick up? Less resonation (expanded comprehensibility) Less (ecological) clamor No (or low) bending on pivot Possible obstruction dismissal by spatial zero(s) Some Issues: Performance is given by basic separation! Increment in sensor commotion (WNG, differential beamforming)

Slide 28

Beamforming: Critical Distance Critical separation (Reverberation span): reverberant-to-direct way vitality proportion is 0 dB: DI = Directivity Index: pick up of direct to reverberant vitality over an omni-directional amplifier Order of limited contrasts utilized. 1 st :  2 mics, 2 nd :  3 mics and so forth)

Slide 29

First-Order Differential Beamforming

Slide 30

Classical First-Order Beamformer Responses Cardioid Hypercardioid Dipole

Slide 31

Beamforming Demo: DEWIND  preparing

SPONSORS