Discourse Acknowledgment

0
0
1608 days ago, 411 views
PowerPoint PPT Presentation
Builds the HMMs for units of discourse. Produces perception ... 2009. Sphinx Whitepaper. http://cmusphinx.sourceforge.net/sphinx4/#whitepaper. Sphinx Forum ...

Presentation Transcript

Slide 1

Discourse Recognition

Slide 2

Components of a Recognition System

Slide 3

Frontend Feature extractor

Slide 4

Frontend Feature extractor Mel-Frequency Cepstral Coefficients (MFCCs) Feature vectors

Slide 5

Hidden Markov Models ( HMMs ) Acoustic Observations

Slide 6

Hidden Markov Models ( HMMs ) Acoustic Observations Hidden States

Slide 7

Hidden Markov Models ( HMMs ) Acoustic Observations Hidden States Acoustic Observation probabilities

Slide 8

Hidden Markov Models ( HMMs ) "Six"

Slide 9

Hidden Markov Models ( HMMs )

Slide 10

Acoustic Model Constructs the HMMs of telephones Produces perception probabilities

Slide 11

Acoustic Model Constructs the HMMs for units of discourse Produces perception probabilities Sampling rate is basic! WSJ versus WSJ_8k

Slide 12

Acoustic Model Constructs the HMMs for units of discourse Produces perception probabilities Sampling rate is basic! WSJ versus WSJ_8k TIDIGITS, RM1, AN4, HUB4

Slide 13

Language Model Word probabilities

Slide 14

Language Model ARPA design Example: 1-grams: - 3.7839 board -0.1552 - 2.5998 bottom -0.3207 - 3.7839 bunch -0.2174 2-grams: - 0.7782 as the - 0.2717 - 0.4771 at all 0.0000 - 0.7782 at the - 0.2915 3-grams: - 2.4450 in the least - 0.5211 in the center - 2.4450 in the on

Slide 15

Grammar open <basicCmd> = <startPolite> <command> <endPolite>; open <startPolite> = (please | compassionate | would you be able to ) *; open <endPolite> = [ please | thanks | thank you ]; <command> = <action> <object>; <action> = (open | close | erase | move); <object> = [the | a] (window | record | menu);

Slide 16

Dictionary Maps words to phoneme successions

Slide 17

Dictionary Example from cmudict.06d POULTICE P OW L T AH S POULTICES P OW L T AH S IH Z POULTON P AW L T AH N POULTRY P OW L T R IY POUNCE P AW N S POUNCED P AW N S T POUNCEY P AW N S IY POUNCING P AW N S IH NG POUNCY P UW NG K IY

Slide 18

Linguist Constructs the inquiry diagram of HMMs from: Acoustic model Statistical Language demonstrate ~or~ Grammar Dictionary

Slide 19

Search Graph

Slide 20

Search Graph

Slide 21

Search Graph Can be statically or powerfully developed

Slide 22

Linguist Types FlatLinguist

Slide 23

Linguist Types FlatLinguist DynamicFlatLinguist

Slide 24

Linguist Types FlatLinguist DynamicFlatLinguist LexTreeLinguist

Slide 25

Decoder Maps include vectors to pursuit chart

Slide 26

Search Manager Searches the diagram for the "best fit"

Slide 27

Search Manager Searches the diagram for the "best fit" P(sequence of highlight vectors| word/telephone) otherwise known as. P(O|W) - > "how likely is the contribution to have been produced by the word"

Slide 28

F ay v f ay v f ay v f ay v f ay v f ay v f ay v …

Slide 29

Viterbi Algorithm Time O1 O2 O3

Slide 30

Pruner Uses calculations to weed out low scoring ways amid disentangling

Slide 31

Result Words!

Slide 32

Word Error Rate Most basic metric Measure the # of alterations to change perceived sentence into reference sentence

Slide 33

Word Error Rate Reference: "This is a reference sentence." Result: "This is neuroscience."

Slide 34

Word Error Rate Reference: "This is a reference sentence." Result: "This is neuroscience." Requires 2 erasures, 1 substitution

Slide 35

Word Error Rate Reference: "This is a reference sentence." Result: "This is neuroscience."

Slide 36

Word Error Rate Reference: "This is a reference sentence." Result: "This is neuroscience." D S D

Slide 37

Sphinx4 Implementation

Slide 38

Sphinx4 Implementation

Slide 39

Sphinx4 Implementation

Slide 40

Sphinx4 Implementation

Slide 41

Sphinx4 Implementation

Slide 42

Sphinx4 Implementation

Slide 43

Sphinx4 Implementation

Slide 44

Sphinx4 Implementation

Slide 45

Sphinx4 Implementation

Slide 46

Sphinx4 Implementation

Slide 47

Sphinx4 Implementation

Slide 48

Where Speech Recognition Works Limited Vocab Multi-Speaker

Slide 49

Where Speech Recognition Works Limited Vocab Multi-Speaker Extensive Vocab Single Speaker

Slide 50

Where Speech Recognition Works *If you have uproarious sound information duplicate expected mistake rate x 2

Slide 51

Where Speech Recognition Works Other factors: - Continuous versus Separated - Conversational versus Perused - Dialect

Slide 52

Questions?

Slide 53

Appendix I: Viterbi Algorithm Time O1 O2 O3

Slide 54

Appendix I: Viterbi Algorithm P(ay | f) * P(O2|ay) P(f|f) * P(O2 | f) Time O1 O2 O3

Slide 55

Appendix I: Viterbi Algorithm P (O1) * P(ay | f) * P(O2|ay) Time O1 O2 O3

Slide 56

Appendix I: Viterbi Algorithm Time O1 O2 O3

Slide 57

Appendix II: FAQs Common Sphinx4 FAQs can be discovered on the web: http://cmusphinx.sourceforge.net/sphinx4/doc/Sphinx4-faq.html What followes are some less-FAQs

Slide 58

Appendix II: FAQs Q. Is a hunt diagram made down each acknowledgment result or one for the acknowledgment application? A. This relies on upon which Linguist is utilized. The level language specialist produces the whole inquiry chart and holds it in memory. It is helpful for little vocab acknowledgment errands. The lexTreeLinguist progressively creates look states permitting it to handle extensive vocabularies

Slide 59

Appendix II: FAQs Q. How does the Viterbi calculation spare calculation over thorough hunt? A. The Viterbi calculation spares memory and calculation by reusing subproblems officially unraveled inside the bigger arrangement. Thusly likelihood computations which rehash in various ways through the inquiry diagram don't get ascertained different times Viterbi cost = n 2 – n 3 Exhaustive pursuit cost = 2 n - 3 n

Slide 60

Appendix II: FAQs Q. Does the language specialist utilize a punctuation to build the pursuit diagram on the off chance that it is accessible? A. Yes, a syntax diagram is made

Slide 61

Appendix II: FAQs Q. What calculation does the Pruner utilize? A. Sphinx4 utilizes supreme and relative pillar pruning

Slide 62

Appendix III: Configuration Parameters Absolute Beam Width - # dynamic pursuit ways <property name="absoluteBeamWidth" value="5000"/>

Slide 63

Appendix III: Configuration Parameters Absolute Beam Width - # dynamic hunt ways <property name="absoluteBeamWidth" value="5000"/> Relative Beam Width – likelihood edge <property name="relativeBeamWidth" value="1E-120"/>

Slide 64

Appendix III: Configuration Parameters Absolute Beam Width - # dynamic inquiry ways <property name="absoluteBeamWidth" value="5000"/> Relative Beam Width – likelihood limit <property name="relativeBeamWidth" value="1E-120"/> Word Insertion Probability – Word break probability <property name="wordInsertionProbability" value="0.7"/>

Slide 65

Appendix III: Configuration Parameters Absolute Beam Width - # dynamic inquiry ways <property name="absoluteBeamWidth" value="5000"/> Relative Beam Width – likelihood edge <property name="relativeBeamWidth" value="1E-120"/> Word Insertion Probability – Word break probability <property name="wordInsertionProbability" value="0.7"/> Language Weight – Boosts dialect display scores <property name="languageWeight" value="10.5"/>

Slide 66

Appendix III: Configuration Parameters Silence Insertion Probability – Likelihood of embeddings hush <property name="silenceInsertionProbability" value=".1"/>

Slide 67

Appendix III: Configuration Parameters Silence Insertion Probability – Likelihood of embeddings quiet <property name="silenceInsertionProbability" value=".1"/> Filler Insertion Probability – Likelihood of embeddings filler words <property name="fillerInsertionProbability" value="1E-10"/>

Slide 68

Appendix IV: Python Note To call a Java case from Python: import subprocess subprocess.call(["java", "- mx1000m", "- jar", "/Users/Username/sphinx4/canister/Transcriber.jar")

Slide 69

References Speech and Language Processing 2 nd Ed. Daniel Jurafsky and James Martin Pearson, 2009 Artificial Intelligence 6 th Ed. George Luger Addison Wesley, 2009 Sphinx Whitepaper http://cmusphinx.sourceforge.net/sphinx4/#whitepaper Sphinx Forum https://sourceforge.net/ventures/cmusphinx/gatherings

SPONSORS