Multi-Level Learning in Mixture Deliberative/Responsive Versatile Robot Compositional Programming Frameworks

2658 days ago, 846 views
PowerPoint PPT Presentation
... Deliberative/Reactive Mobile Robot Architectural Software Systems ... Georgia Tech/Mobile Intelligence. 13. Selecting Behavioral Assemblages - Specifics. Supplant the FSA with a ...

Presentation Transcript

Slide 1

´╗┐Multi-Level Learning in Hybrid Deliberative/Reactive Mobile Robot Architectural Software Systems DARPA MARS Review Meeting - January 2000 Approved for open discharge: conveyance boundless

Slide 2

Georgia Tech College of Computing Prof. Ron Arkin Prof. Chris Atkeson Prof. Sven Koenig Georgia Tech Research Institute Dr. Tom Collins Mobile Intelligence Inc. Dr. Doug MacKenzie Students Amin Atrash Bhaskar Dutt Brian Ellenberger Mel Eriksen Max Likachev Brian Lee Sapan Mehta Personnel

Slide 3

Case-based Reasoning for: deliberative direction ("wizardry") responsive situational-subordinate behavioral design Reinforcement learning for: run-time behavioral modification behavioral array determination Probabilistic behavioral moves gentler setting exchanging background based arranging direction Adaptation and Learning Methods Available Robots and MissionLab Console

Slide 4

Reactive learning by means of element pick up change (parametric conformity) Continuous adjustment in light of late experience Situational investigations required more or less: If it works, continue doing it somewhat harder; on the off chance that it doesn't, have a go at something else 1. Learning Momentum

Slide 5

Learning Momentum - Design Integrated into MissionLab in CNL Library Works with MOVE_TO_GOAL, COOP, and AVOID_OBSTACLES Has not yet been stretched out to all practices

Slide 6

Simple Example

Slide 7

Learning Momentum - Future Work Extension to extra CNL practices Make edges for state assurance rules available from cfgedit Integrate with CBR and RL

Slide 8

Another type of responsive learning Previous frameworks include: ACBARR and SINS Discontinuous behavioral exchanging 2. CBR for Behavioral Selection

Slide 9

Case-Based Reasoning for Behavioral Selection - Current Design The CBR Module is planned as a remain solitary module A hard-coded library of eight cases for MoveToGoal errands Case - an arrangement of parameters for every primitive conduct in the present gathering and file into the library

Slide 10

Case-Based Reasoning for Behavioral Selection - Current Results On the Left - MoveToGoal without CBR Module On the Right - MoveToGoal with CBR Module

Slide 11

Case-Based Reasoning for Behavioral Selection - Future Plans Two levels of operation: picking and adjusting parameters for chose conduct collections and in addition picking and adjusting the entire new conduct arrays Automatic learning and alteration of cases through experience Improvement of case/file/include choice and adjustment Integration with Q-learning and Momentum Learning Identification of significant assignment area case libraries

Slide 12

Reinforcement learning at coarse granularity (behavioral gathering determination) State space tractable Operates at level above learning energy (choice instead of change) Have added the capacity to progressively pick which behavioral collection to execute Ability to realize which collection to pick utilizing wide assortment of Reinforcement Learning techniques: Q-learning, Value Iteration, (Policy Iteration in not so distant future) 3. Fortification learning for Behavioral Assemblage Selection

Slide 13

Selecting Behavioral Assemblages - Specifics Replace the FSA with an interface permitting client to determine the natural and behavioral states Agent learns moves between conduct states Learning calculation is actualized as a theoretical module and diverse learning calculations can be swapped in and out as wanted. CNL work interfaces robot executable and learning calculation

Slide 14

Integrated System

Slide 15

Architecture Environmental States Cfgedit Behavioral States CDL code CNL work MissionLab Learning Algorithm (Qlearning)

Slide 16

RL - Next Steps Change usage of Behavioral Assemblages in Missionlab from just being statically aggregated into the CDL code to a more dynamic representation. Make significant situations and test Missionlab 's capacity to learn great arrangements Look at new learning calculations to abuse the benefits of Behavioral Assemblages determination Conduct broad recreation concentrates then execute on robot stages

Slide 17

Experience-driven help with mission particular At deliberative level above existing arrangement representation (FSA) Provides mission arranging support in setting 4. CBR "Wizardry"

Slide 18

CBR Wizardry/Usability Improvements Current Methods: Using GUI to develop FSA - might be troublesome for unpracticed clients. Objective: Automate arrange creation however much as could reasonably be expected while giving subtle support to client.

Slide 19

Tentative Insertion of FSA Elements: A client bolster instrument right now being taken a shot at Some FSA components all the time happen together. Factual information on this can be accumulated. At the point when client puts an express, a trigger and express that take after this state regularly enough can be likely embedded into the FSA. Tantamount to URL finishing highlights in web programs. Express A State A User places State A Trigger B Tentative Additions Statistical Data State C

Slide 20

Recording Plan Creation Process Pinpointing where client experiences difficulty amid plan creation is imperative essential to enhancing programming ease of use. There was no real way to record arrange creation prepare in MissionLab. Module now made that records client's activities as (s)he makes the arrangement. This recording can later be played back and focuses where the client staggered can in this way be recognized. The Creation of a Plan

Slide 21

Wizardry - Future Work Use of plan creation recordings amid ease of use studies to distinguish hindrances in process. Making of plan formats (systems of some regularly utilized arrangement sorts e.g. surveillance missions) Collection of library of arrangements which can be set at various focuses in "plan creation tree". This can then be utilized as a part of an arrangement creation wizard. Arrange 1 Plan 2 Plan 3 Plan 4 Plan 5 Plan 6 Plan 7 Plan 8 Plan Creation Tree

Slide 22

"Milder, kinder" strategy for coordinating circumstances and their perceptual triggers Expectations produced in light of situational probabilities in regards to behavioral execution (e.g., impediment densities and safety), utilizing them at arranging stages for behavioral choice Markov Decision Process, Dempster-Shafer, and Bayesian techniques to be explored 5. Probabilistic Planning and Execution

Slide 23

Probabilistic Planning and Execution - Concept Find the ideal arrangement in spite of sensor instability about the present environment Mission Editor MissionLab .cdl POMDP Specification POMDP Solver FSA

Slide 24

Probabilistic Methods: Current Status filter mine - 5 P(detect mine|mine) = 0.8 move - 5000 clear mine - 50 move 100 no mine sweep - 5 P(detect mine|no mine) = 0 clear mine - 50 MissionLab (current work) FSA POMDP

Slide 25

Varying Costs Different Plans check mine - 5 P(detect mine|mine) = 0.8 move - 5000 clear mine - 100 move 100 no mine output - 5 P(detect mine|no mine) = 0 clear mine - 50 MissionLab (current work) POMDP FSA

Slide 26

MIC's Role Develop applied arrangement for coordinating learning calculations into MissionLab Guide understudies performing mix Assist in outlining ease of use studies to assess incorporated framework Guide execution and assessment of ease of use studies Identify enter advances in MissionLab which could be marketed Support innovation exchange to an assigned organization for commercialization

Slide 27