GPO Federal Digital System Architecture and Design

Gpo federal digital system architecture and design l.jpg
1 / 68
0
0
918 days ago, 289 views
PowerPoint PPT Presentation
Plan. Framework ArchitectureConceptual ModelData Model and OAIS ImplementationApplication ArchitectureApplication SecurityFDsys Ingest Process ModelData Processing and SearchCollections, Packages and GranulesArchitectural PhilosophySearch FeaturesAccessing DocumentsRepository DesignQ

Presentation Transcript

Slide 1

GPO Federal Digital System Architecture and Design Deng Wu Paul Nelson Johnny Gee

Slide 2

Agenda System Architecture Conceptual Model Data Model and OAIS Implementation Application Architecture Application Security FDsys Ingest Process Model Data Processing and Search Collections, Packages and Granules Architectural Philosophy Search Features Accessing Documents Repository Design Q & A

Slide 3

System Architecture Deng Wu

Slide 4

System Architecture Conceptual Model

Slide 5

Data Model OAIS Implementation SIP, AIP, DIP Metadata Standards: MODS, PREMIS, METS FDsys Package Model Package - bundling plan - content records - metadata IS A SIP ACP AIP DIP CMS-Based Logical Implementation Self-Describing Implementation Self-Describing Implementation

Slide 6

Data Model (cont.) SIP/ACP SIP & ACP bundle envelope 1 version organizer 1 content records interpretation organizer 2 content documents granule envelope granule documents bundle envelope 2 interpretation envelope 1 content documents interpretation envelope 2 content documents

Slide 7

Data Model (cont.) AIP bundle envelope 1 interpretation organizer 1 content records interpretation envelope 2 bundle Folder-2 content records aip.xml mods.xml premis.xml

Slide 8

Application Architecture

Slide 9

Application Security Roles and gatherings to authorize application security control User get to benefits rely on upon client's parts and gatherings Content Originator, Service Specialist, Preservation Specialist, and so forth. Clients, parts and gatherings all oversaw in LDAP FDsys Applications Authentication & Authorization Oracle Internet Directory - Content Originator - Service Provider - EPA - DOE Documentum LDAP synchronize - Service Specialist - Preservation Specialist - U.S. Senate - U.S. House - GPO Microsoft Active Directory

Slide 10

FDsys Ingest Process Model

Slide 11

Data Processing and Search Paul Nelson

Slide 12

Collections, Packages, and Granules

Slide 13

What is an accumulation? Inside FDsys: A gathering of archives which are altogether handled the same Identified with a "Preparing Code" Examples: Federal Register, Congressional Bills, Congressional Record, Public and Private Laws To people in general: A gathering of archives which sensibly have a place together Identified with an "Accumulation Code" Examples: Federal Register, Congressional Bills, Congressional Record, Public and Private Laws

Slide 14

Processing Codes and Collection Codes Metadata is utilized to decide the accumulation code at file time Example Where the Two are not the Same: Presidential Budgets BUDGET2005, BUDGET2006  BUDGET Processed independently however looked together Congressional Reports imprinted in the congressional record Will have two accumulation codes: CR, CRPT Will appear in both accumulations! Probability for "virtual accumulations" without bounds For instance: All archives identified with Education

Slide 15

Packages Roughly Equivalent to a bound, paper report Examples: One issue of the Federal Register One issue of the Congressional Record A solitary Congressional Bill One issue of the Weekly Compilation of Pres. Docs. One volume of the Code of Federal Regulations (!) One volume of the United States Code The 9-11 Report A solitary congressional board report

Slide 16

Packages Example

Slide 17

Granules "The most helpfully searchable unit" Examples: A solitary FR article A solitary CR unit of business (isolated by ) A CFR segment (e.g. § 57.402 in Title 40) A whole bill A whole report A Pres. discourse

Slide 18

Implications of Granules Search over granules Search results are considerably more precise Retrieve singular granules Can see a solitary granule as opposed to filtering through a whole issue or volume Granules can be carefully marked To guarantee genuineness of the information No single page recovery Instead, you get the best granule PDF watcher naturally bounced to the page you requested

Slide 19

Architectural Philosophy

Slide 20

FDsys is a Data Driven Architecture Packages Raw Content Extract Metadata gather into bundles metadata content substance conveyance Search make MODS Browse

Slide 21

Metadata Flow Diagram Metadata Flow Diagram

Slide 22

Parsing Introduction Runs general expressions to concentrate metadata Regular Expression: (Public Law|Pub. L.|PL|P. L.) (1[0-9][0-9])- ([0-9]+) Example: Pub. L. 109-130 Produces: <publicLaw congressNum="109">130</publicLaw> Produces a case of fdsys.xml Parsed metadata is likewise accessible in the MODS

Slide 23

Federal Register Example Metadata offices title activity synopsis dates contact FR Doc Number Billing Code

Slide 24

Extracted Metadata Example <descMdGroup id="id-05-10658"> <title>Editorial Modifications of the Commission's Rules</title> <printPageRange first="31372" last="31374"/> <migratedDocID>fr01jn05-12</migratedDocID> <collectionSpecific> <accessId>05-10658</accessId> <granuleClass>RULE</granuleClass> <agency order="1">FEDERAL COMMUNICATIONS COMMISSION</agency> <effectiveDate>2005-04-14</effectiveDate> <billingCode>6712-01-P</billingCode> <frDocNumber>05-10658</frDocNumber> <action>Final rule.</action> <summary>This archive corrects twelve areas ... refreshing the postal address of the Arecibo Radio Astronomy Observatory.</summary> <dates>Effective April 14, 2005.</dates> <contact>Rodney Small, Office of Engineering and Technology, (202) 418-2452.</contact> <cfr title="47"> <part number="25"/> <part number="73"/> <part number="74"/> </cfr> <tocSubject1>Practice and procedure:</tocSubject1> <tocDoc>Commission's tenets; publication adjustments, </tocDoc> </collectionSpecific> </descMdGroup>

Slide 25

Using the Table of Contents Subject situated list of chapters Federal enlist: Subject headings Weekly Compilation of Presidential Documents: Category headings Congressional Record: Daily process subject headings used to clarify granules

Slide 26

TOC Metadata Example <descMdGroup id="id-05-10658"> <title>Editorial Modifications of the Commission's Rules</title> <printPageRange first="31372" last="31374"/> <migratedDocID>fr01jn05-12</migratedDocID> <collectionSpecific> <accessId>05-10658</accessId> <granuleClass>RULE</granuleClass> <agency order="1">FEDERAL COMMUNICATIONS COMMISSION</agency> <effectiveDate>2005-04-14</effectiveDate> <billingCode>6712-01-P</billingCode> <frDocNumber>05-10658</frDocNumber> <action>Final rule.</action> <summary>This archive revises twelve segments ... refreshing the postal address of the Arecibo Radio Astronomy Observatory.</summary> <dates>Effective April 14, 2005.</dates> <contact>Rodney Small, Office of Engineering and Technology, (202) 418-2452.</contact> <cfr title="47"> <part number="25"/> <part number="73"/> <part number="74"/> </cfr> <tocSubject1> Practice and strategy :</tocSubject1> <tocDoc> Commission's principles; publication changes, </tocDoc> </collectionSpecific> </descMdGroup>

Slide 27

TOC Searches Find all articles in the Federal Register which are recorded as "gatherings" in the Contents Find all sections from the body of the Congressional Record recorded under "Measures Passed" in the Daily Digest Find all passages from the body of the Congressional Record distinguished as "The Patriot Act" in the Daily Digest Find every presidential report recognized as "Correspondences to Congress" in the substance

Slide 28

Table of Contents Example

Slide 29

Congress Member and Committee Normalization Congress individuals and boards of trustees are standardized Congress Member Codes Congressional Committee Codes <congMember authorityId="308" chamber="S" congress="109" party="D" role="SPONSOR" state="ND"> <name type="parsed">Mr. Dorgan </name> <name type="authority-fnf">Byron L. Dorgan</name> <name type="authority-lnf">Dorgan, Byron L.</name> <name type="authority-other">Byron Leslie Dorgan</name> </congMember> <congCommittee authorityId="ssga00" chamber="S" congress="109" type="S"> <name type="authority-standard"> Committee on Homeland Security and Governmental Affairs</name> <name type="authority-short"> Homeland and Governmental Affairs</name> </congCommittee>

Slide 30

Uses of Normalized Names Display the official names of congress individuals, presidents, and panels Search on a board of trustees regardless of the possibility that the name changes after some time Search by congress part state codes or gathering affiliations "Last all records supported by a congressperson from Maryland"

Slide 31

Search Features

Slide 32

Search Results activity (initial 20 singes) firstpage volume segment title secret accumulation connection to substance detail publishdate

Slide 33

Navigators

Slide 34

Collection Browsing congress law sort number range List of Laws

Slide 35

Advanced Search Form

Slide 36

Data Mapping Examples Internal Data Storage 110 V FR RULE congnum 2006-02-01 [2006-02-01;] accode=PPL

SPONSORS