Segment Indexed Integrated Databases (SIID)
originally published in IITRI Productivity Frontiers
Structuring data to create new knowledge
A large share of money spent to research novel ideas' results in workable solutions that
lay fallow. Waste of intellectual property resources is an extravagance taken for granted
by those experiencing government largess. Programs that attempt to locate and to transfer
technology satisfy few individuals attempting to locate ideas/solutions from Agency/Source
"A" for Country/Agency "B." Information reuse successes are limitless to those capable of
developing the means to identify meaningful potential applications. Our enemies, involved
in attempts to steal our knowledge, do not exhibit the creativity defined herein.
We created a logical method to identify sources of archived information that provide a
focus to projects that suffer from supposed secrecy. Outcomes emanate from locating and
integrating unfocused facts. We have generated new products/ideas in niche markets for
entrepreneurs willing to seek out solutions their competition has ignored. Many ideas, as
originally conceived, are probably pedestrian in their original industry/universe. These
ideas become fast track when reapplied to a more amenable niche. Most data, we identify
and restructure, currently reside, unnoticed, in archival holdings located in public domain,
corporate, university and government. Our results offer solutions and approaches that are
little used, perhaps forgotten. These warehoused facts have a high probability to cause
damage when related or associated with a specific weapon system.
The logic of a Segment Indexed Integrated Databases (SIID) is a unique refocusing of
Manufacturing planning and control techniques into areas outside their originally intended
scope. SIID begins by using the idea of a Work Breakdown Structure (WBS) not unlike
that defined in the original USAF PERT/COST. The WBS is enhanced using the USAF
Wright Laboratory Manufacturing Technology Division ICAM logic. ICAM Group
Technology Classification & Coding (GTCC) binds related data organizations, i.e., grouping
related physical characteristics. SIID groups ideas. Focus and control are evident when
the relation between ideas and their original intended use(what we term a simplified matrix
(SIMPLIMATRIX)) creates data structures delineating new solutions (knowledge.)
A comment explained later discusses the SIID methodology as identifying 16,000
manufacturing terms containing less than 4,500 unique definitions. The theory speaks
pointedly of the duplication and redundancy that are eliminated when logic and common
sense are applied to structure archived intellectual resources. Federally sponsored
Research and Development components integrate to create or optimize intellectual
Technologists generate data in gigabytes in every subject. Proper capture, logging and
organizations of data infer previously undefined thoughts. SIID allows those who research
secrets to expand and capitalize upon useful knowledge while purging chaff.
SIID uses existing catchwords to structure a search for datum necessary to produce new
facts. SIID optimizes federally funded archival research. Ideas such as CAD, CAE, CAM,
and CIM begat the major USAF investment in the Integrated Computer Aided
Manufacturing (ICAM) programs. The evolution of an automated architecture served as
the basis for attempts to join islands of automation. One such effort involved the transfer
of production information between IBM, Honeywell and DEC hardware. USAF-sponsored
integration research cost almost three (3) million dollars. Data distribution between end
users occurred in a traditional and pedestrian fashion. The ICAM Program Office awarded
us a small contract to optimize this research. We used it to formalize a set of rules for data
integration that we termed SEGMENT INDEXED INTEGRATED DATABASES (SIID).
SIID collects archived information/facts and structures them to produce unanticipated and
For years we have researched knowledge engineering via code structures to foster
information caching from dissimilar data base (file) structures hosted on dissimilar
hardware. Our government has tools that when used non-traditionally, allow knowledge
integration to occur in a less complex manner than is presently evolving as mining of
warehoused (archived) information. Available methods that aid integration include Work
Breakdown Structure (WBS), Group Technology (GT), Classification and Coding (GTCC),
Integrated Definition (IDEF), etc. These methods are applied ubiquitously outside their
originally intended use.
Our initial test bed SIID was an integration of the Army, Navy and Air Force data for
Manufacturing Technology. A work breakdown structure of ideas was constructed based
on the six USAF Manufacturing Technology Advisory Group's (MTAGs) technology thrusts.
Initial integration of collected data identified a problem of inter-service semantics. A search
by Armed Forces and Contractor managers for an individuality (unique product identity)
created a maze of complexity that obscured knowledge integration and concept reuse.
By Baselining (establishing a standard definition) terms in a relational matrix, we
established common terms. Now, twenty-two years later, the Society of Manufacturing
Engineers still publishes our Manufacturing Glossary under the title: "A Glossary of Terms
Used in Computer Aided Manufacturing." By tagging like ideas we related 16,000
self-centered terms into a manageable 4,300 meaningful yet common definitions without
loss of content or context.
We created a Work Breakdown Structure of then current terminology. Since many basic
terms appear at different levels in an organizational stratum, the idea of Group Technology
(GT) was applied non-traditionally. In place of the grouping of like parts, we grouped like
concepts, methods and ideas. The result was akin to a Japanese Bonsai tree in shape.
It allowed any use of a term, any place in the organizational strata, to be identified and
tracked as one rather than multiple entities. This procedure eliminates supposed duplicate
ideas made different only by their location in an archival stratum. SIID allows a DBMS
index to reference fewer terms. This allowed the relocation of pertinent portions of main
frame-based DBMS to personal computer/work station hard drives. The indexed structure
is quantified by using GTCC to classify and build a SIC/NAICS like structure.
Initially, information from three Armed Services who operated manufacturing data bases
was integrated. Defense RTD&E on-line System (DROLS), National Technical Information
Service (NTIS), Smithsonian Science Information Exchange (SSIE), Federal Projects in
Progress (FPIP) and Lockheed's Compendex Engineering Indexes were added over time
to supplement research needs for wider ranges of appropriate data. This expansion
occurred as we researched the manageable span of control within an SIID model. Our
later research, accomplished at no cost to Government, integrated ideas, identified
misplaced knowledge and created new applications in yet to be considered markets. An
SIID conducted today might collect inputs from several thousand data bases worldwide
being accessed by numerous search engines. The extracted data segments and their
reference tags are the fodder of a unique and innovative SIID integration.
We, precisely, place the items of data located in a relational matrix. Dissimilar data
segments eventually integrated within the SIID context provide an integrated universe of
previously unrelated information that become simple solutions to here-to-fore complex
problems. The initial collection process involved writing software to extract data as a flat
file, convert it to ASCII, transmit it to a newly designated host machine and, when
assembled, merge identified data with other like files. SIID structures the merged result and
manipulates data using newly assigned common format tags (similar to the original IBM
Corporation "key-words-in-context"). The source DBMS and its host hardware are
irrelevant to the process once we leech the relevant compatible data.
We sold/sell knowledge obtained from public access to Federal and Corporate Data bases
back to agencies of the Federal Government (USAF/AFSC/B1B/PMW, USN/DTNSRDC,
DOD/MTAG, USAF/ASD/XRX, etc.)/Corporations needing the newly identified solutions.
The test bed host repository used was a fast micro computer using commercial data base
software manipulated by expert system macros produced for the purpose of identifying
targeted topics in various keyword combinations. Later, we migrated to use the CLIPS
expert systems engine developed by the NASA Johnson Space Flight Center and
extended at NASA Langley.
In an USAF ASD effort we contracted to and constructed an SIID structure and two
software packages for USAF ASD/YP (F-16, Fighting Falcon). We identified and integrated
data bases from General Dynamics/Fort Worth, Ogden Air Logistics Center, ASD/YPC,
ASD/YPP, and ASD/SI. The source data had previously resided on one mainframe, two
DEC minicomputers and about a dozen micro computers (Zenith 248 & PC 80386). We
reduced the General Dynamics Automated Configuration Tracking Information (ACTION)
to a PC hosted information Network (AACTION). The system functioned on a micro
computer or networked from an LAN server feeding a HP hosted Financial Management
System (FMS) SIID also created for the F-16 SPO. While not tasked to do so, the final
model integrated the complete F-16 Program by tail number, lot, etc. This included Air
Frame (YP), engines (YZ), hardware (RW) and PMRT (00ALC).
SIID integrates any DBMS to provide, yet to be considered, unique and innovative
component integration into solutions and/or new facts.
In any endeavor the quest for automated productivity is driven by a need for increased
profit. While the Government is not a 'for-profit' organization, its Commands, Divisions and
Systems compete for scarce resources. Those auditing operations judge the cost to
performance ratio as an indicator of managerial success. Return on investment (yield)
optimize available resources so they might accomplish more for less. SIID data transfer
functions bidirectionally. Data created to optimize production can be recycled to bear on
cost effective engineering upgrades. In the case of weapons systems, SIID might identify
currently ignored weaknesses and suggest counter force options.
Any organization with a highly mobile staff, long term projects suffer from a discontinuity
of corporate memory. To assure consistent and continuous corporate memory, data must
be collected and stored, in a way amenable to prompt efficient and organized retrieval.
The system in operation must be congruous with the policy, standards and regulations that
govern determined need and use. Most computer systems in use by both government and
industry are dissimilar in goals, architecture and/or structure. Policy, standards and
regulations are not considered by systems creators. This causes redundant and duplicate
data. Potential for efficiency is limited as users depend on incomplete pieces of knowledge
that do not readily integrate nor meet program/project needs.
The development of indexable logic structures, for integrating knowledge, significantly
reduces the cost to find and expand the availability of needed data. Further, by Baselining,
i.e., establishing a matrix of terminology with common definitions and data distribution
methods, SIID integrates tangent yet related support functions. The ability to manipulate
common data simplifies the search process. The combined effect of common structured
data, input and validated once, yet, available across operational boundaries can
significantly reduce the administrative overhead associated with identifying alternate use
The extension of SIID mechanisms simplifies arrangement and conversion of dissimilar
data. Common data appears transparent in a designated user format. In place of the
current practice of duplication at high cost, a data management-based host-device can be
programmed to create new unanticipated uses of data at minimal cost. Grouped
information can be modeled and altered to conform to evolving needs. Common place
technologies within industry "A" become the esoteric solutions to problems plaguing
industries' "B", "C", and "D."
Uniform data encourages development of trend information within and among topics being
monitored. We can paper test (model) resultant trends against many scenarios to optimize
solutions. In the same manner as data moving from design to production, manufacturing
trends can be relocated to help in optimizing new engineering designs and configurations
without increased experimentation.
The SIID model is the tip of a yet unmapped information manipulation iceberg. The use
of an integrated knowledge-based system allows definition of concepts that can be
reasonably transferred to other uses. Quantified decisions allow the prioritization of
variables against fixed rules similar to an engineering design of experiment.
Potential product changes, simulated to predict the effects on time and cost, before
commitment of resources, reduce investment. Depiction of modeled results as three
dimensional spread sheets is possible and realistically attainable. As we gain experience;
the SIID becomes a generic methodology, a real time lesson learned, to manage, and
train, cadre for future efforts within the universe of the information being reapplied.
Segment Indexed Integrated Data base (SIID) methodology is proprietary. Use of SIID is
a cornerstone upon which unique automated technology transitioning is constructed. A
reference library of previous SIID studies allows for the reduction of development
expenses. Use of SIID is a prime factor in the achievement of Return on Investment from
any ongoing weapons system development.
THE COMPONENTS OF AN SIID
The Segment Integrated Indexed Database makes use of existing technologies joined to
integrate dissimilar file structures (databases) originally hosted on dissimilar hardware.
The SIID allows information to be located, merged and manipulated for the benefit of all
The SIID modules includes:
Define sponsor goals and lay the groundwork for desired results (an estimate of the
boundaries of the universe of information to be reused).
Identifies the specific data to be collected and integrated. It establishes the record
structure, field demographics and descriptions of data items in glossary form. The SIID
Engineer generates a 'where-used' matrix (SIMPLIMATRIX) to specify data redundancy
and omissions based upon user needs. The result of Baselining is the creation of a new
integrated relational DBMS record, i.e., orientation = direction = placement = etc.
A Work Breakdown Structure (WBS) of logical relations is next constructed. The WBS is
flushed out using ICAM's Integrated Definition (IDEF) methodology. SIID structure is
generated using Group Technology (GT, GTCC). A dimensionality is assigned based on
data item's value/priority to create a fully relational DBMS structure. Finally, a cross index
of the RDBMS data items to source DBMS file structures, as defined by the Baselining
module, is produced.
We define architectures, data interchange standards and communications protocols for
collecting specific data. We specify points of an interface between existing hardware and
the SIID. Transfer bridge software between and among architectures is designed and
compiled from a SIID subroutine library. (NB: the library is lifelike, growing and adjusting
with each new SIID.) We establish and test data integrity verification procedures to assure
validity of collected data. The transfer bridge need not be complicated. It could be as
simple as a modified "PC Anywhere" legacy product.
Working with end users, data gathers define required screen and report formats. We
explore use patterns and auto-statistic methods of data capture that optimize search and
retrieval functions. User friendliness in the design and placement of error messages, 'help'
messages and 'redo' loops are completed. We write and test a user manual to make data
collection specific to sponsor needs.
The SIID RDBMS is created by a data leeching process. In the 'Transfer Bridge' we test
previously written software. Transferring data from specific file structures builds an SIID
RDBMS into the SIID integration shell. We test LAN connectivity and make it operational.
Technical documentation is produced as we train end users.
Post Implementation Audit
After ninety (90) days, we survey users to define modifications that enhance the, now
The non-automated expert SIID is a 'snapshot' of the source databases. If a source
database is modified, the SIID must be 'reoriented' to assure that the data locations and
relations have not lost relevancy to any user need. Experience with testing operational
SIID logic shows that the customization process requires corresponding adjustments.
We can design SIID to function on a personal computer/server (80XXX, Apple System X)
or a DEC VAX 11/7XX and 8000 series or other compatible equipment. NB: older cost
effective surplus equipments are often suggested to host the SIID. This is further proof
that the system need not be expensive to be successful.
Any PC device may contact the SIID RDBMS via a modem and appropriate communications software.