B4GEN: DATA MODELING; BASIC TERMS AND CONCEPTS- REVIEW

Source: By Action Research Team

LESSON-1

1. BASIC TERMS AND CONCEPTS – Review

1.1 INTRODUCTION

The Data-Modeling techniques are fairly recent adoption of system analysis practices of developments so far, as is known. There is infrequent comprehensive exposition of the state-of-the art system analysis and design practices which attempt to embody experiences gained by several years’ participation in their course of proficiency of work.

Prior to the emergence of business system information regular objects, the system designers and analysts were captive of whims in circumambulation of twists and turns, while fixing one naught and then caught up into another. The consequences were gloomy system design. Ultimate handicaps were interruptions on business floors for smooth flow of automated business systems. Inheritably, the economic competitiveness was refrained beyond certain efficiency, undoubtedly causable to catastrophic business disrupts.

This monograph is authored as a part of series in pipeline for Instrumental Data Modeling of Management Information Systems (MIS) using state-of-the art technology for online milieu of transaction processing of business applications. The made-easy professional tips and tricks for Information System practitioners have been construed here to dealing with trivial and nontrivial paradigms.

Here, the focus of this monograph is to enrich broad-spectrum skills and practices of instrumental system designing, which are in soaring industrial demand of Information Technology. Characteristically, these acumens of articulation enforce; the integrity business system processes of online transactions, their events of tasks synchronized with security and performance irrespective of time and geographical bonds. All these are of high worth to build and integrate MIS typical objects.

Equally, the monograph employs; set theory basics, logical operators, business system processes breakdown, and translating linguistic semantics into facts made of predicates. Further, in a logical framework store and retrieve information objects and to extract from database retrievals of pertinent objects for construer of MIS. The beginners are axiomatic recipient of these data modeling skills’ potentials of identifying the relevant information objects used in business transactions and entailing operations; insert, delete and update, and MIS retrievals. The practitioners may find the document a source to enrich their skills as engineering way-out to grasp system designs standards. Novelty is, these solutions constituent incremental roll-model, all set for later on alterations to coup with business processes growth.

In order to coup with the evolution direction of system designing skills, authors utilize the monograph as analytics of resolving issues of systems’ paradigms. They, successfully disseminate, as made easy training document to boost system modeling skills to wide variety of youngsters, who have found this fruitful in building up their confidence level. The continuum series of industrial monograph is to lead advances in system modeling embedded with class and object designing while introducing Object Oriented Relational databases as industrial practice.

This document reflects many advantages in binding theory and practices as an engine of the design nucleus, especially responding ‘why and how’ in data modeling.

NOTE: This version of document is raised on the understanding that neither the publishers, nor the authors shall be responsible for absolute correctness or sufficiency of any information or illustrated tables and queries contained in it. The authors reserve the right to edit excerpts for reasons of clarity and preciseness, any time.

1.2 THE LEAST YOU NEED TO KNOW

The terms are arranged here in sequence with natural needs of their learning;

Data/ Information

Data are most preliminary atomic component, syntactically string of characters with no apparent semantics on their own. Semantics are associated meaning with data which may vary under different contexts.

Example-1.2.1;

A string of 8 characters, say, 02111999 means nothing, unless one knows the associative knowledge, as group of digits representing contents such as the first two digits stand for the day, next two for the month, next two digits for the century, and the last two digits for the year.

Type

It is a generic general term to identify data component of a database.

Example-1.2.2;

Data types are; data item type, record type, file type, and many more.

Occurrence of Type

An occurrence identifies a specific set of values of a type. While Type is a general term, occurrence is a specific term.

Example-1.2.3; Data Item/ Field:

FIELD OCCURRENCES

Student-Name John Hock

Age 52

Gender Male

Major Computer Science

RECORD; Student Records’ Fields

Student-Name Shirley

Address Boulevard Avenue, Technology Park, Super City.

Gender Female

Phone 99-45-9567891

Data Type

The associated Field Values mark Data Type and Domain to encompass correctness of their prescribed algebraic operations, which require Apple with Apple.

Example-1.2.4;

DOMAIN DATA-TYPE DATA-VALUE

Employee Name Characters Jones

Social Security Identifier Integers 510-64-4478

Date-of-Birth Date 24-May-1943

Data-Field

Data-fields by name designate characteristic of basic values to comprise a record. The content of the data-fields are called data values, which are stored, maintained, and provide on access by software programs to users.

Example-1.2.5;

DATA-FIELD: Name Social Security Birth Date Birth Place

FIELD VALUE: Jones, 510-64-4478, 24-May-1943, New York

Input/ Output

INPUT; Inputs are the data values, which come from outside world into information system as facts and figures of pertinent instances of business system needs. These are checked for their correctness or integrity before they are finally accepted as recorded data values into the systems.

OUTPUT; The processed or crunched inputs for various judgment-needs, and provided to endusers.

Example-1.2.6;

Input; Monthly Payroll Transaction

Output: Monthly Pay-slip

Data Structure

Physical form of organizing data in memory or storage media is generally called Data Structure. The software programs manipulates through appropriate algorithms (for efficiency) over those physical forms aiming specific data crunching. Data structures are static/ essential part where the technologies of information manipulation act and react for change of values or states associating semantic values of the desired effects.

Example-1.2.7;

BASIC DATA STRUCTURE: Arrays; single and multiple dimensions, Character Strings, Lists, Inverted Lists, Tables, and many more.

SECONDARY DATA STRUCTURES: Trees, Indexes, Decision Tables, Knowledge Tables and many more.

Algorithm

Algorithm is part and parcel of Data Structure, which is a navigational/ execution path in memory or Storage-media required to arriving at solution. On this path the software program manipulates data values wherever whatever exists on data structure media.

Consequently, to secure a solution the algorithm navigates over data values organized in structure to acquire solution based on various stored values. The algorithm is not an exclusive technical word employed in mathematical computation; rather, it is equally in use for non-mathematical problems too.

Example-1.2.8; (non-mathematical paradigm)

Workout solution ways of the problem that is described below;

There are eight balls equal in weight except one, which is heavier. A two panel balance device is available for precision.

Various algorithms may be worked out to get a solution ‘which ball is heavier’ see below;

a. Brute Force Algorithm; Try all combination of pairs of balls, track one which is heavier. To configure all combinations needs and annihilate one by one of equal weights, requires control and management system before one gets lost in the course of picking the right solution.

b. Moderately Skilled Algorithm; Fixed one ball in a panel of balance device and keep changing ball in another panel in multiple weighing operations. It allows as minimum one and maximum 7 weighing operations, the guaranteed result is there as soon the device shows the first time heavier sign.

c. Intelligent Algorithm; Make two groups of balls each of four balls. In the first weighing operation eliminate lighter group of four balls. In the second, eliminate lighter group of two balls as above. In the last step of two balls heavier group in hand eliminate lighter balls as above. Thus, the heavier ball is separated in three shots.

Example-1.2.9;

A teacher is coerced to compile results of his class students, held during a semester for a large group of students. It is better since beginning prepare a two-dimensional table where the first column bears students’ names, and the second and more columns test-id for test results of respective students. The compilation ends up in column-wise and row wise navigations. The indicated navigation is algorithm and the two dimensional table is the data structure.

NOTE: The choice of optimized structure and algorithm is the programming capability, and is open in devising solutions, which vary in characteristics.

Record

A record is a collection of related data-fields containing elemental data items. Fields may be related explicitly to instance ID or in some way related to each other. Fields describe some specific instance, say, a person, an object, or an event, which has instance ID. A record of an employee contains information such as Name, Social Security, Birth Date, and Birth Place. Name data value is ID also.

Example-1.2.10; RECORD: EMPLOYEE

   Jones          510-64-4478          24-May-1943            New York

FIELDS: Name Social Security Birth Date Birth Place

File

A file is a collection of similar records stored on computer devices. A file has a name known to the operating system for control. Normally, the file has single structure (layout) and organization of records, which is determined/ employed by file access program(s), in order to read or write specific record(s).

NOTE: While organizing records in a file, it is not a good idea to mix different type (layout) of records in a file. Thus, file ‘Employee’ may contain records of employees or file ‘Department’ of department only.

Example-1.2.11; File EMPLOYEE Record Layout of Fields

LAYOUT (in-char): 1 – 10 11 - 26 27 - 37 38 - 57

FIELDS: Name Social Security Birth Date Birth Place

Transaction

The actions that go to up-to-date recording of business system information. These causes changes in the pertinent-field(s) of pertinent record(s) of pertinent file(s).

Example-1.2.12; Transaction types include;

Change of Data-items(s);

Insertion; Inserting data-item comprising record(s) in pertinent files in appropriate records organization of that file without disturbing other records or organization.
Updating; Updating of data-items of pertinent record of a file without disturbing the record placement where ever it is.
Deletion; Deletion of record(s) that is comprised of a set of data-item(s) as physical removal, and there are multiple ways to handle these physical removal.

Read Only - Information Mining;

Retrieval; Retrieval of Information that is made of chunk of data items of record(s) of File(s).

Master File

This is a type of file consists of master records, whose data-item(s) are up-to-date in nature pertaining to some logical facts of some business processes. The file is fairly permanent, data-items are retained relatively longer period of time.

Example-1.2.13;

Students’ Master File; containing up-to-date list of all active students

Transaction File

It is a collection of transactional records of a certain event(s) of certain time duration that contains records waiting to react with records of their Master File. Transaction Files are short lived, once the transactions react to records of Master File, it outlives the utility. The specific software programs process these transaction file records to react with the corresponding master record(s) in MASTER FILE. These reactions leave net effect over the pertinent record(s) and pertinent data-item(s)/ field(s) of Master record(s). These reactions include appending, updating, or deleting a record.

Example-1.2.14; Transaction Reactions to Master

Transactions update record of Master File such as; Delete transaction to delete a particular Student record from the pertinent Master File as per prescribed method. Insert transaction creates record(s) either places as append or insert in appropriate order of the pertinent Master File. Modify transaction is to update some parts of record(s) of pertinent Master File.

File Processing Modes

There are various modes of processing transaction(s) to react with the corresponding Master file.

Batch Mode Processing; Transaction processing is done to react with their Master record(s) on periodic basis. In that gap duration, transaction records are cropped up as a batch of transactions in a Transaction File for processing (by a specific program) in Batch Mode. There are various naming conventions in practical terms of such transaction files, satisfying their identification needs.

Example-1.2.15;

Monthly payroll has transaction file of new employee records, transfers of posting, salary increment, termination, stop salary, and many more. The payroll Master File has master records of employees with up-to-date data values carried over from last processing months.

Real-time Mode Processing (on-time when event occurs); Transactions are made to react with respective Master record(s) in Master File instantly on event basis. This processing action is triggered as when needed.

Example-1.2.16;

Updating inventory balance on each sale of item(s), the transactions of sale reacts to reduce quantity in stock of saleable inventory of that item. This event will leave the stock quantity of that item short by one for display.

Data-Integrity

Data Integrity is concerned with the degree of correctness, timeliness, and relevancy of data-item(s) within a pertinent context. This needs correct occurrences in all respects;

Syntactically and semantically Correctness; The syntactical correctness means data items within the physical domains, and semantically correctness means prescribed logical domains of rules involving multiple predicate and data items (business rules).

Every data item is a corporate resource of organization so its syntax and semantic must be uniform too.

Example-1.2.17;

Syntactical Correctness (Physical Domain); Data-item occurrence must be from the specific range physical domain, or may be unique, similarly many more;

Salary per month of a sales-employee is prescribed between US$ 1500 to 2000. Say, a candidate data-item is US$2200 or A200 is invalid.

Semantic Correctness (Logical Correctness); Date of joining of employee should not be in violence of Child Labor Law. If the two date values; Date of Birth and Date of joining are syntactically correct, the difference of days computed from the entered dates should also be correct to avoid violation of Child Labor Law.

Example-1.2.18;

In an educational environment of a University, students are the key components. The fields of the name of students and their address may be common in various information systems of a University and maintained in different standalone environments. One system may have more emphasis on financial figures and so their correctness, while names spell or addresses are of secondary importance, and vary in degree of their correctness. These specific correctness’ enforcements prohibit their share-ability.

Records Identity

Since relevant data records are collected in a file that has name (recognized by computer oriented file system). The file is organized in a suitable order. This makes essential to identify records of file(s) by specific data-items. Further, it is also an imperative need for interaction of business floor with the computer oriented file system, in order to interact with specific record(s). Characteristically, these should be;

Must be same through out the records of a file,
Must be unique for each record
Must be everlasting
Must be precise and optimized in usage and growth
Must be compatible to business floor needs
Must be easy to use in computer organization and floor world

NOTE; There are various means and ways to configure such records identity. Here this discourse is confined up to its usage in file organizations.

File Organizations

A file is a collection of records, which is organized underlying any one of the feasible organizations of records by record-ids for the purpose of convenience of access to specific records. Inheritably, these access disciplines are managed and controlled on plug-in basis by the operating systems. The accesses organizations include;

Sequential Access Organization; Records are organized on computer storage in sequence, while the insertion is appended, as they are created, and their access is sequential that means access each and every record from beginning of the file and check for desired record by record-id.

Example-1.2.19;

File of Enrollment of Students in a University, which is allocated a unique enrollment number as the identity for external and internal use of computer storage, in sequence the next available empty identity and duly recorded in Enrollment file.

Random Access Organization; This allows to inserting and accessing records on storage media enabling their access as random. This means without any access preconditions and fetching the desired record straight away.

Example-1.2.20;

Common instances; index files, inverted files, also databases

Keyed File Organization and Access: This is organized as groups and subgroups of component data-items, which are the part of records whose value singles out a record. The value of its key is used to derive place of insertion and access a record. The indicated components often called the primary key, secondary key and tertiary key and so on. The techniques to obtain address of a record from its key components include; Indexing, Hashing, and Lookup tables.

Example-1.2.21;

An employee record ‘745-35-8951’ as a very simple determination allocates its record number ‘8951’ within superlative group ‘35’ that suppurated by ‘745’ where it can be fetched for retrieval from storage media as stored over there.

Programming Languages

Basically, these are artificial languages with limited logical syntax and semantics for man machine interaction with limited obligations of perception and execution by machine. However the commands initiated by outside world for inside computer executions are translated at atomic level by middle layer of software. The evolution of these languages, since its early origination in post Second World era has gone through painful stages to arrive at the state-of-the art. These programming languages by classification of hazy generations are known to be;

Example-1.2.22;

First Generation -Machine Languages; These were atomic level machine instructions directly executable by computers’ architecture.

Second Generation - Assembly Languages; it was the first attempt to make programming instructions partly in communication human lexis by using mixed alpha-numeric codes to identify classification of instructions. Imperatively, it required translation of programming instructions into machine executables. Translator software programs were employed, called, Translators.

Third Generation – Compilation Oriented Languages; These include BASIC, FORTRAN, PASCAL, C, and many more. These were classified as high-level languages, which had more logical consistency in expressing group of machine executable instructions. To make machine executables they required special software that parses the instructions text and converted into machine executable atomic instructions. The evolution phase of these languages brought them up to structured constructs of programming, especially, in dealing loops; repeat set of instructions and termination, recursive loops, and many more. The process was called compilation with enhanced functionality of computer technology.

Fourth Generation – Structured Query Languages; SQL based INGRES, SYBASE, ORACLE, DB2, and many more. These were tailored to meet the needs of Relational Databases. The databases era inculcated MIS current movement dragging the technology out of File based processing systems. This implanted a revolutionary change in MIS thought process where information of events is key resource of business systems dynamics and so business competitiveness.

Fifth Generation Languages – Window Based Processing; Object Oriented Languages; C+, JAVA, window based ORACLE, advanced versions of LISP and PROLOG and many more. They are capable of providing graphic supports through window-based breakdowns of programming components.

Compiler

It is a special kind of software with specific obligations of translating high-level sequence of programming instructions to translate into machine executable programming code. The levels of conversions include translation, compilation and memory addresses generations assigned to programming instructions, and merging them all into one machine language compiled executable unit, often called ‘exe’ file. The compilers are specific to programming languages, for specific machine architecture and version and brands.

Example-1.2.23;

Compliers; FORTRAN for IBM Computer, ICL computers, Honeywell, and others

Operating System

Computer machine by itself is a black box of electronics chip circuiting. The primary function of an operating system is to make the capabilities (execution of programming instructions) of computer system available to software programs’ execution. The Operating System software is loaded into the special part of computer memory, which enables to control and manage machine operations by dynamically linking different components of a computer system in operational mode.

Example-1.2.24;

Operating systems; DOS, UNIX, LINUX, WINDOWS 98, WINDOWS 2000, WINDOWS XP, SUN SOLARIS, and many more

Software Program

Specific task oriented set of computer instructions in any of the generation of languages above indicated which crunch data items on files and produce results as outputs, in simple words these are called software programs. There are a variety of software programs, which are categorized according to their nature of use.

Example-1.2.25;

Software Program; Operating Systems, Drivers, Networking Communication Systems, Internet Surfing, Virus scanning and fixing, and many more

Data Bases – File Oriented

A database is a collection of inter-related files (data files), file layouts, field domains, quick reference for access, and any specific that is needed to maintain and use it. Databases, which are maintained on computers for insert, delete, update, and retrieval from computer files usually called, computerized database of file system.

Example-1.2.26;

University Students Examination System, Railway Booking System, Air Line Booking System, and many more

Database Management System (DBMS)

A collection of software programs required to store and process; insert, delete, update, and retrieve events on database file system is generally called DBMS. The Software Companies in IT industry marketed the branded DBMS products.

Example-1.2.27;

On-line Library Search, Accounting Packages for small size companies, Inventory Control System.

Database Key(s)

It refers to the address of a record occurrence stored in a database file. Usually it is composed of certain identifier. The database key value of a record is generated by the DBMS in use.

Example-1.2.28;

Data Independence

The data values in computer processing pass through various Operating Systems, processing environment (batch or on-line), and programming languages, consequently data gets insulated and does not suit to any change of Operating System, environment and language.

Example-1.2.29;

Data files of mainframe computers may not meaningful to Personal Computers

Data Processing

When data is being processed two basic operations are performed;

Maintenance; refers to changes in data/ records of a file in some manner. These are;

Example-1.2.30; adding new records;

Inserting record(s) in appropriate place of its organization of records of student(s) in Student File

Example-1.2.31; Deleting of record;

Student record(s) of specific Identity to be deleted from Student file, if already exists in that file. In situations physical deletion of record(s) is done in steps.

Example1.2.32; Updating a record;

Making effective, new spellings of student’s name, if, the specific student record already exists in a file. This updating means change of spelling stored in the field of student-name only, without disturbing other contents of that record(s).

Managing Data;

Commonly there are three important problems of managing data, which are;

Data Retrieval; retrieval of data in many manners from file system, some times within a very short time.

Data Integrity; Reduction in data duplication and uniformity of data syntax and semantics to make it a common resource.
Data Independence; to make data independent of insulation of environment, operating system and other machine layers.

Retrieval; It means reading data from file into computer processing memory.

Example-1.2.33;

List down students residing in dormitory

Most Desirable Requirements - Retrieval;

a. User should have a great deal of retrieval flexibility from file database, and be able to link data items from various inter related files.

b. User should be allowed reformatting of files need specific in order to reduce duplication and storage on need specific.

c. Since all users will access data in universal manner, there must be of high level of independence.

d. All these goals should be achieved while the database is operating in real time mode.

Types of Database User

Enduser; A person who directly extracts information from the database to fulfill job functionality of the business floor.

Specialist User; A knowledge-able user of database may be designer, developer, database analyst, database administrator, system auditor, and many more.

System Analysts; Analyze the business paradigms to recognize the actual needs of information objects and services to cater opportunities of existing business promotion.

Database Analysts; Focuses on determining requirement by standard analysis means and produce design of database components of information system.

Programmers/ Application Developers; Capable to read database design requirements and develop programs that have embedded in their commands to maintain and access the database.

Besides an application developer two basic skills are essential, one is capability to read and perceive a relational data model that enables designer and developers to imagine and devise all its effects on structural (entity) components and characteristics in a business system context.

Database Administrators; Administers consistency and integrity across databases, and provide consultancy and training of internal database structure and resources to other team members.

Technical Experts; networking, operating system, testing, and documentation experts.

1.3 BASIC TERMINOLOGY OF MODERN DATABASES – Quick look

Concept of Set and Information System Designing

In plain analytics, Set is a collection of objects without duplication. Expediting, this definition in physical form means, collection of objects would not require any discipline or order of their arrangements. The ID of objects would be acquired by data-values (non-duplicate) in standalone or in group and subgroup classifications.

Our domain of interest regarding the Set is confined to a collection of similar records (objects) that may be organized in a file structure or otherwise.

In simple words, the concept of set is deployed in a two level file structure of two different class of record types designed to link the two files’ records as set members. Each file records formulate a set, and link means associating with a record of a file a set certain record(s) of the second file set.

Example-1.3.1;

Conceptually, to represent Department and Faculty where each department consists of none, one, or more faculty member(s) that means there is a set of departments in Department file and sets of faculties for each department in Faculty File, while the links identifies each faculty with their department. This relationship of Department records with Faculty records is called of the type (1:n).

In a set of records Department is designated as owner record type (the ‘1’ of (1:n)), and Faculty is designated consisting of member record (the ‘n’ of (1:n)).

The graphic depiction of that design is;

DEPARTMENT

Owner Record Type:

Set (file) of department records

Relationship of 1:n

FACULTY

Member Record Type:

Set (file) of faculty records

DBMS – File Databases

Database Management System (DBMS) package of software products is organized to manage file storages under different schemas for the purpose of records management and maintenance; insert, delete and updates of records, and to some extent simple retrievals like Query By Example (QBE).

Example-1.3.2; DBMS

Insert Record

Update Record Retrieve Info

Delete Record

DBMS Types

Prominently speaking there is four categories of (schemas) of DBMS File systems, which are in use in File oriented environments. In brief these are described here as states of evolutions of state-of-the art databases. These include;

Hierarchical DBMS File Structure
Network DBMS File Structure
Inverted List DBMS File Structure
Relational DBMS Table Structure

Hierarchical DBMS File Structure – marketed brands

Conceptually hierarchy stands for a characteristic property that a child has one and only parent. The hierarchical DBMS deals with physical structure of records organizing as parent-child while child records are stored as physically adjacent to their parent record in the same file, prior to another parent. Classically, the navigations over these records were restricted due to hierarchical origin, because a child is originated from one and only one parent that also signifies its path to access. A parent may have none, one and more child records. The uniqueness is determined by combination of set of child and parent and not by individual set members standalone.

Example-1.3.3;

Consider again Department and Faculty example of (1:n) relationship of Example-1.3.1, which were designated as owner and member records of that relationship. The same problem in hierarchy setup would be called parent records instead of owner and child record instead of member, but both are stored in one file. The physical adjacency all in one file, where a parent record immediately followed by child their records, till the new parent record is stored. Incidentally, it is beyond our discussion to consider its pluses and minuses at this stage. See depicted version;

Exercise-1.1; for practice;

Underneath is a hierarchical schema setup, demonstrate hierarchical DBMS File structure of the same.

Exercise-1.2; for practice;

Conduct exercise to identify navigation paths to retrieve information from;

a. Parent to child level

b. Child to parent level

c. Child to child level

Exercise-1.3; for practice;

Diagram hierarchical DBMS design in File system database of the file organizational setup;

a. Directors

b. Divisional Heads

c. Supervisors

d. Office Clerks

Exercise-1.4; for practice;

A game board has many clubs as members, and each club has their members various players. Construct hierarchical DBMS file system, having information of tournament and their matches, matches results, the team members, captain and vice captain. The navigation for the following type of queries is essential;

a. Matches, results and teams

b. Matches scores of teams

c. Highest and lowest teams

d. Players panel events

Network DBMS File Structure – marketed brands

Conceptually, network DBMS; means a child may be associated with multiple parents and so vice versa. It also refers to two level relationship between different record types.

Analytically the navigations of these records would be coerced to know the choice of parent to path the access to a record, because a child may be originated from multiple parents and so the path(s) links the pieces of information where they pass through.

Since it is both ways multiple, these are designated in relationship, symbolically (n:n). Often these situations were handled with multiple hierarchies DBMS File Structure (based on adjacency of parent child records) that means much duplication of information system and so their efforts too. The uniqueness is determined by combination set of child and parent and not individual set members’ standalone.

Example -1.3.4; a simple case;

An Instructor works for more than one department, and vice versa.

à points to child

INSTRUCTOR

In hierarchical split these look like;

(1: n) (1: n)

INSTRUCTOR

DEPARTMENT

Exercise-1.5; for practice;

Orders are placed of multiple items to multiple suppliers each day.

Diagram network-file DBMS design

Inverted List DBMS File Structure – marketed brands

Inverted list file is a special structures providing the organization of header and references tail (open end). These were recommended for quick referencing for accessing specific data records on the basis of attribute data values. Likewise; to extract information specific to female gender, the inverted list provides ready list whose header is ‘female’ and followed by records or links pertaining to that attribute data value in this case it is ‘female’. The references may be organized in adjacency with of each header, or by pointers

GENDER VALUES RECORD ID

FEMALE R1, R4, R5, R8, R10

MALE R2, R3, R6, R7

The DBMS products were capable to produce inverted list files and allowing on them the set theory algebraic operations like Union, and Intersection for retrieval.

Example-1.3.5;

A Library has 1000 books on different subjects. Book records consist of information on book title, author, subject, and publisher. Indices of subjects are organized in Inverted List File structures depicted below;

BOOK RECORD

BOOK REC: Book#, Title, Author, Publisher, Subject, Other-Info

SUBJECT INDICES SET; Subject with physical adjacency of Book Nos

Subject-1,

Flat Files Databases & Relational Databases

The classical set theory is the backbone of relational databases, and so their analysis and design, that provides ground for instant logical manipulation for MIS needs.

The system analysis and design analytics transform business systems’ paradigm into a robust set of facts. The business system processes are built up around those facts. The facts are construed strictly in pursuit of the information calculus.

The facts engineering mechanism provides guaranteed support of information algebraic operations for ready extraction of MIS in support of ongoing business activities, which are duly recorded in the database. In plain words, the relational database is a dynamic repository of relational business facts in its own integrity.

The mounted techniques enable to convert business facts event into a standard design with predefined relationships amongst them, practically at the cost of no redundancy and high degree of data integrity and data independence. Rather, easily enforcing business rules and integrity constraints.

Further, these information objects are construing the MIS, which can be manipulated by means of set operations. These operations have already been programmed and stored as software units of RDBMS tools as package including ORACLE and others, and their uses are made rationally straight forward.

NOTE: Although the business description semantics is translated into syntactical repository of pertinent facts in the form of databases, while the semantics of facts is derivable from the acquired syntactical database repository.

Flat Files Databases are portrayed as under;

The physical structure of a flat file means organizing files of records layout; with fixed fields and their length irrespective of their basic type. Behaviorally, these are not tables but look like tables.

The physical structure of relational databases is different in many respects than a flat file system of databases of fixed fields and fixed length records files.

NOTE: RDBMS structure is neither flat files nor a matrix representation deployed in matrix algebra, where a matrix structure is two dimensional arrays of cells with identity of row and column numbers holding values of one type may be numeric only, and complete matrix is one operand unit of the prescribed matrix algebra.

Relational Databases;

The relationships of entities spin the application system, while entities have pivotal role in relational database. The entities catalog classes pertaining to sets of identical facts, which are in relationship with facts of other entities.

Since this industrial monograph is dedicated on the subject of relational Databases design that is called here data modeling, all its key features are dealt one by one as software engineering practices focusing on the same. The deliberations here are confined to an introductory buildup level aiming to better grip of delicacy of skills to this effect.

Relational Databases are portrayed as under;

In simple words - what is a Data Model?

Conceptually, a Data Model is a static structure for holding facts and effects on them of business processes, which formulate the dynamics of business system. The system dynamics results into change(s) in state of the facts, which are cohesively inter-related and describing the new state. The resultant state of facts, represent a prescribed status of the business system.

The system analysis and design identify paradigms of the facts and their relationship, from start and end of the business process flow.

Classically, a data model has the basic two-dimensional static structural meanings;

a. A Data Model is a Set of Entities or Tables (Entity/ Table are synonyms).

b. Each entity is a set of facts, characteristically comprised by attributes; as column(s), their occurrences appear in row(s) of similar semantics.

c. Entity rows are unique as collection of facts without duplication.

d. Relationships do exist at the birth of occurrences in between;

1. Entities

2. Columns (attribute) components of fact

3. Rows (facts) between collections of facts of an entity.

Implementation Structure, it is implemented in table type structure, which is different then file or flat file structure. The internal database structure pertains tables’ names, pertains rows (objects) of that table, each row (object) of a table has object ID (not row ID) and is made of attributes (columns) data values in specific domain. All rows of a table are made of fixed number of attributes throughout their life time, and secure value from one and only one domain, which comprehends its’ semantics. An Illustration;

Example-1.3.6;

A table type structures for STUDENT, that physically describes a student in that particular context, and every attribute has prescribed domain, say, Name comprises of maximum 20 characters of commonly known character set.

STUDENT table is set of facts of students with attribute values; STD_ID, NAME, ADDRESS, BIRTH_DT, and GENDER

TABLE: STUDENT

A table type structure of COURSE table is set of course facts with attributes data values; namely, CRSE_ID, CRSE_NAME, DEPRATMENT and SEC#, rest same as above.

TABLE: COURSE

A table type structure of FACULTY table is set of facts of faculty with attribute values; FAC_ID, FAC_NAME, DEPARTMENT, and GENDER, rests same as above.

TABLE: FACULTY

Example-1.3.7;

A database can be constructed cohesively, by adding here table as repository of events of the business system dynamics that is students’ registration in courses. This is meant here; specific student registered in specific courses and taught by specific faculty(s).

A table type structures of REGISTRATION.

TABLE: REGISTRATION

Discussion; The courses registration table relates occurrences of three different tables as a critical important business rule. The facts recorded in these three tables form objects of REGISTRATION table to mark an event of registration of business process describing the student, the course, and the teaching faculty.

This example shows entities relate specific instances of related facts.

Example-1.3.8; SCHMEA

Depicting this relationship of four entities (tables) below which is often called schema (logical model);

STUDENT COURSE FACULTY

REGISTRATION

Discussion; The above-depicted schema of the business events relating (cohesively) the facts of four entities is in parent-child relationship with parent entities STUDENT, COURSE, FACULTY and child entity of all the three is REGISTRATION. Uniqueness of occurrence of each fact will be focused upon in next chapters.

Exercise1.6; for practice;

For better understanding of Example 1.3.8 answer the following questions;

a. List down each parent-child relationship marking their occurrences form parent entity and corresponding child entity.

b. An occurrence in child entity could have instance, which does not exist in the relevant parent(s).

c. How many maximum entries can occur in child table where counts of the parent entries are taken from Examples 1.3.7.

d. In case a student takes a course whose faculty is not known how that student course registration would be made effective, discuss.

Exercise 1.7; for practice;

Let in Example-1.3.7 the REGISTERATION is replaced by the following table and data;.

Argue whether the Four tables STUDENT, FACULTY, COURSE, REGISTRATION are relational or not?

TABLE: REGISTRATION

1.4 ANSWER TIPS ON EXERCISES

Exercise-1.1; ANSWER TIP of Hierarchical Design Database

Diagram hierarchical DBMS design in File system database of the underneath hierarchical file organizational setup.

UNIVERSITY-2

Exercise-1.2; ANSWER TIP – Navigation on Hierarchical Database records;

Conduct exercise to identify navigation paths to retrieve information from;

a. Parent to child level; The hierarchical database of Example.1.3.1 characteristically holding child records in adjacency, so the navigation from parent to child is finding the parent record and from next these are the child records of same parent till another parent is found or end of file.

b. Child to parent level; The hierarchy is one directional path from parent to child, child to parent is reverse adjacent navigation till the parent is found

c. Child to child level; It is combination of both a) and b) above.

Exercise-1.3; ANSWER TIP

Diagram hierarchical DBMS design in File system database of the organizational setup is similar to the Answer Tip of Exercise-1.1.

Exercise-1.4; Hierarchical Database Design

A game board has many clubs as members, and each club has their members various players. Construct hierarchical DBMS file system having information of tournament and their matches, matches results, the team members, captain and vice captain. The navigation for the following type of queries is essential;

e. Matches, results and teams

f. Matches scores of teams

g. Highest and lowest teams

h. Players penalty events

Game board has clubs, each club has members, and teams are drawn from club members. Teams make scores in matches so retrieval for highest and lowest scores. Players in matches earn penalty on illegal events. These answer form the hierarchy, and suits to hierarchical database design.

Exercise-1.5; ANSWER TIP

Orders are placed of multiple items to multiple suppliers each day.

Diagram network file DBMS design.

ORDER

ITEM

Design Two databases on above lines, a simple approach

Exercise1.6; ANSWER TIP

For better understanding of Example 1.3.7 answer the following questions;

a. List down each parent-child relationship marking their occurrences form parent entity and corresponding child entity.

b. An occurrence in child entity could have instance, which does not exist in the relevant parent.

c. How many maximum entries can occur in child table where counts of parent entries is taken from Examples 1.3.7.

d. In case a student takes a course whose faculty is not known how that student course registration would be made effective, discuss.

ANSWERS TIPS

a. Specific Student fact of STUDENT table is parent and same student fact of

REGISTERATION is child, and so on.

b. NO.

c. Entry count in STUDENT (multiply) entry count in COURSE (multiply) entry count in FACULTY

d. Not Allowed.

Exercise 1.7; ANSWER TIPS

In Example 1.3.7 the table REGISTRATION shown in Exercise 1.7 replaced the REGISTERATION as new structure.

Argue whether the Four tables STUDENT, FACULTY, COURSE, and REGISTRATION are demonstrating data of relational events or not?

ANSWER TIPS;

The new table REGISTRATION violates the relational characteristics above noted, because of fields of CRSE-NAME and FAC-NAME over there, are redundant.

END OF LESSON –1

B4GEN

Thursday, 30 August 2018

DATA MODELING; BASIC TERMS AND CONCEPTS- REVIEW

No comments:

Post a Comment