Misplaced Pages

Data dictionary: Difference between revisions

Article snapshot taken from[REDACTED] with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Browse history interactively← Previous editContent deleted Content addedVisualWikitext
Revision as of 16:45, 21 June 2022 editFfffrr (talk | contribs)Extended confirmed users99,524 edits Importing Wikidata short description: "Set of metadata that contains definitions and representations of data elements"Tag: Shortdesc helper← Previous edit Latest revision as of 08:01, 28 September 2024 edit undoSauer202 (talk | contribs)Extended confirmed users20,406 edits See also 
(10 intermediate revisions by 6 users not shown)
Line 1: Line 1:
{{Short description|Set of metadata that contains definitions and representations of data elements}} {{Short description|Set of metadata that contains definitions and representations of data elements}}
{{Distinguish|Dictionary (data structure)}} {{Distinguish|Dictionary (data structure)}}
]
{{Use dmy dates|date=July 2013}} {{Use dmy dates|date=October 2023}}
A '''data dictionary''', or ], as defined in the ''IBM Dictionary of Computing'', is a "centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format".<ref>ACM, , 10th edition, 1993</ref> '']'' defines it as a collection of tables with metadata. The term can have one of several closely related meanings pertaining to ]s and ]s (DBMS): A '''data dictionary''', or ], as defined in the ''IBM Dictionary of Computing'', is a "centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format".<ref>ACM, , 10th edition, 1993</ref> '']'' defines it as a collection of tables with metadata. The term can have one of several closely related meanings pertaining to ]s and ]s (DBMS):


Line 11: Line 12:
The terms ''data dictionary'' and ''data repository'' indicate a more general software utility than a catalogue. A ''catalogue'' is closely coupled with the DBMS software. It provides the information stored in it to the user and the DBA, but it is mainly accessed by the various software modules of the DBMS itself, such as ] and ] compilers, the query optimiser, the transaction processor, report generators, and the constraint enforcer. On the other hand, a ''data dictionary'' is a data structure that stores ], i.e., (structured) data about information. The software package for a stand-alone data dictionary or data repository may interact with the software modules of the DBMS, but it is mainly used by the designers, users and administrators of a computer system for information resource management. These systems maintain information on system hardware and software configuration, documentation, application and users as well as other information relevant to system administration.<ref>Ramez Elmasri, Shamkant B. Navathe: ''Fundamentals of Database Systems'', 3rd. ed. sect. 17.5, p. 582</ref> The terms ''data dictionary'' and ''data repository'' indicate a more general software utility than a catalogue. A ''catalogue'' is closely coupled with the DBMS software. It provides the information stored in it to the user and the DBA, but it is mainly accessed by the various software modules of the DBMS itself, such as ] and ] compilers, the query optimiser, the transaction processor, report generators, and the constraint enforcer. On the other hand, a ''data dictionary'' is a data structure that stores ], i.e., (structured) data about information. The software package for a stand-alone data dictionary or data repository may interact with the software modules of the DBMS, but it is mainly used by the designers, users and administrators of a computer system for information resource management. These systems maintain information on system hardware and software configuration, documentation, application and users as well as other information relevant to system administration.<ref>Ramez Elmasri, Shamkant B. Navathe: ''Fundamentals of Database Systems'', 3rd. ed. sect. 17.5, p. 582</ref>


If a data dictionary system is used only by the designers, users, and administrators and not by the DBMS Software, it is called a ''passive data dictionary.'' Otherwise, it is called an ''active data dictionary'' or ''data dictionary.'' When a passive data dictionary is updated, it is done so manually and independently from any changes to a DBMS (database) structure. With an active data dictionary, the dictionary is updated first and changes occur in the DBMS automatically as a result. If a data dictionary system is used only by the designers, users, and administrators and not by the DBMS Software, it is called a ''passive data dictionary.'' Otherwise, it is called an ''active data dictionary'' or ''data dictionary.'' When a passive data dictionary is updated, it is done so manually and independently from any changes to a DBMS (database) structure. With an active data dictionary, the dictionary is updated first and changes occur in the DBMS automatically as a result.


Database ] and ] developers can benefit from an authoritative data dictionary document that catalogs the organization, contents, and conventions of one or more databases.<ref>TechTarget, ''SearchSOA'', </ref> This typically includes the names and descriptions of various ] (records or Entities) and their contents (]) plus additional details, like the ] and length of each ]. Another important piece of information that a data dictionary can provide is the relationship between Tables. This is sometimes referred to in Entity-Relationship diagrams, or if using Set descriptors, identifying which Sets database Tables participate in. Database ] and ] developers can benefit from an authoritative data dictionary document that catalogs the organization, contents, and conventions of one or more databases.<ref>TechTarget, ''SearchSOA'', {{Webarchive|url=https://web.archive.org/web/20090212170420/http://searchsoa.techtarget.com/sDefinition/0,,sid26_gci211896,00.html |date=12 February 2009 }}</ref> This typically includes the names and descriptions of various ] (] or ]) and their contents (]) plus additional details, like the ] and length of each ]. Another important piece of information that a data dictionary can provide is the relationship between tables. This is sometimes referred to in ] diagrams (ERDs), or if using set descriptors, identifying which sets database tables participate in.


In an active data dictionary constraints may be placed upon the underlying data. For instance, a Range may be imposed on the value of numeric data in a data element (field), or a Record in a Table may be FORCED to participate in a set relationship with another Record-Type. Additionally, a distributed DBMS may have certain location specifics described within its active data dictionary (e.g. where Tables are physically located). In an active data dictionary constraints may be placed upon the underlying data. For instance, a range may be imposed on the value of numeric data in a data element (field), or a record in a table may be forced to participate in a set relationship with another record-type. Additionally, a distributed DBMS may have certain location specifics described within its active data dictionary (e.g. where tables are physically located).


The data dictionary consists of record types (tables) created in the database by systems generated command files, tailored for each supported back-end DBMS. Oracle has a list of specific views for the "sys" user. This allows users to look up the exact information that is needed. Command files contain SQL Statements for CREATE TABLE, CREATE UNIQUE INDEX, ALTER TABLE (for referential integrity), etc., using the specific statement required by that type of database. The data dictionary consists of record types (tables) created in the database by systems generated command files, tailored for each supported back-end DBMS. Oracle has a list of specific views for the "sys" user. This allows users to look up the exact information that is needed. Command files contain SQL Statements for <code>CREATE TABLE</code>, <code>CREATE UNIQUE INDEX</code>, <code>ALTER TABLE</code> (for referential integrity), etc., using the specific statement required by that type of database.


There is no universal standard as to the level of detail in such a document. There is no universal standard as to the level of detail in such a document.


==Middleware== ==Middleware==
In the construction of database applications, it can be useful to introduce an additional layer of data dictionary software, i.e. ], which communicates with the underlying DBMS data dictionary. Such a "high-level" data dictionary may offer additional features and a degree of flexibility that goes beyond the limitations of the native "low-level" data dictionary, whose primary purpose is to support the basic functions of the DBMS, not the requirements of a typical application. For example, a high-level data dictionary can provide alternative ]s tailored to suit different applications that share a common database.<ref>U.S. Patent 4774661, , 19 November 1985, AT&T</ref> Extensions to the data dictionary also can assist in ] against ]s.<ref>U.S. Patent 4769772, , 28 February 1985, Honeywell Bull</ref> Additionally, DBA functions are often automated using restructuring tools that are tightly coupled to an active data dictionary. In the construction of database applications, it can be useful to introduce an additional layer of data dictionary software, i.e. ], which communicates with the underlying DBMS data dictionary. Such a "high-level" data dictionary may offer additional features and a degree of flexibility that goes beyond the limitations of the native "low-level" data dictionary, whose primary purpose is to support the basic functions of the DBMS, not the requirements of a typical application. For example, a high-level data dictionary can provide alternative ]s tailored to suit different applications that share a common database.<ref>U.S. Patent 4774661, , 19 November 1985, AT&T</ref> Extensions to the data dictionary also can assist in ] against ]s.<ref>U.S. Patent 4769772, , 28 February 1985, Honeywell Bull</ref> Additionally, DBA functions are often automated using restructuring tools that are tightly coupled to an active data dictionary.


]s aimed at ] sometimes include high-level data dictionary facilities, which can substantially reduce the amount of programming required to build ], ], reports, and other components of a database application, including the database itself. For example, PHPLens includes a ] ] to automate the creation of tables, indexes, and ] constraints ] for multiple databases.<ref>PHPLens, {{Webarchive|url=https://web.archive.org/web/20071107170943/http://www.phplens.com/lens/adodb/docs-datadict.htm |date=7 November 2007 }}</ref> Another PHP-based data dictionary, part of the RADICORE toolkit, automatically generates program ], ], and SQL code for menus and forms with ] and complex ].<ref>RADICORE, </ref> For the ] environment, ] data dictionary provides cross-DBMS facilities for automated database creation, data validation, performance enhancement (] and index utilization), ], and extended ]s.<ref>Base One International Corp., </ref> ] features<ref>VISUAL DATAFLEX,</ref> provides the ability to use DataDictionaries as class files to form middle layer between the user interface and the underlying database. The intent is to create standardized rules to maintain data integrity and enforce business rules throughout one or more related applications. ]s aimed at ] sometimes include high-level data dictionary facilities, which can substantially reduce the amount of programming required to build ], ], reports, and other components of a database application, including the database itself. For example, PHPLens includes a ] ] to automate the creation of tables, indexes, and ] constraints ] for multiple databases.<ref>PHPLens, {{Webarchive|url=https://web.archive.org/web/20071107170943/http://www.phplens.com/lens/adodb/docs-datadict.htm |date=7 November 2007 }}</ref> Another PHP-based data dictionary, part of the RADICORE toolkit, automatically generates program ], ], and SQL code for menus and forms with ] and complex ].<ref>RADICORE, </ref> For the ] environment, ] data dictionary provides cross-DBMS facilities for automated database creation, data validation, performance enhancement (] and index utilization), ], and extended ]s.<ref>Base One International Corp., </ref> ] features<ref>VISUAL DATAFLEX, {{Webarchive|url=https://web.archive.org/web/20180405135738/http://www.visualdataflex.com/features.asp?pageid=1030 |date=5 April 2018 }}</ref> provides the ability to use DataDictionaries as class files to form middle layer between the user interface and the underlying database. The intent is to create standardized rules to maintain data integrity and enforce business rules throughout one or more related applications.


Some industries use generalized data dictionaries as technical standards to ensure interoperability between systems. The real estate industry, for example, abides by a to which the ] mandates<ref>{{Cite web|title=Real Estate Transaction Standards (RETS) Web API|url=https://www.nar.realtor/about-nar/policies/mls-policy/real-estate-transaction-standards-rets-web-api|access-date=2020-10-11|website=www.nar.realtor|language=en}}</ref> its ] comply with through its policy handbook.<ref>{{Cite web|title=Handbook on Multiple Listing Policy|url=https://www.nar.realtor/handbook-on-multiple-listing-policy|access-date=2020-10-11|website=www.nar.realtor|language=en}}</ref> This intermediate mapping layer for MLSs' native databases is supported by software companies which provide API services to MLS organizations. Some industries use generalized data dictionaries as technical standards to ensure interoperability between systems. The real estate industry, for example, abides by a to which the ] mandates<ref>{{Cite web|title=Real Estate Transaction Standards (RETS) Web API|url=https://www.nar.realtor/about-nar/policies/mls-policy/real-estate-transaction-standards-rets-web-api|access-date=2020-10-11|website=nar.realtor|date=23 January 2015 |language=en}}</ref> its ] comply with through its policy handbook.<ref>{{Cite web|title=Handbook on Multiple Listing Policy|url=https://www.nar.realtor/handbook-on-multiple-listing-policy|access-date=2020-10-11|website=nar.realtor|date=January 2015 |language=en}}</ref> This intermediate mapping layer for MLSs' native databases is supported by software companies which provide API services to MLS organizations.


==Platform-specific examples== ==Platform-specific examples==
Developers use a ''data description specification'' (''DDS'') to describe data attributes in file descriptions that are external to the application program that processes the data, in the context of an ].<ref>{{cite web |url=http://publib.boulder.ibm.com/infocenter/iseries/v5r3/topic/dds/rbafpddsmain.htm |title=DDS documentation for IBM System i V5R3}}</ref> The ''sys.ts$'' table in Oracle stores information about every table in the database. It is part of the data dictionary that is created when the ] is created.<ref>{{Cite web|url=http://www.dba-oracle.com/concepts/data_dictionary.htm|title=Oracle Concepts - Data Dictionary|website=www.dba-oracle.com|access-date=2017-02-13}}</ref> Developers may also use DDS context from FOSS for structured and transactional queries in open environments. Developers use a ] (DDS) to describe data attributes in file descriptions that are external to the application program that processes the data, in the context of an ].<ref>{{cite web |url=http://publib.boulder.ibm.com/infocenter/iseries/v5r3/topic/dds/rbafpddsmain.htm |title=DDS documentation for IBM System i V5R3}}</ref> The ''sys.ts$'' table in Oracle stores information about every table in the database. It is part of the data dictionary that is created when the ] is created.<ref>{{Cite web|url=http://www.dba-oracle.com/concepts/data_dictionary.htm|title=Oracle Concepts - Data Dictionary|website=dba-oracle.com|access-date=2017-02-13}}</ref> Developers may also use DDS context from ] (FOSS) for structured and transactional queries in open environments.


==Typical attributes== ==Typical attributes==
Line 38: Line 39:
* Field name, such as ] field name * Field name, such as ] field name
* Displayed field title. May default to field name if blank. * Displayed field title. May default to field name if blank.
* Field ] (string, integer, date, etc.) * Field ] (string, integer, date, etc.)
* Dimension(s) such as min and max values, display width, or number of decimal places. Different field types may interpret this differently. An alternative is to have different attributes depending on field type. * ] such as min and max values, display width, or number of decimal places. Different field types may interpret this differently. An alternative is to have different attributes depending on field type.
* Field display order or tab order * Field display order or tab order
* Coordinates on screen (if a positional or grid-based UI) * Coordinates on screen (if a positional or grid-based UI)
* Default value * Default value
* Prompt type, such as drop-down list, combo-box, check-boxes, range, etc. * Prompt type, such as drop-down list, combo-box, check-boxes, range, etc.
* Is-required (Boolean) - If 'true', the value can't be blank, null, or only white-spaces * Is-required (Boolean) - If 'true', the value can not be blank, null, or only white-spaces
* Is-read-only (Boolean) * Is-read-only (Boolean)
* Reference table name, if a foreign key. Can be used for validation or selection lists. * Reference table name, if a foreign key. Can be used for validation or selection lists.
Line 55: Line 56:
*] *]
*] *]
*]
*] *]
*] *]
Line 63: Line 65:


==References== ==References==
{{Reflist|30em}} {{Reflist}}


==External links== ==External links==
Line 69: Line 71:
*Yourdon, ''Structured Analysis Wiki'', *Yourdon, ''Structured Analysis Wiki'',
*Octopai, *Octopai,



{{Data warehouse}} {{Data warehouse}}

Latest revision as of 08:01, 28 September 2024

Set of metadata that contains definitions and representations of data elements Not to be confused with Dictionary (data structure).
A simple layout of a data dictionary

A data dictionary, or metadata repository, as defined in the IBM Dictionary of Computing, is a "centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format". Oracle defines it as a collection of tables with metadata. The term can have one of several closely related meanings pertaining to databases and database management systems (DBMS):

  • A document describing a database or collection of databases
  • An integral component of a DBMS that is required to determine its structure
  • A piece of middleware that extends or supplants the native data dictionary of a DBMS

Documentation

The terms data dictionary and data repository indicate a more general software utility than a catalogue. A catalogue is closely coupled with the DBMS software. It provides the information stored in it to the user and the DBA, but it is mainly accessed by the various software modules of the DBMS itself, such as DDL and DML compilers, the query optimiser, the transaction processor, report generators, and the constraint enforcer. On the other hand, a data dictionary is a data structure that stores metadata, i.e., (structured) data about information. The software package for a stand-alone data dictionary or data repository may interact with the software modules of the DBMS, but it is mainly used by the designers, users and administrators of a computer system for information resource management. These systems maintain information on system hardware and software configuration, documentation, application and users as well as other information relevant to system administration.

If a data dictionary system is used only by the designers, users, and administrators and not by the DBMS Software, it is called a passive data dictionary. Otherwise, it is called an active data dictionary or data dictionary. When a passive data dictionary is updated, it is done so manually and independently from any changes to a DBMS (database) structure. With an active data dictionary, the dictionary is updated first and changes occur in the DBMS automatically as a result.

Database users and application developers can benefit from an authoritative data dictionary document that catalogs the organization, contents, and conventions of one or more databases. This typically includes the names and descriptions of various tables (records or entities) and their contents (fields) plus additional details, like the type and length of each data element. Another important piece of information that a data dictionary can provide is the relationship between tables. This is sometimes referred to in entity-relationship diagrams (ERDs), or if using set descriptors, identifying which sets database tables participate in.

In an active data dictionary constraints may be placed upon the underlying data. For instance, a range may be imposed on the value of numeric data in a data element (field), or a record in a table may be forced to participate in a set relationship with another record-type. Additionally, a distributed DBMS may have certain location specifics described within its active data dictionary (e.g. where tables are physically located).

The data dictionary consists of record types (tables) created in the database by systems generated command files, tailored for each supported back-end DBMS. Oracle has a list of specific views for the "sys" user. This allows users to look up the exact information that is needed. Command files contain SQL Statements for CREATE TABLE, CREATE UNIQUE INDEX, ALTER TABLE (for referential integrity), etc., using the specific statement required by that type of database.

There is no universal standard as to the level of detail in such a document.

Middleware

In the construction of database applications, it can be useful to introduce an additional layer of data dictionary software, i.e. middleware, which communicates with the underlying DBMS data dictionary. Such a "high-level" data dictionary may offer additional features and a degree of flexibility that goes beyond the limitations of the native "low-level" data dictionary, whose primary purpose is to support the basic functions of the DBMS, not the requirements of a typical application. For example, a high-level data dictionary can provide alternative entity-relationship models tailored to suit different applications that share a common database. Extensions to the data dictionary also can assist in query optimization against distributed databases. Additionally, DBA functions are often automated using restructuring tools that are tightly coupled to an active data dictionary.

Software frameworks aimed at rapid application development sometimes include high-level data dictionary facilities, which can substantially reduce the amount of programming required to build menus, forms, reports, and other components of a database application, including the database itself. For example, PHPLens includes a PHP class library to automate the creation of tables, indexes, and foreign key constraints portably for multiple databases. Another PHP-based data dictionary, part of the RADICORE toolkit, automatically generates program objects, scripts, and SQL code for menus and forms with data validation and complex joins. For the ASP.NET environment, Base One's data dictionary provides cross-DBMS facilities for automated database creation, data validation, performance enhancement (caching and index utilization), application security, and extended data types. Visual DataFlex features provides the ability to use DataDictionaries as class files to form middle layer between the user interface and the underlying database. The intent is to create standardized rules to maintain data integrity and enforce business rules throughout one or more related applications.

Some industries use generalized data dictionaries as technical standards to ensure interoperability between systems. The real estate industry, for example, abides by a RESO's Data Dictionary to which the National Association of REALTORS mandates its MLSs comply with through its policy handbook. This intermediate mapping layer for MLSs' native databases is supported by software companies which provide API services to MLS organizations.

Platform-specific examples

Developers use a data description specification (DDS) to describe data attributes in file descriptions that are external to the application program that processes the data, in the context of an IBM i. The sys.ts$ table in Oracle stores information about every table in the database. It is part of the data dictionary that is created when the Oracle Database is created. Developers may also use DDS context from free and open-source software (FOSS) for structured and transactional queries in open environments.

Typical attributes

Here is a non-exhaustive list of typical items found in a data dictionary for columns or fields:

  • Entity or form name or their ID (EntityID or FormID). The group this field belongs to.
  • Field name, such as RDBMS field name
  • Displayed field title. May default to field name if blank.
  • Field type (string, integer, date, etc.)
  • Measures such as min and max values, display width, or number of decimal places. Different field types may interpret this differently. An alternative is to have different attributes depending on field type.
  • Field display order or tab order
  • Coordinates on screen (if a positional or grid-based UI)
  • Default value
  • Prompt type, such as drop-down list, combo-box, check-boxes, range, etc.
  • Is-required (Boolean) - If 'true', the value can not be blank, null, or only white-spaces
  • Is-read-only (Boolean)
  • Reference table name, if a foreign key. Can be used for validation or selection lists.
  • Various event handlers or references to. Example: "on-click", "on-validate", etc. See event-driven programming.
  • Format code, such as a regular expression or COBOL-style "PIC" statements
  • Description or synopsis
  • Database index characteristics or specification

See also

References

  1. ACM, IBM Dictionary of Computing, 10th edition, 1993
  2. Ramez Elmasri, Shamkant B. Navathe: Fundamentals of Database Systems, 3rd. ed. sect. 17.5, p. 582
  3. TechTarget, SearchSOA, What is a data dictionary? Archived 12 February 2009 at the Wayback Machine
  4. U.S. Patent 4774661, Database management system with active data dictionary, 19 November 1985, AT&T
  5. U.S. Patent 4769772, Automated query optimization method using both global and parallel local optimizations for materialization access planning for distributed databases, 28 February 1985, Honeywell Bull
  6. PHPLens, ADOdb Data Dictionary Library for PHP Archived 7 November 2007 at the Wayback Machine
  7. RADICORE, What is a Data Dictionary?
  8. Base One International Corp., Base One Data Dictionary
  9. VISUAL DATAFLEX,features Archived 5 April 2018 at the Wayback Machine
  10. "Real Estate Transaction Standards (RETS) Web API". nar.realtor. 23 January 2015. Retrieved 11 October 2020.
  11. "Handbook on Multiple Listing Policy". nar.realtor. January 2015. Retrieved 11 October 2020.
  12. "DDS documentation for IBM System i V5R3".
  13. "Oracle Concepts - Data Dictionary". dba-oracle.com. Retrieved 13 February 2017.

External links

Data warehouses
Creating a data warehouse
Concepts
Variants
Elements
Fact
Dimension
Filling
Using a data warehouse
Concepts
Languages
Tools
Related
People
Products
Categories:
Data dictionary: Difference between revisions Add topic