Skip to content

Why create a data dictionary?

A data dictionary describes your data. It describes the choices made about column names, codes, methods, or sampling. It enables anyone to better find, understand, reuse, and manage that data.

The many benefits

The benefits of a data dictionary depend on your relationship to the data and what you are trying to achieve.

For individuals

Data dictionaries:

  • enable others to find and reuse your data
  • improve data consistency by employing and encouraging standards
  • Improve the quality of your data and the measurability of that quality
  • Reduce data redundancy - saves you time and effort.

For organisations

Data dictionaries:

  • enable the timely and efficient use of your data
  • reduce data redundancy - redundant data storage adds up
  • enable internal data sharing
  • demonstrate to, and build trust with, the public that you are managing public data responsibly
  • reveal poor design decisions
  • provide documentation about your data, making it easier to improve and manage quality.

For Aotearoa NZ

Data dictionaries:

  • enable government data to be found and reused by all
  • reduce data redundancy in Aotearoa NZ government, by enabling sharing and reuse
  • enable data from different agencies to be readily combined to provide more insights/uses

Data inventories and findable data 

Downloadable data dictionaries in CSV and PDF formats serve the basic needs of data analysts. That is, if the data is readily available, then a data dictionary in a CSV format will typically give them the information they need to confidently analyse yoour data.

However, these data dictionaries are merely static 'codebooks'. CSV formats are, at least, human and machine readible. But, these 'codebooks' can result in a lack of standardisation, as each analyst or data producer creates their own data dictionaries based on their own style.

Also, there are fantastic software products that can now 'intelligently' (digitally) bring together and maintained your data in interactive data catalogues. These will help your organisation keep track of your data, help others find your data, and link your data to similar data published by other organisations.

If every organisation maintained data dictionaries and catalogues like this, then the findability and accessibility of Aotearoa NZ data would be phenomenal.   

The Data Documentation Initiative

Software products like Colectica can help you to create these standardised, maintainable, and interactable data inventories. Furthermore, these products follow metadata standards from the Data Documentation Initiative (DDI). Following these international standards helps the interoperability of all our metadata and data.

The Data Documentation Initiative

These standards and software products also improves the findability of your data. For instance, a basic search on a data catalogue often looks for your search term in only the title and description. But, I am sure you are familiar with titles or descriptions that make no sense or use unexpected words for certain ideas. This is common and results often fail to find all of the relevant data. 

In comparison, rather than relying on a match between your search term and the title and description, these technologies and standards allow you to search for concept or a theme across all of the metadata.

Searching for 'adolescent' would then discover any data related to, linked by, or containing variables related to the concept of 'adolescent'. 

You can see an example of this in the following websites:

The Question Variable database
The Closer Discovery database

Learn more about DDI

If you want to explore DDI and the benefits for your organisation, the following documents have further explanation and examples. Unfortunately, they are in PDF format. If you need them in alternate formats for accessibility purposes, contact datalead@stats.govt.nz. 

Basic

An introduction to DDI [PDF 947 KB]

Intermediate

What can DDI do for you? [PPTX 5.4 MB]

Advanced 

Question driven harmonisation of data – The variable cascade in practice [PDF 3.7 MB]

Data dictionaries and stewardship

Data stewardship is the careful and responsible creation, collection, management, and use of data. Data dictionaries are one of the first, easiest, and best ways for an organisation to improve how it works with data and how data works for Aotearoa NZ.

Data stewardship

Contact us

If you’d like more information, have a question, or want to provide feedback, email datalead@stats.govt.nz.

Content last reviewed 11 January 2021.

Top