Skip to content

Introduction to data management

Learn about the lifecycle of a dataset, and data management plans (DMP), why they are important, and how to write one.

Introduction to data management, part 1 e-learning [PDF 1.2 MB]
Data management plan template [DOCX 1 MB]

On this page

What is the lifecycle of data?

Data has a lifecycle: collect, describe, store short-term, analyse and check, use, save or destroy. You will often see 'plan' added to the beginning of this lifecycle, but we like to think that planning relates to all stages. 

Data management is a way of (positively) influencing how your data moves through this cycle. A data management plan documents all the information and decisions made about the data. 

The image below captures the six stages mentioned above, with the 'plan' in the centre of the flow, connected to all stages. 

Why manage data?

Demand

The increased demand for evidence-based decision-making has increased the demand for data. The uses for data are also rapidly changing, requiring more versatility.

Transparency and security

If people are going to continue to provide data, they need to know organisations are taking good care of it.

Use and re-use

Data has the potential to be used more than once, and not just for one purpose. This means the ‘metadata’ (which describes and gives the context for the data) is as important as the actual data.

Legislative requirements

The Public Records Act (2005) requires that all data collected as part of government business is managed, until it is archived or destroyed.

What do I need to know about DMPs?

A good data management plan (DMP):

  • manages the dataset as well as describes (give information about) it
  • is a gateway to everything to do with the dataset. It must be clearly linked to the dataset, the metadata and any other relevant documents/ records. If the dataset is small, the plan may even contain the metadata about the dataset.
  • should always be put in place, even if sometimes this means adding it in order to manage data retrieved from a long-existing system, like a taxation one
  • can apply to more than one dataset if the datasets’ governance and related documentation are the same
  • shows that the data is being handled safely and securely
  • identifies any legislative or contractual requirements for accessing or using the data.

What are the parts of a good DMP?

  • Governance and access
  • Discovery, use, and re-use
  • Retention, preservation, and disposal. 

Governance and access

Governance

Governance is about properly looking after the dataset, and applies to both individuals and organisations.

An individual is accountable for their datasets: knowing how to access them, how to keep them secure and how the datasets contribute value inside and outside their organisation.

An organisation is accountable for how it manages its data assets, so they are accessible, secure, usable and re-usable. 

Responsibilities for data management begin with the data creators/collectors. They need to be sure, for example:

  • who will be responsible for the dataset
  • how informed consent will be handled
  • that legislative requirements will be met
  • that relevant principles/frameworks are followed
  • that documents regarding decisions are stored securely and are discoverable and accessible.

The easiest way to make sense of a data management plan is to create one on your own. Open the data management plan template then save the document to your own system. 

Data management plan template [Word 1 MB]

You can refer back to this guide as you complete the template. 

Access and security

Access and security are important to those who provide data. We need to be transparent about the openness of data we collect, including access limits:

  • How will we manage security?
  • How will we make sure the data remains uncorrupted?
  • What barriers might there be that would stop sharing?

Discovery, use, and re-use

Data documentation

  • Describe how the dataset was extracted or created.
  • Enable use and re-use by explaining what the data items mean.
  • Use consistent names to identify the data through its lifecycle (such as raw, processed, final).

The users of any particular dataset do not usually have the opportunity to talk with the creators. So describing the dataset is vital: it means the data can be discovered, used, and potentially re-used.

Data formats, volume, and storage

  • What form will the dataset be kept in?
  • What software is needed to use the dataset?
  • Where is the dataset stored?
  • What is the size of the dataset?

Retention, preservation, and disposal

Managing datasets well means both creators and users know that the data is being looked after. For example, they know:

  • how long datasets will be kept and how long-term access will be managed
  • how decisions will be made about disposal
  • reviews are scheduled in the data life cycle
  • the final archived data is read-only.

How do you practically implement a DMP?

Make sure that your data management plan (DMP) is:

  • easily accessible
  • user-friendly
  • easy to maintain.

You can find other simple Word or Excel versions online and even software programmes that collect data, put it into a database and create interactive reports.

DMPTool
Public templates
Publicly accessible example DMPs 

Where should a DMP live?

All your DMPs should be stored together in a shared location. Each DMP should clearly link to its datasets, relevant documents, and shared drives as appropriate.

How can I learn more?

Digital Curation Centre
A wonderful collection of resources for everything data and information management. 

Contact us

If you’d like more information, have a question, or want to provide feedback on this page,  email datalead@stats.govt.nz.

Content last reviewed 23 April 2021.

Top