Guidance from Data Ethics Advisory Group

We have collated the guidance provided by the Data Ethics Advisory Group (DEAG) to government agencies who have brought specific initiatives and challenges to the Group. The guidance captures and themes advice given in response to these specific requests, however, much of the guidance will be generalisable to many use cases.

Preliminary considerations: deciding whether to proceed

Consider use cases individually and be clear about what is in and out of scope. Describe each use case and consider the ethics relating to each. Learn from others e.g., international initiatives.

When working with people’s personal information, complete a Privacy Impact Assessment (PIA) but note that a PIA alone is insufficient to identify ethical issues.

Consider all actual and perceived risks

Note the foundational risks of trust, confidence, and engagement with the data system (when the public trusts government with their data, good data quality results, strengthening official statistics)
Note that data continues to have a life of its own once it is shared
Remember that data released publicly can then be used commercially
Ask questions like - could disadvantaged populations be further disadvantaged by initiatives or interventions?
What level of residual risk is acceptable?

Consider the potential benefits

What use is the data being put to?
Will it make people's lives better?
Who may benefit from the data use and application?
Are the benefits being returned to the public of New Zealand?
Will this foster local innovation - or are the benefits going overseas or to commercial entities? If entities are going to make a lot of money overseas, what commercial gains would need to flow back into New Zealand?
What meaningful impact will eventuate?
Measure where the impacts may occur over time while recognising that communities' needs may change over time.

Consider equity, inclusion and reciprocity

Who gets to influence decisions and decide what good use is?
A digital divide is real – will this exacerbate it? How could digital inclusion occur?
Is upskilling of capability needed?
Promote accessibility e.g., give people access to the data about them and their communities
Reciprocate where possible and give data back to communities
The environment and future generations also have rights
Use an ethical framework, for example:
- Consequentialism
- Utility and fairness
- Rights and justices
Underpin ethical principles with human principles, to ensure people are at the centre of the decision/s made.

Consider the balance of benefits and potential harms to reach a final decision

Follow the principle of ‘doing good while doing no harm’. Consider a wide definition of harm for individuals, families and groups.
Protect people and communities.
It is important to reflect on what is possible and what is necessary.
Carefully consider ‘should we do this?
Consider what are the ethics of not doing this?
When risk and sensitivities are involved, there is a need to proceed very carefully.

Establishing the authorising environment

It is important to operate within the right ethical and legal frameworks and explicitly express how decisions align with these. Identify possible consequences for data to be used in ways that could lead to harm/s, or in ways that could extinguish rights.

Consider the legislative and ethical environment

Data sovereignty rights.
Jurisdictional risks e.g. data held in, or shared with, other countries. Jurisdictional risks refer to the legal pathway for foreign governments, for example Australia, to access NZ data held in their jurisdictions.
Territorial reach is also legally possible, under current mutual assistance laws, but has not yet been experienced. This is where other governments, in certain circumstances, can request access to data held in NZ.
Privacy Law and whether engagement with the Office of the Privacy Commissioner is advisable.
Human Rights.
The rights of the environment and future generations.

Seek to maintain and build social license, trust and confidence

Trust is paramount; it should be prioritised and built proactively.
New Zealanders need to have trust and confidence in the ability of those collecting and using data.
Align to trust criteria and fully incorporate the classical philosophical structure of trust.
Put in place strong guardrails and frameworks to ensure government data use does not overreach.
Be aware of power imbalances e.g., The role of Stats NZ as one of the few agencies that people must give information to, and the recognition of its constitutional role.

Develop strong governance

View data as a system, rather than managing it in a siloed approach.
Ensure there are clear roles and accountabilities for data.
Build in a transparent, ethical review process from the start.
A good practice is to have an ethical panel overseeing significant projects that includes diverse membership and service users.
Independent reviews of best practice can be beneficial when expanding or changing data use.
Consider an independent monitoring function to ensure the agency is following necessary assurance rules. This may alleviate public concerns and provide reassurance to ministers.
Develop risk management in a way that people feel comfortable to report issues and concerns.
Ensure the proposed use of the data aligns with the original purpose of data collection. For the individuals that the data relates to, would they reasonably expect it to be used in this way?
Consider system choices that support system to system interoperability.
Make decisions in a transparent manner.

Security

Put in place rules and controls as to who can access what data.
Take a ‘zero-trust’ approach when considering technical parameters and culture to prevent transgressions.
Deploy modern data governance tools to enforce policies and protocols at a technological level.
Data access rules need to be applied and monitored very seriously, with trackable audit capability incorporated (who accesses what data and how it is used). Establish guidance as to what the triggers, or ‘red flags’ would be to necessitate an audit.
Breaches must have serious consequences and individuals need to be reassured that their data is stored and used safely, especially within small communities. 
Physical safe haven access systems (for an example see the SafePod network) can allow secure access to data hosted by another organisation.

Capability

Invest in training so staff, partners, and researchers understand how to use data responsibly.
Invest time in data ethics training – you can contact Stats NZ’s Centre for Data Ethics and Innovation for free training materials and guidance
Create a learning environment so that you regular assess and continuously improve your data practices.

Core practices

Some data and data practices require certain ethical considerations

Sensitive data: data can be considered sensitive by different people, cultures, and communities e.g., Māori data that is tapu, rainbow communities who are fearful to self-identify, imaging, cervical screening. How should this data be treated?
Ethnicity data: ethnicity goes to the heart of peoples’ identity; therefore, it is important that individuals and communities can define ethnicity on their terms. Be aware that some people do not feel safe in self-identifying their ethnicity in certain situations, fearing that they may be at risk of negative outcomes, so they may report ethnicity differently in different contexts. Ethnicity can also change over time. Consider using ontology over taxonomy, as ontology provides flexibility in how different ethnicities can be grouped for different purposes, and the ability to include greater detail (e.g., enabling finer self-identification). For transparency, publish mappings and collection contexts.
Missing data: it is important to understand who is not in the data. These individuals and communities are not represented and not visible. How can the data be more inclusive?
Imputation: consider human rights and autonomy issues when imputing missing data and the value of non-responses.
Administrative data: consider how data was collected when considering whether it is appropriate to use or not, e.g., was information provided by the individual or someone else? was it collected under duress or stress? Note that people can give their information (e.g., ethnicity) in different ways to different places. There are many factors that affect the quality of administrative data that need to be considered.
Synthetic data: when synthetic data closely models real populations, carefully consider any privacy concerns.
Proxy measures: be cautious of using data attributes as proxy measures as these can prove to be poor substitutes and can introduce inaccuracies and bias.
Inferences: inferences can also result in harm, such as discrimination and stigma.
Simulation modelling: there are often trade-offs to be made when creating models. Careful thought is needed around what data and attributes are used, including sampling methodology that ensures representativeness.

Engagement

Engage with groups and communities in ways that build trust and value people

Engage early and build trust.
Consider how to bring the public on the journey, to increase trust and confidence.
Communicate processes in a way that is transparent for communities to engage with. Be transparent about data uses including secondary uses of data, e.g., by another agency, such as sharing data with Stats NZ to integrate into the Integrated Data Infrastructure (IDI).
Language is important. Consider the content and tone.
Design for accessibility and inclusivity, e.g., translate consultation documents into other languages and allow people to make verbal submissions.
Don’t assume that communities understand what their data needs are, what data is available, how it can be used, or the level of data granularity required. Presenting product choices to a group can be a helpful approach instead.
Consider appropriate reciprocity for people’s time and expertise.

Engage with key demographics and priority populations

Engage with Māori and design specific engagement with mana whenua.
Conduct more targeted engagement with those likely to be most impacted. These are the people you most need to hear from and where engagement effort should be targeted, rather than engaging with those who are easy to find.
Engage with those who are traditionally misrepresented, under-represented or not present in the data (this is especially true for administrative data sources). For example, communities poorly served by government.
We recommend a focus on working with minorities, with a rationale that what works well for minority groups will work well for the majority of people. For example, if products are developed that meet the specific needs of people with a disability, usability will be improved for everyone.
Consider engaging with rainbow, disabled, Pacific, ethnic minorities, and communities that have low trust in Government, etc.
Consider ways that people with lived experience can be involved in the design of new data processes and in research about them.

Data Collection

When collecting data from people it is important to allow for autonomy and to build trust

Before collecting additional data, can you demonstrate what you are doing with the data you currently have?
Don’t over-collect, i.e., don’t collect more data than is needed for the existing purpose.
Carefully consider the choice of sampling methodology and recruitment to ensure representation of those who have not previously been well included in the data.
Consider things such as 'if one person in the household shares their data, how could this impact on the household or group of households/groups they belong to?'
Consider the attributes to be collected to allow for visibility and intersectionality e.g., avoid binary choices such as Māori and non-Māori.
Creating diverse and representative datasets in research will contribute to more accurate insights, and ultimately, better outcomes for all.
Be cautious when collecting ‘deficit based’ data about people. Ensure that this will not bias decisions against them in the future or result in stigmatisation.
Be thoughtful and careful when seeking to collect information from people in very vulnerable situations, where this could be potentially risky for them, e.g., family violence. Consider what protective steps might be taken.
Data quality issues can occur when staff enter ‘best guesses’ into databases rather than information provided by the individual, e.g., ethnicity.  Wherever possible, allow people to self-identify.

Seek consent to collect data

Ideally consent is made freely and voluntarily, after being clearly informed as to the purpose of the data collection and what will happen throughout the data lifecycle. It is given without undue influence and taken back without consequence. Consent needs to be collected and evidenced. 
Allow autonomy and choice by providing options and giving people the ability to not answer a question.
Use clear language and a high level of transparency. Messaging should be user-friendly, accessible, and informative so that people understand what they are being asked to consent to.
Consent is not achieved by providing a note in small print in a service agreement that states that personal and service use information will be collected and used for research/quality control purposes by government agencies/service funders.
When it is not realistic or practical to give service users a choice about what happens to their data (the ideal), then other standards for ethical data use are going to become even more important. A lot more needs to be done in terms of justifying a research project in the absence of consent, especially around data management, local governance, and consultation with affected communities.
Other situations that require thoughtfulness and care are when indirect information is collected. For example: when information is provided by others on individuals who may not provide consent if they were directly approached; adults providing information on children who are not consulted directly.
Separate out consent for services and consent for data sharing. Both need to be sought—in a way that doesn’t limit an organisation to do their job. 
For further guidance on obtaining informed consent when running a survey, please refer to the section Getting informed consent.

Data sharing

Data sharing, especially with those who the data is from, can empower communities and enable benefits for New Zealanders, but sharing data needs to be done carefully to ensure that privacy and ethics are maintained. Consider whether sharing data is the right thing to do, or if not sharing data is the better decision. Is data sharing being done for a ‘good purpose’? Is identifiable data needed?  When informed consent of the people whose data is being shared is absent, an ethics lens is especially critical to ensure harm is mitigated.

It is important to consider equity, transparency, use purpose, and data minimisation when sharing data

Consider issues of equity and fairness when allowing access to data.
Be transparent on what data is being shared, to who, and for what purpose.
The purpose the data will be used for should align with the purpose of the initial data collection.
If this data is being shared with commercial entities, what benefit is flowing back to those whose data it is?
Only share the minimal amount of data necessary, at the minimum level of data detail.
Consider whether the organisation you are sharing data with may require support, e.g., to comply to standards such as the Information Sharing Standard.

Where the data is personal information, ensure protection of privacy and confidentiality – for both individuals and groups

The risk of re-identification:

There is increasing ability to re-identify information and greater capability to reverse engineer de-identification processes (e.g., hashing).
The risk of re-identification is greater when bringing together data from different data sources.

Consider technical ways of sharing information that minimise the risk of identifying individuals and groups, for example:

Technical approaches that preserve privacy but enable data activities, e.g., data masking, homomorphic encryption (a form of encryption that allows computations to be performed on encrypted data without first having to decrypt it).
There are options to be more permissible with how the data is used, while maintaining protective control of the data, e.g., receiving queries from entities and then releasing the aggregated output of those queries, not the data itself.
Federated data models allow agencies and groups to retain control over their data and what data is shared.
Temporary links to data can be used to tightly manage access.
Differentiated privacy settings can enable entities to access different levels of data.
Derived data products (with varying levels of granularity) can broaden access without exposing source microdata directly.
Data tags can protect some data for specific use, while other data may be made accessible more readily.

Working with data of a group

It can be difficult to balance publishing information that is valuable and has an impact while ensuring no harm could happen to the group in question, including the risks of identifiability and mis-identifiability. This is a complex problem and public policy challenge. It is important to build and enhance trust while reducing or eliminating harm. It’s about balance and affording as much privacy as possible to individuals—especially around sensitive matters—without restricting the rights and interests of others or impeding a common good.

It is difficult to describe a group or assign individuals to a collective, since the boundary of any group can be difficult to shape and we all have so many intersections. 

Questions to consider when defining a group:

How are the boundaries of the group defined?
Is the group self-defined or algorithmically determined? How does the community define themselves?
Just because different individuals share certain characteristics, does that mean they belong to the group?
Does the group have agency around shared interests and goals?
Who can legitimately speak for the group? Some communities have recognised authorities to speak on their behalf (e.g., iwi leaders), but some don’t (e.g., LGBTQI+).
Who owns the data of a community (e.g., Māori data sovereignty)?

Consider risks and mitigations

The nature and extent of an outcome is based on how sensitive an individual is to being identified, and that will determine the consequences. 

There are wider issues around the potential for group harms in big data than just potential breaches of privacy, or ‘outing’ an individual as a member of a group (or inferring that they are with a reasonable degree of confidence). Consider whether the group is asserting a moral right that it is wrong to infer or reveal anything about individuals in the group on the basis that they share (sensitive) characteristics with others.

We can’t be confident that the same safeguards that protect privacy can mitigate these other risks. Be alert, have guardrails in place and be prepared, all while taking a learning-system approach. Treat group privacy risks explicitly when publishing high impact statistics, especially at intersections of marginalised groups.

Data should be objective. Deficit statistics and stories lead to harm. Questioning who is in control of the narrative and what data is being sought and exposed is critical. Allow people who understand the community being researched to give contextual understanding to the data being used. 

Note that:

Identifiability issues do not only apply to small groups. 
There is a distinction between what people can observe and infer, versus the data that is published and the likelihood of exposure at the intersection of two marginalised groups.
There is a need for careful consideration when a group already feels targeted by society. To protect and reduce harm for groups that face bias, stigma and discrimination, it might be helpful to think about where this duty of protection comes from. For example, it may not be coming only from privacy rights.
Questions about fairness will arise if the perception becomes that one group is more deserving of active protection than others.
The Government’s commitment to the United Nations Convention on the Rights of Persons with Disabilities (UNCRPD) provides the mandate to legitimate disability representation.

Possible approaches:

One approach would be to engage with one group as a test case, working to understand whether standard or bespoke processes and guardrails would be needed.
Don’t assume that communities understand their data needs, what data is available, how it can be used, and the level of data granularity required. Presenting product choices to a group can be a helpful approach. 
Protocols are needed to ensure the Five-Safes are always applied, while also considering case by case, problem definition and any possible implications. Be content-sensitive and vigilant on ramifications.
This is a multi-dimensional problem where it is hard to find a solution and therefore it can be helpful to look at a ‘near adjacent’ for part-solutions, such as the data approaches of Māori communities.
It is important that this work is done ‘with and by’ communities, rather than ‘for’, or ‘to’ communities. Approaches include evolving technical and social relationship elements.  
Weigh privileged access for certain groups as a potential safeguard in cases where public release may create risk, alongside Five Safes ‘safe outputs’ controls.
Efforts in good faith towards transparency and engagement will build trust.

Oversight of data sharing

When sharing data, it is important to maintain oversight.

Retain a high level of visibility to ensure that there is no ‘loss-of-sight’ of the data shared and the obligations of the third party.
Determine when shared data must be deleted by the recipient, along with a confirmation process to ensure that deletion of the data has occurred, e.g., certification.
Have clear roles and responsibilities for each party so if a privacy breach occurs, everyone knows what is needed, including who will notify that there has been a breach and how to address harm to the individuals whose personal information is involved.

Designing data products and services

When designing data products and services:

Anchor designs on values like transparency, usefulness, resilience, equitable access, incrementalism, and clear principles to resist bad actors.
Use new technologies carefully and be watchful of data bias.
Incorporate an ethical framework.
Realise the local value in data and ensure data is easily accessible in a local context.
Distinguish clearly between general ‘accessibility’ and ‘accessibility for disabled people’. Engage disability experts early for advice and co-design.
Learn with each increment and adapt as necessary.

Specific cases

Social investment

To support an ethical approach to social investment, DEAG advises that service users are always put first. To minimise harm and maximise the value for communities the following practices should be considered:

Seek consent to collect data
Provide training for NGOs on ethical data collection, e.g., consider an auditable certification regime – this could be online.
Provide a practical risk framework, for example, a ‘traffic light system’ where ‘green’ (low risk) could be managed by following good practice exemplars; ‘orange’ (medium risk) would indicate a pathway for engagement; and ‘red’ (high risk) should not be contemplated. 
Develop a multi-language, standardised information brochure for service users to clearly outline what their decisions would mean, including how identifying information will be removed. 
Wrap a learning system around the approach.
Consider establishing an independent group to monitor and determine if objectives, informed consent, data quality and safety, are achieved in the process.

Artificial intelligence (AI)

Moral responsibility is needed in this space where people’s lives are impacted. Technology can be misused where ‘what is a tool to someone can be a weapon to someone else’. Data standards, data quality, data governance, data protection, and privacy are of critical importance. Ask is this technology necessary?

Responsible AI governance is needed

Use a responsible AI Governance framework
Conduct an AI Impact Assessment (AIA)
Rigorously test before implementing models. Consider testing external-facing models, like Chatbots, internally first and then with testers that reflect the diversity of potential users.
Monitor downstream effects for bias or model drift and understand what is and isn’t working for users.

Accuracy and reliability

Accuracy: Consider accuracy in relation to AI, both in terms of timeliness and veracity. Check all information before it is relied on. Test and report on the accuracy of data. An ability to tag and preserve the original base data is needed. Archive old content so that it is not available to AI models.
Hallucinations: Clearly define ‘hallucinations’. These will be unavoidable when using genAI and need to be called out as a key risk. Erroneous information and responses will affect trust and confidence.
Fake information: Generative AI has the potential to create fake information at scale. This is a key concern to be aware of.

Consider risks associated with security, privacy and potential overreach

Privacy by design is critical along with knowing where data is stored and retained – could it be open to misuse?
Security concerns exist, including:
- the potential for adversarial attacks to identify training data
- the potential to train the AI model to act in ways it was not meant to
- the rule of law within the countries that overseas suppliers may be operating out of, and the history of response by the suppliers to issues of jurisdiction, and how they might protect New Zealand organisations.
Consider the potential for profiling and surveillance capabilities of a system, including its ability to build a comprehensive picture of a person’s movements (e.g., Automatic Number Plate Recognition) and the privacy risks associated with this, along with impacts on trust and confidence in Government.
Identify if ‘scope creep’ is a risk, where technology could be applied beyond the current use case being assessed.

Trialling AI tools and approaches

Ask whether the use of AI could erode trust.
Ensure humans are in control.
When trialling multiple technologies consider separating the technologies for risk assessment.
Consider including the status quo in test scenarios.
Build in learning to evaluate and understand both the technical behaviour and the human response and impact, e.g., does the approach elevate unconscious bias?
Consider non-functional requirements such as safeguards that need to be implemented.
Where a particular demographic or group choses to opt out— given that AI is involved—consider what the possible implications could be and address these accordingly.
Be transparent and provide both simple messaging and more technical documents to meet different people’s needs for understanding. Messaging should be up front and easy to find.
Share learning with other government agencies.

Assess for potential bias and aim for equity

Training data: It is important to understand the data that an AI model is trained on. Models trained on overseas data (e.g., most ‘off the shelf’ offerings) will need to be customised for the New Zealand context by the inclusion of relevant training data. Clarity of ownership over this training data is also essential. High impact areas like the NZ Health system, especially need data that reflects the diversity of New Zealanders. Facial recognition AI is a known case where the composition of the training data does not reflect NZ demographics and consequently the AI has the potential to not recognise certain NZ demographics, leading to false positives.
Māori and Māori worldview: Generative AI doesn’t presently cater to a Māori consideration of data which means that inherent biases exist in AI against Māori and Māori worldview. There are no checks that data is accurate or authentic. Cultural differences need to be acknowledged or could lead to cultural harm.
Automation bias: Automation bias is the human tendency to rely on machine outputs, often at the expense of their own judgement. For example, it appears there may have been automation bias at play in the recent Rotorua case[1] where facial recognition technology used by Foodstuffs North Island Limited incorrectly identified a person as a shoplifter. There is a need for agencies to appropriately train the staff involved in monitoring AI decisions.
Representation in the data: The risks surrounding AI may not be shared equally amongst all citizens and populations, for example, those that may be personally identifiable within data sets, or those that may not be within any data sets.

Investigate data provenance and protect against cultural appropriation or breaches to copyright

There are risks associated with cultural appropriation where generative AI may inappropriately use mātauranga Māori, which has an intellectual property part and also a cultural responsibility part.
Be authentic and treat the information with integrity; meaning an acknowledgement that the data belongs to the source, rather than the collector or repository of the data.

Ensure transparency and explain decisions made when using AI tools.

Name and describe the genAI tools being used, what data is collected, where data will be stored and what protections exist. Seek feedback to test the comfort levels of users.  
When developing genAI models, include links to where the information has been retrieved from so that users can check the original sources.  
Government automated decision-making can significantly impact the public. There is a need for people to be able to understand how those decisions are made so they can challenge decisions that unfairly impact them.
Consider carefully when and where an AI model requires a human-in-the-loop, and which decisions need to be made by a human.
When testing or using AI tools in public (e.g., facial recognition tools) be sure to display easy to understand signage advising this is in operation, include alignment with the Privacy Act 2020. Consider using signage in multiple languages.

Provide training and resources to lift the capability to work with AI

Provide training opportunities for staff so they understand that generative AI is a sophisticated predictive text system, and how best to use it, especially with sensitive data.
AI skills training and awareness are needed for executives and leadership teams and those workforces where there are going to be significant changes due to AI and automation. Different business sectors may require different support.
Resources are needed to demystify AI and engage people, with a focus on data ethics.

AI procurement considerations

There are some key differences in procuring AI and other technologies, and smaller government departments rely heavily on larger departments in undertaking due diligence as part of their procurement processes.

As the Government is the largest technology procurer in New Zealand, there is also an opportunity to influence and encourage appropriate AI governance by suppliers through the procurement process. For example, requiring vendors to demonstrate how they are taking steps to manage accuracy, privacy, bias and explainability risks will help encourage a greater focus on minimising the well-established risks of AI.

Given New Zealand’s size and the common issues faced by all agencies (e.g., the need for local configuration and testing, sourcing representative training data, safe deployment), there is a need to work together. Cross-agency consistency in pre-procurement, ethical assessment, and approval processes is essential. This is especially so in the context of the delegation powers for microdata access under the Data and Statistics Act (where the Government Statistician can give access to microdata to other agencies) and potential uses of AI over microdata in different environments.

There is value in sharing case studies of where things have gone wrong elsewhere to bring the potential risks to life. For example, Horizon - Postmaster saga[2], Robodebt[3], Netherlands Child Welfare[4].

Pre-procurement: Analysis and evaluation phase. Do we need it? Does it need to be AI?

Only procure AI when it will provide the best solution to the problem, and not just the use of ‘shiny new technology’.
Include appropriate evaluation criteria as part of procurement due diligence processes to understand potential risks e.g., privacy, security, bias, transparency, explainability, etc.
Assess that the AI is fit for New Zealand purposes, including such things as using the correct language for those who are disabled.

During the procurement process: Require prospective vendors to communicate what they are doing to manage AI risks.

Compare providers to identify how they contribute to such things as equity, benefits, and relevance to New Zealand systems. It is important to foster local innovation and innovations that don't underserve New Zealanders by using overseas models and algorithms.
Consider if there are options to collaborate with private entities and share benefits.
Ensure that both the procurers and users of AI understand their responsibilities. Not all responsibility lies with the supplier.
Require the completion of privacy and AI impact assessments by both the AI entity and the procuring agency.
Understand the governance practices for these commercial entities in the use of AI and in managing bias, transparency, etc.
Be aware of the risk with closed-source AI e.g., in terms of vendor lock- in.

Post-procurement and governance: Implement and monitor ethical responses within the agency and with the vendor.

Continual monitoring and testing are needed to ensure that the model does not drift, and that bias is not present.
Make Audit Principles explicit upfront to set clear expectations for the public (note that the Global Partnership on Artificial Intelligence (GPAI) is currently developing Audit requirements).
Understand the appropriate place for a human-in-the-loop.
Provide training for those who will use the AI.

Getting informed consent

Steps to support informed consent when developing a survey

Create a clear Information Sheet with important information, rather than embedding this in the survey itself where it could easily get lost.
Add in some Consent Statements at the start of the survey that convey exactly what participants are clicking ’Agree‘ to....,i.e. that they have read and understood the Information Sheet, had the opportunity to ask questions, know that they can withdraw, and—in cases of data matching—agree to this (e.g., that their responses will be linked to a limited set of variables for the purposes of this study only).
Identify the parties involved in collecting, processing and managing the data, including branding logos.
Identify the project leader or contact person by name and provide fulsome contact details, not a generic email. Whom should participants contact if they have questions/ complaints, or wish to withdraw?
Consider a contact for Māori Cultural Support and any other counselling/ support that may be helpful, depending on the nature of questions being asked.
Explain how the individual's contact address has been obtained.
Detail the anticipated benefits, explaining who will be interested in the results. Some history of the survey and the gains that have resulted would provide valuable context.
Insert a withdrawal statement, providing an indicative date up to which it is possible to withdraw data identifiable as belonging to a person, even if they have completed and submitted the survey.
Explain any consultation on the data collection/survey approach.
Provide reassurance where possible, e.g., when data will be deleted, who will have access to it, if it will be deidentified, what it will and won’t be used for, and that it won’t be used for any other purpose.
Distinguish the management of identifiable and coded/anonymised data (access, storage, destruction).
Consider using a diagram/flow chart for those who are more visual, to show what happens to their data, e.g., the data sources, the flow of data, any sharing data with other parties, and the various gates data moves though for linking, de-identifying, etc. over the study lifecycle.