Skip to content

How to export all datasets from data.govt.nz into a CSV

Using the ckanapi-exporter tool you can extract all dataset metadata into a single CSV file.

Learning outcomes

  • Understand the tools you will need to extract all dataset metadata from data.govt.nz.
  • Learn the steps required to carry out the extract.
  • Configure the columns of the extracted dataset.

Requirements

Steps

  1. Install requirements
  2. Create a columns.json file (sets out some preset data propoerties to extract from data.govt.nz API, you can customise if required). See below for file contents.
  3. Run the below command on your terminal
ckanapi-exporter --url 'https://catalogue.data.govt.nz' --columns columns.json > datasets.csv

columns.json

{
 "Title": {
     "pattern": "^title$"
 },
 "Agency": {
     "pattern": ["^organization$", "^title$"]
 },
 "URL": {
     "pattern": "^url$"
 },
 "CatalogueCreated": {
     "pattern": "^metadata_created$",
     "max_length": 10
 },
 "CatalogueLastUpdated": {
     "pattern": "^metadata_modified$",
     "max_length": 10
 },
 "DatasetCreated": {
     "pattern": "^issued$",
     "max_length": 10
 },
 "DatasetLastUpdated": {
     "pattern": "^modified$",
     "max_length": 10
 },
 "FrequencyOfUpdate": {
     "pattern": "^frequency_of_update$"
 },
 "Rights": {
     "pattern": "^license_title$"
 },
 "FormatsAvailable": {
     "pattern": ["^resources$", "^format$"],
     "case_sensitive": true,
     "deduplicate": true
 },
 "Description": {
     "pattern": "^notes$"
 },
 "Tags": {
   "pattern": ["^tags$", "^display_name$"]
 },
 "Groups": {
     "pattern": ["^groups$", "^display_name$"]
 },
 "AgencyContact": {
     "pattern": "^author$"
 },
 "AgencyContactEmail":{
     "pattern": "^author_email$"
 },
 "AgencyContactPhone":{
     "pattern": "^author_phone$"
 },
 "DatasetContact": {
     "pattern": "^maintainer$"
 },
 "DatasetContactEmail": {
     "pattern": "^maintainer_email$"
 },
 "DatasetContactPhone": {
     "pattern": "^maintainer_phone$"
 },
 "PermanentIdentifier":{
     "pattern": "^id$"
 },
 "SourceIdentifier": {
     "pattern": "^source_identifier$"
 }
}

Last Updated: 25/09/2018 9:12am