data.govt.nz

23 Posts in 8 Topics

Jump to:

Developing the site

Page: 1 Go to End Reply
Author Topic: Guidance on creating datasets 566 Views
  • Kimeros
    avatar

    Guidance on creating datasets Link to this post

    Hi,

    I haven't been able to find Guidance on this website for Government Departments around:

    Best practice for creating datasets and the recommended formats.

    Could you please direct me to this information if it is available or consider providing it if it is not.

    Regards,
    Kimeros.

  • Anthony
    avatar

    Re: Guidance on creating datasets Link to this post

    Hi Kimeros

    As part of this work, we'll be contributing to guidance on that for departments. Which is to say the guidance doesn't exist yet ... It will be created as part of cross-govt work, and we consider it high priority. However -

    Data.govt.nz indexes: CSV, Spreadsheets, XML, geospatial formats (increasingly - working with LINZ on this to get it right) and HTML. We index HTML because the benefits (data importance, ‘pastability’), as this stage, outweigh the disadvantages. We let people know where there's an API. We don’t index PDFs.

    When we talk to agencies we ask them to release or re-format their data in CSV and, if they want, as spreadsheets. Where they are providing HTML from databases, we recommend enabling users to export it in open formats. Providing feeds also important.

    Have found that the issues aren't so much technical as 'business': establishing who in the agency owns the data, what status it has (offical, advisory, caveat-heavy or good-to-go), and sometimes who in the business pays for conversion, release, updating, etc. That's just for converting data, not releasing new data. The issues are often the same as those for getting agencies to use web standards.

    Other info:

    The site currently includes, or will include, the following broad types:

    * “Reference” datasets and quick-changing datasets. Reference: eg, region definitions, political borders. Quick-changing: eg, teacher registrations, real-time river flows, traffic reports.
    * Statistical and non-statistical. Statistical: eg, pinus radiate logging prices. Non-statistical: eg, threatened environments classification.
    * "Authoritative" and non-authoritative data. Authoritative: eg, Current account balance. Non-authoritative: eg, traffic webcams (API)

    Then there's the UK (RDFa etc) ...

    http://cloudofdata.com/2009/07/talking-with-mark-birbeck-about-rdfa-and-its-use-in-government/

    Be good to get yr thoughts on that.

    566 Views
Go to Top Reply