Making Uploading Open Data Easier - Zip Extractor Tool

Allan Barger | 23 Feb 2016

The zip extractor tool gives data custodians a more effective avenue to publish open data. This opt-in tool will automatically search for and extract ‘interesting’ files from zipped resources. Nominated interesting files currently include;​

The zip extractor follows some simple rules;

  1. Multiple zip files within a single dataset can be enabled for automatic extraction.
  2. There must be more than one interesting file per zip for the extractor to run.
  3. If the zip extractor encounters a folder in a zipped file with ‘interesting’ files inside it will extract the folder as an additional zipped resource.
  4. Any additional extracted files will be placed at the bottom of the resource list of a dataset.
  5. New resource names will reflect their filename.

Extracting ‘interesting’ files allows to create APIs for correctly formatted data files.

This tool improves workflow for data custodian when adding datasets with a large number of resource files. Instead of uploading all files individually, users can combine all related files into a single zipped upload. The tool will then automatically extract any interesting files.

The zip extractor is a scheduled task and runs daily across new resources at 18:00 AEST.

Enabling Zip Extractor Tool

The zip extractor tool can be enabled for new resources at the Add Data screen by selecting the ‘Extract Resources from Zip Files’ tick box.

For existing resources users can enable the extractor tool by;

  1. Browsing to the zipped resource
  2. Clicking the Manage button
  3. Selecting the Extract Resources from Zip Files tick box

The extraction of the files inside the zip will be queued for the next scheduled run and will not happen right away.