Tools

Dataset Tools provides crystallographers with a means of depositing raw diffraction data into institutional digital repositories for worldwide download.

The current implementation is compatible with Fedora repositories version 2.2 and above, and runs on Windows, OS X and Linux.

Note: The Java 6 JRE is required as Java 5 has trouble handling large files.

The java-based package consists of 4 programs:

Project Descriptor
Allows a crystallographer to enter basic information about a project, such as title, author and citation. Once executed, Project Descriptor creates a Fedora-compatible XML description file conforming to the Metadata Encoding and Transmission Standard to be ingested into Fedora along with the data

Project Descriptor

Dataset Packager
Performs several procedures on a set of images. The term ‘packaging’ when referring to a dataset is the process of converting a set of diffraction images (a dataset) into a repository-suitable format complete with technical metadata that describes the image set. Once packaged, images are scanned and instrument metadata is extracted, calculated and written to XML conforming ot our own datasets schema.

Dataset Packager

Project Depositor
A user can specify a project description file created with Project Descriptor, and a directory containing their packaged datasets and other files they wish to include and all files will be deposited into a nominated repository, complete with metadata exposed for eventual indexing by this website.

Project Depositor

Dataset UnPackager
Dataset Unpackager extracts the diffraction images from files created with Dataset Packager

Dataset UnPackager

Workflow

Workflow Diagram

For a live demonstration of a typical workflow it's recommended that you download the Dataset Tools and run through the example embedded within.

Download

Included will be a partial diffraction dataset, as an example. The readme.txt will also include a guide on setting up a compatible Fedora repository.

Tutorial Screenshot
Click here to view a tutorial video

Note: The Java 6 JRE is required as Java 5 has trouble handling large files.

Mac OS XMac OS X - DatasetTools.tar.gz (64.4 mb)

LinuxLinux - DatasetTools.tar.gz (64.4 mb)

WindowsWindows - DatasetTools.zip (64.4 mb)