Research Platform Services Wiki

If you're a researcher, we'll help you do stuff.

User Tools

Site Tools


Sidebar

data_management:mediaflux:howto:downloaddata

Downloading Data From Mediaflux

Because Mediaflux supports a lot of protocols there are a lot of possible ways to access your data. Users come in two categories of relevance:

  1. Those that have their own accounts (they might be local or via the Australian Access Federation) and who can directly authenticate to Mediaflux.
  2. Those that don't have accounts - here the users gains access via (usually temporary) secure tokens provisioned by somebody else. Of course, this mechanism can also work for users who have their own accounts as well - they just don't need those accounts in this instance.

1 Authenticated Users

This means you have an account and can log in directly to Mediaflux. The details on how to login with various protocols and interfaces are found in (see access mechanisms).

1.1 Generic Users

We recommend that best way to download data is either to use sFTP (if you are already familiar with it and have a client installed), or use the download client that UniMelb supplies (see below).

1.1.1 From the Vendor [Arcitecta]
1.1.2 From Research Platform Services, The University of Melbourne

Our clients can be downloaded from our Available downloads page.

  • download client - Efficient, restartable, parallel, synchronizable, download client (one of several clients described in this page)
  • checking client - Efficient client that checks the local file system against assets in Mediaflux (one of several clients described on this page)

2 Non-Authenticated Users

The following mechanisms means that you don't need to have an account and don't need to login directly to Mediaflux. You may of course still have an account, you just don't need to use it in these cases.

These mechanisms all authenticate to Mediaflux with what is called a secure identity token. The token usually expires after a fixed time (which you would be told by the person who provisions your data for you) and it is granted access to the data that it needs and only that data.

Direct shareable links are ok for a small amount of data (they download in serial into a container and are not restartable). The more complex indirect shareable links are much better for big data as they don't pack the data into a container, can be restarted and can download the data in parallel.

    • A direct shareable link downloads data to a container (e.g. a tar or zip container). Once that is downloaded, the user must unpack the container (so double the storage is temporarily required)
    • Direct links are not good for big data because if the process fails (e.g.network interruption) you have to start again from the beginning and because twice the storage is required
    • Direct shareable links can be provisioned by users via the Mediaflux Explorer (see the shareable link how to video on the Explorer how to videos page
    • An indirect shareable link downloads a downloader (e.g. a script or application) which itself downloads the data
    • These are best for big data because they can restart, may be able to download data in parallel and because they don't pack the data into a container (so you don't need double the storage)
    • Currently, indirect shareable links can only be provisioned by specific users via scripts provided via ResPlat
data_management/mediaflux/howto/downloaddata.txt · Last modified: 2019/10/31 15:03 by Neil Killeen