Research Platform Services Wiki

If you're a researcher, we'll help you do stuff.

User Tools

Site Tools


Sidebar

data_management:mediaflux:howto:downloaddata:indirect_shareable_links
  • This means somebody else wants to share data with you and they have sent you a URL to paste in your browser (or use as an argument to Unix tools like wget or curl)
  • When you paste it into your browser and press return, this will activate a download process.
  • The difference with direct shareable links is that rather than directly downloading the data, it will download a zip file holding scripts/applications (perhaps a download manager) that itself will download the data for you.
  • With this approach, the download can be more efficient and robust
    • Can be restartable
    • Can download in parallel in some cases
  • After download, unpack the container (it will be a standard zip file) to access the scripts
  • We offer two types of data download scripts at present (and the person provisioning your link will have discussed with you which is more suitable). We also offer scripts for Unix and Windows (and both flavours will be in the downloaded zip file).
    • ATERM wrapper
      • These wrappers need Java 8 installed on the computer you are running them on (contact your local IT if you need help)
      • They utilise the ATERM download command described in Section 1 above
      • All you needs to do is execute the script (on Unix systems you can make it executable [chmod +x <my script>] or use the command source to execute it).
      • The script will then fetch the ATERM Jar file and use it to download your data - it will be held in a temporary directory that is destroyed after it finishes.
      • You can specify the output directory if you want as the only argument to the script. By default the data go into the working directory.The person that provisioned the shareable link will have decided how many parallel threads to use (typically at least 2 and not more than 4). You can change that by editing the script if you want.
      • If the process should fail, e.g. your network drops out, then you can restart. The application will skip files it has already downloaded (will take a little time to evaluate that of course).
    • Shell Wrapper
      • In these scripts, each asset is downloaded from Mediaflux with one line of the script per asset.
      • They use pre-installed tools like curl and wget on Unix systems and powershell on Windows systems. So Java is not required.
      • These kinds of scripts cannot download data in parallel.
      • If the process should fail, e.g. the your network drops out, then you can restart. The application will skip files it has already downloaded (will take a little time to evaluate that of course).
    • When launching scripts (be they shell or ATERM wrappers) that potentially need to run for long periods of time (for downloading a lot of data), we recommend that for Unix systems, you preface with the nohup command (ignores terminal hangups) and run in the background (then you can log out if you want). The log will be stored in a file called nohup.out
      • nohup myscript &
data_management/mediaflux/howto/downloaddata/indirect_shareable_links.txt · Last modified: 2018/06/12 14:43 by Neil Killeen