For full functionality of this site it is necessary to enable JavaScript. Here are the instructions how to enable JavaScript in your web browser.

Exporting FaceBase data | FaceBase

Register now for the FaceBase Community Forum: June 13 in Los Angeles (remote option available)!

Exporting FaceBase data

On this page:

1. Export Options

Available data files displayed on a page may be downloaded individually by clicking the link (indicated by the download icon):

export file unzipped

The Export button offers additional options:

export file unzipped

To use the “BDBag” option, follow the instructions on the rest of the page.

2. Export the BDBag

To create the initial “BDBag”, you will need a FaceBase account. If you have not already done so, please click this link or the “Sign Up” link in the upper right corner of the navigation menu.

Go to the page with the data you want, click the Export button towards the upper right part of the screen and choose BDBag.

We’ll use this study as an example: https://www.facebase.org/chaise/record/#1/isa:dataset/RID=1-DY3G

A ZIP file downloads to your local environment with the metadata files describing the data (including appropriate MD5 checksums), e.g. dataset_1-DY3G.zip. This does not yet include the actual data files themselves.

Make a note of the path to this file.

3. Export the data files

Use our the BDBag client to export the files. You have the option of using a GUI for simple transfer of files to your local computer or the command-line client, which offers more options and may be used on a remote server.

If you haven’t already signed up for a FaceBase account (ie, someone else created the BDBag with their account and sent it to you), click this link or the “Sign Up” link in the upper right corner of the navigation menu. (Note, you’ll need to wait until the Hub approves your request before your account becomes active.)

3.1 Download and install DERIVA Client

DERIVA Client is a set of software packages that includes the BDBag client, available as a command line program (bdbag) and a GUI, either of which may be used for downloading the files, and the DERIVA Authentication Agent for authenticating your downloads.

Detailed installation instructions are available here.

3.2 If you use the GUI application

The GUI application is quick and easy to use but has fewer options than the command-line. This is fine if you just want all of the files to a local environment. For use on a remote server or to have full access to exporting options, see 3.3 If you use the command line client.

To use the GUI:

export file unzipped

export file unzipped

export file unzipped

export file unzipped

Notes:

3.3 If you use the command line client

Using the command line client, bdbag, requires more steps but offers many more options. In general, the steps are:

  1. Authenticate with our Authentication Agent.
  2. Use the DERIVA Command Line Applications terminal window to run the appropriate bdbag command, e.g., bdbag download\path\dataset_1-DY3G.zip --materialize

3.3.1 Authenticate with the DERIVA Authentication Agent

In the DERIVA Client Tools directory, open the “DERIVA Authentication Agent” application:

The first time you log in, you’ll see a mostly-empty window:

export file unzipped

In the “Server:” field, type in www.facebase.org and click the Add or Login buttons.

export file unzipped

You should now see the following screen:

export file unzipped

Log in with the credentials you chose when you created your FaceBase account.

After logging in, you’ll see an “Authentication Successful” message.

3.3.2 Download the data files with the bdbag command

The following describe two methods of downloading the data files: to your local computer and to a remote server.

3.3.2.1 Download to your local computer

If you used a Window/Mac installer to install DERIVA Client, go to the “DERIVA Client Tools” folder on your computer and open “DERIVA Command Line Applications”. This will display a special Terminal window to run your BDBag commands.

Run the following command to export (i.e., materialize) the data and download them to the current directory:

bdbag download\path\dataset_1-DY3G.zip --materialize

This program outputs messages as it extracts the BDBag, looks up the remote addresses of the data files and downloads each file. This program also automatically performs full validation of the files after they have been downloaded.

The downloaded files are located in the folder ./data/asset/ relevant to the current directory.

You may refine the download options further with the following options.

To download only files with a particular file extension (e.g. .fastq.gz)…
bdbag --resolve-fetch all --fetch-filter filename$*fastq.gz download\path\dataset_1-DY3G.zip
To download only the missing files with a particular file extensions (e.g. fastq.gz)…
bdbag --resolve-fetch missing --fetch-filter filename$*fastq.gz "download\path\dataset_1-DY3G"

You would use the option --resolve-fetch missing if your download quit before retrieving all of the files.

To filter and download only files below a specific size…
bdbag --resolve-fetch all --fetch-filter "length<=n" "download\path\dataset_1-DY3G"

where n is the size limit in kilobytes. For example, if you only wanted to download files smaller than 500 MB, the above command would be:

bdbag --resolve-fetch all --fetch-filter "length<=500000" "download\path\dataset_1-DY3G"

You may find the complete list of command line options here:

https://github.com/fair-research/bdbag/blob/master/doc/cli.md#bdbag-command-line-interface-cli

3.3.2.2 Download to a remote cluster/server

If you want to download data files to a remote cluster or server, you will need to install the DERIVA Client tools on the remote environment as well as locally. The following instructions assume you have already installed DERIVA Client locally (as described in 3.1 Download and install DERIVA Client).

A Python 3.5.4 or greater system installation is required to use the client tools. The latest stable version of Python is recommended.

  1. On the remote cluster/server, install bdbag using the following command:
    pip install bdbag
    
  2. On your local computer, log in using DERIVA Auth (as described in 3.3.1 Authenticate with the DERIVA Authentication Agent).

  3. Under the local folder /home/username/.bdbag, find the deriva-cookies.txt file and copy from your local computer to the remote server:



export file unzipped

  1. Copy the downloaded and unzipped BDBag folder (e.g. “dataset_1-DY3G”) to the remote server.

  2. SSH to the remote server and run the bdbag commands from 3.3.2.1 Download to your local computer.

4. Troubleshooting

4.1 If you used DERIVA Auth but are still getting an error that you are not allowed to download data

Make sure that you logged in to the right server! This is more possible if you are using this procedure on different websites - you want to make sure that you are logging in to the correct website. Make sure to double-check the content of the “Server” field in DERIVA Authentication Agent.

4.2 If your BDBag export failed

BDBag keeps track of what files were exported. If the transfer is interrupted, it will re-try several times.

If the transfer completely stops, you can rerun the same command with the option --missing and it will only get the appropriately missing files.

4.3 If you want to verify you received all of the files in the BDBag

Re-run your BDBag command with --validate full at the end to confirm the data integrity.