Introduction to Basemount
Introduction to Basemount
Introduction To BaseMount
Table of Contents
Overview
The main mechanism to interact with your BaseSpace Sequence Hub (BSSH) data is via the website at basespace.illumina.com. However, for some use-cases, it can be useful to work with the same data using the Linux command line interface (CLI). This allows direct ad-hoc programmatic access so that users can write ad-hoc scripts and use tools like find, xargs and command line loops to work with their data in bulk.
This is the concept behind our BaseSpace Sequence Hub command line tool, BaseMount, a FUSE driver that allows command line access to your BaseSpace Sequence Hub data.
What is BaseMount?
BaseMount is a tool to mount your BaseSpace Sequence Hub data as a Linux file system. You can navigate through projects, samples, runs and app results and interact directly with the associated files exactly as you would with any other local file system. BaseMount is based on FUSE and uses the BaseSpace Sequence Hub API to populate the contents of each directory.
Tutorial Videos
This video series starts by preparing you for your first mount and ends with filtering samples from the command line based on their metadata.
The script of each video is included here for quick reference.
Step 1 - Install
Step 2 - Import Data to Account
{% embed url="https://youtu.be/R8ou0yLE_Ts" %}
`
`` In a browser, starting from basespace.illumina.com - Log in - Add public dataset (2 runs + 1 project): "HiSeq X Ten TruSeq PCR Free (16 NA12878 1 plex)" (Use the following import links to avoid going through the web GUI) - Run 1 : https://basespace.illumina.com/s/0FCHjcXGsMMX - Run 2 : https://basespace.illumina.com/s/EXDT8tjo6Zbc - Project: https://basespace.illumina.com/s/mYwAqCV3Pe7R
sudo bash -c "$(curl -L https://basemount.basespace.illumina.com/uninstall)"
basemount [--config ]
mkdir ~/BaseSpace_Mount basemount --config user1 ~/BaseSpace_Mount
basemount [--config ] --api-server <API URL or alias | help>
basemount --plugin=bsfs
Add string property
echo "female" > gender
Add Sample property
ln -s ../../../Samples/myExistingSample mySampleProperty
... or any path
ln -s myNiceEntity
# Add Sample property ln -s input.sample ```
Alternatively, you may create a property called input.samples with the type "array of samples":
Marking AppSessions as Complete
Inside an AppResult directory, run basemount-cmd mark-as-complete to change the status of the associated appsession to "Complete".
The directory will become read-only. In case of failure, an explicit error message is issued.
Renaming AppSessions
When creating an appresult, BaseSpace Sequence Hub automatically creates an associated appsession and automatically allocates it the name "BaseSpaceCLI - {creation time}".
If you wish to choose a different name to organise your workspace more effectively, you can use the mv command in the AppSessions directory to rename appsessions.
Deleting data
Starting with version 0.14, BaseMount can move data to and from the trash.
Warning about access token scopes
Access tokens obtained through authentication are created with a specific set of scopes.
Starting with version 0.14, the "MOVETOTRASH GLOBAL" scope is requested.
If you authenticated with an older version of BaseMount, your stored access token may not contain the scope needed for trash operations.
To fix this, you need to delete your current configuration (by using basemount --remove-config [--config=<config>]) and run BaseMount again to re-authenticate.
Move to trash
You can delete BaseSpace Sequence Hub entities with any of these methods:
basemount-cmd --path <entity directory> move-to-trash. In case of error (e.g. lack of permissions, app still running, etc.), the tool will report the error and exit with error code 1.cdinto the entity's directory and runbasemount-cmd move-to-trash. Warning: In case of success, the current directory will become invalid, as the entity will have been deleted.cdinto the entity's parent directory and runrmdir <entity name>. In case of error, the entity won't be deleted, and the error message will be added to the.errorfile in the current directory.
Move base call Run files to trash
With runs, users often wish to delete the large amount of base call data while retaining the BaseSpace entries and small files such as metrics and monitoring files.
This is achieved with basemount-cmd move-to-trash-preserve-metadata, which deletes only the Data directory from the run, preserving the entity and the other files.
Restore from trash
The directory <mount point>/.Trash (note the dot) contains the list of items stored in your trash.
In order to restore one of these items, run:
basemount-cmd --path <path_of_item_in_trash> restore-from-trash.
Note-warning: If you use this command without --path, by first entering the trash item directory, the current directory will become invalid as soon as the entity gets restored, as it won't be in the trash anymore.
Note: If the restored entity's parent directory had previously been accessed, a manual refresh may be needed to see the restored entity:
basemount-cmd --path <restored entity's parent path> refresh
Trash item types
Each item in BaseMount's .Trash directory is itself a directory that contains the usual .json and other metadata files, giving you some information about the deleted item.
A text file called TrashItemType exposes a string of the form <Type> (<includes>), where:
<Type>takes values such as "DeletedProject", "DeletedRun", "DeletedAppSession", etc.<includes>is a list of '+'-separated entries as returned by the API, currently either "FILEDATA+METADATA" for items that have been deleted withmove-to-trash, and just "FILEDATA" for items that have been deleted withmove-to-trash-preserve-metadata
Protection against rm -rf
Don't use rm -rf to delete a BaseSpace Sequence Hub entity, as it could delete its properties before moving the entity itself to the trash.
As a safeguard, any attempt to delete an invalid item (such as the Projects directory or a .json file) blocks any other deletion for 5 seconds. As rm -rf usually starts with such invalid items, it should block itself before deleting any data.
Emptying the trash
We keep this feature hard to access to prevent an accidental loss of data.
In order to do so, you will need an access token with the "EMPTY TRASH" scope, and you will need to issue the DELETE v1pre3/users/current/trash API call yourself (for example with curl).
Please contact our support team if needed.
BaseMount with v2 API
BaseSpace Sequence Hub has two parallel API versions, v1 and v2. The main differences are that v1 exposes Samples (collections of FASTQ files) and AppResults (generic collections of files from apps). v2 contains Datasets (any collection of files) and Biosamples (sample metadata and pointers to FASTQ Datasets). AppResults and Samples that were created with the v1 API are transparently exposed as v2 Datasets and BioSamples, which can then be augmented with some new metadata, attributes, etc.
In order to use the v2 API with BaseMount you can launch:
basemount --use-v2-api
The basemount-cmd tool
Running basemount-cmd (or its shorter alias bm-cmd) displays the list of available commands. This list of commands will vary based on your current directory, for example mark-as-complete only appears for AppResults that are not yet in status==Complete, whereas the refresh command appears in most directories.
The basemount-cmd tool was introduced to:
Enable bi-directional communication between the user and BaseMount (typical filesystem commands such as
cat,echo, etc. only read or write to a file, but can't return explicit information or error messages conveniently)Give a higher level of abstraction for certain commands, allowing to group multiple filesystem commands, which are intentionally kept at the same granularity as the BaseSpace Sequence Hub API
Give access to bash-completion (the ability to see the list of commands by typing
basemount-cmd <TAB><TAB>)
The current version of basemount-cmd is the first version and is still experimental.
How it works
In a selected subset of BaseMount directories, extra commands are available, which you can call using
Typing basemount-cmd <TAB><TAB> displays the list of available commands.
Running basemount-cmd without arguments also shows a description for each command.
Description of commands
refresh: Refresh this directory and sub-directories Available in: all entities and lists thereof
move-to-trash: Delete current entity (Warning: current directory will become invalid) Available in: project, run, sample, appresult and appsession entities
move-to-trash-preserve-metadata: Delete only the Data directory from the run, preserving the entity and the other files Available in: run entities
restore-from-trash: Restore entity to main account - Warning: current directory disappears Available in: .Trash/{entity-name} directories
mark-as-complete: Finalize and switch (sample or appresult) to read-only. Change the state of the associated appsession to Complete Available in: sample, appresult and appsession entities
rename-appsession: Rename appsession associated to the appresult Available in: appresult entities
relaunch: Relaunch current appsession by running the equivalent of
cat LaunchPayload > Application/.AppLauncher(see show-launch-files below) Available in: appsession entitiesshow-launch-files: Expose LaunchPayload and LaunchApp files Available in: appsession entities
unmount: Force unmount. Warning: the current directory will become invalid Available in: top level directory
Note: Commands that are not listed here are "use at your own risk", and may disappear in future versions (well... those listed here may disappear as well anyway... it's an alpha release after all).
Limitations of BaseMount Alpha
Each new directory access made by BaseMount relies on the local FUSE device (/dev/fuse), the BaseSpace Sequence Hub API and the user's credentials. This mechanism means that, as currently available, BaseMount does not support the following types of accesses:
Cluster access, where many compute nodes can access the files. FUSE-mounted filesystems are per-host and cannot be accessed from other hosts without additional infrastructure.
In general, lots of concurrent requests can cause stability issues on a resource-constrained system. Keep in mind, this is an early release and stability will increase.
If you have terabytes of data in BaseSpace, doing a "find" command or recursive "ls" or recursive "grep" on the mount is not recommended: it is likely to start consuming many GB of RAM and may crash when the memory runs out.
Troubleshooting
Generic BaseMount debugging
When experiencing problems with BaseMount, the following actions can help identify the root cause:
Check .error files, created in the directory where the error occurred. Disadvantage: multiple errors in the same directory overwrite each other.
Check latest entries from /tmp/basemount.errorlog . Critical errors and crash stack traces (very useful to developers, in case you plan to report the problem) are reported there.
Re-launch Basemount with the -f option (
basemount -f ...) to keep BaseMount in the foreground and give you a (very!) comprehensive log output. Disadvantage: it keeps one terminal busy and slows BaseMount down.
Error: "Bad address"
As BaseMount is exposed as a filesystem, it has the inconvenience of only being accessible by unidirectional commands: either you read from a file or you write to a file. Commands operating on files (such as echo, cat, cp, etc.) don't have the ability to return a personalized error message from the filesystem driver to the user. When an error occurs, BaseMount returns a generic error code (usually translated as "Bad address") and stores a more comprehensive error message in a file called .error, created in the directory where the error occurred. Very important errors are also logged in the /tmp/basemount.errorlog file.
Follow "Generic BaseMount debugging" to figure out what they mean.
Error: "Failed to open mountpoint for reading: Permission denied"
You don't have root access to the directory where you are creating your mount point. Some file systems may be configured to not allow root access by default.
Error: "Timeout was reached - Shared error buffer: Operation too slow. Less than 1000 bytes/sec transfered [in] the last 30 seconds"
This error message is reported in the /tmp/basemount.errorlog file and is related to file (from Files directories) download.
Rare occurrences (fewer than once per 100 GB) of this message can be ignored, as connections to the S3 object store seem to break from time to time. The 30-second timeout is here to restart downloading the affected blocks on those occasions.
Regular occurrences, on the other hand, are important to address, and usually indicate a connection to S3 worse than our expected worst case, or a deeper problem with the stability of your internet connection.
If you believe the problem comes from BaseMount, the answer to the next question describes command line parameters that may ease the bandwidth/latency requirements.
Error: "Timeout was reached - Shared error buffer: Operation timed out after 300000 milliseconds with 8668643 out of 16777216 bytes received"
This error message is reported in the /tmp/basemount.errorlog file and is related to file (from Files directories) downloads.
By default, BaseMount tries to download 16MB blocks using 8 threads. If a block takes more than 300 seconds to arrive, it issues this error message (and tries to resume the download 3 times before aborting).
16MB/thread * 8 threads * 8 bits/byte / 300s = 3.4 Mbps.
If your connection is slower than that, two BaseMount options may help address this problem:
--threads=<n>sets the number of concurrent download threads
By default, n=8, so --threads=2 may help.
But... in many settings, download speed is improved by using multiple threads. In this case, reducing the size of each downloaded block to something smaller than 16MB may be a good option:
--cache-opts=<interactive block size>:<large block size>:<total cache size>
Default value, in MB, are: 2:16:512 (Note: interactive block size is used when files are accessed in a non-sequential manner).
For example --cache-opts=2:4:512 would make BaseMount download 4MB blocks.
A middle ground can be achieved with both options: --threads=4 --cache-opt=2:8:512
Questions and Answers
How can I refresh the contents of BaseMount's directories?
You can run either:
basemount-cmd refreshecho refresh > .commands
Can I check which permissions/scopes my access token has?
Yes, you can see this as part of the following json response:
cat <mount point>/.AccessToken/.json
How can I install a previous version of Basemount?
The install script always installs the latest version of BaseMount. If you want to lock in a specific version as part of your system setup scripts, please use the following steps.
Configure the BaseMount repository as described in the "BaseMount Installation, Manual Install" section, but without the final
apt-get/yum install basemountcommandOptional: Configure the BSFS repository as described in the "Appendix 1 - BSFS installation" section, but without the final
apt-get/yum install bsfscommandInstall BaseMount using the following commands:
v0.1 Alpha
Ubuntu
CentOS
v0.11 Alpha
Ubuntu
CentOS
Package version discovery
You can discover which versions of BaseMount are available with the following commands:
Ubuntu
CentOS
ChangeLog
Tue Jun 21 2016 - v0.14 Alpha
Refresh command
Move-to-trash and restore-from-trash
New
bm-cmdtool, a shorter alias forbasemount-cmdMoved passphrase encryption to --passphrase
Tue Mar 1 2016 - v0.12 Alpha
Write-mode: project and appresult creation, file upload
Properties can be viewed and edited
Improved documentation
Relaxed timeout for low bandwidth
Offers to create mount point when it doesn't exist
Offers to unmount if mount point refers to a mounted path
Unmount assistance, listing blocking processes and offering lazy-unmount
Raised RAM requirements to 1.5GB, allowing to proceed after confirmation prompt
Tue Jan 26 2016 - v0.11 Alpha
Proxy support
"Files" directories are not symlinks anymore
Run Files are automatically mounted, now that they load more interactively
.basemount directories for BaseSpaceCLI support
BSFS available as a plugin
Fri Jul 24 2015 - v0.1 Alpha
Initial release
Last updated
Was this helpful?
