Download SRA sequences from Entrez search results
Obtain search results
Task: find RNA-Seq records for lymph node tissue in BALB/c mice in SRA Entrez
To learn how to use Advanced Search Builder please refer to Search in SRA
- In the Entrez search bar enter the query: ((("mus musculus"[Organism]) AND BALB/c*) AND "lymph*") AND "rna seq"[Strategy].
- To limit your search to only aligned data add to the above query AND aligned data"[Properties].
- Click the checkboxes next to records (experiments) to select data of interest. Leave all checkboxes unchecked to select all records (experiments) from your search.
Obtain run accessions
Run accessions are used to download SRA data. To download a list of Run accessions selected from your Entrez search:
- Click Send to on the top of the page, check the radiobutton File , select Accession List.
- Save this file in the location from which you are running the SRA Toolkit.
Download sequence data files using SRA Toolkit
- SRA Toolkit latest release
- SRA Toolkit documentation
- Accessing SRA data
- SRA Toolkit Installation and Configuration Guide
Downloading public data
Prefetch is a part of the SRA toolkit. This program downloads Runs (sequence files in the compressed SRA format) and all additional data necessary to convert the Run from the SRA format to a more commonly used format. Prefetch can be used to correct and finish an incomplete Run download.
Use this prefetch command to download the Runs from the previous example in SRA format.
prefetch --option-file SraAccList.txt
fastq-dump and sam-dump are also part of the SRA toolkit and can be used to convert the prefetched
Runs from compressed SRA format to fastq
or sam
format. For example:
$ fastq-dump –X 5 –Z –split-files SRR000001
You can also avoid the prefetch step and download and convert the Run in one step by entering just the Run accession
without the .sra
extension in your fastq-dump or sam-dump command:
$ fastq-dump –X 5 –Z –split-files SRR000001
Downloading protected data
Download metadata associated with SRA data
From the search result page
SRA Run files do not contain any information about the metadata (sample information, etc.) linked to the data themselves.
To download metadata for each Run in your Entrez query click Send to on the top of the page, check the File radiobutton, and select RunInfo in pull-down menu.
This will generate a tabular SraRunInfo.csv
file with metadata available for each Run.
From Run Selector
A slightly different set of metadata can be downloaded in a tab-delimited file from Run Selector .
To download metadata for each Run in your Entrez query:
- Click Send to on the top of the page, check the Run Selector radiobutton, and click the button Go.
- If necessary, refine your results by using various filters provided by the Run Selector's interface.
- Click the RunInfo Table button. This will generate a tabular
SraRunTable.txt
file with metadata available for each Run.
Download sequence data from the Run Browser
Run Browser allows for limited download (one run at a time over HTTP
) of unaligned and aligned sequences.
Unaligned sequences example
- Open the selected run in the Run Browser .
- Click the Reads tab.
- Find certain reads by applying a Filter or leave the Filter field empty.
- Click on the Filtered Download button.
- Select available download format and click Download link.
Aligned sequences example
- Open the selected run in the Run Browser .
- Click the Alignment tab.
- Select available download format in pull-down menu and click on Screen or File button to output the run to the screen or into a file.
Contact SRA
Contact SRA staff for assistance at sra@ncbi.nlm.nih.gov