getblast
Retrieve BLAST report from NCBI website
Description
uses additional options specified by one or more name-value pair arguments.blastdata
= getblast(RID
,Name,Value
)
Examples
Perform BLAST search
Perform a BLAST search on a protein sequence and save the results to an XML file.
Get a sequence from the Protein Data Bank and create a MATLAB structure.
S = getpdb('1CIV');
Use the structure as input for the BLAST search with a significance threshold of 1e-10
. The first output is the request ID, and the second output is the estimated time (in minutes) until the search is completed.
[RID1,ROTE] = blastncbi(S,'blastp','expect',1e-10);
Get the search results from the report. You can save the XML-formatted report to a file for an offline access. Use ROTE as the wait time to retrieve the results.
report1 = getblast(RID1,'WaitTime',ROTE,'ToFile','1CIV_report.xml')
Blast results are not available yet. Please wait ... report1 = struct with fields: RID: 'R49TJMCF014' Algorithm: 'BLASTP 2.6.1+' Database: 'nr' QueryID: 'Query_224139' QueryDefinition: 'unnamed protein product' Hits: [1×100 struct] Parameters: [1×1 struct] Statistics: [1×1 struct]
Use blastread
to read BLAST data from the XML-formatted BLAST report file.
blastdata = blastread('1CIV_report.xml')
blastdata = struct with fields: RID: '' Algorithm: 'BLASTP 2.6.1+' Database: 'nr' QueryID: 'Query_224139' QueryDefinition: 'unnamed protein product' Hits: [1×100 struct] Parameters: [1×1 struct] Statistics: [1×1 struct]
Alternatively, run the BLAST search with an NCBI accession number.
RID2 = blastncbi('AAA59174','blastp','expect',1e-10)
RID2 = 'R49WAPMH014'
Get the search results from the report.
report2 = getblast(RID2)
Blast results are not available yet. Please wait ... report2 = struct with fields: RID: 'R49WAPMH014' Algorithm: 'BLASTP 2.6.1+' Database: 'nr' QueryID: 'AAA59174.1' QueryDefinition: 'insulin receptor precursor [Homo sapiens]' Hits: [1×100 struct] Parameters: [1×1 struct] Statistics: [1×1 struct]
Input Arguments
RID
— Request ID for NCBI BLAST search
character vector | string
Request ID for retrieving results from a specific NCBI BLAST search, specified as a character vector or string.
Example: 'GTF033EZ015'
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose
Name
in quotes.
Example: 'ToFile','report.xml'
saves the results to a file named
report.xml
.
ToFile
— Name of file to save report data to
character vector | string
Name of the file to save the report data to, specified as the
comma-separated pair consisting of 'ToFile'
and a
character vector or string. The file is XML-formatted by default.
Example: 'ToFile','Report.xml'
WaitTime
— Time to wait for report
0 (default) | nonnegative integer
Time (in minutes) to wait for the report from NCBI to be ready,
specified as the comma-separated pair consisting of
'WaitTime'
and a nonnegative integer. If the
report is still not ready after the specified time, an error is
generated.
The default value is 0, that is, there is no delay in retrieving the report.
Tip
Use the RTOE
, request time of execution,
returned by the blastncbi
function as the wait time here.
Example: 'WaitTime',2
TimeOut
— Connection timeout
5 (default) | positive scalar
Connection timeout (in seconds) for each request, specified as a positive scalar. For details, see here.
Example: 'TimeOut',10
Output Arguments
blastdata
— BLAST report data
structure
BLAST report data, returned as a structure that contains the following fields:
Field | Description |
---|---|
RID | Request ID for retrieving results from a specific NCBI BLAST search |
Algorithm | NCBI algorithm used to perform the BLAST search |
Database | All databases searched |
QueryID | Identifier of the query sequence |
QueryDefinition | Definition of the query sequence |
Hits | Structure containing information on the hit sequences, such as IDs, accession numbers, lengths, and HSPs (high-scoring segment pairs) |
Parameters | Structure containing information on the input parameters used to perform the search |
Statistics | Summary of statistical details about the performed search, such as lambda, kappa, and entropy values |
More About
Hits
This table lists each field of
blastdata.Hits
.
Field | Description |
---|---|
ID | ID of the subject sequence that matched the query sequence |
Definition | Description of the subject sequence |
Accession | Accession of the subject sequence |
Length | Length of the subject sequence |
Hsps | Structure containing Information on the high-scoring segment pairs (HSPs) |
Hits.Hsps
This table summarizes the fields of Hits.Hsps
.
Field | Description |
---|---|
Score | Pairwise alignment score for a high-scoring segment pair between the query sequence and a subject sequence. |
BitScore | Bit score for a high-scoring segment pair. |
Expect | Expectation value for a high-scoring segment pair. |
Identities | Number of identical or similar residues for a high-scoring segment pair between the query sequence and a subject sequence. |
Positives | Number of identical or similar residues for a high-scoring sequence pair between the query sequence and a subject amino acid sequence. This field applies only to translated nucleotide or amino acid query sequences and databases. |
Gaps | Nonaligned residues for a high-scoring segment pair. |
AlignmentLength | Length of the alignment for a high-scoring segment pair. |
QueryIndices | Indices of the query sequence residue positions for a high-scoring segment pair. |
SubjectIndices | Indices of the subject sequence residue positions for a high-scoring segment pair. |
Frame | Reading frame of the translated nucleotide sequence for a high-scoring segment pair. |
Alignment | 3-by-N character array showing the alignment for a high-scoring sequence pair between the query sequence and a subject sequence. The first row is the query sequence, the second row is the alignment, and the third row is the subject sequence. |
Version History
Introduced before R2006aR2017b: 'Alignments'
option has been removed
The 'Alignments'
name-value pair has been removed. The number
of hits returned in the output is controlled by the number of hits in the input
BLAST report.
R2017b: 'Descriptions'
option has been removed
The 'Descriptions'
name-value pair has been removed. The number
of hits returned in the output is controlled by the number of hits in the input
BLAST report.
R2017b: 'FileFormat'
option has been removed
The 'FileFormat'
name-value pair has been removed. The file is
XML-formatted automatically.
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: United States.
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)