FaFile {Rsamtools} | R Documentation |
Manipulate indexed fasta files.
Description
Use FaFile()
to create a reference to an indexed fasta
file. The reference remains open across calls to methods, avoiding
costly index re-loading.
FaFileList()
provides a convenient way of managing a list of
FaFile
instances.
Usage
## Constructors
FaFile(file, index=sprintf("%s.fai", file),
gzindex=sprintf("%s.gzi", file))
FaFileList(...)
## Opening / closing
## S3 method for class 'FaFile'
open(con, ...)
## S3 method for class 'FaFile'
close(con, ...)
## accessors; also path(), index()
## S4 method for signature 'FaFile'
gzindex(object, asNA=TRUE)
## S4 method for signature 'FaFileList'
gzindex(object, asNA=TRUE)
## S4 method for signature 'FaFile'
isOpen(con, rw="")
## actions
## S4 method for signature 'FaFile'
indexFa(file, ...)
## S4 method for signature 'FaFile'
scanFaIndex(file, ...)
## S4 method for signature 'FaFileList'
scanFaIndex(file, ..., as=c("GRangesList", "GRanges"))
## S4 method for signature 'FaFile'
seqinfo(x)
## S4 method for signature 'FaFile'
countFa(file, ...)
## S4 method for signature 'FaFile,GRanges'
scanFa(file, param, ...,
as=c("DNAStringSet", "RNAStringSet", "AAStringSet"))
## S4 method for signature 'FaFile,IntegerRangesList'
scanFa(file, param, ...,
as=c("DNAStringSet", "RNAStringSet", "AAStringSet"))
## S4 method for signature 'FaFile,missing'
scanFa(file, param, ...,
as=c("DNAStringSet", "RNAStringSet", "AAStringSet"))
## S4 method for signature 'FaFile'
getSeq(x, param, ...)
## S4 method for signature 'FaFileList'
getSeq(x, param, ...)
Arguments
con , object , x |
An instance of |
file , index , gzindex |
A character(1) vector of the fasta or fasta index
or bgzip index file path (for |
asNA |
logical indicating if missing output should be NA or character() |
param |
An optional |
... |
Additional arguments.
|
rw |
Mode of file; ignored. |
as |
A character(1) vector indicating the type of object to return.
|
Objects from the Class
Objects are created by calls of the form FaFile()
.
Fields
The FaFile
class inherits fields from the
RsamtoolsFile
class.
Functions and methods
FaFileList
inherits methods from
RsamtoolsFileList
and SimpleList
.
Opening / closing:
- open.FaFile
Opens the (local or remote)
path
andindex
files. Returns aFaFile
instance.- close.FaFile
Closes the
FaFile
con
; returning (invisibly) the updatedFaFile
. The instance may be re-opened withopen.FaFile
.
Accessors:
- path
Returns a character(1) vector of the fasta path name.
- index
Returns a character(1) vector of fasta index name (minus the '.fai' extension).
Methods:
- indexFa
Visit the path in
path(file)
and create an index file (with the extension ‘.fai’).- scanFaIndex
Read the sequence names and and widths of recorded in an indexed fasta file, returning the information as a
GRanges
object.- seqinfo
Consult the index file for defined sequences (
seqlevels()
) and lengths (seqlengths()
).- countFa
Return the number of records in the fasta file.
- scanFa
Return the sequences indicated by
param
as aDNAStringSet
instance.seqnames(param)
selects the sequences to return;start(param)
andend{param}
define the (1-based) region of the sequence to return. Values ofend(param)
greater than the width of the sequence cause an error; useseqlengths(FaFile(file))
to discover sequence widths. Whenparam
is missing, all records are selected. Whenlength(param)==0
no records are selected.- getSeq
Returns the sequences indicated by
param
from the indexed fasta file(s) offile
.For the
FaFile
method, the return type is aDNAStringSet
. ThegetSeq,FaFile
andscanFa,FaFile,GRanges
methods differ in thatgetSeq
will reverse complement sequences selected from the minus strand.For the
FaFileList
method, theparam
argument must be aGRangesList
of the same length asfile
, creating a one-to-one mapping between the ith element offile
and the ith element ofparam
; the return type is aSimpleList
ofDNAStringSet
instances, with elements of the list in the same order as the input elements.- show
Compactly display the object.
Author(s)
Martin Morgan
Examples
fl <- system.file("extdata", "ce2dict1.fa", package="Rsamtools",
mustWork=TRUE)
fa <- open(FaFile(fl)) # open
countFa(fa)
(idx <- scanFaIndex(fa))
(dna <- scanFa(fa, param=idx[1:2]))
ranges(idx) <- narrow(ranges(idx), -10) # last 10 nucleotides
(dna <- scanFa(fa, param=idx[1:2]))