TabixFile {Rsamtools} | R Documentation |
Manipulate tabix indexed tab-delimited files.
Description
Use TabixFile()
to create a reference to a Tabix file (and its
index). Once opened, the reference remains open across calls to
methods, avoiding costly index re-loading.
TabixFileList()
provides a convenient way of managing a list of
TabixFile
instances.
Usage
## Constructors
TabixFile(file, index = paste(file, "tbi", sep="."), ...,
yieldSize=NA_integer_)
TabixFileList(...)
## Opening / closing
## S3 method for class 'TabixFile'
open(con, ...)
## S3 method for class 'TabixFile'
close(con, ...)
## accessors; also path(), index(), yieldSize()
## S4 method for signature 'TabixFile'
isOpen(con, rw="")
## actions
## S4 method for signature 'TabixFile'
seqnamesTabix(file, ...)
## S4 method for signature 'TabixFile'
headerTabix(file, ...)
## S4 method for signature 'TabixFile,GRanges'
scanTabix(file, ..., param)
## S4 method for signature 'TabixFile,IntegerRangesList'
scanTabix(file, ..., param)
## S4 method for signature 'TabixFile,missing'
scanTabix(file, ..., param)
## S4 method for signature 'character,ANY'
scanTabix(file, ..., param)
## S4 method for signature 'character,missing'
scanTabix(file, ..., param)
countTabix(file, ...)
Arguments
con |
An instance of |
file |
For TabixFile(), A character(1) vector to the tabix file
path; can be remote (http://, ftp://). For |
index |
A character(1) vector of the tabix file index. |
yieldSize |
Number of records to yield each time the file is read
from using |
param |
An instance of GRanges or IntegerRangesList, used to select which records to scan. |
... |
Additional arguments. For |
rw |
character() indicating mode of file; not used for |
Objects from the Class
Objects are created by calls of the form TabixFile()
.
Fields
The TabixFile
class inherits fields from the
RsamtoolsFile
class.
Functions and methods
TabixFileList
inherits methods from
RsamtoolsFileList
and SimpleList
.
Opening / closing:
- open.TabixFile
Opens the (local or remote)
path
andindex
. Returns aTabixFile
instance.yieldSize
determines the number of records parsed during each call toscanTabix
;NA
indicates that all records are to be parsed.- close.TabixFile
Closes the
TabixFile
con
; returning (invisibly) the updatedTabixFile
. The instance may be re-opened withopen.TabixFile
.
Accessors:
- path
Returns a character(1) vector of the tabix path name.
- index
Returns a character(1) vector of tabix index name.
- yieldSize, yieldSize<-
Return or set an integer(1) vector indicating yield size.
Methods:
- seqnamesTabix
Visit the path in
path(file)
, returning the sequence names present in the file.- headerTabix
Visit the path in
path(file)
, returning the sequence names, column indicies used to sort the file, the number of lines skipped while indexing, the comment character used while indexing, and the header (preceeded by comment character, at start of file) lines.- countTabix
Return the number of records in each range of
param
, or the count of all records in the file (whenparam
is missing).- scanTabix
For
signature(file="TabixFile")
, Visit the path inpath(file)
, returning the result ofscanTabix
applied to the specified path. Forsignature(file="character")
, call the corresponding method after coercingfile
toTabixFile
.- indexTabix
This method operates on file paths, rather than
TabixFile
objects, to index tab-separated files. SeeindexTabix
.- show
Compactly display the object.
Author(s)
Martin Morgan
Examples
fl <- system.file("extdata", "example.gtf.gz", package="Rsamtools",
mustWork=TRUE)
tbx <- TabixFile(fl)
param <- GRanges(c("chr1", "chr2"), IRanges(c(1, 1), width=100000))
countTabix(tbx)
countTabix(tbx, param=param)
res <- scanTabix(tbx, param=param)
sapply(res, length)
res[["chr1:1-100000"]][1:2]
## parse to list of data.frame's
dff <- Map(function(elt) {
read.csv(textConnection(elt), sep="\t", header=FALSE)
}, res)
dff[["chr1:1-100000"]][1:5,1:8]
## parse 100 records at a time
length(scanTabix(tbx)[[1]]) # total number of records
tbx <- open(TabixFile(fl, yieldSize=100))
while(length(res <- scanTabix(tbx)[[1]]))
cat("records read:", length(res), "\n")
close(tbx)