欢迎光临散文网 会员登陆 & 注册

求助帖:寻有CQN (Conditional Quantile Normalization)条件分位数标准化 经验的大神

2021-06-17 15:41 作者:天马行空的坦克兵  | 我要投稿

CQN (conditional quantile normalization) for RNA-Seq data

先附上CQN的help内容,再来说我的疑问。如果你有使用过CQN条件分位数标准化,请在评论区说说你的看法,先谢谢各位读者啦  (文章中有表述不对的地方,请指点,大家共同进步~)  能解惑者必有谢

Examples

data(montgomery.subset)

data(sizeFactors.subset)

data(uCovar)

cqn.subset <- cqn(montgomery.subset【表达量矩阵】, lengths = uCovar$length【每个基因长度】, 

                  x = uCovar$GCcontent【每个基因GC含量】, sizeFactors = sizeFactors.subset【文库大小,非必要参数】,

                  verbose = TRUE)

这是R中的例子。我的疑问,这个GC含量是怎么来的??纠结……疑惑,求助这一块知识。能解惑者必有谢~

帮助文档中也没有说明。

这是帮助文档中的矩阵

Description

This function implements CQN (conditional quantile normalization) for RNA-Seq data.


Usage

cqn(counts, x, lengths, sizeFactors = NULL, subindex = NULL, tau = 0.5, sqn = TRUE,

    lengthMethod = c("smooth", "fixed"), verbose = FALSE)

## S3 method for class 'cqn'

print(x, ...)

Arguments

counts

An object that can be coerced to a matrix of region by sample counts. Ought to have integer values.


x

This is a covariate whose systematic influence on the counts will be removed. Typically the GC content. Has to have the same length as the number of rows of counts.


lengths

The lengths (in bp) of the regions in counts. Has to have the same length as the number of rows of counts.


sizeFactors

An optional vector of sizeFactors, ie. the sequencing effort of the various samples. If NULL this is calculated as the column sums of counts.


subindex

An optional vector of indices into the rows of counts. If not given, this becomes the indices of genes with row means of counts greater then 50.


tau

This argument is passed to rq, it indicates what quantile is being fit. The default should only be changed by expert users..


sqn

This argument indicates whether the residuals from the systematic fit are (subset) quantile normalized. The default should only be changed by expert users.


lengthMethod

Should length enter the model as a smooth function or not.


verbose

Is the function verbose?


...

Not used.


Details

These functions implement the CQN (conditional quantile normalization) for RNA-Seq data. The functions remove a single systematic effect, contained in the argument x, which will typicall be GC content. The effect of lengths will either be modelled as a smooth function (which we recommend), if you are using lengthMethod = "smooth" or as an offset (equivalent to modelling using RPKMs), if you are using lengthMethod = "fixed". Length can be complete removed from the model by having lengthMethod = "fixed" and setting all lengths to 1000.


Final corrected values are equal to value$y + value$offset.


Value

A list with the following components


counts

The value of argument counts.


x

The value of argument x.


lengths

The value of argument lengths.


sizeFactors

The value of argument sizeFactors. In case the argument was NULL, this is the value used internally.


subindex

The value of argument subindex. In case the argument was NULL, this is the value used internally.


y

The dependent value used in the systematic effect fit. Equal to log2 tranformed reads per millions.


offset

The estimated offset.


offset0

A single number used internally for identifiability.


glm.offset

An offset useful for supplying to a GLM type model function. It is on the natural log scale and includes correcting for sizeFactors.


func1

The estimated effect of function 1 (argument x). This is a matrix of function values on a grid. Columns are samples and rows are grid points.


grid1

The grid points on which function 1 (argument x) was evaluated.


knots1

The knots used for function 1 (argument x).


func2

The estimated effect of function 2 (lengths). This is a matrix of function values on a grid. Columns are samples and rows are grid points.


grid2

The grid points on which function 2 (lengths) was evaluated.


knots2

The knots used for function 2 (lengths).


call

The call.


Note

Internally, the function uses a custom implementation of subset quantile normalization, contained in the (not exported) SQN2 function.


Author(s)

Kasper Daniel Hansen, Zhijin Wu


References

KD Hansen, RA Irizarry, and Z Wu, Removing technical variability in RNA-seq data using conditional quantile normalization. Biostatistics 2012 vol. 13(2) pp. 204-216.


See Also

The package vignette.

UP主:天马行空的坦克兵

2021-06-17



求助帖:寻有CQN (Conditional Quantile Normalization)条件分位数标准化 经验的大神的评论 (共 条)

分享到微博请遵守国家法律