Title: | Exploratory Chemometrics for 2D Spectroscopy |
---|---|
Description: | A collection of functions for exploratory chemometrics of 2D spectroscopic data sets such as COSY (correlated spectroscopy) and HSQC (heteronuclear single quantum coherence) 2D NMR (nuclear magnetic resonance) spectra. 'ChemoSpec2D' deploys methods aimed primarily at classification of samples and the identification of spectral features which are important in distinguishing samples from each other. Each 2D spectrum (a matrix) is treated as the unit of observation, and thus the physical sample in the spectrometer corresponds to the sample from a statistical perspective. In addition to chemometric tools, a few tools are provided for plotting 2D spectra, but these are not intended to replace the functionality typically available on the spectrometer. 'ChemoSpec2D' takes many of its cues from 'ChemoSpec' and tries to create consistent graphical output and to be very user friendly. |
Authors: | Bryan A. Hanson [aut, cre] |
Maintainer: | Bryan A. Hanson <[email protected]> |
License: | GPL-3 |
Version: | 0.5.0 |
Built: | 2024-11-19 04:56:47 UTC |
Source: | https://github.com/bryanhanson/chemospec2d |
Description: A collection of functions for exploratory chemometrics of 2D spectroscopic data sets such as COSY and HSQC NMR spectra. ChemoSpec2D deploys methods aimed primarily at classification of samples and the identification of spectral features which are important in distinguishing samples from each other. Each 2D spectrum (a matrix) is treated as the unit of observation, and thus the physical sample in the spectrometer corresponds to the sample from a statistical perspective. In addition to chemometric tools, a few tools are provided for plotting 2D spectra, but these are not intended to replace the functionality typically available on the spectrometer. ChemoSpec2D takes many of its cues from ChemoSpec and tries to create consistent graphical output and to be very user friendly. A vignette is available.
Bryan A. Hanson.
Maintainer: Bryan A. Hanson [email protected]
Given a matrix or vector input, this function will assist in selecting levels for preparing
contour and image type plots. For instance, levels can be spaced evenly,
logrithmically, exponentially or using a cumulative distribution function.
NA
values are ignored.
calcLvls( M, n = 10, mode = "even", lambda = 1.5, base = 2, showHist = FALSE, ... )
calcLvls( M, n = 10, mode = "even", lambda = 1.5, base = 2, showHist = FALSE, ... )
M |
A numeric matrix or vector. |
n |
An integer giving the number of levels desired:
|
mode |
Character. One of |
lambda |
Numeric. A non-zero exponent used with |
base |
Integer. The base used with |
showHist |
Logical. Shall a histogram be drawn showing the location of the chosen levels? |
... |
Arguments to be passed downstream. |
A numeric vector giving the levels.
Bryan A. Hanson, DePauw University. [email protected]
set.seed(9) MM <- matrix(runif(100, -1, 1), nrow = 10) # test data tsts <- c("even", "log", "poslog", "exp", "posexp", "ecdf", "NMR") for (i in 1:length(tsts)) { nl <- 20 if (tsts[i] == "ecdf") nl <- seq(0.1, 0.9, 0.1) levels <- calcLvls( M = MM, n = nl, mode = tsts[i], showHist = TRUE, main = tsts[i] ) }
set.seed(9) MM <- matrix(runif(100, -1, 1), nrow = 10) # test data tsts <- c("even", "log", "poslog", "exp", "posexp", "ecdf", "NMR") for (i in 1:length(tsts)) { nl <- 20 if (tsts[i] == "ecdf") nl <- seq(0.1, 0.9, 0.1) levels <- calcLvls( M = MM, n = nl, mode = tsts[i], showHist = TRUE, main = tsts[i] ) }
This function will optionally center, and optionally scale, a Spectra2D
object along the
samples dimension (i.e. this is pixel-wise scaling in the language of multivariate
image analysis). Several scaling options are available.
centscaleSpectra2D(spectra, center = FALSE, scale = "noscale")
centscaleSpectra2D(spectra, center = FALSE, scale = "noscale")
spectra |
An object of S3 class |
center |
Logical. Should the spectra be centered before possibly scaling? Will give an error
if |
scale |
A character string indicating the type of scaling to apply. One of
|
An object of S3 class Spectra2D
.
Bryan A. Hanson, DePauw University.
R. Bro and A. K. Smilde "Centering and Scaling in Component Analysis" J. Chemometrics vol. 17 pgs 16-33 (2003).
normSpectra2D
for another means of scaling.
data(MUD1) tst <- centscaleSpectra2D(MUD1)
data(MUD1) tst <- centscaleSpectra2D(MUD1)
This function is used by ChemoSpec
and ChemoSpec2D
, but is
formally part of ChemoSpecUtils
. You can access
full documentation via check4Gaps
.
This function is used by ChemoSpec
and ChemoSpec2D
, but is
formally part of ChemoSpecUtils
. You can access
full documentation via chkSpectra
.
You can access
full documentation via colorSymbol
.
This function takes a range of frequencies for each dimension, and calculates the volume of the enclosed region.
computeVolume(spectra, F2range = NULL, F1range = NULL)
computeVolume(spectra, F2range = NULL, F1range = NULL)
spectra |
An object of S3 class |
F2range |
A formula giving a frequency range. May include
"low" or "high" representing the extremes of the spectra. Values below or above the range of
F2 are tolerated without notice and are handled as |
F1range |
As for |
A numeric vector of volumes, one for each spectrum.
Bryan A. Hanson, DePauw University.
data(MUD1) tst <- computeVolume(MUD1, F2range = 3 ~ 4, F1range = 55 ~ 70)
data(MUD1) tst <- computeVolume(MUD1, F2range = 3 ~ 4, F1range = 55 ~ 70)
This function imports data into a Spectra2D
object. It primarily uses
read.table
to read files so it is
very flexible in regard to file formatting. Be sure to see the ...
argument below for important details you need to provide.
files2Spectra2DObject( gr.crit = NULL, gr.cols = "auto", fmt = NULL, nF2 = NULL, x.unit = "no frequency unit provided", y.unit = "no frequency unit provided", z.unit = "no intensity unit provided", descrip = "no description provided", fileExt = "\\.(csv|CSV)$", out.file = "mydata", debug = 0, chk = TRUE, allowSloppy = FALSE, ... )
files2Spectra2DObject( gr.crit = NULL, gr.cols = "auto", fmt = NULL, nF2 = NULL, x.unit = "no frequency unit provided", y.unit = "no frequency unit provided", z.unit = "no intensity unit provided", descrip = "no description provided", fileExt = "\\.(csv|CSV)$", out.file = "mydata", debug = 0, chk = TRUE, allowSloppy = FALSE, ... )
gr.crit |
Group Criteria. A vector of character strings which will be
searched for among the file/sample names in order to assign an individual
spectrum to group membership. This is done using grep, so characters
like "." (period/dot) do not have their literal meaning (see below).
Warnings are issued if there are file/sample
names that don't match entries in |
gr.cols |
Group Colors. See
Colors will be assigned one for one, so the first element of
|
fmt |
A character string giving the format of the data. Consult
|
nF2 |
Integer giving the number of data points in the F2 (x) dimension. Note: If any dimension is zero-filled you may need to study the acquistion details to get the correct value for this argument. This may be vendor-dependent. |
x.unit |
A character string giving the units for the F2 dimension (frequency or wavelength corresponding to the x dimension). |
y.unit |
A character string giving the units for the F1 dimension (frequency or wavelength corresponding to the y dimension). |
z.unit |
A character string giving the units of the z-axis (some sort of intensity). |
descrip |
A character string describing the data set. |
fileExt |
A character string giving the extension of the files to be
processed. |
out.file |
A file name. The completed object of S3 class |
debug |
Integer. Set to 1 or |
chk |
Logical. Should the |
allowSloppy |
Logical. Experimental Feature If |
... |
Arguments to be passed to |
files2Spectra2DObject
acts on all files in the current working
directory with the specified fileExt
so there should be no
extraneous files with that extension in the directory.
One of these objects:
If allowSloppy = FALSE
, the default, an object of class Spectra2D
.
If allowSloppy = TRUE
, an object of undocumented class SloppySpectra2D
.
These objects are experimental and are not checked by chkSpectra
.
For these objects spectra$F1
and spectra$F2
are NA
, and each
spectra$data
entry is a list with elements F1, F2 and M, which is the matrix
of imported data (basically, the object returned by import2Dspectra
).
In each case,
an unnamed object of S3 class Spectra2D
or SloppySpectra2D
is also written to out.file
.
To read it back into the workspace, use new.name <- loadObject(out.file)
(loadObject
is package R.utils).
The matching of gr.crit
against the sample file names is done one at
a time, in order, using grep. While powerful, this has the potential to lead
to some "gotchas" in certain cases, noted below.
Your file system may allow file/sample names which R
will not like, and will
cause confusing behavior. File/sample names become variables in ChemoSpec
, and R
does not like things like "-" (minus sign or hyphen) in file/sample names. A hyphen
is converted to a period (".") if found, which is fine for a variable name.
However, a period in gr.crit
is interpreted from the grep point of view,
namely a period matches any single character. At this point, things may behave
very differently than one might hope. See make.names
for allowed
characters in R
variables and make sure your file/sample names comply.
The entries in gr.crit
must be
mutually exclusive. For example, if you have files with names like
"Control_1" and "Sample_1" and use gr.crit = c("Control", "Sample")
groups will be assigned as you would expect. But, if you have file names
like "Control_1_Shade" and "Sample_1_Sun" you can't use gr.crit =
c("Control", "Sample", "Sun", "Shade")
because each criteria is grepped in
order, and the "Sun/Shade" phrases, being last, will form the basis for your
groups. Because this is a grep process, you can get around this by using
regular expressions in your gr.crit
argument to specify the desired
groups in a mutually exclusive manner. In this second example, you could
use gr.crit = c("Control(.*)Sun"
, "Control(.*)Shade"
, "Sample(.*)Sun"
,
"Sample(.*)Shade")
to have your groups assigned based upon both phrases in
the file names.
To summarize, gr.crit
is used as a grep pattern, and the file/sample names
are the target. Make sure your file/sample names comply with make.names
.
Finally, samples whose names are not matched using gr.crit
are still
incorporated into the Spectra2D
object, but they are not
assigned a group. Therefore they don't plot, but they do take up space in a
plot! A warning is issued in these cases, since one wouldn't normally want
a spectrum to be orphaned this way.
All these problems can generally be identified by running sumSpectra
once the data is imported.
The ... argument can be used to pass any argument to read.table
or list.files
.
This includes the possibility of passing arguments that will cause trouble later, for instance
na.strings
in read.table
. While one might successfully read in data with NA
,
it will eventually cause problems. The intent of this feature is to allow one to recurse
a directory tree containing the data, and/or to specify a starting point other than the current
working directory. So for instance if the current working directory is not the directory containing
the data files, you can use path = "my_path"
to point to the desired top-level
directory, and recursive = TRUE
to work your way through a set of subdirectories. In addition,
if you are reading in JCAMP-DX files, you can pass arguments to readJDX
via ..., e.g. SOFC = FALSE
.
Finally, while argument fileExt
appears to be a file extension (from its
name and the description elsewhere), it's actually just a grep pattern that you can apply
to any part of the file name if you know how to construct the proper pattern.
Bryan A. Hanson, DePauw University.
Align the spectra in a Spectra2D
object using an implementation
of the HATS algorithm described by Robinette et al.. Currently, only
global, not local, alignment is carried out.
hats_alignSpectra2D( spectra, maxF2 = NULL, maxF1 = NULL, dist_method = "cosine", minimize = FALSE, thres = 0.99, no.it = 20L, restarts = 2L, method = "MBO", fill = "noise", plot = FALSE, debug = 1 )
hats_alignSpectra2D( spectra, maxF2 = NULL, maxF1 = NULL, dist_method = "cosine", minimize = FALSE, thres = 0.99, no.it = 20L, restarts = 2L, method = "MBO", fill = "noise", plot = FALSE, debug = 1 )
spectra |
An object of S3 class |
maxF2 |
Integer. The most extreme positive |
maxF1 |
Integer. As for |
dist_method |
Character. The distance method to use in the objective function.
See |
minimize |
Logical. Is the goal to minimize the objective function? If so, use |
thres |
Numeric. Prior to launching the optimization, the objective function is evaluated
for no shift in case this is actually the best alignment (saving a great deal of time).
If this initial check exceeds the value of |
no.it |
Integer. The maximum number of iterations in the optimization. |
restarts |
Integer. The maximum number of independent rounds of optimization. |
method |
Character. Currently only |
fill |
Aligning spectra requires that at least some spectra be shifted left/right and up/down. When a spectrum is shifted, spaces are opened that must be filled with something:
|
plot |
Logical. Shall a plot of the alignment progress be made? The plot is useful for diagnostic purposes. Every step of the alignment has a corresponding plot so you should probably direct the output to a pdf file. |
debug |
Integer.
|
An object of S3 class Spectra2D
.
I suggest that you plot your possibly mis-aligned spectra first, zooming in on a small region where
alignment might be an issue, and get an idea of the size of the alignment problem. This will help
you choose good values for maxF1
and maxF2
which will speed up the search.
The algorithm uses random numbers to initialize the search, so set the seed for reproducible results. Different seeds may give different results; you may find it useful to experiment a bit and see how the alignment turns out.
Be sure that your choice of thres
, minimize
and dist_method
are self-consistent.
Some dist_method
choices are bounded, others unbounded, and some should be minimized, others maximized.
You should use sampleDist
to visualize the distances ahead of time. The method
chosen should return a wide numerical range between samples or it won't give a good alignment result.
Bryan A. Hanson, DePauw University.
Roughly follows the algorithm described in Robinette et al. 2011 Anal. Chem. vol. 83, 1649-1657 (2011) dx.doi.org/10.1021/ac102724x
## Not run: set.seed(123) library("ggally") # for diagnostic plots data(MUD2) sumSpectra(MUD2) mylvls <- seq(3, 30, 3) # Plot before alignment plotSpectra2D(MUD2, which = c(2, 3, 5, 6), showGrid = TRUE, lvls = LofL(mylvls, 4), cols = LofC(c("black", "red", "blue", "green"), 4, 10, 2) ) # Carry out alignment # You might want to direct the diagnostic output here to a pdf file # This alignment takes about 90 seconds including the plotting overhead MUD2a <- hats_alignSpectra2D(MUD2, method = "MBO", debug = 1, plot = TRUE) # Plot after alignment plotSpectra2D(MUD2a, which = c(2, 3, 5, 6), showGrid = TRUE, lvls = LofL(mylvls, 4), cols = LofC(c("black", "red", "blue", "green"), 4, 10, 2) ) ## End(Not run)
## Not run: set.seed(123) library("ggally") # for diagnostic plots data(MUD2) sumSpectra(MUD2) mylvls <- seq(3, 30, 3) # Plot before alignment plotSpectra2D(MUD2, which = c(2, 3, 5, 6), showGrid = TRUE, lvls = LofL(mylvls, 4), cols = LofC(c("black", "red", "blue", "green"), 4, 10, 2) ) # Carry out alignment # You might want to direct the diagnostic output here to a pdf file # This alignment takes about 90 seconds including the plotting overhead MUD2a <- hats_alignSpectra2D(MUD2, method = "MBO", debug = 1, plot = TRUE) # Plot after alignment plotSpectra2D(MUD2a, which = c(2, 3, 5, 6), showGrid = TRUE, lvls = LofL(mylvls, 4), cols = LofC(c("black", "red", "blue", "green"), 4, 10, 2) ) ## End(Not run)
This function imports a single file (for instance, a csv) containing a 2D
spectroscopic data set. The current version handles various types of
ASCII text files as well as a few other types.
This function is called by files2Spectra2DObject
and is exported and documented to assist in developing new format codes.
import2Dspectra(file, fmt, nF2, debug = 0, ...)
import2Dspectra(file, fmt, nF2, debug = 0, ...)
file |
Character string giving the path to a file containing a 2D spectrum. |
fmt |
Character string giving the format code to use. Details below. |
nF2 |
Integer giving the number of data points in the F2 (x) dimension. Note: If any dimension is zero-filled you may need to study the acquistion details to get the correct value for this argument. This may be vendor-dependent. |
debug |
Integer. Applies to |
... |
Arguments to be passed to |
A list with 3 elements:
A matrix of the z values. The no. of columns = nF2
and the no. of rows
follows from the size of the imported data.
A vector giving the F2 (x) values.
A vector giving the F1 (y) values.
ASCII format codes are constructed in two parts separated by a hyphen. Three or more columns are expected. The first part gives the order of the columns in the file, e.g. F2F1R means the first column has the F2 values, the second column has the F1 values and the third the real-valued intensities. The second part of the format code relates to the order of the rows, i.e. which column varies fastest and in what direction. These codes are best understood in relation to how the data is stored internally in a matrix. The internal matrix is organized exactly as the data appears on the screen, with F2 decreasing left-to-right, and F1 increasing top-to-bottom. There are many possible formats (only those listed are implemented, please e-mail for help creating additional combinations):
"F2F1R-F2decF1dec"
Columns in the file are F2 (x), F1 (y), real. Both F2 and F1 are decreasing.
Last row is first in the file. This format is used at least some of the time by
nmrPipe.
fmt = "F1F2RI-F1decF2dec2"
Columns in the file are F1 (y), F2 (x), real and imaginary
(imaginary data will be skipped). F1 is held at a fixed value while F2 decreases. F1 starts high
and decreases, so last row is first in the file. There are two sets of data in the file:
The data after FT'ing along F2 only, and the data after FT'ing along both dimensions. The "2"
in the format name means we are taking the second data set.
This format is used by JEOL when exporting to "generic ascii". Argument nF2
is ignored
with this format as the value is sought from the corresponding *.hdr
file. Doing so
also allows one to import files with slightly different F1 and or F2, but for this to be successful
you will need to 1) set allowSloppy = TRUE
in the call to files2Spectra2DObject
and
2) harmonize the dimensions manually after initial import.
Here are some other format codes you can use:
"SimpleM"
. Imports matrices composed of z values. The F2 and F1 values
are created from the dimension of the matrix. After import, you will have to manually
convert the F2 and F1 values to ppm. You may also have to transpose the matrices, or
perhaps invert the order of the rows or columns. Imported via read.table
.
"Btotxt"
. This format imports Bruker data written to a file using the Bruker
"totxt" command. Tested with TopSpin 4.0.7. This format is read via readLines
and thus the ... argument does not apply.
"dx"
. This format imports files written in the JCAMP-DX format, via
package readJDX
.
Bryan A. Hanson, DePauw University.
Given a Spectra2D
object, this function will assist in selecting levels
for preparing contour and image type plots.
Any of the arguments to calcLvls
can be used
to compute the levels, or you can choose your own by inspection.
inspectLvls(spectra, which = 1, ...)
inspectLvls(spectra, which = 1, ...)
spectra |
An object of S3 class |
which |
Integer. The spectrum/spectra to be analyzed. If a vector, the intensities are combined. |
... |
Arguments to be passed downstream to |
A numeric vector giving the levels (invisibly).
Bryan A. Hanson, DePauw University. [email protected]
See pfacSpectra2D
for further examples.
data(MUD1) inspectLvls(MUD1, ylim = c(0, 300), main = "MUD1 Spectrum 1, mode = even") inspectLvls(MUD1, ylim = c(0, 300), mode = "NMR", main = "MUD1 Spectrum 1, mode = NMR")
data(MUD1) inspectLvls(MUD1, ylim = c(0, 300), main = "MUD1 Spectrum 1, mode = even") inspectLvls(MUD1, ylim = c(0, 300), mode = "NMR", main = "MUD1 Spectrum 1, mode = NMR")
When overlaying multiple 2D NMR spectra, one needs to supply a vector of colors for each spectrum contour,
as a list with length equal to the number of spectra to be plotted.
This is a convenience function which takes a vector of colors and copies it into a list, ready for
use with plotSpectra2D
.
LofC(cols, nspec = 1L, ncon = 1L, mode = 1L)
LofC(cols, nspec = 1L, ncon = 1L, mode = 1L)
cols |
Character. A vector of color designations. |
nspec |
Integer. The number of spectra to be plotted. |
ncon |
Integer. The number of contour levels. |
mode |
Integer. How to replicate the colors:
|
A list of length nspec
, the number of spectra to be plotted; each entry is a vector of colors.
mycols <- c("red", "green", "blue") LofC(mycols, 1, 3, 1) LofC(mycols, 3, 3, 2)
mycols <- c("red", "green", "blue") LofC(mycols, 1, 3, 1) LofC(mycols, 3, 3, 2)
When overlaying multiple 2D NMR spectra, one needs to supply a vector of contour levels for each spectrum, as a list.
This is a convenience function which takes a set of levels and copies it into a list, ready for
use with plotSpectra2D
.
LofL(lvls, n)
LofL(lvls, n)
lvls |
Numeric. A vector of the desired levels. |
n |
Integer. The number of spectra to be plotted, which is also the number of times to replicate the levels. |
A list of length n
; each entry is a copy of lvls
.
plotSpectra2D
for an example.
Carry out multivariate image analysis of a Spectra2D
object
(multivariate image analysis is the same as a Tucker1 analysis).
Function pcasup1
from package ThreeWay is used.
miaSpectra2D(spectra)
miaSpectra2D(spectra)
spectra |
An object of S3 class |
A list per pcasup1
. Of particular interest are the
elements C
containing the eigenvectors and 1c
containing the eigenvalues.
We add the class mia
to the list for our use later, as well as a method
element for annotating plots.
Bryan A. Hanson, DePauw University.
A. Smilde, R. Bro and P. Geladi "Multi-way Analysis: Applications in the Chemical Sciences" Wiley (2004). See especially Example 4.5.
P. Geladi and H. Grahn "Multivariate Image Analysis" Wiley (1996). Note that in this text the meanings of scores and loadings are reversed from the usual spectroscopic uses of the terms.
For other data reduction methods for Spectra2D
objects, see
pfacSpectra2D
and popSpectra2D
.
library("ggplot2") data(MUD1) res <- miaSpectra2D(MUD1) # plotScores & plotScree use ggplot2 graphics p1 <- plotScores(MUD1, res, tol = 1.0, ellipse = "cls") p1 <- p1 + ggtitle("MIA Scores") p1 p2 <- plotScree(res) p2 # plotLoadings2D uses base graphics MUD1a <- plotLoadings2D(MUD1, res, load_lvls = seq(-90, 0, 10), main = "MIA Comp. 1 Loadings" ) # Selection of loading matrix levels can be aided by the following # Use MUD1a$names to find the index of the loadings inspectLvls(MUD1a, which = 11, ylim = c(0, 80), main = "Histogram of Loadings Matrix" )
library("ggplot2") data(MUD1) res <- miaSpectra2D(MUD1) # plotScores & plotScree use ggplot2 graphics p1 <- plotScores(MUD1, res, tol = 1.0, ellipse = "cls") p1 <- p1 + ggtitle("MIA Scores") p1 p2 <- plotScree(res) p2 # plotLoadings2D uses base graphics MUD1a <- plotLoadings2D(MUD1, res, load_lvls = seq(-90, 0, 10), main = "MIA Comp. 1 Loadings" ) # Selection of loading matrix levels can be aided by the following # Use MUD1a$names to find the index of the loadings inspectLvls(MUD1a, which = 11, ylim = c(0, 80), main = "Histogram of Loadings Matrix" )
Made Up Data that resemble simple, HSQC-like 2D NMR data sets. Lean, low resolution
and designed primarily to check graphics and test functions. As this is made up
data, there is no underlying tri-linear structure and therefore one should NOT try to interpret
the output of miaSpectra2D
or pfacSpectra2D
run on this data.
MUD1
is intended to test and demonstrate data reduction functions. The HSQC-like data
is derived from the 1H and 13C spectra of 3-methyl-1-butanol and the corresponding ethyl ether,
idealized slightly for simplicity. There are 10 spectra. Sample 1 is the alcohol; samples 2-5 are
the alcohol with local shifts (specifically, two peaks have been shifted +/- one data point).
Samples 6-10 are the ether, treated in a similar fashion.
MUD2
is intended to test and demonstrate alignment algorithms. The HSQC-like data
is derived from the 1H and 13C spectra of 3-methyl-1-butanol, idealized slightly for
simplicity. There are 10 spectra. The first one is "correct" and the other samples have global shifts on one
or both dimensions.
The data are stored as a Spectra2D
object.
Bryan A. Hanson, DePauw University.
Created from scratch. Contact the author for a script if interested.
This function carries out normalization of the spectra in a
Spectra2D
object. The current options are:
"zero2one"
normalizes each 2D spectrum to a [0 ... 1] scale.
"minusPlus"
normalizes each 2D spectrum to a [-1 ... 1] scale.
"TotInt"
normalizes each 2D spectrum so that the total area is one.
normSpectra2D(spectra, method = "zero2one")
normSpectra2D(spectra, method = "zero2one")
spectra |
An object of S3 class |
method |
Character string giving the method for normalization. |
An object of S3 class Spectra2D
.
Bryan A. Hanson, DePauw University.
centscaleSpectra2D
for another means of scaling.
data(MUD1) MUD1n <- normSpectra2D(MUD1) MUD1b <- removeFreq(MUD1, remF2 = 2.5 ~ 3.5) MUD1bn <- normSpectra2D(MUD1b)
data(MUD1) MUD1n <- normSpectra2D(MUD1) MUD1b <- removeFreq(MUD1, remF2 = 2.5 ~ 3.5) MUD1bn <- normSpectra2D(MUD1b)
Carry out PARAFAC analysis of a Spectra2D
object.
Function parafac
from multiway is used.
For large data sets, computational time may be long enough that
it might desirable to run in batch mode and possibly use parallel processing.
pfacSpectra2D(spectra, parallel = FALSE, setup = FALSE, nfac = 2, ...)
pfacSpectra2D(spectra, parallel = FALSE, setup = FALSE, nfac = 2, ...)
spectra |
An object of S3 class |
parallel |
Logical. Should parallel processing be used?
Unless you love waiting, you should use parallel processing for larger data sets.
If you are working on a shared machine and/or another process (created by you or
another user) might also try to access all or some of the cores in your CPU,
you should be careful to avoid hogging the cores.
|
setup |
Logical. If |
nfac |
Integer. The number of factors/components to compute. |
... |
Additional parameters to be passed to function |
An object of class pfac
and parafac
, modified to include a list
element called $method
which is parafac
.
To get reproducible results you will need to set.seed()
. See the example.
Bryan A. Hanson, DePauw University.
R. Bro "PARAFAC. Tutorial and applications" Chemometrics and Intelligent Laboratory Systems vol. 38 pgs. 149-171 (1997).
A. Smilde, R. Bro and P. Geladi "Multi-way Analysis: Applications in the Chemical Sciences" Wiley (2004).
For other data reduction methods for Spectra2D
objects, see
miaSpectra2D
and popSpectra2D
.
library("ggplot2") data(MUD1) set.seed(123) res <- pfacSpectra2D(MUD1, parallel = FALSE, nfac = 2) # plotScores uses ggplot2 graphics p1 <- plotScores(MUD1, res, leg.loc = "topright", ellipse = "cls") p1 <- p1 + ggtitle("PARAFAC Score Plot") p1 # plotLoadings2D uses base graphics res1 <- plotLoadings2D(MUD1, res, load_lvls = c(1, 5, 10, 15, 25), main = "PARAFAC Comp. 1 Loadings") res2 <- plotLoadings2D(MUD1, res, load_lvls = c(1, 5, 10, 15, 25), ref = 2, ref_lvls = seq(5, 35, 5), ref_cols = rep("black", 7), main = "PARAFAC Comp. 1 Loadings + Ref. Spectrum") # Selection of loading matrix levels can be aided by the following # Use res1$names to find the index of the loadings inspectLvls(res1, which = 11, ylim = c(0, 50), main = "Histogram of Loadings Matrix")
library("ggplot2") data(MUD1) set.seed(123) res <- pfacSpectra2D(MUD1, parallel = FALSE, nfac = 2) # plotScores uses ggplot2 graphics p1 <- plotScores(MUD1, res, leg.loc = "topright", ellipse = "cls") p1 <- p1 + ggtitle("PARAFAC Score Plot") p1 # plotLoadings2D uses base graphics res1 <- plotLoadings2D(MUD1, res, load_lvls = c(1, 5, 10, 15, 25), main = "PARAFAC Comp. 1 Loadings") res2 <- plotLoadings2D(MUD1, res, load_lvls = c(1, 5, 10, 15, 25), ref = 2, ref_lvls = seq(5, 35, 5), ref_cols = rep("black", 7), main = "PARAFAC Comp. 1 Loadings + Ref. Spectrum") # Selection of loading matrix levels can be aided by the following # Use res1$names to find the index of the loadings inspectLvls(res1, which = 11, ylim = c(0, 50), main = "Histogram of Loadings Matrix")
Computes (if necessary) and plots loadings from a PARAFAC, MIA or POP analysis of a
Spectra2D
object. The loadings matrix has has dimensions
F1 x F2 and is a 2D pseudo-spectrum. A reference spectrum may also be drawn.
plotLoadings2D( spectra, so, load = 1, ref = NULL, load_lvls = NULL, ref_lvls = NULL, load_cols = NULL, ref_cols = NULL, plot = TRUE, ... )
plotLoadings2D( spectra, so, load = 1, ref = NULL, load_lvls = NULL, ref_lvls = NULL, load_cols = NULL, ref_cols = NULL, plot = TRUE, ... )
spectra |
An object of S3 class |
so |
("Score Object") One of the following:
|
load |
An integer specifying the loading to plot. |
ref |
An integer giving the spectrum in |
load_lvls |
A vector specifying the contour levels
for the loadings pseudo-spectrum.
If |
ref_lvls |
A vector specifying the levels at which to compute contours
for the reference spectrum.
If |
load_cols |
A vector specifying the colors for the contours in the laoding spectrum.
If |
ref_cols |
A vector specifying the colors for the contours in the reference
spectrum. If |
plot |
Logical. Shall a plot be made? Plotting large data sets can be slow.
Run the function with |
... |
Additional parameters to be passed to plotting functions. For instance
|
The modified Spectra2D
object is returned invisibly.
The loadings matrix will be appended with a sample of name of Loadings_x where
x = load
. Side effect is a plot.
You can view the color scale for the plot via showScale
.
The number of levels and colors must match, and they are used 1 for 1. If you
provide n
colors, and no levels, the automatic calculation of levels may return
a number of levels other than n
, in which case the function will override your colors and
assign new colors for the number of levels it computed (with a message). To get
exactly what you want, specify both levels and colors in equal numbers. Function
inspectLvls
can help you choose appropriate levels.
If you specify more than one spectrum to plot, e.g. which = c(1,2)
, then
arguments lvls
and cols
must be lists of levels and colors, one list
element for each spectrum to be plotted (if specified at all). Two convenience functions exist to
make this process easier: LofL
and LofC
. See the examples.
Bryan A. Hanson, DePauw University.
Please see pfacSpectra2D
, miaSpectra2D
or
popSpectra2D
for examples.
This function is used by ChemoSpec
and ChemoSpec2D
, but is
formally part of ChemoSpecUtils
. You can access
full documentation via plotScores
.
This function is used by ChemoSpec
and ChemoSpec2D
, but is
formally part of ChemoSpecUtils
. You can access
full documentation via plotScree
.
Plots a slice of a 2D spectrum stored in a Spectra2D
object.
plotSlice(spectra, which = 1, F2 = NULL, F1 = NULL, showGrid = TRUE, ...)
plotSlice(spectra, which = 1, F2 = NULL, F1 = NULL, showGrid = TRUE, ...)
spectra |
An object of S3 class |
which |
A single integer specifying which 2D spectrum from which to plot the slice. |
F2 |
A single frequency to plot. Matched to the nearest value. |
F1 |
As for |
showGrid |
Logical. If TRUE, show a dotted gray line at each x axis tick mark. |
... |
Additional parameters to be passed to the plotting routines. |
Side effect is a plot.
Only one of F2
or F1
should be given.
Bryan A. Hanson, DePauw University.
data(MUD1) plotSlice(MUD1, F1 = 22, main = "Slice @ F1 = 22 ppm")
data(MUD1) plotSlice(MUD1, F1 = 22, main = "Slice @ F1 = 22 ppm")
Plots a 2D spectrum stored in a Spectra2D
object.
This is primarily for inspection and for preparation of final plots.
If you need to do extensive exploration, you should probably go back
to the spectrometer.
plotSpectra2D( spectra, which = 1, lvls = NULL, cols = NULL, showNA = TRUE, showGrid = FALSE, ... )
plotSpectra2D( spectra, which = 1, lvls = NULL, cols = NULL, showNA = TRUE, showGrid = FALSE, ... )
spectra |
An object of S3 class |
which |
An integer specifying which spectrum to plot. May be a vector. |
lvls |
A numeric vector specifying the levels at which to compute contours.
If |
cols |
A vector of valid color designations. If provided, must be of the
the same length as |
showNA |
Logical. Should the locations of peaks removed by |
showGrid |
Logical. If TRUE, show a dotted gray line at each tick mark. |
... |
Additional parameters to be passed to the plotting routines. |
Side effect is a plot.
One cannot remove frequencies from the interior of a 2D NMR data set and expect to get a meaningful contour plot, because doing so puts unrelated peaks adjacent in the data set. This would lead to contours being drawn that don't exist in the original data set. This function will check for missing frequencies and stops if any are found.
You can view the color scale for the plot via showScale
.
The number of levels and colors must match, and they are used 1 for 1. If you
provide n
colors, and no levels, the automatic calculation of levels may return
a number of levels other than n
, in which case the function will override your colors and
assign new colors for the number of levels it computed (with a message). To get
exactly what you want, specify both levels and colors in equal numbers. Function
inspectLvls
can help you choose appropriate levels.
If you specify more than one spectrum to plot, e.g. which = c(1,2)
, then
arguments lvls
and cols
must be lists of levels and colors, one list
element for each spectrum to be plotted (if specified at all). Two convenience functions exist to
make this process easier: LofL
and LofC
. See the examples.
Bryan A. Hanson, DePauw University.
data(MUD1) mylvls <- seq(5, 30, 5) plotSpectra2D(MUD1, which = 7, lvls = mylvls, main = "MUD1 Sample 7" ) # Invert the screen which makes the colors pop! op <- par(no.readonly = TRUE) par(bg = "black", fg = "white", col.axis = "white", col.main = "white") plotSpectra2D(MUD1, which = 7, lvls = mylvls, main = "MUD1 Sample 7" ) par(op) # Overlay multiple spectra: plotSpectra2D(MUD1, which = c(6, 1), lvls = LofL(mylvls, 2), cols = LofC(c("red", "black"), 2, length(mylvls), 2), main = "MUD1 Sample 1 (red) & Sample 6 (black)\n(4 of 6 peaks overlap)" )
data(MUD1) mylvls <- seq(5, 30, 5) plotSpectra2D(MUD1, which = 7, lvls = mylvls, main = "MUD1 Sample 7" ) # Invert the screen which makes the colors pop! op <- par(no.readonly = TRUE) par(bg = "black", fg = "white", col.axis = "white", col.main = "white") plotSpectra2D(MUD1, which = 7, lvls = mylvls, main = "MUD1 Sample 7" ) par(op) # Overlay multiple spectra: plotSpectra2D(MUD1, which = c(6, 1), lvls = LofL(mylvls, 2), cols = LofC(c("red", "black"), 2, length(mylvls), 2), main = "MUD1 Sample 1 (red) & Sample 6 (black)\n(4 of 6 peaks overlap)" )
This function unstacks a Spectra2D
object and conducts IRLBA
PCA on it.
To unstack, each F1 slice (parallel to F2) is concatenated one after the other
so that each 2D spectrum becomes a 1D spectrum. The length of this spectrum will be
equal to the length of the F2 dimension times the length of the F1 dimension.
PCA is performed on the collection of 1D spectra (one spectrum from each 2D spectrum).
The IRLBA algorithm is used because the resulting matrix (n samples in rows x F1 * F2 columns)
can be very large, and other PCA algorithms can struggle.
popSpectra2D(spectra, n = 3, choice = "noscale", ...)
popSpectra2D(spectra, n = 3, choice = "noscale", ...)
spectra |
An object of S3 class |
n |
Integer. The number of components desired. |
choice |
A character string indicating the choice of scaling. One of
|
... |
Other parameters to be passed to |
The scale choice autoscale
scales the columns by their standard
deviation. Pareto
scales by the square root of the standard
deviation. "autoscale"
is called "standard normal variate" or "correlation matrix PCA"
in some literature. This action is performed on the unstacked matrix, as is centering.
An object of classes prcomp
, pop
and computed_via_irlba
modified to include a list element called $method
, a character string describing the
pre-processing carried out and the type of PCA performed (used to annotate
plots).
Bryan A. Hanson, DePauw University.
J. Baglama and L. Reichel, "Augmented Implicitly Restarted Lanczos Bidiagonalization Methods" SIAM J. Sci. Comput. (2005).
For other data reduction methods for Spectra2D
objects, see
miaSpectra2D
and pfacSpectra2D
.
library("ggplot2") data(MUD1) res <- popSpectra2D(MUD1) # plotScores & plotScree use ggplot2 graphics p1 <- plotScores(MUD1, res, ellipse = "cls") p1 <- p1 + ggtitle("POP Scores") p1 p2 <- plotScree(res) p2 # plotLoadings2D uses base graphics MUD1a <- plotLoadings2D(MUD1, res, load_lvls = c(-0.2, -0.1, 0.1, 0.2), load_cols = rep("black", 4), main = "POP Comp. 1 Loadings")
library("ggplot2") data(MUD1) res <- popSpectra2D(MUD1) # plotScores & plotScree use ggplot2 graphics p1 <- plotScores(MUD1, res, ellipse = "cls") p1 <- p1 + ggtitle("POP Scores") p1 p2 <- plotScree(res) p2 # plotLoadings2D uses base graphics MUD1a <- plotLoadings2D(MUD1, res, load_lvls = c(-0.2, -0.1, 0.1, 0.2), load_cols = rep("black", 4), main = "POP Comp. 1 Loadings")
This function is used by ChemoSpec
and ChemoSpec2D
, but is
formally part of ChemoSpecUtils
. You can access
full documentation via removeFreq
.
This function is used by ChemoSpec
and ChemoSpec2D
, but is
formally part of ChemoSpecUtils
. You can access
full documentation via removeGroup
.
This function sets peaks at specified frequencies in a Spectra2D
object to NA
. This effectively removes these peaks from calculations of
contours which can speed things up and clarifies the visual presentation of data.
This function is useful for removing regions with large
interfering peaks (e.g. the water peak in 1H NMR), or regions that are primarily
noise. This function leaves the frequency axes intact. Note that the
parafac
function, used by pfacSpectra2D
, does not allow NA
in the input data matrices. See removeFreq
for a way to shrink
the data set without introducing NA
s.
removePeaks2D(spectra, remF2 = NULL, remF1 = NULL)
removePeaks2D(spectra, remF2 = NULL, remF1 = NULL)
spectra |
An object of S3 class |
remF2 |
A formula giving the range of frequencies to be set to |
remF1 |
As for |
An object of S3 class Spectra2D
.
Bryan A. Hanson, DePauw University.
# Note we will set contours a bit low to better # show what is going on. data(MUD1) plotSpectra2D(MUD1, which = 7, lvls = 0.1, cols = "black", main = "MUD1 Sample 7: Complete Data Set" ) MUD1a <- removePeaks2D(MUD1, remF2 = 2.5 ~ 4) sumSpectra(MUD1a) plotSpectra2D(MUD1a, which = 7, lvls = 0.1, cols = "black", main = "MUD1 Sample 7\nRemoved Peaks: F2 2.5 ~ 4" ) MUD1b <- removePeaks2D(MUD1, remF2 = low ~ 2) sumSpectra(MUD1b) plotSpectra2D(MUD1b, which = 7, lvls = 0.1, cols = "black", main = "MUD1 Sample 7\nRemoved Peaks: F2 low ~ 2" ) MUD1c <- removePeaks2D(MUD1, remF1 = high ~ 23) sumSpectra(MUD1c) plotSpectra2D(MUD1c, which = 7, lvls = 0.1, cols = "black", main = "MUD1 Sample 7\nRemoved Peaks: F1 high ~ 23" ) MUD1d <- removePeaks2D(MUD1, remF2 = 2.5 ~ 4, remF1 = 45 ~ 55) sumSpectra(MUD1d) plotSpectra2D(MUD1d, which = 7, lvls = 0.1, cols = "black", main = "MUD1 Sample 7\nRemoved Peaks: F2 2.5 ~ 4 & F1 45 ~ 55" )
# Note we will set contours a bit low to better # show what is going on. data(MUD1) plotSpectra2D(MUD1, which = 7, lvls = 0.1, cols = "black", main = "MUD1 Sample 7: Complete Data Set" ) MUD1a <- removePeaks2D(MUD1, remF2 = 2.5 ~ 4) sumSpectra(MUD1a) plotSpectra2D(MUD1a, which = 7, lvls = 0.1, cols = "black", main = "MUD1 Sample 7\nRemoved Peaks: F2 2.5 ~ 4" ) MUD1b <- removePeaks2D(MUD1, remF2 = low ~ 2) sumSpectra(MUD1b) plotSpectra2D(MUD1b, which = 7, lvls = 0.1, cols = "black", main = "MUD1 Sample 7\nRemoved Peaks: F2 low ~ 2" ) MUD1c <- removePeaks2D(MUD1, remF1 = high ~ 23) sumSpectra(MUD1c) plotSpectra2D(MUD1c, which = 7, lvls = 0.1, cols = "black", main = "MUD1 Sample 7\nRemoved Peaks: F1 high ~ 23" ) MUD1d <- removePeaks2D(MUD1, remF2 = 2.5 ~ 4, remF1 = 45 ~ 55) sumSpectra(MUD1d) plotSpectra2D(MUD1d, which = 7, lvls = 0.1, cols = "black", main = "MUD1 Sample 7\nRemoved Peaks: F2 2.5 ~ 4 & F1 45 ~ 55" )
This function is used by ChemoSpec
and ChemoSpec2D
, but is
formally part of ChemoSpecUtils
. You can access
full documentation via removeSample
.
This function is used by ChemoSpec
and ChemoSpec2D
, but is
formally part of ChemoSpecUtils
. You can access
full documentation via sampleDist
.
Shift the spectra in a Spectra2D
object manually. During shifting, some rows
or columns are thrown away and new rows or columns are introduced. These new entries may
be filled with zeros, or noise from the original spectra.
(+) shiftF2
- shift right: trim right, fill left
(-) shiftF2
- shift left: trim left, fill right
(+) shiftF1
- shift up: trim top, fill bottom
(-) shiftF1
- shift down: trim bottom, fill top
shiftSpectra2D( spectra, which = NULL, shiftF2 = 0L, shiftF1 = 0L, fill = "noise" )
shiftSpectra2D( spectra, which = NULL, shiftF2 = 0L, shiftF1 = 0L, fill = "noise" )
spectra |
An object of S3 class |
which |
An integer specifying which spectra to shift. May be a vector. |
shiftF2 |
Integer. The number of data points to shift along the F2 dimension. See Details. |
shiftF1 |
As per |
fill |
Aligning spectra requires that at least some spectra be shifted left/right and up/down. When a spectrum is shifted, spaces are opened that must be filled with something:
|
An object of S3 class Spectra2D
.
Bryan A. Hanson, DePauw University.
data(MUD2) # Show the first two spectra, overlaid mylvls <- seq(5, 35, 5) plotSpectra2D(MUD2, which = 1:2, lvls = LofL(mylvls, 2), cols = LofC(c("red", "black"), 2, length(mylvls), 2), main = "MUD2 Sample 1 (black) & Sample 2 (red)" ) # Now shift Sample 2 MUD2s <- shiftSpectra2D(MUD2, which = 2, shiftF1 = -2) plotSpectra2D(MUD2s, which = 1:2, lvls = LofL(mylvls, 2), cols = LofC(c("red", "black"), 2, length(mylvls), 2), main = "MUD2 Sample 1 (black) & Sample 2 (red)\n(samples now aligned/overlap)" )
data(MUD2) # Show the first two spectra, overlaid mylvls <- seq(5, 35, 5) plotSpectra2D(MUD2, which = 1:2, lvls = LofL(mylvls, 2), cols = LofC(c("red", "black"), 2, length(mylvls), 2), main = "MUD2 Sample 1 (black) & Sample 2 (red)" ) # Now shift Sample 2 MUD2s <- shiftSpectra2D(MUD2, which = 2, shiftF1 = -2) plotSpectra2D(MUD2s, which = 1:2, lvls = LofL(mylvls, 2), cols = LofC(c("red", "black"), 2, length(mylvls), 2), main = "MUD2 Sample 1 (black) & Sample 2 (red)\n(samples now aligned/overlap)" )
This function opens the files containing the contour scale in an appropriate viewer. One file has a white background and the other a black background.
showScale()
showScale()
## Not run: showScale() ## End(Not run)
## Not run: showScale() ## End(Not run)
In ChemoSpec2D
, spectral data sets are stored in an S3 class called
Spectra2D
, which contains a variety of information in addition to the
spectra themselves. Spectra2D
objects are created by
files2Spectra2DObject
.
The structure of a Spectra2D
object is a list of eight
elements and an attribute as follows:
w
element | type | description |
$F2 | num | A common frequency (or wavelength) axis corresponding |
to the F2 dimension in NMR or the x axis more generally. | ||
Must be sorted ascending. | ||
$F1 | num | A common frequency (or wavelength) axis corresponding |
to the F1 dimension in NMR or the y axis more generally. | ||
Must be sorted ascending. | ||
$data | num | A list of matrices. Each matrix contains a 2D spectrum. |
Each matrix should have length(F1) rows and |
||
length(F2) columns. The matrix must not have dimnames. |
||
The low end of the F2 dimension is last column of the last row | ||
(lower right hand corner as typically displayed). The low end of | ||
the F1 dimension is the last column of the first row (upper right hand corner). | ||
In other words, the spectrum is stored as typically displayed. | ||
The list of matrices, if named, should have the same names as | ||
names . However, this is not currently enforced. |
||
$names | chr | The sample names for the spectra; length must be no. samples. |
$groups | factor | The group classification of the samples; length must be no. samples. |
$colors | character | Colors for plotting; length must be no. samples. |
Colors correspond to groups. | ||
$units | chr | Three entries, the first giving the F2 (x) axis unit, the |
second the F1 (y) axis unit, and the third the z axis unit, | ||
usually some kind of intensity. | ||
$desc | chr | A character string describing the data set. |
- attr | chr | "Spectra2D" The S3 class designation. |
Bryan A. Hanson, DePauw University.
sumSpectra
to summarize a Spectra2D
object.
sumGroups
to summarize group membership of a Spectra2D
object. chkSpectra
to verify the integrity of a
Spectra2D
object.
data(MUD1) str(MUD1) sumSpectra(MUD1)
data(MUD1) str(MUD1) sumSpectra(MUD1)
This function is used by ChemoSpec
and ChemoSpec2D
, but is
formally part of ChemoSpecUtils
. You can access
full documentation via sumGroups
.
This function is used by ChemoSpec
and ChemoSpec2D
, but is
formally part of ChemoSpecUtils
. You can access
full documentation via sumSpectra
.
This function is used by ChemoSpec
and ChemoSpec2D
, but is
formally part of ChemoSpecUtils
. You can access
full documentation via updateGroups
.