From d.scott at auckland.ac.nz Fri Apr 1 00:26:33 2011 From: d.scott at auckland.ac.nz (David Scott) Date: Fri, 01 Apr 2011 11:26:33 +1300 Subject: [R] generate random numbers In-Reply-To: References: Message-ID: <4D94FF99.3040903@auckland.ac.nz> On 01/04/11 08:50, Ted Harding wrote: > On 31-Mar-11 19:23:33, Anna Lee wrote: >> Hey List, >> does anyone know how I can generate a vector of random numbers >> from a given distribution? Something like "rnorm" just for non >> normal distributions??? >> >> Thanks a lot! >> Anna > SUppose we give your distribution the name "Dist". > > The generic approach would start by defining a function for > the inverse of its cumulative distribution. Call this qDist. > Then > > qDist(runif(1000)) > > would generate 1000 values from the distribution "Dist". > > As a ready-made example, qnorm is the inverse of pnorm, > the cumulative distribution function of the Normal distribution. > Then > > qnorm(runif(1000)) > > would act just like rnorm(1000), though the sequence of values > would be different (a different algorithm) -- and also rnorm() > would be more efficient (being specially written). > > Depending on what your desired distribution is, you may find > that an "rDist" has already been written for it. There are > many distributions already in R for which the family of > functions dDist, pDist, qDist and rDist are provided. > > For more specific advice, please give us information about > the specific distribution you want to sample from! > > Ted. > I can point to one general implementation which might be helpful, and even the function names are the same. In the version of DistributionUtils on R-Forge you will find functions pDist and qDist which should give the distribution function and quantile function of any continuous unimodal distribution. Provisos: there may be problems with distributions with very heavy tails, and generally the routines could be slow. David Scott -- _________________________________________________________________ David Scott Department of Statistics The University of Auckland, PB 92019 Auckland 1142, NEW ZEALAND Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055 Email: d.scott at auckland.ac.nz, Fax: +64 9 373 7018 Director of Consulting, Department of Statistics From ssefick at gmail.com Fri Apr 1 00:38:00 2011 From: ssefick at gmail.com (stephen sefick) Date: Thu, 31 Mar 2011 17:38:00 -0500 Subject: [R] Linear Model with curve fitting parameter? Message-ID: I have a model Q=K*A*(R^r)*(S^s) A, R, and S are data I have and K is a curve fitting parameter. I have linearized as log(Q)=log(K)+log(A)+r*log(R)+s*log(S) I have taken the log of the data that I have and this is the model formula without the K part lm(Q~offset(A)+R+S, data=x) What is the formula that I should use? Thanks for all of your help. I can provide a subset of data if necessary. -- Stephen Sefick ____________________________________ | Auburn University? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | | Biological Sciences ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ?? ? ? ?| | 331 Funchess Hall? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| | Auburn, Alabama? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ? ??? | | 36849? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | |___________________________________| | sas0025 at auburn.edu? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ?| | http://www.auburn.edu/~sas0025? ? ? ? ? ?? ? ? | |___________________________________| Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods.? We are mammals, and have not exhausted the annoying little problems of being mammals. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -K. Mullis "A big computer, a complex algorithm and a long time does not equal science." ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -Robert Gentleman From jholtman at gmail.com Fri Apr 1 01:02:49 2011 From: jholtman at gmail.com (jim holtman) Date: Thu, 31 Mar 2011 19:02:49 -0400 Subject: [R] Create Variable names dynamically In-Reply-To: <5C0BB463-2983-4854-8D53-9123BCD8CBBC@smartmediacorp.com> References: <5C0BB463-2983-4854-8D53-9123BCD8CBBC@smartmediacorp.com> Message-ID: The best thing to do is to understand how to use 'list' for this purpose. Much easier to handle the information. On Thu, Mar 31, 2011 at 4:42 PM, Noah Silverman wrote: > Hi, > > I want to create variable names from within my code, but can't find any documentation for this. > > An example is probably the best way to illustrate. I am reading data in from a file, doing a bunch of stuff, and want to generate variables with my output. ?(I could make a "list of lists" and name all the elements, but I really want separate variables.) > > > ################# > #This is just a dummy example, please excuse any shortcuts... > > data <- read.table("file", ....) > animals <- (data[,animal]) > animals >> "cat", "dog", "horse" ?# Not known what these are before I read the data file > > # do a bunch of stuff > > mean_cat <- abc > var_cat <- dfd > mean_dog <- 123 > var_dog <- 453 > etc.. > ############## > > I thought of trying to use the paste() function to create the variable name, but that doesn't work: > for( animal in animals){ > ? ? ? ?paste("mean", animal "_") <- 123 > } > > Any ideas??? > > Thanks > > > -- > Noah Silverman > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? From arrayprofile at yahoo.com Fri Apr 1 01:31:56 2011 From: arrayprofile at yahoo.com (array chip) Date: Thu, 31 Mar 2011 16:31:56 -0700 (PDT) Subject: [R] regular expression Message-ID: <536214.83139.qm@web125802.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From carson.farmer at gmail.com Fri Apr 1 02:23:03 2011 From: carson.farmer at gmail.com (Carson Farmer) Date: Fri, 1 Apr 2011 01:23:03 +0100 Subject: [R] rank of Matrix In-Reply-To: References: Message-ID: Hmm, looks like I'm 'answering' my own question here... library(Matrix) data(KNex) mm <- KNex$mm str(mmQR <- qr(mm)) # new stuff: R <- mmQR at R Rdiag <- diag(R) rank <- sum(Rdiag > max(dim(mm))*.Machine$double.eps*abs(R[1,1])) # this is the matlab default I think? # 712 for comparison, rankMatrix from package Matrix gives the same answer, but takes considerably more time and memory! rankMatrix(mm) # 712 Does the above make sense to others? Alternatively, is it possible to derive rank from the model itself (I can't see how)? library(MatrixModels) trial <- data.frame(counts=c(18,17,15,20,10,20,25,13,12), ? ? ? ? ? ? ? ? ? ? ? ?outcome=gl(3,1,9,labels=LETTERS[1:3]), ? ? ? ? ? ? ? ? ? ? ? ?treatment=gl(3,3,labels=letters[1:3])) glmS <- glm4(counts ~ 0+outcome + treatment, family=poisson, data=trial, ? ? ? ? ? ? ? ? ? ? ?verbose = TRUE, sparse = TRUE) str(glmS) -- Carson J. Q. Farmer ISSP Doctoral Fellow National Centre for Geocomputation National University of Ireland, Maynooth, http://www.carsonfarmer.com/ From bernd.weiss at uni-koeln.de Fri Apr 1 02:32:25 2011 From: bernd.weiss at uni-koeln.de (Bernd Weiss) Date: Thu, 31 Mar 2011 20:32:25 -0400 Subject: [R] regular expression In-Reply-To: <536214.83139.qm@web125802.mail.ne1.yahoo.com> References: <536214.83139.qm@web125802.mail.ne1.yahoo.com> Message-ID: <4D951D19.4050902@uni-koeln.de> Am 31.03.2011 19:31, schrieb array chip: > Hi, I am stuck on this: how to specify a match pattern that means not > to include "abc"? > > I tried: > > grep("^(abc)", "hello", value=T) should return "hello". > grep("[^(abc)]", "hello", value=T) [1] "hello" HTH, Bernd From arrayprofile at yahoo.com Fri Apr 1 02:49:28 2011 From: arrayprofile at yahoo.com (array chip) Date: Thu, 31 Mar 2011 17:49:28 -0700 (PDT) Subject: [R] regular expression In-Reply-To: <4D951D19.4050902@uni-koeln.de> References: <536214.83139.qm@web125802.mail.ne1.yahoo.com> <4D951D19.4050902@uni-koeln.de> Message-ID: <791670.28123.qm@web125805.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From peter.langfelder at gmail.com Fri Apr 1 02:55:26 2011 From: peter.langfelder at gmail.com (Peter Langfelder) Date: Thu, 31 Mar 2011 17:55:26 -0700 Subject: [R] regular expression In-Reply-To: <791670.28123.qm@web125805.mail.ne1.yahoo.com> References: <536214.83139.qm@web125802.mail.ne1.yahoo.com> <4D951D19.4050902@uni-koeln.de> <791670.28123.qm@web125805.mail.ne1.yahoo.com> Message-ID: On Thu, Mar 31, 2011 at 5:49 PM, array chip wrote: > Thanks Bernd! I tried your approach with my real example, sometimes it worked, > sometimes it didn't. For example > > grep('[^(arg)]\\.symptom',"stomach.symptom",value=T) > [1] "stomach.symptom" > > grep('[^(arg)]\\.symptom',"liver.symptom",value=T) > character(0) > > I think both examples should return the text, but the 2nd example didn't. > > What was wrong here? Operator error :) Since you exclude 'r' before the '.', liver.symptom does not match the pattern. Peter From arrayprofile at yahoo.com Fri Apr 1 03:06:41 2011 From: arrayprofile at yahoo.com (array chip) Date: Thu, 31 Mar 2011 18:06:41 -0700 (PDT) Subject: [R] regular expression In-Reply-To: References: <536214.83139.qm@web125802.mail.ne1.yahoo.com> <4D951D19.4050902@uni-koeln.de> <791670.28123.qm@web125805.mail.ne1.yahoo.com> Message-ID: <213318.24649.qm@web125808.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bernd.weiss at uni-koeln.de Fri Apr 1 03:19:25 2011 From: bernd.weiss at uni-koeln.de (Bernd Weiss) Date: Thu, 31 Mar 2011 21:19:25 -0400 Subject: [R] regular expression In-Reply-To: <213318.24649.qm@web125808.mail.ne1.yahoo.com> References: <536214.83139.qm@web125802.mail.ne1.yahoo.com> <4D951D19.4050902@uni-koeln.de> <791670.28123.qm@web125805.mail.ne1.yahoo.com> <213318.24649.qm@web125808.mail.ne1.yahoo.com> Message-ID: <4D95281D.7090504@uni-koeln.de> Am 31.03.2011 21:06, schrieb array chip: > Ok then this code didn't do what I wanted. I want "not including > 'arg' before '.symptom'", not individual letters of "arg", but rather > as a word. > > Bill Dunlap suggested using invert=T, it works for single 1 > condition, but not for 2 conditions here: not including "arg" before > ".", but at the same time, does include ".symptom". > > Any other suggestions would be appreciated This does work (but I am by no means an expert in regex...). I am using 'negative lookbehind'[1] to define an expression like 'arg'. > grep('(? grep('(? References: <8512_1301611104_1301611104_AANLkTikOxmcE=oMvHBuB8x61fxXJwnexXJkr+Qp3Tawp@mail.gmail.com> Message-ID: > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of stephen sefick > Sent: March-31-11 3:38 PM > To: R help > Subject: [R] Linear Model with curve fitting parameter? > > I have a model Q=K*A*(R^r)*(S^s) > > A, R, and S are data I have and K is a curve fitting parameter. I > have linearized as > > log(Q)=log(K)+log(A)+r*log(R)+s*log(S) > > I have taken the log of the data that I have and this is the model > formula without the K part > > lm(Q~offset(A)+R+S, data=x) > > What is the formula that I should use? Let Z = Q - A for your logged data. Fitting lm(Z ~ R + S, data = x) should yield intercept parameter estimate = estimate for log(K) R coefficient parameter estimate = estimate for r S coefficient parameter estimate = estimate for s Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre > > Thanks for all of your help. I can provide a subset of data if necessary. > > > > -- > Stephen Sefick > ____________________________________ > | Auburn University? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | > | Biological Sciences ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ?? ? ? ?| > | 331 Funchess Hall? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| > | Auburn, Alabama? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ? ??? | > | 36849? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | > |___________________________________| > | sas0025 at auburn.edu? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ?| > | http://www.auburn.edu/~sas0025? ? ? ? ? ?? ? ? | > |___________________________________| > > Let's not spend our time and resources thinking about things that are > so little or so large that all they really do for us is puff us up and > make us feel like gods.? We are mammals, and have not exhausted the > annoying little problems of being mammals. > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -K. Mullis > > "A big computer, a complex algorithm and a long time does not equal science." > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -Robert Gentleman > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From vogue01 at vogue-clothing.com Fri Apr 1 04:17:41 2011 From: vogue01 at vogue-clothing.com (vogue01 at vogue-clothing.com) Date: Fri, 1 Apr 2011 10:17:41 +0800 Subject: [R] Uniform inquiry Message-ID: <1696C551FAEEFB814D413F4E0AA109BB@Kyioqrhj> Dear Sir/Madam This is vogue clothing co., from China. We are looking for long-cooperation fashion purchaser like you sincerely. Our products is as following, 1/Mens Business suits, blazers, sports coats, uniforms, jumpers and shirts. etc. 2/Ladys Tops, dress, cardigan, skirt, sweaters,with or without embroidery, sequins, beads or painting. etc. Any comments, please kindly inquiry us freely. Best regards Tony Manage director vogue clothing co., ltd. ADD:Bd 9, Oriental Holiday, 158 Youyi RD. Lujia, Kunshan, Jiangsu 215331,China. Tel: 0086-512-50327780 Fax:0086-512-82175207 email: info at vogue-clothing.com url: www.vogue-clothing.com From pjmiller_57 at yahoo.com Fri Apr 1 04:18:23 2011 From: pjmiller_57 at yahoo.com (Paul Miller) Date: Thu, 31 Mar 2011 19:18:23 -0700 (PDT) Subject: [R] Cox Proportional Hazards model with a time-varying covariate Message-ID: <963854.3092.qm@web161619.mail.bf1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jfox at mcmaster.ca Fri Apr 1 05:03:20 2011 From: jfox at mcmaster.ca (John Fox) Date: Thu, 31 Mar 2011 23:03:20 -0400 Subject: [R] Effects - plot the marginal effect In-Reply-To: References: Message-ID: Dear Tomas, Write the model as mreg01 = lm(enep1 ~ enpres * proximity1), data=a90) That is, it's not necessary to index a90 as a list since it's given as the data argument to lm, and doing so confuses the effect() function. Also, enpres*proximity1 will include both the enpres:proximity1 interaction and enpres + proximity1, which are marginal to the interaction. Next, you must quote the name of the term for which you want to compute effects, thus "enpres:proximity1" in the call to effect(). Finally, effect() doesn't compute what are usually termed marginal effects. If you want more information about what it does, see the references given in ?effect. I hope this helps, John ------------------------------------------------ John Fox Sen. William McMaster Prof. of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ On Thu, 31 Mar 2011 22:09:32 +0200 Tomii wrote: > Hello, > > I try to plot the marginal effect by using package "effects" (example of the > graph i want to get is in the attached picture). > All variables are continuous. > > Here is regression function, results and error effect function gives: > > > mreg01 = lm(a90$enep1 ~ a90$enpres + a90$proximity1 + (a90$enpres * a90$proximity1), data=a90)> summary(mreg01) > Call: > lm(formula = a90$enep1 ~ a90$enpres + a90$proximity1 + (a90$enpres * > a90$proximity1), data = a90) > > Residuals: > Min 1Q Median 3Q Max > -2.3173 -1.3349 -0.5713 0.8938 8.1084 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 4.2273 0.3090 13.683 < 2e-16 *** > a90$enpres 0.4225 0.2319 1.822 0.072250 . > a90$proximity1 -3.8797 1.0984 -3.532 0.000696 *** > a90$enpres:a90$proximity1 0.8953 0.4101 2.183 0.032025 * > --- > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > Residual standard error: 2.029 on 78 degrees of freedom > Multiple R-squared: 0.2128, Adjusted R-squared: 0.1826 > F-statistic: 7.031 on 3 and 78 DF, p-value: 0.0003029 > > plot(effect(a90$enpres:a90$proximity1, mreg01))Warning messages:1: In a90$enpres:a90$proximity1 : > numerical expression has 82 elements: only the first used2: In > a90$enpres:a90$proximity1 : > numerical expression has 82 elements: only the first used3: In > analyze.model(term, mod, xlevels, default.levels) : > 0 does not appear in the modelError in > plot(effect(a90$enpres:a90$proximity1, mreg01)) : > error in evaluating the argument 'x' in selecting a method for function 'plot' > > > > > Thanks in advance. > Tomas From m_hofert at web.de Fri Apr 1 07:33:23 2011 From: m_hofert at web.de (Marius Hofert) Date: Fri, 1 Apr 2011 07:33:23 +0200 Subject: [R] lattice: wireframe "eats up" points; how to make points on wireframe visible? In-Reply-To: References: <1D3CE86E-61F3-4D79-B3CB-BCDB79FFEFC1@web.de> Message-ID: okay, I found a solution: library(lattice) f <- function(x) 1/((1-x[1])*(1-x[2])+1) u <- seq(0, 1, length.out=20) grid <- expand.grid(x=u, y=u) x <- grid[,1] y <- grid[,2] z <- apply(grid, 1, f) pt.x <- c(0.4, 0.7) pt.y <- c(0.6, 0.8) eps <- 0.4 pts <- rbind(c(pt.x, f(pt.x)-eps), c(pt.y, f(pt.y))) # points to add to the wireframe trellis.device("pdf", onefile=FALSE, paper="special", width=5.4, height=5.4) wireframe(z~x*y, pts=pts, aspect=1, scales=list(col=1, arrows=FALSE), zlim=c(0,1), panel.3d.wireframe = function(x,y,z,xlim,ylim,zlim,xlim.scaled, ylim.scaled,zlim.scaled,pts,drape=drape,...){ panel.3dwire(x=x, y=y, z=z, xlim=xlim, ylim=ylim, zlim=zlim, xlim.scaled=xlim.scaled, ylim.scaled=ylim.scaled, zlim.scaled=zlim.scaled,drape=TRUE,...) panel.3dscatter(x=pts[,1], y=pts[,2], z=pts[,3], xlim=xlim, ylim=ylim, zlim=zlim, xlim.scaled=xlim.scaled, ylim.scaled=ylim.scaled, zlim.scaled=zlim.scaled, type="p", col=c(2,3), cex=1.8, pch=c(3,4), .scale=TRUE, ...) }) dev.off() On 2011-03-30, at 23:56 , Marius Hofert wrote: > Dear Deepayan, > > thanks for answering. It's never too late to be useful. > > I see your point in the minimal example. I checked the z-axis limits in my > original problem for the point to be inside and it wasn't there. I can't easily > reproduce it from the minimal example though. I'll get back to you if I run into > this problem again. > > In the example below, both points are shown. Although one lies clearly below/under > the surface, it looks as if it lies above. One would probably have to plot this > point first so that the wire frame is above the point. But still, this is > misleading since the eye believes that the wireframe is *not* transparent. This > happens because the lines connecting (0,1,0)--(1,1,0)--(1,0,0) [dashed ones] are > not completely visible [also not the one from (1,1,0) to (1,1,1)]. How can I make > them visible even if they lie behind/under the wireframe? I tried to work with > col="transparent" and with alpha=... but neither did work as I expected. > My goal is to make the small "rectangles" between the wire transparent. > I also use these plots in posters with a certain gradient-like background color > and so it's a bit annoying that the "rectangles" are filled with white color. > > Cheers and many thanks for helping [as usual], > > Marius > > library(lattice) > > f <- function(x) 1/((1-x[1])*(1-x[2])+1) > > u <- seq(0, 1, length.out=20) > grid <- expand.grid(x=u, y=u) > x <- grid[,1] > y <- grid[,2] > z <- apply(grid, 1, f) > > pt.x <- c(0.4, 0.7) > pt.y <- c(0.6, 0.8) > eps <- 0.4 > pts <- rbind(c(pt.x, f(pt.x)-eps), c(pt.y, f(pt.y))) # points to add to the wireframe > > trellis.device("pdf", onefile=FALSE, paper="special", width=5.4, height=5.4) > wireframe(z~x*y, pts=pts, aspect=1, scales=list(col=1, arrows=FALSE), > zlim=c(0,1), > panel.3d.wireframe = function(x,y,z,xlim,ylim,zlim,xlim.scaled, > ylim.scaled,zlim.scaled,pts,...){ > panel.3dwire(x=x, y=y, z=z, xlim=xlim, ylim=ylim, zlim=zlim, > xlim.scaled=xlim.scaled, ylim.scaled=ylim.scaled, > zlim.scaled=zlim.scaled,...) > panel.3dscatter(x=pts[,1], y=pts[,2], z=pts[,3], > xlim=xlim, ylim=ylim, zlim=zlim, > xlim.scaled=xlim.scaled, ylim.scaled=ylim.scaled, > zlim.scaled=zlim.scaled, type="p", col=c(2,3), > cex=1.8, pch=c(3,4), .scale=TRUE, ...) > }) > dev.off() > > On 2011-03-30, at 10:52 , Deepayan Sarkar wrote: > >> On Fri, Mar 4, 2011 at 1:47 PM, Marius Hofert wrote: >>> Dear expeRts, >>> >>> I would like to add two points to a wireframe plot. The points have (x,y,z) coordinates >>> where z is determined to be on the wireframe [same z-value]. Now something strange >>> happens. One point is perfectly plotted, the other isn't shown at all. It only >>> appears if I move it upwards in z-direction by adding a positive number. So somehow >>> it disappears in the wireframe-surface *although* the plot symbol [the cross] has >>> a positive length in each dimension [I also chose cex=5 to make it large enough so >>> that it should (theoretically) be visible]. >>> >>> My wireframe plot is a complicated function which I cannot post here. Below is a minimal >>> example, however, it didn't show the same problem [the surface is too nice I guess]. >>> I therefore *artifically* create the problem in the example below so that you know >>> what I mean. For one of the points, I subtract an epsilon [=0.25] in z-direction and >>> suddenly the point completely disappears. The strange thing is that the point is >>> not even "under" the surface [use the screen-argument to rotate the wireframe plot to check this], >>> it's simply gone, eaten up by the surface. >>> >>> How can I make the two points visible? >>> I also tried to use the alpha-argument to make the wireframe transparent, but I couldn't >>> solve the problem. >>> >>> Cheers, >>> >>> Marius >>> >>> PS: One also faces this problem for example if one wants to make points visible that are on "opposite sides" of the wireframe. >>> >>> library(lattice) >>> >>> f <- function(x) 1/((1-x[1])*(1-x[2])+1) >>> >>> u <- seq(0, 1, length.out=20) >>> grid <- expand.grid(x=u, y=u) >>> x <- grid[,1] >>> y <- grid[,2] >>> z <- apply(grid, 1, f) >>> >>> pt.x <- c(0.2, 0.5) >>> pt.y <- c(0.6, 0.8) >>> eps <- 0.25 >>> pts <- rbind(c(pt.x, f(pt.x)-eps), c(pt.y, f(pt.y))) # points to add to the wireframe >> >> The reason in this case is fairly obvious: you have >> >>> pts >> [,1] [,2] [,3] >> [1,] 0.2 0.5 0.4642857 >> [2,] 0.6 0.8 0.9259259 >> >> So the z-value for Point 1 is 0.4642857, which is less than 0.5, the >> minimum of the z-axis.panel.3dscatter() "clips" any points outside the >> range of the bounding box, so this point is not plotted. >> >> I can't say what the problem was in your original example without >> looking at it, but I would guess it's caused by something similar. >> >> Hope this is not too late to be useful. >> >> -Deepayan >> >> >> >>> >>> wireframe(z~x*y, pts=pts, aspect=1, scales=list(col=1, arrows=FALSE), >>> panel.3d.wireframe = function(x,y,z,xlim,ylim,zlim,xlim.scaled, >>> ylim.scaled,zlim.scaled,pts,...){ >>> panel.3dwire(x=x, y=y, z=z, xlim=xlim, ylim=ylim, zlim=zlim, >>> xlim.scaled=xlim.scaled, ylim.scaled=ylim.scaled, >>> zlim.scaled=zlim.scaled, ...) >>> panel.3dscatter(x=pts[,1], y=pts[,2], z=pts[,3], >>> xlim=xlim, ylim=ylim, zlim=zlim, >>> xlim.scaled=xlim.scaled, ylim.scaled=ylim.scaled, >>> zlim.scaled=zlim.scaled, type="p", col=c(2,3), >>> cex=1.8, .scale=TRUE, ...) >>> }, key=list(x=0.5, y=0.95, points=list(col=c(2,3)), >>> text=list(c("Point 1", "Point 2")), >>> cex=1, align=TRUE, transparent=TRUE)) >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> > From Yusuke.Fukuda at nt.gov.au Fri Apr 1 07:22:30 2011 From: Yusuke.Fukuda at nt.gov.au (Yusuke Fukuda) Date: Fri, 1 Apr 2011 14:52:30 +0930 Subject: [R] ANCOVA for linear regressions without intercept In-Reply-To: References: Message-ID: Thanks Bert. I have read "?formula" again and again, and I'm still struggling; >lm(body_length ~ head_length-1) This removes intercept from each individual regression (for male, female, unknown). When they are taken together, >lm(body_length ~ sex*head_length) This shows differences in slopes and intercepts between the regressions (but I want to compare the slopes of the regressions WITHOUT intercepts). If I put > lm(body_length ~ sex:head_length-1) This shows slopes for each sex without intercepts, but NOT differences in the slope between the regressions. I also tried > lm(body_length ~ sex*head_length-1) > lm(body_length ~ sex*head_length-sex-1) But none of them worked. Would anyone be able to help me? All I want to do is to compare the slopes of three linear regressions that go through the origin (0,0) so that I can say if their difference is significant or not. Thanks for your help. ________________________________________ From: Bert Gunter [mailto:gunter.berton at gene.com] Sent: Friday, 1 April 2011 12:56 AM To: Yusuke Fukuda Cc: r-help at r-project.org Subject: Re: [R] ANCOVA for linear regressions without intercept If you haven't already received an answer, a careful reading of ? ?formula ? will provide it. ? -- Bert On Wed, Mar 30, 2011 at 11:42 PM, Yusuke Fukuda wrote: Hello R experts I have two linear regressions for sexes (Male, Female, Unknown). All have a good correlation between body length (response variable) and head length (explanatory variable). I know it is not recommended, but for a good practical reason (the purpose of study is to find a single conversion factor from head length to body length), the regressions need to go through the origin (0 intercept). Is it possible to do ANCOVA for these regressions without intercepts? When I do summary(lm(body length ~ sex*head length)) this will include the intercepts as below Coefficients: ? ? ? ? ? ? ? ? ? ? ? Estimate Std. Error t value Pr(>|t|) (Intercept) ? ? ? ? ? ?-6.49697 ? ?1.68497 ?-3.856 0.000118 *** sexMale ? ? ? ? ? ? ? ?-9.39340 ? ?1.97760 ?-4.750 2.14e-06 *** sexUnknown ? ? ? ? ? ? -1.33791 ? ?2.35453 ?-0.568 0.569927 head_length ? ? ? ? ? ? 7.12307 ? ?0.05503 129.443 ?< 2e-16 *** sexMale:head_length ? ? 0.31631 ? ?0.06246 ? 5.064 4.37e-07 *** sexUnknown:head_length ?0.19937 ? ?0.07022 ? 2.839 0.004556 ** --- Is there any way I can remove the intercepts so that I can simply compare the slopes with no intercept taken into account? Thanks for help in advance. Yusuke Fukuda ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- "Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions." ? -- Maimonides (1135-1204) ? Bert Gunter Genentech Nonclinical Biostatistics From nuncio.m at gmail.com Fri Apr 1 08:22:57 2011 From: nuncio.m at gmail.com (nuncio m) Date: Fri, 1 Apr 2011 11:52:57 +0530 Subject: [R] principal components Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rroa at azti.es Fri Apr 1 08:28:54 2011 From: rroa at azti.es (=?iso-8859-1?Q?Rub=E9n_Roa?=) Date: Fri, 1 Apr 2011 08:28:54 +0200 Subject: [R] Simple lattice question In-Reply-To: <4D94A701.9040201@ucalgary.ca> References: <5CD78996B8F8844D963C875D3159B94A021BCB2B@dsrcorreo> <4D948489.8000000@ucalgary.ca> <5CD78996B8F8844D963C875D3159B94A021BCD81@dsrcorreo> <4D94A701.9040201@ucalgary.ca> Message-ID: <5CD78996B8F8844D963C875D3159B94A021BCF2B@dsrcorreo> > -----Mensaje original----- > De: Peter Ehlers [mailto:ehlers at ucalgary.ca] > Enviado el: jueves, 31 de marzo de 2011 18:09 > Para: Rub?n Roa > CC: r-help at r-project.org > Asunto: Re: [R] Simple lattice question > > On 2011-03-31 06:58, Rub?n Roa wrote: > > Thanks Peters! > > > > Just a few minor glitches now: > > > > require(lattice) > > data<- > data.frame(SP=sort(rep(as.factor(c('A','B','C','D','E')),12)), > > x=rpois(60,10), > > y=rep(c(rep(0,4),rep(10,4),rep(20,4)),5), > > z=rep(1:4,15)) > > > xyplot(x~y|SP,data=data,groups=z,layout=c(2,3),pch=1:4,lty=1:4 ,col='black',type='b', > > key=list(x = .65, y = .75, corner = c(0, 0), points=TRUE, > > lines=TRUE, pch=1:4, lty=1:4, type='b', > > text=list(lab = as.character(unique(data$z))))) > > > > I have an extra column of symbols on the legend, > > > > and, > > > > would like to add a title to the legend. Such as 'main' for plots. > > > > Any suggestions? > > for key(), make 'lines' into a list: > > xyplot(x~y|SP,data=data,groups=z,layout=c(2,3), > pch=1:4,lty=1:4,col='black',type='b', > key=list(x = .65, y = .75, corner = c(0, 0), > title="title here", cex.title=.9, lines.title=3, > lines=list(pch=1:4, lty=1:4, type='b'), > text=list(lab = as.character(unique(data$z))))) > > Peter Ehlers ... that works. Thanks Peter (sorry I misspelled your name b4). The clean code is now: require(lattice) data <- data.frame(SP=sort(rep(as.factor(c('A','B','C','D','E')),12)), x=rpois(60,10), y=rep(c(rep(0,4),rep(10,4),rep(20,4)),5), z=rep(1:4,15)) xyplot(x~y|SP,data=data,groups=z,layout=c(2,3),pch=1:4,lty=1:4,col='black',type='b', key=list(x = .65, y = .75, corner = c(0, 0), lines=list(pch=1:4, lty=1:4, type='b'), title=expression(CO^2), text=list(lab = as.character(unique(data$z))))) David's code works too (thanks to you too!) and is somewhat shorter xyplot(x~y|SP, data=data,groups=z, layout=c(2,3), par.settings=simpleTheme(pch=1:4,lty=1:4,col='black'), type="b", auto.key = list(x = .6, y = .7, lines=TRUE, corner = c(0, 0))) but the lines and symbols are on different columns, and the line types look as if they were in bold. Rub?n ____________________________________________________________________________________ Dr. Rub?n Roa-Ureta AZTI - Tecnalia / Marine Research Unit Txatxarramendi Ugartea z/g 48395 Sukarrieta (Bizkaia) SPAIN From andreas.borg at unimedizin-mainz.de Fri Apr 1 09:25:47 2011 From: andreas.borg at unimedizin-mainz.de (Andreas Borg) Date: Fri, 01 Apr 2011 09:25:47 +0200 Subject: [R] Assign Names of columns in data.frame dinamically In-Reply-To: References: Message-ID: <4D957DFB.7040904@unimedizin-mainz.de> Hi Marcos, I'd be glad to help, but your example is not usable for anyone but yourself - I don't have the files that your code reads and I don't have any idea what the resulting data frames look like. Please provide examples of the data and the result. Andreas Marcos Amaris Gonzalez schrieb: > Hello List. > > I have many files of ECG, each one with 7 column and I need only the > second column and the name of each ECG. I am doing this but althought > I have various days trying this i haven't gotten, so I ask help with > this, if somebody cans help me I'll be so thankfully. > > -------------------------------------------------------------------------------------------------------- > rm(list=ls()) > > # loadEcgFiles <- function(Dir=".") { > Dir <- "."; > > txtfiles <- list.files(paste(Dir),'.txt$'); > ecg = data.frame(ncol=1); > len = length(txtfiles); > > for (i in 1:5 ) { > # i <- 1; > filename = paste(projectDir, "/" , > txtfiles[i],sep=''); > sample = read.table(filename, nrow=75000); > sampleName = filename; sampleName=gsub("./", > "",sampleName); sampleName=gsub(".txt", "", sampleName); > temp <- sample$V2; > names(temp) <- sampleName; > ecg = cbind(ecg, temp); > # } > > Thanks and sorry for my english. > > --- > Marcos Amaris Gonz?lez - Linux Counter #462840 > ------------------------------------------------------------------------------ > No al SPAM! > "No es m?s rico el que m?s tiene, sino el que menos necesita." > ------------------------------------------------------------------------------ > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- Andreas Borg Medizinische Informatik UNIVERSIT?TSMEDIZIN der Johannes Gutenberg-Universit?t Institut f?r Medizinische Biometrie, Epidemiologie und Informatik Obere Zahlbacher Stra?e 69, 55131 Mainz www.imbei.uni-mainz.de Telefon +49 (0) 6131 175062 E-Mail: borg at imbei.uni-mainz.de Diese E-Mail enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und l?schen Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail und der darin enthaltenen Informationen ist nicht gestattet. From Bernhard_Pfaff at fra.invesco.com Fri Apr 1 10:21:52 2011 From: Bernhard_Pfaff at fra.invesco.com (Pfaff, Bernhard Dr.) Date: Fri, 1 Apr 2011 09:21:52 +0100 Subject: [R] TV VECM (formerly: VECM with UNRESTRICTED TREND) In-Reply-To: References: Message-ID: Dear Renoir, are you referring to: http://econ.la.psu.edu/~hbierens/TVCOINT.PDF ? If so, no, but you could up this framework fairly easily and hereby employ the functions of urca. But this should already be evident from the package's manual. Best, Bernhard > -----Urspr?ngliche Nachricht----- > Von: renoir vieira [mailto:renoirvieira at gmail.com] > Gesendet: Donnerstag, 31. M?rz 2011 22:27 > An: Grzegorz Konat > Cc: Pfaff, Bernhard Dr.; r-help at r-project.org > Betreff: Re: [R] VECM with UNRESTRICTED TREND > > Dear Pfaff, > > Would that be possible to fit a Time varying VECM using urca? > > Yours, > Renoir > > On Thursday, March 31, 2011, Grzegorz Konat > wrote: > > The code you gave me works fine with Finland, but the same > for my data > > - does not! > > I do: > > > > library(urca) > > data(my.data) > > dat1 <- my.data[, c("dY", "X", "dM")] > > trend <- matrix(1:nrow(dat1), ncol = 1) > > colnames(trend) <- "trd" > > yxm.vecm <- ca.jo(dat1, type = "trace", ecdet = "const", K > = 2, spec = > > "longrun", dumvar = trend) > > > > and the result is again: > > > > Error in r[i1, , drop = FALSE] - r[-nrow(r):-(nrow(r) - lag > + 1L), , > > drop = FALSE] : > > ?non-numeric argument to binary operator > > > > I attach my dataset in xls format. If you have 5 minutes > and wish to > > check it out, I'd be extremely grateful! > > > > Best, > > Greg > > > > > > > > 2011/3/31 Pfaff, Bernhard Dr. > > > >> ?Well, without further information, I do not know, but try the > >> following > >> > >> library(urca) > >> example(ca.jo) > >> trend <- matrix(1:nrow(sjf), ncol = 1) > >> colnames(trend) <- "trd" > >> ca.jo(sjf, type = "trace", ecdet = "const", K = 2, spec = > "longrun", > >> dumvar = trend) > >> > >> Best, > >> Bernhard > >> > >> > >> > >> ?------------------------------ > >> *Von:* Grzegorz Konat [mailto:grzegorz.konat at ibrkk.pl] > >> *Gesendet:* Donnerstag, 31. M?rz 2011 14:40 > >> > >> *An:* Pfaff, Bernhard Dr.; r-help at r-project.org > >> *Betreff:* Re: [R] VECM with UNRESTRICTED TREND > >> > >> 'time' was a trend variable from my.data set. Equivalent to the > >> output of the command 'matrix' you just gave me. > >> > >> So now I did: > >> > >> ?library(urca) > >> data(my.data) > >> names(my.data) > >> attach(my.data) > >> dat1 <- my.data[, c("dY", "X", "dM")] > >> mat1 <- matrix(seq(1:nrow(dat1)), ncol = 1) > >> args('ca.jo') > >> yxm.vecm <- ca.jo(dat1, type = "trace", ecdet = "const", K > = 2, spec > >> = "longrun", dumvar=mat1) > >> > >> and the output is: > >> > >> ?Error in r[i1, , drop = FALSE] - r[-nrow(r):-(nrow(r) - > lag + 1L), , > >> drop = FALSE] : > >> ? non-numeric argument to binary operator In addition: Warning > >> message: > >> In ca.jo(dat1, type = "trace", ecdet = "const", K = 2, spec = > >> "longrun", > >> ?: > >> No column names in 'dumvar', using prefix 'exo' instead. > >> > >> What do I do wrong? > >> > >> Best, > >> Greg > >> > >> > >> 2011/3/31 Pfaff, Bernhard Dr. > >> > >>> > >>> > >>> > >>> ?Hello Bernhard, > >>> > >>> thank You so much one again! Now I (more or less) understand the > >>> idea, but still have problem with its practical application. > >>> > >>> I do (somewhat following example 8.1 in your textbook): > >>> > >>> ?library(urca) > >>> data(my.data) > >>> names(my.data) > >>> attach(my.data) > >>> dat1 <- my.data[, c("dY", "X", "dM")] > >>> dat2 <- cbind(time) > >>> > >>> What is 'time'? Just employ matrix(seq(1:nrow(dat1)), > ncol = 1) for > >>> creating the trend variable. > >>> > >>> Best, > >>> Bernhard > >>> > >>> > >>> ?args('ca.jo') > >>> yxm.vecm <- ca.jo(dat1, type = "trace", ecdet = "trend", > K = 2, spec > >>> = "longrun", dumvar=dat2) > >>> > >>> The above code produces following output: > >>> > >>> ?Error in r[i1, , drop = FALSE] - r[-nrow(r):-(nrow(r) - > lag + 1L), > >>> , drop = FALSE] : > >>> ? non-numeric argument to binary operator > >>> > >>> What does that mean? Should I use cbind command to dat1 > as well? And > >>> doesn't it transform the series into series of integer numbers? > >>> > >>> Thank you once again (especially for your patience). > >>> > >>> Best, > >>> Greg > >>> > >>> > >>> > >>> 2011/3/31 Pfaff, Bernhard Dr. > >>> > >>>> ?Hello Greg, > >>>> > >>>> you include your trend as a (Nx1) matrix and use this > for 'dumvar'. > >>>> The matrix 'dumvar' is just added to the VECM as deterministic > >>>> regressors and while you are referring to case 5, this > is basically > >>>> what you are after, if I am not mistaken. But we aware that this > >>>> implies a quadratic trend for the levels > ***************************************************************** Confidentiality Note: The information contained in this ...{{dropped:10}} From chrismcowen at gmail.com Fri Apr 1 10:24:02 2011 From: chrismcowen at gmail.com (Chris Mcowen) Date: Fri, 1 Apr 2011 09:24:02 +0100 Subject: [R] Using a variable in a covariance structure and as a predictor - is this OK? Message-ID: Dear list, I have the model below which i am using to account for spatial autocorrelation: exponential <-corExp(form = ~ Longitude + Latitude) explanation_mod_all <- gls(Lower_PD~Area+Elevation+Temperature+Preceipitation+Agriculture+Urban+Human.footprint+Population, correlation = exponential) I want to see if Latitude and Longitude are significant, is it statistically and methodologically correct to add them to the model i.e explanation_mod_all <- gls(Lower_PD~Area+Elevation+Temperature+Preceipitation+Agriculture+Urban+Human.footprint+Population+Longitude+Latitude, correlation = exponential) ? Thanks Chris From lebatsnok at gmail.com Fri Apr 1 11:24:34 2011 From: lebatsnok at gmail.com (Kenn Konstabel) Date: Fri, 1 Apr 2011 12:24:34 +0300 Subject: [R] How to update R? In-Reply-To: <392248.50969.qm@web30806.mail.mud.yahoo.com> References: <8360A74801605D4487C1D39EF1FDE13502B5A5B5@EXCHANGEVS-03.ad.wsu.edu> <392248.50969.qm@web30806.mail.mud.yahoo.com> Message-ID: On Thu, Mar 31, 2011 at 9:04 PM, Shi, Tao wrote: > This question has been asked by many people already. ?The easiest way is: > > 1) install the new version > 2) copy all or the libraries that you installed later from the "library" folder > of older version to the new version > 3) uninstall the old version > 4) do a library update in the new version On Windows, I've found that it is actually better to uninstall the old version first. Uninstalling removes file associations for .Rdata and other files even if a more recent version of R is present (although perhaps there is a way to tell it not to) but this won't be a problem if you do it in the following order: > 3) uninstall the old version > 1) install the new version > 2) copy all or the libraries that you installed later from the "library" folder > of older version to the new version > 4) do a library update in the new version Kenn From henriMone at gmail.com Fri Apr 1 12:46:01 2011 From: henriMone at gmail.com (Henri Mone) Date: Fri, 1 Apr 2011 12:46:01 +0200 Subject: [R] MySql Versus R Message-ID: Dear R Users, I use for my data crunching a combination of MySQL and GNU R. I have to handle huge/ middle seized data which is stored in a MySql database, R executes a SQL command to fetch the data and does the plotting with the build in R plotting functions. The (low level) calculations like summing, dividing, grouping, sorting etc. can be done either with the sql command on the MySQL side or on the R side. My question is what is faster for this low level calculations / data rearrangement MySQL or R? Is there a general rule of thumb what to shift to the MySql side and what to the R side? Thanks Henri From kamauallan at gmail.com Fri Apr 1 13:08:30 2011 From: kamauallan at gmail.com (Allan Kamau) Date: Fri, 1 Apr 2011 14:08:30 +0300 Subject: [R] MySql Versus R In-Reply-To: References: Message-ID: On Fri, Apr 1, 2011 at 1:46 PM, Henri Mone wrote: > Dear R Users, > > I use for my data crunching a combination of MySQL and GNU R. I have > to handle huge/ middle seized data which is stored in a MySql > database, R executes a SQL command to fetch the data and does the > plotting with the build in R plotting functions. > > The (low level) calculations like summing, dividing, grouping, sorting > etc. can be done either with the sql command on the MySQL side or on > the R side. > My question is what is faster for this low level calculations / data > rearrangement MySQL or R? Is there a general rule of thumb what to > shift to the MySql side and what to the R side? > > Thanks > Henri > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > I would assume RDBMS have advanced memory management capabilities and are designed for the manipulation and handling of large amounts of data. These are primary features for most database management server software. This way the database management server software should (in most cases) be used to store, manipulate then return only the processed and qualifying records to the client or other application for further specialized processing and/or data visualization. Allan. From e.hofstadler at gmail.com Fri Apr 1 13:08:51 2011 From: e.hofstadler at gmail.com (E Hofstadler) Date: Fri, 1 Apr 2011 14:08:51 +0300 Subject: [R] programming: telling a function where to look for the entered variables Message-ID: Hi there, Could someone help me with the following programming problem..? I have written a function that works for my intended purpose, but it is quite closely tied to a particular dataframe and the names of the variables in this dataframe. However, I'd like to use the same function for different dataframes and variables. My problem is that I'm not quite sure how to tell my function in which dataframe the entered variables are located. Here's some reproducible data and the function: # create reproducible data set.seed(124) xvar <- sample(0:3, 1000, replace = T) yvar <- sample(0:1, 1000, replace=T) zvar <- rnorm(100) lvar <- sample(0:1, 1000, replace=T) Fulldf <- as.data.frame(cbind(xvar,yvar,zvar,lvar)) Fulldf$xvar <- factor(xvar, labels=c("blue","green","red","yellow")) Fulldf$yvar <- factor(yvar, labels=c("area1","area2")) Fulldf$lvar <- factor(lvar, labels=c("yes","no")) and here's the function in the form that it currently works: from a subset of the dataframe Fulldf, a contingency table is created (in my actual data, several other operations are then performed on that contingency table, but these are not relevant for the problem in question, therefore I've deleted it) . # function as it currently works: tailored to a particular dataframe (Fulldf) myfunct <- function(subgroup){ # enter a particular subgroup for which the contingency table should be calculated (i.e. a particular value of the factor lvar) Data.tmp <- subset(Fulldf, lvar==subgroup, select=c("xvar","yvar")) #restrict dataframe to given subgroup and two columns of the original dataframe Data.tmp <- na.omit(Data.tmp) # exclude missing values indextable <- table(Data.tmp$xvar, Data.tmp$yvar) # make contingency table return(indextable) } #Since I need to use the function with different dataframes and variable names, I'd like to be able to tell my function the name of the dataframe and variables it should use for calculating the index. This is how I tried to modify the first part of the #function, but it didn't work: # function as I would like it to work: independent of any particular dataframe or variable names (doesn't work) myfunct.better <- function(subgroup, lvarname, yvarname, dataframe){ #enter the subgroup, the variable names to be used and the dataframe in which they are found Data.tmp <- subset(dataframe, lvarname==subgroup, select=c("xvar", deparse(substitute(yvarname)))) # trying to subset the given dataframe for the given subgroup of the given variable. The variable "xvar" happens to have the same name in all dataframes) but the variable yvarname has different names in the different dataframes Data.tmp <- na.omit(Data.tmp) indextable <- table(Data.tmp$xvar, Data.tmp$yvarname) # create the contingency table on the basis of the entered variables return(indextable) } calling myfunct.better("yes", lvarname=lvar, yvarname=yvar, dataframe=Fulldf) results in the following error: Error in `[.data.frame`(x, r, vars, drop = drop) : undefined columns selected My feeling is that R doesn't know where to look for the entered variables (lvar, yvar), but I'm not sure how to solve this problem. I tried using with() and even attach() within the function, but that didn't work. Any help is greatly appreciated. Best, Esther P.S.: Are there books that elaborate programming in R for beginners -- and I mean things like how to best use vectorization instead of loops and general "best practice" tips for programming. Most of the books I've been looking at focus on applying R for particular statistical analyses, and only comparably briefly deal with more general programming aspects. I was wondering if there's any books or tutorials out there that cover the latter aspects in a more elaborate and systematic way...? From ripley at stats.ox.ac.uk Fri Apr 1 13:15:09 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Fri, 1 Apr 2011 12:15:09 +0100 (BST) Subject: [R] MySql Versus R In-Reply-To: References: Message-ID: On Fri, 1 Apr 2011, Henri Mone wrote: > Dear R Users, > > I use for my data crunching a combination of MySQL and GNU R. I have > to handle huge/ middle seized data which is stored in a MySql > database, R executes a SQL command to fetch the data and does the > plotting with the build in R plotting functions. > > The (low level) calculations like summing, dividing, grouping, sorting > etc. can be done either with the sql command on the MySQL side or on > the R side. > My question is what is faster for this low level calculations / data > rearrangement MySQL or R? Is there a general rule of thumb what to > shift to the MySql side and what to the R side? The data transfer costs almost always dominate here: since such low-level computations would almost always be a trivial part of the total costs, you should do things which can reduce the size (e.g. summarizations) in the DBMS. I do wonder what you think the R-sig-db list is for if not questions such as this one. Please subscribe and use it next time. > Thanks > Henri -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From ggrothendieck at gmail.com Fri Apr 1 13:32:43 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Fri, 1 Apr 2011 07:32:43 -0400 Subject: [R] MySql Versus R In-Reply-To: References: Message-ID: On Fri, Apr 1, 2011 at 6:46 AM, Henri Mone wrote: > Dear R Users, > > I use for my data crunching a combination of MySQL and GNU R. I have > to handle huge/ middle seized data which is stored in a MySql > database, R executes a SQL command to fetch the data and does the > plotting with the build in R plotting functions. > > The (low level) calculations like summing, dividing, grouping, sorting > etc. can be done either with the sql command on the MySQL side or on > the R side. > My question is what is faster for this low level calculations / data > rearrangement MySQL or R? Is there a general rule of thumb what to > shift to the MySql side and what to the R side? The sqldf package makes it easy to use sqlite, h2 or postgresql from R to carry out data manipulation tasks and this has facilitated some benchmarks by users using sqlite's capability of using an "in memory" database. In the cases cited on the sqldf home page sqlite was faster than R despite the overhead of moving the data into the database and out again. See http://sqldf.googlecode.com In general the answer would depend on the database, what has been cached, the particular query, size of data and how well you had optimized your sql and R queries. There are entire books on optimizing MySQL so this is an extensive subject. Various comparisons of different approaches can easily result in different ordering from fastest to slowest based on what would appear to be relatively minor aspects of the problem so you would have to benchmark the important queries to really get an answer that pertains to your situation. Check out the rbenchmark package for this which makes it relatively simple to do. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From nick.sabbe at ugent.be Fri Apr 1 13:34:17 2011 From: nick.sabbe at ugent.be (Nick Sabbe) Date: Fri, 1 Apr 2011 13:34:17 +0200 Subject: [R] programming: telling a function where to look for the entered variables In-Reply-To: References: Message-ID: <04ac01cbf060$bc4e5d10$34eb1730$@sabbe@ugent.be> See the warning in ?subset. Passing the column name of lvar is not the same as passing the 'contextual column' (as I coin it in these circumstances). You can solve it by indeed using [] instead. For my own comfort, here is the relevant line from your original function: Data.tmp <- subset(Fulldf, lvar==subgroup, select=c("xvar","yvar")) Which should become something like (untested but should be close): Data.tmp <- Fulldf[Fulldf[,"lvar"]==subgroup, c("xvar","yvar")] This should be a lot easier to translate based on column names, as the column names are now used as such. HTH, Nick Sabbe -- ping: nick.sabbe at ugent.be link: http://biomath.ugent.be wink: A1.056, Coupure Links 653, 9000 Gent ring: 09/264.59.36 -- Do Not Disapprove -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of E Hofstadler Sent: vrijdag 1 april 2011 13:09 To: r-help at r-project.org Subject: [R] programming: telling a function where to look for the entered variables Hi there, Could someone help me with the following programming problem..? I have written a function that works for my intended purpose, but it is quite closely tied to a particular dataframe and the names of the variables in this dataframe. However, I'd like to use the same function for different dataframes and variables. My problem is that I'm not quite sure how to tell my function in which dataframe the entered variables are located. Here's some reproducible data and the function: # create reproducible data set.seed(124) xvar <- sample(0:3, 1000, replace = T) yvar <- sample(0:1, 1000, replace=T) zvar <- rnorm(100) lvar <- sample(0:1, 1000, replace=T) Fulldf <- as.data.frame(cbind(xvar,yvar,zvar,lvar)) Fulldf$xvar <- factor(xvar, labels=c("blue","green","red","yellow")) Fulldf$yvar <- factor(yvar, labels=c("area1","area2")) Fulldf$lvar <- factor(lvar, labels=c("yes","no")) and here's the function in the form that it currently works: from a subset of the dataframe Fulldf, a contingency table is created (in my actual data, several other operations are then performed on that contingency table, but these are not relevant for the problem in question, therefore I've deleted it) . # function as it currently works: tailored to a particular dataframe (Fulldf) myfunct <- function(subgroup){ # enter a particular subgroup for which the contingency table should be calculated (i.e. a particular value of the factor lvar) Data.tmp <- subset(Fulldf, lvar==subgroup, select=c("xvar","yvar")) #restrict dataframe to given subgroup and two columns of the original dataframe Data.tmp <- na.omit(Data.tmp) # exclude missing values indextable <- table(Data.tmp$xvar, Data.tmp$yvar) # make contingency table return(indextable) } #Since I need to use the function with different dataframes and variable names, I'd like to be able to tell my function the name of the dataframe and variables it should use for calculating the index. This is how I tried to modify the first part of the #function, but it didn't work: # function as I would like it to work: independent of any particular dataframe or variable names (doesn't work) myfunct.better <- function(subgroup, lvarname, yvarname, dataframe){ #enter the subgroup, the variable names to be used and the dataframe in which they are found Data.tmp <- subset(dataframe, lvarname==subgroup, select=c("xvar", deparse(substitute(yvarname)))) # trying to subset the given dataframe for the given subgroup of the given variable. The variable "xvar" happens to have the same name in all dataframes) but the variable yvarname has different names in the different dataframes Data.tmp <- na.omit(Data.tmp) indextable <- table(Data.tmp$xvar, Data.tmp$yvarname) # create the contingency table on the basis of the entered variables return(indextable) } calling myfunct.better("yes", lvarname=lvar, yvarname=yvar, dataframe=Fulldf) results in the following error: Error in `[.data.frame`(x, r, vars, drop = drop) : undefined columns selected My feeling is that R doesn't know where to look for the entered variables (lvar, yvar), but I'm not sure how to solve this problem. I tried using with() and even attach() within the function, but that didn't work. Any help is greatly appreciated. Best, Esther P.S.: Are there books that elaborate programming in R for beginners -- and I mean things like how to best use vectorization instead of loops and general "best practice" tips for programming. Most of the books I've been looking at focus on applying R for particular statistical analyses, and only comparably briefly deal with more general programming aspects. I was wondering if there's any books or tutorials out there that cover the latter aspects in a more elaborate and systematic way...? ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From b.rowlingson at lancaster.ac.uk Fri Apr 1 13:36:44 2011 From: b.rowlingson at lancaster.ac.uk (Barry Rowlingson) Date: Fri, 1 Apr 2011 12:36:44 +0100 Subject: [R] MySql Versus R In-Reply-To: References: Message-ID: On Fri, Apr 1, 2011 at 11:46 AM, Henri Mone wrote: > Dear R Users, > > I use for my data crunching a combination of MySQL and GNU R. I have > to handle huge/ middle seized data which is stored in a MySql > database, R executes a SQL command to fetch the data and does the > plotting with the build in R plotting functions. > > The (low level) calculations like summing, dividing, grouping, sorting > etc. can be done either with the sql command on the MySQL side or on > the R side. > My question is what is faster for this low level calculations / data > rearrangement MySQL or R? Is there a general rule of thumb what to > shift to the MySql side and what to the R side? Given that you are already set up to test this yourself, why don't you? SELECT everything from a table and add it in R, and then SELECT sum(everything) from a table and compare the time (obviously your example might be more complex). Post some benchmark test results together with your hardware spec. Probably best to the db-flavour R mailing list. Is the MySQl server running locally, ie on the same machine? Maybe PostgreSQL will be even faster? So many of these questions are problem-specific and hardware-setup related. You can get massive speedups by having more RAM, or more disk, or spreading your giant database onto multiple servers. Rules of thumb are rare in this world, since everyone's thumbs are different sizes and are being stuck into different sized problems. Barry From therneau at mayo.edu Fri Apr 1 13:49:36 2011 From: therneau at mayo.edu (Terry Therneau) Date: Fri, 01 Apr 2011 06:49:36 -0500 Subject: [R] Cox Proportional Hazards model with a time-varying covariate Message-ID: <1301658576.20397.24.camel@wallace> > The R results I'm getting are similar to the SAS results but not very > close. For example, my coefficient for TRT is .2913 in R but SAS > gives .3450. I'd like to be able to run the R Code with method = > "exact" to make it as same as the AS example but I can't seem to get > it to work You need to read the documentation more carefully. The "exact partial likelihood" (Cox's label, not mine), which is appropriate for data on a discreete time scale, is invoked by the "exact" option for ties in R and by using the "discrete" option in SAS. (In hindsight I actually like their choice of a label somewhat better, since users more often make the correct association of method to data.) This method can take nearly forever to compute, however, when there are a lot of subjects tied at a given time. For instance at time 20 your data set has 10 events out of 77 subjects; the term in the exact partial likelihood requires a sum over all the possible subsets of 10 chosen from 77, approximately 10^12 terms: you might retire before it finishes. The "exact marginal likelihood" proposed by Prentice is invoked in SAS using ties=exact, it is not implimented in R. I have never found any compelling reason to program it. Statistical comment: in statistics an "exact" method means that the solution can be computed exactly, the label has absolutely no connection with the question of whether the method is a sensible thing to do. In the case of a Cox model, I prefer and strongly recommend using the Efron approximation for ties, this is the default in R. The Breslow approximation is usually quite good enough, this is the default in SAS. Terry Therneau From deepayan.sarkar at gmail.com Fri Apr 1 13:58:54 2011 From: deepayan.sarkar at gmail.com (Deepayan Sarkar) Date: Fri, 1 Apr 2011 17:28:54 +0530 Subject: [R] lattice: wireframe "eats up" points; how to make points on wireframe visible? In-Reply-To: References: <1D3CE86E-61F3-4D79-B3CB-BCDB79FFEFC1@web.de> Message-ID: On Thu, Mar 31, 2011 at 3:26 AM, Marius Hofert wrote: > Dear Deepayan, > > thanks for answering. It's never too late to be useful. > > I see your point in the minimal example. I checked the z-axis limits in my > original problem for the point to be inside and it wasn't there. I can't easily > reproduce it from the minimal example though. I'll get back to you if I run into > this problem again. > > In the example below, both points are shown. Although one lies clearly below/under > the surface, it looks as if it lies above. One would probably have to plot this > point first so that the wire frame is above the point. But still, this is > misleading since the eye believes that the wireframe is *not* transparent. This > happens because the lines connecting (0,1,0)--(1,1,0)--(1,0,0) [dashed ones] are > not completely visible [also not the one from (1,1,0) to (1,1,1)]. How can I make > them visible even if they lie behind/under the wireframe? I tried to work with > col="transparent" and with alpha=... but neither did work as I expected. > My goal is to make the small "rectangles" between the wire transparent. > I also use these plots in posters with a certain gradient-like background color > and so it's a bit annoying that the "rectangles" are filled with white color. Yes, that probably needs a new argument; the default computation is a bit of a hack. You can try the following workaround for now: wireframe(z~x*y, pts=pts, aspect=1, scales=list(col=1, arrows=FALSE), zlim=c(0,1), par.settings = list(background = list(col = "#ffffff11")), ## <- NEW panel.3d.wireframe = function(x,y,z,xlim,ylim,zlim,xlim.scaled, ylim.scaled,zlim.scaled,pts,...){ panel.3dwire(x=x, y=y, z=z, xlim=xlim, ylim=ylim, zlim=zlim, xlim.scaled=xlim.scaled, ylim.scaled=ylim.scaled, zlim.scaled=zlim.scaled, ...) panel.3dscatter(x=pts[,1], y=pts[,2], z=pts[,3], xlim=xlim, ylim=ylim, zlim=zlim, xlim.scaled=xlim.scaled, ylim.scaled=ylim.scaled, zlim.scaled=zlim.scaled, type="p", col=c(2,3), cex=1.8, pch=c(3,4), .scale=TRUE, ...) }) col = "#ffffff00" instead will give you full transparency (but "transparent" will not work), and col = "#ffffff77" will be less transparent and so on. -Deepayan From e.hofstadler at gmail.com Fri Apr 1 14:28:23 2011 From: e.hofstadler at gmail.com (E Hofstadler) Date: Fri, 1 Apr 2011 15:28:23 +0300 Subject: [R] programming: telling a function where to look for the entered variables In-Reply-To: <8023110704927321697@unknownmsgid> References: <8023110704927321697@unknownmsgid> Message-ID: Thanks Nick and Juan for your replies. Nick, thanks for pointing out the warning in subset(). I'm not sure though I understand the example you provided -- because despite using subset() rather than bracket notation, the original function (myfunct) does what is expected of it. The problem I have is with the second function (myfunct.better), where variable names + dataframe are not fixed within the function but passed to the function when calling it -- and even with bracket notation I don't quite manage to tell R where to look for the columns that related to the entered column names. (but then perhaps I misunderstood you) This is what I tried (using bracket notation): myfunct.better(dataframe, subgroup, lvarname,yvarname){ Data.tmp <- dataframe[dataframe[,deparse(substitute(lvarname))]==subgroup, c("xvar",deparse(substitute(yvarname)))] } but this creates an empty contingency table only -- perhaps because my use of deparse() is flawed (I think what is converted into a string is "lvarname" and "yvarname", rather than the column names that these two function-variables represent in the dataframe)? 2011/4/1 Nick Sabbe : > See the warning in ?subset. > Passing the column name of lvar is not the same as passing the 'contextual > column' (as I coin it in these circumstances). > You can solve it by indeed using [] instead. > > For my own comfort, here is the relevant line from your original function: > Data.tmp <- subset(Fulldf, lvar==subgroup, select=c("xvar","yvar")) > Which should become something like (untested but should be close): > Data.tmp <- Fulldf[Fulldf[,"lvar"]==subgroup, c("xvar","yvar")] > > This should be a lot easier to translate based on column names, as the > column names are now used as such. > > HTH, > > > Nick Sabbe > -- > ping: nick.sabbe at ugent.be > link: http://biomath.ugent.be > wink: A1.056, Coupure Links 653, 9000 Gent > ring: 09/264.59.36 > > -- Do Not Disapprove > > > > > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On > Behalf Of E Hofstadler > Sent: vrijdag 1 april 2011 13:09 > To: r-help at r-project.org > Subject: [R] programming: telling a function where to look for the entered > variables > > Hi there, > > Could someone help me with the following programming problem..? > > I have written a function that works for my intended purpose, but it > is quite closely tied to a particular dataframe and the names of the > variables in this dataframe. However, I'd like to use the same > function for different dataframes and variables. My problem is that > I'm not quite sure how to tell my function in which dataframe the > entered variables are located. > > Here's some reproducible data and the function: > > # create reproducible data > set.seed(124) > xvar <- sample(0:3, 1000, replace = T) > yvar <- sample(0:1, 1000, replace=T) > zvar <- rnorm(100) > lvar <- sample(0:1, 1000, replace=T) > Fulldf <- as.data.frame(cbind(xvar,yvar,zvar,lvar)) > Fulldf$xvar <- factor(xvar, labels=c("blue","green","red","yellow")) > Fulldf$yvar <- factor(yvar, labels=c("area1","area2")) > Fulldf$lvar <- factor(lvar, labels=c("yes","no")) > > and here's the function in the form that it currently works: from a > subset of the dataframe Fulldf, a contingency table is created (in my > actual data, several other operations are then performed on that > contingency table, but these are not relevant for the problem in > question, therefore I've deleted it) . > > # function as it currently works: tailored to a particular dataframe > (Fulldf) > > myfunct <- function(subgroup){ # enter a particular subgroup for which > the contingency table should be calculated (i.e. a particular value of > the factor lvar) > Data.tmp <- subset(Fulldf, lvar==subgroup, select=c("xvar","yvar")) > #restrict dataframe to given subgroup and two columns of the original > dataframe > Data.tmp <- na.omit(Data.tmp) # exclude missing values > indextable <- table(Data.tmp$xvar, Data.tmp$yvar) # make contingency table > return(indextable) > } > > #Since I need to use the function with different dataframes and > variable names, I'd like to be able to tell my function the name of > the dataframe and variables it should use for calculating the index. > This is how I tried to modify the first part of the #function, but it > didn't work: > > # function as I would like it to work: independent of any particular > dataframe or variable names (doesn't work) > > myfunct.better <- function(subgroup, lvarname, yvarname, dataframe){ > #enter the subgroup, the variable names to be used and the dataframe > in which they are found > ? ?Data.tmp <- subset(dataframe, lvarname==subgroup, select=c("xvar", > deparse(substitute(yvarname)))) # trying to subset the given dataframe > for the given subgroup of the given variable. The variable "xvar" > happens to have the same name in all dataframes) but the variable > yvarname has different names in the different dataframes > Data.tmp <- na.omit(Data.tmp) > ? ?indextable <- table(Data.tmp$xvar, Data.tmp$yvarname) # create the > contingency table on the basis of the entered variables > return(indextable) > } > > calling > > myfunct.better("yes", lvarname=lvar, yvarname=yvar, dataframe=Fulldf) > > results in the following error: > > Error in `[.data.frame`(x, r, vars, drop = drop) : > ?undefined columns selected > > My feeling is that R doesn't know where to look for the entered > variables (lvar, yvar), but I'm not sure how to solve this problem. I > tried using with() and even attach() within the function, but that > didn't work. > > Any help is greatly appreciated. > > Best, > Esther > > P.S.: > Are there books that elaborate programming in R for beginners -- and I > mean things like how to best use vectorization instead of loops and > general "best practice" tips for programming. Most of the books I've > been looking at focus on applying R for particular statistical > analyses, and only comparably briefly deal with more general > programming aspects. I was wondering if there's any books or tutorials > out there that cover the latter aspects in a more elaborate and > systematic way...? > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > From nick.sabbe at ugent.be Fri Apr 1 14:42:22 2011 From: nick.sabbe at ugent.be (Nick Sabbe) Date: Fri, 1 Apr 2011 14:42:22 +0200 Subject: [R] programming: telling a function where to look for the entered variables In-Reply-To: References: <8023110704927321697@unknownmsgid> Message-ID: <04ad01cbf06a$3f55fa70$be01ef50$@sabbe@ugent.be> This should be a version that does what you want. Because you named the variable lvarname, I assumed you were already passing "lvar" instead of trying to pass lvar (without the quotes), which is in no way a 'name'. myfunct.better <- function(subgroup, lvarname, xvarname, yvarname, dataframe) { #enter the subgroup, the variable names to be used and the dataframe #in which they are found Data.tmp <- Fulldf[Fulldf[,lvarname]==subgroup, c(xvarname,yvarname)] Data.tmp <-na.omit(Data.tmp) indextable <- table(Data.tmp[,xvarname], Data.tmp[,yvarname]) # create the contingency #table on the basis of the entered variables #actually, if I remember well, you could simply use indextable<-table(Data.tmp) here #that would allow for some more simplifications (replace xvarname and yvarname by #columnsOfInterest or similar, and pass that instead of c(xvarname, yvarname) ) return(indextable) } myfunct.better("yes", lvarname="lvar", xvarname="xvar", yvarname="yvar", dataframe=Fulldf) HTH, Nick Sabbe -- ping: nick.sabbe at ugent.be link: http://biomath.ugent.be wink: A1.056, Coupure Links 653, 9000 Gent ring: 09/264.59.36 -- Do Not Disapprove -----Original Message----- From: irene.prix at googlemail.com [mailto:irene.prix at googlemail.com] On Behalf Of E Hofstadler Sent: vrijdag 1 april 2011 14:28 To: Nick Sabbe Cc: r-help at r-project.org Subject: Re: [R] programming: telling a function where to look for the entered variables Thanks Nick and Juan for your replies. Nick, thanks for pointing out the warning in subset(). I'm not sure though I understand the example you provided -- because despite using subset() rather than bracket notation, the original function (myfunct) does what is expected of it. The problem I have is with the second function (myfunct.better), where variable names + dataframe are not fixed within the function but passed to the function when calling it -- and even with bracket notation I don't quite manage to tell R where to look for the columns that related to the entered column names. (but then perhaps I misunderstood you) This is what I tried (using bracket notation): myfunct.better(dataframe, subgroup, lvarname,yvarname){ Data.tmp <- dataframe[dataframe[,deparse(substitute(lvarname))]==subgroup, c("xvar",deparse(substitute(yvarname)))] } but this creates an empty contingency table only -- perhaps because my use of deparse() is flawed (I think what is converted into a string is "lvarname" and "yvarname", rather than the column names that these two function-variables represent in the dataframe)? 2011/4/1 Nick Sabbe : > See the warning in ?subset. > Passing the column name of lvar is not the same as passing the 'contextual > column' (as I coin it in these circumstances). > You can solve it by indeed using [] instead. > > For my own comfort, here is the relevant line from your original function: > Data.tmp <- subset(Fulldf, lvar==subgroup, select=c("xvar","yvar")) > Which should become something like (untested but should be close): > Data.tmp <- Fulldf[Fulldf[,"lvar"]==subgroup, c("xvar","yvar")] > > This should be a lot easier to translate based on column names, as the > column names are now used as such. > > HTH, > > > Nick Sabbe > -- > ping: nick.sabbe at ugent.be > link: http://biomath.ugent.be > wink: A1.056, Coupure Links 653, 9000 Gent > ring: 09/264.59.36 > > -- Do Not Disapprove > > > > > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On > Behalf Of E Hofstadler > Sent: vrijdag 1 april 2011 13:09 > To: r-help at r-project.org > Subject: [R] programming: telling a function where to look for the entered > variables > > Hi there, > > Could someone help me with the following programming problem..? > > I have written a function that works for my intended purpose, but it > is quite closely tied to a particular dataframe and the names of the > variables in this dataframe. However, I'd like to use the same > function for different dataframes and variables. My problem is that > I'm not quite sure how to tell my function in which dataframe the > entered variables are located. > > Here's some reproducible data and the function: > > # create reproducible data > set.seed(124) > xvar <- sample(0:3, 1000, replace = T) > yvar <- sample(0:1, 1000, replace=T) > zvar <- rnorm(100) > lvar <- sample(0:1, 1000, replace=T) > Fulldf <- as.data.frame(cbind(xvar,yvar,zvar,lvar)) > Fulldf$xvar <- factor(xvar, labels=c("blue","green","red","yellow")) > Fulldf$yvar <- factor(yvar, labels=c("area1","area2")) > Fulldf$lvar <- factor(lvar, labels=c("yes","no")) > > and here's the function in the form that it currently works: from a > subset of the dataframe Fulldf, a contingency table is created (in my > actual data, several other operations are then performed on that > contingency table, but these are not relevant for the problem in > question, therefore I've deleted it) . > > # function as it currently works: tailored to a particular dataframe > (Fulldf) > > myfunct <- function(subgroup){ # enter a particular subgroup for which > the contingency table should be calculated (i.e. a particular value of > the factor lvar) > Data.tmp <- subset(Fulldf, lvar==subgroup, select=c("xvar","yvar")) > #restrict dataframe to given subgroup and two columns of the original > dataframe > Data.tmp <- na.omit(Data.tmp) # exclude missing values > indextable <- table(Data.tmp$xvar, Data.tmp$yvar) # make contingency table > return(indextable) > } > > #Since I need to use the function with different dataframes and > variable names, I'd like to be able to tell my function the name of > the dataframe and variables it should use for calculating the index. > This is how I tried to modify the first part of the #function, but it > didn't work: > > # function as I would like it to work: independent of any particular > dataframe or variable names (doesn't work) > > myfunct.better <- function(subgroup, lvarname, yvarname, dataframe){ > #enter the subgroup, the variable names to be used and the dataframe > in which they are found > ? ?Data.tmp <- subset(dataframe, lvarname==subgroup, select=c("xvar", > deparse(substitute(yvarname)))) # trying to subset the given dataframe > for the given subgroup of the given variable. The variable "xvar" > happens to have the same name in all dataframes) but the variable > yvarname has different names in the different dataframes > Data.tmp <- na.omit(Data.tmp) > ? ?indextable <- table(Data.tmp$xvar, Data.tmp$yvarname) # create the > contingency table on the basis of the entered variables > return(indextable) > } > > calling > > myfunct.better("yes", lvarname=lvar, yvarname=yvar, dataframe=Fulldf) > > results in the following error: > > Error in `[.data.frame`(x, r, vars, drop = drop) : > ?undefined columns selected > > My feeling is that R doesn't know where to look for the entered > variables (lvar, yvar), but I'm not sure how to solve this problem. I > tried using with() and even attach() within the function, but that > didn't work. > > Any help is greatly appreciated. > > Best, > Esther > > P.S.: > Are there books that elaborate programming in R for beginners -- and I > mean things like how to best use vectorization instead of loops and > general "best practice" tips for programming. Most of the books I've > been looking at focus on applying R for particular statistical > analyses, and only comparably briefly deal with more general > programming aspects. I was wondering if there's any books or tutorials > out there that cover the latter aspects in a more elaborate and > systematic way...? > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > From ssefick at gmail.com Fri Apr 1 14:44:01 2011 From: ssefick at gmail.com (stephen sefick) Date: Fri, 1 Apr 2011 07:44:01 -0500 Subject: [R] Linear Model with curve fitting parameter? In-Reply-To: References: <8512_1301611104_1301611104_AANLkTikOxmcE=oMvHBuB8x61fxXJwnexXJkr+Qp3Tawp@mail.gmail.com> Message-ID: Setting Z=Q-A would be the incorrect dimensions. I could Z=Q/A. Is fitting a nls model the same as fitting an ols? These data are hydraulic data from ~47 sites. To access predictive ability I am removing one site fitting a new model and then accessing the fit with a myriad of model assessment criteria. I should get the same answer with ols vs nls? Thank you for all of your help. Stephen On Thu, Mar 31, 2011 at 8:34 PM, Steven McKinney wrote: > >> -----Original Message----- >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of stephen sefick >> Sent: March-31-11 3:38 PM >> To: R help >> Subject: [R] Linear Model with curve fitting parameter? >> >> I have a model Q=K*A*(R^r)*(S^s) >> >> A, R, and S are data I have and K is a curve fitting parameter. ?I >> have linearized as >> >> log(Q)=log(K)+log(A)+r*log(R)+s*log(S) >> >> I have taken the log of the data that I have and this is the model >> formula without the K part >> >> lm(Q~offset(A)+R+S, data=x) >> >> What is the formula that I should use? > > Let Z = Q - A for your logged data. > > Fitting lm(Z ~ R + S, data = x) should yield > intercept parameter estimate = estimate for log(K) > R coefficient parameter estimate = estimate for r > S coefficient parameter estimate = estimate for s > > > > Steven McKinney > > Statistician > Molecular Oncology and Breast Cancer Program > British Columbia Cancer Research Centre > > > >> >> Thanks for all of your help. ?I can provide a subset of data if necessary. >> >> >> >> -- >> Stephen Sefick >> ____________________________________ >> | Auburn University? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | >> | Biological Sciences ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ?? ? ? ?| >> | 331 Funchess Hall? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| >> | Auburn, Alabama? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ? ??? | >> | 36849? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | >> |___________________________________| >> | sas0025 at auburn.edu? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ?| >> | http://www.auburn.edu/~sas0025? ? ? ? ? ?? ? ? | >> |___________________________________| >> >> Let's not spend our time and resources thinking about things that are >> so little or so large that all they really do for us is puff us up and >> make us feel like gods.? We are mammals, and have not exhausted the >> annoying little problems of being mammals. >> >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -K. Mullis >> >> "A big computer, a complex algorithm and a long time does not equal science." >> >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -Robert Gentleman >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > -- Stephen Sefick ____________________________________ | Auburn University? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | | Biological Sciences ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ?? ? ? ?| | 331 Funchess Hall? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| | Auburn, Alabama? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ? ??? | | 36849? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | |___________________________________| | sas0025 at auburn.edu? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ?| | http://www.auburn.edu/~sas0025? ? ? ? ? ?? ? ? | |___________________________________| Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods.? We are mammals, and have not exhausted the annoying little problems of being mammals. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -K. Mullis "A big computer, a complex algorithm and a long time does not equal science." ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -Robert Gentleman From ehlers at ucalgary.ca Fri Apr 1 14:45:45 2011 From: ehlers at ucalgary.ca (Peter Ehlers) Date: Fri, 01 Apr 2011 05:45:45 -0700 Subject: [R] Simple lattice question In-Reply-To: <5CD78996B8F8844D963C875D3159B94A021BCF2B@dsrcorreo> References: <5CD78996B8F8844D963C875D3159B94A021BCB2B@dsrcorreo> <4D948489.8000000@ucalgary.ca> <5CD78996B8F8844D963C875D3159B94A021BCD81@dsrcorreo> <4D94A701.9040201@ucalgary.ca> <5CD78996B8F8844D963C875D3159B94A021BCF2B@dsrcorreo> Message-ID: <4D95C8F9.3010005@ucalgary.ca> Ruben, One more thing you might try is to add a 'size' component to the lines list. The default is size=5, but since you have enought space in the plot, try size=8 or 9; the line types will show up more clearly: .... lines=list(pch=1:4, lty=1:4, type='b', size=8), .... Peter Ehlers On 2011-03-31 23:28, Rub?n Roa wrote: >> -----Mensaje original----- >> De: Peter Ehlers [mailto:ehlers at ucalgary.ca] >> Enviado el: jueves, 31 de marzo de 2011 18:09 >> Para: Rub?n Roa >> CC: r-help at r-project.org >> Asunto: Re: [R] Simple lattice question >> >> On 2011-03-31 06:58, Rub?n Roa wrote: >>> Thanks Peters! >>> >>> Just a few minor glitches now: >>> >>> require(lattice) >>> data<- >> data.frame(SP=sort(rep(as.factor(c('A','B','C','D','E')),12)), >>> x=rpois(60,10), >>> y=rep(c(rep(0,4),rep(10,4),rep(20,4)),5), >>> z=rep(1:4,15)) >>> >> xyplot(x~y|SP,data=data,groups=z,layout=c(2,3),pch=1:4,lty=1:4 > ,col='black',type='b', >>> key=list(x = .65, y = .75, corner = c(0, 0), points=TRUE, >>> lines=TRUE, pch=1:4, lty=1:4, type='b', >>> text=list(lab = as.character(unique(data$z))))) >>> >>> I have an extra column of symbols on the legend, >>> >>> and, >>> >>> would like to add a title to the legend. Such as 'main' for plots. >>> >>> Any suggestions? >> >> for key(), make 'lines' into a list: >> >> xyplot(x~y|SP,data=data,groups=z,layout=c(2,3), >> pch=1:4,lty=1:4,col='black',type='b', >> key=list(x = .65, y = .75, corner = c(0, 0), >> title="title here", cex.title=.9, lines.title=3, >> lines=list(pch=1:4, lty=1:4, type='b'), >> text=list(lab = as.character(unique(data$z))))) >> >> Peter Ehlers > > ... that works. Thanks Peter (sorry I misspelled your name b4). The clean code is now: > > require(lattice) > data<- data.frame(SP=sort(rep(as.factor(c('A','B','C','D','E')),12)), > x=rpois(60,10), > y=rep(c(rep(0,4),rep(10,4),rep(20,4)),5), > z=rep(1:4,15)) > xyplot(x~y|SP,data=data,groups=z,layout=c(2,3),pch=1:4,lty=1:4,col='black',type='b', > key=list(x = .65, y = .75, corner = c(0, 0), > lines=list(pch=1:4, lty=1:4, type='b'), > title=expression(CO^2), > text=list(lab = as.character(unique(data$z))))) > > David's code works too (thanks to you too!) and is somewhat shorter > > xyplot(x~y|SP, data=data,groups=z, layout=c(2,3), par.settings=simpleTheme(pch=1:4,lty=1:4,col='black'), type="b", > auto.key = list(x = .6, y = .7, lines=TRUE, corner = c(0, 0))) > > but the lines and symbols are on different columns, and the line types look as if they were in bold. > > Rub?n > > > ____________________________________________________________________________________ > > Dr. Rub?n Roa-Ureta > AZTI - Tecnalia / Marine Research Unit > Txatxarramendi Ugartea z/g > 48395 Sukarrieta (Bizkaia) > SPAIN > > From d.firth at warwick.ac.uk Fri Apr 1 14:13:43 2011 From: d.firth at warwick.ac.uk (David Firth) Date: Fri, 1 Apr 2011 13:13:43 +0100 Subject: [R] useR! 2011 abstract deadline extended by 2 days Message-ID: <201104011313.43495.d.firth@warwick.ac.uk> useR! 2011 Conference, University of Warwick, Coventry, UK 16-18 August, 2011 (International R user conference; see http://www.r-project.org/conferences.html for the history.) The abstract submission period for this year's useR! conference, which was due to end today, is *extended* by 2 days. The FINAL DEADLINE is: Monday 4 April at 0900 GMT. For all conference information, including the abstract submission details, please see http://www.warwick.ac.uk/statsdept/useR-2011/ David Firth (for the useR! 2011 organisers) -- Professor David Firth http://go.warwick.ac.uk/dfirth From january.weiner at mpiib-berlin.mpg.de Fri Apr 1 12:32:19 2011 From: january.weiner at mpiib-berlin.mpg.de (January Weiner) Date: Fri, 1 Apr 2011 12:32:19 +0200 Subject: [R] Syntax coloring in R console Message-ID: Dear all, I am a happy user of R console, but I would like to see syntax coloring. I use R 2.12 in Ubuntu Linux. I have found the packages "xterm256" and "highlight", but I was not able to figure out how to use it to highlight the syntax in console output. Also, I tried several GUI interfaces, but I was not able to find something that suits me better than the default R console. R cmdr is definitely not for me, as I don't want to fundamentally change the way I am managing my data in R. Rkwrd seems to be nice (from screenshots), but its installation requires all the base KDE libraries, which I don't want to install. I tried JGR, the GUI for R, but I have found the following problems with this package: - I was not able to change the background color from a repulsive grey, - apparently, GNU readline is not implemented in that package, that is, there is no functionality similar to ctrl-r (which searches through the history for matching commands), something I use frequently, and - tab expansion is of limited use (e.g. doesn't browse files in the current directory when expanding quoted arguments e.g. in "read.table"). All in all, I'd be happy to continue using the plain R console, but syntax highlighting would be nice. Any advice would be extremely welcome. Kind regards, January > sessionInfo() R version 2.12.2 (2011-02-25) Platform: i486-pc-linux-gnu (32-bit) locale: [1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8 LC_MONETARY=C [6] LC_MESSAGES=en_US.utf8 LC_PAPER=en_US.utf8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base -- From deepayan.sarkar at gmail.com Fri Apr 1 14:53:57 2011 From: deepayan.sarkar at gmail.com (Deepayan Sarkar) Date: Fri, 1 Apr 2011 18:23:57 +0530 Subject: [R] lattice xscale.components: different ticks on top/bottom axis In-Reply-To: <201103101854.p2AIsL0A020203@hypatia.math.ethz.ch> References: <201103101854.p2AIsL0A020203@hypatia.math.ethz.ch> Message-ID: On Fri, Mar 11, 2011 at 12:28 AM, wrote: > Good afternoon, > > I am trying to create a plot where the bottom and top axes have the same > scale but different tick marks. ?I tried user-defined xscale.component > function but it does not produce desired results. ?Can anybody suggest > where my use of xscale.component function is incorrect? > > For example, the code below tries to create a plot where horizontal axes > limits are c(0,10), top axis has ticks at odd integers, and bottom axis > has ticks at even integers. > > library(lattice) > > df <- data.frame(x=1:10,y=1:10) > > xscale.components.A <- function(...,user.value=NULL) { > ?# get default axes definition list; print user.value > ?ans <- xscale.components.default(...) > ?print(user.value) > > ?# start with the same definition of bottom and top axes > ?ans$top <- ans$bottom > > ?# - bottom labels > ?ans$bottom$labels$at <- seq(0,10,by=2) > ?ans$bottom$labels$labels <- paste("B",seq(0,10,by=2),sep="-") > > ?# - top labels > ?ans$top$labels$at <- seq(1,9,by=2) > ?ans$top$labels$labels <- paste("T",seq(1,9,by=2),sep="-") > > ?# return axes definition list > ?return(ans) > } > > oltc <- xyplot(y~x,data=df, > > scales=list(x=list(limits=c(0,10),at=0:10,alternating=3)), > ? ? ? ? ? ? ? xscale.components=xscale.components.A, > ? ? ? ? ? ? ? user.value=1) > print(oltc) > > The code generates a figure with incorrectly placed bottom and top > labels. ?Bottom labels "B-0", "B-2", ... are at 0, 1, ... and top labels > "T-1", "T-3", ... are at 0, 1, ... ?When axis-function runs out of > labels, it replaces labels with NA. > > It appears that lattice uses top$ticks$at to place labels and > top$labels$labels for labels. ?Is there a way to override this behaviour > (other than to expand the "labels$labels" vector to be as long as > "ticks$at" vector and set necessary elements to "")? Well, $ticks$at is used to place the ticks, and $labels$at is used to place the labels. They should typically be the same, but you have changed one and not the other. Everything seems to work if you set $ticks$at to the same values as $labels$at: ## - bottom labels + ans$bottom$ticks$at <- seq(0,10,by=2) ans$bottom$labels$at <- seq(0,10,by=2) ans$bottom$labels$labels <- paste("B",seq(0,10,by=2),sep="-") ## - top labels + ans$top$ticks$at <- seq(1,9,by=2) ans$top$labels$at <- seq(1,9,by=2) ans$top$labels$labels <- paste("T",seq(1,9,by=2),sep="-") > Also, can user-parameter be passed into xscale.components() function? > (For example, locations and labels of ticks on the top axis). ?In the > code above, print(user.value) returns NULL even though in the xyplot() > call user.value is 1. No. Unrecognized arguments are passed to the panel function only, not to any other function. However, you can always define an inline function: oltc <- xyplot(y~x,data=df, scales=list(x=list(limits=c(0,10), at = 0:10, alternating=3)), xscale.components = function(...) xscale.components.A(..., user.value=1)) Hope that helps (and sorry for the late reply). -Deepayan From e.hofstadler at gmail.com Fri Apr 1 14:54:08 2011 From: e.hofstadler at gmail.com (E Hofstadler) Date: Fri, 1 Apr 2011 15:54:08 +0300 Subject: [R] programming: telling a function where to look for the entered variables In-Reply-To: <3788224312490154257@unknownmsgid> References: <8023110704927321697@unknownmsgid> <3788224312490154257@unknownmsgid> Message-ID: 2011/4/1 Nick Sabbe : > This should be a version that does what you want. Indeed it does, thank you very much! > Because you named the variable lvarname, I assumed you were already passing > "lvar" instead of trying to pass lvar (without the quotes), which is in no > way a 'name'. Sorry about that, I can see how my variable names were somewhat confusing. Many thanks once again! > > > > -----Original Message----- > From: irene.prix at googlemail.com [mailto:irene.prix at googlemail.com] On Behalf > Of E Hofstadler > Sent: vrijdag 1 april 2011 14:28 > To: Nick Sabbe > Cc: r-help at r-project.org > Subject: Re: [R] programming: telling a function where to look for the > entered variables > > Thanks Nick and Juan for your replies. > > Nick, thanks for pointing out the warning in subset(). I'm not sure > though I understand the example you provided -- because despite using > subset() rather than bracket notation, the original function (myfunct) > does what is expected of it. The problem I have is with the second > function (myfunct.better), where variable names + dataframe are not > fixed within the function but passed to the function when calling it > -- and even with bracket notation I don't quite manage to tell R where > to look for the columns that related to the entered column names. > (but then perhaps I misunderstood you) > > This is what I tried (using bracket notation): > > myfunct.better(dataframe, subgroup, lvarname,yvarname){ > Data.tmp <- dataframe[dataframe[,deparse(substitute(lvarname))]==subgroup, > c("xvar",deparse(substitute(yvarname)))] > } > > but this creates an empty contingency table only -- perhaps because my > use of deparse() is flawed (I think what is converted into a string is > "lvarname" and "yvarname", rather than the column names that these two > function-variables represent in the dataframe)? > > > 2011/4/1 Nick Sabbe : >> See the warning in ?subset. >> Passing the column name of lvar is not the same as passing the 'contextual >> column' (as I coin it in these circumstances). >> You can solve it by indeed using [] instead. >> >> For my own comfort, here is the relevant line from your original function: >> Data.tmp <- subset(Fulldf, lvar==subgroup, select=c("xvar","yvar")) >> Which should become something like (untested but should be close): >> Data.tmp <- Fulldf[Fulldf[,"lvar"]==subgroup, c("xvar","yvar")] >> >> This should be a lot easier to translate based on column names, as the >> column names are now used as such. >> >> HTH, >> >> >> Nick Sabbe >> -- >> ping: nick.sabbe at ugent.be >> link: http://biomath.ugent.be >> wink: A1.056, Coupure Links 653, 9000 Gent >> ring: 09/264.59.36 >> >> -- Do Not Disapprove >> >> >> >> >> -----Original Message----- >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] > On >> Behalf Of E Hofstadler >> Sent: vrijdag 1 april 2011 13:09 >> To: r-help at r-project.org >> Subject: [R] programming: telling a function where to look for the entered >> variables >> >> Hi there, >> >> Could someone help me with the following programming problem..? >> >> I have written a function that works for my intended purpose, but it >> is quite closely tied to a particular dataframe and the names of the >> variables in this dataframe. However, I'd like to use the same >> function for different dataframes and variables. My problem is that >> I'm not quite sure how to tell my function in which dataframe the >> entered variables are located. >> >> Here's some reproducible data and the function: >> >> # create reproducible data >> set.seed(124) >> xvar <- sample(0:3, 1000, replace = T) >> yvar <- sample(0:1, 1000, replace=T) >> zvar <- rnorm(100) >> lvar <- sample(0:1, 1000, replace=T) >> Fulldf <- as.data.frame(cbind(xvar,yvar,zvar,lvar)) >> Fulldf$xvar <- factor(xvar, labels=c("blue","green","red","yellow")) >> Fulldf$yvar <- factor(yvar, labels=c("area1","area2")) >> Fulldf$lvar <- factor(lvar, labels=c("yes","no")) >> >> and here's the function in the form that it currently works: from a >> subset of the dataframe Fulldf, a contingency table is created (in my >> actual data, several other operations are then performed on that >> contingency table, but these are not relevant for the problem in >> question, therefore I've deleted it) . >> >> # function as it currently works: tailored to a particular dataframe >> (Fulldf) >> >> myfunct <- function(subgroup){ # enter a particular subgroup for which >> the contingency table should be calculated (i.e. a particular value of >> the factor lvar) >> Data.tmp <- subset(Fulldf, lvar==subgroup, select=c("xvar","yvar")) >> #restrict dataframe to given subgroup and two columns of the original >> dataframe >> Data.tmp <- na.omit(Data.tmp) # exclude missing values >> indextable <- table(Data.tmp$xvar, Data.tmp$yvar) # make contingency table >> return(indextable) >> } >> >> #Since I need to use the function with different dataframes and >> variable names, I'd like to be able to tell my function the name of >> the dataframe and variables it should use for calculating the index. >> This is how I tried to modify the first part of the #function, but it >> didn't work: >> >> # function as I would like it to work: independent of any particular >> dataframe or variable names (doesn't work) >> >> myfunct.better <- function(subgroup, lvarname, yvarname, dataframe){ >> #enter the subgroup, the variable names to be used and the dataframe >> in which they are found >> ? ?Data.tmp <- subset(dataframe, lvarname==subgroup, select=c("xvar", >> deparse(substitute(yvarname)))) # trying to subset the given dataframe >> for the given subgroup of the given variable. The variable "xvar" >> happens to have the same name in all dataframes) but the variable >> yvarname has different names in the different dataframes >> Data.tmp <- na.omit(Data.tmp) >> ? ?indextable <- table(Data.tmp$xvar, Data.tmp$yvarname) # create the >> contingency table on the basis of the entered variables >> return(indextable) >> } >> >> calling >> >> myfunct.better("yes", lvarname=lvar, yvarname=yvar, dataframe=Fulldf) >> >> results in the following error: >> >> Error in `[.data.frame`(x, r, vars, drop = drop) : >> ?undefined columns selected >> >> My feeling is that R doesn't know where to look for the entered >> variables (lvar, yvar), but I'm not sure how to solve this problem. I >> tried using with() and even attach() within the function, but that >> didn't work. >> >> Any help is greatly appreciated. >> >> Best, >> Esther >> >> P.S.: >> Are there books that elaborate programming in R for beginners -- and I >> mean things like how to best use vectorization instead of loops and >> general "best practice" tips for programming. Most of the books I've >> been looking at focus on applying R for particular statistical >> analyses, and only comparably briefly deal with more general >> programming aspects. I was wondering if there's any books or tutorials >> out there that cover the latter aspects in a more elaborate and >> systematic way...? >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > From johannes_hain at web.de Fri Apr 1 13:59:08 2011 From: johannes_hain at web.de (Johannes Hain) Date: Fri, 1 Apr 2011 13:59:08 +0200 (CEST) Subject: [R] guidelines for using the R logo Message-ID: <1860388401.413654.1301659148608.JavaMail.fmail@mwmweb014> Hello, I am writing an R-guide which is addressed especially for students of German Universities (http://www.rrzn.uni-hannover.de/buch.html?&no_cache=1&titel=statistik_r). For the title page, I want to use the R logo. I found a similar question in the R mailing list about the same topic. The answer was that the usage of the logo is OK. However, this already a few years ago and I was wondering wether this is still up-to-date or wether there are any guidelines I would violate when using the logo. Thanks -- Johannes Hain University of Wuerzburg Chair of Mathematics VIII (Statistics) +49 931 31-84969 Empfehlen Sie WEB.DE DSL Ihren Freunden und Bekannten und wir belohnen Sie mit bis zu 50,- Euro! [1]https://freundschaftswerbung.web.de References 1. https://freundschaftswerbung.web.de/ From orsolyatvincze at gmail.com Fri Apr 1 13:36:35 2011 From: orsolyatvincze at gmail.com (=?ISO-8859-1?Q?T=EDmea_Vincze?=) Date: Fri, 1 Apr 2011 14:36:35 +0300 Subject: [R] Information theoretic approach in GLS Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Steve_Friedman at nps.gov Fri Apr 1 14:59:23 2011 From: Steve_Friedman at nps.gov (Steve_Friedman at nps.gov) Date: Fri, 1 Apr 2011 08:59:23 -0400 Subject: [R] Syntax coloring in R console In-Reply-To: Message-ID: RStudio is a new interface. I just started using it and find it very good. You might want to check it out. It does have Ubuntu capabilities, but you have be sure that your graphic drivers are the same as is specified in the RStudio system. Hope this helps Steve Friedman Ph. D. Ecologist / Spatial Statistical Analyst Everglades and Dry Tortugas National Park 950 N Krome Ave (3rd Floor) Homestead, Florida 33034 Steve_Friedman at nps.gov Office (305) 224 - 4282 Fax (305) 224 - 4147 January Weiner Sent by: cc r-help-bounces at r- project.org Subject [R] Syntax coloring in R console 04/01/2011 06:32 AM Dear all, I am a happy user of R console, but I would like to see syntax coloring. I use R 2.12 in Ubuntu Linux. I have found the packages "xterm256" and "highlight", but I was not able to figure out how to use it to highlight the syntax in console output. Also, I tried several GUI interfaces, but I was not able to find something that suits me better than the default R console. R cmdr is definitely not for me, as I don't want to fundamentally change the way I am managing my data in R. Rkwrd seems to be nice (from screenshots), but its installation requires all the base KDE libraries, which I don't want to install. I tried JGR, the GUI for R, but I have found the following problems with this package: - I was not able to change the background color from a repulsive grey, - apparently, GNU readline is not implemented in that package, that is, there is no functionality similar to ctrl-r (which searches through the history for matching commands), something I use frequently, and - tab expansion is of limited use (e.g. doesn't browse files in the current directory when expanding quoted arguments e.g. in "read.table"). All in all, I'd be happy to continue using the plain R console, but syntax highlighting would be nice. Any advice would be extremely welcome. Kind regards, January > sessionInfo() R version 2.12.2 (2011-02-25) Platform: i486-pc-linux-gnu (32-bit) locale: [1] LC_CTYPE=en_US.utf8 LC_NUMERIC=C LC_TIME=en_US.utf8 LC_COLLATE=en_US.utf8 LC_MONETARY=C [6] LC_MESSAGES=en_US.utf8 LC_PAPER=en_US.utf8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base -- ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From landronimirc at gmail.com Fri Apr 1 15:02:33 2011 From: landronimirc at gmail.com (Liviu Andronic) Date: Fri, 1 Apr 2011 15:02:33 +0200 Subject: [R] Syntax coloring in R console In-Reply-To: References: Message-ID: Hello On Fri, Apr 1, 2011 at 12:32 PM, January Weiner wrote: > I have found the packages "xterm256" and "highlight", but I was not > able to figure out how to use it to highlight the syntax in console > output. > I've been in this position before, without finding a solution for syntax highlighting in the default R console. > Also, I tried several GUI interfaces, but I was not able to find > something that suits me better than the default R console. R cmdr is > definitely not for me, as I don't want to fundamentally change the way > I am managing my data in R. Rkwrd seems to be nice (from screenshots), > but its installation requires all the base KDE libraries, which I > don't want to install. > Recently RStudio was introduced [1] and, although beta, it received quasi-unanimous acclaim from the community, so you risk finding it useful too. [1] http://alternativeto.net/software/rstudio/about Regards Liviu > I tried JGR, the GUI for R, but I have found the following problems > with this package: > > - I was not able to change the background color from a repulsive grey, > - apparently, GNU readline is not implemented in that package, that > is, there is no functionality similar to ctrl-r (which searches > through the history for matching commands), something I use > frequently, and > - tab expansion is of limited use (e.g. doesn't browse files in the > current directory when expanding quoted arguments e.g. in > "read.table"). > > All in all, I'd be happy to continue using the plain R console, but > syntax highlighting would be nice. Any advice would be extremely > welcome. > > Kind regards, > > January > > > >> sessionInfo() > R version 2.12.2 (2011-02-25) > Platform: i486-pc-linux-gnu (32-bit) > > locale: > ?[1] LC_CTYPE=en_US.utf8 ? ? ? LC_NUMERIC=C > LC_TIME=en_US.utf8 ? ? ? ?LC_COLLATE=en_US.utf8 ? ? LC_MONETARY=C > ?[6] LC_MESSAGES=en_US.utf8 ? ?LC_PAPER=en_US.utf8 ? ? ? LC_NAME=C > ? ? ? ? ? ?LC_ADDRESS=C ? ? ? ? ? ? ?LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base > > > -- > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail From deepayan.sarkar at gmail.com Fri Apr 1 15:04:51 2011 From: deepayan.sarkar at gmail.com (Deepayan Sarkar) Date: Fri, 1 Apr 2011 18:34:51 +0530 Subject: [R] lattice (panel.3dscatter): how to make plot symbol thicker? In-Reply-To: References: <4D76D2C8.1040708@yahoo.de> <4D76E5A1.8090903@ucalgary.ca> Message-ID: On Fri, Mar 11, 2011 at 12:29 PM, Marius Hofert wrote: > Dear Deepayan, > > many thanks for answering. > > Another thing I am wondering is the following: I know you can have (3d-like) "crosses" in the wireframe plot. But are there any other 3d-like plot symbols? Of course one can use different colors to distinguish between several points. The problem is that most of the scientific journals do not allow colors [or it's expensive]. I am thus wondering if the plot symbols have equivalent 3d-versions. It looks a bit odd to draw a 2d cross in a 3d wireframe. Or a circle. Concerning the circle, it could be a small ball in 3d for example. Is there anything like this? > No, at least not easily. The 3D crosses are done by mapping each point into 3 perpendicular line segments, then projecting the endpoints into the 2D space, and then joining them. This is hardcoded in panel.3dscatter() -- search for the section inside 'if (cross)'. For other 3D plotting characters, you need to replicate this process (which is not really that difficult, but a bit tedious). If you are going to be doing a lot of fancy 3D graphics, I would strongly suggest considering rgl, which is a "real" 3D graphics system. Proper 3D graphics are difficult in systems based on vector graphics (like R graphics). -Deepayan From Samuel.Le at srlglobal.com Fri Apr 1 15:34:48 2011 From: Samuel.Le at srlglobal.com (Samuel Le) Date: Fri, 1 Apr 2011 13:34:48 +0000 Subject: [R] controlling the labels width of a barplot Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From wwwhsd at gmail.com Fri Apr 1 15:39:53 2011 From: wwwhsd at gmail.com (Henrique Dallazuanna) Date: Fri, 1 Apr 2011 10:39:53 -0300 Subject: [R] controlling the labels width of a barplot In-Reply-To: References: Message-ID: Try the cex.names argument in barplot function. On Fri, Apr 1, 2011 at 10:34 AM, Samuel Le wrote: > Dear all, > > > > I am trying the barplot command but some of the labels are disappearing as there is not enough place on the graph to put them all. > > Here is an example of code that doesn't show all the labels: > > > > barplot(sort(runif(9,0,0.2),decreasing=TRUE),xlim=c(0,20),width=2,names.arg=c("first name","second name","third name","fourth name","fifth name","sixth name","seventh name","eigth name","nineth name")) > > > > Does someone know a way to control the size of the font in the barplot function, or to give them an inclination angle? > > > > Many thanks, > > > > Samuel > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O From deepayan.sarkar at gmail.com Fri Apr 1 15:51:17 2011 From: deepayan.sarkar at gmail.com (Deepayan Sarkar) Date: Fri, 1 Apr 2011 19:21:17 +0530 Subject: [R] XYPlot Conditioning Variable in Specific, Non-Alphanumeric Order. -- Resending with corrected .txt file In-Reply-To: References: Message-ID: On Sat, Mar 19, 2011 at 2:53 AM, Guy Jett wrote: > Due to an error on my part, I have renamed the previously attached file from > ?T_5-04b_LTC-SE-SO-Compared.csv to > ?T_5-04b_LTC-SE-SO-Compared.txt. > It remains a comma-delimited file so the extension can be changed and used per the script, or loaded separately. > My sincere apologies, > Guy > > -----Original Message----- > From: Guy Jett > Sent: Friday, March 18, 2011 1:13 PM > To: 'r-help at R-project.org' > Subject: XYPlot Conditioning Variable in Specific, Non-Alphanumeric Order. > > # I need to create an xyplot() where I control the specific order of # ?both my conditioning variables. ?The default code below plots the # ?data correctly (dispersed across all 14 columns), but fails in two # ?ways. ?Both the primary conditioning variable (Transect), and the # ?secondary conditioning variable (Offset) are in alphanumeric order, # ?rather than the specific order I need. > > # Here is a call to the input datafile, which should be attached. ?You may rename that .txt file to .csv for processing in the following line. > ? ?df <- read.csv(file = "T_5-04b_LTC-SE-SO-Compared.csv") > > # Basic default plot (correct data, incorrect layout): > ? ?xyplot((sbd+sed)/2 ~ Result | Offset+Transect, groups = PARLABEL, as.table = TRUE, > ? ?data = df, > ? ?layout = c(14,4), type = "b") > > # I attempted to control the order following the method described in # ?the thread "[R] xyplot() - can you control how the plots are # ?ordered?", but I appear to be missing, or misunderstanding # ?something. ?The modeled code is here. ?It does put all the # ?individual 'lattices'(?) in the needed order, BUT the graphics # ?for the individual sets dump all the measurements into a single # ?cell, on the diagonal, as if it's treating the conditioning # ?variables as an [i,j] index. ?Again not what I want. > > # ?Draft code (incorrect data, correct layout): > ? ?Transects <- c("LNF02", "LSF02", "LUR01", "LURT1", "LUR03", > ? ? ? ? ? ? ? ? ? "LUR05", "LUR09", "LUR11", "LUR12", "LUR15", > ? ? ? ? ? ? ? ? ? "LUR16", "LUR21", "LURT3", "LUR25", "LURT4", > ? ? ? ? ? ? ? ? ? "LUR28", "LUR36", "LUR38", "LUR46", "LURT5", > ? ? ? ? ? ? ? ? ? "LUR48", "LLR04", "LLR10", "LLR11", "LLRT1", > ? ? ? ? ? ? ? ? ? "LLR17", "LLRT2", "LLR18", "LLRT3", "LLR19") > ? ?Transects <- factor(Transects, levels = Transects) > > ? ?Offsets <- c("T", "U", "V", "Y", "Z", "A", "B", "C", "D", "E", "F", "G", "H") > ? ?Offsets <- factor(Offsets, levels = Offsets) > > ? ?xyplot((sbd+sed)/2 ~ Result | Offsets+Transects, groups = PARLABEL, as.table = TRUE, > ? ?data = df, > ? ?layout = c(13,5), type = "b") > > # What I am looking for is a combination of the default plot, but ordered in # ?the layout of the second code fragment. You need to specify the order of the levels explicitly (to override the default). Here is how to do it for one, you can similarly do the other: > levels(df$Offset) [1] "T" "U" "V" "Y" "Z" "A" "B" "C" "D" "E" "F" "G" "H" > df$Offset <- factor(df$Offset, + levels = c("T", "U", "V", "Y", "Z", "A", + "B", "C", "D", "E", "F", "G", "H")) > levels(df$Offset) [1] "T" "U" "V" "Y" "Z" "A" "B" "C" "D" "E" "F" "G" "H" Once you make these changes, your original call should work as desired. -Deepayan From deepayan.sarkar at gmail.com Fri Apr 1 15:54:26 2011 From: deepayan.sarkar at gmail.com (Deepayan Sarkar) Date: Fri, 1 Apr 2011 19:24:26 +0530 Subject: [R] lattice histogram function and groups In-Reply-To: <00A769DD3A6D594C9902BC12EF1CAF8E055F1E88@SOAANCMSG01.soa.alaska.gov> References: <00A769DD3A6D594C9902BC12EF1CAF8E055F1E88@SOAANCMSG01.soa.alaska.gov> Message-ID: On Sat, Mar 19, 2011 at 6:11 AM, Evans, David G (DFG) wrote: > Hi, > > >From the following code (tweaked from another user): > > > > variable<-sample(rep(1:2,100)) > > individual<-rep(1:3, length(variable)) > > group<-rep(LETTERS[1:2],length(variable)/2) > > mydata<-data.frame(variable,individual,group) > > individual<-as.factor(individual) > > group<-as.factor(group) > > histogram(~variable|individual+group) > > > > I get six panels, one for each of ?individuals 1-3 in group A and one > for each of individuals 1-3 in group B . > > What I want is three panels , one for each individual, but with ?A and B > paired in the same panel. ?I think the "groups="argument does this sort > of superposition for other lattice functions. But not for histogram (intentionally). See http://tolstoy.newcastle.edu.au/R/e2/help/07/04/14490.html -Deepayan From M.Rosario.Garcia at slu.se Fri Apr 1 15:43:31 2011 From: M.Rosario.Garcia at slu.se (Rosario Garcia Gil) Date: Fri, 1 Apr 2011 15:43:31 +0200 Subject: [R] Large number of Y and X variables Message-ID: <74776A1FD44FB94E9182E2C524E78772BD0783AE81@exmbx3.ad.slu.se> Hello I have two issues 1. I have a linear model with several thousands of Y and X variables, so if I do not want to write them down one by one in the lm(Y ~ X) model how should I go about it. 2. Also, if I suspect that some of the Y variables could not be independent, what function should I apply? Regards Rosario From january.weiner at mpiib-berlin.mpg.de Fri Apr 1 15:48:23 2011 From: january.weiner at mpiib-berlin.mpg.de (January Weiner) Date: Fri, 1 Apr 2011 15:48:23 +0200 Subject: [R] Syntax coloring in R console In-Reply-To: References: Message-ID: > Recently RStudio was introduced [1] and, although beta, it received > quasi-unanimous acclaim from the community, so you risk finding it > useful too. Dear Liviu, RStudio might be a fine program, but it does not feature syntax highlighting, which is the only thing I am missing from R Console (it only colors the commands typed). Moreover, the very idea of squeezing all R windows into one "window-desktop" would be counterproductive in my particular case. Thank you anyways! j. > [1] http://alternativeto.net/software/rstudio/about > > Regards > Liviu > > >> I tried JGR, the GUI for R, but I have found the following problems >> with this package: >> >> - I was not able to change the background color from a repulsive grey, >> - apparently, GNU readline is not implemented in that package, that >> is, there is no functionality similar to ctrl-r (which searches >> through the history for matching commands), something I use >> frequently, and >> - tab expansion is of limited use (e.g. doesn't browse files in the >> current directory when expanding quoted arguments e.g. in >> "read.table"). >> >> All in all, I'd be happy to continue using the plain R console, but >> syntax highlighting would be nice. Any advice would be extremely >> welcome. >> >> Kind regards, >> >> January >> >> >> >>> sessionInfo() >> R version 2.12.2 (2011-02-25) >> Platform: i486-pc-linux-gnu (32-bit) >> >> locale: >> ?[1] LC_CTYPE=en_US.utf8 ? ? ? LC_NUMERIC=C >> LC_TIME=en_US.utf8 ? ? ? ?LC_COLLATE=en_US.utf8 ? ? LC_MONETARY=C >> ?[6] LC_MESSAGES=en_US.utf8 ? ?LC_PAPER=en_US.utf8 ? ? ? LC_NAME=C >> ? ? ? ? ? ?LC_ADDRESS=C ? ? ? ? ? ? ?LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base >> >> >> -- >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Do you know how to read? > http://www.alienetworks.com/srtest.cfm > http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader > Do you know how to write? > http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail > -- -------- Dr. January Weiner 3 -------------------------------------- Max Planck Institute for Infection Biology Charit?platz 1 D-10117 Berlin, Germany Web?? : www.mpiib-berlin.mpg.de Tel? ?? : +49-30-28460514 From Ezekiel_Landes at hms.harvard.edu Fri Apr 1 15:54:17 2011 From: Ezekiel_Landes at hms.harvard.edu (Landes, Ezekiel) Date: Fri, 1 Apr 2011 09:54:17 -0400 Subject: [R] converting affybatch object to matrix Message-ID: I have an Affybatch object called "batch" : > > batch AffyBatch object size of arrays=1050x1050 features (196 kb) cdf=HuGene-1_0-st-v1 (32321 affyids) number of samples=384 number of genes=32321 annotation=hugene10stv1 notes= > > Is there a way of converting a portion of this data into a matrix? More specifically, a matrix where the 384 samples are columns and the 32321 genes are rows? The "exprs" function returns a matrix that has 384 columns but for some reason there are 1050^2 rows. Thanks! From comp611 at gmail.com Fri Apr 1 16:00:49 2011 From: comp611 at gmail.com (hongsheng wu) Date: Fri, 1 Apr 2011 10:00:49 -0400 Subject: [R] read.table question #only need to change column names Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Thierry.ONKELINX at inbo.be Fri Apr 1 16:03:16 2011 From: Thierry.ONKELINX at inbo.be (ONKELINX, Thierry) Date: Fri, 1 Apr 2011 14:03:16 +0000 Subject: [R] Syntax coloring in R console In-Reply-To: References: Message-ID: Dear January, Have a look at Eclipse with the STAT-ET plugin http://www.walware.de/goto/statet Best regards, Thierry ---------------------------------------------------------------------------- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie & Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics & Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey > -----Oorspronkelijk bericht----- > Van: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] Namens January Weiner > Verzonden: vrijdag 1 april 2011 15:48 > Aan: r-help at stat.math.ethz.ch > Onderwerp: Re: [R] Syntax coloring in R console > > > Recently RStudio was introduced [1] and, although beta, it received > > quasi-unanimous acclaim from the community, so you risk finding it > > useful too. > > Dear Liviu, > > RStudio might be a fine program, but it does not feature > syntax highlighting, which is the only thing I am missing > from R Console (it only colors the commands typed). Moreover, > the very idea of squeezing all R windows into one > "window-desktop" would be counterproductive in my particular case. > > Thank you anyways! > > j. > > > > > [1] http://alternativeto.net/software/rstudio/about > > > > Regards > > Liviu > > > > > >> I tried JGR, the GUI for R, but I have found the following > problems > >> with this package: > >> > >> - I was not able to change the background color from a repulsive > >> grey, > >> - apparently, GNU readline is not implemented in that > package, that > >> is, there is no functionality similar to ctrl-r (which searches > >> through the history for matching commands), something I use > >> frequently, and > >> - tab expansion is of limited use (e.g. doesn't browse > files in the > >> current directory when expanding quoted arguments e.g. in > >> "read.table"). > >> > >> All in all, I'd be happy to continue using the plain R > console, but > >> syntax highlighting would be nice. Any advice would be extremely > >> welcome. > >> > >> Kind regards, > >> > >> January > >> > >> > >> > >>> sessionInfo() > >> R version 2.12.2 (2011-02-25) > >> Platform: i486-pc-linux-gnu (32-bit) > >> > >> locale: > >> ?[1] LC_CTYPE=en_US.utf8 ? ? ? LC_NUMERIC=C > >> LC_TIME=en_US.utf8 ? ? ? ?LC_COLLATE=en_US.utf8 ? ? LC_MONETARY=C > >> ?[6] LC_MESSAGES=en_US.utf8 ? ?LC_PAPER=en_US.utf8 ? ? ? LC_NAME=C > >> ? ? ? ? ? ?LC_ADDRESS=C ? ? ? ? ? ? ?LC_TELEPHONE=C [11] > >> LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C > >> > >> attached base packages: > >> [1] stats ? ? graphics ?grDevices utils ? ? datasets ? > methods ? base > >> > >> > >> -- > >> > >> ______________________________________________ > >> R-help at r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > > > > > > -- > > Do you know how to read? > > http://www.alienetworks.com/srtest.cfm > > > http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader > > Do you know how to write? > > http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail > > > > > > -- > -------- Dr. January Weiner 3 -------------------------------------- > Max Planck Institute for Infection Biology Charit?platz 1 > D-10117 Berlin, Germany > Web?? : www.mpiib-berlin.mpg.de > Tel? ?? : +49-30-28460514 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From jorge.nieves at moorecap.com Fri Apr 1 16:12:39 2011 From: jorge.nieves at moorecap.com (Jorge Nieves) Date: Fri, 1 Apr 2011 10:12:39 -0400 Subject: [R] Rexcel path problem Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From landronimirc at gmail.com Fri Apr 1 16:20:55 2011 From: landronimirc at gmail.com (Liviu Andronic) Date: Fri, 1 Apr 2011 16:20:55 +0200 Subject: [R] Syntax coloring in R console In-Reply-To: References: Message-ID: On Fri, Apr 1, 2011 at 3:48 PM, January Weiner wrote: > RStudio might be a fine program, but it does not feature syntax > highlighting, which is the only thing I am missing from R Console (it > only colors the commands typed). > The idea is that you shouldn't use the R console for you main programming needs, but only for quick and dirty checks. For the bulk of programming tasks you are invited to use the integrated editor (File > New > Script). The editor window does feature syntax highlighting, and a very helpful completion mechanism (via ). Sending lines for execution to the terminal is as easy as clicking 'run lines' or ctrl+enter. If you're not a fan of keeping scripts for your projects you can easily use temporary files that you don't save. > Moreover, the very idea of squeezing > all R windows into one "window-desktop" would be counterproductive in > my particular case. > Notice that all panes are freely resizable and can be resized to the point of becoming hidden. Future releases will give more control over the panes layout (I think). Regards Liviu > Thank you anyways! > > j. > > > >> [1] http://alternativeto.net/software/rstudio/about >> >> Regards >> Liviu >> >> >>> I tried JGR, the GUI for R, but I have found the following problems >>> with this package: >>> >>> - I was not able to change the background color from a repulsive grey, >>> - apparently, GNU readline is not implemented in that package, that >>> is, there is no functionality similar to ctrl-r (which searches >>> through the history for matching commands), something I use >>> frequently, and >>> - tab expansion is of limited use (e.g. doesn't browse files in the >>> current directory when expanding quoted arguments e.g. in >>> "read.table"). >>> >>> All in all, I'd be happy to continue using the plain R console, but >>> syntax highlighting would be nice. Any advice would be extremely >>> welcome. >>> >>> Kind regards, >>> >>> January >>> >>> >>> >>>> sessionInfo() >>> R version 2.12.2 (2011-02-25) >>> Platform: i486-pc-linux-gnu (32-bit) >>> >>> locale: >>> ?[1] LC_CTYPE=en_US.utf8 ? ? ? LC_NUMERIC=C >>> LC_TIME=en_US.utf8 ? ? ? ?LC_COLLATE=en_US.utf8 ? ? LC_MONETARY=C >>> ?[6] LC_MESSAGES=en_US.utf8 ? ?LC_PAPER=en_US.utf8 ? ? ? LC_NAME=C >>> ? ? ? ? ? ?LC_ADDRESS=C ? ? ? ? ? ? ?LC_TELEPHONE=C >>> [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C >>> >>> attached base packages: >>> [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base >>> >>> >>> -- >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> >> >> -- >> Do you know how to read? >> http://www.alienetworks.com/srtest.cfm >> http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader >> Do you know how to write? >> http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail >> > > > > -- > -------- Dr. January Weiner 3 -------------------------------------- > Max Planck Institute for Infection Biology > Charit?platz 1 > D-10117 Berlin, Germany > Web?? : www.mpiib-berlin.mpg.de > Tel? ?? : +49-30-28460514 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail From jdnewmil at dcn.davis.ca.us Fri Apr 1 16:20:49 2011 From: jdnewmil at dcn.davis.ca.us (Jeff Newmiller) Date: Fri, 1 Apr 2011 07:20:49 -0700 Subject: [R] Rexcel path problem In-Reply-To: References: Message-ID: <74a2b362-35b8-425d-9c69-3f1971cf90bb@email.android.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rmh at temple.edu Fri Apr 1 16:23:34 2011 From: rmh at temple.edu (Richard M. Heiberger) Date: Fri, 1 Apr 2011 10:23:34 -0400 Subject: [R] Rexcel path problem In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bhh at xs4all.nl Fri Apr 1 16:28:12 2011 From: bhh at xs4all.nl (Berend Hasselman) Date: Fri, 1 Apr 2011 16:28:12 +0200 Subject: [R] Rexcel path problem In-Reply-To: References: Message-ID: <7E2502A2-1D73-449B-9F7F-80AD175F6E42@xs4all.nl> On 01-04-2011, at 16:12, Jorge Nieves wrote: > > Hi, > > I am running a test to call an R script with in excel using VBA. My VBA > code is shown bellow. The middle section of this mail also includes the > content of my Rscript. The bottom part shows the error message form the > R console. > > It seems that Excel is opening the R console without any problems. > > The problem I am seeing is that Rinterface.RRun instruction is > interpreting the "\T" part of the path as an R command. It does not > recognize the "X:\Trading\Energy\JorgeSpace\TMPholder\cpixe\home > models\" as a one single string, or path. > > > Any ideas how can I fix the problem? > > Thanks ,' > > Jorge > > > VB code > Sub tester() > > Rinterface.StartRServer > > Rinterface.RRun > ("source('X:\Trading\Energy\JorgeSpace\TMPholder\cpixe\home > models\toto.R')") > > Rinterface.StopRServer > > End Sub > > > > "R cript code toto.R" > path = getwd() > setwd(path) > a = 5 > b=5 > x = matrix(rnorm(a*b),a,b) > a = 5 > b=5 > > y = matrix(rnorm(a*b),a,b) > z = x %*% y > savefile = paste(path,"/","testresults.csv",sep="") > write.csv(z, file = savefile) > > Rconsole message > > Type 'demo()' for some demos, 'help()' for on-line help, or > 'help.start()' for an HTML browser interface to help. > Type 'q()' to quit R. > > Loading required package: rcom > Loading required package: rscproxy > Error: '\T' is an unrecognized escape in character string starting > "source('X:\T" R is just telling you that it doesn't recognize \T as a valid escape sequence in a character string. See R for Windows FAQ 2.16 and 5.1. Use \\ instead of a single \ as path separator or use the forward slash /. Berend From mazatlanmexico at yahoo.com Fri Apr 1 16:28:47 2011 From: mazatlanmexico at yahoo.com (Felipe Carrillo) Date: Fri, 1 Apr 2011 07:28:47 -0700 Subject: [R] Rexcel path problem In-Reply-To: References: Message-ID: <18490.9435.qm@web56604.mail.re3.yahoo.com> Jorge: You can run save your scripts in the same folder where your wokbook is and run it like this: Sub Scatter()???????????????????? Call rinterface.StartRServer 'Put the dataframe into R ????Call rinterface.PutDataframe("scatter", DownRightFrom(Range("RData!A1")), WithRowNames:=False) Call rinterface.RRun("attach(scatter)") 'Run the RScatter script rinterface.RunRFile ThisWorkbook.Path & "\RScatter.r" Call rinterface.StopRServer End Sub I ? Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish & Wildlife Service California, USA http://www.fws.gov/redbluff/rbdd_jsmp.aspx ----- Original Message ---- > From: Jorge Nieves > To: r-help at stat.math.ethz.ch > Sent: Fri, April 1, 2011 7:12:39 AM > Subject: [R] Rexcel path problem > > > Hi, > > I am running a test to call an R script with in excel using VBA. My VBA > code is shown bellow. The middle section of this mail also includes the > content of my Rscript.? The bottom part shows the error message form the > R console. > > It seems that Excel is? opening the R console without any problems. > > The problem I am seeing is that Rinterface.RRun? instruction is > interpreting the "\T" part of the path as an R command. It does not > recognize the "X:\Trading\Energy\JorgeSpace\TMPholder\cpixe\home > models\" as a one single string, or path. > > > Any ideas how can I fix the problem? > > Thanks ,' > > Jorge > > > VB code > Sub tester() > > ? ? Rinterface.StartRServer > > ? ? Rinterface.RRun > ("source('X:\Trading\Energy\JorgeSpace\TMPholder\cpixe\home > models\toto.R')")? ? > > ? ? Rinterface.StopRServer > ? ? > End Sub > > > > "R cript code toto.R" > path = getwd() > setwd(path) > a = 5 > b=5 > x = matrix(rnorm(a*b),a,b) > a = 5 > b=5 > > y = matrix(rnorm(a*b),a,b) > z = x %*% y > savefile = paste(path,"/","testresults.csv",sep="") > write.csv(z, file = savefile) > > > > > > Rconsole message > > Type 'demo()' for some demos, 'help()' for on-line help, or > 'help.start()' for an HTML browser interface to help. > Type 'q()' to quit R. > > Loading required package: rcom > Loading required package: rscproxy > Error: '\T' is an unrecognized escape in character string starting > "source('X:\T" > > > > > Jorge Nieves > > > > ??? [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From deepayan.sarkar at gmail.com Fri Apr 1 16:29:32 2011 From: deepayan.sarkar at gmail.com (Deepayan Sarkar) Date: Fri, 1 Apr 2011 19:59:32 +0530 Subject: [R] Colour makes my life; but not my bwplot (panel.violin) In-Reply-To: References: <37243060-3D21-474A-BA1E-3633F40F8AF9@comcast.net> Message-ID: On Fri, Mar 25, 2011 at 3:59 PM, JP wrote: > Hi there David, > > Many thanks for your time and reply > > I created a small test set, and ran your proposed solution... and this is > what I get http://i.imgur.com/vlsSQ.png > This is not what I want - I want separate grp_1 and grp_2 panels and in each > panel a red violin plot and a blue one. ?So like this --> > http://i.imgur.com/NnsE0.png but with red for condition_a and blue for > condition_b. ?You would think that something like this is trivial to > achieve... I just spent a whole day on this :(( ?Maybe I am just thick > > I included the test data I am using: > > # some dummy data > p <- rep(c(rep("condition_a", 4), rep("condition_b", 4)), 2) > q <- c(rep("grp_1", 8), rep("grp_2", 8)) > r <- rnorm(16) > test_data <- data.frame(p, q, r) > > # your solution > bwplot(r ~ p, > ? groups = q, > ? data=test_data, > ? col = c("red", "blue"), > ? panel=panel.superpose, > ? panel.groups = function(..., box.ratio){ > panel.violin(..., ?cut = 1, varwidth = FALSE, box.ratio = box.ratio) > panel.bwplot(..., ?box.ratio = .1) > }, > par.settings = list(plot.symbol = list(pch = 21, col = "gray"), > ? ?box.rectangle = list(col = "black"), ? # not sure these are working > properly > box.umbrella = list(col = "black")) > ) Umm, isn't this slight modification of the above what you want (only first two lines changed -- your formula with the right 'groups' variable)? bwplot(r ~ p | q, groups = p, data=test_data, col = c("red", "blue"), panel=panel.superpose, panel.groups = function(..., box.ratio){ panel.violin(..., cut = 1, varwidth = FALSE, box.ratio = box.ratio) panel.bwplot(..., box.ratio = .1) }, par.settings = list(plot.symbol = list(pch = 21, col = "gray"), box.rectangle = list(col = "black"), box.umbrella = list(col = "black")) ) Some further modifications will get you closer to David's solution: bwplot(r ~ p | q, groups = p, data=test_data, col = c("red", "blue"), fill = c("red", "blue"), panel=panel.superpose, panel.groups = function(..., box.ratio, col, pch){ panel.violin(..., cut = 1, varwidth = FALSE, box.ratio = box.ratio, col = col) panel.bwplot(..., box.ratio = .1, col = "black", pch = 16) }, par.settings = list(plot.symbol = list(pch = 21, col = "gray"), box.rectangle = list(col = "black"), box.umbrella = list(col = "black")) ) -Deepayan > # my non working one for completeness > > bwplot(r ~ p | q, > data=test_data, > col = c("red", "blue"), > panel = function(..., box.ratio){ > panel.violin(..., ?cut = 1, varwidth = FALSE, box.ratio = box.ratio) > panel.bwplot(..., ?box.ratio = .1) > }, > par.settings = list(plot.symbol = list(pch = 21, col = "gray"), > box.rectangle = list(col = "black"), ? # not sure these are working properly > box.umbrella = list(col = "black")) > ) > > > On 24 March 2011 21:59, David Winsemius wrote: > >> >> On Mar 24, 2011, at 1:37 PM, JP wrote: >> >> ?Using Trellis, am successfully setting up a number of panels (25) in which >>> I >>> have two box and violin plots. >>> >>> I would like to colour - one plot as RED and the other as BLUE (in each >>> panel). ?I can do that with the box plots, but the violin density areas >>> just >>> take on one colour. >>> >>> My basic call is as follows: >>> >>> >> I took the suggestion of Sarkar's: >> http://finzi.psych.upenn.edu/Rhelp10/2010-April/234191.html >> >> Identified with a search on: " panel.violin color" >> >> .... a bit of trial and error with a re-worked copy of the `singer` >> data.frame meant I encountered errors and needed to throw out some of your >> pch arguments, and suggest this reworking of your code: >> >> >> bwplot(rmsd ~ file , groups= code, >> ? data=spread_data.filtered, col = c("red", "blue"), >> ? ?panel=panel.superpose, >> ? ? panel.groups = function(..., box.ratio){ >> ? ? ? panel.violin(..., ?cut = 1, varwidth = FALSE, >> ? ? ? ? ? ? ? ? ? ? ? box.ratio = box.ratio) >> ? ? ? panel.bwplot(..., ?box.ratio = .1) >> >> ? ? ? }, >> ? par.settings = list(plot.symbol = list(pch = 21, col = "gray"), >> ? box.rectangle = list(col = "black"), ? # not sure these are working >> properly >> >> ? box.umbrella = list(col = "black")) >> ) >> >> Obviously it cannot be tested without some data, but I did get alternating >> colors to the violin plots. There is an modifyList functionthat you might >> want to look up in the archives for changing par.settings: >> >> >> http://search.r-project.org/cgi-bin/namazu.cgi?query=par.settings+modifyList&max=100&result=normal&sort=score&idxname=functions&idxname=Rhelp08&idxname=Rhelp10&idxname=Rhelp02 >> >> >> -- >> >> David Winsemius, MD >> West Hartford, CT >> >> > > > -- > > Jean-Paul Ebejer > Early Stage Researcher > > InhibOx Ltd > Pembroke House > 36-37 Pembroke Street > Oxford > OX1 1BP > UK > > (+44 / 0) 1865 262 034 > > > > This email and any files transmitted with it are confide...{{dropped:22}} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From Murali.Menon at avivainvestors.com Fri Apr 1 16:35:10 2011 From: Murali.Menon at avivainvestors.com (Murali.Menon at avivainvestors.com) Date: Fri, 1 Apr 2011 15:35:10 +0100 Subject: [R] choosing best 'match' for given factor In-Reply-To: References: <05C1ABBB41112A4DB61380AD45F18EB61A24067399@SWVLONCUEXDP01.im.root-domain.net> Message-ID: <05C1ABBB41112A4DB61380AD45F18EB61A240673A9@SWVLONCUEXDP01.im.root-domain.net> Interesting variety of solutions! Thanks very much. Murali -----Original Message----- From: Henrique Dallazuanna [mailto:wwwhsd at gmail.com] Sent: 31 March 2011 18:26 To: Menon Murali Cc: r-help at r-project.org Subject: Re: [R] choosing best 'match' for given factor Try this: bestMatch <- function(search, match) { colnames(match)[pmax(apply(match[,search], 2, which.max) - 1, 1)] } On Thu, Mar 31, 2011 at 11:46 AM, wrote: > Folks, > > I have a 'matching' matrix between variables A, X, L, O: > >> a <- structure(c(1, 0.41, 0.58, 0.75, 0.41, 1, 0.6, 0.86, 0.58, > 0.6, 1, 0.83, 0.75, 0.86, 0.83, 1), .Dim = c(4L, 4L), .Dimnames = list( > ? ?c("A", "X", "L", "O"), c("A", "X", "L", "O"))) > >> a > ? ? ?A ? ? X ? ? L ? ? O > A ?1.00 ?0.41 ?0.58 ?0.75 > X ?0.41 ?1.00 ?0.60 ?0.86 > L ?0.58 ?0.75 ?1.00 ?0.83 > O ?0.60 ?0.86 ?0.83 ?1.00 > > And I have a search vector of variables > >> v <- c("X", "O") > > I want to write a function bestMatch(searchvector, matchMat) such that for each variable in searchvector, I get the variable that it has the highest match to - but searching only among variables to the left of it in the 'matching' matrix, and not matching with any variable in searchvector itself. > > So in the above example, although "X" has the highest match (0.86) with "O", I can't choose "O" as it's to the right of X (and also because "O" is in the searchvector v already); I'll have to choose "A". > > For "O", I will choose "L", the variable it's best matched with - as it can't match "X" already in the search vector. > > My function bestMatch(v, a) will then return c("A", "L") > > My matrix a is quite large, and I have a long list of search vectors v, so I need an efficient method. > > I wrote this: > > bestMatch <- function(searchvector, ?matchMat) { > ? ? ? ?sapply(searchvector, function(cc) { > ? ? ? ? ? ? ? ? ? ? ? ? ? ? y <- matchMat[!(rownames(matchMat) %in% searchvector) & (index(rownames(matchMat)) < match(cc, rownames(matchMat))), cc, drop = FALSE]; > ? ? ? ? ? ? ? ? ? ? ? ? ? ? rownames(y)[which.max(y)] > ? ? ? ?}) > } > > Any advice? > > Thanks, > > Murali > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O From deepayan.sarkar at gmail.com Fri Apr 1 16:43:23 2011 From: deepayan.sarkar at gmail.com (Deepayan Sarkar) Date: Fri, 1 Apr 2011 20:13:23 +0530 Subject: [R] bwplot [lattice]: how to get different y-axis scales for each row? In-Reply-To: <1F0CC393-B4EE-413E-BBE0-F144C1769452@web.de> References: <313D1279-0B78-43D1-803B-7A296871E8DF@web.de> <1F0CC393-B4EE-413E-BBE0-F144C1769452@web.de> Message-ID: On Sun, Mar 27, 2011 at 8:50 PM, Marius Hofert wrote: > Dear expeRts, > > I partially managed to obtain what I wanted by using latticeExtra. However, the > following questions remain: > 1) why do not all x-axis labels appear? [compare bw and bw2] You have unnecessarily asked for relation="free" for the x-axis (even though they don't change across panels), and then tried to combine them using combineLimits(), which doesn't know how to handle categorical axes. Change to using relation="free" for the y-axis only, and you should be fine. bw <- bwplot(error ~ methods | attr * groups, data=df, as.table=TRUE, notch=TRUE, scales = list(y = list(alternating = c(1,1), tck=c(1,0), relation = "free"))) > 2) Can I have the y-axis labels on the right margin/side of the plot? Changing > the "alternating" argument does not do the job since relation="free" No. (Well, you can provide your own axis() function to do it, in which case you will need to allocate space too.) -Deepayan > Cheers, > > Marius > > > > library(lattice) > library(latticeExtra) > > ## build example data set > dim <- c(100, 6, 4, 3) # n, groups, methods, attributes > dimnames <- list(n=paste("n=", seq_len(100), sep=""), > ? ? ? ? ? ? ? ? groups=paste("group=", seq_len(6), sep=""), > ? ? ? ? ? ? ? ? methods=paste("method=", seq_len(4), sep=""), > ? ? ? ? ? ? ? ? attr=paste("attribute=", seq_len(3), sep="")) > set.seed(1) > data <- rexp(prod(dim)) > arr <- array(data=data, dim=dim, dimnames=dimnames) > arr[,2,,] <- arr[,2,,]*10 > arr[,4,2,2] <- arr[,4,2,2]*10 > z <- abs(sweep(arr, 3, 1)) > df <- as.data.frame.table(z, responseName="error") > > ## box plot > bw <- bwplot(error ~ methods | attr * groups, data=df, > ? ? ? ? ? ? as.table=TRUE, notch=TRUE, > ? ? ? ? ? ? scales=list(y=list(alternating=c(1,1), tck=c(1,0)), > ? ? ? ? ? ? relation="free")) > (bw2 <- useOuterStrips(combineLimits(bw, extend=FALSE)))) > > > > On 2011-03-26, at 09:34 , Marius Hofert wrote: > >> Dear expeRts, >> >> How can I get ... >> (1) different y-axis scales for each row >> (2) while having the same y-axis scales for different columns? >> >> I coulnd't manage to do this with relation="free" [which gives (1) but not (2)]. >> I also tried relation="sliced", but it did not give the same y-axis scales >> within each row (see the fourth row). Further, it "separates" the panels. >> >> Cheers, >> >> Marius >> >> ## minimal example: >> >> library(lattice) >> >> ## build example data set >> dim <- c(100, 6, 2, 3) # n, groups, methods, attributes >> dimnames <- list(n=paste("n=", seq_len(100), sep=""), >> ? ? ? ? ? ? ? ?groups=paste("group=", seq_len(6), sep=""), >> ? ? ? ? ? ? ? ?methods=paste("method=", seq_len(2), sep=""), >> ? ? ? ? ? ? ? ?attr=paste("attribute=", seq_len(3), sep="")) >> set.seed(1) >> data <- rexp(prod(dim)) >> arr <- array(data=data, dim=dim, dimnames=dimnames) >> arr[,2,,] <- arr[,2,,]*10 >> arr[,4,2,2] <- arr[,4,2,2]*10 >> z <- abs(sweep(arr, 3, 1)) >> df <- as.data.frame.table(z, responseName="error") >> >> ## box plot >> bwplot(error ~ methods | attr * groups, data=df, >> ? ? ? as.table=TRUE, notch=TRUE, >> ? ? ? scales=list(alternating=c(1,1), tck=c(1,0))) >> >> ## with relation="sliced" >> bwplot(error ~ methods | attr * groups, data=df, >> ? ? ? as.table=TRUE, notch=TRUE, >> ? ? ? scales=list(alternating=c(1,1), tck=c(1,0), relation="sliced")) >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From dwinsemius at comcast.net Fri Apr 1 16:48:17 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 1 Apr 2011 10:48:17 -0400 Subject: [R] Plotting symbols and colors based upon data values In-Reply-To: <7B08ACC0-FDD2-4D5A-8A81-218F72079DD7@comcast.net> References: <00fa01cbe119$a152c7e0$e3f857a0$@gmail.com> <29A6BC18-0052-48FD-8A7C-8946899EFCEC@comcast.net> <015601cbe1e2$0100f6d0$0302e470$@gmail.com> <7B08ACC0-FDD2-4D5A-8A81-218F72079DD7@comcast.net> Message-ID: <09DCFFCC-FAE4-4DDA-BB8D-39ACB304D768@comcast.net> Thanks to Deepayan for sending me the code for the correct approach using subscripts. On Mar 13, 2011, at 9:46 PM, David Winsemius wrote: > > On Mar 13, 2011, at 8:51 PM, Mark Linderman wrote: > >> David, thank you for your quick reply. I spent a few minutes >> getting your >> command to work with some sparse synthetic data, and then spent >> several >> hours trying to figure out why my data didn't work (at least for >> symbols, >> colors look okay). I have massaged my data to where it is >> practically >> indistinguishable from the synthetic data - yet it still doesn't >> work. >> Attached are the two data files that can be plotted as follows: >> >> broken = read.table("broken.table",header=TRUE) >> works = read.table("works.table",header=TRUE) >> xyplot(Y ~ X | A, data=works, pch=works$C , col=works$B) >> xyplot(Y ~ X | A, data=broken, pch=broken$C , col=broken$B) xyplot(Y ~ X | A, data=works, pch=works$C , col=as.character(works$B), panel = function(..., pch, col, subscripts) { panel.xyplot(..., pch = pch[subscripts], col = col[subscripts]) }) -- David (for Deepayan Sarkar) > > I get the same problem and after experimenting for a while I think I > can solve it by randomizing the order of the entries: > > > broken <- broken[sample(417), ] > > > xyplot(Y ~ X | A, data=broken, pch=broken$C, col=broken$B) > > Why xyplot should fail to properly assign pch values just because > all "1"'s are at the beginning seems to me to be a bug. > > -- > David. > > After confirming the the problem recurs when re-order()-ed by broken > $C, I am appending dput( ordered-broken) for others to experiment > > > dput(broken[order(broken$C), ]) > structure(list(rown = c(91L, 193L, 128L, 8L, 143L, 46L, 60L, > 99L, 112L, 67L, 25L, 15L, 188L, 93L, 115L, 4L, 190L, 64L, 147L, > 119L, 82L, 120L, 23L, 139L, 28L, 42L, 180L, 24L, 145L, 71L, 13L, > 95L, 94L, 104L, 149L, 74L, 32L, 184L, 11L, 114L, 90L, 70L, 63L, > 141L, 192L, 126L, 153L, 172L, 26L, 151L, 109L, 133L, 79L, 35L, > 61L, 43L, 52L, 29L, 30L, 80L, 154L, 7L, 121L, 122L, 106L, 182L, > 16L, 2L, 175L, 34L, 102L, 174L, 117L, 178L, 100L, 68L, 48L, 31L, > 53L, 168L, 59L, 165L, 123L, 69L, 55L, 62L, 163L, 39L, 108L, 96L, > 97L, 113L, 87L, 164L, 169L, 33L, 118L, 45L, 148L, 129L, 22L, > 116L, 101L, 157L, 191L, 89L, 75L, 156L, 137L, 183L, 98L, 150L, > 124L, 144L, 127L, 155L, 57L, 36L, 14L, 161L, 187L, 138L, 111L, > 146L, 20L, 107L, 140L, 110L, 125L, 41L, 105L, 159L, 103L, 132L, > 44L, 166L, 56L, 171L, 195L, 40L, 135L, 5L, 58L, 37L, 54L, 83L, > 17L, 142L, 77L, 162L, 170L, 160L, 78L, 38L, 194L, 21L, 167L, > 27L, 81L, 185L, 47L, 66L, 73L, 3L, 134L, 158L, 51L, 173L, 50L, > 18L, 12L, 6L, 189L, 72L, 85L, 65L, 92L, 179L, 86L, 49L, 130L, > 177L, 152L, 176L, 9L, 10L, 76L, 88L, 131L, 181L, 19L, 186L, 136L, > 1L, 84L, 366L, 235L, 196L, 224L, 206L, 288L, 204L, 274L, 199L, > 239L, 271L, 295L, 266L, 305L, 284L, 340L, 268L, 296L, 293L, 262L, > 300L, 212L, 336L, 208L, 358L, 242L, 221L, 237L, 369L, 292L, 201L, > 338L, 233L, 217L, 227L, 225L, 270L, 267L, 345L, 205L, 219L, 278L, > 337L, 230L, 380L, 291L, 229L, 367L, 339L, 241L, 228L, 263L, 349L, > 348L, 371L, 202L, 207L, 351L, 282L, 222L, 200L, 213L, 285L, 375L, > 302L, 231L, 223L, 386L, 352L, 363L, 353L, 357L, 359L, 350L, 283L, > 362L, 218L, 198L, 374L, 301L, 286L, 364L, 368L, 220L, 298L, 280L, > 214L, 273L, 303L, 382L, 354L, 238L, 373L, 234L, 356L, 216L, 289L, > 370L, 381L, 343L, 361L, 306L, 281L, 203L, 341L, 355L, 346L, 272L, > 264L, 360L, 334L, 210L, 197L, 342L, 299L, 378L, 236L, 333L, 294L, > 347L, 275L, 385L, 365L, 209L, 297L, 240L, 265L, 379L, 304L, 269L, > 372L, 384L, 344L, 287L, 332L, 376L, 261L, 377L, 383L, 215L, 232L, > 277L, 276L, 211L, 290L, 335L, 226L, 279L, 399L, 307L, 395L, 400L, > 411L, 388L, 319L, 403L, 320L, 309L, 318L, 407L, 402L, 308L, 326L, > 251L, 260L, 246L, 408L, 331L, 312L, 387L, 414L, 253L, 315L, 413L, > 416L, 327L, 393L, 322L, 390L, 317L, 389L, 249L, 325L, 329L, 398L, > 397L, 323L, 396L, 255L, 415L, 245L, 391L, 412L, 259L, 417L, 311L, > 392L, 409L, 328L, 254L, 248L, 310L, 258L, 405L, 324L, 250L, 406L, > 316L, 394L, 257L, 404L, 243L, 252L, 410L, 313L, 256L, 330L, 321L, > 244L, 401L, 247L, 314L), X = c(0.701250601327047, 0.164821685524657, > 0.606994603062049, 0.863256110809743, 0.956295087235048, > 0.94587846682407, > 0.838799783028662, 0.523805776145309, 0.562612239504233, > 0.0359199855010957, > 0.142975208582357, 0.459868715610355, 0.579013091977686, > 0.384347806917503, > 0.161508617224172, 0.96426909067668, 0.504280025139451, > 0.438289026031271, > 0.373645842541009, 0.439572562696412, 0.25889431219548, > 0.0467256724368781, > 0.365111395483837, 0.40517632686533, 0.847934616263956, > 0.0139284294564277, > 0.228810637025163, 0.976755930809304, 0.537870434345677, > 0.831849699141458, > 0.735547028947622, 0.107985522132367, 0.200033176457509, > 0.250281900400296, > 0.578671747585759, 0.289995870785788, 0.440369168063626, > 0.364585015457124, > 0.905479809269309, 0.446940524037927, 0.658691298449412, > 0.0173427225090563, > 0.24269335786812, 0.430798843270168, 0.164247164269909, > 0.357896727975458, > 0.381168011575937, 0.466935358708724, 0.598047381266952, > 0.236625553574413, > 0.075431430246681, 0.729021292412654, 0.332617457257584, > 0.800470217363909, > 0.658606661716476, 0.857558676274493, 0.95546124689281, > 0.319315308239311, > 0.26408554892987, 0.736351884901524, 0.998718089656904, > 0.0957230781204998, > 0.689594832016155, 0.422279811929911, 0.9711551531218, > 0.564745565643534, > 0.493862147675827, 0.569259794196114, 0.0411112532019615, > 0.0377507081720978, > 0.725350870285183, 0.174974237568676, 0.403428635792807, > 0.291210540104657, > 0.196486078668386, 0.0656222854740918, 0.0509774188976735, > 0.78418469009921, > 0.907518392894417, 0.718859917717054, 0.162560145836323, > 0.165355861652642, > 0.239028699230403, 0.794382674852386, 0.258753027301282, > 0.856053869239986, > 0.753498190781102, 0.128293008776382, 0.97426181170158, > 0.73234709375538, > 0.88877343875356, 0.0330339481588453, 0.114545000018552, > 0.34120128583163, > 0.623125589219853, 0.296904754126444, 0.0341241161804646, > 0.827014627167955, > 0.269123075297102, 0.835532493656501, 0.378544366452843, > 0.0417758887633681, > 0.701557072810829, 0.991019503446296, 0.430038925725967, > 0.30387532315217, > 0.212177407694981, 0.0739604290574789, 0.767785993637517, > 0.989211868261918, > 0.724100521299988, 0.192399453371763, 0.339564701542258, > 0.54304352379404, > 0.485609688796103, 0.842413938138634, 0.446879531955346, > 0.794351557036862, > 0.096292131813243, 0.258302961708978, 0.58616296085529, > 0.278098736191168, > 0.843206173274666, 0.565877866232768, 0.355487501248717, > 0.931851770961657, > 0.0385297290049493, 0.753262906335294, 0.186055560130626, > 0.324502500705421, > 0.143642185255885, 0.0232619508169591, 0.618703046115115, > 0.340094977291301, > 0.458663430297747, 0.313175080576912, 0.311025382485241, > 0.740189846139401, > 0.387821438955143, 0.127946690190583, 0.0982711461838335, > 0.530347143299878, > 0.226925036637112, 0.352387776365504, 0.24819467542693, > 0.0116725896950811, > 0.154650345211849, 0.393079981207848, 0.728091803612188, > 0.170153956860304, > 0.81173292431049, 0.604964130092412, 0.195516420993954, > 0.665194702101871, > 0.902374199125916, 0.875123467994854, 0.28142825467512, > 0.312998358858749, > 0.629422497935593, 0.945258686086163, 0.63372730021365, > 0.248635908588767, > 0.544222480617464, 0.587964891456068, 0.252189125167206, > 0.2657802302856, > 0.989964423933998, 0.0520109671633691, 0.211115221725777, > 0.723641818389297, > 0.21277131116949, 0.999876993708313, 0.115524013759568, > 0.107035915134475, > 0.807371424278244, 0.558987217256799, 0.831789107760414, > 0.824789069592953, > 0.26601968659088, 0.0976237277500331, 0.656752997078001, > 0.417558990651742, > 0.928754845634103, 0.642699809512123, 0.289895867696032, > 0.771415231283754, > 0.252410312648863, 0.181261786725372, 0.343963136896491, > 0.151824467582628, > 0.410438629798591, 0.316298315767199, 0.89474390889518, > 0.347615822451189, > 0.492034191964194, 0.0415450807195157, 0.828365112189204, > 0.230966808507219, > 0.0422265736851841, 0.402152558788657, 0.684953848132864, > 0.899216906866059, > 0.922379054129124, 0.550890099955723, 0.42850927147083, > 0.146120680728927, > 0.222744381986558, 0.637204843340442, 0.975538540631533, > 0.533271077787504, > 0.0438263991381973, 0.288163386518136, 0.276471544289961, > 0.204844454303384, > 0.724974561249837, 0.0446081308182329, 0.49430369422771, > 0.19497368298471, > 2.32525635510683e-05, 0.904675911879167, 0.493794195353985, > 0.72478751721792, > 0.142712019383907, 0.663267731433734, 0.231417116709054, > 0.0173127462621778, > 0.57666564756073, 0.484273905167356, 0.997436377452686, > 0.000396451214328408, > 0.510367450769991, 0.591025563655421, 0.224653659854084, > 0.773361603729427, > 0.379073723452166, 0.448086899705231, 0.041542210849002, > 0.309524232754484, > 0.647234397474676, 0.637066879775375, 0.616037875413895, > 0.162085753167048, > 0.958705822238699, 0.602349029621109, 0.598767473595217, > 0.113397455308586, > 0.698689580429345, 0.825687980279326, 0.290552897378802, > 0.507397164823487, > 0.397019035648555, 0.223723065108061, 0.701426188694313, > 0.850980734452605, > 0.756329476134852, 0.0340954945422709, 0.893199543701485, > 0.74836050788872, > 0.87006417219527, 0.487111459486187, 0.290695097995922, > 0.551357613410801, > 0.78720832709223, 0.443636126350611, 0.909595184028149, > 0.0358559342566878, > 0.0801119154784828, 0.839801122667268, 0.666993780760095, > 0.577966008568183, > 0.422719019465148, 0.630310772219673, 0.883910533739254, > 0.941940967924893, > 0.232898708432913, 0.539576183073223, 0.285852419212461, > 0.481135553214699, > 0.565562826581299, 0.345754055306315, 0.862464904552326, > 0.793464061338454, > 0.0559016007464379, 0.124836669070646, 0.896915214369074, > 0.936814778018743, > 0.74106968473643, 0.0435592739377171, 0.825898392824456, > 0.725313652539626, > 0.950716170715168, 0.481058384524658, 0.693162805400789, > 0.0216092106420547, > 0.277436587261036, 0.402496351627633, 0.719264863058925, > 0.439484613249078, > 0.398182060336694, 0.236637223977596, 0.244894647505134, > 0.479899478610605, > 0.165471526794136, 0.0460625963751227, 0.397890231572092, > 0.329886695370078, > 0.635740406578407, 0.552641975693405, 0.244620074750856, > 0.330058194464073, > 0.928089746274054, 0.532479231012985, 0.405185123672709, > 0.918767123483121, > 0.193489346886054, 0.282202445436269, 0.35137809580192, > 0.737104135332629, > 0.0308404462412, 0.505957348737866, 0.936959875747561, > 0.123565464280546, > 0.189931713975966, 0.125042418483645, 0.135540294926614, > 0.583715724293143, > 0.0357968790922314, 0.64392837928608, 0.19056866155006, > 0.950899359304458, > 0.098964711651206, 0.88134005269967, 0.888160075061023, > 0.0511809457093477, > 0.702107470482588, 0.608718459727243, 0.416799532948062, > 0.117909522727132, > 0.633046145318076, 0.88943671900779, 0.803786197211593, > 0.775628343923017, > 0.290075656725094, 0.592150817392394, 0.741318783024326, > 0.77316367207095, > 0.44796843174845, 0.635858315508813, 0.295597444288433, > 0.600949247833341, > 0.76223914208822, 0.419811620842665, 0.705567310331389, > 0.708347749663517, > 0.263079840457067, 0.50989875337109, 0.973575493320823, > 0.94332129508257, > 0.819637563312426, 0.192222638521343, 0.396144614787772, > 0.983340895501897, > 0.827303341357037, 0.756905926857144, 0.22044308623299, > 0.59880581847392, > 0.535745044238865, 0.722211508080363, 0.871434730477631, > 0.11978330113925, > 0.652976502198726, 0.732404118636623, 0.861010160297155, > 0.89550770772621, > 0.36102738394402, 0.597046557813883, 0.694743229541928, > 0.80066374479793, > 0.664067747537047, 0.698023943696171, 0.730355188250542, > 0.569261560449377, > 0.906895149964839, 0.534028419991955, 0.161579012637958, > 0.486382132628933, > 0.176803272450343, 0.996078292140737, 0.166760441381484, > 0.130956800654531, > 0.412682609632611, 0.667715679854155, 0.337738514645025, > 0.51363705820404, > 0.881870723795146, 0.724578340072185, 0.941882126033306, > 0.158593302126974, > 0.452429827069864, 0.491405379958451, 0.973311392823234, > 0.99337683757767, > 0.249422027729452, 0.930915406672284, 0.46549045224674, > 0.878254638286307, > 0.059840910602361, 0.998892408795655, 0.114133176626638, > 0.234312520828098, > 0.300854196073487, 0.149864102946594, 0.636824406450614, > 0.671343131922185, > 0.872752726078033, 0.00751097011379898, 0.582277687266469), Y = > c(0.171890107216313, > 0.370797614334151, 0.924361743032932, 0.28134711808525, > 0.66432327334769, > 0.796134414616972, 0.976940092863515, 0.604996162932366, > 0.00418127072043717, > 0.316872535273433, 0.99756994890049, 0.309717640280724, > 0.131888416828588, > 0.245327473850921, 0.645931946812198, 0.92422404163517, > 0.339793691877276, > 0.261232751654461, 0.900882654357702, 0.64631098182872, > 0.00424177665263414, > 0.473244683118537, 0.77987119066529, 0.97352181491442, > 0.369361791759729, > 0.767924362327904, 0.10399404889904, 0.0438599628396332, > 0.29301310935989, > 0.73782938462682, 0.156529521103948, 0.671467063948512, > 0.400057458085939, > 0.661995379254222, 0.298377772793174, 0.372027936391532, > 0.380989259341732, > 0.562391041079536, 0.752812439808622, 0.7302008070983, > 0.818077584030107, > 0.877294855890796, 0.413850854616612, 0.823451421456411, > 0.147629920160398, > 0.0302557460963726, 0.773684116546065, 0.589168650330976, > 0.468369670212269, > 0.388508658157662, 0.611251852475107, 0.148683512816206, > 0.240981192560866, > 0.625521472422406, 0.69594865757972, 0.845864138565958, > 0.0306010833010077, > 0.587514291750267, 0.146518325898796, 0.151491977507249, > 0.888296207878739, > 0.090270600747317, 0.878451642813161, 0.984217442339286, > 0.0249567097052932, > 0.333548737457022, 0.296019930159673, 0.320528760086745, > 0.0295929634012282, > 0.635236620903015, 0.392915730830282, 0.439254282508045, > 0.0461037687491626, > 0.301570050418377, 0.472936083795503, 0.261422283714637, > 0.0222742764744908, > 0.355823787860572, 0.987022530985996, 0.834863429190591, > 0.740066486410797, > 0.391710012918338, 0.871678836410865, 0.12019352382049, > 0.277163289953023, > 0.98267021495849, 0.345335704274476, 0.922220463398844, > 0.424633938586339, > 0.278999223839492, 0.714344000443816, 0.56897996342741, > 0.465939020272344, > 0.712276648031548, 0.72533538704738, 0.986942887306213, > 0.229512252379209, > 0.580829019192606, 0.226183731108904, 0.167294949525967, > 0.375515706604347, > 0.713610262144357, 0.431350194616243, 0.547398295486346, > 0.699540067696944, > 0.455317207612097, 0.372894094558433, 0.492133665131405, > 0.0603628207463771, > 0.23433334287256, 0.758064843481407, 0.064469970529899, > 0.240423953859136, > 0.142249457305297, 0.101748930523172, 0.368909977609292, > 0.235771276289597, > 0.465952947735786, 0.191509356489405, 0.136731109814718, > 0.304088074946776, > 0.802979400614277, 0.543293120339513, 0.0068712888751179, > 0.664302490185946, > 0.295362222241238, 0.199966921936721, 0.38276062393561, > 0.09960433607921, > 0.971819909987971, 0.753774431766942, 0.381981828948483, > 0.710454542655498, > 0.535177865996957, 0.935759501997381, 0.469148830277845, > 0.694085463415831, > 0.993797857081518, 0.551567627117038, 0.766783748753369, > 0.290656099561602, > 0.796439147554338, 0.45645066886209, 0.32463817903772, > 0.706545975524932, > 0.608359589707106, 0.870380551321432, 0.644623076310381, > 0.583964630262926, > 0.464653216535226, 0.348849157337099, 0.672243025619537, > 0.30125402030535, > 0.844146094284952, 0.735730222426355, 0.137696442659944, > 0.909995779395103, > 0.67104962747544, 0.193171724444255, 0.719128533499315, > 0.000234761741012335, > 0.115613663569093, 0.43861810490489, 0.359372491715476, > 0.0316722302231938, > 0.170181025052443, 0.327365997713059, 0.213334964588284, > 0.174400835763663, > 0.549330030335113, 0.308011762565002, 0.172261784551665, > 0.905044082552195, > 0.0104124981444329, 0.109363237163052, 0.629955994896591, > 0.90461226599291, > 0.843848718563095, 0.788788010831922, 0.410448123002425, > 0.164807833498344, > 0.703740650322288, 0.262160507496446, 0.871268867049366, > 0.0789999319240451, > 0.373864559689537, 0.0346520259045064, 0.439821641892195, > 0.66595608741045, > 0.929020783631131, 0.0182372040580958, 0.236733148572966, > 0.159761383896694, > 0.533487406792119, 0.933626463869587, 0.0902693625539541, > 0.441298491321504, > 0.542474026791751, 0.564687464153394, 0.126976538216695, > 0.789098215056583, > 0.38124479772523, 0.228319370187819, 0.921609675046057, > 0.208050262648612, > 0.239139189943671, 0.396728692576289, 0.817313153296709, > 0.560780925443396, > 0.400536111090332, 0.441601832862943, 0.988719131564721, > 0.000827041920274496, > 0.516230726381764, 0.283795825671405, 0.837254045763984, > 0.810963656520471, > 0.0731901943217963, 0.75502674584277, 0.572658375371248, > 0.588822845835239, > 0.69323132908903, 0.454046661267057, 0.0532575217075646, > 0.864072882803157, > 0.108720119809732, 0.941018613055348, 0.68864845763892, > 0.106074716662988, > 0.59618656639941, 0.983284626156092, 0.447163598611951, > 0.323577877366915, > 0.43200644152239, 0.230878660688177, 0.330381851643324, > 0.228483391692862, > 0.14531171717681, 0.947160384617746, 0.0644653548952192, > 0.933858478674665, > 0.867367077618837, 0.960713072912768, 0.0106419362127781, > 0.0958437097724527, > 0.318360113538802, 0.89607287850231, 0.194603573530912, > 0.0140892933122814, > 0.78282424225472, 0.154727570246905, 0.762999390950426, > 0.489407370565459, > 0.423035337822512, 0.19474891689606, 0.681294421199709, > 0.225796846672893, > 0.6234319685027, 0.636501412605867, 0.159554290119559, > 0.553764832671732, > 0.0210408298298717, 0.68121306411922, 0.995034355204552, > 0.399561397265643, > 0.403975014341995, 0.852582210907713, 0.125726311234757, > 0.992505229078233, > 0.422545881476253, 0.662919480586424, 0.316766428994015, > 0.618335535051301, > 0.441973085515201, 0.851687799440697, 0.842560255667195, > 0.633133036317304, > 0.324292129138485, 0.381184502737597, 0.78097918536514, > 0.238214250188321, > 0.927592432824895, 0.841349175665528, 0.8883684319444, > 0.848109701182693, > 0.215758525533602, 0.440561114577577, 0.667430794099346, > 0.555521365720779, > 0.0360126029700041, 0.355077777989209, 0.802172212162986, > 0.0323124176356941, > 0.764302607392892, 0.556403563823551, 0.0513982712291181, > 0.167700330493972, > 0.275772945489734, 0.121303298510611, 0.355494713177904, > 0.619331127265468, > 0.722270094789565, 0.13826255640015, 0.83197070308961, > 0.208093572407961, > 0.417050289222971, 0.552277022507042, 0.532353505026549, > 0.825634842040017, > 0.0584846271667629, 0.206079388037324, 0.55847840802744, > 0.787833330687135, > 0.265674144495279, 0.632736003026366, 0.935361107578501, > 0.892695550108328, > 0.810330303385854, 0.504313444718719, 0.484284303616732, > 0.268657196313143, > 0.79852058342658, 0.837990865344182, 0.0854433339554816, > 0.457756362855434, > 0.930622784886509, 0.30546832550317, 0.364406730281189, > 0.895903791300952, > 0.61879310826771, 0.705111734103411, 0.229004139779136, > 0.153806942980736, > 0.388366017956287, 0.384744216687977, 0.131829720223323, > 0.933241792721674, > 0.828655388206244, 0.478957881452516, 0.163506358396262, > 0.202536955475807, > 0.521721071796492, 0.934954703785479, 0.922832843149081, > 0.0890378498006612, > 0.744039923418313, 0.342938947491348, 0.829126243945211, > 0.438954021781683, > 0.342147971037775, 0.904643931658939, 0.618884094292298, > 0.53136019455269, > 0.28578051389195, 0.261097583221272, 0.731547623872757, > 0.925990937277675, > 0.392090859822929, 0.344719064421952, 0.566447681514546, > 0.676267065340653, > 0.0889970965217799, 0.79778384277597, 0.454307504929602, > 0.0324128807988018, > 0.367819492006674, 0.748151563573629, 0.117547858972102, > 0.609072768129408, > 0.0297117948066443, 0.425113417906687, 0.59103324636817, > 0.89295660564676, > 0.961610725149512, 0.844706527423114, 0.538759749848396, > 0.818922623759136, > 0.549228129675612, 0.126476648263633, 0.861659712390974, > 0.613700804766268, > 0.409116324270144, 0.686794322915375, 0.438312869053334, > 0.878093276405707, > 0.755687783006579, 0.00695069995708764, 0.138217013562098, > 0.411313445540145, > 0.907310962677002, 0.701067975489423, 0.1852589645423, > 0.150231995387003, > 0.934385694563389, 0.353562438627705, 0.464768649777398, > 0.765283492859453, > 0.905872487463057, 0.0849798938725144, 0.773121788864955, > 0.0939909212756902, > 0.596990453079343, 0.725830281618983, 0.598506234120578, > 0.458693271037191, > 0.281013652216643, 0.458665299229324, 0.0339348095003515, > 0.799791351892054, > 0.000570902600884438, 0.804609716171399, 0.812421002890915, > 0.886078594485298, > 0.525463038356975, 0.145692070946097, 0.78209150978364, > 0.905050198314711 > ), A = structure(c(1L, 3L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, > 1L, 3L, 2L, 2L, 1L, 3L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 1L, 1L, 3L, > 1L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 3L, 1L, 2L, 1L, 1L, 1L, > 2L, 3L, 2L, 3L, 3L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, > 1L, 3L, 1L, 2L, 2L, 2L, 3L, 1L, 1L, 3L, 1L, 2L, 3L, 2L, 3L, 2L, > 1L, 1L, 1L, 1L, 3L, 1L, 3L, 2L, 1L, 1L, 1L, 3L, 1L, 2L, 2L, 2L, > 2L, 1L, 3L, 3L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, > 3L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 3L, 1L, 1L, 1L, 3L, 3L, 2L, 2L, > 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 3L, 2L, 2L, 1L, 3L, 1L, 3L, 3L, > 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 3L, 3L, 3L, 1L, 1L, 3L, > 1L, 3L, 1L, 1L, 3L, 1L, 1L, 1L, 1L, 2L, 3L, 1L, 3L, 1L, 1L, 1L, > 1L, 3L, 1L, 1L, 1L, 1L, 3L, 1L, 1L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, > 1L, 2L, 3L, 1L, 3L, 2L, 1L, 1L, 6L, 4L, 4L, 4L, 4L, 5L, 4L, 5L, > 4L, 4L, 5L, 5L, 5L, 5L, 5L, 6L, 5L, 5L, 5L, 5L, 5L, 4L, 6L, 4L, > 6L, 4L, 4L, 4L, 6L, 5L, 4L, 6L, 4L, 4L, 4L, 4L, 5L, 5L, 6L, 4L, > 4L, 5L, 6L, 4L, 6L, 5L, 4L, 6L, 6L, 4L, 4L, 5L, 6L, 6L, 6L, 4L, > 4L, 6L, 5L, 4L, 4L, 4L, 5L, 6L, 5L, 4L, 4L, 6L, 6L, 6L, 6L, 6L, > 6L, 6L, 5L, 6L, 4L, 4L, 6L, 5L, 5L, 6L, 6L, 4L, 5L, 5L, 4L, 5L, > 5L, 6L, 6L, 4L, 6L, 4L, 6L, 4L, 5L, 6L, 6L, 6L, 6L, 5L, 5L, 4L, > 6L, 6L, 6L, 5L, 5L, 6L, 6L, 4L, 4L, 6L, 5L, 6L, 4L, 6L, 5L, 6L, > 5L, 6L, 6L, 4L, 5L, 4L, 5L, 6L, 5L, 5L, 6L, 6L, 6L, 5L, 6L, 6L, > 5L, 6L, 6L, 4L, 4L, 5L, 5L, 4L, 5L, 6L, 4L, 5L, 6L, 5L, 6L, 6L, > 6L, 6L, 5L, 6L, 5L, 5L, 5L, 6L, 6L, 5L, 5L, 4L, 4L, 4L, 6L, 5L, > 5L, 6L, 6L, 4L, 5L, 6L, 6L, 5L, 6L, 5L, 6L, 5L, 6L, 4L, 5L, 5L, > 6L, 6L, 5L, 6L, 4L, 6L, 4L, 6L, 6L, 4L, 6L, 5L, 6L, 6L, 5L, 4L, > 4L, 5L, 4L, 6L, 5L, 4L, 6L, 5L, 6L, 4L, 6L, 4L, 4L, 6L, 5L, 4L, > 5L, 5L, 4L, 6L, 4L, 5L), .Label = c("Cat A", "Cat B", "Cat C", > "Cat D", "Cat E", "Cat F"), class = "factor"), B = structure(c(3L, > 1L, 1L, 1L, 4L, 3L, 3L, 4L, 4L, 1L, 4L, 4L, 1L, 1L, 3L, 1L, 3L, > 4L, 1L, 1L, 1L, 3L, 4L, 1L, 1L, 1L, 4L, 4L, 4L, 4L, 4L, 1L, 1L, > 1L, 3L, 4L, 4L, 3L, 1L, 4L, 4L, 4L, 1L, 1L, 4L, 3L, 3L, 4L, 4L, > 4L, 4L, 1L, 4L, 1L, 4L, 4L, 4L, 4L, 4L, 1L, 4L, 4L, 1L, 4L, 4L, > 4L, 3L, 4L, 4L, 4L, 4L, 4L, 1L, 1L, 4L, 3L, 1L, 1L, 4L, 1L, 4L, > 3L, 4L, 4L, 1L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 3L, > 4L, 1L, 3L, 4L, 1L, 4L, 1L, 4L, 4L, 3L, 1L, 4L, 3L, 4L, 3L, 4L, > 3L, 4L, 3L, 1L, 3L, 4L, 4L, 4L, 3L, 4L, 3L, 3L, 4L, 4L, 3L, 3L, > 4L, 4L, 4L, 1L, 1L, 1L, 4L, 3L, 4L, 3L, 4L, 4L, 4L, 4L, 3L, 4L, > 4L, 3L, 1L, 4L, 3L, 4L, 1L, 4L, 3L, 3L, 4L, 1L, 4L, 4L, 1L, 1L, > 1L, 3L, 1L, 4L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 4L, 4L, 4L, 3L, 4L, > 4L, 4L, 3L, 3L, 1L, 4L, 4L, 1L, 4L, 4L, 1L, 4L, 3L, 3L, 4L, 4L, > 4L, 4L, 1L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 3L, 2L, 3L, 3L, 3L, 2L, > 3L, 4L, 3L, 3L, 4L, 3L, 3L, 3L, 4L, 4L, 3L, 2L, 2L, 2L, 3L, 3L, > 2L, 4L, 2L, 3L, 3L, 3L, 2L, 3L, 3L, 2L, 2L, 3L, 3L, 2L, 3L, 3L, > 2L, 1L, 3L, 2L, 3L, 3L, 3L, 3L, 2L, 2L, 3L, 3L, 3L, 2L, 3L, 3L, > 3L, 1L, 2L, 3L, 3L, 2L, 3L, 3L, 3L, 3L, 4L, 2L, 3L, 3L, 2L, 3L, > 3L, 2L, 3L, 2L, 2L, 3L, 3L, 2L, 4L, 3L, 3L, 2L, 1L, 3L, 2L, 2L, > 1L, 2L, 3L, 3L, 2L, 3L, 3L, 2L, 3L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, > 3L, 3L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 3L, 2L, > 3L, 3L, 2L, 3L, 3L, 4L, 3L, 3L, 4L, 1L, 3L, 2L, 3L, 4L, 2L, 4L, > 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 3L, > 4L, 3L, 4L, 3L, 3L, 3L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 3L, 3L, > 3L, 3L, 4L, 3L, 3L, 3L, 4L, 3L, 4L, 3L, 3L, 3L, 4L, 3L, 4L, 3L, > 3L, 4L, 3L, 4L, 2L, 2L, 3L, 3L, 4L, 4L, 3L, 3L, 4L, 4L, 3L, 3L, > 4L, 3L, 4L, 3L, 4L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 3L, 4L, 3L, 3L > ), .Label = c("black", "blue", "orange", "red"), class = "factor"), > C = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, > 1L, 1L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, > 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, > 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, > 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, > 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, > 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, > 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, > 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, > 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, > 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, > 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, > 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, > 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, > 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, > 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L)), .Names > = c("rown", > "X", "Y", "A", "B", "C"), row.names = c(91L, 193L, 128L, 8L, > 143L, 46L, 60L, 99L, 112L, 67L, 25L, 15L, 188L, 93L, 115L, 4L, > 190L, 64L, 147L, 119L, 82L, 120L, 23L, 139L, 28L, 42L, 180L, > 24L, 145L, 71L, 13L, 95L, 94L, 104L, 149L, 74L, 32L, 184L, 11L, > 114L, 90L, 70L, 63L, 141L, 192L, 126L, 153L, 172L, 26L, 151L, > 109L, 133L, 79L, 35L, 61L, 43L, 52L, 29L, 30L, 80L, 154L, 7L, > 121L, 122L, 106L, 182L, 16L, 2L, 175L, 34L, 102L, 174L, 117L, > 178L, 100L, 68L, 48L, 31L, 53L, 168L, 59L, 165L, 123L, 69L, 55L, > 62L, 163L, 39L, 108L, 96L, 97L, 113L, 87L, 164L, 169L, 33L, 118L, > 45L, 148L, 129L, 22L, 116L, 101L, 157L, 191L, 89L, 75L, 156L, > 137L, 183L, 98L, 150L, 124L, 144L, 127L, 155L, 57L, 36L, 14L, > 161L, 187L, 138L, 111L, 146L, 20L, 107L, 140L, 110L, 125L, 41L, > 105L, 159L, 103L, 132L, 44L, 166L, 56L, 171L, 195L, 40L, 135L, > 5L, 58L, 37L, 54L, 83L, 17L, 142L, 77L, 162L, 170L, 160L, 78L, > 38L, 194L, 21L, 167L, 27L, 81L, 185L, 47L, 66L, 73L, 3L, 134L, > 158L, 51L, 173L, 50L, 18L, 12L, 6L, 189L, 72L, 85L, 65L, 92L, > 179L, 86L, 49L, 130L, 177L, 152L, 176L, 9L, 10L, 76L, 88L, 131L, > 181L, 19L, 186L, 136L, 1L, 84L, 366L, 235L, 196L, 224L, 206L, > 288L, 204L, 274L, 199L, 239L, 271L, 295L, 266L, 305L, 284L, 340L, > 268L, 296L, 293L, 262L, 300L, 212L, 336L, 208L, 358L, 242L, 221L, > 237L, 369L, 292L, 201L, 338L, 233L, 217L, 227L, 225L, 270L, 267L, > 345L, 205L, 219L, 278L, 337L, 230L, 380L, 291L, 229L, 367L, 339L, > 241L, 228L, 263L, 349L, 348L, 371L, 202L, 207L, 351L, 282L, 222L, > 200L, 213L, 285L, 375L, 302L, 231L, 223L, 386L, 352L, 363L, 353L, > 357L, 359L, 350L, 283L, 362L, 218L, 198L, 374L, 301L, 286L, 364L, > 368L, 220L, 298L, 280L, 214L, 273L, 303L, 382L, 354L, 238L, 373L, > 234L, 356L, 216L, 289L, 370L, 381L, 343L, 361L, 306L, 281L, 203L, > 341L, 355L, 346L, 272L, 264L, 360L, 334L, 210L, 197L, 342L, 299L, > 378L, 236L, 333L, 294L, 347L, 275L, 385L, 365L, 209L, 297L, 240L, > 265L, 379L, 304L, 269L, 372L, 384L, 344L, 287L, 332L, 376L, 261L, > 377L, 383L, 215L, 232L, 277L, 276L, 211L, 290L, 335L, 226L, 279L, > 399L, 307L, 395L, 400L, 411L, 388L, 319L, 403L, 320L, 309L, 318L, > 407L, 402L, 308L, 326L, 251L, 260L, 246L, 408L, 331L, 312L, 387L, > 414L, 253L, 315L, 413L, 416L, 327L, 393L, 322L, 390L, 317L, 389L, > 249L, 325L, 329L, 398L, 397L, 323L, 396L, 255L, 415L, 245L, 391L, > 412L, 259L, 417L, 311L, 392L, 409L, 328L, 254L, 248L, 310L, 258L, > 405L, 324L, 250L, 406L, 316L, 394L, 257L, 404L, 243L, 252L, 410L, > 313L, 256L, 330L, 321L, 244L, 401L, 247L, 314L), class = "data.frame") > >> >> Only difference I see is that my data is largely sorted by $C >> whereas the >> working data frame is not. Not sure why that would make a >> difference. >> >> Thanks again for your help! >> Mark >> >> >>> head(broken) >> X Y A B C >> 1 0.3476158 0.5334874 Cat A red 1 >> 2 0.5692598 0.3205288 Cat A red 1 >> 3 0.5879649 0.3593725 Cat A black 1 >> 4 0.9642691 0.9242240 Cat A black 1 >> 5 0.5303471 0.7964391 Cat A red 1 >> 6 0.9998770 0.1722618 Cat A black 1 >> >>> head(works) >> X Y A B C >> 1 0.55722499 31 cat D yellow 2 >> 2 0.75100600 32 cat B red 5 >> 3 0.21665005 33 cat C green 4 >> 4 0.01201102 34 cat B red 3 >> 5 0.78503588 35 cat B black 2 >> 6 0.53589896 36 cat D blue 5 >> -----Original Message----- >> From: David Winsemius [mailto:dwinsemius at comcast.net] >> Sent: Saturday, March 12, 2011 10:39 PM >> To: Mark Linderman >> Cc: r-help at r-project.org >> Subject: Re: [R] Plotting symbols and colors based upon data values >> >> >> On Mar 12, 2011, at 7:57 PM, Mark Linderman wrote: >> >>> I am new to R and am sure this is simple, but I been unable to >>> find a >>> solution. >>> >>> I have 5 columns of data labeled "X", "Y", "A","B","C". I can >>> easily >>> xyplot(Y ~ X | A) but I want the colors of the symbols to be based >>> upon the >>> values of B and the shape of the symbols to be determined by C. >>> There are >>> approximately four distinct values of B and C (say >>> "b1","b2","b3","b4" >>> and "c1","c2","c3","c4", respectively) >>> >> No data to check it against (despite the request for such that >> accompanies >> every posting) but see if this give the desired result: >> >> xyplot(Y ~ X | A, data=dfrm2, pch=dfrm2$C , col=dfrm2$B) >> >> >>> Either a solution or a pointer to a specific reference/example is >>> greatly appreciated. >> >> There are many in the contributed documentation as well as in >> Sarkar's book >> website and in the graphics galleries. As you suggested, it's >> pretty basic >> stuff since you are benefiting from Sarkar's effort to carry over >> some of >> the argument names from basic graphics. The one "trick" is to not >> rely on >> the argument being assumed to come from the environment of the `data` >> argument. >> >> -- David Winsemius, MD West Hartford, CT From dwinsemius at comcast.net Fri Apr 1 16:50:05 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 1 Apr 2011 10:50:05 -0400 Subject: [R] read.table question #only need to change column names In-Reply-To: References: Message-ID: <17839A96-0C70-4B5D-BE8F-5EBB1F62B8E0@comcast.net> On Apr 1, 2011, at 10:00 AM, hongsheng wu wrote: > Hi all, > > I have a huge data set. All I want to do is to change the column > names of > the it. So if I use read.table, I can read the data in and change > colnames > and then write back, such as, > > t <- read.table("a.txt", header = T, sep = "\t") > > colnames(t) > > colnames(t) <- c("....) # new column names > > write.table(t, "a.txt", quote = F, sep = "\t", row.names = F, > col.names = T) > > My question is that is there a better to do this without reading in > and > write out data entirely? With an editor. -- David Winsemius, MD West Hartford, CT From jwiley.psych at gmail.com Fri Apr 1 17:02:12 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Fri, 1 Apr 2011 08:02:12 -0700 Subject: [R] Syntax coloring in R console In-Reply-To: References: Message-ID: Dear January, Have you looked at Emacs + ESS? http://ess.r-project.org/ It highlights in the text editor and the actual R process besides coming with a rich set of features and a mailing list filled with helpful Emacs & R users. I've tried several different interfaces and ended up being happiest by far with Emacs + ESS. I also seem to recall Kate or Kedit at least highlighted .R files, but I am not in a position to check right now. Happy hunting, Josh On Fri, Apr 1, 2011 at 3:32 AM, January Weiner wrote: > Dear all, > > I am a happy user of R console, but I would like to see syntax > coloring. I use R 2.12 in Ubuntu Linux. > > I have found the packages "xterm256" and "highlight", but I was not > able to figure out how to use it to highlight the syntax in console > output. > > Also, I tried several GUI interfaces, but I was not able to find > something that suits me better than the default R console. R cmdr is > definitely not for me, as I don't want to fundamentally change the way > I am managing my data in R. Rkwrd seems to be nice (from screenshots), > but its installation requires all the base KDE libraries, which I > don't want to install. > > I tried JGR, the GUI for R, but I have found the following problems > with this package: > > - I was not able to change the background color from a repulsive grey, > - apparently, GNU readline is not implemented in that package, that > is, there is no functionality similar to ctrl-r (which searches > through the history for matching commands), something I use > frequently, and > - tab expansion is of limited use (e.g. doesn't browse files in the > current directory when expanding quoted arguments e.g. in > "read.table"). > > All in all, I'd be happy to continue using the plain R console, but > syntax highlighting would be nice. Any advice would be extremely > welcome. > > Kind regards, > > January > > > >> sessionInfo() > R version 2.12.2 (2011-02-25) > Platform: i486-pc-linux-gnu (32-bit) > > locale: > ?[1] LC_CTYPE=en_US.utf8 ? ? ? LC_NUMERIC=C > LC_TIME=en_US.utf8 ? ? ? ?LC_COLLATE=en_US.utf8 ? ? LC_MONETARY=C > ?[6] LC_MESSAGES=en_US.utf8 ? ?LC_PAPER=en_US.utf8 ? ? ? LC_NAME=C > ? ? ? ? ? ?LC_ADDRESS=C ? ? ? ? ? ? ?LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base > > > -- > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From jianfeng.mao at gmail.com Fri Apr 1 17:02:19 2011 From: jianfeng.mao at gmail.com (Mao Jianfeng) Date: Fri, 1 Apr 2011 17:02:19 +0200 Subject: [R] a statistical problem - more statistical strategy for making quality control Message-ID: Dear R-listers, I would like to have your helps on make a good strategy of quality control by several quality control variables. This means I need a good strategy to choose cutoff for each quality control variables, or make cutoff for all such quality control variables in one time. For quality control, we employed several values. Each production has each value of such variables: (1) concordance, (float, from 0 to 1, but 0.5 is the expected best one). (2) coverage, (integer, >=1, more large more better) (3) base quality, (integer, 1 to 40, more large more better) Here, concordance may be the most important variable for quality control. The best product determined by concordance are those have values of 0.5. Obviously, smaller ones (<0.1) and bigger ones (>0.9) are not good. Coverage may also play an important role, like the products which have 0.5 vale for concordance and 0.1 coverage may not be the good calls. While, base quality is the same with converage. The bigger base quality should be the products which are better. Here, I want to find a good strategy to set a cutoff to our products based on these three or just concordance and coverage variables. I prefer a more statistical way. Would you please give me any ideas/directions on my problems? Thanks in advance. ########################################################### # as an example, I create a dummy data for my question. product<-1:100 concordance<-rnorm(100, mean=0.2) coverage<-sample(1:50, 100, replace = T) base.quality<-sample(1:40, 100, replace = T) dummy <- cbind(product, concordance, coverage, base.quality) Best, Jian-Feng, Mao From alex at chaotic-neutral.de Fri Apr 1 17:17:22 2011 From: alex at chaotic-neutral.de (Alexander Engelhardt) Date: Fri, 1 Apr 2011 17:17:22 +0200 Subject: [R] Syntax coloring in R console In-Reply-To: References: Message-ID: <4D95EC82.4090709@chaotic-neutral.de> Am 01.04.2011 17:02, schrieb Joshua Wiley: > Dear January, > > Have you looked at Emacs + ESS? http://ess.r-project.org/ > > It highlights in the text editor and the actual R process besides > coming with a rich set of features and a mailing list filled with > helpful Emacs& R users. I've tried several different interfaces and > ended up being happiest by far with Emacs + ESS. I also seem to > recall Kate or Kedit at least highlighted .R files, but I am not in a > position to check right now. Another vote for Emacs + ESS here. I think sooner or later you'll need some kind of editor (with built-in syntax highlighing) to save your code. The "naked" R console won't do it forever :) Although I didn't like Emacs at first because of its steep learning curve, after a while you'll love the keyboard shortcuts for everything. Also, this add-on is very useful. It makes Shift-Enter send your lines to R (and a bit more): http://www.kieranhealy.org/blog/archives/2009/10/12/make-shift-enter-do-a-lot-in-ess/ From wangz at kuhp.kyoto-u.ac.jp Fri Apr 1 17:21:28 2011 From: wangz at kuhp.kyoto-u.ac.jp (Zhipeng Wang) Date: Sat, 2 Apr 2011 00:21:28 +0900 Subject: [R] Connect observed values by a smooth curve, is there any method to get coordinate value of arbitrary point on the curve? In-Reply-To: References: <4D93150E.3050802@gmail.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From y.jiao at ucl.ac.uk Fri Apr 1 17:24:06 2011 From: y.jiao at ucl.ac.uk (Yan Jiao) Date: Fri, 1 Apr 2011 16:24:06 +0100 Subject: [R] regression line on boxplots Message-ID: <7785EFE1CC2D9F4A96F6A415BC1C2BD0168DD480@PC6-46.pogb.cancer.ucl.ac.uk> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From fabio.ciotola at gmail.com Fri Apr 1 16:34:10 2011 From: fabio.ciotola at gmail.com (Fabio Ciotola) Date: Fri, 1 Apr 2011 15:34:10 +0100 Subject: [R] Simple AR(2) Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From january.weiner at mpiib-berlin.mpg.de Fri Apr 1 17:05:13 2011 From: january.weiner at mpiib-berlin.mpg.de (January Weiner) Date: Fri, 1 Apr 2011 17:05:13 +0200 Subject: [R] Syntax coloring in R console In-Reply-To: References: Message-ID: Dear Liviu, thanks for the programming tip! However, I do all my editing in vim, which has had syntax highlighting for quite a while, as well as auto-completion and a number of other goodies. But while I do most of my programing in vim, I do most of my scientific studies simply using the R interface and keeping a manually edited "lab book" apart from the scripts. Clearly, one can do as much with Rstudio -- I just don't see any advantages, but I do see a disadvantage in my specific case: all sub-windows are confined to the "desktop". In other words, having multiple plots on one monitor and a terminal with R command line is not possible. A question, though. Given that I have projects assorted in various directories, how can I start RStudio opening a project stored in .RData and .Rhistory of a given directory? I.e., how can I make RStudio open the current directory (like R does), and not $HOME? Thanks nonetheless, j. On Fri, Apr 1, 2011 at 4:20 PM, Liviu Andronic wrote: > On Fri, Apr 1, 2011 at 3:48 PM, January Weiner > wrote: >> RStudio might be a fine program, but it does not feature syntax >> highlighting, which is the only thing I am missing from R Console (it >> only colors the commands typed). >> > The idea is that you shouldn't use the R console for you main > programming needs, but only for quick and dirty checks. For the bulk > of programming tasks you are invited to use the integrated editor > (File > New > Script). The editor window does feature syntax > highlighting, and a very helpful completion mechanism (via ). > Sending lines for execution to the terminal is as easy as clicking > 'run lines' or ctrl+enter. If you're not a fan of keeping scripts for > your projects you can easily use temporary files that you don't save. > > >> Moreover, the very idea of squeezing >> all R windows into one "window-desktop" would be counterproductive in >> my particular case. >> > Notice that all panes are freely resizable and can be resized to the > point of becoming hidden. Future releases will give more control over > the panes layout (I think). > > Regards > Liviu > > >> Thank you anyways! >> >> j. >> >> >> >>> [1] http://alternativeto.net/software/rstudio/about >>> >>> Regards >>> Liviu >>> >>> >>>> I tried JGR, the GUI for R, but I have found the following problems >>>> with this package: >>>> >>>> - I was not able to change the background color from a repulsive grey, >>>> - apparently, GNU readline is not implemented in that package, that >>>> is, there is no functionality similar to ctrl-r (which searches >>>> through the history for matching commands), something I use >>>> frequently, and >>>> - tab expansion is of limited use (e.g. doesn't browse files in the >>>> current directory when expanding quoted arguments e.g. in >>>> "read.table"). >>>> >>>> All in all, I'd be happy to continue using the plain R console, but >>>> syntax highlighting would be nice. Any advice would be extremely >>>> welcome. >>>> >>>> Kind regards, >>>> >>>> January >>>> >>>> >>>> >>>>> sessionInfo() >>>> R version 2.12.2 (2011-02-25) >>>> Platform: i486-pc-linux-gnu (32-bit) >>>> >>>> locale: >>>> ?[1] LC_CTYPE=en_US.utf8 ? ? ? LC_NUMERIC=C >>>> LC_TIME=en_US.utf8 ? ? ? ?LC_COLLATE=en_US.utf8 ? ? LC_MONETARY=C >>>> ?[6] LC_MESSAGES=en_US.utf8 ? ?LC_PAPER=en_US.utf8 ? ? ? LC_NAME=C >>>> ? ? ? ? ? ?LC_ADDRESS=C ? ? ? ? ? ? ?LC_TELEPHONE=C >>>> [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C >>>> >>>> attached base packages: >>>> [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base >>>> >>>> >>>> -- >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> >>> >>> -- >>> Do you know how to read? >>> http://www.alienetworks.com/srtest.cfm >>> http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader >>> Do you know how to write? >>> http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail >>> >> >> >> >> -- >> -------- Dr. January Weiner 3 -------------------------------------- >> Max Planck Institute for Infection Biology >> Charit?platz 1 >> D-10117 Berlin, Germany >> Web?? : www.mpiib-berlin.mpg.de >> Tel? ?? : +49-30-28460514 >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Do you know how to read? > http://www.alienetworks.com/srtest.cfm > http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader > Do you know how to write? > http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail > -- -------- Dr. January Weiner 3 -------------------------------------- Max Planck Institute for Infection Biology Charit?platz 1 D-10117 Berlin, Germany Web?? : www.mpiib-berlin.mpg.de Tel? ?? : +49-30-28460514 From pjmiller_57 at yahoo.com Fri Apr 1 16:28:57 2011 From: pjmiller_57 at yahoo.com (Paul Miller) Date: Fri, 1 Apr 2011 07:28:57 -0700 (PDT) Subject: [R] Cox Proportional Hazards model with a time-varying covariate Message-ID: <648355.43722.qm@web161607.mail.bf1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Fri Apr 1 17:42:02 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 1 Apr 2011 11:42:02 -0400 Subject: [R] regression line on boxplots In-Reply-To: <7785EFE1CC2D9F4A96F6A415BC1C2BD0168DD480@PC6-46.pogb.cancer.ucl.ac.uk> References: <7785EFE1CC2D9F4A96F6A415BC1C2BD0168DD480@PC6-46.pogb.cancer.ucl.ac.uk> Message-ID: <56498B89-3BCA-4C5C-86C9-89F5DD96E8C0@comcast.net> On Apr 1, 2011, at 11:24 AM, Yan Jiao wrote: > boxplot(c(1:3),c(4:6),c(5:8)) df <- structure(list(grp = c(1, 1, 1, 2, 2, 2, 3, 3, 3, 3), val = c(1L, 2L, 3L, 4L, 5L, 6L, 5L, 6L, 7L, 8L)), .Names = c("grp", "val"), row.names = c(NA, -10L), class = "data.frame") boxplot(val~grp,data=df) lm(val~grp, data=df) #Call: #lm(formula = val ~ grp, data = df) #Coefficients: #(Intercept) grp # 0.04348 2.21739 ?abline abline(reg=lm(val~grp, data=df) ) -- David Winsemius, MD West Hartford, CT From bernd.weiss at uni-koeln.de Fri Apr 1 17:45:03 2011 From: bernd.weiss at uni-koeln.de (Bernd Weiss) Date: Fri, 01 Apr 2011 11:45:03 -0400 Subject: [R] regression line on boxplots In-Reply-To: <7785EFE1CC2D9F4A96F6A415BC1C2BD0168DD480@PC6-46.pogb.cancer.ucl.ac.uk> References: <7785EFE1CC2D9F4A96F6A415BC1C2BD0168DD480@PC6-46.pogb.cancer.ucl.ac.uk> Message-ID: <4D95F2FF.5000606@uni-koeln.de> Am 01.04.2011 11:24, schrieb Yan Jiao: > Dear R users, > > I'm trying to add a regression line on my boxplots (something > like:boxplot(c(1:3),c(4:6),c(5:8))) But I can't see it. Please help > !!! It's not a April fool's joke!!! Your sample data does not make any sense (at least to me). I would do it as follows: set.seed(8) df <- data.frame(y = rnorm(100), x = factor(rep(1:4))) boxplot(y ~ x, data = df) regline <- lm(y ~ as.numeric(x), data = df) abline(regline) HTH, Bernd From gunter.berton at gene.com Fri Apr 1 17:39:15 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Fri, 1 Apr 2011 08:39:15 -0700 Subject: [R] Syntax coloring in R console In-Reply-To: References: Message-ID: Please move this thread off r-help. It's about Rstudio, not R. -- Bert On Fri, Apr 1, 2011 at 8:05 AM, January Weiner wrote: > Dear Liviu, > > thanks for the programming tip! However, I do all my editing in vim, > which has had syntax highlighting for quite a while, as well as > auto-completion and a number of other goodies. But while I do most of > my programing in vim, I do most of my scientific studies simply using > the R interface and keeping a manually edited "lab book" apart from > the scripts. Clearly, one can do as much with Rstudio -- I just don't > see any advantages, but I do see a disadvantage in my specific case: > all sub-windows are confined to the "desktop". In other words, having > multiple plots on one monitor and a terminal with R command line is > not possible. > > A question, though. Given that I have projects assorted in various > directories, how can I start RStudio opening a project stored in > .RData and .Rhistory of a given directory? I.e., how can I make > RStudio open the current directory (like R does), and not $HOME? > > Thanks nonetheless, > > j. > > > On Fri, Apr 1, 2011 at 4:20 PM, Liviu Andronic wrote: >> On Fri, Apr 1, 2011 at 3:48 PM, January Weiner >> wrote: >>> RStudio might be a fine program, but it does not feature syntax >>> highlighting, which is the only thing I am missing from R Console (it >>> only colors the commands typed). >>> >> The idea is that you shouldn't use the R console for you main >> programming needs, but only for quick and dirty checks. For the bulk >> of programming tasks you are invited to use the integrated editor >> (File > New > Script). The editor window does feature syntax >> highlighting, and a very helpful completion mechanism (via ). >> Sending lines for execution to the terminal is as easy as clicking >> 'run lines' or ctrl+enter. If you're not a fan of keeping scripts for >> your projects you can easily use temporary files that you don't save. >> >> >>> Moreover, the very idea of squeezing >>> all R windows into one "window-desktop" would be counterproductive in >>> my particular case. >>> >> Notice that all panes are freely resizable and can be resized to the >> point of becoming hidden. Future releases will give more control over >> the panes layout (I think). >> >> Regards >> Liviu >> >> >>> Thank you anyways! >>> >>> j. >>> >>> >>> >>>> [1] http://alternativeto.net/software/rstudio/about >>>> >>>> Regards >>>> Liviu >>>> >>>> >>>>> I tried JGR, the GUI for R, but I have found the following problems >>>>> with this package: >>>>> >>>>> - I was not able to change the background color from a repulsive grey, >>>>> - apparently, GNU readline is not implemented in that package, that >>>>> is, there is no functionality similar to ctrl-r (which searches >>>>> through the history for matching commands), something I use >>>>> frequently, and >>>>> - tab expansion is of limited use (e.g. doesn't browse files in the >>>>> current directory when expanding quoted arguments e.g. in >>>>> "read.table"). >>>>> >>>>> All in all, I'd be happy to continue using the plain R console, but >>>>> syntax highlighting would be nice. Any advice would be extremely >>>>> welcome. >>>>> >>>>> Kind regards, >>>>> >>>>> January >>>>> >>>>> >>>>> >>>>>> sessionInfo() >>>>> R version 2.12.2 (2011-02-25) >>>>> Platform: i486-pc-linux-gnu (32-bit) >>>>> >>>>> locale: >>>>> ?[1] LC_CTYPE=en_US.utf8 ? ? ? LC_NUMERIC=C >>>>> LC_TIME=en_US.utf8 ? ? ? ?LC_COLLATE=en_US.utf8 ? ? LC_MONETARY=C >>>>> ?[6] LC_MESSAGES=en_US.utf8 ? ?LC_PAPER=en_US.utf8 ? ? ? LC_NAME=C >>>>> ? ? ? ? ? ?LC_ADDRESS=C ? ? ? ? ? ? ?LC_TELEPHONE=C >>>>> [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C >>>>> >>>>> attached base packages: >>>>> [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base >>>>> >>>>> >>>>> -- >>>>> >>>>> ______________________________________________ >>>>> R-help at r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>> >>>> >>>> >>>> -- >>>> Do you know how to read? >>>> http://www.alienetworks.com/srtest.cfm >>>> http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader >>>> Do you know how to write? >>>> http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail >>>> >>> >>> >>> >>> -- >>> -------- Dr. January Weiner 3 -------------------------------------- >>> Max Planck Institute for Infection Biology >>> Charit?platz 1 >>> D-10117 Berlin, Germany >>> Web?? : www.mpiib-berlin.mpg.de >>> Tel? ?? : +49-30-28460514 >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> >> >> -- >> Do you know how to read? >> http://www.alienetworks.com/srtest.cfm >> http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader >> Do you know how to write? >> http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail >> > > > > -- > -------- Dr. January Weiner 3 -------------------------------------- > Max Planck Institute for Infection Biology > Charit?platz 1 > D-10117 Berlin, Germany > Web?? : www.mpiib-berlin.mpg.de > Tel? ?? : +49-30-28460514 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- "Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions." -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics From landronimirc at gmail.com Fri Apr 1 17:46:18 2011 From: landronimirc at gmail.com (Liviu Andronic) Date: Fri, 1 Apr 2011 17:46:18 +0200 Subject: [R] Syntax coloring in R console In-Reply-To: References: Message-ID: On Fri, Apr 1, 2011 at 5:05 PM, January Weiner wrote: > Dear Liviu, > > thanks for the programming tip! However, I do all my editing in vim, > which has had syntax highlighting for quite a while, as well as > auto-completion and a number of other goodies. But while I do most of > my programing in vim, > For info, vim has plug-in for R. > I do most of my scientific studies simply using > the R interface and keeping a manually edited "lab book" apart from > the scripts. Clearly, one can do as much with Rstudio -- I just don't > see any advantages, but I do see a disadvantage in my specific case: > all sub-windows are confined to the "desktop". In other words, having > multiple plots on one monitor and a terminal with R command line is > not possible. > I see. This would be more of a hack, but you could use playwith as an external graphics device. Otherwise, there seems to be an easy hack: > x11() > plot(1:10) > A question, though. Given that I have projects assorted in various > directories, how can I start RStudio opening a project stored in > .RData and .Rhistory of a given directory? I.e., how can I make > RStudio open the current directory (like R does), and not $HOME? > This is a known issue and the RStudio devels plan to address this. Currently I simply open RStudio in $HOME (and make sure that there is no existent .RData there), then open the .RData from my project's directory (by clicking on it in the right-hand Files pane) and then in the same pane I hit More > Set as working dir. A bit cumbersome, but seems to work, and a small price to pay for all the available functionality. Regards Liviu > Thanks nonetheless, > > j. > > > On Fri, Apr 1, 2011 at 4:20 PM, Liviu Andronic wrote: >> On Fri, Apr 1, 2011 at 3:48 PM, January Weiner >> wrote: >>> RStudio might be a fine program, but it does not feature syntax >>> highlighting, which is the only thing I am missing from R Console (it >>> only colors the commands typed). >>> >> The idea is that you shouldn't use the R console for you main >> programming needs, but only for quick and dirty checks. For the bulk >> of programming tasks you are invited to use the integrated editor >> (File > New > Script). The editor window does feature syntax >> highlighting, and a very helpful completion mechanism (via ). >> Sending lines for execution to the terminal is as easy as clicking >> 'run lines' or ctrl+enter. If you're not a fan of keeping scripts for >> your projects you can easily use temporary files that you don't save. >> >> >>> Moreover, the very idea of squeezing >>> all R windows into one "window-desktop" would be counterproductive in >>> my particular case. >>> >> Notice that all panes are freely resizable and can be resized to the >> point of becoming hidden. Future releases will give more control over >> the panes layout (I think). >> >> Regards >> Liviu >> >> >>> Thank you anyways! >>> >>> j. >>> >>> >>> >>>> [1] http://alternativeto.net/software/rstudio/about >>>> >>>> Regards >>>> Liviu >>>> >>>> >>>>> I tried JGR, the GUI for R, but I have found the following problems >>>>> with this package: >>>>> >>>>> - I was not able to change the background color from a repulsive grey, >>>>> - apparently, GNU readline is not implemented in that package, that >>>>> is, there is no functionality similar to ctrl-r (which searches >>>>> through the history for matching commands), something I use >>>>> frequently, and >>>>> - tab expansion is of limited use (e.g. doesn't browse files in the >>>>> current directory when expanding quoted arguments e.g. in >>>>> "read.table"). >>>>> >>>>> All in all, I'd be happy to continue using the plain R console, but >>>>> syntax highlighting would be nice. Any advice would be extremely >>>>> welcome. >>>>> >>>>> Kind regards, >>>>> >>>>> January >>>>> >>>>> >>>>> >>>>>> sessionInfo() >>>>> R version 2.12.2 (2011-02-25) >>>>> Platform: i486-pc-linux-gnu (32-bit) >>>>> >>>>> locale: >>>>> ?[1] LC_CTYPE=en_US.utf8 ? ? ? LC_NUMERIC=C >>>>> LC_TIME=en_US.utf8 ? ? ? ?LC_COLLATE=en_US.utf8 ? ? LC_MONETARY=C >>>>> ?[6] LC_MESSAGES=en_US.utf8 ? ?LC_PAPER=en_US.utf8 ? ? ? LC_NAME=C >>>>> ? ? ? ? ? ?LC_ADDRESS=C ? ? ? ? ? ? ?LC_TELEPHONE=C >>>>> [11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C >>>>> >>>>> attached base packages: >>>>> [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base >>>>> >>>>> >>>>> -- >>>>> >>>>> ______________________________________________ >>>>> R-help at r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>> >>>> >>>> >>>> -- >>>> Do you know how to read? >>>> http://www.alienetworks.com/srtest.cfm >>>> http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader >>>> Do you know how to write? >>>> http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail >>>> >>> >>> >>> >>> -- >>> -------- Dr. January Weiner 3 -------------------------------------- >>> Max Planck Institute for Infection Biology >>> Charit?platz 1 >>> D-10117 Berlin, Germany >>> Web?? : www.mpiib-berlin.mpg.de >>> Tel? ?? : +49-30-28460514 >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> >> >> -- >> Do you know how to read? >> http://www.alienetworks.com/srtest.cfm >> http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader >> Do you know how to write? >> http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail >> > > > > -- > -------- Dr. January Weiner 3 -------------------------------------- > Max Planck Institute for Infection Biology > Charit?platz 1 > D-10117 Berlin, Germany > Web?? : www.mpiib-berlin.mpg.de > Tel? ?? : +49-30-28460514 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail From ehlers at ucalgary.ca Fri Apr 1 18:04:31 2011 From: ehlers at ucalgary.ca (Peter Ehlers) Date: Fri, 01 Apr 2011 09:04:31 -0700 Subject: [R] ANCOVA for linear regressions without intercept In-Reply-To: References: Message-ID: <4D95F78F.10501@ucalgary.ca> See inline. On 2011-03-31 22:22, Yusuke Fukuda wrote: > Thanks Bert. > > I have read "?formula" again and again, and I'm still struggling; > >> lm(body_length ~ head_length-1) > > This removes intercept from each individual regression (for male, female, unknown). > > When they are taken together, > >> lm(body_length ~ sex*head_length) > > This shows differences in slopes and intercepts between the regressions (but I want to compare the slopes of the regressions WITHOUT intercepts). > > If I put > >> lm(body_length ~ sex:head_length-1) > > This shows slopes for each sex without intercepts, but NOT differences in the slope between the regressions. You probably want: lm(body_length ~ head_length + sex:head_length-1) or, in short form: lm(body_length ~ head_length/sex-1) You might then compare the model 'without intercepts' (i.e. with intercepts forced to zero) with a model that includes intercepts. If the intercepts turn out to be significantly nonzero, what will you do? Peter Ehlers > > I also tried > >> lm(body_length ~ sex*head_length-1) >> lm(body_length ~ sex*head_length-sex-1) > > But none of them worked. > > Would anyone be able to help me? All I want to do is to compare the slopes of three linear regressions that go through the origin (0,0) so that I can say if their difference is significant or not. > > Thanks for your help. > > > > ________________________________________ > From: Bert Gunter [mailto:gunter.berton at gene.com] > Sent: Friday, 1 April 2011 12:56 AM > To: Yusuke Fukuda > Cc: r-help at r-project.org > Subject: Re: [R] ANCOVA for linear regressions without intercept > > If you haven't already received an answer, a careful reading of > > ?formula > > will provide it. > > -- Bert > On Wed, Mar 30, 2011 at 11:42 PM, Yusuke Fukuda wrote: > > Hello R experts > > I have two linear regressions for sexes (Male, Female, Unknown). All have a good correlation between body length (response variable) and head length (explanatory variable). I know it is not recommended, but for a good practical reason (the purpose of study is to find a single conversion factor from head length to body length), the regressions need to go through the origin (0 intercept). > > Is it possible to do ANCOVA for these regressions without intercepts? When I do > > summary(lm(body length ~ sex*head length)) > > this will include the intercepts as below > > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) -6.49697 1.68497 -3.856 0.000118 *** > sexMale -9.39340 1.97760 -4.750 2.14e-06 *** > sexUnknown -1.33791 2.35453 -0.568 0.569927 > head_length 7.12307 0.05503 129.443< 2e-16 *** > sexMale:head_length 0.31631 0.06246 5.064 4.37e-07 *** > sexUnknown:head_length 0.19937 0.07022 2.839 0.004556 ** > --- > > Is there any way I can remove the intercepts so that I can simply compare the slopes with no intercept taken into account? > > Thanks for help in advance. > > Yusuke Fukuda > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > From ehlers at ucalgary.ca Fri Apr 1 18:14:39 2011 From: ehlers at ucalgary.ca (Peter Ehlers) Date: Fri, 1 Apr 2011 09:14:39 -0700 Subject: [R] Linear Model with curve fitting parameter? In-Reply-To: References: <8512_1301611104_1301611104_AANLkTikOxmcE=oMvHBuB8x61fxXJwnexXJkr+Qp3Tawp@mail.gmail.com> Message-ID: <4D95F9EF.2080205@ucalgary.ca> On 2011-04-01 05:44, stephen sefick wrote: > Setting Z=Q-A would be the incorrect dimensions. I could Z=Q/A. Is > fitting a nls model the same as fitting an ols? These data are > hydraulic data from ~47 sites. To access predictive ability I am > removing one site fitting a new model and then accessing the fit with > a myriad of model assessment criteria. I should get the same answer > with ols vs nls? Thank you for all of your help. No, ols and nls won't give the same result. If you use ols on the logged data, you're assuming additive errors on the log scale. With nls, you assume additive errors on the original scale. But your model looks simple enough - why not run it through both functions and see what the difference is. Ultimately, everything depends on what assumptions you're comfortable with. Peter Ehlers > > Stephen > > On Thu, Mar 31, 2011 at 8:34 PM, Steven McKinney wrote: >> >>> -----Original Message----- >>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of stephen sefick >>> Sent: March-31-11 3:38 PM >>> To: R help >>> Subject: [R] Linear Model with curve fitting parameter? >>> >>> I have a model Q=K*A*(R^r)*(S^s) >>> >>> A, R, and S are data I have and K is a curve fitting parameter. I >>> have linearized as >>> >>> log(Q)=log(K)+log(A)+r*log(R)+s*log(S) >>> >>> I have taken the log of the data that I have and this is the model >>> formula without the K part >>> >>> lm(Q~offset(A)+R+S, data=x) >>> >>> What is the formula that I should use? >> >> Let Z = Q - A for your logged data. >> >> Fitting lm(Z ~ R + S, data = x) should yield >> intercept parameter estimate = estimate for log(K) >> R coefficient parameter estimate = estimate for r >> S coefficient parameter estimate = estimate for s >> >> >> >> Steven McKinney >> >> Statistician >> Molecular Oncology and Breast Cancer Program >> British Columbia Cancer Research Centre >> >> >> >>> >>> Thanks for all of your help. I can provide a subset of data if necessary. >>> >>> >>> >>> -- >>> Stephen Sefick >>> ____________________________________ >>> | Auburn University | >>> | Biological Sciences | >>> | 331 Funchess Hall | >>> | Auburn, Alabama | >>> | 36849 | >>> |___________________________________| >>> | sas0025 at auburn.edu | >>> | http://www.auburn.edu/~sas0025 | >>> |___________________________________| >>> >>> Let's not spend our time and resources thinking about things that are >>> so little or so large that all they really do for us is puff us up and >>> make us feel like gods. We are mammals, and have not exhausted the >>> annoying little problems of being mammals. >>> >>> -K. Mullis >>> >>> "A big computer, a complex algorithm and a long time does not equal science." >>> >>> -Robert Gentleman >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> > > > From jeff.a.ryan at gmail.com Fri Apr 1 18:28:19 2011 From: jeff.a.ryan at gmail.com (Jeff Ryan) Date: Fri, 1 Apr 2011 11:28:19 -0500 Subject: [R] R/Finance 2011 Conference Agenda Message-ID: R community: We're excited to post a preliminary agenda for the upcoming 3rd conference on R and Applied Finance, to be held in Chicago on April 29th and 30th. In addition to keynotes from John Bollinger, Mebane Faber, Stefano Iacus and Louis Kates, we are excited to have 31 additional talks covering the state of R and applied finance. This represents a phenomenal opportunity to meet and interact with some of the leading contributors in the field of finance, all with relevant contributions using R. We expect more than 200 participants from industry, government, and academia for the 2 day event. In addition, a conference dinner Friday evening in the heart of the financial district along Chicago's picturesque river will offer an unprecedented opportunity to enjoy amazing food, drink and conversation. http://www.rinfinance.com/agenda/index.html Registration is open, though pre-conference workshops are rapidly filling up. Register now and join your fellow colleagues at R/Finance 2011! http://www.rinfinance.com/register/ Thanks to our 2011 Co-Sponsors and Sponsors: International Center for Futures and Derivatives at UIC REvolution Analytics OneMarketData RStudio lemnica From m_hofert at web.de Fri Apr 1 18:32:39 2011 From: m_hofert at web.de (Marius Hofert) Date: Fri, 1 Apr 2011 18:32:39 +0200 Subject: [R] How to paste a vector of expressions and a character vector? Message-ID: <42460E70-FC82-41D6-B6E2-F3F92712F935@web.de> Dear expeRts, I know I can't paste expressions in the normal way, but I just couldn't figure out how to get the following (I want to paste a character vector to an expression vector) right with bquote() or substitute. vec1 <- c("a", expression(tilde(b)), "c") vec2 <- c("1", "2", "3") main <- as.expression(paste(vec1, vec2)) plot(0,0, main=main[2]) Cheers, Marius From arrayprofile at yahoo.com Fri Apr 1 18:33:57 2011 From: arrayprofile at yahoo.com (array chip) Date: Fri, 1 Apr 2011 09:33:57 -0700 (PDT) Subject: [R] regular expression In-Reply-To: <4D95281D.7090504@uni-koeln.de> References: <536214.83139.qm@web125802.mail.ne1.yahoo.com> <4D951D19.4050902@uni-koeln.de> <791670.28123.qm@web125805.mail.ne1.yahoo.com> <213318.24649.qm@web125808.mail.ne1.yahoo.com> <4D95281D.7090504@uni-koeln.de> Message-ID: <898597.20978.qm@web125807.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From wwwhsd at gmail.com Fri Apr 1 19:14:03 2011 From: wwwhsd at gmail.com (Henrique Dallazuanna) Date: Fri, 1 Apr 2011 14:14:03 -0300 Subject: [R] regular expression In-Reply-To: <898597.20978.qm@web125807.mail.ne1.yahoo.com> References: <536214.83139.qm@web125802.mail.ne1.yahoo.com> <4D951D19.4050902@uni-koeln.de> <791670.28123.qm@web125805.mail.ne1.yahoo.com> <213318.24649.qm@web125808.mail.ne1.yahoo.com> <4D95281D.7090504@uni-koeln.de> <898597.20978.qm@web125807.mail.ne1.yahoo.com> Message-ID: Try this also: > grep("arg", "arg.symptom", value = TRUE, invert = TRUE) character(0) > grep("arg", "liver.symptom", value = TRUE, invert = TRUE) [1] "liver.symptom" > On Fri, Apr 1, 2011 at 1:33 PM, array chip wrote: > Great. thank you Bernd! Learned a new thing here. > > John > > > > > ________________________________ > From: Bernd Weiss > > Cc: r-help at r-project.org > Sent: Thu, March 31, 2011 6:19:25 PM > Subject: Re: [R] regular expression > > Am 31.03.2011 21:06, schrieb array chip: >> Ok then this code didn't do what I wanted. I want "not including >> 'arg' before '.symptom'", not individual letters of "arg", but rather >> as a word. >> >> Bill Dunlap suggested using invert=T, it works for single 1 >> condition, but not for 2 conditions here: not including "arg" before >> ".", but at the same time, does include ".symptom". >> >> Any other suggestions would be appreciated > > This does work (but I am by no means an expert in regex...). I am using > 'negative lookbehind'[1] to define an expression like 'arg'. > >> grep('(? character(0) > >> grep('(? [1] "liver.symptom" > > Bernd > > [1] http://www.regular-expressions.info/lookaround.html > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O From jim.silverton at gmail.com Fri Apr 1 19:19:32 2011 From: jim.silverton at gmail.com (Jim Silverton) Date: Fri, 1 Apr 2011 13:19:32 -0400 Subject: [R] Fisher's test Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From y.jiao at ucl.ac.uk Fri Apr 1 19:20:41 2011 From: y.jiao at ucl.ac.uk (Yan Jiao) Date: Fri, 1 Apr 2011 18:20:41 +0100 Subject: [R] mean in the boxplot Message-ID: <7785EFE1CC2D9F4A96F6A415BC1C2BD0168DD487@PC6-46.pogb.cancer.ucl.ac.uk> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Fri Apr 1 19:22:26 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 1 Apr 2011 13:22:26 -0400 Subject: [R] How to paste a vector of expressions and a character vector? In-Reply-To: <42460E70-FC82-41D6-B6E2-F3F92712F935@web.de> References: <42460E70-FC82-41D6-B6E2-F3F92712F935@web.de> Message-ID: On Apr 1, 2011, at 12:32 PM, Marius Hofert wrote: > Dear expeRts, > > I know I can't paste expressions in the normal way, but I just > couldn't figure out > how to get the following (I want to paste a character vector to an > expression vector) > right with bquote() or substitute. > > vec1 <- c("a", expression(tilde(b)), "c") > vec2 <- c("1", "2", "3") > main <- as.expression(paste(vec1, vec2)) > plot(0,0, main=main[2]) Do not use `paste` ... it coerces your expression to a character value ... use `c` instead: > main <- as.expression(c(vec1, vec2)) > plot(1,1, main=main[2]) And then, even the as.expression is superfluous: > main <- c(vec1, vec2) > plot(1,1, main=main[2]) -- David Winsemius, MD West Hartford, CT From dwinsemius at comcast.net Fri Apr 1 19:28:04 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 1 Apr 2011 13:28:04 -0400 Subject: [R] mean in the boxplot In-Reply-To: <7785EFE1CC2D9F4A96F6A415BC1C2BD0168DD487@PC6-46.pogb.cancer.ucl.ac.uk> References: <7785EFE1CC2D9F4A96F6A415BC1C2BD0168DD487@PC6-46.pogb.cancer.ucl.ac.uk> Message-ID: <103EAFAD-E888-43E3-A8FC-EEFC698A2F53@comcast.net> On Apr 1, 2011, at 1:20 PM, Yan Jiao wrote: > Dear R users, > > How to show mean in the boxplot instead of median ? Read ... help(boxplot) Use boxplot to create a set of values, substitute th emean for the groups then pass the mangled set of values to bxp(). -- David Winsemius, MD West Hartford, CT From m_hofert at web.de Fri Apr 1 19:56:13 2011 From: m_hofert at web.de (Marius Hofert) Date: Fri, 1 Apr 2011 19:56:13 +0200 Subject: [R] How to paste a vector of expressions and a character vector? In-Reply-To: References: <42460E70-FC82-41D6-B6E2-F3F92712F935@web.de> Message-ID: Dear David, thanks for your reply. The paste() meant to paste the vectors vec1 and vec2 together, so main should be a vector of length 3 of the form "a 1", "b 2", "c 3" # with b being tilde(b) However, with c() it is a vector of length 6: expression("a", tilde(b), "c", "1", "2", "3") Do you know a solution for that? Cheers, Marius On 2011-04-01, at 19:22 , David Winsemius wrote: > > On Apr 1, 2011, at 12:32 PM, Marius Hofert wrote: > >> Dear expeRts, >> >> I know I can't paste expressions in the normal way, but I just couldn't figure out >> how to get the following (I want to paste a character vector to an expression vector) >> right with bquote() or substitute. >> >> vec1 <- c("a", expression(tilde(b)), "c") >> vec2 <- c("1", "2", "3") >> main <- as.expression(paste(vec1, vec2)) >> plot(0,0, main=main[2]) > > Do not use `paste` ... it coerces your expression to a character value ... use `c` instead: > > > main <- as.expression(c(vec1, vec2)) > > plot(1,1, main=main[2]) > > And then, even the as.expression is superfluous: > > > main <- c(vec1, vec2) > > plot(1,1, main=main[2]) > > -- > > David Winsemius, MD > West Hartford, CT > From dwinsemius at comcast.net Fri Apr 1 20:34:44 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 1 Apr 2011 14:34:44 -0400 Subject: [R] How to paste a vector of expressions and a character vector? In-Reply-To: References: <42460E70-FC82-41D6-B6E2-F3F92712F935@web.de> Message-ID: <3A7884B9-2F89-41FD-B1F6-88FD57F433F6@comcast.net> On Apr 1, 2011, at 1:56 PM, Marius Hofert wrote: > Dear David, > > thanks for your reply. The paste() meant to paste the vectors vec1 > and vec2 together, so main should be a vector of length 3 of the form > "a 1", "b 2", "c 3" # with b being tilde(b) > However, with c() it is a vector of length 6: > expression("a", tilde(b), "c", "1", "2", "3") lls <-c("a", "tilde(b)","c") mains <- bquote(.(parse(text=paste(lls,"~",nns,sep=""))) ) mains #expression(a~1, tilde(b)~2, c~3) #attr(,"srcfile") attr(,"wholeSrcref") a~1 tilde(b)~2 c~3 #----- > mains[2] #expression(tilde(b)~2) plot(0,0, main=mains[2]) This would also work: > mains2 <- as.expression(parse(text=paste(lls,"~",nns,sep=""))) > mains2 expression(a~1, tilde(b)~2, c~3) attr(,"srcfile") attr(,"wholeSrcref") a~1 tilde(b)~2 c~3 I try to avoid using spaces in expressions and instead use "~"'s because I don't really understand how spaces get parsed in expression operations, and I do know how plotmath operations work with "~". > > Do you know a solution for that? > > Cheers, > > Marius > > On 2011-04-01, at 19:22 , David Winsemius wrote: > >> >> On Apr 1, 2011, at 12:32 PM, Marius Hofert wrote: >> >>> Dear expeRts, >>> >>> I know I can't paste expressions in the normal way, but I just >>> couldn't figure out >>> how to get the following (I want to paste a character vector to an >>> expression vector) >>> right with bquote() or substitute. >>> >>> vec1 <- c("a", expression(tilde(b)), "c") >>> vec2 <- c("1", "2", "3") >>> main <- as.expression(paste(vec1, vec2)) >>> plot(0,0, main=main[2]) >> >> Do not use `paste` ... it coerces your expression to a character >> value ... use `c` instead: >> >>> main <- as.expression(c(vec1, vec2)) >>> plot(1,1, main=main[2]) >> >> And then, even the as.expression is superfluous: >> >>> main <- c(vec1, vec2) >>> plot(1,1, main=main[2]) >> >> -- >> >> David Winsemius, MD >> West Hartford, CT >> > David Winsemius, MD West Hartford, CT From dwinsemius at comcast.net Fri Apr 1 20:36:58 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 1 Apr 2011 14:36:58 -0400 Subject: [R] How to paste a vector of expressions and a character vector? In-Reply-To: <3A7884B9-2F89-41FD-B1F6-88FD57F433F6@comcast.net> References: <42460E70-FC82-41D6-B6E2-F3F92712F935@web.de> <3A7884B9-2F89-41FD-B1F6-88FD57F433F6@comcast.net> Message-ID: On Apr 1, 2011, at 2:34 PM, David Winsemius wrote: > > On Apr 1, 2011, at 1:56 PM, Marius Hofert wrote: > >> Dear David, >> >> thanks for your reply. The paste() meant to paste the vectors vec1 >> and vec2 together, so main should be a vector of length 3 of the form >> "a 1", "b 2", "c 3" # with b being tilde(b) >> However, with c() it is a vector of length 6: >> expression("a", tilde(b), "c", "1", "2", "3") > Forgot one variable, nns: > lls <-c("a", "tilde(b)","c") nns <-1:3 > mains <- bquote(.(parse(text=paste(lls,"~",nns,sep=""))) ) > mains > > #expression(a~1, tilde(b)~2, c~3) > #attr(,"srcfile") > > attr(,"wholeSrcref") > a~1 > tilde(b)~2 > c~3 > #----- > > > mains[2] > #expression(tilde(b)~2) > > > plot(0,0, main=mains[2]) > > > This would also work: > > > mains2 <- as.expression(parse(text=paste(lls,"~",nns,sep=""))) > > mains2 > expression(a~1, tilde(b)~2, c~3) > attr(,"srcfile") > > attr(,"wholeSrcref") > a~1 > tilde(b)~2 > c~3 > > I try to avoid using spaces in expressions and instead use "~"'s > because I don't really understand how spaces get parsed in > expression operations, and I do know how plotmath operations work > with "~". > >> >> Do you know a solution for that? >> >> Cheers, >> >> Marius >> >> On 2011-04-01, at 19:22 , David Winsemius wrote: >> >>> >>> On Apr 1, 2011, at 12:32 PM, Marius Hofert wrote: >>> >>>> Dear expeRts, >>>> >>>> I know I can't paste expressions in the normal way, but I just >>>> couldn't figure out >>>> how to get the following (I want to paste a character vector to >>>> an expression vector) >>>> right with bquote() or substitute. >>>> >>>> vec1 <- c("a", expression(tilde(b)), "c") >>>> vec2 <- c("1", "2", "3") >>>> main <- as.expression(paste(vec1, vec2)) >>>> plot(0,0, main=main[2]) >>> >>> Do not use `paste` ... it coerces your expression to a character >>> value ... use `c` instead: >>> >>>> main <- as.expression(c(vec1, vec2)) >>>> plot(1,1, main=main[2]) >>> >>> And then, even the as.expression is superfluous: >>> >>>> main <- c(vec1, vec2) >>>> plot(1,1, main=main[2]) >>> >>> -- >>> >>> David Winsemius, MD >>> West Hartford, CT >>> >> > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From dwinsemius at comcast.net Fri Apr 1 20:40:20 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 1 Apr 2011 14:40:20 -0400 Subject: [R] How to paste a vector of expressions and a character vector? In-Reply-To: <3A7884B9-2F89-41FD-B1F6-88FD57F433F6@comcast.net> References: <42460E70-FC82-41D6-B6E2-F3F92712F935@web.de> <3A7884B9-2F89-41FD-B1F6-88FD57F433F6@comcast.net> Message-ID: <95DB95FD-338F-4196-9769-1467E39D4DDF@comcast.net> On Apr 1, 2011, at 2:34 PM, David Winsemius wrote: > > On Apr 1, 2011, at 1:56 PM, Marius Hofert wrote: > >> Dear David, >> >> thanks for your reply. The paste() meant to paste the vectors vec1 >> and vec2 together, so main should be a vector of length 3 of the form >> "a 1", "b 2", "c 3" # with b being tilde(b) >> However, with c() it is a vector of length 6: >> expression("a", tilde(b), "c", "1", "2", "3") > > lls <-c("a", "tilde(b)","c") nns <- 1:3 #Even easier would be: > mains3 <- parse(text=paste(lls,"~",nns,sep="")) > mains3 expression(a~1, tilde(b)~2, c~3) attr(,"srcfile") attr(,"wholeSrcref") a~1 tilde(b)~2 c~3 > mains <- bquote(.(parse(text=paste(lls,"~",nns,sep=""))) ) > mains > > #expression(a~1, tilde(b)~2, c~3) > #attr(,"srcfile") > > attr(,"wholeSrcref") > a~1 > tilde(b)~2 > c~3 > #----- > > > mains[2] > #expression(tilde(b)~2) > > > plot(0,0, main=mains[2]) > > > This would also work: > > > mains2 <- as.expression(parse(text=paste(lls,"~",nns,sep=""))) > > mains2 > expression(a~1, tilde(b)~2, c~3) > attr(,"srcfile") > > attr(,"wholeSrcref") > a~1 > tilde(b)~2 > c~3 > > I try to avoid using spaces in expressions and instead use "~"'s > because I don't really understand how spaces get parsed in > expression operations, and I do know how plotmath operations work > with "~". > >> >> Do you know a solution for that? >> >> Cheers, >> >> Marius >> >> On 2011-04-01, at 19:22 , David Winsemius wrote: >> >>> >>> On Apr 1, 2011, at 12:32 PM, Marius Hofert wrote: >>> >>>> Dear expeRts, >>>> >>>> I know I can't paste expressions in the normal way, but I just >>>> couldn't figure out >>>> how to get the following (I want to paste a character vector to >>>> an expression vector) >>>> right with bquote() or substitute. >>>> >>>> vec1 <- c("a", expression(tilde(b)), "c") >>>> vec2 <- c("1", "2", "3") >>>> main <- as.expression(paste(vec1, vec2)) >>>> plot(0,0, main=main[2]) >>> >>> Do not use `paste` ... it coerces your expression to a character >>> value ... use `c` instead: >>> >>>> main <- as.expression(c(vec1, vec2)) >>>> plot(1,1, main=main[2]) >>> >>> And then, even the as.expression is superfluous: >>> >>>> main <- c(vec1, vec2) >>>> plot(1,1, main=main[2]) >>> >>> -- >>> >>> David Winsemius, MD >>> West Hartford, CT >>> >> > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From Steve_Friedman at nps.gov Fri Apr 1 20:48:31 2011 From: Steve_Friedman at nps.gov (Steve_Friedman at nps.gov) Date: Fri, 1 Apr 2011 14:48:31 -0400 Subject: [R] filled contour plot with contour lines Message-ID: I'm stumped, can anyone find my error in this sequence. for(j in 1:(varsize[4]-1)) temp <- get.var.ncdf(nc=input, varid="p_foraging",c(1,1,j),c(varsize[1],varsize[2],1)) filled.contour(x, y, temp, color = terrain.colors, plot.title = title(main = paste("Everglades Wood Stork Foraging Potential \nYear", (2000+j)), xlab = "UTM East", ylab = "UTM North") , plot.axes = { contour(temp, add=T) axis(1, seq(450000 , 580000, by = 10000)) axis(2, seq(2800000,4000000, by = 10000)) }, key.title = title(main="Probability") , key.axes = axis(4, seq(0 , 1 , by = 0.1)) The routine will work in a modified form without adding the coordinates (the axis lines) but when I include these the routine produces various errors, such as "dimension mismatch", or "unexpected end encountered." I tried to follow the example on filled.contour help page. Thanks in advance Steve Steve Friedman Ph. D. Ecologist / Spatial Statistical Analyst Everglades and Dry Tortugas National Park 950 N Krome Ave (3rd Floor) Homestead, Florida 33034 Steve_Friedman at nps.gov Office (305) 224 - 4282 Fax (305) 224 - 4147 From smckinney at bccrc.ca Fri Apr 1 21:44:23 2011 From: smckinney at bccrc.ca (Steven McKinney) Date: Fri, 1 Apr 2011 12:44:23 -0700 Subject: [R] Linear Model with curve fitting parameter? In-Reply-To: References: <8512_1301611104_1301611104_AANLkTikOxmcE=oMvHBuB8x61fxXJwnexXJkr+Qp3Tawp@mail.gmail.com> Message-ID: > -----Original Message----- > From: stephen sefick [mailto:ssefick at gmail.com] > Sent: April-01-11 5:44 AM > To: Steven McKinney > Cc: R help > Subject: Re: [R] Linear Model with curve fitting parameter? > > Setting Z=Q-A would be the incorrect dimensions. I could Z=Q/A. I suspect this is confusion about what Q is. I was presuming that the Q in this following formula was log(Q) with Q from the original data. > >> I have taken the log of the data that I have and this is the model > >> formula without the K part > >> > >> lm(Q~offset(A)+R+S, data=x) If the model is Q=K*A*(R^r)*(S^s) then log(Q) = log(K) + log(A) + r*log(R) + s*log(S) Rearranging yields log(Q) - log(A) = log(K) + r*log(R) + s*log(S) so what I labeled 'Z' below is Z = log(Q) - log(A) = log(Q/A) so Z = log(K) + r*log(R) + s*log(S) and a linear model fit of Z ~ log(R) + log(S) will yield parameter estimates for the linear equation E(Z) = B0 + B1*log(R) + B2*log(S) (E(Z) = expected value of Z) so B0 estimate is an estimate of log(K) B1 estimate is an estimate of r B2 estimate is an estimate of s More details and careful notation will eventually lead to a reasonable description and analysis strategy. Best Steve McKinney > Is fitting a nls model the same as fitting an ols? These data are > hydraulic data from ~47 sites. To access predictive ability I am > removing one site fitting a new model and then accessing the fit with > a myriad of model assessment criteria. I should get the same answer > with ols vs nls? Thank you for all of your help. > > Stephen > > On Thu, Mar 31, 2011 at 8:34 PM, Steven McKinney wrote: > > > >> -----Original Message----- > >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of stephen > sefick > >> Sent: March-31-11 3:38 PM > >> To: R help > >> Subject: [R] Linear Model with curve fitting parameter? > >> > >> I have a model Q=K*A*(R^r)*(S^s) > >> > >> A, R, and S are data I have and K is a curve fitting parameter. ?I > >> have linearized as > >> > >> log(Q)=log(K)+log(A)+r*log(R)+s*log(S) > >> > >> I have taken the log of the data that I have and this is the model > >> formula without the K part > >> > >> lm(Q~offset(A)+R+S, data=x) > >> > >> What is the formula that I should use? > > > > Let Z = Q - A for your logged data. > > > > Fitting lm(Z ~ R + S, data = x) should yield > > intercept parameter estimate = estimate for log(K) > > R coefficient parameter estimate = estimate for r > > S coefficient parameter estimate = estimate for s > > > > > > > > Steven McKinney > > > > Statistician > > Molecular Oncology and Breast Cancer Program > > British Columbia Cancer Research Centre > > > > > > > >> > >> Thanks for all of your help. ?I can provide a subset of data if necessary. > >> > >> > >> > >> -- > >> Stephen Sefick > >> ____________________________________ > >> | Auburn University? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | > >> | Biological Sciences ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ?? ? ? ?| > >> | 331 Funchess Hall? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| > >> | Auburn, Alabama? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ? ??? | > >> | 36849? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | > >> |___________________________________| > >> | sas0025 at auburn.edu? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ?| > >> | http://www.auburn.edu/~sas0025? ? ? ? ? ?? ? ? | > >> |___________________________________| > >> > >> Let's not spend our time and resources thinking about things that are > >> so little or so large that all they really do for us is puff us up and > >> make us feel like gods.? We are mammals, and have not exhausted the > >> annoying little problems of being mammals. > >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -K. Mullis > >> > >> "A big computer, a complex algorithm and a long time does not equal science." > >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -Robert Gentleman > >> ______________________________________________ > >> R-help at r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > Stephen Sefick > ____________________________________ > | Auburn University? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | > | Biological Sciences ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ?? ? ? ?| > | 331 Funchess Hall? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| > | Auburn, Alabama? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ? ??? | > | 36849? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | > |___________________________________| > | sas0025 at auburn.edu? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ?| > | http://www.auburn.edu/~sas0025? ? ? ? ? ?? ? ? | > |___________________________________| > > Let's not spend our time and resources thinking about things that are > so little or so large that all they really do for us is puff us up and > make us feel like gods.? We are mammals, and have not exhausted the > annoying little problems of being mammals. > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -K. Mullis > > "A big computer, a complex algorithm and a long time does not equal science." > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -Robert Gentleman From ehlers at ucalgary.ca Fri Apr 1 21:57:12 2011 From: ehlers at ucalgary.ca (Peter Ehlers) Date: Fri, 01 Apr 2011 12:57:12 -0700 Subject: [R] Fisher's test In-Reply-To: References: Message-ID: <4D962E18.4090802@ucalgary.ca> On 2011-04-01 10:19, Jim Silverton wrote: > I have a matrix with 2 columns and I want to do fishers exact test for these > with the totals for each row being 100 say. > > The data has the form: > 23 12 > 32 21 > 12 2 > > and these represents the tables: > > 23 12 > 77 88 > > 32 21 > 78 79 > > 12 2 > 88 98 > > > How do I use apply to speed up aclculation of the fisher.exact test? > apply(yourMatrix, 1, function(x) fisher.test(cbind(x, 100 - x))) or, if you only want the P-value: apply(yourMatrix, 1, function(x) fisher.test(cbind(x, 100 - x))$p.value) Peter Ehlers From Boris.Vasiliev at forces.gc.ca Fri Apr 1 21:59:11 2011 From: Boris.Vasiliev at forces.gc.ca (Boris.Vasiliev at forces.gc.ca) Date: Fri, 1 Apr 2011 15:59:11 -0400 Subject: [R] lattice xscale.components: different ticks on top/bottom axis In-Reply-To: References: <201103101854.p2AIsL0A020203@hypatia.math.ethz.ch> Message-ID: <201104012000.p31K0INU017710@hypatia.math.ethz.ch> > On Fri, Mar 11, 2011 at 12:28 AM, > wrote: > > Good afternoon, > > > > I am trying to create a plot where the bottom and top axes have the > > same scale but different tick marks. ?I tried user-defined > > xscale.component function but it does not produce desired results. ? > > Can anybody suggest where my use of xscale.component > > function is incorrect? > > > > For example, the code below tries to create a plot where horizontal > > axes limits are c(0,10), top axis has ticks at odd integers, and > > bottom axis has ticks at even integers. > > > > library(lattice) > > > > df <- data.frame(x=1:10,y=1:10) > > > > xscale.components.A <- function(...,user.value=NULL) { > > ?# get default axes definition list; print user.value > > ?ans <- xscale.components.default(...) > > ?print(user.value) > > > > ?# start with the same definition of bottom and top axes > > ?ans$top <- ans$bottom > > > > ?# - bottom labels > > ?ans$bottom$labels$at <- seq(0,10,by=2) > > ?ans$bottom$labels$labels <- paste("B",seq(0,10,by=2),sep="-") > > > > ?# - top labels > > ?ans$top$labels$at <- seq(1,9,by=2) > > ?ans$top$labels$labels <- paste("T",seq(1,9,by=2),sep="-") > > > > ?# return axes definition list > > ?return(ans) > > } > > > > oltc <- xyplot(y~x,data=df, > > > > scales=list(x=list(limits=c(0,10),at=0:10,alternating=3)), > > ? ? ? ? ? ? ? xscale.components=xscale.components.A, > > ? ? ? ? ? ? ? user.value=1) > > print(oltc) > > > > The code generates a figure with incorrectly placed bottom and top > > labels. ?Bottom labels "B-0", "B-2", ... are at 0, 1, ... and top > > labels "T-1", "T-3", ... are at 0, 1, ... ?When axis-function runs out > > of labels, it replaces labels with NA. > > > > It appears that lattice uses top$ticks$at to place labels and > > top$labels$labels for labels. ?Is there a way to override this > > behaviour (other than to expand the "labels$labels" vector to be as > > long as "ticks$at" vector and set necessary elements to "")? > > Well, $ticks$at is used to place the ticks, and > $labels$at is used to place the labels. They should typically > be the same, but you have changed one and not the other. > Everything seems to work if you set $ticks$at to the same > values as $labels$at: > > > ## - bottom labels > + ans$bottom$ticks$at <- seq(0,10,by=2) > ans$bottom$labels$at <- seq(0,10,by=2) > ans$bottom$labels$labels <- paste("B",seq(0,10,by=2),sep="-") > > ## - top labels > + ans$top$ticks$at <- seq(1,9,by=2) > ans$top$labels$at <- seq(1,9,by=2) > ans$top$labels$labels <- paste("T",seq(1,9,by=2),sep="-") > > > > Also, can user-parameter be passed into xscale.components() > > function? (For example, locations and labels of ticks on the top > > axis). ?In the code above, print(user.value) returns NULL even > > though in the xyplot() call user.value is 1. > > No. Unrecognized arguments are passed to the panel function > only, not to any other function. However, you can always > define an inline > function: > > oltc <- xyplot(y~x,data=df, > scales=list(x=list(limits=c(0,10), at = 0:10, > alternating=3)), > xscale.components = function(...) > xscale.components.A(..., user.value=1)) > > Hope that helps (and sorry for the late reply). > > -Deepayan > Deepyan, Thank you very much for your reply. It makes things a bit clearer. It other words in the list prepared by xscale.components(), vectors $ticks$at and $labels$at must be the same. If only every second tick is to be labelled then every second label should be set explicitly to empty strings: ans$bottom$ticks$at <- seq(0,10,by=1) ans$bottom$labels$at <- seq(0,10,by=1) ans$bottom$labels$labels <- paste("B",seq(0,10,by=1),sep="-") # replace "B-1", "B-3", ... with "" ans$bottom$labels$labels[seq(2,11,by=2)] <- "" Sincerely, Boris. From ehlers at ucalgary.ca Fri Apr 1 22:53:00 2011 From: ehlers at ucalgary.ca (Peter Ehlers) Date: Fri, 01 Apr 2011 13:53:00 -0700 Subject: [R] mean in the boxplot In-Reply-To: <103EAFAD-E888-43E3-A8FC-EEFC698A2F53@comcast.net> References: <7785EFE1CC2D9F4A96F6A415BC1C2BD0168DD487@PC6-46.pogb.cancer.ucl.ac.uk> <103EAFAD-E888-43E3-A8FC-EEFC698A2F53@comcast.net> Message-ID: <4D963B2C.5010802@ucalgary.ca> On 2011-04-01 10:28, David Winsemius wrote: > > On Apr 1, 2011, at 1:20 PM, Yan Jiao wrote: > >> Dear R users, >> >> How to show mean in the boxplot instead of median ? > > Read ... help(boxplot) > > Use boxplot to create a set of values, substitute th emean for the > groups then pass the mangled set of values to bxp(). > Another option is to add the means as points after plotting the boxplot with: boxplot(....) points(1:n, Means) This is particularly useful if you want both the medians and the means on the plot. If you don't want the median line, you can get rid of it by setting medcol="transparent" in the boxplot() call. But then again, the whole point of a boxplot is to represent general distributional shape for which the median is surely more effective. Peter Ehlers From GJett at itsi.com Fri Apr 1 23:05:47 2011 From: GJett at itsi.com (Guy Jett) Date: Fri, 1 Apr 2011 21:05:47 +0000 Subject: [R] XYPlot Conditioning Variable in Specific, Non-Alphanumeric Order. -- Resending with corrected .txt file In-Reply-To: References: Message-ID: Thank you very much Deepayan. This does exactly what I needed! Cheers, Guy Jett gjett at itsi.com -----Original Message----- From: Deepayan Sarkar [mailto:deepayan.sarkar at gmail.com] Sent: Friday, April 01, 2011 6:51 AM To: Guy Jett Cc: r-help at R-project.org Subject: Re: [R] XYPlot Conditioning Variable in Specific, Non-Alphanumeric Order. -- Resending with corrected .txt file You need to specify the order of the levels explicitly (to override the default). Here is how to do it for one, you can similarly do the other: > levels(df$Offset) [1] "T" "U" "V" "Y" "Z" "A" "B" "C" "D" "E" "F" "G" "H" > df$Offset <- factor(df$Offset, + levels = c("T", "U", "V", "Y", "Z", "A", + "B", "C", "D", "E", "F", "G", "H")) > levels(df$Offset) [1] "T" "U" "V" "Y" "Z" "A" "B" "C" "D" "E" "F" "G" "H" Once you make these changes, your original call should work as desired. -Deepayan From jshleap at Dal.Ca Fri Apr 1 22:47:55 2011 From: jshleap at Dal.Ca (Jose Hleap Lozano) Date: Fri, 01 Apr 2011 17:47:55 -0300 Subject: [R] hc2Newick is different than th hclust dendrogram Message-ID: <20110401174755.20670jzo8r2746pc@wm4.dal.ca> Hi R helpers... I am having troubles because of the discrepancy between the dendrogram plotted from hclust and what is wrote in the hc2Newick file. I've got a matrix C: > hc <- hclust(dist(C)) > plot(hc) with the: > write(hc2Newick(hc),file='test.newick') both things draw completely different "trees"... I have also tried with the raw distance matrix D and also the agnes function, but the same happens. The hclus and agnes dendrogram is logical, whereas the newick tree is not. Thanks for any help! -- Jose Sergio Hleap Lozano, M. Sc. Ph. D. Student, Dalhousie University Researcher, SQUALUS Foundation From ehlers at ucalgary.ca Fri Apr 1 23:08:17 2011 From: ehlers at ucalgary.ca (Peter Ehlers) Date: Fri, 01 Apr 2011 14:08:17 -0700 Subject: [R] filled contour plot with contour lines In-Reply-To: References: Message-ID: <4D963EC1.2020509@ucalgary.ca> Aren't you missing a set of parentheses? I can't run your code since it's not reproducible, but to my aging eyes it seems that you need a set of '{}' around the contents of your loop: for(j in 1:(varsize[4]-1)) { loop stuff } Peter Ehlers On 2011-04-01 11:48, Steve_Friedman at nps.gov wrote: > > I'm stumped, can anyone find my error in this sequence. > > for(j in 1:(varsize[4]-1)) > temp<- get.var.ncdf(nc=input, > varid="p_foraging",c(1,1,j),c(varsize[1],varsize[2],1)) > filled.contour(x, y, temp, color = terrain.colors, > plot.title = title(main = paste("Everglades Wood Stork > Foraging Potential \nYear", (2000+j)), > xlab = "UTM East", ylab = "UTM North") , > plot.axes = { contour(temp, add=T) > axis(1, seq(450000 , 580000, by = 10000)) > axis(2, seq(2800000,4000000, by = 10000)) }, > key.title = title(main="Probability") , > key.axes = axis(4, seq(0 , 1 , by = 0.1)) > > > The routine will work in a modified form without adding the coordinates > (the axis lines) but when I include these the routine produces various > errors, such as "dimension mismatch", or "unexpected end encountered." > > I tried to follow the example on filled.contour help page. > > Thanks in advance > Steve > > Steve Friedman Ph. D. > Ecologist / Spatial Statistical Analyst > Everglades and Dry Tortugas National Park > 950 N Krome Ave (3rd Floor) > Homestead, Florida 33034 > > Steve_Friedman at nps.gov > Office (305) 224 - 4282 > Fax (305) 224 - 4147 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From Steve_Friedman at nps.gov Fri Apr 1 23:19:51 2011 From: Steve_Friedman at nps.gov (Steve_Friedman at nps.gov) Date: Fri, 1 Apr 2011 17:19:51 -0400 Subject: [R] filled contour plot with contour lines In-Reply-To: <4D963EC1.2020509@ucalgary.ca> Message-ID: Hi Peter, Thanks for taking the time to consider the problem. I realize the procedure is not reproducible. I use a 4 dimensional netCDF file (4.2 GB) in size to pull data into this process. Nobody in their right mind should work with such things. It's a spatial temporal database with 10 years of daily data in a irregular area spanning 405 x 287 cells. Anyway, I tried your suggestion adding a { in front of the for loop and a closing } following the last line. It did not work. Steve Friedman Ph. D. Ecologist / Spatial Statistical Analyst Everglades and Dry Tortugas National Park 950 N Krome Ave (3rd Floor) Homestead, Florida 33034 Steve_Friedman at nps.gov Office (305) 224 - 4282 Fax (305) 224 - 4147 Peter Ehlers To "Steve_Friedman at nps.gov" 04/01/2011 05:08 PM cc "r-help at r-project.org" Subject Re: [R] filled contour plot with contour lines Aren't you missing a set of parentheses? I can't run your code since it's not reproducible, but to my aging eyes it seems that you need a set of '{}' around the contents of your loop: for(j in 1:(varsize[4]-1)) { loop stuff } Peter Ehlers On 2011-04-01 11:48, Steve_Friedman at nps.gov wrote: > > I'm stumped, can anyone find my error in this sequence. > > for(j in 1:(varsize[4]-1)) > temp<- get.var.ncdf(nc=input, > varid="p_foraging",c(1,1,j),c(varsize[1],varsize[2],1)) > filled.contour(x, y, temp, color = terrain.colors, > plot.title = title(main = paste("Everglades Wood Stork > Foraging Potential \nYear", (2000+j)), > xlab = "UTM East", ylab = "UTM North") , > plot.axes = { contour(temp, add=T) > axis(1, seq(450000 , 580000, by = 10000)) > axis(2, seq(2800000,4000000, by = 10000)) }, > key.title = title(main="Probability") , > key.axes = axis(4, seq(0 , 1 , by = 0.1)) > > > The routine will work in a modified form without adding the coordinates > (the axis lines) but when I include these the routine produces various > errors, such as "dimension mismatch", or "unexpected end encountered." > > I tried to follow the example on filled.contour help page. > > Thanks in advance > Steve > > Steve Friedman Ph. D. > Ecologist / Spatial Statistical Analyst > Everglades and Dry Tortugas National Park > 950 N Krome Ave (3rd Floor) > Homestead, Florida 33034 > > Steve_Friedman at nps.gov > Office (305) 224 - 4282 > Fax (305) 224 - 4147 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From sahin at hsu-hh.de Fri Apr 1 23:24:48 2011 From: sahin at hsu-hh.de (sahin at hsu-hh.de) Date: Fri, 1 Apr 2011 23:24:48 +0200 Subject: [R] qcc.overdispersion-test Message-ID: <20110401232448.20386ud3gd2ka1ls@webmail.unibw-hamburg.de> Hi all, I have made an overdispersion test for a data set and get the following result Overdispersion test Obs.Var/Theor.Var Statistic p-value poisson data 16.24267 47444.85 0 after deleting the outliers from the data set I get the following result Overdispersion test Obs.Var/Theor.Var Statistic p-value poisson data 16.27106 0 1 The problem is that the overdispersion parameter does not really change, but how could the p-value and the statistic change so that the null hypothesis is accepted?? I would be very grateful if someone could help me? From Steve_Friedman at nps.gov Fri Apr 1 23:38:00 2011 From: Steve_Friedman at nps.gov (Steve_Friedman at nps.gov) Date: Fri, 1 Apr 2011 17:38:00 -0400 Subject: [R] filled contour plot with contour lines In-Reply-To: <4D963EC1.2020509@ucalgary.ca> Message-ID: Hi Peter, Thanks for taking the time to consider the problem. I realize the procedure is not reproducible. I use a 4 dimensional netCDF file (4.2 GB) in size to pull data into this process. Nobody in their right mind should work with such things. It's a spatial temporal database with 10 years of daily data in a irregular area spanning 405 x 287 cells. Anyway, I tried your suggestion adding a { in front of the for loop and a closing } following the last line. It did not work. Steve Friedman Ph. D. Ecologist / Spatial Statistical Analyst Everglades and Dry Tortugas National Park 950 N Krome Ave (3rd Floor) Homestead, Florida 33034 Steve_Friedman at nps.gov Office (305) 224 - 4282 Fax (305) 224 - 4147 Peter Ehlers To "Steve_Friedman at nps.gov" 04/01/2011 05:08 PM cc "r-help at r-project.org" Subject Re: [R] filled contour plot with contour lines Aren't you missing a set of parentheses? I can't run your code since it's not reproducible, but to my aging eyes it seems that you need a set of '{}' around the contents of your loop: for(j in 1:(varsize[4]-1)) { loop stuff } Peter Ehlers On 2011-04-01 11:48, Steve_Friedman at nps.gov wrote: > > I'm stumped, can anyone find my error in this sequence. > > for(j in 1:(varsize[4]-1)) > temp<- get.var.ncdf(nc=input, > varid="p_foraging",c(1,1,j),c(varsize[1],varsize[2],1)) > filled.contour(x, y, temp, color = terrain.colors, > plot.title = title(main = paste("Everglades Wood Stork > Foraging Potential \nYear", (2000+j)), > xlab = "UTM East", ylab = "UTM North") , > plot.axes = { contour(temp, add=T) > axis(1, seq(450000 , 580000, by = 10000)) > axis(2, seq(2800000,4000000, by = 10000)) }, > key.title = title(main="Probability") , > key.axes = axis(4, seq(0 , 1 , by = 0.1)) > > > The routine will work in a modified form without adding the coordinates > (the axis lines) but when I include these the routine produces various > errors, such as "dimension mismatch", or "unexpected end encountered." > > I tried to follow the example on filled.contour help page. > > Thanks in advance > Steve > > Steve Friedman Ph. D. > Ecologist / Spatial Statistical Analyst > Everglades and Dry Tortugas National Park > 950 N Krome Ave (3rd Floor) > Homestead, Florida 33034 > > Steve_Friedman at nps.gov > Office (305) 224 - 4282 > Fax (305) 224 - 4147 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From clint at ecy.wa.gov Fri Apr 1 23:47:49 2011 From: clint at ecy.wa.gov (Clint Bowman) Date: Fri, 1 Apr 2011 14:47:49 -0700 (PDT) Subject: [R] filled contour plot with contour lines In-Reply-To: References: Message-ID: Steve, I use filled.contour but have a semicolon, ";", between the two axis calls. Do you need one after "axis(1, seq(450000 , 580000, by = 10000))" ? Clint -- Clint Bowman INTERNET: clint at ecy.wa.gov Air Quality Modeler INTERNET: clint at math.utah.edu Department of Ecology VOICE: (360) 407-6815 PO Box 47600 FAX: (360) 407-7534 Olympia, WA 98504-7600 USPS: PO Box 47600, Olympia, WA 98504-7600 Parcels: 300 Desmond Drive, Lacey, WA 98503-1274 On Fri, 1 Apr 2011, Steve_Friedman at nps.gov wrote: > Hi Peter, > > Thanks for taking the time to consider the problem. I realize the procedure > is not reproducible. I use a 4 dimensional netCDF file (4.2 GB) in size to > pull data into this process. Nobody in their right mind should work with > such things. It's a spatial temporal database with 10 years of daily data > in a irregular area spanning 405 x 287 cells. > > Anyway, I tried your suggestion adding a { in front of the for loop and a > closing } following the last line. > > It did not work. > > > Steve Friedman Ph. D. > Ecologist / Spatial Statistical Analyst > Everglades and Dry Tortugas National Park > 950 N Krome Ave (3rd Floor) > Homestead, Florida 33034 > > Steve_Friedman at nps.gov > Office (305) 224 - 4282 > Fax (305) 224 - 4147 > > > > Peter Ehlers > ca> To > "Steve_Friedman at nps.gov" > 04/01/2011 05:08 > PM cc > "r-help at r-project.org" > > Subject > Re: [R] filled contour plot with > contour lines > > > > > > > > > > > Aren't you missing a set of parentheses? > I can't run your code since it's not reproducible, but to > my aging eyes it seems that you need a set of '{}' around > the contents of your loop: > > for(j in 1:(varsize[4]-1)) { loop stuff } > > Peter Ehlers > > On 2011-04-01 11:48, Steve_Friedman at nps.gov wrote: >> >> I'm stumped, can anyone find my error in this sequence. >> >> for(j in 1:(varsize[4]-1)) >> temp<- get.var.ncdf(nc=input, >> varid="p_foraging",c(1,1,j),c(varsize[1],varsize[2],1)) >> filled.contour(x, y, temp, color = terrain.colors, >> plot.title = title(main = paste("Everglades Wood Stork >> Foraging Potential \nYear", (2000+j)), >> xlab = "UTM East", ylab = "UTM North") , >> plot.axes = { contour(temp, add=T) >> axis(1, seq(450000 , 580000, by = 10000)) >> axis(2, seq(2800000,4000000, by = 10000)) }, >> key.title = title(main="Probability") , >> key.axes = axis(4, seq(0 , 1 , by = 0.1)) >> >> >> The routine will work in a modified form without adding the coordinates >> (the axis lines) but when I include these the routine produces various >> errors, such as "dimension mismatch", or "unexpected end encountered." >> >> I tried to follow the example on filled.contour help page. >> >> Thanks in advance >> Steve >> >> Steve Friedman Ph. D. >> Ecologist / Spatial Statistical Analyst >> Everglades and Dry Tortugas National Park >> 950 N Krome Ave (3rd Floor) >> Homestead, Florida 33034 >> >> Steve_Friedman at nps.gov >> Office (305) 224 - 4282 >> Fax (305) 224 - 4147 >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From arrayprofile at yahoo.com Fri Apr 1 23:56:06 2011 From: arrayprofile at yahoo.com (array chip) Date: Fri, 1 Apr 2011 14:56:06 -0700 (PDT) Subject: [R] function in argument Message-ID: <463943.82411.qm@web125807.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Fri Apr 1 23:59:51 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 1 Apr 2011 17:59:51 -0400 Subject: [R] filled contour plot with contour lines In-Reply-To: References: Message-ID: <2A4A3667-DECA-4D70-881B-AD35789E1456@comcast.net> On Apr 1, 2011, at 5:38 PM, Steve_Friedman at nps.gov wrote: > Hi Peter, > > Thanks for taking the time to consider the problem. I realize the > procedure > is not reproducible. I use a 4 dimensional netCDF file (4.2 GB) in > size to > pull data into this process. Nobody in their right mind should work > with > such things. It's a spatial temporal database with 10 years of daily > data > in a irregular area spanning 405 x 287 cells. > > Anyway, I tried your suggestion adding a { in front of the for loop > and a > closing } following the last line. That suggestion was probably intended to get you a more informative error message: for(j in 1:(varsize[4]-1) ) temp <- get.var.ncdf( nc=input, varid="p_foraging", c(1,1,j), c(varsize[1], varsize[2], 1) ) # I think Peter is correct. That loop will complete before the call to filled.contour, # so you will only have the last version of temp plot. filled.contour(x, y, temp, color = terrain.colors, plot.title = title( main = paste("Everglades Wood Stork Foraging Potential \nYear", (2000+j)), #that "double call" to title looks incomplete as well. The comma looks premature. # Shouldn't it be something like: plot.title = bquote("Everglades Wood Stork Foraging Potential \nYear", .(2000+j) ) #(And since the for loop is already complete `j` will not exist. # Which is probably the source of the error you are getting.) xlab = "UTM East", ylab = "UTM North") , plot.axes = { contour(temp, add=T) axis(1, seq(450000 , 580000, by = 10000)) axis(2, seq(2800000,4000000, by = 10000)) }, key.title = title(main="Probability") , key.axes = axis(4, seq(0 , 1 , by = 0.1)) > > It did not work. > > > Steve Friedman Ph. D. > Ecologist / Spatial Statistical Analyst > Everglades and Dry Tortugas National Park > 950 N Krome Ave (3rd Floor) > Homestead, Florida 33034 > > Steve_Friedman at nps.gov > Office (305) 224 - 4282 > Fax (305) 224 - 4147 > > > > Peter Ehlers > > ca> To > "Steve_Friedman at nps.gov" > 04/01/2011 05:08 > > PM cc > "r-help at r-project.org" > > > Subject > Re: [R] filled contour plot with > contour lines > > > > > > > > > > > Aren't you missing a set of parentheses? > I can't run your code since it's not reproducible, but to > my aging eyes it seems that you need a set of '{}' around > the contents of your loop: > > for(j in 1:(varsize[4]-1)) { loop stuff } > > Peter Ehlers > > On 2011-04-01 11:48, Steve_Friedman at nps.gov wrote: >> >> I'm stumped, can anyone find my error in this sequence. >> >> for(j in 1:(varsize[4]-1)) >> temp<- get.var.ncdf(nc=input, >> varid="p_foraging",c(1,1,j),c(varsize[1],varsize[2],1)) >> filled.contour(x, y, temp, color = terrain.colors, >> plot.title = title(main = paste("Everglades Wood Stork >> Foraging Potential \nYear", (2000+j)), >> xlab = "UTM East", ylab = "UTM North") , >> plot.axes = { contour(temp, add=T) >> axis(1, seq(450000 , 580000, by = 10000)) >> axis(2, seq(2800000,4000000, by = 10000)) }, >> key.title = title(main="Probability") , >> key.axes = axis(4, seq(0 , 1 , by = 0.1)) >> >> >> The routine will work in a modified form without adding the >> coordinates >> (the axis lines) but when I include these the routine produces >> various >> errors, such as "dimension mismatch", or "unexpected end >> encountered." >> >> I tried to follow the example on filled.contour help page. >> >> Thanks in advance >> Steve >> >> Steve Friedman Ph. D. >> Ecologist / Spatial Statistical Analyst >> Everglades and Dry Tortugas National Park >> 950 N Krome Ave (3rd Floor) >> Homestead, Florida 33034 >> >> Steve_Friedman at nps.gov >> Office (305) 224 - 4282 >> Fax (305) 224 - 4147 >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From dwinsemius at comcast.net Sat Apr 2 00:06:43 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 1 Apr 2011 18:06:43 -0400 Subject: [R] function in argument In-Reply-To: <463943.82411.qm@web125807.mail.ne1.yahoo.com> References: <463943.82411.qm@web125807.mail.ne1.yahoo.com> Message-ID: <065AB2E1-BD10-4CD0-9B20-FD1B25B183F3@comcast.net> On Apr 1, 2011, at 5:56 PM, array chip wrote: > Hi, I tried to pass the function dist() as an argument, but got an > error > message. However, almost the same code with mean() as the function > to be passed, > it works ok. > > foo<-function (x, > xfun = dist) > { > xfun(x) > } > > foo(matrix(1:100,nrow=5)) > Error in foo(matrix(1:100, nrow = 5)) : could not find function "xfun" > Works on my machine. > foo(matrix(1:100,nrow=5)) 1 2 3 4 2 4.472136 3 8.944272 4.472136 4 13.416408 8.944272 4.472136 5 17.888544 13.416408 8.944272 4.472136 You have probably overwritten `dist` with a non-functional object. Try rm(dist) and re-run: > > foo<-function (x, > xfun = mean) > { > xfun(x) > } > > foo(1:10) > [1] 5.5 > > what am I missing here? Thinking about your full workspace, I would guess. -- David Winsemius, MD West Hartford, CT From arrayprofile at yahoo.com Sat Apr 2 00:06:53 2011 From: arrayprofile at yahoo.com (array chip) Date: Fri, 1 Apr 2011 15:06:53 -0700 (PDT) Subject: [R] function in argument In-Reply-To: <463943.82411.qm@web125807.mail.ne1.yahoo.com> References: <463943.82411.qm@web125807.mail.ne1.yahoo.com> Message-ID: <463487.54423.qm@web125810.mail.ne1.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From wwwhsd at gmail.com Sat Apr 2 00:10:46 2011 From: wwwhsd at gmail.com (Henrique Dallazuanna) Date: Fri, 1 Apr 2011 19:10:46 -0300 Subject: [R] function in argument In-Reply-To: <463487.54423.qm@web125810.mail.ne1.yahoo.com> References: <463943.82411.qm@web125807.mail.ne1.yahoo.com> <463487.54423.qm@web125810.mail.ne1.yahoo.com> Message-ID: Or: foo<-function (x, xfun = dist) { match.fun(xfun)(x) } On Fri, Apr 1, 2011 at 7:06 PM, array chip wrote: > OK, I figured it out, need to add stats::: before dist > > foo<-function (x, > ? ?xfun = stats:::dist) > { > xfun(x) > } > > > John > > > ________________________________ > > To: r-help at r-project.org > Sent: Fri, April 1, 2011 2:56:06 PM > Subject: [R] function in argument > > Hi, I tried to pass the function dist() as an argument, but got an error > message. However, almost the same code with mean() as the function to be passed, > > it works ok. > > foo<-function (x, > ? ?xfun = dist) > { > xfun(x) > } > > foo(matrix(1:100,nrow=5)) > Error in foo(matrix(1:100, nrow = 5)) : could not find function "xfun" > > > foo<-function (x, > ? ? xfun = mean) > { > ?xfun(x) > } > > foo(1:10) > [1] 5.5 > > what am I missing here? > > Thanks > > John > > ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O From allen_pam at hotmail.com Sat Apr 2 00:21:28 2011 From: allen_pam at hotmail.com (Pam Allen) Date: Fri, 1 Apr 2011 17:21:28 -0500 (CDT) Subject: [R] Lattice wireframe or cloud plot with different colours by a group Message-ID: <1301696488288-3421296.post@n4.nabble.com> I have a question about wireframe 3-D plots and how to apply colors. I have a large dataset of river flow (m^3/s) over time, and I have coded these flows based on their height. I would like to produce a wireframe plot that colors the graph based on the flow code, i.e. I would like high flows to be red, medium to be green and low flows to be blue. Here is some sample data with the basic wireframe plot: flow.dat=cbind.data.frame(flow=sin(2*pi/53*c(1:3000))+1,day=as.numeric(format(as.Date(c(1:3000)), format="%j")), year=as.numeric(format(as.Date(c(1:3000)), format="%Y")),grp=c(rep(c("1.high","2.med","3.low"),1000))) wireframe(flow~day*year, data=flow.dat, shade=T) Is there any way to specify what colours are passed to the plot? I.e. wireframe(flow~day*year, data=flow.dat, shade=T, groups=grp, col.group=c("#FF3030","#551A8B","#43CD80")) I would also be happy if I could do this with a cloud plot, but I can't get the colors to plot correctly. cloud(flow~day*year, data=flow.dat, shade=T, groups=grp, col.group=c("#FF3030","#43CD80","#1E90FF"), pch=20) Any help is much appreciated! Thank you. -Pam Allen allen_pam at hotmail.com -- View this message in context: http://r.789695.n4.nabble.com/Lattice-wireframe-or-cloud-plot-with-different-colours-by-a-group-tp3421296p3421296.html Sent from the R help mailing list archive at Nabble.com. From ehlers at ucalgary.ca Sat Apr 2 00:48:00 2011 From: ehlers at ucalgary.ca (Peter Ehlers) Date: Fri, 01 Apr 2011 15:48:00 -0700 Subject: [R] filled contour plot with contour lines In-Reply-To: <2A4A3667-DECA-4D70-881B-AD35789E1456@comcast.net> References: <2A4A3667-DECA-4D70-881B-AD35789E1456@comcast.net> Message-ID: <4D965620.10403@ucalgary.ca> Steve, I was just about to send off another suggestion which began with ... Try the following or wait for David W. to chime in. But I see that he's already done that. Anyway, in case it's still of some use: First I would check that you don't have a variable 'T' in your workspace. (Never use T in place of TRUE.) Then I would check that x and y are as they should be. Then I would make sure that temp is of the correct type (a matrix) by letting j take a few values (1,2,varsize[4]-1), generating temp, and running str(temp). I assume that varsize[4] is >= 2. Then I would run the following stripped-down version of your loop (note that I modified your contour() call): for(j in 1:2) { temp <- get.var.ncdf(nc=input, varid="p_foraging",c(1,1,j),c(varsize[1],varsize[2],1) ) filled.contour(x, y, temp, plot.axes = { contour(x, y, temp, add = TRUE) axis(1, seq(450000 , 580000, by = 10000)) axis(2, seq(2800000,4000000, by = 10000)) } } If that gives reasonable results, you can add the prettyfying statements back in. If you still get errors, then either I'm out-to-lunch (quite possible!) or you need to use debug() to figure out what's going on. Peter Ehlers On 2011-04-01 14:59, David Winsemius wrote: > > On Apr 1, 2011, at 5:38 PM, Steve_Friedman at nps.gov wrote: > >> Hi Peter, >> >> Thanks for taking the time to consider the problem. I realize the >> procedure >> is not reproducible. I use a 4 dimensional netCDF file (4.2 GB) in >> size to >> pull data into this process. Nobody in their right mind should work >> with >> such things. It's a spatial temporal database with 10 years of daily >> data >> in a irregular area spanning 405 x 287 cells. >> >> Anyway, I tried your suggestion adding a { in front of the for loop >> and a >> closing } following the last line. > > That suggestion was probably intended to get you a more informative > error message: > > for(j in 1:(varsize[4]-1) ) > temp<- get.var.ncdf( > nc=input, > varid="p_foraging", > c(1,1,j), > c(varsize[1], varsize[2], 1) > ) > > # I think Peter is correct. That loop will complete before the call to > filled.contour, > # so you will only have the last version of temp plot. > > filled.contour(x, y, temp, color = terrain.colors, > plot.title = title( main = > paste("Everglades Wood Stork Foraging Potential \nYear", > (2000+j)), > > #that "double call" to title looks incomplete as well. The comma looks > premature. > # Shouldn't it be something like: > plot.title = bquote("Everglades Wood Stork Foraging Potential > \nYear", .(2000+j) ) > > #(And since the for loop is already complete `j` will not exist. > # Which is probably the source of the error you are getting.) > > xlab = "UTM East", ylab = "UTM North") , > plot.axes = { contour(temp, add=T) > axis(1, seq(450000 , 580000, by = 10000)) > axis(2, seq(2800000,4000000, by = 10000)) }, > key.title = title(main="Probability") , > key.axes = axis(4, seq(0 , 1 , by = 0.1)) >> >> It did not work. >> >> >> Steve Friedman Ph. D. >> Ecologist / Spatial Statistical Analyst >> Everglades and Dry Tortugas National Park >> 950 N Krome Ave (3rd Floor) >> Homestead, Florida 33034 >> >> Steve_Friedman at nps.gov >> Office (305) 224 - 4282 >> Fax (305) 224 - 4147 >> >> >> >> Peter Ehlers >> > >> ca> To >> "Steve_Friedman at nps.gov" >> 04/01/2011 05:08 >> >> PM cc >> "r-help at r-project.org" >> >> >> Subject >> Re: [R] filled contour plot with >> contour lines >> >> >> >> >> >> >> >> >> >> >> Aren't you missing a set of parentheses? >> I can't run your code since it's not reproducible, but to >> my aging eyes it seems that you need a set of '{}' around >> the contents of your loop: >> >> for(j in 1:(varsize[4]-1)) { loop stuff } >> >> Peter Ehlers >> >> On 2011-04-01 11:48, Steve_Friedman at nps.gov wrote: >>> >>> I'm stumped, can anyone find my error in this sequence. >>> >>> for(j in 1:(varsize[4]-1)) >>> temp<- get.var.ncdf(nc=input, >>> varid="p_foraging",c(1,1,j),c(varsize[1],varsize[2],1)) >>> filled.contour(x, y, temp, color = terrain.colors, >>> plot.title = title(main = paste("Everglades Wood Stork >>> Foraging Potential \nYear", (2000+j)), >>> xlab = "UTM East", ylab = "UTM North") , >>> plot.axes = { contour(temp, add=T) >>> axis(1, seq(450000 , 580000, by = 10000)) >>> axis(2, seq(2800000,4000000, by = 10000)) }, >>> key.title = title(main="Probability") , >>> key.axes = axis(4, seq(0 , 1 , by = 0.1)) >>> >>> >>> The routine will work in a modified form without adding the >>> coordinates >>> (the axis lines) but when I include these the routine produces >>> various >>> errors, such as "dimension mismatch", or "unexpected end >>> encountered." >>> >>> I tried to follow the example on filled.contour help page. >>> >>> Thanks in advance >>> Steve >>> >>> Steve Friedman Ph. D. >>> Ecologist / Spatial Statistical Analyst >>> Everglades and Dry Tortugas National Park >>> 950 N Krome Ave (3rd Floor) >>> Homestead, Florida 33034 >>> >>> Steve_Friedman at nps.gov >>> Office (305) 224 - 4282 >>> Fax (305) 224 - 4147 >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ritacarreira at hotmail.com Sat Apr 2 00:49:00 2011 From: ritacarreira at hotmail.com (Rita Carreira) Date: Fri, 1 Apr 2011 22:49:00 +0000 Subject: [R] package MICE, squeeze function, calling several variables at once Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From skfglades at gmail.com Sat Apr 2 01:23:39 2011 From: skfglades at gmail.com (skfglades at gmail.com) Date: Fri, 1 Apr 2011 19:23:39 -0400 Subject: [R] filled contour plot with contour lines In-Reply-To: <4D965620.10403@ucalgary.ca> References: <2A4A3667-DECA-4D70-881B-AD35789E1456@comcast.net> <4D965620.10403@ucalgary.ca> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Sat Apr 2 01:46:59 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Fri, 1 Apr 2011 19:46:59 -0400 Subject: [R] Lattice wireframe or cloud plot with different colours by a group In-Reply-To: <1301696488288-3421296.post@n4.nabble.com> References: <1301696488288-3421296.post@n4.nabble.com> Message-ID: <5E2A1321-BB93-47C3-8080-48743E4B21A8@comcast.net> On Apr 1, 2011, at 6:21 PM, Pam Allen wrote: > I have a question about wireframe 3-D plots and how to apply > colors. I have > a large dataset of river flow (m^3/s) over time, and I have coded > these > flows based on their height. I would like to produce a wireframe > plot that > colors the graph based on the flow code, i.e. I would like high > flows to be > red, medium to be green and low flows to be blue. Here is some > sample data > with the basic wireframe plot: > > flow.dat=cbind.data.frame(flow=sin(2*pi/53*c(1:3000)) > +1,day=as.numeric(format(as.Date(c(1:3000)), > format="%j")), year=as.numeric(format(as.Date(c(1:3000)), > format="%Y")),grp=c(rep(c("1.high","2.med","3.low"),1000))) It does not look to me that high flows are properly grouped with `grp`. > > wireframe(flow~day*year, data=flow.dat, shade=T) wireframe(flow~day*year, data=flow.dat, drape=TRUE, col.regions=c("#FF3030","#551A8B","#43CD80"), at=c(0,.6,1.3,1.8)) Because of the coding at the corner and the long thin polygons you have specified with your day*year splits you get slanting colors. > > Is there any way to specify what colours are passed to the plot? I.e. > wireframe(flow~day*year, data=flow.dat, shade=T, groups=grp, > col.group=c("#FF3030","#551A8B","#43CD80")) > > I would also be happy if I could do this with a cloud plot, but I > can't get > the colors to plot correctly. > cloud(flow~day*year, data=flow.dat, shade=T, groups=grp, > col.group=c("#FF3030","#43CD80","#1E90FF"), pch=20) > > Any help is much appreciated! Thank you. > > -Pam Allen > > allen_pam at hotmail.com > > > -- > View this message in context: http://r.789695.n4.nabble.com/Lattice-wireframe-or-cloud-plot-with-different-colours-by-a-group-tp3421296p3421296.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From bjameshunter at gmail.com Sat Apr 2 02:39:14 2011 From: bjameshunter at gmail.com (Ben Hunter) Date: Fri, 1 Apr 2011 17:39:14 -0700 Subject: [R] Putting a loop in a function Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jwiley.psych at gmail.com Sat Apr 2 04:08:05 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Fri, 1 Apr 2011 19:08:05 -0700 Subject: [R] Putting a loop in a function In-Reply-To: References: Message-ID: Hi Ben, I am having some trouble figuring out what it is exactly you want. A workable example would be nice and then (even if just manually) typed out what you would like it to return based on a given input. I almost think you just want grep(). grep("test", c("test", "not", "test2", "not"), value = TRUE) [1] "test" "test2" using your example: grep("string", d[,"Column.You.Want"], value = TRUE) On Fri, Apr 1, 2011 at 5:39 PM, Ben Hunter wrote: > I'm stuck here. The following code works great if I just use it on the > command line in a 'for' loop. Once I try to define it as a function as seen > below, I get bad results. I'm trying to return a vector of column indices > based on a short string that is contained in a selection (length of about > 70) of long column names that changes from time to time. Is there an > existing function that will do this for me? The more I think about this > problem, the more I feel there has to be a function out there. I've not > found it. > > > ind <- function(col, string, vector){ ?# this is really the problem. I don't > feel like I'm declaring these arguments properly. > ?indices <- vector(mode = 'numeric') # I am not entirely confident that > this use is necessary. Is indices <- c() okay? indices <- c() would sort of work, but vector() is better. Also, if you actually want your function to be returning integers, you should instatiate indices as an integer class vector, not numeric. > ?for (i in 1:length(col)){ > ? ?num <- regexpr(str, col[i]) str() is a function, and as far as I can tell, a variable "str" is not defined anywhere in your function or your functions argument. Did you mean "string"? > ? ?if (num != -1){ > ? ? ? ? indices <- c(vector, i) # I've also had success with indices <- > append(indices, i) why are you combining "vector" with i over and over? This will give you something like: c("vector", i1, "vector", i2, "vector", i3, etc.) except obviously replace i1, i3 with their values and "vector" with its value. Is that what you want? > ?} > } > indices > } > > ind(d[,'Column.I.want'], 'string', 'output.vector') > > Am I wrong here? I've read that the last statement in the function is what > it will return, and what I want is a vector of integers. yes, if return() is not explicitly specified inside the function, then it will return the output of the last statement. > > Thanks, Thank you for providing code of what you tried. For future reference, it is good to at least provide useable input data and then show us what the output you would like is. For example: "Blah blah blah, I have a vector d <- c(1, 2, 3), how can I find the average of this (i.e., 2)?" To which you would get the answer: "mean(d)" or some such. If grep() is not actually what you are after, can you let us know a little bit more about what you want? Hope this helps, Josh > > -Ben > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From josephpaulson at gmail.com Sat Apr 2 07:08:45 2011 From: josephpaulson at gmail.com (Joseph N. Paulson) Date: Sat, 2 Apr 2011 01:08:45 -0400 Subject: [R] Matrix manipulation Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From djmuser at gmail.com Sat Apr 2 07:56:36 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Fri, 1 Apr 2011 22:56:36 -0700 Subject: [R] Matrix manipulation In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From djandrija at gmail.com Sat Apr 2 08:33:56 2011 From: djandrija at gmail.com (andrija djurovic) Date: Sat, 2 Apr 2011 08:33:56 +0200 Subject: [R] Matrix manipulation In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bhh at xs4all.nl Sat Apr 2 09:51:49 2011 From: bhh at xs4all.nl (Berend Hasselman) Date: Sat, 2 Apr 2011 02:51:49 -0500 (CDT) Subject: [R] Putting a loop in a function In-Reply-To: References: Message-ID: <1301730709132-3421768.post@n4.nabble.com> Ben Hunter wrote: > > I'm stuck here. The following code works great if I just use it on the > command line in a 'for' loop. Once I try to define it as a function as > seen > below, I get bad results. I'm trying to return a vector of column indices > based on a short string that is contained in a selection (length of about > 70) of long column names that changes from time to time. Is there an > existing function that will do this for me? The more I think about this > problem, the more I feel there has to be a function out there. I've not > found it. > > > ind <- function(col, string, vector){ # this is really the problem. I > don't > feel like I'm declaring these arguments properly. > indices <- vector(mode = 'numeric') # I am not entirely confident > that > this use is necessary. Is indices <- c() okay? > for (i in 1:length(col)){ > num <- regexpr(str, col[i]) > if (num != -1){ > indices <- c(vector, i) # I've also had success with indices > <- > append(indices, i) > } > } > indices > } > > ind(d[,'Column.I.want'], 'string', 'output.vector') > > Am I wrong here? I've read that the last statement in the function is what > it will return, and what I want is a vector of integers. > It is not clear what you want. Give a simple example with input and desired output. About your function: - what's the purpose of the argument. It isn't being used in your function. - what is the argument? Character? Object? in the function body vector is used as a built-in function. Berend -- View this message in context: http://r.789695.n4.nabble.com/Putting-a-loop-in-a-function-tp3421439p3421768.html Sent from the R help mailing list archive at Nabble.com. From szimine at gmail.com Sat Apr 2 10:21:55 2011 From: szimine at gmail.com (stan zimine) Date: Sat, 2 Apr 2011 10:21:55 +0200 Subject: [R] R gui on windows how to force to always show the last line of output Message-ID: Hi. Googled but did not found the answer for the following little issue. how to force R gui on windows (maybe a specific setting) to always show the last line of output in the window console. My program in R makes measurements every 5 mins in indefinite loop and prints results in the console. The problem: last messages are not visible, The scrolling bar of the gui console gets shorter. I.e. you have to scroll for the last messages. Thanks if anybody knows the sol to this prob. SZ From erich.neuwirth at univie.ac.at Sat Apr 2 12:02:40 2011 From: erich.neuwirth at univie.ac.at (Erich Neuwirth) Date: Sat, 02 Apr 2011 12:02:40 +0200 Subject: [R] fonts in mosaic In-Reply-To: <4D949CC8.5080100@univie.ac.at> References: <4D93757D.4050200@univie.ac.at> <4D939897.3000704@univie.ac.at> <4D949CC8.5080100@univie.ac.at> Message-ID: <4D96F440.2020809@univie.ac.at> The problem of changing the default font used fpr the windows graphics device seems to be quite complicated. As I wrote already windowsFonts(myfont="Consolas") par(family="myfont") will make the windows graphics device use Consolas as the defaut font for labels. Therefore, a function call like plot(1:10,main="Title") will use Consolas for labeling. It will, however, not change the default for mosaic(UCBAdmissions) The mosaic plot will still use the original default font for the device, which in a standard configuration seems to be Arial. par(family=NULL) will reset the current default font to the original default font, so after this command plot(1:10,main="Title") will use Arial again. The the font family used in graphics from package grid is set by explicitly using gpar paramenter in calls to functions from that package. gpar has default values, but there seems to be no way of changing the default values of the gpar object. Does anybody on the list know if there is a way of changing the default values of gpar objects? From bt_jannis at yahoo.de Sat Apr 2 13:15:16 2011 From: bt_jannis at yahoo.de (Jannis) Date: Sat, 02 Apr 2011 13:15:16 +0200 Subject: [R] R gui on windows how to force to always show the last line of output In-Reply-To: References: Message-ID: <4D970544.2080609@yahoo.de> If I were you, I would use another GUI. The standart GUI (to mee) seems to be very basic and lacks many handy features (e.g. autosave etc). I am not sure which GUI does what you want, but just try a few (list is sorted from intuitive to more complicated): -RStudio (still in Beta but very nive) -TinnR -Rkward -Emacs -ESS (I am quite sure that this one does what you want) -Eclipse - StatET Just try a few.... Jannis On 04/02/2011 10:21 AM, stan zimine wrote: > Hi. > Googled but did not found the answer for the following little issue. > > how to force R gui on windows (maybe a specific setting) to always > show the last line of output in the window console. > > > My program in R makes measurements every 5 mins in indefinite loop and > prints results in the console. > > The problem: last messages are not visible, The scrolling bar of the > gui console gets shorter. I.e. you have to scroll for the last > messages. > > Thanks if anybody knows the sol to this prob. > > SZ > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From ivowel at gmail.com Sat Apr 2 14:24:07 2011 From: ivowel at gmail.com (ivo welch) Date: Sat, 2 Apr 2011 08:24:07 -0400 Subject: [R] uniroot speed and vectorization? Message-ID: curiosity---given that vector operations are so much faster than scalar operations, would it make sense to make uniroot vectorized? if I read the uniroot docs correctly, uniroot() calls an external C routine which seems to be a scalar function. that must be slow. I am thinking a vectorized version would be useful for an example such as of <- function(x,a) ( log(x)+x+a ) uniroot( of, c( 1e-7, 100 ), a=rnorm(1000000) ) I would have timed this, but I would have used a 'for' loop, which is probably not the "R way" of doing this. has someone already written a package that does this? /iaw ---- Ivo Welch (ivo.welch at brown.edu, ivo.welch at gmail.com) From dwinsemius at comcast.net Sat Apr 2 14:27:53 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sat, 2 Apr 2011 08:27:53 -0400 Subject: [R] R gui on windows how to force to always show the last line of output In-Reply-To: References: Message-ID: On Apr 2, 2011, at 4:21 AM, stan zimine wrote: > Hi. > Googled but did not found the answer for the following little issue. > > how to force R gui on windows (maybe a specific setting) to always > show the last line of output in the window console. > > > My program in R makes measurements every 5 mins in indefinite loop and > prints results in the console. > > The problem: last messages are not visible, The scrolling bar of the > gui console gets shorter. I.e. you have to scroll for the last > messages. > > Thanks if anybody knows the sol to this prob. You may want to add flush.console() to the code. -- David Winsemius, MD West Hartford, CT From dwinsemius at comcast.net Sat Apr 2 13:14:28 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sat, 2 Apr 2011 07:14:28 -0400 Subject: [R] Lattice wireframe or cloud plot with different colours by a group In-Reply-To: <5E2A1321-BB93-47C3-8080-48743E4B21A8@comcast.net> References: <1301696488288-3421296.post@n4.nabble.com> <5E2A1321-BB93-47C3-8080-48743E4B21A8@comcast.net> Message-ID: On Apr 1, 2011, at 7:46 PM, David Winsemius wrote: > > On Apr 1, 2011, at 6:21 PM, Pam Allen wrote: > >> I have a question about wireframe 3-D plots and how to apply >> colors. I have >> a large dataset of river flow (m^3/s) over time, and I have coded >> these >> flows based on their height. I would like to produce a wireframe >> plot that >> colors the graph based on the flow code, i.e. I would like high >> flows to be >> red, medium to be green and low flows to be blue. Here is some >> sample data >> with the basic wireframe plot: >> >> flow.dat=cbind.data.frame(flow=sin(2*pi/53*c(1:3000)) >> +1,day=as.numeric(format(as.Date(c(1:3000)), >> format="%j")), year=as.numeric(format(as.Date(c(1:3000)), >> format="%Y")),grp=c(rep(c("1.high","2.med","3.low"),1000))) > > It does not look to me that high flows are properly grouped with > `grp`. > >> >> wireframe(flow~day*year, data=flow.dat, shade=T) > > wireframe(flow~day*year, data=flow.dat, drape=TRUE, > col.regions=c("#FF3030","#551A8B","#43CD80"), > at=c(0,.6,1.3,1.8)) > > Because of the coding at the corner and the long thin polygons you > have specified with your day*year splits you get slanting colors. If you want to use grp as the color index then you need first to convert it to something that can be used as an index (with as.numeric), then use that in "[" to pull from a vector of colors: cloud(flow~day*year, data=flow.dat, col=c("#FF3030", "#551A8B", "#43CD80")[ as.numeric(flow.dat$grp)], pch=20) -- David > >> >> Is there any way to specify what colours are passed to the plot? >> I.e. >> wireframe(flow~day*year, data=flow.dat, shade=T, groups=grp, >> col.group=c("#FF3030","#551A8B","#43CD80")) >> >> I would also be happy if I could do this with a cloud plot, but I >> can't get >> the colors to plot correctly. >> cloud(flow~day*year, data=flow.dat, shade=T, groups=grp, >> col.group=c("#FF3030","#43CD80","#1E90FF"), pch=20) >> >> Any help is much appreciated! Thank you. >> >> -Pam Allen >> >> allen_pam at hotmail.com >> >> >> -- >> View this message in context: http://r.789695.n4.nabble.com/Lattice-wireframe-or-cloud-plot-with-different-colours-by-a-group-tp3421296p3421296.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From jrkrideau at yahoo.ca Sat Apr 2 16:04:37 2011 From: jrkrideau at yahoo.ca (John Kane) Date: Sat, 2 Apr 2011 07:04:37 -0700 (PDT) Subject: [R] how to do t-test in r for difference of mean In-Reply-To: Message-ID: <92074.9017.qm@web38404.mail.mud.yahoo.com> http://www.gardenersown.co.uk/Education/Lectures/R/basics.htm#t_test --- On Thu, 3/31/11, arkajyoti jana wrote: > From: arkajyoti jana > Subject: [R] how to do t-test in r for difference of mean > To: r-help at r-project.org > Received: Thursday, March 31, 2011, 9:07 AM > I am trying to do t-test to test > whether the mean of one one column of the > data frame is greater then another. please help me out. > > -- > Arkajyoti Jana > M. Phil/ 2nd semester > Centre for Economic Studies and planning > School of Social Sciences > Jawaharlal Nehru University > New Delhi-67 > > ??? [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. > From dwinsemius at comcast.net Sat Apr 2 17:16:44 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sat, 2 Apr 2011 11:16:44 -0400 Subject: [R] fonts in mosaic In-Reply-To: <4D96F440.2020809@univie.ac.at> References: <4D93757D.4050200@univie.ac.at> <4D939897.3000704@univie.ac.at> <4D949CC8.5080100@univie.ac.at> <4D96F440.2020809@univie.ac.at> Message-ID: <62AAD21A-4414-4D5B-81C2-8D768935B452@comcast.net> On Apr 2, 2011, at 6:02 AM, Erich Neuwirth wrote: > The problem of changing the default font used fpr the windows graphics > device seems to be quite complicated. > As I wrote already > > windowsFonts(myfont="Consolas") > par(family="myfont") > > will make the windows graphics device use Consolas as the > defaut font for labels. > > Therefore, a function call like > plot(1:10,main="Title") > will use Consolas for labeling. > It will, however, not change the default for > > mosaic(UCBAdmissions) > > The mosaic plot will still use the original default font > for the device, which in a standard configuration > seems to be Arial. > > par(family=NULL) > will reset the current default font to the original default font, > so after this command > plot(1:10,main="Title") > will use Arial again. > > The the font family used in graphics from package grid is set by > explicitly using gpar paramenter in calls to functions from that > package. > gpar has default values, but there seems to be no way > of changing the default values of the gpar object. > > Does anybody on the list know if there is a way of changing the > default > values of gpar objects? Make one with your desired values: gpCon <- gpar(fontfamily="Consolas", fontsize=8) Then pass it to the appropriate argument. I don't have a Windows device nor a copy of that font but the following changes several of the defaults on a Mac if I use "Times" instead of"Consolas". mosaic(Titanic, main="Test of gpar", gp_varnames=gpCon, gp_labels=gpCon ) You _might_ be able to change the defaults with grid:::set.gpar() or with grid:::grid.Call("L_setGPar", list(...)) but my efforts did not succeed (unless success is measured by undesirable side-effects on one's system.) Neither of those functions is "exposed", so it would appear that their use is not recommended (or documented). I was able to change the internal values returned by get.gpar() but passing them on to mosaic() did not succeed. -- David Winsemius, MD West Hartford, CT From liuwensui at gmail.com Sat Apr 2 17:18:12 2011 From: liuwensui at gmail.com (Wensui Liu) Date: Sat, 2 Apr 2011 11:18:12 -0400 Subject: [R] recommendation on r scripting tutorial? Message-ID: Good morning, dear listers I am wondering if you could recommend a good tutorial / book for r scripting. thank you so much in advance! WenSui Liu Credit Risk Manager, 53 Bancorp wensui.liu at 53.com 513-295-4370 From Thomas.Adams at noaa.gov Sat Apr 2 17:55:24 2011 From: Thomas.Adams at noaa.gov (Thomas.Adams at noaa.gov) Date: Sat, 02 Apr 2011 11:55:24 -0400 Subject: [R] recommendation on r scripting tutorial? In-Reply-To: References: Message-ID: Wensui, use google and search for "r stats scripts" and find among others: http://cran.r-project.org/doc/contrib/Lemon-kickstart/kr_scrpt.html Tom ----- Original Message ----- From: Wensui Liu Date: Saturday, April 2, 2011 11:22 am Subject: [R] recommendation on r scripting tutorial? To: r-help > Good morning, dear listers > > I am wondering if you could recommend a good tutorial / book for r scripting. > > thank you so much in advance! > > WenSui Liu > Credit Risk Manager, 53 Bancorp > wensui.liu at 53.com > 513-295-4370 > > ______________________________________________ > R-help at r-project.org mailing list > > PLEASE do read the posting guide > and provide commented, minimal, self-contained, reproducible code. From irene_vrbik at hotmail.com Sat Apr 2 17:06:05 2011 From: irene_vrbik at hotmail.com (statfan) Date: Sat, 2 Apr 2011 10:06:05 -0500 (CDT) Subject: [R] truncated distributions Message-ID: <1301756765123-3422245.post@n4.nabble.com> I am sampling from the truncated multivariate student t distribution "rtmvt" in the package {tmvtnorm}. My question is about the mean vector. Is it possible to define a mean vector outside of the truncated region? Thank you in advance for any help. -- View this message in context: http://r.789695.n4.nabble.com/truncated-distributions-tp3422245p3422245.html Sent from the R help mailing list archive at Nabble.com. From dwinsemius at comcast.net Sat Apr 2 18:41:17 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sat, 2 Apr 2011 12:41:17 -0400 Subject: [R] truncated distributions In-Reply-To: <1301756765123-3422245.post@n4.nabble.com> References: <1301756765123-3422245.post@n4.nabble.com> Message-ID: <43E7A77A-0AD9-4EAB-971F-1550865487B2@comcast.net> On Apr 2, 2011, at 11:06 AM, statfan wrote: > I am sampling from the truncated multivariate student t distribution > "rtmvt" > in the package {tmvtnorm}. My question is about the mean vector. Is > it > possible to define a mean vector outside of the truncated region? > Thank you > in advance for any help. In what sense are you interpreting the word "mean"? The "mean" in the specification of a truncated distribution is probably not going to be the expected value of a random variable from such a distribution, but rather refers to the parent distribution's mean. > print(x=rtmvnorm(10, mean=0, sigma=1, lower=0.5, upper=1), digits=3) [,1] [1,] 0.984 [2,] 0.528 [3,] 0.529 [4,] 0.550 [5,] 0.832 [6,] 0.788 [7,] 0.775 [8,] 0.631 [9,] 0.832 [10,] 0.558 -- David Winsemius, MD West Hartford, CT From bjameshunter at gmail.com Sat Apr 2 19:35:09 2011 From: bjameshunter at gmail.com (Ben Hunter) Date: Sat, 2 Apr 2011 10:35:09 -0700 Subject: [R] Putting a loop in a function In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Sat Apr 2 20:47:57 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sat, 2 Apr 2011 14:47:57 -0400 Subject: [R] fonts in mosaic In-Reply-To: <62AAD21A-4414-4D5B-81C2-8D768935B452@comcast.net> References: <4D93757D.4050200@univie.ac.at> <4D939897.3000704@univie.ac.at> <4D949CC8.5080100@univie.ac.at> <4D96F440.2020809@univie.ac.at> <62AAD21A-4414-4D5B-81C2-8D768935B452@comcast.net> Message-ID: <6B26672F-0160-4539-9EC6-DD23E4DBE59F@comcast.net> On Apr 2, 2011, at 11:16 AM, David Winsemius wrote: > > On Apr 2, 2011, at 6:02 AM, Erich Neuwirth wrote: > >> The problem of changing the default font used fpr the windows >> graphics >> device seems to be quite complicated. >> As I wrote already >> >> windowsFonts(myfont="Consolas") >> par(family="myfont") >> >> will make the windows graphics device use Consolas as the >> defaut font for labels. >> >> Therefore, a function call like >> plot(1:10,main="Title") >> will use Consolas for labeling. >> It will, however, not change the default for >> >> mosaic(UCBAdmissions) >> >> The mosaic plot will still use the original default font >> for the device, which in a standard configuration >> seems to be Arial. >> >> par(family=NULL) >> will reset the current default font to the original default font, >> so after this command >> plot(1:10,main="Title") >> will use Arial again. >> >> The the font family used in graphics from package grid is set by >> explicitly using gpar paramenter in calls to functions from that >> package. >> gpar has default values, but there seems to be no way >> of changing the default values of the gpar object. >> >> Does anybody on the list know if there is a way of changing the >> default >> values of gpar objects? > > Make one with your desired values: > > gpCon <- gpar(fontfamily="Consolas", fontsize=8) > > Then pass it to the appropriate argument. I don't have a Windows > device nor a copy of that font but the following changes several of > the defaults on a Mac if I use "Times" instead of"Consolas". > > mosaic(Titanic, main="Test of gpar", > gp_varnames=gpCon, > gp_labels=gpCon ) > > You _might_ be able to change the defaults with grid:::set.gpar() > or with grid:::grid.Call("L_setGPar", list(...)) but my efforts did > not succeed (unless success is measured by undesirable side-effects > on one's system.) Neither of those functions is "exposed", so it > would appear that their use is not recommended (or documented). I > was able to change the internal values returned by get.gpar() but > passing them on to mosaic() did not succeed. And then I re-read Prof Ripley's reply and the light dawned: pdf.options(family="Times") pdf(); mosaic(Titanic, main="Test of gpar") dev.off() # Works My remaining puzzlement is how to apply this strategy to the Mac screen device: quartz(family="serif"); mosaic(Titanic, main="Test of gpar") # Does not work # Nor does quartzFonts("serif"); mosaic(Titanic, main="Test of gpar") # Nor any other of several permuations of arguments -- David Winsemius, MD West Hartford, CT From asanramzan at yahoo.com Sat Apr 2 21:18:38 2011 From: asanramzan at yahoo.com (Asan Ramzan) Date: Sat, 2 Apr 2011 12:18:38 -0700 (PDT) Subject: [R] (no subject) Message-ID: <910376.18321.qm@web44716.mail.sp1.yahoo.com> Take care about your security http://users5.nofeehost.com/fohadim/wovifuhu.html Love- ItSee msToMeA Dream There are times when all you need is a touch; From Peter.Brecknock at bp.com Sat Apr 2 21:26:40 2011 From: Peter.Brecknock at bp.com (Pete Brecknock) Date: Sat, 2 Apr 2011 14:26:40 -0500 (CDT) Subject: [R] cumsum while maintaining NA In-Reply-To: <1301710743873-3421513.post@n4.nabble.com> References: <1301710743873-3421513.post@n4.nabble.com> Message-ID: <1301772400362-3422619.post@n4.nabble.com> Here's one way ..... Lines<-"x1 x2 x3 x4 x5 x6 NA NA 3 4 NA NA 5 3 4 NA NA NA 7 3 4 4 NA NA 11 3 4 5 NA NA 67 4 4 NA NA NA" d <- read.table(textConnection(Lines), header = TRUE, colClasses=c("integer")) closeAllConnections() res = t(apply(d, 1, function(x) ave(x,is.na(x),FUN=cumsum))) print(res) x1 x2 x3 x4 x5 x6 [1,] NA NA 3 7 NA NA [2,] 5 8 12 NA NA NA [3,] 7 10 14 18 NA NA [4,] 11 14 18 23 NA NA [5,] 67 71 75 NA NA NA HTH Pete -- View this message in context: http://r.789695.n4.nabble.com/cumsum-while-maintaining-NA-tp3421513p3422619.html Sent from the R help mailing list archive at Nabble.com. From irene_vrbik at hotmail.com Sat Apr 2 19:15:20 2011 From: irene_vrbik at hotmail.com (statfan) Date: Sat, 2 Apr 2011 12:15:20 -0500 (CDT) Subject: [R] truncated distributions In-Reply-To: <43E7A77A-0AD9-4EAB-971F-1550865487B2@comcast.net> References: <1301756765123-3422245.post@n4.nabble.com> <43E7A77A-0AD9-4EAB-971F-1550865487B2@comcast.net> Message-ID: <1301764520218-3422434.post@n4.nabble.com> The definition of the "mean vector" is essentially what my question boils down to. In the functions details, the author states "We sample x ~ T(mean, Sigma, df) subject to the rectangular truncation lower <= x <= upper. Currently, two random number generation methods are implemented: rejection sampling and the Gibbs Sampler." So if the mean vector in the "rtmvt" function is the mean of the parent distribution's mean (as I hope it is), then it would be acceptable to define a mean vector outside of the truncated range. Clarification of this point would be greatly appreciated. -- View this message in context: http://r.789695.n4.nabble.com/truncated-distributions-tp3422245p3422434.html Sent from the R help mailing list archive at Nabble.com. From padmanabhan.vijayan at gmail.com Sat Apr 2 19:31:40 2011 From: padmanabhan.vijayan at gmail.com (Vijayan Padmanabhan) Date: Sat, 2 Apr 2011 23:01:40 +0530 Subject: [R] help Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From daniel at umd.edu Sat Apr 2 22:07:35 2011 From: daniel at umd.edu (Daniel Malter) Date: Sat, 2 Apr 2011 15:07:35 -0500 (CDT) Subject: [R] Plotting MDS (multidimensional scaling) Message-ID: <1301774855661-3422670.post@n4.nabble.com> Hi, I just encountered what I thought was strange behavior in MDS. However, it turned out that the mistake was mine. The lesson learned from my mistake is that one should plot on a square pane when plotting results of an MDS. Not doing so can be very misleading. Follow the example of an equilateral triangle below to see what I mean. I hope this helps others to avoid this kind of headache. Let's say I have an equilateral triangle. Then, the three Euclidean distances between points A, B, and C are all equal. That is, dist(AB)=dist(AC)=dist(BC). Let the points A, B, and C have (x,y)-coordinates (0,0), (2,0), and (1,sqrt(3)). Then, MDS should reproduce an equilateral triangle, which it does if there are only three points. require(MASS) x=c(0,2,1,0,0,sqrt(3)) dim(x)=c(3,2) d1=dist(x) fit1<-isoMDS(d1) plot(fit1$points, xlab="Coordinate 1", ylab="Coordinate 2", main="Metric MDS",type="n") text(fit1$points, labels = c('A','B','C'), cex=1) So far so good, until I add more points. Now assume, I add a fourth point D at {0,2*sqrt(3)}. This produces the rectangular triangle ABD with hypothenuse BD that encompasses the smaller triangle ABC such that C lies in the middle between B and D. Then, MDS should reproduce the rectangular triangle ABD and the equilateral triangle ABC within it. However, even though distance matrix d2 below still indicates that ABC is an equilateral triangle, the plot of the MDS does not confirm this. x=c(0,2,1,0,0,0,sqrt(3),2*sqrt(3)) dim(x)=c(4,2) d2=dist(x) fit2<-isoMDS(d2) plot(fit2$points, xlab="Coordinate 1", ylab="Coordinate 2", main="Metric MDS",type="n") text(fit2$points, labels = c('A','B','C','D'), cex=1) The reason for this is that the dimension of the plot is automatically scaled to fit the points. This distorts the visual impression of the distances, angular relationships, and relative locations. If you plot on a square pane, however, peace and order are restored in the galaxy. plot(fit2$points, xlab="Coordinate 1", ylab="Coordinate 2", main="Metric MDS",type="n",xlim=c(-3,3),ylim=c(-3,3)) text(fit2$points, labels = c('A','B','C','D'), cex=1) Best, Daniel -- View this message in context: http://r.789695.n4.nabble.com/Plotting-MDS-multidimensional-scaling-tp3422670p3422670.html Sent from the R help mailing list archive at Nabble.com. From dwinsemius at comcast.net Sat Apr 2 22:36:19 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sat, 2 Apr 2011 16:36:19 -0400 Subject: [R] truncated distributions In-Reply-To: <1301764520218-3422434.post@n4.nabble.com> References: <1301756765123-3422245.post@n4.nabble.com> <43E7A77A-0AD9-4EAB-971F-1550865487B2@comcast.net> <1301764520218-3422434.post@n4.nabble.com> Message-ID: <1F045A1E-0311-4F16-9D54-F0B220E37499@comcast.net> On Apr 2, 2011, at 1:15 PM, statfan wrote: > The definition of the "mean vector" is essentially what my question > boils > down to. In the functions details, the author states > > "We sample x ~ T(mean, Sigma, df) subject to the rectangular > truncation > lower <= x <= upper. Currently, two random number generation methods > are > implemented: rejection sampling and the Gibbs Sampler." > > So if the mean vector in the "rtmvt" function is the mean of the > parent > distribution's mean (as I hope it is), Given the results of what I posted earlier ... how could it be otherwise? > then it would be acceptable to define > a mean vector outside of the truncated range. Clarification of this > point > would be greatly appreciated. > > -- > View this message in context: http://r.789695.n4.nabble.com/truncated-distributions-tp3422245p3422434.html > Sent from the R help mailing list archive at Nabble.com. -- David Winsemius, MD West Hartford, CT From nandan.amar at gmail.com Sat Apr 2 22:05:16 2011 From: nandan.amar at gmail.com (nandan amar) Date: Sun, 3 Apr 2011 01:35:16 +0530 Subject: [R] help In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From alex at chaotic-neutral.de Sat Apr 2 23:05:25 2011 From: alex at chaotic-neutral.de (Alexander Engelhardt) Date: Sat, 02 Apr 2011 23:05:25 +0200 Subject: [R] I think I just broke R Message-ID: <4D978F95.10602@chaotic-neutral.de> I swear, I didn't touch it! I can't fit GLM's anymore, and I can't make it talk english (for googling the error messages) anymore. > y <- c(1,1,0,1,0,1) > x <- c(2,7,3,5,2,4) > glm(y~x, binomial) Fehler in runif(length(pi)) : Element 1 ist leer; Der Teil der Argumentliste 'length' der berechnet wurde war: (pi) > Sys.setenv(LANG="EN") > glm(y~x, binomial) Fehler in runif(length(pi)) : Element 1 ist leer; Der Teil der Argumentliste 'length' der berechnet wurde war: (pi) I may suffer from sleep deprivation and minor confusion... maybe. But this is just weird and I can't explain it. I run R from Emacs/ESS, if that matters. Any help would be appreciated! -- Alex From savicky at praha1.ff.cuni.cz Sat Apr 2 23:25:08 2011 From: savicky at praha1.ff.cuni.cz (Petr Savicky) Date: Sat, 2 Apr 2011 23:25:08 +0200 Subject: [R] uniroot speed and vectorization? In-Reply-To: References: Message-ID: <20110402212507.GA30149@praha1.ff.cuni.cz> On Sat, Apr 02, 2011 at 08:24:07AM -0400, ivo welch wrote: > curiosity---given that vector operations are so much faster than > scalar operations, would it make sense to make uniroot vectorized? if > I read the uniroot docs correctly, uniroot() calls an external C > routine which seems to be a scalar function. that must be slow. I am > thinking a vectorized version would be useful for an example such as > > of <- function(x,a) ( log(x)+x+a ) > uniroot( of, c( 1e-7, 100 ), a=rnorm(1000000) ) Hi. The slowest part of the solution using uniroot() is the repeated evaluation of the R level input function. If this can be vectorized, then a faster algorithm could be possible. The following is an example of a vectorized bisection, which is simpler and less efficient than "zeroin" used in uniroot(). of <- function(x,a) { log(x)+x+a } a <- rnorm(1000) x1 <- rep(1e-7, times=length(a)) x2 <- rep(100, times=length(a)) stopifnot(of(x1, a) < 0) stopifnot(of(x2, a) > 0) for (i in 1:60) { x3 <- (x1 + x2)/2 pos <- of(x3, a) > 0 y1 <- ifelse(pos, x1, x3) y2 <- ifelse(pos, x3, x2) x1 <- y1 x2 <- y2 } print(range(of(x1, a))) print(range(of(x2, a))) It can be more efficient to find approximations of the roots using a few iterations of the above approach and then switch to the Newton method, which can be vectorized easily. Hope this helps for a start. Petr Savicky. From m_hofert at web.de Sun Apr 3 00:29:53 2011 From: m_hofert at web.de (Marius Hofert) Date: Sun, 3 Apr 2011 00:29:53 +0200 Subject: [R] lattice: wireframe "eats up" points; how to make points on wireframe visible? In-Reply-To: References: <1D3CE86E-61F3-4D79-B3CB-BCDB79FFEFC1@web.de> Message-ID: Dear Deepayan, thank you very much, this works great. Cheers, Marius On 2011-04-01, at 13:58 , Deepayan Sarkar wrote: > On Thu, Mar 31, 2011 at 3:26 AM, Marius Hofert wrote: >> Dear Deepayan, >> >> thanks for answering. It's never too late to be useful. >> >> I see your point in the minimal example. I checked the z-axis limits in my >> original problem for the point to be inside and it wasn't there. I can't easily >> reproduce it from the minimal example though. I'll get back to you if I run into >> this problem again. >> >> In the example below, both points are shown. Although one lies clearly below/under >> the surface, it looks as if it lies above. One would probably have to plot this >> point first so that the wire frame is above the point. But still, this is >> misleading since the eye believes that the wireframe is *not* transparent. This >> happens because the lines connecting (0,1,0)--(1,1,0)--(1,0,0) [dashed ones] are >> not completely visible [also not the one from (1,1,0) to (1,1,1)]. How can I make >> them visible even if they lie behind/under the wireframe? I tried to work with >> col="transparent" and with alpha=... but neither did work as I expected. >> My goal is to make the small "rectangles" between the wire transparent. >> I also use these plots in posters with a certain gradient-like background color >> and so it's a bit annoying that the "rectangles" are filled with white color. > > Yes, that probably needs a new argument; the default computation is a > bit of a hack. You can try the following workaround for now: > > wireframe(z~x*y, pts=pts, aspect=1, scales=list(col=1, arrows=FALSE), > zlim=c(0,1), > par.settings = list(background = list(col = "#ffffff11")), ## <- NEW > panel.3d.wireframe = function(x,y,z,xlim,ylim,zlim,xlim.scaled, > ylim.scaled,zlim.scaled,pts,...){ > panel.3dwire(x=x, y=y, z=z, xlim=xlim, ylim=ylim, zlim=zlim, > xlim.scaled=xlim.scaled, ylim.scaled=ylim.scaled, > zlim.scaled=zlim.scaled, ...) > panel.3dscatter(x=pts[,1], y=pts[,2], z=pts[,3], > xlim=xlim, ylim=ylim, zlim=zlim, > xlim.scaled=xlim.scaled, ylim.scaled=ylim.scaled, > zlim.scaled=zlim.scaled, type="p", col=c(2,3), > cex=1.8, pch=c(3,4), .scale=TRUE, ...) > }) > > col = "#ffffff00" instead will give you full transparency (but > "transparent" will not work), and col = "#ffffff77" will be less > transparent and so on. > > -Deepayan From vanopen at gmail.com Sun Apr 3 03:00:18 2011 From: vanopen at gmail.com (Yue Wu) Date: Sun, 3 Apr 2011 09:00:18 +0800 Subject: [R] [w32] Bug? When trying to auto-complete, rest of the input from cursor is wiped out. Message-ID: <20110403010018.GA5556@fbsd.t60.cpu> Hi, list, I'm using Rterm on windows, I find that when I try to complete the code with , the rest of the input from cursor will be wiped out. How to reproduce(`|' means the cursor position): > write.table(file='foo',fileEn|="") Then hit , R will try to complete the code, but the rest is gone: > write.table(file='foo',fileEncoding| No this issue when on *nix. -- Regards, Yue Wu Key Laboratory of Modern Chinese Medicines Department of Traditional Chinese Medicine China Pharmaceutical University No.24, Tongjia Xiang Street, Nanjing 210009, China From Tranlm at berkeley.edu Sun Apr 3 03:31:55 2011 From: Tranlm at berkeley.edu (Linh Tran) Date: Sat, 2 Apr 2011 18:31:55 -0700 Subject: [R] Discretizing data rows into regular intervals Message-ID: Hi guys, I'd like to thank you ahead of time for any help that you can offer me. I'm kind of stuck trying to do this. I have a data frame with dates and values (note: only two columns shown): head(test) date value stop 1 01/02/05 100 12/01/07 2 07/16/05 200 12/01/07 3 12/20/05 150 12/01/07 4 04/01/06 250 12/01/07 5 10/01/06 10 12/01/07 What I need to do is create regularly spaced 3-month intervals (starting with the first observed date) with values that are closest to but recorded after the date created. I would stop at the stop date. So the result would look like: new_date value 1 01/02/05 100 2 04/02/05 100 3 07/02/05 100 4 10/02/05 200 5 01/02/06 150 6 04/02/06 250 7 07/02/06 250 8 10/02/06 10 9 01/02/07 10 etc etc etc 10/02/07 --- ## Final obs since next one would be 1/2/08 (after stop date) Again, I would be extremely grateful for any help. Thanks, -linh From emorway at usgs.gov Sun Apr 3 01:09:49 2011 From: emorway at usgs.gov (emorway) Date: Sat, 2 Apr 2011 18:09:49 -0500 (CDT) Subject: [R] Help with filled.contour() In-Reply-To: <758246BF-9625-4F20-948A-CFC120689A83@virginia.edu> References: <758246BF-9625-4F20-948A-CFC120689A83@virginia.edu> Message-ID: <1301785789629-3422837.post@n4.nabble.com> Michael, Although this is a rather old post I'm responding to, I recently came across it and have a suggestions for getting rid of the legend. Simply modify the code associated with the function and stuff it into a new function, edm<-function (x = seq(0, 1, length.out = nrow(z)), y = seq(0, 1, length.out = ncol(z)), z, xlim = range(x, finite = TRUE), ylim = range(y, finite = TRUE), zlim = range(z, finite = TRUE), levels = pretty(zlim, nlevels), nlevels = 20, color.palette = cm.colors, col = color.palette(length(levels) - 1), plot.title, plot.axes, key.title, key.axes, asp = NA, xaxs = "i", yaxs = "i", las = 1, axes = TRUE, frame.plot = axes, ...) { if (missing(z)) { if (!missing(x)) { if (is.list(x)) { z <- x$z y <- x$y x <- x$x } else { z <- x x <- seq.int(0, 1, length.out = nrow(z)) } } else stop("no 'z' matrix specified") } else if (is.list(x)) { y <- x$y x <- x$x } if (any(diff(x) <= 0) || any(diff(y) <= 0)) stop("increasing 'x' and 'y' values expected") mar.orig <- (par.orig <- par(c("mar", "las", "mfrow")))$mar on.exit(par(par.orig)) w <- (3 + mar.orig[2L]) * par("csi") * 2.54 #layout(matrix(c(2, 1), ncol = 2L), widths = c(1, lcm(w))) par(las = las) mar <- mar.orig mar[4L] <- mar[2L] mar[2L] <- 1 par(mar = mar) # plot.new() # plot.window(xlim = c(0, 1), ylim = range(levels), xaxs = "i", # yaxs = "i") # rect(0, levels[-length(levels)], 1, levels[-1L], col = col) # if (missing(key.axes)) { # if (axes) # axis(4) # } # else key.axes # box() # if (!missing(key.title)) # key.title # mar <- mar.orig # mar[4L] <- 1 # par(mar = mar) plot.new() plot.window(xlim, ylim, "", xaxs = xaxs, yaxs = yaxs, asp = asp) if (!is.matrix(z) || nrow(z) <= 1L || ncol(z) <= 1L) stop("no proper 'z' matrix specified") if (!is.double(z)) storage.mode(z) <- "double" .Internal(filledcontour(as.double(x), as.double(y), z, as.double(levels), col = col)) if (missing(plot.axes)) { if (axes) { title(main = "", xlab = "", ylab = "") Axis(x, side = 1) Axis(y, side = 2) } } else plot.axes if (frame.plot) box() if (missing(plot.title)) title(...) else plot.title invisible() } Then call this new function, edm(x, y, z, axes = F, frame.plot = F, asp = 1, col = palette(gray(seq(0, 0.9, len = 25))), nlevels = 25) Eric -- View this message in context: http://r.789695.n4.nabble.com/R-Help-with-filled-contour-tp815296p3422837.html Sent from the R help mailing list archive at Nabble.com. From daniel at umd.edu Sun Apr 3 03:51:29 2011 From: daniel at umd.edu (Daniel Malter) Date: Sat, 2 Apr 2011 20:51:29 -0500 (CDT) Subject: [R] I think I just broke R In-Reply-To: <4D978F95.10602@chaotic-neutral.de> References: <4D978F95.10602@chaotic-neutral.de> Message-ID: <1301795489478-3422932.post@n4.nabble.com> Check whether x, y, or glm have been redefined. If not, restart R. D. -- View this message in context: http://r.789695.n4.nabble.com/I-think-I-just-broke-R-tp3422737p3422932.html Sent from the R help mailing list archive at Nabble.com. From daniel at umd.edu Sun Apr 3 03:53:57 2011 From: daniel at umd.edu (Daniel Malter) Date: Sat, 2 Apr 2011 20:53:57 -0500 (CDT) Subject: [R] Discretizing data rows into regular intervals In-Reply-To: References: Message-ID: <1301795637492-3422933.post@n4.nabble.com> The ?cut function should work for that. Have you tried? D. -- View this message in context: http://r.789695.n4.nabble.com/Discretizing-data-rows-into-regular-intervals-tp3422921p3422933.html Sent from the R help mailing list archive at Nabble.com. From jwiley.psych at gmail.com Sun Apr 3 04:20:20 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Sat, 2 Apr 2011 19:20:20 -0700 Subject: [R] conditionally weighted mean with NAs Message-ID: Dear List, I am trying to take the weighted average of two numbers (visual acuity measures from the left and right eye). For each row, the lowest value should get a weight of .75, and the highest a weight of .25. My problem is, if one value is missing (NA), the remaining one should get a weight of 1 (i.e., just return the nonmissing value), if both are missing, NA should be returned. Below is some example data and the code I tried to write (Desired is what I actually want). Any thoughts or comments would be welcome. Thanks, Josh VA <- cbind(OS = c(.2, 0, 1, -.1, NA, 3, NA), OD = c(.3, -.1, .2, -.1, NA, 0, .1), Desired = c(0.225, -0.075, 0.4, -0.1, NA, 0.75, 0.1)) ## What I tried weight.combine <- function(left, right) { out.r <- right out.l <- left right <- ifelse(out.r <= out.l, out.r * .75, out.r * .25) left <- ifelse(out.r > out.l, out.l * .75, out.l * .25) rowSums(cbind(left, right), na.rm = TRUE) } ## This "works", except it does not handle NAs properly weight.combine(VA[, "OS"], VA[, "OD"]) -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From joe.yuan0309 at gmail.com Sun Apr 3 04:24:18 2011 From: joe.yuan0309 at gmail.com (Xing Yuan) Date: Sat, 2 Apr 2011 22:24:18 -0400 Subject: [R] How can I generate a random correlation matrix with some off-diagonal elements being zero Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Sun Apr 3 04:59:20 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sat, 2 Apr 2011 22:59:20 -0400 Subject: [R] conditionally weighted mean with NAs In-Reply-To: References: Message-ID: <16CDE8DF-B34C-4AE6-9F24-578B45C9921F@comcast.net> On Apr 2, 2011, at 10:20 PM, Joshua Wiley wrote: > Dear List, > > > I am trying to take the weighted average of two numbers (visual acuity > measures from the left and right eye). For each row, the lowest value > should get a weight of .75, and the highest a weight of .25. My > problem is, if one value is missing (NA), the remaining one should get > a weight of 1 (i.e., just return the nonmissing value), if both are > missing, NA should be returned. Below is some example data and the > code I tried to write (Desired is what I actually want). Any thoughts > or comments would be welcome. > > Thanks, > > Josh > > > VA <- cbind(OS = c(.2, 0, 1, -.1, NA, 3, NA), > OD = c(.3, -.1, .2, -.1, NA, 0, .1), > Desired = c(0.225, -0.075, 0.4, -0.1, NA, 0.75, 0.1)) > > ## What I tried > weight.combine <- function(left, right) { > out.r <- right > out.l <- left > right <- ifelse(out.r <= out.l, out.r * .75, out.r * .25) > left <- ifelse(out.r > out.l, out.l * .75, out.l * .25) > rowSums(cbind(left, right), na.rm = TRUE) > } > VA <- cbind(VA, 0.75*pmin(VA[, "OS"], VA[, "OD"], na.rm=TRUE) + 0.25*pmax(VA[, "OS"], VA[, "OD"], na.rm=TRUE) ) > VA OS OD Desired [1,] 0.2 0.3 0.225 0.225 [2,] 0.0 -0.1 -0.075 -0.075 [3,] 1.0 0.2 0.400 0.400 [4,] -0.1 -0.1 -0.100 -0.100 [5,] NA NA NA NA [6,] 3.0 0.0 0.750 0.750 [7,] NA 0.1 0.100 0.100 > ## This "works", except it does not handle NAs properly > weight.combine(VA[, "OS"], VA[, "OD"]) > > > -- > Joshua Wiley David Winsemius, MD West Hartford, CT From daniel at umd.edu Sun Apr 3 05:06:51 2011 From: daniel at umd.edu (Daniel Malter) Date: Sat, 2 Apr 2011 22:06:51 -0500 (CDT) Subject: [R] Discretizing data rows into regular intervals In-Reply-To: References: Message-ID: <1301800011430-3422975.post@n4.nabble.com> Sorry, I did not get the question because I read it too sloppily. I hope this is not homework. You can proceed along this example: set.seed(32345) #Value of observation value=rpois(60,100) #Day of observation day=sample(1:1080,50,replace=F) day=sort(day) #Assume 3 years #Assume months have all 30 days #For real dates, the breaks in cut() #have to be defined properly all.days=seq(from=1,to=1080) quarter=cut(all.days,breaks=12) quarter=as.factor(as.numeric(quarter)) #In which quarter is a certain observation quarter.of.day=quarter[day] #What's the minimum day in a quarter min.day=tapply(day,quarter.of.day,min) #What's the value at that day values.at.min.day=value[which(day%in%min.day)] hth, daniel -- View this message in context: http://r.789695.n4.nabble.com/Discretizing-data-rows-into-regular-intervals-tp3422921p3422975.html Sent from the R help mailing list archive at Nabble.com. From jwiley.psych at gmail.com Sun Apr 3 05:16:39 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Sat, 2 Apr 2011 20:16:39 -0700 Subject: [R] conditionally weighted mean with NAs In-Reply-To: <16CDE8DF-B34C-4AE6-9F24-578B45C9921F@comcast.net> References: <16CDE8DF-B34C-4AE6-9F24-578B45C9921F@comcast.net> Message-ID: On Sat, Apr 2, 2011 at 7:59 PM, David Winsemius wrote: > > On Apr 2, 2011, at 10:20 PM, Joshua Wiley wrote: >> VA <- cbind(OS = c(.2, 0, 1, -.1, NA, 3, NA), >> ?OD = c(.3, -.1, .2, -.1, NA, 0, .1), >> ?Desired = c(0.225, -0.075, 0.4, -0.1, NA, 0.75, 0.1)) > > ?VA <- cbind(VA, 0.75*pmin(VA[, "OS"], VA[, "OD"], na.rm=TRUE) + > ? ? ? ? ? ? ? ? ?0.25*pmax(VA[, "OS"], VA[, "OD"], na.rm=TRUE) ) > >> VA > ? ? ? OS ? OD Desired > [1,] ?0.2 ?0.3 ? 0.225 ?0.225 > [2,] ?0.0 -0.1 ?-0.075 -0.075 > [3,] ?1.0 ?0.2 ? 0.400 ?0.400 > [4,] -0.1 -0.1 ?-0.100 -0.100 > [5,] ? NA ? NA ? ? ?NA ? ? NA > [6,] ?3.0 ?0.0 ? 0.750 ?0.750 > [7,] ? NA ?0.1 ? 0.100 ?0.100 > That works wonderfully, and it is so elegant! Thank you, David. Josh > > David Winsemius, MD > West Hartford, CT > From mvalle at cscs.ch Sun Apr 3 07:37:31 2011 From: mvalle at cscs.ch (Mario Valle) Date: Sun, 3 Apr 2011 07:37:31 +0200 Subject: [R] Plotting MDS (multidimensional scaling) In-Reply-To: <1301774855661-3422670.post@n4.nabble.com> References: <1301774855661-3422670.post@n4.nabble.com> Message-ID: <4D98079B.6030204@cscs.ch> Also try the asp parameter in plot. plot(fit2$points, xlab="Coordinate 1", ylab="Coordinate 2",main="Metric MDS",type='n',asp=1) text(fit2$points, labels = c('A','B','C','D'), cex=1) Hope it helps mario On 02-Apr-11 22:07, Daniel Malter wrote: > Hi, > > I just encountered what I thought was strange behavior in MDS. However, it > turned out that the mistake was mine. The lesson learned from my mistake is > that one should plot on a square pane when plotting results of an MDS. Not > doing so can be very misleading. Follow the example of an equilateral > triangle below to see what I mean. I hope this helps others to avoid this > kind of headache. > > Let's say I have an equilateral triangle. Then, the three Euclidean > distances between points A, B, and C are all equal. That is, > dist(AB)=dist(AC)=dist(BC). Let the points A, B, and C have > (x,y)-coordinates (0,0), (2,0), and (1,sqrt(3)). Then, MDS should reproduce > an equilateral triangle, which it does if there are only three points. > > require(MASS) > x=c(0,2,1,0,0,sqrt(3)) > dim(x)=c(3,2) > d1=dist(x) > fit1<-isoMDS(d1) > plot(fit1$points, xlab="Coordinate 1", ylab="Coordinate 2", > main="Metric MDS",type="n") > text(fit1$points, labels = c('A','B','C'), cex=1) > > So far so good, until I add more points. Now assume, I add a fourth point D > at {0,2*sqrt(3)}. This produces the rectangular triangle ABD with > hypothenuse BD that encompasses the smaller triangle ABC such that C lies in > the middle between B and D. Then, MDS should reproduce the rectangular > triangle ABD and the equilateral triangle ABC within it. However, even > though distance matrix d2 below still indicates that ABC is an equilateral > triangle, the plot of the MDS does not confirm this. > > x=c(0,2,1,0,0,0,sqrt(3),2*sqrt(3)) > dim(x)=c(4,2) > d2=dist(x) > fit2<-isoMDS(d2) > plot(fit2$points, xlab="Coordinate 1", ylab="Coordinate 2", > main="Metric MDS",type="n") > text(fit2$points, labels = c('A','B','C','D'), cex=1) > > The reason for this is that the dimension of the plot is automatically > scaled to fit the points. This distorts the visual impression of the > distances, angular relationships, and relative locations. If you plot on a > square pane, however, peace and order are restored in the galaxy. > > plot(fit2$points, xlab="Coordinate 1", ylab="Coordinate 2", > main="Metric MDS",type="n",xlim=c(-3,3),ylim=c(-3,3)) > text(fit2$points, labels = c('A','B','C','D'), cex=1) > > Best, > Daniel > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Plotting-MDS-multidimensional-scaling-tp3422670p3422670.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Ing. Mario Valle Data Analysis and Visualization Group | http://www.cscs.ch/~mvalle Swiss National Supercomputing Centre (CSCS) | Tel: +41 (91) 610.82.60 v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax: +41 (91) 610.82.82 From jari.oksanen at oulu.fi Sun Apr 3 08:22:12 2011 From: jari.oksanen at oulu.fi (Jari Oksanen) Date: Sun, 3 Apr 2011 06:22:12 +0000 Subject: [R] Plotting MDS (multidimensional scaling) References: <1301774855661-3422670.post@n4.nabble.com> Message-ID: Daniel Malter umd.edu> writes: > > Let's say I have an equilateral triangle. Then, the three Euclidean > distances between points A, B, and C are all equal. That is, > dist(AB)=dist(AC)=dist(BC). Let the points A, B, and C have > (x,y)-coordinates (0,0), (2,0), and (1,sqrt(3)). Then, MDS should reproduce > an equilateral triangle, which it does if there are only three points. > > require(MASS) > x=c(0,2,1,0,0,sqrt(3)) > dim(x)=c(3,2) > d1=dist(x) > fit1<-isoMDS(d1) > plot(fit1$points, xlab="Coordinate 1", ylab="Coordinate 2", > main="Metric MDS",type="n") > text(fit1$points, labels = c('A','B','C'), cex=1) > > So far so good, until I add more points. Now assume, I add a fourth point D > at {0,2*sqrt(3)}. This produces the rectangular triangle ABD with > hypothenuse BD that encompasses the smaller triangle ABC such that C lies in > the middle between B and D. Then, MDS should reproduce the rectangular > triangle ABD and the equilateral triangle ABC within it. However, even > though distance matrix d2 below still indicates that ABC is an equilateral > triangle, the plot of the MDS does not confirm this. > > x=c(0,2,1,0,0,0,sqrt(3),2*sqrt(3)) > dim(x)=c(4,2) > d2=dist(x) > fit2<-isoMDS(d2) > plot(fit2$points, xlab="Coordinate 1", ylab="Coordinate 2", > main="Metric MDS",type="n") > text(fit2$points, labels = c('A','B','C','D'), cex=1) > Daniel, Mario Valle already told you about asp=1 in plot() to force equal aspect ratio, and MASS also has eqscplot() function for plots with geometrically equal scaling. However, your example above hints that there is something else you should take care of: You label your plot as "Metric MDS", but isoMDS does not do metric MDS. The title in its documentation reads "Kruskal's Non-metric Multidimensional Scaling". In this case you happened to have metric MDS, because isoMDS uses metric scaling as its default starting configuration, and in this case that starting configuration is a perfect fit (stress = 0), and isoMDS() makes no iterations to change the starting configuration. If you want to work with metric MDS, use cmdscale() which does metric MDS. Cheers, jari Oksanen > The reason for this is that the dimension of the plot is automatically > scaled to fit the points. This distorts the visual impression of the > distances, angular relationships, and relative locations. If you plot on a > square pane, however, peace and order are restored in the galaxy. > > plot(fit2$points, xlab="Coordinate 1", ylab="Coordinate 2", > main="Metric MDS",type="n",xlim=c(-3,3),ylim=c(-3,3)) > text(fit2$points, labels = c('A','B','C','D'), cex=1) > > Best, > Daniel > > -- > View this message in context: http://r.789695.n4.nabble.com/Plotting-MDS-multidimensional- scaling-tp3422670p3422670.html > Sent from the R help mailing list archive at Nabble.com. > > From alex at chaotic-neutral.de Sun Apr 3 09:30:50 2011 From: alex at chaotic-neutral.de (Alexander Engelhardt) Date: Sun, 03 Apr 2011 09:30:50 +0200 Subject: [R] I think I just broke R In-Reply-To: <1301795489478-3422932.post@n4.nabble.com> References: <4D978F95.10602@chaotic-neutral.de> <1301795489478-3422932.post@n4.nabble.com> Message-ID: <4D98222A.6050706@chaotic-neutral.de> Am 03.04.2011 03:51, schrieb Daniel Malter: > Check whether x, y, or glm have been redefined. If not, restart R. I wouldn't call my function 'glm'. However, I did call one 'binomial'. That was my mistake. Thanks :) A few weeks ago I asked how to set my error messages to english, and Richard Heiberger told me to use 'Sys.setenv(LANG="EN")'. He used this example, which did work for me at first, but doesn't work now anymore: > Sys.setenv(LANG="DE") > 2+"a" Fehler in 2 + "a" : nicht-numerisches Argument f?r bin?ren Operator > Sys.setenv(LANG="EN") > 2+"a" Fehler in 2 + "a" : nicht-numerisches Argument f?r bin?ren Operator Does someone have any idea why that could be the case? My sessionInfo() is here: > sessionInfo() R version 2.10.1 (2009-12-14) i486-pc-linux-gnu locale: [1] LC_CTYPE=de_DE.utf8 LC_NUMERIC=C [3] LC_TIME=de_DE.utf8 LC_COLLATE=de_DE.utf8 [5] LC_MONETARY=C LC_MESSAGES=de_DE.utf8 [7] LC_PAPER=de_DE.utf8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=de_DE.utf8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base From krishnakirti at gmail.com Sun Apr 3 09:24:38 2011 From: krishnakirti at gmail.com (Krishna Kirti Das) Date: Sun, 3 Apr 2011 01:24:38 -0600 Subject: [R] Unbalanced Anova: What is the best approach? Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From pdalgd at gmail.com Sun Apr 3 11:10:34 2011 From: pdalgd at gmail.com (peter dalgaard) Date: Sun, 3 Apr 2011 11:10:34 +0200 Subject: [R] Unbalanced Anova: What is the best approach? In-Reply-To: References: Message-ID: <639DF089-69B0-4A13-8698-A1EE431E5E04@gmail.com> On Apr 3, 2011, at 09:24 , Krishna Kirti Das wrote: > I have a three-way unbalanced ANOVA that I need to calculate (fixed effects > plus interactions, no random effects). But word has it that aov() is good > only for balanced designs. I have seen a number of different recommendations > for working with unbalanced designs, but they seem to differ widely (car, > nlme, lme4, etc.). So I would like to know what is the best or most usual > way to go about working with unbalanced designs and extracting a reliable > ANOVA table from them in R? Actually, without random effects, aov() is not too crazy, but you might as well use plain lm(). In both cases, the main point is that you need to be aware that there is no such thing as "the" ANOVA table: Sums of squares will depend on the order of testing, and there is nothing to do about that (except getting balanced data). Pragmatically, I'd test the three-factor interaction, then use drop1() on a model with two-factor interactions, if nothing glaringly obvious pops up, try reduction to additive model and then use drop1() again. Obviously, if significant interactions appear, you cannot just remove them and need to investigate what they mean. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com From ggrothendieck at gmail.com Sun Apr 3 11:39:15 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Sun, 3 Apr 2011 05:39:15 -0400 Subject: [R] Discretizing data rows into regular intervals In-Reply-To: References: Message-ID: On Sat, Apr 2, 2011 at 9:31 PM, Linh Tran wrote: > Hi guys, > > I'd like to thank you ahead of time for any help that you can offer me. > I'm kind of stuck trying to do this. > > I have a data frame with dates and values (note: only two columns shown): > > head(test) > ? ? ? ?date ? ? value ? ? ? ? stop > 1 ? ? 01/02/05 ? ? 100 ? ? 12/01/07 > 2 ? ? 07/16/05 ? ? 200 ? ? 12/01/07 > 3 ? ? 12/20/05 ? ? 150 ? ? 12/01/07 > 4 ? ? 04/01/06 ? ? 250 ? ? 12/01/07 > 5 ? ? 10/01/06 ? ? ?10 ? ? 12/01/07 > > What I need to do is create regularly spaced 3-month intervals (starting > with the first observed date) with values that are closest to but recorded > after the date created. I would stop at the stop date. So the result would > look like: > > ? ? ?new_date ? value > 1 ? ? 01/02/05 ? ? 100 > 2 ? ? 04/02/05 ? ? 100 > 3 ? ? 07/02/05 ? ? 100 > 4 ? ? 10/02/05 ? ? 200 > 5 ? ? 01/02/06 ? ? 150 > 6 ? ? 04/02/06 ? ? 250 > 7 ? ? 07/02/06 ? ? 250 > 8 ? ? 10/02/06 ? ? ?10 > 9 ? ? 01/02/07 ? ? ?10 > etc > etc > etc ? 10/02/07 ? ? --- ?## Final obs since next one would be 1/2/08 (after > stop date) > See question #13 in the zoo-faq vignette: http://cran.r-project.org/web/packages/zoo/index.html and note the existence of zoo's yearqtr class. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From ligges at statistik.tu-dortmund.de Sun Apr 3 12:16:03 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Sun, 03 Apr 2011 12:16:03 +0200 Subject: [R] I think I just broke R In-Reply-To: <4D98222A.6050706@chaotic-neutral.de> References: <4D978F95.10602@chaotic-neutral.de> <1301795489478-3422932.post@n4.nabble.com> <4D98222A.6050706@chaotic-neutral.de> Message-ID: <4D9848E3.9000304@statistik.tu-dortmund.de> On 03.04.2011 09:30, Alexander Engelhardt wrote: > Am 03.04.2011 03:51, schrieb Daniel Malter: >> Check whether x, y, or glm have been redefined. If not, restart R. > > I wouldn't call my function 'glm'. However, I did call one 'binomial'. > That was my mistake. Thanks :) > > A few weeks ago I asked how to set my error messages to english, and > Richard Heiberger told me to use 'Sys.setenv(LANG="EN")'. > > He used this example, which did work for me at first, but doesn't work > now anymore: > > > Sys.setenv(LANG="DE") > > 2+"a" > Fehler in 2 + "a" : nicht-numerisches Argument f?r bin?ren Operator > > Sys.setenv(LANG="EN") > > 2+"a" > Fehler in 2 + "a" : nicht-numerisches Argument f?r bin?ren Operator > > Does someone have any idea why that could be the case? Use "LANGUAGE" rather than "LANG" as the environment variable. > My sessionInfo() is here: > > > sessionInfo() > R version 2.10.1 (2009-12-14) and time to upgrade R Best, Uwe Ligges > i486-pc-linux-gnu > > locale: > [1] LC_CTYPE=de_DE.utf8 LC_NUMERIC=C > [3] LC_TIME=de_DE.utf8 LC_COLLATE=de_DE.utf8 > [5] LC_MONETARY=C LC_MESSAGES=de_DE.utf8 > [7] LC_PAPER=de_DE.utf8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=de_DE.utf8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code From muzna.alvi at gmail.com Sun Apr 3 12:56:09 2011 From: muzna.alvi at gmail.com (Muzna Alvi) Date: Sun, 3 Apr 2011 16:26:09 +0530 Subject: [R] kernel density plot Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From mathijsdevaan at gmail.com Sun Apr 3 12:15:50 2011 From: mathijsdevaan at gmail.com (mathijsdevaan) Date: Sun, 3 Apr 2011 05:15:50 -0500 (CDT) Subject: [R] Plotting data on a US County Map Message-ID: <1301825750088-3423342.post@n4.nabble.com> Hi, I have a data frame listing US counties and a quantity ("number") per county and I have a shapefile of the US with county ID's. I would like to plot the "number" variable on a map (in the shapefile) using a color range per county (e.g. white = min(number) = 2, black = max(number) = 15). Can anyone help me actually plotting the data on the map? This is how far I got. Thanks! DF = data.frame(read.table(textConnection(" A CNTY_FIPS number 1 US001 2 2 US002 8 3 US003 3 4 US004 5 5 US005 6 6 US006 7 7 US007 9 8 US008 9 9 US009 10 10 US010 11 11 US011 13 12 US012 15"),head=TRUE,stringsAsFactors=FALSE)) library(maptools) library(ggplot2) library(gpclib) gpclibPermit() setwd("C:/Documents") us_counties.shp <- readShapeSpatial("uscounties.shp") us_counties.shp.p <- fortify.SpatialPolygonsDataFrame(us_counties.shp, region="CNTY_FIPS") us <- merge(us_counties.shp.p, us_counties.shp, by.x="id", by.y="CNTY_FIPS") p <- ggplot(data=us, aes(x=long, y=lat, group=group)) + geom_polygon(fill="#CFCFCF") p <- p + geom_path(color="white") + coord_equal() ggsave(p, width=11.69, height=8.27, file="us_counties_a.jpg") -- View this message in context: http://r.789695.n4.nabble.com/Plotting-data-on-a-US-County-Map-tp3423342p3423342.html Sent from the R help mailing list archive at Nabble.com. From ripley at stats.ox.ac.uk Sun Apr 3 13:12:22 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Sun, 3 Apr 2011 12:12:22 +0100 (BST) Subject: [R] I think I just broke R In-Reply-To: <4D9848E3.9000304@statistik.tu-dortmund.de> References: <4D978F95.10602@chaotic-neutral.de> <1301795489478-3422932.post@n4.nabble.com> <4D98222A.6050706@chaotic-neutral.de> <4D9848E3.9000304@statistik.tu-dortmund.de> Message-ID: On Sun, 3 Apr 2011, Uwe Ligges wrote: > > > On 03.04.2011 09:30, Alexander Engelhardt wrote: >> Am 03.04.2011 03:51, schrieb Daniel Malter: >>> Check whether x, y, or glm have been redefined. If not, restart R. >> >> I wouldn't call my function 'glm'. However, I did call one 'binomial'. >> That was my mistake. Thanks :) >> >> A few weeks ago I asked how to set my error messages to english, and >> Richard Heiberger told me to use 'Sys.setenv(LANG="EN")'. >> >> He used this example, which did work for me at first, but doesn't work >> now anymore: >> >> > Sys.setenv(LANG="DE") >> > 2+"a" >> Fehler in 2 + "a" : nicht-numerisches Argument f?r bin?ren Operator >> > Sys.setenv(LANG="EN") >> > 2+"a" >> Fehler in 2 + "a" : nicht-numerisches Argument f?r bin?ren Operator >> >> Does someone have any idea why that could be the case? > > > Use "LANGUAGE" rather than "LANG" as the environment variable. Also, set it outside your R session, e.g. in your .Renviron file. You are supposed to be able to change this during an R session, but if you rely on OS facilities (as you probably do on Linux) rather than the gettext in the R sources, we have seen instances of the OS breaking this. > > >> My sessionInfo() is here: >> >> > sessionInfo() >> R version 2.10.1 (2009-12-14) > > and time to upgrade R > > > Best, > Uwe Ligges > > > >> i486-pc-linux-gnu >> >> locale: >> [1] LC_CTYPE=de_DE.utf8 LC_NUMERIC=C >> [3] LC_TIME=de_DE.utf8 LC_COLLATE=de_DE.utf8 >> [5] LC_MONETARY=C LC_MESSAGES=de_DE.utf8 >> [7] LC_PAPER=de_DE.utf8 LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=de_DE.utf8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From alex at chaotic-neutral.de Sun Apr 3 13:50:22 2011 From: alex at chaotic-neutral.de (Alexander Engelhardt) Date: Sun, 03 Apr 2011 13:50:22 +0200 Subject: [R] I think I just broke R In-Reply-To: References: <4D978F95.10602@chaotic-neutral.de> <1301795489478-3422932.post@n4.nabble.com> <4D98222A.6050706@chaotic-neutral.de> <4D9848E3.9000304@statistik.tu-dortmund.de> Message-ID: <4D985EFE.1050209@chaotic-neutral.de> Am 03.04.2011 13:12, schrieb Prof Brian Ripley: > On Sun, 3 Apr 2011, Uwe Ligges wrote: >> Use "LANGUAGE" rather than "LANG" as the environment variable. > > Also, set it outside your R session, e.g. in your .Renviron file. > > You are supposed to be able to change this during an R session, but if > you rely on OS facilities (as you probably do on Linux) rather than the > gettext in the R sources, we have seen instances of the OS breaking this. I did use LANGUAGE too, didn't work as well. Creating a ~/.Rprofile file with LANGUAGE="EN" had no effect (weird..). When I edited /etc/R/Renviron.site to include LANGUAGE="EN", it worked. Thanks for the hints! >>> > sessionInfo() >>> R version 2.10.1 (2009-12-14) >> >> and time to upgrade R I'm still fighting to find out how to upgrade stuff on Ubuntu. After a repository update the newest available version was still 2.10.1. I'll figure it out, sooner or later :) From dwinsemius at comcast.net Sun Apr 3 13:55:05 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sun, 3 Apr 2011 07:55:05 -0400 Subject: [R] kernel density plot In-Reply-To: References: Message-ID: On Apr 3, 2011, at 6:56 AM, Muzna Alvi wrote: > I am using the following commands for plotting kernel density for > three > kinds of crops > > density(s22$Net_income_Total.1, bw="nrd0",adjust=1, > kernel=c("gaussian"))->t > plot(t, xlim=c(-30000,40000), main="Net Income Distribution", axes=F, > ylim=c(0,0.00035). xlab="Value in Rupees") > par(new=T) > density(s33$Net_income_Total.1, bw="nrd0",adjust=1, > kernel=c("gaussian"))->u > plot(u, xlim=c(-30000,40000), axes=F, main="", col="red", > ylim=c(0,0.00035)) > par(new=T) > density(s44$Net_income_Total.1, bw="nrd0",adjust=1, > kernel=c("gaussian"))->v > plot(v, xlim=c(-30000,40000), col="blue", axes=F, main="", > ylim=c(0,0.00035)) > > the problem is that in the graph that is drawn > > 1. the xlab gets hidden with the [N= and the bandwidth=] values > 2. when i do par(new=T) this N and bandwidth value appears multiple > times..overlapping each time and making the graph look untidy.. > > > Is there any way of making these N and Bandwidth values not appear > in the > graph? Why not just set ylab="" in subsequent calls to plot? -- David Winsemius, MD West Hartford, CT From dwinsemius at comcast.net Sun Apr 3 13:59:38 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sun, 3 Apr 2011 07:59:38 -0400 Subject: [R] kernel density plot In-Reply-To: References: Message-ID: <9AA0E4F8-24BE-4E68-80AF-6924D3B4340D@comcast.net> On Apr 3, 2011, at 7:55 AM, David Winsemius wrote: > > On Apr 3, 2011, at 6:56 AM, Muzna Alvi wrote: > >> I am using the following commands for plotting kernel density for >> three >> kinds of crops >> >> density(s22$Net_income_Total.1, bw="nrd0",adjust=1, >> kernel=c("gaussian"))->t >> plot(t, xlim=c(-30000,40000), main="Net Income Distribution", axes=F, >> ylim=c(0,0.00035). xlab="Value in Rupees") >> par(new=T) >> density(s33$Net_income_Total.1, bw="nrd0",adjust=1, >> kernel=c("gaussian"))->u >> plot(u, xlim=c(-30000,40000), axes=F, main="", col="red", >> ylim=c(0,0.00035)) >> par(new=T) >> density(s44$Net_income_Total.1, bw="nrd0",adjust=1, >> kernel=c("gaussian"))->v >> plot(v, xlim=c(-30000,40000), col="blue", axes=F, main="", >> ylim=c(0,0.00035)) >> >> the problem is that in the graph that is drawn >> >> 1. the xlab gets hidden with the [N= and the bandwidth=] values >> 2. when i do par(new=T) this N and bandwidth value appears multiple >> times..overlapping each time and making the graph look untidy.. >> >> >> Is there any way of making these N and Bandwidth values not appear >> in the >> graph? > > Why not just set ylab="" in subsequent calls to plot? Sorry, I meant xlab="". > -- David Winsemius, MD West Hartford, CT From marchywka at hotmail.com Sun Apr 3 13:51:42 2011 From: marchywka at hotmail.com (Mike Marchywka) Date: Sun, 3 Apr 2011 07:51:42 -0400 Subject: [R] help In-Reply-To: References: , Message-ID: > Date: Sun, 3 Apr 2011 01:35:16 +0530 > From: nandan.amar at gmail.com > To: padmanabhan.vijayan at gmail.com > CC: r-help at r-project.org > Subject: Re: [R] help > > One way that u might have thought of is to create plot in PDF in R and the > use pdftools. > Additionally one can also think of running R script using R CMD and then > using pdftools in a .sh script file if u r in linux. > I am not aware of pdftools capability in R. > > On 2 April 2011 23:01, Vijayan Padmanabhan wrote: > > > Dear R Help group > > I need to run a command line script from within R session. I am not clear > > how i can acheive this. I tried shell and system function, but i am missing > > something critical.can someone provide help? > > My intention is to create a pdf file of a plot in R and then attach > > existing files from my system as attachment into the newly created pdf > > file. > > Any help would be greatly appreciated.. Here is the command line script i > > want to execute from within R. > > > > > > pdftools -S "attachfiles=C:\test1.pdf" -i C:\test2.pdf -o C:\test4.pdf > > > > Regards > > Vijayan Padmanabhan I just tried > system("pdftk --help") and it appeared to work as I have pdftk from cygwin.I routinely do this the other way however and invoke R from a bash script and then use external tools like this from the bash script after R is done. If I'm generating various pieces, it seems to make sense to get them all first and release any resources R has accumulated as pdf manipulation itself can often require lots of memory etc. From shahab.mokari at gmail.com Sun Apr 3 14:02:57 2011 From: shahab.mokari at gmail.com (shahab) Date: Sun, 3 Apr 2011 15:02:57 +0300 Subject: [R] Error in "color2D.matplot" : "Error in plot.new() : figure margins too large" Message-ID: Hi, I am using color2D.matplot (...) function of "plotrix" package. I used a matrix of size around 20*20 However, apparently it failed to visualize the matrix and gave the following exception, which I don't have any idea about possible source of this error. "Error in plot.new() : figure margins too large" It would be appreciated if someone points me to the right origin of this error. best, /Shahab From murdoch.duncan at gmail.com Sun Apr 3 14:10:50 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Sun, 03 Apr 2011 08:10:50 -0400 Subject: [R] I think I just broke R In-Reply-To: <4D985EFE.1050209@chaotic-neutral.de> References: <4D978F95.10602@chaotic-neutral.de> <1301795489478-3422932.post@n4.nabble.com> <4D98222A.6050706@chaotic-neutral.de> <4D9848E3.9000304@statistik.tu-dortmund.de> <4D985EFE.1050209@chaotic-neutral.de> Message-ID: <4D9863CA.1020107@gmail.com> On 11-04-03 7:50 AM, Alexander Engelhardt wrote: > Am 03.04.2011 13:12, schrieb Prof Brian Ripley: >> On Sun, 3 Apr 2011, Uwe Ligges wrote: >>> Use "LANGUAGE" rather than "LANG" as the environment variable. >> >> Also, set it outside your R session, e.g. in your .Renviron file. >> >> You are supposed to be able to change this during an R session, but if >> you rely on OS facilities (as you probably do on Linux) rather than the >> gettext in the R sources, we have seen instances of the OS breaking this. > > I did use LANGUAGE too, didn't work as well. > Creating a ~/.Rprofile file with LANGUAGE="EN" had no effect (weird..). That's not weird: you just created an R variable named LANGUAGE, not an environment variable. Duncan Murdoch > When I edited /etc/R/Renviron.site to include LANGUAGE="EN", it worked. > Thanks for the hints! > >>>>> sessionInfo() >>>> R version 2.10.1 (2009-12-14) >>> >>> and time to upgrade R > > I'm still fighting to find out how to upgrade stuff on Ubuntu. After a > repository update the newest available version was still 2.10.1. > I'll figure it out, sooner or later :) > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From alex at chaotic-neutral.de Sun Apr 3 14:30:19 2011 From: alex at chaotic-neutral.de (Alexander Engelhardt) Date: Sun, 03 Apr 2011 14:30:19 +0200 Subject: [R] I think I just broke R In-Reply-To: <4D9863CA.1020107@gmail.com> References: <4D978F95.10602@chaotic-neutral.de> <1301795489478-3422932.post@n4.nabble.com> <4D98222A.6050706@chaotic-neutral.de> <4D9848E3.9000304@statistik.tu-dortmund.de> <4D985EFE.1050209@chaotic-neutral.de> <4D9863CA.1020107@gmail.com> Message-ID: <4D98685B.5040201@chaotic-neutral.de> Am 03.04.2011 14:10, schrieb Duncan Murdoch: > That's not weird: you just created an R variable named LANGUAGE, not an > environment variable. > > Duncan Murdoch Silly me. It works now: alexx at derp:~$ cat ~/.Renviron LANGUAGE="EN" Thanks :) From jrkrideau at yahoo.ca Sun Apr 3 14:44:45 2011 From: jrkrideau at yahoo.ca (John Kane) Date: Sun, 3 Apr 2011 05:44:45 -0700 (PDT) Subject: [R] Error in "color2D.matplot" : "Error in plot.new() : figure margins too large" In-Reply-To: Message-ID: <809072.21577.qm@web38402.mail.mud.yahoo.com> Totally guessing but did you make an earlier call to par() to adjust margins? Otherwise you may want to supply the matrix here for other people to experiment with. See ?dput as probably the best way to do this. --- On Sun, 4/3/11, shahab wrote: > From: shahab > Subject: [R] Error in "color2D.matplot" : "Error in plot.new() : figure margins too large" > To: r-help at r-project.org > Received: Sunday, April 3, 2011, 8:02 AM > Hi, > > I am using color2D.matplot (...) function of "plotrix" > package. I used > a matrix of size? around 20*20 > However, apparently it failed to visualize the matrix and > gave the > following exception, which I don't have any idea about > possible source > of this error. > > "Error in plot.new() : figure margins too large" > > It would be appreciated if someone points me to the right > origin of this error. > > best, > /Shahab > > ______________________________________________ > R-help at r-project.org > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. > From lcn918 at gmail.com Sun Apr 3 15:01:37 2011 From: lcn918 at gmail.com (lcn) Date: Sun, 3 Apr 2011 21:01:37 +0800 Subject: [R] recommendation on r scripting tutorial? In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From emammendes at gmail.com Sun Apr 3 15:03:30 2011 From: emammendes at gmail.com (Eduardo M. A. M.Mendes) Date: Sun, 3 Apr 2011 10:03:30 -0300 Subject: [R] Inverse noncentral Beta Message-ID: <008101cbf1ff$8ad795d0$a086c170$@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From lcn918 at gmail.com Sun Apr 3 15:05:37 2011 From: lcn918 at gmail.com (lcn) Date: Sun, 3 Apr 2011 21:05:37 +0800 Subject: [R] R gui on windows how to force to always show the last line of output In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From murdoch.duncan at gmail.com Sun Apr 3 15:14:26 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Sun, 03 Apr 2011 09:14:26 -0400 Subject: [R] recommendation on r scripting tutorial? In-Reply-To: References: Message-ID: <4D9872B2.5000006@gmail.com> On 11-04-03 9:01 AM, lcn wrote: > The documents accompanying the distribution can be a good start. > > And a suggestion for searching help on R over Google, use "r-help" as a > basic keyword, coz a single letter of "r" hardly helps you find the desired > topics. When I google for "R tutorial" or "r tutorial", the entire first page looks relevant (though I'm not familiar with most of the tutorials, so can't give a recommendation). I think it's a myth that Google doesn't know what you mean when you ask about "R". Or perhaps it is tailoring its results to what it has seen me choose in the past. Duncan Murdoch > > 2011/4/2 Wensui Liu > >> Good morning, dear listers >> >> I am wondering if you could recommend a good tutorial / book for r >> scripting. >> >> thank you so much in advance! >> >> WenSui Liu >> Credit Risk Manager, 53 Bancorp >> wensui.liu at 53.com >> 513-295-4370 >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ted.harding at wlandres.net Sun Apr 3 15:18:02 2011 From: ted.harding at wlandres.net ( (Ted Harding)) Date: Sun, 03 Apr 2011 14:18:02 +0100 (BST) Subject: [R] Inverse noncentral Beta In-Reply-To: <008101cbf1ff$8ad795d0$a086c170$@gmail.com> Message-ID: On 03-Apr-11 13:03:30, Eduardo M. A. M.Mendes wrote: > Hello > I could not find whether there is any R-function that implements > the inverse of a noncentral Beta. Could someone out there tell > me where I can find it? Or how to implement it? > > Many thanks > Ed Have a look at the 'ncp' paramater in ?qbeta -- this (default=0) is the non-centrality parameter for the beta distribution, and qbeta is the inverse function of the distribution. The noncentral Beta distribution (with ?ncp? = lambda) is defined (Johnson et al, 1995, pp. 502) as the distribution of X/(X+Y) where X ~ chi^2_2a(lambda) and Y ~ chi^2_2b. i.e. as in qbeta(p,a,b,ncp=lambda) Hoping this helps, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) Fax-to-email: +44 (0)870 094 0861 Date: 03-Apr-11 Time: 14:17:58 ------------------------------ XFMail ------------------------------ From jfox at mcmaster.ca Sun Apr 3 15:28:29 2011 From: jfox at mcmaster.ca (John Fox) Date: Sun, 3 Apr 2011 09:28:29 -0400 Subject: [R] Unbalanced Anova: What is the best approach? In-Reply-To: References: Message-ID: <001201cbf203$05c56f80$11504e80$@mcmaster.ca> Dear Krishna, Although it's difficult to explain briefly, I'd argue that balanced and unbalanced ANOVA are not fundamentally different, in that the focus should be on the hypotheses that are tested, and these are naturally expressed as functions of cell means and marginal means. For example, in a two-way ANOVA, the null hypotheses of no interaction is equivalent to parallel profiles of cell means for one factor across levels of the other. What is different, though, is that in a balanced ANOVA all common approaches to constructing an ANOVA table coincide. Without getting into the explanation in detail (which you can find in a text like my Applied Regression Analysis and Generalized Linear Models), so-called type-I (or sequential) tests, such as those performed by the standard anova() function in R, test hypotheses that are rarely of substantive interest, and, even when they are, are of interest only by accident. So-called type-II tests, such as those performed by default by the Anova() function in the car package, test hypotheses that are almost always of interest. Type-III tests, which the Anova() function in car can perform optionally, require careful formulation of the model for the hypotheses tested to be sensible, and even then have less power than corresponding type-II tests in the circumstances in which a test would be of interest. Since you're addressing fixed-effects models, I'm not sure why you introduced nlme and lme4 into the discussion, but I note that Anova() in the car package has methods that can produce type-II and -III Wald tests for the fixed effects in mixed models fit by lme() and lmer(). Your question has been asked several times before on the r-help list. For example, if you enter terms like "type-II" or "unbalanced ANOVA" in the RSeek search engine and look under the "Support Lists" tab, you'll see many hits -- e.g., . I hope this helps, John -------------------------------- John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] > On Behalf Of Krishna Kirti Das > Sent: April-03-11 3:25 AM > To: r-help at r-project.org > Subject: [R] Unbalanced Anova: What is the best approach? > > I have a three-way unbalanced ANOVA that I need to calculate (fixed > effects plus interactions, no random effects). But word has it that aov() > is good only for balanced designs. I have seen a number of different > recommendations for working with unbalanced designs, but they seem to > differ widely (car, nlme, lme4, etc.). So I would like to know what is the > best or most usual way to go about working with unbalanced designs and > extracting a reliable ANOVA table from them in R? > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. From krishnakirti at gmail.com Sun Apr 3 16:35:30 2011 From: krishnakirti at gmail.com (Krishna Kirti Das) Date: Sun, 3 Apr 2011 08:35:30 -0600 Subject: [R] Unbalanced Anova: What is the best approach? In-Reply-To: <001201cbf203$05c56f80$11504e80$@mcmaster.ca> References: <001201cbf203$05c56f80$11504e80$@mcmaster.ca> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From krishnakirti at gmail.com Sun Apr 3 16:36:54 2011 From: krishnakirti at gmail.com (Krishna Kirti Das) Date: Sun, 3 Apr 2011 08:36:54 -0600 Subject: [R] Unbalanced Anova: What is the best approach? In-Reply-To: <639DF089-69B0-4A13-8698-A1EE431E5E04@gmail.com> References: <639DF089-69B0-4A13-8698-A1EE431E5E04@gmail.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From mustafa_binar at mynet.com Sun Apr 3 15:10:33 2011 From: mustafa_binar at mynet.com (mustafabinar) Date: Sun, 3 Apr 2011 16:10:33 +0300 (EEST) Subject: [R] :HELP Message-ID: <59163.78.167.98.215.1301836233.mynet@webmail191.mynet.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From spencer.graves at prodsyse.com Sun Apr 3 17:06:46 2011 From: spencer.graves at prodsyse.com (Spencer Graves) Date: Sun, 03 Apr 2011 08:06:46 -0700 Subject: [R] Unbalanced Anova: What is the best approach? In-Reply-To: References: <001201cbf203$05c56f80$11504e80$@mcmaster.ca> Message-ID: <4D988D06.8050202@prodsyse.com> Hi, Krishna: On 4/3/2011 7:35 AM, Krishna Kirti Das wrote: > Thank you, John. > > Yes, your answers do help. For me it's mainly about getting familiar with > the "R" way of doing things. > > Thus your response also confirms what I suspected, that there is no explicit > user-interface (at least one that is widely used) in terms of > functions/packages that represents an unbalanced design in the same way that > aov would represent a balanced one. Analyzing balanced and unbalanced data > are obviously possible, but with balanced designs via aov what has to be > done is intuitive within the language but unintuitive for unbalanced > designs. Intuition is subject to one's background and expectations. If you think in terms of a series of nested hypotheses, then the standard R anova is very intuitive. I never use aov, because it's not intuitive to me and not very general. 'aov' is only useful for a balanced design with normal independent errors with constant variance. The real world is rarely so simple. The 'aov' algorithm was wonderful over half a century ago, when all computations were done by hand or using a mechanical calculator (e.g., an abacus or a calculator with gears). Unbalanced designs were largely impractical because of computational difficulties. There were many procedures for imputing missing values for a design that was "almost balanced". I encourage you to think in terms of alternative sequences of nested hypotheses, including the implications of A being significant by itself, but not with B already present, except that the A:B interaction is or is not significant. > I did notice that this question gets asked several times and in slightly > different ways, and I think the lack of an interface that represents an > unbalanced design in the same way aov represents balanced designs is why the > question will probably keep getting asked again. > > I had mentioned nlme and lme4 because I saw in some of the discussions that > using those were recommended for working with unbalanced designs. And > specifying random effects with zero variance, for example, would probably > serve my purposes. I'd be surprised if nlme or lme4 changes what I wrote above. Hope this helps. Spencer > Thank you for your help. > > Sincerely, > > Krishna > > On Sun, Apr 3, 2011 at 7:28 AM, John Fox wrote: > >> Dear Krishna, >> >> Although it's difficult to explain briefly, I'd argue that balanced and >> unbalanced ANOVA are not fundamentally different, in that the focus should >> be on the hypotheses that are tested, and these are naturally expressed as >> functions of cell means and marginal means. For example, in a two-way >> ANOVA, >> the null hypotheses of no interaction is equivalent to parallel profiles of >> cell means for one factor across levels of the other. What is different, >> though, is that in a balanced ANOVA all common approaches to constructing >> an >> ANOVA table coincide. >> >> Without getting into the explanation in detail (which you can find in a >> text >> like my Applied Regression Analysis and Generalized Linear Models), >> so-called type-I (or sequential) tests, such as those performed by the >> standard anova() function in R, test hypotheses that are rarely of >> substantive interest, and, even when they are, are of interest only by >> accident. So-called type-II tests, such as those performed by default by >> the >> Anova() function in the car package, test hypotheses that are almost always >> of interest. Type-III tests, which the Anova() function in car can perform >> optionally, require careful formulation of the model for the hypotheses >> tested to be sensible, and even then have less power than corresponding >> type-II tests in the circumstances in which a test would be of interest. >> >> Since you're addressing fixed-effects models, I'm not sure why you >> introduced nlme and lme4 into the discussion, but I note that Anova() in >> the >> car package has methods that can produce type-II and -III Wald tests for >> the >> fixed effects in mixed models fit by lme() and lmer(). >> >> Your question has been asked several times before on the r-help list. For >> example, if you enter terms like "type-II" or "unbalanced ANOVA" in the >> RSeek search engine and look under the "Support Lists" tab, you'll see many >> hits -- e.g., >> . >> >> I hope this helps, >> John >> >> -------------------------------- >> John Fox >> Senator William McMaster >> Professor of Social Statistics >> Department of Sociology >> McMaster University >> Hamilton, Ontario, Canada >> http://socserv.mcmaster.ca/jfox >> >> >> >>> -----Original Message----- >>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] >>> On Behalf Of Krishna Kirti Das >>> Sent: April-03-11 3:25 AM >>> To: r-help at r-project.org >>> Subject: [R] Unbalanced Anova: What is the best approach? >>> >>> I have a three-way unbalanced ANOVA that I need to calculate (fixed >>> effects plus interactions, no random effects). But word has it that aov() >>> is good only for balanced designs. I have seen a number of different >>> recommendations for working with unbalanced designs, but they seem to >>> differ widely (car, nlme, lme4, etc.). So I would like to know what is >> the >>> best or most usual way to go about working with unbalanced designs and >>> extracting a reliable ANOVA table from them in R? >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting- >>> guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Spencer Graves, PE, PhD President and Chief Operating Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San Jos?, CA 95126 ph: 408-655-4567 From jfox at mcmaster.ca Sun Apr 3 17:24:23 2011 From: jfox at mcmaster.ca (John Fox) Date: Sun, 3 Apr 2011 11:24:23 -0400 Subject: [R] Unbalanced Anova: What is the best approach? In-Reply-To: <4D988D06.8050202@prodsyse.com> References: <001201cbf203$05c56f80$11504e80$@mcmaster.ca> <4D988D06.8050202@prodsyse.com> Message-ID: <001901cbf213$36436d00$a2ca4700$@mcmaster.ca> Dear Spencer, > -----Original Message----- > From: Spencer Graves [mailto:spencer.graves at prodsyse.com] > Sent: April-03-11 11:07 AM > To: Krishna Kirti Das > Cc: John Fox; r-help at r-project.org > Subject: Re: [R] Unbalanced Anova: What is the best approach? > > Hi, Krishna: > > > > > On 4/3/2011 7:35 AM, Krishna Kirti Das wrote: > > Thank you, John. > > > > Yes, your answers do help. For me it's mainly about getting familiar > > with the "R" way of doing things. > > > > Thus your response also confirms what I suspected, that there is no > > explicit user-interface (at least one that is widely used) in terms of > > functions/packages that represents an unbalanced design in the same > > way that aov would represent a balanced one. Analyzing balanced and > > unbalanced data are obviously possible, but with balanced designs via > > aov what has to be done is intuitive within the language but > > unintuitive for unbalanced designs. > > Intuition is subject to one's background and expectations. If you > think in terms of a series of nested hypotheses, then the standard R anova > is very intuitive. I never use aov, because it's not intuitive to me and > not very general. 'aov' is only useful for a balanced design with normal > independent errors with constant variance. The real world is rarely so > simple. The 'aov' algorithm was wonderful over half a century ago, when > all computations were done by hand or using a mechanical calculator (e.g., > an abacus or a calculator with gears). > Unbalanced designs were largely impractical because of computational > difficulties. There were many procedures for imputing missing values for > a design that was "almost balanced". > > > I encourage you to think in terms of alternative sequences of > nested hypotheses, including the implications of A being significant by > itself, but not with B already present, except that the A:B interaction is > or is not significant. So-called type-II tests do exactly that -- that is, obey the principle of marginality; they are maximally powerful if the higher-order term(s) to which a particular term is marginal are 0. Best, John > > > I did notice that this question gets asked several times and in > > slightly different ways, and I think the lack of an interface that > > represents an unbalanced design in the same way aov represents > > balanced designs is why the question will probably keep getting asked > again. > > > > I had mentioned nlme and lme4 because I saw in some of the discussions > > that using those were recommended for working with unbalanced designs. > > And specifying random effects with zero variance, for example, would > > probably serve my purposes. > > I'd be surprised if nlme or lme4 changes what I wrote above. > > > Hope this helps. > Spencer > > > Thank you for your help. > > > > Sincerely, > > > > Krishna > > > > On Sun, Apr 3, 2011 at 7:28 AM, John Fox wrote: > > > >> Dear Krishna, > >> > >> Although it's difficult to explain briefly, I'd argue that balanced > >> and unbalanced ANOVA are not fundamentally different, in that the > >> focus should be on the hypotheses that are tested, and these are > >> naturally expressed as functions of cell means and marginal means. > >> For example, in a two-way ANOVA, the null hypotheses of no > >> interaction is equivalent to parallel profiles of cell means for one > >> factor across levels of the other. What is different, though, is that > >> in a balanced ANOVA all common approaches to constructing an ANOVA > >> table coincide. > >> > >> Without getting into the explanation in detail (which you can find in > >> a text like my Applied Regression Analysis and Generalized Linear > >> Models), so-called type-I (or sequential) tests, such as those > >> performed by the standard anova() function in R, test hypotheses that > >> are rarely of substantive interest, and, even when they are, are of > >> interest only by accident. So-called type-II tests, such as those > >> performed by default by the > >> Anova() function in the car package, test hypotheses that are almost > >> always of interest. Type-III tests, which the Anova() function in car > >> can perform optionally, require careful formulation of the model for > >> the hypotheses tested to be sensible, and even then have less power > >> than corresponding type-II tests in the circumstances in which a test > would be of interest. > >> > >> Since you're addressing fixed-effects models, I'm not sure why you > >> introduced nlme and lme4 into the discussion, but I note that Anova() > >> in the car package has methods that can produce type-II and -III Wald > >> tests for the fixed effects in mixed models fit by lme() and lmer(). > >> > >> Your question has been asked several times before on the r-help list. > >> For example, if you enter terms like "type-II" or "unbalanced ANOVA" > >> in the RSeek search engine and look under the "Support Lists" tab, > >> you'll see many hits -- e.g., > >> . > >> > >> I hope this helps, > >> John > >> > >> -------------------------------- > >> John Fox > >> Senator William McMaster > >> Professor of Social Statistics > >> Department of Sociology > >> McMaster University > >> Hamilton, Ontario, Canada > >> http://socserv.mcmaster.ca/jfox > >> > >> > >> > >>> -----Original Message----- > >>> From: r-help-bounces at r-project.org > >>> [mailto:r-help-bounces at r-project.org] > >>> On Behalf Of Krishna Kirti Das > >>> Sent: April-03-11 3:25 AM > >>> To: r-help at r-project.org > >>> Subject: [R] Unbalanced Anova: What is the best approach? > >>> > >>> I have a three-way unbalanced ANOVA that I need to calculate (fixed > >>> effects plus interactions, no random effects). But word has it that > >>> aov() is good only for balanced designs. I have seen a number of > >>> different recommendations for working with unbalanced designs, but > >>> they seem to differ widely (car, nlme, lme4, etc.). So I would like > >>> to know what is > >> the > >>> best or most usual way to go about working with unbalanced designs > >>> and extracting a reliable ANOVA table from them in R? > >>> > >>> [[alternative HTML version deleted]] > >>> > >>> ______________________________________________ > >>> R-help at r-project.org mailing list > >>> https://stat.ethz.ch/mailman/listinfo/r-help > >>> PLEASE do read the posting guide http://www.R-project.org/posting- > >>> guide.html and provide commented, minimal, self-contained, > >>> reproducible code. > >> > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > -- > Spencer Graves, PE, PhD > President and Chief Operating Officer > Structure Inspection and Monitoring, Inc. > 751 Emerson Ct. > San Jos?, CA 95126 > ph: 408-655-4567 From jfox at mcmaster.ca Sun Apr 3 17:35:01 2011 From: jfox at mcmaster.ca (John Fox) Date: Sun, 3 Apr 2011 11:35:01 -0400 Subject: [R] Unbalanced Anova: What is the best approach? In-Reply-To: References: <001201cbf203$05c56f80$11504e80$@mcmaster.ca> Message-ID: <001a01cbf214$b2a29be0$17e7d3a0$@mcmaster.ca> Dear Krishna, > -----Original Message----- > From: Krishna Kirti Das [mailto:krishnakirti at gmail.com] > Sent: April-03-11 10:36 AM > To: John Fox > Cc: r-help at r-project.org > Subject: Re: [R] Unbalanced Anova: What is the best approach? > > Thank you, John. > > Yes, your answers do help. For me it's mainly about getting familiar with > the "R" way of doing things. > > Thus your response also confirms what I suspected, that there is no > explicit user-interface (at least one that is widely used) in terms of > functions/packages that represents an unbalanced design in the same way > that aov would represent a balanced one. Analyzing balanced and unbalanced > data are obviously possible, but with balanced designs via aov what has to > be done is intuitive within the language but unintuitive for unbalanced > designs. I don't agree with your characterization. For example, the representation of a two-way crossed ANOVA model as an R model formula is precisely the same for balanced and unbalanced data: for response Y and factors A and B, Y ~ A*B. Moreover, the issue of how to formulate tests is independent of the software you use. > > I did notice that this question gets asked several times and in slightly > different ways, and I think the lack of an interface that represents an > unbalanced design in the same way aov represents balanced designs is why > the question will probably keep getting asked again. I suspect that the issue gets asked repeatedly for two reasons: (1) More fundamentally, I believe that the general level of understanding of hypothesis tests in unbalanced data is low; (2) people don't necessarily read previous posts to r-help. > > I had mentioned nlme and lme4 because I saw in some of the discussions > that using those were recommended for working with unbalanced designs. And > specifying random effects with zero variance, for example, would probably > serve my purposes. I don't think that either lme() or lmer() will allow you to fit a model without random effects, but even if they did there wouldn't be much sense in doing so. You can compute a mean with lm() or glm(), but would you? Best, John > > Thank you for your help. > > Sincerely, > > Krishna > > On Sun, Apr 3, 2011 at 7:28 AM, John Fox wrote: > > > Dear Krishna, > > Although it's difficult to explain briefly, I'd argue that balanced > and > unbalanced ANOVA are not fundamentally different, in that the focus > should > be on the hypotheses that are tested, and these are naturally > expressed as > functions of cell means and marginal means. For example, in a two-way > ANOVA, > the null hypotheses of no interaction is equivalent to parallel > profiles of > cell means for one factor across levels of the other. What is > different, > though, is that in a balanced ANOVA all common approaches to > constructing an > ANOVA table coincide. > > Without getting into the explanation in detail (which you can find in > a text > like my Applied Regression Analysis and Generalized Linear Models), > so-called type-I (or sequential) tests, such as those performed by > the > standard anova() function in R, test hypotheses that are rarely of > substantive interest, and, even when they are, are of interest only > by > accident. So-called type-II tests, such as those performed by default > by the > Anova() function in the car package, test hypotheses that are almost > always > of interest. Type-III tests, which the Anova() function in car can > perform > optionally, require careful formulation of the model for the > hypotheses > tested to be sensible, and even then have less power than > corresponding > type-II tests in the circumstances in which a test would be of > interest. > > Since you're addressing fixed-effects models, I'm not sure why you > introduced nlme and lme4 into the discussion, but I note that Anova() > in the > car package has methods that can produce type-II and -III Wald tests > for the > fixed effects in mixed models fit by lme() and lmer(). > > Your question has been asked several times before on the r-help list. > For > example, if you enter terms like "type-II" or "unbalanced ANOVA" in > the > RSeek search engine and look under the "Support Lists" tab, you'll > see many > hits -- e.g., > . > > I hope this helps, > John > > -------------------------------- > John Fox > Senator William McMaster > Professor of Social Statistics > Department of Sociology > McMaster University > Hamilton, Ontario, Canada > http://socserv.mcmaster.ca/jfox > > > > > > -----Original Message----- > > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] > > On Behalf Of Krishna Kirti Das > > Sent: April-03-11 3:25 AM > > To: r-help at r-project.org > > Subject: [R] Unbalanced Anova: What is the best approach? > > > > I have a three-way unbalanced ANOVA that I need to calculate (fixed > > effects plus interactions, no random effects). But word has it that > aov() > > is good only for balanced designs. I have seen a number of > different > > recommendations for working with unbalanced designs, but they seem > to > > differ widely (car, nlme, lme4, etc.). So I would like to know what > is the > > best or most usual way to go about working with unbalanced designs > and > > extracting a reliable ANOVA table from them in R? > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting- > > guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > From jrkrideau at yahoo.ca Sun Apr 3 17:50:25 2011 From: jrkrideau at yahoo.ca (John Kane) Date: Sun, 3 Apr 2011 08:50:25 -0700 (PDT) Subject: [R] :HELP In-Reply-To: <59163.78.167.98.215.1301836233.mynet@webmail191.mynet.com> Message-ID: <818002.79301.qm@web38403.mail.mud.yahoo.com> Your message is severely garbled. However a={1,2,3,4,5 6,7,8,9,10} is not going to work since {} is incorrect. you need c() Anyway try this: a=matrix(1:10,ncol=2) sum(a[1:3,1]) sum(a[1:3,2]) --- On Sun, 4/3/11, mustafabinar wrote: > From: mustafabinar > Subject: [R] :HELP > To: r-help at r-project.org > Received: Sunday, April 3, 2011, 9:10 AM > Hello, >   > I want to sum first three terms of each column of matrix. > But I don't calculate with "apply" function. >   > skwkrt<-function(N=10000,mu=0,sigma=1,n=100, > nboot=1000,alpha=0.05){ > x<-rnorm(N,mu,sigma)#population > samplex<-matrix(sample(x,n*nboot,replace=T),nrow=nboot) > #... > } >   > is that: suppose a is a 5x2 matrix. >  a={1,2,3,4,5 > 6,7,8,9,10} >   > I want to sum first three terms. > sm[1]=1+2+3 > sm[2]=6+7+8 > But I don't calculate. please help me!!! > > ??? [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. > From Samuel.Le at srlglobal.com Sun Apr 3 18:14:51 2011 From: Samuel.Le at srlglobal.com (Samuel Le) Date: Sun, 3 Apr 2011 16:14:51 +0000 Subject: [R] converting "call" objects into character Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Sun Apr 3 18:42:20 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sun, 3 Apr 2011 12:42:20 -0400 Subject: [R] converting "call" objects into character In-Reply-To: References: Message-ID: <037B6CC7-D60B-4B6F-A9B3-5F95F33B8174@comcast.net> On Apr 3, 2011, at 12:14 PM, Samuel Le wrote: > Dear all, > > > > I would like to log the calls to my functions. I am trying to do > this using the function match.call(): fTest<-function(x) { theCall<-match.call() print(theCall) return(list(x=x, logf = theCall)) } > > fTest(x=2)$x [1] 2 > fTest(x=2)$logf fTest(x = 2) > str(fTest(x=2)$logf) language fTest(x = 2) You may want to convert that call component to a character object, since: > cat(fTest(x=2)$logf) Error in cat(list(...), file, sep, fill, labels, append) : argument 1 (type 'language') cannot be handled by 'cat' > > I can see "theCall" printed into the console, but I don't manage to > convert it into a character to write it into a log file with other > informations. > > Can anyone help? David Winsemius, MD West Hartford, CT From axel.urbiz at gmail.com Sun Apr 3 18:56:12 2011 From: axel.urbiz at gmail.com (Axel Urbiz) Date: Sun, 3 Apr 2011 12:56:12 -0400 Subject: [R] Help in splitting ists into sub-lists Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bates at stat.wisc.edu Sun Apr 3 19:22:24 2011 From: bates at stat.wisc.edu (Douglas Bates) Date: Sun, 3 Apr 2011 12:22:24 -0500 Subject: [R] converting "call" objects into character In-Reply-To: <037B6CC7-D60B-4B6F-A9B3-5F95F33B8174@comcast.net> References: <037B6CC7-D60B-4B6F-A9B3-5F95F33B8174@comcast.net> Message-ID: On Sun, Apr 3, 2011 at 11:42 AM, David Winsemius wrote: > > On Apr 3, 2011, at 12:14 PM, Samuel Le wrote: > >> Dear all, >> >> >> >> I would like to log the calls to my functions. I am trying to do this >> using the function match.call(): > > fTest<-function(x) > > { ?theCall<-match.call() > ? ? ?print(theCall) > ? ? ?return(list(x=x, logf = theCall)) > } > >> >> fTest(x=2)$x > [1] 2 >> fTest(x=2)$logf > fTest(x = 2) >> str(fTest(x=2)$logf) > ?language fTest(x = 2) > > You may want to convert that ?call component to a character object, since: > >> cat(fTest(x=2)$logf) > Error in cat(list(...), file, sep, fill, labels, append) : > ?argument 1 (type 'language') cannot be handled by 'cat' If you want to examine a call object you need to ensure that it is not evaluated. Evaluating a number or a character string is not a problem because eval(4) is the same as 4 However, evaluating a function call should be different from the call itself. As David shows, the str function is careful not to evaluate the call object. (Martin and I found ourselves going around in circles when looking at the structure of a fitted model object that included a call and he kindly changed the behavior of str().) So you need to decide when a function, such as print(), evaluates its arguments or when it doesn't, which can get kind of complicated. An alternative is to use match.call() repeatedly instead of trying to save the value, as in > fTest function(x) { print(match.call()) list(x=x, logf = match.call()) } > fTest(x=2) fTest(x = 2) $x [1] 2 $logf fTest(x = 2) The trick there is that the value of match.call() is the unevaluated call whereas myCall <- match.call() print(myCall) evaluates myCall in the call to print, thereby evaluating the function fTest again. Is this sufficiently confusing? :-) >> >> I can see "theCall" printed into the console, but I don't manage to >> convert it into a character to write it into a log file with other >> informations. >> >> Can anyone help? > > > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From tyler_rinker at hotmail.com Sun Apr 3 19:44:03 2011 From: tyler_rinker at hotmail.com (Tyler Rinker) Date: Sun, 3 Apr 2011 13:44:03 -0400 Subject: [R] Function for finding NA's Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From djmuser at gmail.com Sun Apr 3 19:52:47 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Sun, 3 Apr 2011 10:52:47 -0700 Subject: [R] :HELP In-Reply-To: <59163.78.167.98.215.1301836233.mynet@webmail191.mynet.com> References: <59163.78.167.98.215.1301836233.mynet@webmail191.mynet.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Sun Apr 3 19:55:54 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sun, 3 Apr 2011 13:55:54 -0400 Subject: [R] converting "call" objects into character In-Reply-To: References: <037B6CC7-D60B-4B6F-A9B3-5F95F33B8174@comcast.net> Message-ID: On Apr 3, 2011, at 1:22 PM, Douglas Bates wrote: > On Sun, Apr 3, 2011 at 11:42 AM, David Winsemius > wrote: >> >> On Apr 3, 2011, at 12:14 PM, Samuel Le wrote: >> >>> Dear all, >>> >>> >>> >>> I would like to log the calls to my functions. I am trying to do >>> this >>> using the function match.call(): >> >> fTest<-function(x) >> >> { theCall<-match.call() >> print(theCall) >> return(list(x=x, logf = theCall)) >> } >> >>> >>> fTest(x=2)$x >> [1] 2 >>> fTest(x=2)$logf >> fTest(x = 2) >>> str(fTest(x=2)$logf) >> language fTest(x = 2) >> >> You may want to convert that call component to a character object, >> since: >> >>> cat(fTest(x=2)$logf) >> Error in cat(list(...), file, sep, fill, labels, append) : >> argument 1 (type 'language') cannot be handled by 'cat' > > If you want to examine a call object you need to ensure that it is not > evaluated. Evaluating a number or a character string is not a problem > because > > eval(4) > > is the same as > > 4 > > However, evaluating a function call should be different from the call > itself. As David shows, the str function is careful not to evaluate > the call object. (Martin and I found ourselves going around in > circles when looking at the structure of a fitted model object that > included a call and he kindly changed the behavior of str().) > > So you need to decide when a function, such as print(), evaluates its > arguments or when it doesn't, which can get kind of complicated. An > alternative is to use match.call() repeatedly instead of trying to > save the value, as in > >> fTest > function(x) { > print(match.call()) > list(x=x, logf = match.call()) > } >> fTest(x=2) > fTest(x = 2) > $x > [1] 2 > > $logf > fTest(x = 2) > > The trick there is that the value of match.call() is the unevaluated > call whereas > > myCall <- match.call() > print(myCall) > > evaluates myCall in the call to print, thereby evaluating the function > fTest again. > > Is this sufficiently confusing? :-) Yes, I am now sufficiently confused^W , ... er, motivated to look for another route. I think the way out of the confusion is to turn the call into text and since as.character doesn't do a very neat a job, I would suggest instead: deparse() > fTest <- function(x) { + print(match.call()) + list(x=x, logf = deparse(match.call())) + } > fTest(x=3)$logf fTest(x = 3) [1] "fTest(x = 3)" > cat(fTest(x=3)$logf) fTest(x = 3) fTest(x = 3) cat() is a convenient test of the capacity of an object to be written to a file. It has an append parameter that implies it could serve the logging function requested by the OP. >>> >>> I can see "theCall" printed into the console, but I don't manage to >>> convert it into a character to write it into a log file with other >>> informations. >>> >>> Can anyone help? David Winsemius, MD West Hartford, CT From dwinsemius at comcast.net Sun Apr 3 20:19:40 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sun, 3 Apr 2011 14:19:40 -0400 Subject: [R] Function for finding NA's In-Reply-To: References: Message-ID: <816BCC9F-7F84-4344-9921-AAAC10BF75F7@comcast.net> On Apr 3, 2011, at 1:44 PM, Tyler Rinker wrote: > > Quick question, > > I tried to find a function in available packages to find NA's for an > entire data set (or single variables) and report the row of missing > values (NA's for each column). I searched the typical routes > through the blogs and the help manuals for 15 minutes. Rather than > spend any more time searching I created my own function to do this > (probably in less time than it would have taken me to find the > function). > > Now I still have the same question: Is this function (NAhunter I > call it) already in existence? If so please direct me (because I'm > sure they've written better code more efficiently). I highly doubt > I'm this first person to want to find all the missing values in a > data set so I assume there is a function for it but I just didn't > spend enough time looking. If there is no existing function (big if > here), is this something people feel is worthwhile for me to put > into a package of some sort? I'm not sure that it would have occurred to people to include it in a package. Consider: getNa <- function(dfrm) lapply(dfrm, function(x) which(is.na(x) ) ) > cities long lat city pop 1 -58.38194 -34.59972 Buenos Aires NA 2 14.25000 40.83333 NA > getNa(cities) $long integer(0) $lat integer(0) $city [1] 2 $pop [1] 1 2 There are several packages with functions by the name `describe` that do most or all of rest of what you have proposed. I happen to use Harrell's Hmisc but the other versions should also be reviewed if you want to avoid re-inventing the wheel. -- David. > > Tyler > > Here's the code: > > NAhunter<-function(dataset) > { > find.NA<-function(variable) > { > if(is.numeric(variable)){ > n<-length(variable) > mean<-mean(variable, na.rm=T) > median<-median(variable, na.rm=T) > sd<-sd(variable, na.rm=T) > NAs<-is.na(variable) > total.NA<-sum(NAs) > percent.missing<-total.NA/n > descriptives<-data.frame(n,mean,median,sd,total.NA,percent.missing) > rownames(descriptives)<-c(" ") > Case.Number<-1:n > Missing.Values<-ifelse(NAs>0,"Missing Value"," ") > missing.value<-data.frame(Case.Number,Missing.Values) > missing.values<-missing.value[ which(Missing.Values=='Missing > Value'),] > list("NUMERIC DATA","DESCRIPTIVES"=t(descriptives),"CASE # OF > MISSING VALUES"=missing.values[,1]) > } > else{ > n<-length(variable) > NAs<-is.na(variable) > total.NA<-sum(NAs) > percent.missing<-total.NA/n > descriptives<-data.frame(n,total.NA,percent.missing) > rownames(descriptives)<-c(" ") > Case.Number<-1:n > Missing.Values<-ifelse(NAs>0,"Missing Value"," ") > missing.value<-data.frame(Case.Number,Missing.Values) > missing.values<-missing.value[ which(Missing.Values=='Missing > Value'),] > list("CATEGORICAL DATA","DESCRIPTIVES"=t(descriptives),"CASE # OF > MISSING VALUES"=missing.values[,1]) > } > } > dataset<-data.frame(dataset) > options(scipen=100) > options(digits=2) > lapply(dataset,find.NA) > } > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From Greg.Snow at imail.org Sun Apr 3 20:41:41 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Sun, 3 Apr 2011 12:41:41 -0600 Subject: [R] kernel density plot In-Reply-To: References: Message-ID: It is better to replace your later calls to plot with calls to lines instead, then you don't need to use par(new=T) which as you see tends to cause problems. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Muzna Alvi > Sent: Sunday, April 03, 2011 4:56 AM > To: r-help at r-project.org > Subject: [R] kernel density plot > > I am using the following commands for plotting kernel density for three > kinds of crops > > density(s22$Net_income_Total.1, bw="nrd0",adjust=1, > kernel=c("gaussian"))->t > plot(t, xlim=c(-30000,40000), main="Net Income Distribution", axes=F, > ylim=c(0,0.00035). xlab="Value in Rupees") > par(new=T) > density(s33$Net_income_Total.1, bw="nrd0",adjust=1, > kernel=c("gaussian"))->u > plot(u, xlim=c(-30000,40000), axes=F, main="", col="red", > ylim=c(0,0.00035)) > par(new=T) > density(s44$Net_income_Total.1, bw="nrd0",adjust=1, > kernel=c("gaussian"))->v > plot(v, xlim=c(-30000,40000), col="blue", axes=F, main="", > ylim=c(0,0.00035)) > > the problem is that in the graph that is drawn > > 1. the xlab gets hidden with the [N= and the bandwidth=] values > 2. when i do par(new=T) this N and bandwidth value appears multiple > times..overlapping each time and making the graph look untidy.. > > > Is there any way of making these N and Bandwidth values not appear in > the > graph? > > Thanks > > -- > -- > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. From muzna.alvi at gmail.com Sun Apr 3 20:52:21 2011 From: muzna.alvi at gmail.com (Muzna Alvi) Date: Mon, 4 Apr 2011 00:22:21 +0530 Subject: [R] kernel density plot In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From djandrija at gmail.com Sun Apr 3 21:02:35 2011 From: djandrija at gmail.com (andrija djurovic) Date: Sun, 3 Apr 2011 21:02:35 +0200 Subject: [R] Help in splitting ists into sub-lists In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Greg.Snow at imail.org Sun Apr 3 21:49:40 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Sun, 3 Apr 2011 13:49:40 -0600 Subject: [R] kernel density plot In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From sebastian.daza at gmail.com Sun Apr 3 22:04:11 2011 From: sebastian.daza at gmail.com (=?ISO-8859-1?Q?Sebasti=E1n_Daza?=) Date: Sun, 03 Apr 2011 15:04:11 -0500 Subject: [R] setCoefTemplate Message-ID: <4D98D2BB.6000504@gmail.com> Hi everyone, I am trying to build a table putting standard errors horizontally. I haven't been able to do it. library(memisc) berkeley <- aggregate(Table(Admit,Freq)~.,data=UCBAdmissions) berk0 <- glm(cbind(Admitted,Rejected)~1,data=berkeley,family="binomial") berk1 <- glm(cbind(Admitted,Rejected)~Gender,data=berkeley,family="binomial") berk2 <- glm(cbind(Admitted,Rejected)~Gender+Dept,data=berkeley,family="binomial") setCoefTemplate(est.se=c(est = "($est:#)($se:#)")) mtable(berk0,berk1,berk2, + coef.style="est.se", + summary.stats=c("Deviance","AIC","N")) Error in dim(ans) <- newdims : dims [product 1] do not match the length of object [2] Thank you in advance. -- Sebasti?n Daza sebastian.daza at gmail.com From mazatlanmexico at yahoo.com Sun Apr 3 21:41:01 2011 From: mazatlanmexico at yahoo.com (Felipe Carrillo) Date: Sun, 3 Apr 2011 12:41:01 -0700 Subject: [R] another question on shapefiles and geom_point in ggplot2 In-Reply-To: <4D981220.7060404@gmail.com> References: <4D9772A4.40208@gmail.com> <276101.2012.qm@web56602.mail.re3.yahoo.com> <4D97BE22.4040002@gmail.com> <6775.63438.qm@web56603.mail.re3.yahoo.com> <4D97C9E1.7080207@gmail.com> <297554.87949.qm@web56603.mail.re3.yahoo.com> <4D980737.7030401@gmail.com> <472714.79454.qm@web56606.mail.re3.yahoo.com> <4D981080.9040701@gmail.com> <173268.37139.qm@web56608.mail.re3.yahoo.com> <4D981220.7060404@gmail.com> Message-ID: <969108.33353.qm@web56603.mail.re3.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From mnovak1 at ucsc.edu Sun Apr 3 17:58:19 2011 From: mnovak1 at ucsc.edu (Mark Novak) Date: Sun, 03 Apr 2011 08:58:19 -0700 Subject: [R] zoo:rollapply by multiple grouping factors Message-ID: <4D98991B.10909@ucsc.edu> # Hi there, # I am trying to apply a function over a moving-window for a large number of multivariate time-series that are grouped in a nested set of factors. I have spent a few days searching for solutions with no luck, so any suggestions are much appreciated. # The data I have are for the abundance dynamics of multiple species observed in multiple fixed plots at multiple sites. (I total I have 7 sites, ~3-5 plots/site, ~150 species/plot, for 60 time-steps each.) So my data look something like this: dat<-data.frame(Site=rep(1), Plot=rep(c(rep(1,8),rep(2,8),rep(3,8)),1), Time=rep(c(1,1,2,2,3,3,4,4)), Sp=rep(1:2), Count=sample(24)) dat # Let the function I want to apply over a right-aligned window of w=2 time steps be: cv<-function(x){sd(x)/mean(x)} w<-2 # The final output I want would look something like this: Out<-data.frame(dat,CV=round(c(NA,NA,runif(6,0,1),c(NA,NA,runif(6,0,1))),2)) # I could reshape and apply zoo:rollapply() to a given plot at a given site, and reshape again as follows: library(zoo) a<-subset(dat,Site==1&Plot==1) b<-reshape(a[-c(1,2)],v.names='Count',idvar='Time',timevar='Sp',direction='wide') d<-zoo(b[,-1],b[,1]) d out<-rollapply(d, w, cv, na.pad=T, align='right') out # I would thereby have to loop through all my sites and plots which, although it deals with all species at once, still seems exceedingly inefficient. # So the question is, how do I use something like aggregate.zoo or tapply or even lapply to apply rollapply on each species' time series. # The closest I've come is the following two approaches: # First let: datx<-list(Site=dat$Site,Plot=dat$Plot,Sp=dat$Sp) daty<-dat$Count # Method 1. out1<-tapply(seq(along=daty),datx,function(i,x=daty){ rollapply(zoo(x[i]), w, cv, na.pad=T, align='right') }) out1 out1[,,1] # Which "works" in that it gives me the right answers, but in a format from which I can't figure out how to get back into the format I want. # Method 2. fun<-function(x){y<-zoo(x);coredata(rollapply(y, w, cv,na.pad=T,align='right'))} out2<-aggregate(daty,by=datx,fun) out2 # Which superficially "works" better, but again only in a format I can't figure out how to use because the output seems to be a mix of data.frame and lists. out2[1,4] out2[1,5] is.data.frame(out2) is.list(out2) # The situation is made more problematic by the fact that the time point of first survey can differ between plots (e.g., site1-plot3 may only start at time-point 3). As in... dat2<-dat dat2<-dat2[-which(dat2$Plot==3 & dat2$Time<3),] dat2 # I must therefore ensure that I'm keeping track of the true time associated with each value, not just the order of their occurences. This information is (seemingly) lost by both methods. datx<-list(Site=dat2$Site,Plot=dat2$Plot,Sp=dat2$Sp) daty<-dat2$Count # Method 1. out3<-tapply(seq(along=daty),datx,function(i,x=daty){ rollapply(zoo(x[i]), w, cv, na.pad=T, align='right') }) out3 out3[1,3,1] time(out3[1,3,1]) # Method 2 out4<-aggregate(daty,by=datx,fun) out4 time(out4[3,4]) # Am I going about this all wrong? Is there a different package to try? Any thoughts and suggestions are much appreciated! # R 2.12.2 GUI 1.36 Leopard build 32-bit (5691); zoo 1.6-4 # Thanks! # -mark -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-- Ecology & Evolutionary Biology University of California, Santa Cruz Long Marine Laboratory 100 Shaffer Road Santa Cruz, CA 95060-5730 Ph: 773-256-8645 Fax: 831-459-3383 http://people.ucsc.edu/~mnovak1/ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-- From sebastian.daza at gmail.com Sun Apr 3 18:42:56 2011 From: sebastian.daza at gmail.com (=?ISO-8859-1?Q?Sebasti=E1n_Daza?=) Date: Sun, 03 Apr 2011 16:42:56 -0000 Subject: [R] setCoefTemplate Message-ID: <4819F302.20501@gmail.com> Hi everyone, I am trying to build a table putting standard errors horizontally. I haven't been able to do it. library(memisc) berkeley <- aggregate(Table(Admit,Freq)~.,data=UCBAdmissions) berk0 <- glm(cbind(Admitted,Rejected)~1,data=berkeley,family="binomial") berk1 <- glm(cbind(Admitted,Rejected)~Gender,data=berkeley,family="binomial") berk2 <- glm(cbind(Admitted,Rejected)~Gender+Dept,data=berkeley,family="binomial") setCoefTemplate(est.se=c(est = "($est:#)($se:#)")) mtable(berk0,berk1,berk2, + coef.style="est.se", + summary.stats=c("Deviance","AIC","N")) Error in dim(ans) <- newdims : dims [product 1] do not match the length of object [2] Thank you in advance. -- Sebasti?n Daza sebastian.daza at gmail.com From sebastian.daza at gmail.com Sun Apr 3 19:35:58 2011 From: sebastian.daza at gmail.com (=?ISO-8859-1?Q?Sebasti=E1n_Daza?=) Date: Sun, 03 Apr 2011 17:35:58 -0000 Subject: [R] coefficients style Message-ID: <4819FF74.3010802@gmail.com> Hi everyone, I am trying to build a table putting standard errors horizontally. I haven't been able to do it. library(memisc) berkeley <- aggregate(Table(Admit,Freq)~.,data=UCBAdmissions) berk0 <- glm(cbind(Admitted,Rejected)~1,data=berkeley,family="binomial") berk1 <- glm(cbind(Admitted,Rejected)~Gender,data=berkeley,family="binomial") berk2 <- glm(cbind(Admitted,Rejected)~Gender+Dept,data=berkeley,family="binomial") setCoefTemplate(est.se=c(est = "($est:#)($se:#)")) mtable(berk0,berk1,berk2, + coef.style="est.se", + summary.stats=c("Deviance","AIC","N")) Error in dim(ans) <- newdims : dims [product 1] do not match the length of object [2] Thank you in advance. -- Sebasti?n Daza sebastian.daza at gmail.com From sebastian.daza at gmail.com Sun Apr 3 19:46:38 2011 From: sebastian.daza at gmail.com (=?ISO-8859-1?Q?Sebasti=E1n_Daza?=) Date: Sun, 03 Apr 2011 17:46:38 -0000 Subject: [R] style question Message-ID: <481A01F3.5010300@gmail.com> Hi everyone, I am trying to build a table putting standard errors horizontally. I haven't been able to do it. library(memisc) berkeley <- aggregate(Table(Admit,Freq)~.,data=UCBAdmissions) berk0 <- glm(cbind(Admitted,Rejected)~1,data=berkeley,family="binomial") berk1 <- glm(cbind(Admitted,Rejected)~Gender,data=berkeley,family="binomial") berk2 <- glm(cbind(Admitted,Rejected)~Gender+Dept,data=berkeley,family="binomial") setCoefTemplate(est.se=c(est = "($est:#)($se:#)")) mtable(berk0,berk1,berk2, + coef.style="est.se", + summary.stats=c("Deviance","AIC","N")) Error in dim(ans) <- newdims : dims [product 1] do not match the length of object [2] Thank you in advance. -- Sebasti?n Daza sebastian.daza at gmail.com From algotr8der at gmail.com Sun Apr 3 19:57:49 2011 From: algotr8der at gmail.com (algotr8der) Date: Sun, 3 Apr 2011 12:57:49 -0500 (CDT) Subject: [R] R-project: plot 2 zoo objects (price series) that have some date mis-matches Message-ID: <1301853469771-3423899.post@n4.nabble.com> I have 2 zoo objects - 1) Interest rate spread between 10-YR-US-Treasury and 2-YR-US-Treasury (object name = sprd) 2) S&P 500 index (object name = spy) > str(spy) ?zoo? series from 1976-06-01 to 2011-03-31 Data: num [1:8791] 99.8 100.2 100.1 99.2 98.6 ... Index: Class 'Date' num [1:8791] 2343 2344 2345 2346 2349 ... > str(sprd) ?zoo? series from 1976-06-01 to 2011-03-31 Data: num [1:9088] 0.68 0.71 0.7 0.77 0.79 0.79 0.82 0.86 0.83 0.83 ... Index: Class 'Date' num [1:9088] 2343 2344 2345 2346 2349 ... Since there are NA data points in object 'sprd' I created another object that omits "NA". The name of that object is 'sprdtmp'. > str(sprdtmp) ?zoo? series from 1976-06-01 to 2011-03-31 Data: atomic [1:8704] 0.68 0.71 0.7 0.77 0.79 0.79 0.82 0.86 0.83 0.83 ... - attr(*, "na.action")=Class 'omit' int [1:384] 25 70 95 111 118 128 149 190 224 260 ... Index: Class 'Date' num [1:8704] 2343 2344 2345 2346 2349 ... I want to plot both time series on the same plot with time/date on the x-axis and the axis label quarterly (or monthly). One problem is that the sprdtmp and spy objects do not have the same number of data points as there are times when the equity markets are closed while the interest rate markets are open. For the most part the dates overlap. Would this matter if I try to plot both on the same plot? And how would I go about plotting these objects in one plot. The second part of the plot requires that both are on different scales. I guess I could take a log of the s&p and plot that along with the rate spread. But it would be nice for future reference how I could plot 2 series in one plot with 2 difference scales. I spent all night yesterday and this morning trying various options but I cant seem to get this to work. I have read through the ?plot, ?plot.zoo and ?axis documentation without any success. I would greatly appreciate if anyone can point me in the right direction. Thank you kindly. -- View this message in context: http://r.789695.n4.nabble.com/R-project-plot-2-zoo-objects-price-series-that-have-some-date-mis-matches-tp3423899p3423899.html Sent from the R help mailing list archive at Nabble.com. From friedman.steve at gmail.com Sun Apr 3 20:35:23 2011 From: friedman.steve at gmail.com (Steve Friedman) Date: Sun, 3 Apr 2011 14:35:23 -0400 Subject: [R] installing ncdf on Ubuntu 10.04 Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From antrael at hotmail.com Sun Apr 3 21:38:03 2011 From: antrael at hotmail.com (jouba) Date: Sun, 3 Apr 2011 14:38:03 -0500 (CDT) Subject: [R] Structural equation modeling in R(lavaan,sem) In-Reply-To: References: <1301253139729-3409642.post@n4.nabble.com> <4D9073F7.2040309@gmail.com> <1301426701835-3415954.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From tyler_rinker at hotmail.com Sun Apr 3 21:46:54 2011 From: tyler_rinker at hotmail.com (Tyler Rinker) Date: Sun, 3 Apr 2011 15:46:54 -0400 Subject: [R] Function for finding NA's In-Reply-To: <816BCC9F-7F84-4344-9921-AAAC10BF75F7@comcast.net> References: , <816BCC9F-7F84-4344-9921-AAAC10BF75F7@comcast.net> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ericstrom at aol.com Sun Apr 3 22:12:31 2011 From: ericstrom at aol.com (eric) Date: Sun, 3 Apr 2011 15:12:31 -0500 (CDT) Subject: [R] How do I modify uniroot function to return .0001 if error ? Message-ID: <1301861551841-3424092.post@n4.nabble.com> I am calling the uniroot function from inside another function using these lines (last two lines of the function) : d <- uniroot(k, c(.001, 250), tol=.05) return(d$root) The problem is that on occasion there's a problem with the values I'm passing to uniroot. In those instances uniroot stops and sends a message that it can't calculate the root because f.upper * f.lower is greater than zero. All I'd like to do in those cases is be able to set the return value of my calling function "return(d$root)" to .0001. But I'm not sure how to pull that off. I tried a few modifications to uniroot but so far no luck. For convenience, the uniroot function is shown below: uniroot <- function (f, interval, ..., lower = min(interval), upper = max(interval), f.lower = f(lower, ...), f.upper = f(upper, ...), tol = .Machine$double.eps^0.25, maxiter = 1000) { if (!missing(interval) && length(interval) != 2L) stop("'interval' must be a vector of length 2") if (!is.numeric(lower) || !is.numeric(upper) || lower >= upper) stop("lower < upper is not fulfilled") if (is.na(f.lower)) stop("f.lower = f(lower) is NA") if (is.na(f.upper)) stop("f.upper = f(upper) is NA") if (f.lower * f.upper > 0) stop("f.up * f.down > 0") val <- .Internal(zeroin2(function(arg) f(arg, ...), lower, upper, f.lower, f.upper, tol, as.integer(maxiter))) iter <- as.integer(val[2L]) if (iter < 0) { warning("_NOT_ converged in ", maxiter, " iterations") iter <- maxiter } list(root = val[1L], f.root = f(val[1L], ...), iter = iter, estim.prec = val[3L]) } -- View this message in context: http://r.789695.n4.nabble.com/How-do-I-modify-uniroot-function-to-return-0001-if-error-tp3424092p3424092.html Sent from the R help mailing list archive at Nabble.com. From dylan.glynn at englund.lu.se Sun Apr 3 18:47:27 2011 From: dylan.glynn at englund.lu.se (dsg) Date: Sun, 3 Apr 2011 11:47:27 -0500 (CDT) Subject: [R] Homals package color function problem Message-ID: <1301849247672-3423814.post@n4.nabble.com> Hello The Homals package and its plot options are excellent. However, I am unable to manipulate the colour in the plots. In a call such as: plot(mc_analysis, plot.dim = c(1,3), plot.type = "jointplot", col = 1) this should be straightforward - but I can't seem to effect the plotted colours I have tried various combinations for "col" commands in other plot packages I know col = col = c('red', 'green') ccol = rcol = also the different ways of designating the colour col = 1, col = "black", col = (#6698FF) I get no error messages, I have tried flushing R, restarting etc. the colours never change I'm under MacOS- 10.5.8, in R- 2.12.2 I'm sure it is something so obvious I'll crawl away and hide when I find it, but for now.... ideas anyone? Dylan -- View this message in context: http://r.789695.n4.nabble.com/Homals-package-color-function-problem-tp3423814p3423814.html Sent from the R help mailing list archive at Nabble.com. From naba9_ming at hotmail.com Sun Apr 3 19:01:39 2011 From: naba9_ming at hotmail.com (Kin Ming Wong) Date: Mon, 4 Apr 2011 01:01:39 +0800 Subject: [R] Standard Error for Cointegration Results Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From wuleiwong at gmail.com Sun Apr 3 17:36:14 2011 From: wuleiwong at gmail.com (wulei wong) Date: Sun, 3 Apr 2011 23:36:14 +0800 Subject: [R] R gui on windows how to force to always show the last line of output In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From wuleiwong at gmail.com Sun Apr 3 17:36:14 2011 From: wuleiwong at gmail.com (wulei wong) Date: Sun, 3 Apr 2011 23:36:14 +0800 Subject: [R] Discretizing data rows into regular intervals In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From daniel at umd.edu Sun Apr 3 22:32:17 2011 From: daniel at umd.edu (Daniel Malter) Date: Sun, 3 Apr 2011 15:32:17 -0500 (CDT) Subject: [R] Plotting MDS (multidimensional scaling) In-Reply-To: References: <1301774855661-3422670.post@n4.nabble.com> Message-ID: <1301862737089-3424135.post@n4.nabble.com> The header "metric mds" was actually a leftover because I initially used cmdscale and did not bother changing it for the example. Thanks, Daniel -- View this message in context: http://r.789695.n4.nabble.com/Plotting-MDS-multidimensional-scaling-tp3422670p3424135.html Sent from the R help mailing list archive at Nabble.com. From tlumley at uw.edu Sun Apr 3 23:02:02 2011 From: tlumley at uw.edu (Thomas Lumley) Date: Mon, 4 Apr 2011 09:02:02 +1200 Subject: [R] installing ncdf on Ubuntu 10.04 In-Reply-To: References: Message-ID: On Mon, Apr 4, 2011 at 6:35 AM, Steve Friedman wrote: > Hello > > I'm trying to install the "ncdf" package on a Ubuntu 10.04 laptop. > > Can you offer suggestions to install this package? > checking for netcdf.h... no > configure: error: netcdf header netcdf.h not found > ERROR: configuration failed for package ?ncdf? > * removing ?/home/steve/R/i486-pc-linux-gnu-library/2.12/ncdf? Looks like you need the netCDF libraries. https://launchpad.net/ubuntu/+source/netcdf suggests that you need both the binary and a -dev package with headers. The ncdf binaries on Windows and Mac have been specially built to include the netCDF libraries, but the source package doesn't include them. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland From dwinsemius at comcast.net Sun Apr 3 23:44:55 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sun, 3 Apr 2011 17:44:55 -0400 Subject: [R] Function for finding NA's In-Reply-To: References: , <816BCC9F-7F84-4344-9921-AAAC10BF75F7@comcast.net> Message-ID: <1D52625D-87CD-4C82-92FE-A0D9634F161F@comcast.net> On Apr 3, 2011, at 3:46 PM, Tyler Rinker wrote: > aThanks David, > > After seeing the simplicity of your function versus the convoluted > mess I worked up I now understand why it's not necessary to have a > package to find NA's (and from what you said is a part of other > packages such as Hmisc already). I'm actually not aware that any of the `describe` variants will return the indices of NA's. In the case of real dataset such an object could be fairly large. It was the other descriptive functions that I said were probably already coded. > > I am at the 2 1/2 month mark as an R user and have loads to learn. > Simpler is better. Thanks David for your time and I will take the > information you gave and put it to use in new situations. You should also familiarize yourself with complete.cases() and the various functions that handle na.action parameters (linked from that help page). Note that complete.cases returns a logical vector (not the cases themselves) and is designed for indexing matrices or dataframes. > > Tyler > > > CC: r-help at r-project.org > > From: dwinsemius at comcast.net > > To: tyler_rinker at hotmail.com > > Subject: Re: [R] Function for finding NA's > > Date: Sun, 3 Apr 2011 14:19:40 -0400 > > > > > > On Apr 3, 2011, at 1:44 PM, Tyler Rinker wrote: > > > > > > > > Quick question, > > > > > > I tried to find a function in available packages to find NA's > for an > > > entire data set (or single variables) and report the row of > missing > > > values (NA's for each column). I searched the typical routes > > > through the blogs and the help manuals for 15 minutes. Rather than > > > spend any more time searching I created my own function to do this > > > (probably in less time than it would have taken me to find the > > > function). > > > > > > Now I still have the same question: Is this function (NAhunter I > > > call it) already in existence? If so please direct me (because I'm > > > sure they've written better code more efficiently). I highly doubt > > > I'm this first person to want to find all the missing values in a > > > data set so I assume there is a function for it but I just didn't > > > spend enough time looking. If there is no existing function (big > if > > > here), is this something people feel is worthwhile for me to put > > > into a package of some sort? > > > > I'm not sure that it would have occurred to people to include it > in a > > package. Consider: > > > > getNa <- function(dfrm) lapply(dfrm, function(x) which(is.na(x) ) ) > > > > > cities > > long lat city pop > > 1 -58.38194 -34.59972 Buenos Aires NA > > 2 14.25000 40.83333 NA > > > getNa(cities) > > $long > > integer(0) > > > > $lat > > integer(0) > > > > $city > > [1] 2 > > > > $pop > > [1] 1 2 > > > > There are several packages with functions by the name `describe` > that > > do most or all of rest of what you have proposed. I happen to use > > Harrell's Hmisc but the other versions should also be reviewed if > you > > want to avoid re-inventing the wheel. > > -- > > David. > > > > > > > > Tyler > > > > > > Here's the code: > > > > > > NAhunter<-function(dataset) > > > { > > > find.NA<-function(variable) > > > { > > > if(is.numeric(variable)){ > > > n<-length(variable) > > > mean<-mean(variable, na.rm=T) > > > median<-median(variable, na.rm=T) > > > sd<-sd(variable, na.rm=T) > > > NAs<-is.na(variable) > > > total.NA<-sum(NAs) > > > percent.missing<-total.NA/n > > > descriptives<- > data.frame(n,mean,median,sd,total.NA,percent.missing) > > > rownames(descriptives)<-c(" ") > > > Case.Number<-1:n > > > Missing.Values<-ifelse(NAs>0,"Missing Value"," ") > > > missing.value<-data.frame(Case.Number,Missing.Values) > > > missing.values<-missing.value[ which(Missing.Values=='Missing > > > Value'),] > > > list("NUMERIC DATA","DESCRIPTIVES"=t(descriptives),"CASE # OF > > > MISSING VALUES"=missing.values[,1]) > > > } > > > else{ > > > n<-length(variable) > > > NAs<-is.na(variable) > > > total.NA<-sum(NAs) > > > percent.missing<-total.NA/n > > > descriptives<-data.frame(n,total.NA,percent.missing) > > > rownames(descriptives)<-c(" ") > > > Case.Number<-1:n > > > Missing.Values<-ifelse(NAs>0,"Missing Value"," ") > > > missing.value<-data.frame(Case.Number,Missing.Values) > > > missing.values<-missing.value[ which(Missing.Values=='Missing > > > Value'),] > > > list("CATEGORICAL DATA","DESCRIPTIVES"=t(descriptives),"CASE # OF > > > MISSING VALUES"=missing.values[,1]) > > > } > > > } > > > dataset<-data.frame(dataset) > > > options(scipen=100) > > > options(digits=2) > > > lapply(dataset,find.NA) > > > } > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > > R-help at r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > David Winsemius, MD > > West Hartford, CT > > David Winsemius, MD West Hartford, CT From dwinsemius at comcast.net Sun Apr 3 23:57:36 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sun, 3 Apr 2011 17:57:36 -0400 Subject: [R] style question In-Reply-To: <481A01F3.5010300@gmail.com> References: <481A01F3.5010300@gmail.com> Message-ID: On May 1, 2008, at 1:46 PM, Sebasti?n Daza wrote: > Hi everyone, > I am trying to build a table putting standard errors horizontally. I > haven't been able to do it. > > library(memisc) > berkeley <- aggregate(Table(Admit,Freq)~.,data=UCBAdmissions) > > berk0 <- > glm(cbind(Admitted,Rejected)~1,data=berkeley,family="binomial") > berk1 <- > glm(cbind(Admitted,Rejected)~Gender,data=berkeley,family="binomial") > berk2 <- glm(cbind(Admitted,Rejected)~Gender > +Dept,data=berkeley,family="binomial") > > setCoefTemplate(est.se=c(est = "($est:#)($se:#)")) I'm not a skilled user of that package but just looking at the value of the last four leaves of the list you created makes me wonder if you meant to do something like the `ci.se.horizontal` variant? $ci.se.horizontal est se est "($est:#)" "(($se:#))" ci "[($lwr:#)" "($upr:#)]" $ci.p est p lwr upr "($est:#)" "(($p:#))" "[($lwr:#)" "($upr:#)]" $ci.p.horizontal est se est "($est:#)" "(($p:#))" ci "[($lwr:#)" "($upr:#)]" $est.se # Your addition doesn't really look like the others in the list est "($est:#)($se:#)" This runs without error: > mtable(berk0,berk1,berk2, coef.style="ci.se.horizontal", summary.stats=c("Deviance","AIC","N")) On the other hand maybe you wanted this, (note a two item list): tt<- setCoefTemplate(est.se=list(est = "($est:#)", se="($se:#)")) > tt['est.se'] $est.se est se "($est:#)" "($se:#)" > mtable(berk0,berk1,berk2, + coef.style="est.se", + summary.stats=c("Deviance","AIC","N")) Calls: berk0: glm(formula = cbind(Admitted, Rejected) ~ 1, family = "binomial", data = berkeley) berk1: glm(formula = cbind(Admitted, Rejected) ~ Gender, family = "binomial", data = berkeley) berk2: glm(formula = cbind(Admitted, Rejected) ~ Gender + Dept, family = "binomial", data = berkeley) =============================================== berk0 berk1 berk2 ----------------------------------------------- (Intercept) -0.457 -0.220 0.582 0.031 0.039 0.069 Gender: Female/Male -0.610 0.100 0.064 0.081 Dept: B/A -0.043 0.110 Dept: C/A -1.263 0.107 Dept: D/A -1.295 0.106 Dept: E/A -1.739 0.126 Dept: F/A -3.306 0.170 ----------------------------------------------- Deviance 877.056 783.607 20.204 AIC 947.996 856.547 103.144 N 4526 4526 4526 =============================================== > > mtable(berk0,berk1,berk2, > + coef.style="est.se", > + summary.stats=c("Deviance","AIC","N")) > Error in dim(ans) <- newdims : > dims [product 1] do not match the length of object [2] > > Thank you in advance. > > -- > Sebasti?n Daza > sebastian.daza at gmail.com > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From mspinola10 at gmail.com Sun Apr 3 22:51:02 2011 From: mspinola10 at gmail.com (=?ISO-8859-1?Q?Manuel_Sp=EDnola?=) Date: Sun, 3 Apr 2011 14:51:02 -0600 Subject: [R] another question on shapefiles and geom_point in ggplot2 In-Reply-To: <969108.33353.qm@web56603.mail.re3.yahoo.com> References: <4D9772A4.40208@gmail.com> <276101.2012.qm@web56602.mail.re3.yahoo.com> <4D97BE22.4040002@gmail.com> <6775.63438.qm@web56603.mail.re3.yahoo.com> <4D97C9E1.7080207@gmail.com> <297554.87949.qm@web56603.mail.re3.yahoo.com> <4D980737.7030401@gmail.com> <472714.79454.qm@web56606.mail.re3.yahoo.com> <4D981080.9040701@gmail.com> <173268.37139.qm@web56608.mail.re3.yahoo.com> <4D981220.7060404@gmail.com> <969108.33353.qm@web56603.mail.re3.yahoo.com> Message-ID: <4D98DDB6.9030503@gmail.com> Thank you very much Felipe, Did you see the solution from ahmadou dicko? He doesn?t use gpclibPermit() I have another option but I cannot get the right fill for the id. See attached map. ai_biotica = readOGR(dsn="C:/ProyectosRespacial/ICE/SIG_Biotica_PHED", layer="AI_BIOTICA_010411_CRTM05") str(ai_biotica) # fortify to get the data fortify.ai_biotica <- fortify.SpatialPolygonsDataFrame(ai_biotica,region='Area_Influ') names(fortify.ai_biotica) str(fortify.ai_biotica) levels(fortify.ai_biotica$group) # mapa ggplot(fortify.ai_biotica, aes(x = long, y=lat, group = group)) + geom_polygon(colour = "black", fill = NA) geo = read.csv("riqueza_out.csv", sep = ",", header = T) names(geo) str(geo) summary(geo) # mapa con riqueza p = ggplot(geo, aes(x, y)) p + geom_point(aes(size = ACE, colour = ACE)) + theme_bw() + scale_size(name = "N?mero de especies", breaks = c(2, 4, 6, 8, 10, 12, 14, 16, 18, 20)) + scale_colour_gradientn(name = 'N?mero de especies', colours = heat.colors(10), breaks = c(2, 4, 6, 8, 10, 12, 14, 16, 18, 20))+ xlab("Longitud") + ylab("Latitud") + opts(axis.text.x = theme_text(size = 8, vjust = 1)) + opts(axis.text.y = theme_text(size = 8, hjust = 1)) + geom_path(aes(x=long,y=lat,group=group, fill=id),data=fortify.ai_biotica) Best, Manuel On 03/04/2011 01:41 p.m., Felipe Carrillo wrote: > Manuel: > I changed your variable names from x to 'long' and y to 'lat' on the > riqueza_out.csv file. > The code below should do what you want. Also, since the legend title > is kind of long, I broke it > down into three lines so you can see more plot area. I am cc'ing the > other groups so more people > use it if needed. > library(rgdal) > library(ggplot2) > library(sp) > library(maptools) > gpclibPermit() > manuel <- readOGR(dsn=".", layer="AI_BIOTICA_010411_CRTM05") > names(manuel);dim(manuel) > slotNames(manuel) # look at the slot names > # add the 'id' variable to the shapefile and use it to merge both files > manuel at data$id = rownames(manuel at data > ) > # convert shapefile to dataframe > manuel.df <- as.data.frame(manuel) > # fortify to plot with ggplot2 > manuel_fort <- fortify(manuel,region="id") > head(manuel_fort) > # Merge shapefile and the as.dataframe shapefile > manuel_merged <- join(manuel_fort,manuel.df, by ="id") > head(manuel_merged) > # Read in the csv file > manuel_points <- read.csv("riqueza_out.csv") > head(manuel_points);dim(manuel_points) > # fortify this one too for the points or else an error will ocurr > manuel_points <- fortify(manuel_points) > manuel_points > # Plot the shapefile and overlayed the points over it > p <- ggplot(manuel_merged, aes(long,lat,group=group)) + > geom_polygon(aes(data=manuel_merged,fill=Area_Influ)) + > geom_path(color="white") + theme_bw() # remove this if you don't > want black and white background > p + geom_point(data=manuel_points,aes(size=ACE,colour=ACE,group=NULL)) + > scale_size(name = "N?mero\nde\nespecies", breaks = c(2, 4, 6, 8, 10, > 12, 14, 16, 18, 20)) + > scale_colour_gradientn(name = 'N?mero\nde\nespecies', > colours = rainbow(6), breaks = c(2, 4, 6, 8, 10, 12, 14, 16, 18, 20))+ > xlab("Longitud") + ylab("Latitud") + opts(axis.text.x = > theme_text(size = 8, vjust = 1)) + > opts(axis.text.y = theme_text(size = 8, hjust = 1)) > > > Felipe D. Carrillo > Supervisory Fishery Biologist > Department of the Interior > US Fish & Wildlife Service > California, USA > http://www.fws.gov/redbluff/rbdd_jsmp.aspx > > > *From:* Manuel Sp?nola > *To:* Felipe Carrillo > *Sent:* Sat, April 2, 2011 11:22:24 PM > *Subject:* Re: another question on shapefiles and geom_point in > ggplot2 > > No problem, thank you very much Felipe. > > Best, > > Manuel > > On 03/04/2011 12:19 a.m., Felipe Carrillo wrote: >> I meant to send you this one..Let me clean up the code a little >> bit and >> I will send it to you,,,do you mind if I send it to you in the >> morning? >> Felipe D. Carrillo >> Supervisory Fishery Biologist >> Department of the Interior >> US Fish & Wildlife Service >> California, USA >> http://www.fws.gov/redbluff/rbdd_jsmp.aspx >> >> >> *From:* Manuel Sp?nola >> *To:* Felipe Carrillo >> *Sent:* Sat, April 2, 2011 11:15:28 PM >> *Subject:* Re: another question on shapefiles and geom_point >> in ggplot2 >> >> Yes Felipe. That is the graph I was looking for. >> >> I got something closer but no like yours. How did you do it? >> >> Manuel >> >> On 03/04/2011 12:10 a.m., Felipe Carrillo wrote: >>> I was able to open them,,I am attaching a picture of the >>> graph I created..It's that what >>> you had in mind? >>> Felipe D. Carrillo >>> Supervisory Fishery Biologist >>> Department of the Interior >>> US Fish & Wildlife Service >>> California, USA >>> http://www.fws.gov/redbluff/rbdd_jsmp.aspx >>> >>> >>> *From:* Manuel Sp?nola >>> *To:* Felipe Carrillo >>> *Sent:* Sat, April 2, 2011 10:35:51 PM >>> *Subject:* Re: another question on shapefiles and >>> geom_point in ggplot2 >>> >>> It should be. I am sending them again. >>> >>> Manuel >>> >>> On 02/04/2011 10:23 p.m., Felipe Carrillo wrote: >>>> Manuel: >>>> I can't open the shapefile, is this the original one? >>>> Is the csv file the one that you are trying to overlay >>>> on top of the shapefile? >>>> Felipe D. Carrillo >>>> Supervisory Fishery Biologist >>>> Department of the Interior >>>> US Fish & Wildlife Service >>>> California, USA >>>> http://www.fws.gov/redbluff/rbdd_jsmp.aspx >>>> >>>> >>>> *From:* Manuel Sp?nola >>>> *To:* Felipe Carrillo >>>> *Sent:* Sat, April 2, 2011 6:14:09 PM >>>> *Subject:* Re: another question on shapefiles and >>>> geom_point in ggplot2 >>>> >>>> Files attached. >>>> >>>> >>>> On 02/04/2011 07:04 p.m., Felipe Carrillo wrote: >>>>> If you want individual points overlayed on the >>>>> shapefile, you need to add another variable to it >>>>> before you fortify it. >>>>> After you fortify merge both the fortified dataset >>>>> and the original shapefile. Go ahead and post your >>>>> shapefile to see if >>>>> I can figure it out. Do you just want the points >>>>> or want text also? >>>>> Felipe D. Carrillo >>>>> Supervisory Fishery Biologist >>>>> Department of the Interior >>>>> US Fish & Wildlife Service >>>>> California, USA >>>>> http://www.fws.gov/redbluff/rbdd_jsmp.aspx >>>>> >>>>> >>>>> *From:* Manuel Sp?nola >>>>> *To:* Felipe Carrillo >>>>> *Sent:* Sat, April 2, 2011 5:24:02 PM >>>>> *Subject:* Re: another question on shapefiles >>>>> and geom_point in ggplot2 >>>>> >>>>> Hi Felipe, >>>>> >>>>> I did the same thing that I am trying know, >>>>> attached is how it looks. >>>>> >>>>> Best, >>>>> >>>>> Manuel >>>>> >>>>> On 02/04/2011 06:09 p.m., Felipe Carrillo wrote: >>>>>> Manuel: >>>>>> I did something similar a few weeks ago,,If >>>>>> you post your shapefile and describe what you are >>>>>> expecting I might be able to help.. >>>>>> Felipe D. Carrillo >>>>>> Supervisory Fishery Biologist >>>>>> Department of the Interior >>>>>> US Fish & Wildlife Service >>>>>> California, USA >>>>>> http://www.fws.gov/redbluff/rbdd_jsmp.aspx >>>>>> >>>>>> >>>>>> *From:* Manuel Sp?nola >>>>>> *To:* "ggplot2 at googlegroups.com" >>>>>> >>>>>> *Sent:* Sat, April 2, 2011 12:01:56 PM >>>>>> *Subject:* another question on shapefiles >>>>>> and geom_point in ggplot2 >>>>>> >>>>>> Dear list members, >>>>>> >>>>>> This is a different question from my >>>>>> previous post. >>>>>> I handle to read my shapefile with >>>>>> ggplot2 following >>>>>> https://github.com/hadley/ggplot2/wiki/plotting-polygon-shapefiles >>>>>> >>>>>> p = ggplot(ai_biotica.df, >>>>>> aes(long,lat,group=group,fill=Area_Influ)) + >>>>>> geom_polygon() + >>>>>> geom_path(color="white") + >>>>>> coord_equal() >>>>>> >>>>>> I got a nice map. >>>>>> >>>>>> Now I want to plot some point with >>>>>> geom_point but I got an error. >>>>>> >>>>>> > p + geom_point(geo, aes(size = ACE, >>>>>> colour = ACE)) + scale_size(name = >>>>>> "N?mero de especies", breaks = c(2, 4, 6, >>>>>> 8, 10, 12, 14, 16, 18, 20)) + >>>>>> scale_colour_gradientn(name = 'N?mero de >>>>>> especies', colours = rainbow(6), breaks = >>>>>> c(2, 4, 6, 8, 10, 12, 14, 16, 18, 20))+ >>>>>> xlab("Longitud") + ylab("Latitud") + >>>>>> opts(axis.text.x = theme_text(size = 8, >>>>>> vjust = 1)) + opts(axis.text.y = >>>>>> theme_text(size = 8, hjust = 1)) >>>>>> Error: ggplot2 doesn't know how to deal >>>>>> with data of class uneval >>>>>> >>>>>> Best, >>>>>> >>>>>> Manuel >>>>>> >>>>>> -- >>>>>> *Manuel Sp?nola, Ph.D.* >>>>>> Instituto Internacional en Conservaci?n y >>>>>> Manejo de Vida Silvestre >>>>>> Universidad Nacional >>>>>> Apartado 1350-3000 >>>>>> Heredia >>>>>> COSTA RICA >>>>>> mspinola at una.ac.cr >>>>>> mspinola10 at gmail.com >>>>>> Tel?fono: (506) 2277-3598 >>>>>> Fax: (506) 2237-7036 >>>>>> Personal website: Lobito de r?o >>>>>> >>>>>> Institutional website: ICOMVIS >>>>>> >>>>>> -- >>>>>> You received this message because you are >>>>>> subscribed to the ggplot2 mailing list. >>>>>> Please provide a reproducible example: >>>>>> http://gist.github.com/270442 >>>>>> >>>>>> To post: email ggplot2 at googlegroups.com >>>>>> To unsubscribe: email >>>>>> ggplot2+unsubscribe at googlegroups.com >>>>>> More options: >>>>>> http://groups.google.com/group/ggplot2 >>>>>> >>>>> >>>>> >>>>> -- >>>>> *Manuel Sp?nola, Ph.D.* >>>>> Instituto Internacional en Conservaci?n y >>>>> Manejo de Vida Silvestre >>>>> Universidad Nacional >>>>> Apartado 1350-3000 >>>>> Heredia >>>>> COSTA RICA >>>>> mspinola at una.ac.cr >>>>> mspinola10 at gmail.com >>>>> Tel?fono: (506) 2277-3598 >>>>> Fax: (506) 2237-7036 >>>>> Personal website: Lobito de r?o >>>>> >>>>> Institutional website: ICOMVIS >>>>> >>>>> >>>> >>>> >>>> -- >>>> *Manuel Sp?nola, Ph.D.* >>>> Instituto Internacional en Conservaci?n y Manejo de >>>> Vida Silvestre >>>> Universidad Nacional >>>> Apartado 1350-3000 >>>> Heredia >>>> COSTA RICA >>>> mspinola at una.ac.cr >>>> mspinola10 at gmail.com >>>> Tel?fono: (506) 2277-3598 >>>> Fax: (506) 2237-7036 >>>> Personal website: Lobito de r?o >>>> >>>> Institutional website: ICOMVIS >>>> >>>> >>> >>> >>> -- >>> *Manuel Sp?nola, Ph.D.* >>> Instituto Internacional en Conservaci?n y Manejo de Vida >>> Silvestre >>> Universidad Nacional >>> Apartado 1350-3000 >>> Heredia >>> COSTA RICA >>> mspinola at una.ac.cr >>> mspinola10 at gmail.com >>> Tel?fono: (506) 2277-3598 >>> Fax: (506) 2237-7036 >>> Personal website: Lobito de r?o >>> >>> Institutional website: ICOMVIS >>> >>> >> >> >> -- >> *Manuel Sp?nola, Ph.D.* >> Instituto Internacional en Conservaci?n y Manejo de Vida >> Silvestre >> Universidad Nacional >> Apartado 1350-3000 >> Heredia >> COSTA RICA >> mspinola at una.ac.cr >> mspinola10 at gmail.com >> Tel?fono: (506) 2277-3598 >> Fax: (506) 2237-7036 >> Personal website: Lobito de r?o >> >> Institutional website: ICOMVIS >> > > > -- > *Manuel Sp?nola, Ph.D.* > Instituto Internacional en Conservaci?n y Manejo de Vida Silvestre > Universidad Nacional > Apartado 1350-3000 > Heredia > COSTA RICA > mspinola at una.ac.cr > mspinola10 at gmail.com > Tel?fono: (506) 2277-3598 > Fax: (506) 2237-7036 > Personal website: Lobito de r?o > > Institutional website: ICOMVIS > -- *Manuel Sp?nola, Ph.D.* Instituto Internacional en Conservaci?n y Manejo de Vida Silvestre Universidad Nacional Apartado 1350-3000 Heredia COSTA RICA mspinola at una.ac.cr mspinola10 at gmail.com Tel?fono: (506) 2277-3598 Fax: (506) 2237-7036 Personal website: Lobito de r?o Institutional website: ICOMVIS -------------- next part -------------- A non-text attachment was scrubbed... Name: riqueza.png Type: image/png Size: 14825 bytes Desc: riqueza.png URL: From rhelpme at hotmail.com Mon Apr 4 00:04:39 2011 From: rhelpme at hotmail.com (Xiaoxi Gao) Date: Sun, 3 Apr 2011 18:04:39 -0400 Subject: [R] Quick question about duplicating vectors Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From daniel at umd.edu Mon Apr 4 00:16:04 2011 From: daniel at umd.edu (Daniel Malter) Date: Sun, 3 Apr 2011 17:16:04 -0500 (CDT) Subject: [R] Avoiding loops in creating a coinvestment matrix Message-ID: <1301868964137-3424298.post@n4.nabble.com> Hi, I am working on a dataset in which a number of venture capitalists invest in a number of firms. What I am creating is an asymmetric matrix M in which m(ij) is the volume (sum) of coinvestments of VC i with VC j (i.e., how much has VC i invested in companies that VC j also has investments in). The output should look like the "coinvestments" matrix produced with the code below. If possible I would like to avoid loops and optimize the code for speed because the real data is huge. If anybody has suggestions, I would be grateful. invest=c(20,50,40,30,10,20,20,30,40) vc=rep(c('A','B','C'),each=3) company=c('E','F','G','F','G','H','G','H','I') data=data.frame(vc,company,invest) data #data inv.mat=tapply(invest,list(vc,company),sum) inv.mat=replace(inv.mat,which(is.na(inv.mat)==T),0) inv.mat #investment matrix exist.mat=inv.mat>0 coinvestments<-matrix(0,nrow=length(unique(vc)),ncol=length(unique(vc))) for(i in unique(vc)){ for(j in unique(vc)){ i.is=which(unique(vc)==i) j.is=which(unique(vc)==j) i.invests=exist.mat[i,] j.invests=exist.mat[j,] which.i=which(i.invests==T) which.j=which(j.invests==T) i.invests.with.j=which.i[which.i%in%which.j] coinvestments[i.is,j.is]=sum(inv.mat[i.is,i.invests.with.j]) } } coinvestments system.time( for(i in unique(vc)){ for(j in unique(vc)){ i.is=which(unique(vc)==i) j.is=which(unique(vc)==j) i.invests=exist.mat[i,] j.invests=exist.mat[j,] which.i=which(i.invests==T) which.j=which(j.invests==T) i.invests.with.j=which.i[which.i%in%which.j] coinvestments[i.is,j.is]=sum(inv.mat[i.is,i.invests.with.j]) } } ) Thanks much, Daniel -- View this message in context: http://r.789695.n4.nabble.com/Avoiding-loops-in-creating-a-coinvestment-matrix-tp3424298p3424298.html Sent from the R help mailing list archive at Nabble.com. From dwinsemius at comcast.net Mon Apr 4 00:17:55 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Sun, 3 Apr 2011 18:17:55 -0400 Subject: [R] Quick question about duplicating vectors In-Reply-To: References: Message-ID: On Apr 3, 2011, at 6:04 PM, Xiaoxi Gao wrote: > > Hello R users, > > I have a quick question, if I have a data set a, say > a <- c(1,2,3,4,5) You might as well learn to use more precise R terminology... that is a `vector`. > and I want to get a new data set b, > b <- c(1,1,1,2,2,2,3,3,3,4,4,4,5,5,5) > Could anyone tell me how I can obtain this? ?rep -- David Winsemius, MD West Hartford, CT From ggrothendieck at gmail.com Mon Apr 4 00:27:07 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Sun, 3 Apr 2011 18:27:07 -0400 Subject: [R] zoo:rollapply by multiple grouping factors In-Reply-To: <4D98991B.10909@ucsc.edu> References: <4D98991B.10909@ucsc.edu> Message-ID: On Sun, Apr 3, 2011 at 11:58 AM, Mark Novak wrote: > # Hi there, > # I am trying to apply a function over a moving-window for a large number of > multivariate time-series that are grouped in a nested set of factors. ?I > have spent a few days searching for solutions with no luck, so any > suggestions are much appreciated. > > # The data I have are for the abundance dynamics of multiple species > observed in multiple fixed plots at multiple sites. ?(I total I have 7 > sites, ~3-5 plots/site, ~150 species/plot, for 60 time-steps each.) So my > data look something like this: > > dat<-data.frame(Site=rep(1), Plot=rep(c(rep(1,8),rep(2,8),rep(3,8)),1), > Time=rep(c(1,1,2,2,3,3,4,4)), Sp=rep(1:2), Count=sample(24)) > dat > > # Let the function I want to apply over a right-aligned window of w=2 time > steps be: > cv<-function(x){sd(x)/mean(x)} > w<-2 > > # The final output I want would look something like this: > Out<-data.frame(dat,CV=round(c(NA,NA,runif(6,0,1),c(NA,NA,runif(6,0,1))),2)) > > # I could reshape and apply zoo:rollapply() to a given plot at a given site, > and reshape again as follows: > library(zoo) > a<-subset(dat,Site==1&Plot==1) > b<-reshape(a[-c(1,2)],v.names='Count',idvar='Time',timevar='Sp',direction='wide') > d<-zoo(b[,-1],b[,1]) > d > out<-rollapply(d, w, cv, na.pad=T, align='right') > out > > # I would thereby have to loop through all my sites and plots which, > although it deals with all species at once, still seems exceedingly > inefficient. > > # So the question is, how do I use something like aggregate.zoo or tapply or > even lapply to apply rollapply on each species' time series. > > # The closest I've come is the following two approaches: > > # First let: > datx<-list(Site=dat$Site,Plot=dat$Plot,Sp=dat$Sp) > daty<-dat$Count > > # Method 1. > out1<-tapply(seq(along=daty),datx,function(i,x=daty){ rollapply(zoo(x[i]), > w, cv, na.pad=T, align='right') }) > out1 > out1[,,1] > > # Which "works" in that it gives me the right answers, but in a format from > which I can't figure out how to get back into the format I want. > > # Method 2. > fun<-function(x){y<-zoo(x);coredata(rollapply(y, w, > cv,na.pad=T,align='right'))} > out2<-aggregate(daty,by=datx,fun) > out2 > > # Which superficially "works" better, but again only in a format I can't > figure out how to use because the output seems to be a mix of data.frame and > lists. > out2[1,4] > out2[1,5] > is.data.frame(out2) > is.list(out2) > > # The situation is made more problematic by the fact that the time point of > first survey can differ between plots ?(e.g., site1-plot3 may only start at > time-point 3). ?As in... > dat2<-dat > dat2<-dat2[-which(dat2$Plot==3 & dat2$Time<3),] > dat2 > > # I must therefore ensure that I'm keeping track of the true time associated > with each value, not just the order of their occurences. ?This information > is (seemingly) lost by both methods. > datx<-list(Site=dat2$Site,Plot=dat2$Plot,Sp=dat2$Sp) > daty<-dat2$Count > > # Method 1. > out3<-tapply(seq(along=daty),datx,function(i,x=daty){ rollapply(zoo(x[i]), > w, cv, na.pad=T, align='right') }) > out3 > out3[1,3,1] > time(out3[1,3,1]) > > # Method 2 > out4<-aggregate(daty,by=datx,fun) > out4 > time(out4[3,4]) > > > # Am I going about this all wrong? ?Is there a different package to try? > ?Any thoughts and suggestions are much appreciated! > > # R 2.12.2 GUI 1.36 Leopard build 32-bit (5691); zoo 1.6-4 > > # Thanks! > # -mark > Try ave: dat$cv <- ave(dat$Count, dat[c("Site", "Plot", "Sp")], FUN = function(x) rollapply(zoo(x), 2, cv, na.pad = TRUE, align = "right")) -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From ggrothendieck at gmail.com Mon Apr 4 00:34:12 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Sun, 3 Apr 2011 18:34:12 -0400 Subject: [R] R-project: plot 2 zoo objects (price series) that have some date mis-matches In-Reply-To: <1301853469771-3423899.post@n4.nabble.com> References: <1301853469771-3423899.post@n4.nabble.com> Message-ID: On Sun, Apr 3, 2011 at 1:57 PM, algotr8der wrote: > I have 2 zoo objects - > > 1) Interest rate spread between 10-YR-US-Treasury and 2-YR-US-Treasury > (object name = sprd) > > 2) S&P 500 index (object name = spy) > > >> str(spy) > ?zoo? series from 1976-06-01 to 2011-03-31 > ?Data: num [1:8791] 99.8 100.2 100.1 99.2 98.6 ... > ?Index: Class 'Date' ?num [1:8791] 2343 2344 2345 2346 2349 ... > >> str(sprd) > ?zoo? series from 1976-06-01 to 2011-03-31 > ?Data: num [1:9088] 0.68 0.71 0.7 0.77 0.79 0.79 0.82 0.86 0.83 0.83 ... > ?Index: Class 'Date' ?num [1:9088] 2343 2344 2345 2346 2349 ... > > > Since there are NA data points in object 'sprd' I created another object > that omits "NA". The name of that object is 'sprdtmp'. > > >> str(sprdtmp) > ?zoo? series from 1976-06-01 to 2011-03-31 > ?Data: atomic [1:8704] 0.68 0.71 0.7 0.77 0.79 0.79 0.82 0.86 0.83 0.83 ... > ?- attr(*, "na.action")=Class 'omit' ?int [1:384] 25 70 95 111 118 128 149 > 190 224 260 ... > ?Index: Class 'Date' ?num [1:8704] 2343 2344 2345 2346 2349 ... > > I want to plot both time series on the same plot with time/date on the > x-axis and the axis label quarterly (or monthly). One problem is that the > sprdtmp and spy objects do not have the same number of data points as there > are times when the equity markets are closed while the interest rate markets > are open. For the most part the dates overlap. Would this matter if I try to > plot both on the same plot? And how would I go about plotting these objects > in one plot. > > The second part of the plot requires that both are on different scales. I > guess I could take a log of the s&p and plot that along with the rate > spread. But it would be nice for future reference how I could plot 2 series > in one plot with 2 difference scales. > > I spent all night yesterday and this morning trying various options but I > cant seem to get this to work. I have read through the ?plot, ?plot.zoo and > ?axis documentation without any success. I would greatly appreciate if > anyone can point me in the right direction. Thank you kindly. > Try this: plot(na.approx(cbind(z1, z2)), screen = 1) and in ?plot.zoo see the multivariate plotting example. where z1 and z2 are two zoo series. Omit screen = 1 if you want them in different panels. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From mazatlanmexico at yahoo.com Mon Apr 4 00:37:22 2011 From: mazatlanmexico at yahoo.com (Felipe Carrillo) Date: Sun, 3 Apr 2011 15:37:22 -0700 Subject: [R] another question on shapefiles and geom_point in ggplot2 In-Reply-To: <4D98DDB6.9030503@gmail.com> References: <4D9772A4.40208@gmail.com> <276101.2012.qm@web56602.mail.re3.yahoo.com> <4D97BE22.4040002@gmail.com> <6775.63438.qm@web56603.mail.re3.yahoo.com> <4D97C9E1.7080207@gmail.com> <297554.87949.qm@web56603.mail.re3.yahoo.com> <4D980737.7030401@gmail.com> <472714.79454.qm@web56606.mail.re3.yahoo.com> <4D981080.9040701@gmail.com> <173268.37139.qm@web56608.mail.re3.yahoo.com> <4D981220.7060404@gmail.com> <969108.33353.qm@web56603.mail.re3.yahoo.com> <4D98DDB6.9030503@gmail.com> Message-ID: <874959.51997.qm@web56604.mail.re3.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rhelpme at hotmail.com Mon Apr 4 00:23:53 2011 From: rhelpme at hotmail.com (Xiaoxi Gao) Date: Sun, 3 Apr 2011 18:23:53 -0400 Subject: [R] Quick question about duplicating vectors In-Reply-To: References: , Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From timspier at hotmail.com Mon Apr 4 01:10:49 2011 From: timspier at hotmail.com (Timothy Spier) Date: Sun, 3 Apr 2011 23:10:49 +0000 Subject: [R] power of 2 way ANOVA with interaction Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bert.jacobs at figurestofacts.be Mon Apr 4 01:39:34 2011 From: bert.jacobs at figurestofacts.be (Bert Jacobs) Date: Mon, 4 Apr 2011 01:39:34 +0200 Subject: [R] replace last 3 characters of string Message-ID: <000001cbf258$637e5a60$2a7b0f20$@jacobs@figurestofacts.be> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From edd at debian.org Mon Apr 4 01:39:14 2011 From: edd at debian.org (Dirk Eddelbuettel) Date: Sun, 3 Apr 2011 23:39:14 +0000 Subject: [R] replace last 3 characters of string In-Reply-To: <000001cbf258$637e5a60$2a7b0f20$@jacobs@figurestofacts.be> References: <000001cbf258$637e5a60$2a7b0f20$@jacobs@figurestofacts.be> Message-ID: <20110403233914.GA8957@master.debian.org> On Mon, Apr 04, 2011 at 01:39:34AM +0200, Bert Jacobs wrote: > I would like to replace the last tree characters of the values of a certain > column in a dataframe. > > This replacement should only take place if the last three characters > correspond to the value "/:/" and they should be replaced with ""(blank) > > I cannot perform a simple gsub because the characters /:/ might also be > present somewhere else in the string values and then they should not be > replaced. Keep reading up on regular expressions, this tends to pay off. Here we use the fact that you can achor a regexp to the end of a string: R> aString <- "abc:def:::" R> gsub(":::$", "", aString) [1] "abc:def" R> Hth, Dirk -- Three out of two people have difficulties with fractions. From bert.jacobs at figurestofacts.be Mon Apr 4 01:48:00 2011 From: bert.jacobs at figurestofacts.be (Bert Jacobs) Date: Mon, 4 Apr 2011 01:48:00 +0200 Subject: [R] replace last 3 characters of string In-Reply-To: <20110403233914.GA8957@master.debian.org> References: <000001cbf258$637e5a60$2a7b0f20$@jacobs@figurestofacts.be> <20110403233914.GA8957@master.debian.org> Message-ID: <000501cbf259$90e2e0b0$b2a8a210$@jacobs@figurestofacts.be> Thx I could imagine it was so simple.:) Bert -----Original Message----- From: Dirk Eddelbuettel [mailto:edd at master.debian.org] On Behalf Of Dirk Eddelbuettel Sent: 04 April 2011 01:39 To: Bert Jacobs Cc: r-help at r-project.org Subject: Re: [R] replace last 3 characters of string On Mon, Apr 04, 2011 at 01:39:34AM +0200, Bert Jacobs wrote: > I would like to replace the last tree characters of the values of a certain > column in a dataframe. > > This replacement should only take place if the last three characters > correspond to the value "/:/" and they should be replaced with ""(blank) > > I cannot perform a simple gsub because the characters /:/ might also be > present somewhere else in the string values and then they should not be > replaced. Keep reading up on regular expressions, this tends to pay off. Here we use the fact that you can achor a regexp to the end of a string: R> aString <- "abc:def:::" R> gsub(":::$", "", aString) [1] "abc:def" R> Hth, Dirk -- Three out of two people have difficulties with fractions. From ssefick at gmail.com Mon Apr 4 02:34:42 2011 From: ssefick at gmail.com (stephen sefick) Date: Sun, 3 Apr 2011 19:34:42 -0500 Subject: [R] Linear Model with curve fitting parameter? In-Reply-To: References: <8512_1301611104_1301611104_AANLkTikOxmcE=oMvHBuB8x61fxXJwnexXJkr+Qp3Tawp@mail.gmail.com> Message-ID: Steven: You are exactly right sorry I was confused. ####################################################### so log(y-intercept)+log(K) is a constant called b0 (is this right?) lm(log(Q)~log(A)+log(R)+log(S)-1) is fitting the model log(Q)=a*log(A)+r*log(R)+s*log(S) (no beta 0) and lm(log(Q)~log(A)+log(R)+log(S)) is fitting the model log(Q)=b0+a*log(A)+r*log(R)+s*log(S) ###################################################### These are the models I am trying to fit and if I have reasoned correctly above then I should be able to fit the below models similarly. manning log(Q)=log(b0)+log(K)+log(A)+r*log(R)+s*log(S) dingman log(Q)=log(b0)+log(K)+a*log(A)+r*log(R)+s*(log(S))^2 bjerklie log(Q)=log(b0)+log(K)+a*log(A)+r*log(R)+s*log(S) ####################################################### Thank you for all of your help! Stephen On Fri, Apr 1, 2011 at 2:44 PM, Steven McKinney wrote: > >> -----Original Message----- >> From: stephen sefick [mailto:ssefick at gmail.com] >> Sent: April-01-11 5:44 AM >> To: Steven McKinney >> Cc: R help >> Subject: Re: [R] Linear Model with curve fitting parameter? >> >> Setting Z=Q-A would be the incorrect dimensions. ?I could Z=Q/A. > > I suspect this is confusion about what Q is. ?I was presuming that > the Q in this following formula was log(Q) with Q from the original data. > >> >> I have taken the log of the data that I have and this is the model >> >> formula without the K part >> >> >> >> lm(Q~offset(A)+R+S, data=x) > > If the model is > > ? Q=K*A*(R^r)*(S^s) > > then > > ? log(Q) = log(K) + log(A) + r*log(R) + s*log(S) > > Rearranging yields > > ? log(Q) - log(A) = log(K) + r*log(R) + s*log(S) > > so what I labeled 'Z' below is > > ? Z = log(Q) - log(A) = log(Q/A) > > so > > ? Z = log(K) + r*log(R) + s*log(S) > > and a linear model fit of > > ? Z ~ log(R) + log(S) > > will yield parameter estimates for the linear equation > > ? E(Z) = B0 + B1*log(R) + B2*log(S) > > (E(Z) = expected value of Z) > > so B0 estimate is an estimate of log(K) > ? B1 estimate is an estimate of r > ? B2 estimate is an estimate of s > > More details and careful notation will eventually lead > to a reasonable description and analysis strategy. > > > Best > > Steve McKinney > > > >> Is fitting a nls model the same as fitting an ols? ?These data are >> hydraulic data from ~47 sites. ?To access predictive ability I am >> removing one site fitting a new model and then accessing the fit with >> a myriad of model assessment criteria. ?I should get the same answer >> with ols vs nls? ?Thank you for all of your help. >> >> Stephen >> >> On Thu, Mar 31, 2011 at 8:34 PM, Steven McKinney wrote: >> > >> >> -----Original Message----- >> >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of stephen >> sefick >> >> Sent: March-31-11 3:38 PM >> >> To: R help >> >> Subject: [R] Linear Model with curve fitting parameter? >> >> >> >> I have a model Q=K*A*(R^r)*(S^s) >> >> >> >> A, R, and S are data I have and K is a curve fitting parameter. ?I >> >> have linearized as >> >> >> >> log(Q)=log(K)+log(A)+r*log(R)+s*log(S) >> >> >> >> I have taken the log of the data that I have and this is the model >> >> formula without the K part >> >> >> >> lm(Q~offset(A)+R+S, data=x) >> >> >> >> What is the formula that I should use? >> > >> > Let Z = Q - A for your logged data. >> > >> > Fitting lm(Z ~ R + S, data = x) should yield >> > intercept parameter estimate = estimate for log(K) >> > R coefficient parameter estimate = estimate for r >> > S coefficient parameter estimate = estimate for s >> > >> > >> > >> > Steven McKinney >> > >> > Statistician >> > Molecular Oncology and Breast Cancer Program >> > British Columbia Cancer Research Centre >> > >> > >> > >> >> >> >> Thanks for all of your help. ?I can provide a subset of data if necessary. >> >> >> >> >> >> >> >> -- >> >> Stephen Sefick >> >> ____________________________________ >> >> | Auburn University? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | >> >> | Biological Sciences ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ?? ? ? ?| >> >> | 331 Funchess Hall? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| >> >> | Auburn, Alabama? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ? ??? | >> >> | 36849? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | >> >> |___________________________________| >> >> | sas0025 at auburn.edu? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ?| >> >> | http://www.auburn.edu/~sas0025? ? ? ? ? ?? ? ? | >> >> |___________________________________| >> >> >> >> Let's not spend our time and resources thinking about things that are >> >> so little or so large that all they really do for us is puff us up and >> >> make us feel like gods.? We are mammals, and have not exhausted the >> >> annoying little problems of being mammals. >> >> >> >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -K. Mullis >> >> >> >> "A big computer, a complex algorithm and a long time does not equal science." >> >> >> >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -Robert Gentleman >> >> ______________________________________________ >> >> R-help at r-project.org mailing list >> >> https://stat.ethz.ch/mailman/listinfo/r-help >> >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> >> and provide commented, minimal, self-contained, reproducible code. >> > >> >> >> >> -- >> Stephen Sefick >> ____________________________________ >> | Auburn University? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | >> | Biological Sciences ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ?? ? ? ?| >> | 331 Funchess Hall? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| >> | Auburn, Alabama? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ? ??? | >> | 36849? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | >> |___________________________________| >> | sas0025 at auburn.edu? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ?| >> | http://www.auburn.edu/~sas0025? ? ? ? ? ?? ? ? | >> |___________________________________| >> >> Let's not spend our time and resources thinking about things that are >> so little or so large that all they really do for us is puff us up and >> make us feel like gods.? We are mammals, and have not exhausted the >> annoying little problems of being mammals. >> >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -K. Mullis >> >> "A big computer, a complex algorithm and a long time does not equal science." >> >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -Robert Gentleman > -- Stephen Sefick ____________________________________ | Auburn University? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | | Biological Sciences ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ?? ? ? ?| | 331 Funchess Hall? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| | Auburn, Alabama? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ? ??? | | 36849? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | |___________________________________| | sas0025 at auburn.edu? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ?| | http://www.auburn.edu/~sas0025? ? ? ? ? ?? ? ? | |___________________________________| Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods.? We are mammals, and have not exhausted the annoying little problems of being mammals. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -K. Mullis "A big computer, a complex algorithm and a long time does not equal science." ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -Robert Gentleman From ghubona at gmail.com Mon Apr 4 02:56:40 2011 From: ghubona at gmail.com (Geoffrey Hubona) Date: Sun, 3 Apr 2011 20:56:40 -0400 Subject: [R] LIVE, ONLINE COURSE: Using R Software for Academic Research Analyses Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From sjdennis3 at gmail.com Mon Apr 4 04:55:33 2011 From: sjdennis3 at gmail.com (Samuel Dennis) Date: Mon, 4 Apr 2011 14:55:33 +1200 Subject: [R] Graph many points without hiding some In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From hwborchers at googlemail.com Mon Apr 4 05:25:10 2011 From: hwborchers at googlemail.com (Hans W Borchers) Date: Mon, 4 Apr 2011 03:25:10 +0000 Subject: [R] How do I modify uniroot function to return .0001 if error ? References: <1301861551841-3424092.post@n4.nabble.com> Message-ID: eric aol.com> writes: > > I am calling the uniroot function from inside another function using these > lines (last two lines of the function) : > > d <- uniroot(k, c(.001, 250), tol=.05) > return(d$root) > > The problem is that on occasion there's a problem with the values I'm > passing to uniroot. In those instances uniroot stops and sends a message > that it can't calculate the root because f.upper * f.lower is greater than > zero. All I'd like to do in those cases is be able to set the return value > of my calling function "return(d$root)" to .0001. But I'm not sure how to > pull that off. I tried a few modifications to uniroot but so far no luck. > Do not modify uniroot(). Use 'try' or 'tryCatch', for example e <- try( d <- uniroot(k, c(.001, 250), tol=.05), silent = TRUE ) if (class(e) == "try-error") { return(0.0001) } else { return(d$root) } --Hans Werner From klebyn at yahoo.com.br Mon Apr 4 05:29:34 2011 From: klebyn at yahoo.com.br (Cleber N. Borges) Date: Mon, 04 Apr 2011 00:29:34 -0300 Subject: [R] RGtk2: How to populate an GtkListStore data model? Message-ID: <4D993B1E.907@yahoo.com.br> hello all I am trying to learn how to use the RGtk2 package... so, my first problem is: I don't get the right way for populate my gtkListStore object! any help is welcome... because I am trying several day to mount the code... Thanks in advanced Cleber N. Borges --------------------------- # my testing code library(RGtk2) win <- gtkWindowNew() datamodel <- gtkListStoreNew('gchararray') treeview <- gtkTreeViewNew() renderer <- gtkCellRendererText() col_0 <- gtkTreeViewColumnNewWithAttributes(title="TitleXXX", cell=renderer, "text"="Bar") nc_next <- gtkTreeViewInsertColumn(object=treeview, column=col_0, position=0) gtkTreeViewSetModel( treeview, datamodel ) win$add( treeview ) # is there an alternative function for this? # iter <- gtkTreeModelGetIterFirst( datamodel )[[2]] # this function don't give VALID iter # gtkListStoreIterIsValid( datamodel, iter ) result in FALSE iter <- gtkListStoreInsert( datamodel, position=0 )[[2]] gtkListStoreIterIsValid( datamodel, iter ) # the help of this function say to terminated in -1 value # but -1 crash the R-pckage (or the gtk)... gtkListStoreSet(object=datamodel, iter=iter, 0, "textFoo") # don't make any difference in the window... :-( ---- R version 2.13.0 alpha (2011-03-27 r55091) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=Portuguese_Brazil.1252 [2] LC_CTYPE=Portuguese_Brazil.1252 [3] LC_MONETARY=Portuguese_Brazil.1252 [4] LC_NUMERIC=C [5] LC_TIME=Portuguese_Brazil.1252 attached base packages: [1] stats graphics grDevices utils datasets methods [7] base other attached packages: [1] RGtk2_2.20.8 loaded via a namespace (and not attached): [1] tools_2.13.0 > my gtk version == 2.16.2 From jlaurentum at yahoo.com Mon Apr 4 05:55:01 2011 From: jlaurentum at yahoo.com (jose romero) Date: Sun, 3 Apr 2011 20:55:01 -0700 (PDT) Subject: [R] loading R object files on an RWeb server Message-ID: <338068.95921.qm@web30605.mail.mud.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jeroenooms at gmail.com Mon Apr 4 05:02:19 2011 From: jeroenooms at gmail.com (Jeroen Ooms) Date: Sun, 3 Apr 2011 22:02:19 -0500 (CDT) Subject: [R] detect filetype (as in unix 'file') Message-ID: <1301886139248-3424562.post@n4.nabble.com> Is there a way in R (in Linux) to detect the type of a file without invoking a shell? E.g to do this: > system("file density.plot") density.plot: PDF document, version 1.4 but without using system()? I tried file() and file.info(), but both do display the information I am looking for. -- View this message in context: http://r.789695.n4.nabble.com/detect-filetype-as-in-unix-file-tp3424562p3424562.html Sent from the R help mailing list archive at Nabble.com. From vivs at ucla.edu Mon Apr 4 05:51:33 2011 From: vivs at ucla.edu (Vivian Shih) Date: Sun, 03 Apr 2011 20:51:33 -0700 Subject: [R] Ordering every row of a matrix while ignoring off diagonal elements Message-ID: <20110403205133.16345p77nnjj8scg@mail.ucla.edu> Sorry if this is a stupid question but I've been stuck on how to code so that I can order rows of a matrix without paying attention to the diagonal elements. Say, for example, I have a matrix d: > d [,1] [,2] [,3] [,4] [1,] 0.000000 2.384158 2.0065682 2.2998856 [2,] 2.384158 0.000000 1.4599928 2.4333213 [3,] 2.006568 1.459993 0.0000000 0.9733285 [4,] 2.299886 0.000000 0.9733285 0.0000000 Then I'd like ordered d to be like: [,1] [,2] [,3] [1,] 3 4 2 [2,] 3 1 4 [3,] 4 2 1 [4,] 2 3 1 So subject 1's smallest value is in column 3. Subject 2's second smallest value would be 3, etc. Note that subject 4 has two zeros (a "tie") but if the diagonals are not in the equation, then the minimum value for this subject is from column 2. Right now I coded off diagonals as missing and then order it that way but I feel like it's cheating. Suggestions?? From pierre.roudier at gmail.com Mon Apr 4 06:23:46 2011 From: pierre.roudier at gmail.com (Pierre Roudier) Date: Mon, 4 Apr 2011 16:23:46 +1200 Subject: [R] another question on shapefiles and geom_point in ggplot2 In-Reply-To: <874959.51997.qm@web56604.mail.re3.yahoo.com> References: <4D9772A4.40208@gmail.com> <276101.2012.qm@web56602.mail.re3.yahoo.com> <4D97BE22.4040002@gmail.com> <6775.63438.qm@web56603.mail.re3.yahoo.com> <4D97C9E1.7080207@gmail.com> <297554.87949.qm@web56603.mail.re3.yahoo.com> <4D980737.7030401@gmail.com> <472714.79454.qm@web56606.mail.re3.yahoo.com> <4D981080.9040701@gmail.com> <173268.37139.qm@web56608.mail.re3.yahoo.com> <4D981220.7060404@gmail.com> <969108.33353.qm@web56603.mail.re3.yahoo.com> <4D98DDB6.9030503@gmail.com> <874959.51997.qm@web56604.mail.re3.yahoo.com> Message-ID: Hi all, 2011/4/4 Felipe Carrillo : > Manuel: > As?far as I know?one needs gpclibPermit() in order to fortify > see this: > Note: polygon geometry computations in maptools > ??????? depend on the package gpclib, which has a > ??????? restricted licence. It is disabled by default; > ??????? to enable gpclib, type gpclibPermit() > I am going to guess that ahmadou dicko doesn't show gpclibPermit() on his > code > because he loaded it with Rprofile or some other way. I tried to run his > code without > gpclibPermit() and it wouldn't let me fortify, so not sure how he did it. On that specific point, Colin Arundel and Roger Bivant released the rgeos package on CRAN a few days [1]. This is a great achievement as it brings bindings to the GEOS C++ lib [2] - long story short, it makes the job the non-free [3] gpclib used to do. In its later release, maptools has an option to check if rgeos if present - if it is the case it is used instead of gpclib: > library(maptools) Loading required package: foreign Loading required package: sp Loading required package: lattice Note: polygon geometry computations in maptools depend on the package gpclib, which has a restricted licence. It is disabled by default; to enable gpclib, type gpclibPermit() Checking rgeos availability as gpclib substitute: TRUE > ?gpclibPermit Pierre [1] http://cran.r-project.org/web/packages/rgeos/ [2] http://trac.osgeo.org/geos/ [3] https://stat.ethz.ch/pipermail/r-sig-geo/2010-January/007400.html -- Scientist Landcare Research, New Zealand From jholtman at gmail.com Mon Apr 4 06:41:58 2011 From: jholtman at gmail.com (jim holtman) Date: Mon, 4 Apr 2011 00:41:58 -0400 Subject: [R] Ordering every row of a matrix while ignoring off diagonal elements In-Reply-To: <20110403205133.16345p77nnjj8scg@mail.ucla.edu> References: <20110403205133.16345p77nnjj8scg@mail.ucla.edu> Message-ID: I assume that this is what you did, and I would not call that cheating; it is just a reasonable way to solve the problem: > x <- as.matrix(read.table(textConnection(" 0.000000 2.384158 2.0065682 2.2998856 + 2.384158 0.000000 1.4599928 2.4333213 + 2.006568 1.459993 0.0000000 0.9733285 + 2.299886 0.000000 0.9733285 0.0000000"))) > closeAllConnections() > # put NAs in diagonals > diag(x) <- NA > # get the order > x.ord <- t(apply(x, 1, order)) > # remove last column since this is where NAs sort > x.ord[, -ncol(x.ord)] [,1] [,2] [,3] [1,] 3 4 2 [2,] 3 1 4 [3,] 4 2 1 [4,] 2 3 1 > On Sun, Apr 3, 2011 at 11:51 PM, Vivian Shih wrote: > Sorry if this is a stupid question but I've been stuck on how to code so > that I can order rows of a matrix without paying attention to the diagonal > elements. > > Say, for example, I have a matrix d: >> >> d > > ? ? ? ? [,1] ? ? [,2] ? ? ?[,3] ? ? ?[,4] > [1,] 0.000000 2.384158 2.0065682 2.2998856 > [2,] 2.384158 0.000000 1.4599928 2.4333213 > [3,] 2.006568 1.459993 0.0000000 0.9733285 > [4,] 2.299886 0.000000 0.9733285 0.0000000 > > Then I'd like ordered d to be like: > ? ? [,1] [,2] [,3] > [1,] ? ?3 ? ?4 ? ?2 > [2,] ? ?3 ? ?1 ? ?4 > [3,] ? ?4 ? ?2 ? ?1 > [4,] ? ?2 ? ?3 ? ?1 > > So subject 1's smallest value is in column 3. Subject 2's second smallest > value would be 3, etc. Note that subject 4 has two zeros (a "tie") but if > the diagonals are not in the equation, then the minimum value for this > subject is from column 2. > > Right now I coded off diagonals as missing and then order it that way but I > feel like it's cheating. Suggestions?? > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? From jholtman at gmail.com Mon Apr 4 06:46:36 2011 From: jholtman at gmail.com (jim holtman) Date: Mon, 4 Apr 2011 00:46:36 -0400 Subject: [R] replace last 3 characters of string In-Reply-To: <-8099779560792882433@unknownmsgid> References: <-8099779560792882433@unknownmsgid> Message-ID: Will this do it for you: > x <- c('asdfasdf', 'sadfasdf/:/', 'sadf', 'asdf/:/') > sub("/:/$", '', x) [1] "asdfasdf" "sadfasdf" "sadf" "asdf" On Sun, Apr 3, 2011 at 7:39 PM, Bert Jacobs wrote: > Hi, > > > > I would like to replace the last tree characters of the values of a certain > column in a dataframe. > > This replacement should only take place if the last three characters > correspond to the value "/:/" and they should be replaced with ""(blank) > > I cannot perform a simple gsub because the characters /:/ might also be > present somewhere else in the string values and then they should not be > replaced. > > > > Could someone help me out on this one? > > Thx > > Bert > > > > > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? From jeremy.miles at gmail.com Mon Apr 4 07:12:45 2011 From: jeremy.miles at gmail.com (Jeremy Miles) Date: Sun, 3 Apr 2011 22:12:45 -0700 Subject: [R] Structural equation modeling in R(lavaan,sem) In-Reply-To: References: <1301253139729-3409642.post@n4.nabble.com> <4D9073F7.2040309@gmail.com> <1301426701835-3415954.post@n4.nabble.com> Message-ID: On 3 April 2011 12:38, jouba wrote: > > Daer all, > I have a question concerning longitudinal data: > When we have a longitudinal data and we have to do sem analysis there is in the package lavaan some functions,options in this package that help to do this or we can treat these data like non longitudinal data > No, and (qualified) no. 1. There are (AFAIK) no options, functions that are specific to longitudinal data. 2. You don't treat these data as non-longitudinal data, you add parameters that are appropriate though. For example, look at the model shown on http://lavaan.ugent.be. dem60 and dem65 are two measures of the same construct at different timepoints, so there are correlations over time for each pair of measured variables that are measures of that construct - i.e. y1 ~~ y5 3. You would get much better answers on the SEM mailing list - semnet. You can join it here: http://www2.gsu.edu/~mkteer/semnet.html#Joining. Jeremy -- Jeremy Miles Psychology Research Methods Wiki: www.researchmethodsinpsychology.com From ripley at stats.ox.ac.uk Mon Apr 4 08:06:16 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Mon, 4 Apr 2011 07:06:16 +0100 (BST) Subject: [R] detect filetype (as in unix 'file') In-Reply-To: <1301886139248-3424562.post@n4.nabble.com> References: <1301886139248-3424562.post@n4.nabble.com> Message-ID: On Sun, 3 Apr 2011, Jeroen Ooms wrote: > Is there a way in R (in Linux) to detect the type of a file without invoking > a shell? E.g to do this: > >> system("file density.plot") > density.plot: PDF document, version 1.4 > > but without using system()? I tried file() and file.info(), but both do > display the information I am looking for. No, but what is wrong with using system()? 'file' is large and complex because it tries to be comprehensive (but it still does not know about some common systems, e.g. 64-bit Windows binaries). There simply is no point in replicating that in R: which is why we chose rather to port 'file' to Windows and provide in in Rools. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From felipe.parra at quantil.com.co Mon Apr 4 08:32:32 2011 From: felipe.parra at quantil.com.co (Luis Felipe Parra) Date: Mon, 4 Apr 2011 14:32:32 +0800 Subject: [R] RmetricsTools.R Message-ID: An embedded and charset-unspecified text was scrubbed... Name: no disponible URL: From dieter.menne at menne-biomed.de Mon Apr 4 08:41:26 2011 From: dieter.menne at menne-biomed.de (Dieter Menne) Date: Mon, 4 Apr 2011 01:41:26 -0500 (CDT) Subject: [R] loading R object files on an RWeb server In-Reply-To: <338068.95921.qm@web30605.mail.mud.yahoo.com> References: <338068.95921.qm@web30605.mail.mud.yahoo.com> Message-ID: <1301899286025-3424761.post@n4.nabble.com> jose romero-3 wrote: > > For one thing, i am using google docs to host the R object file and google > docs has secure https URL's, which apparently cannot be handled by R's > url(). So my questions are these: > > Try ?getURL in the RCurl package Dieter -- View this message in context: http://r.789695.n4.nabble.com/loading-R-object-files-on-an-RWeb-server-tp3424613p3424761.html Sent from the R help mailing list archive at Nabble.com. From srinivas.eswar at gmail.com Mon Apr 4 08:11:37 2011 From: srinivas.eswar at gmail.com (psombe) Date: Mon, 4 Apr 2011 01:11:37 -0500 (CDT) Subject: [R] Support Counting Message-ID: <1301897497861-3424730.post@n4.nabble.com> Hi, I'm new to R and trying to some simple analysis. I have a data set with about 88000 transactions and i want to perform a simple support count analysis of an itemset which is say not a complete transaction but a subset of a transaction. say {A,B,D} is a transaction and i want to find support of {A,B} even though it never occurs as only A,B in the entire set To this i needed to create a new itemsets class and then use the support function but somehow the answers never seem to tally. Thanks in advance Srinivas -- View this message in context: http://r.789695.n4.nabble.com/Support-Counting-tp3424730p3424730.html Sent from the R help mailing list archive at Nabble.com. From yrosseel at gmail.com Mon Apr 4 09:13:58 2011 From: yrosseel at gmail.com (yrosseel) Date: Mon, 04 Apr 2011 09:13:58 +0200 Subject: [R] Structural equation modeling in R(lavaan,sem) In-Reply-To: References: <1301253139729-3409642.post@n4.nabble.com> <4D9073F7.2040309@gmail.com> <1301426701835-3415954.post@n4.nabble.com> Message-ID: <4D996FB6.2020903@gmail.com> On 04/03/2011 09:38 PM, jouba wrote: > > Daer all, I have a question concerning longitudinal data: When we > have a longitudinal data and we have to do sem analysis there is in > the package lavaan some functions,options in this package that help > to do this or we can treat these data like non longitudinal data The function 'growth' (in the lavaan package) can be used for (standard) growth modeling. Good material about growth modeling (using Mplus) can be found here: http://statistics.ats.ucla.edu/stat/mplus/seminars/gm/default.htm Next, you can read how to do growth modeling with lavaan by reading section 7 in the lavaan intro, which you can download from the documentation section on the lavaan website (http://lavaan.org). Yves. -- Yves Rosseel -- http://www.da.ugent.be Department of Data Analysis, Ghent University Henri Dunantlaan 1, B-9000 Gent, Belgium From M.Rosario.Garcia at slu.se Mon Apr 4 08:58:45 2011 From: M.Rosario.Garcia at slu.se (Rosario Garcia Gil) Date: Mon, 4 Apr 2011 08:58:45 +0200 Subject: [R] multiple variables Y and X Message-ID: <74776A1FD44FB94E9182E2C524E78772BD0783AE8A@exmbx3.ad.slu.se> Hello I have a model with several hundred Y variables, and also several 1000 X variables. The model is linear lm(Y ~ X). My questions are: 1.- how to avoid writing all Xs variables? is list() the right function? 2.- about the multiple Ys with dependence among some of them, how to incorporate that information in the linear model? Thank you Rosario From henriMone at gmail.com Mon Apr 4 09:49:43 2011 From: henriMone at gmail.com (Henri Mone) Date: Mon, 4 Apr 2011 09:49:43 +0200 Subject: [R] MySql Versus R In-Reply-To: References: Message-ID: Dear All, Thank you to all of you for your fast reply. I will run the test and subscribe to the R-sig-db list. Cheers, Henri From maechler at stat.math.ethz.ch Mon Apr 4 10:17:28 2011 From: maechler at stat.math.ethz.ch (Martin Maechler) Date: Mon, 4 Apr 2011 10:17:28 +0200 Subject: [R] RmetricsTools.R In-Reply-To: References: Message-ID: <19865.32408.790933.287750@stat.math.ethz.ch> >>>>> "LFP" == Luis Felipe Parra >>>>> on Mon, 4 Apr 2011 14:32:32 +0800 writes: LFP> Hello I read on the Rmetrics webpage that all the LFP> development packages could be intalled using the LFP> following command LFP> source("RmetricsTools.R") LFP> install.RmetricsDev("[Rmetrics package to be installed] LFP> I would like to know where I could get this LFP> RmetricsTools.R . I suppose it might be somewhere on LFP> R-Forge but I haven't been able to find it. Thank you It's "top-level". With the standard subversion client, you can get it, e.g., by svn export svn://svn.r-forge.r-project.org/svnroot/rmetrics/pkg/RmetricsTools.R -- Martin Maechler, ETH Zurich From jim at bitwrit.com.au Mon Apr 4 11:19:37 2011 From: jim at bitwrit.com.au (Jim Lemon) Date: Mon, 04 Apr 2011 19:19:37 +1000 Subject: [R] Error in "color2D.matplot" : "Error in plot.new() : figure margins too large" In-Reply-To: References: Message-ID: <4D998D29.2010309@bitwrit.com.au> On 04/03/2011 10:02 PM, shahab wrote: > Hi, > > I am using color2D.matplot (...) function of "plotrix" package. I used > a matrix of size around 20*20 > However, apparently it failed to visualize the matrix and gave the > following exception, which I don't have any idea about possible source > of this error. > > "Error in plot.new() : figure margins too large" > > It would be appreciated if someone points me to the right origin of this error. > Hi Shahab, If you could send me the data you used (real or fake, as long as it produces the error), I'll try to work out what has happened. Jim From nuncio.m at gmail.com Mon Apr 4 10:51:19 2011 From: nuncio.m at gmail.com (nuncio m) Date: Mon, 4 Apr 2011 14:21:19 +0530 Subject: [R] svd Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From pdalgd at gmail.com Mon Apr 4 11:03:13 2011 From: pdalgd at gmail.com (peter dalgaard) Date: Mon, 4 Apr 2011 11:03:13 +0200 Subject: [R] power of 2 way ANOVA with interaction In-Reply-To: References: Message-ID: On Apr 4, 2011, at 01:10 , Timothy Spier wrote: > > I've been searching for an answer to this for a while but no joy. I have a simple 2-way ANOVA with an interaction. I'd like to determine the power of this test for each factor (factor A, factor B, and the A*B interaction). How can I do this in R? I used to do this with "proc Glmpower" in SAS, but I can find no analogue in R. They're not massively hard to do by hand, if you know what you're doing (which, admittedly is a bit hard to be sure of in this case). The basic structure can be lifted from power.anova.test and the name of the game is to work out the noncentrality parameter of the relevant F tests. E.g., lifting an example from the SAS manual: > twoway <- cbind(expand.grid(ex=factor(1:2),var=factor(1:3)),x=c(14,10,16,15,21,16)) > with(twoway,tapply(x,list(ex,var),mean)) 1 2 3 1 14 16 21 2 10 15 16 Now, you have 10 replicates of this with a specified SD of 5. If we do a "skeleton analysis" of the above table, we get > anova(lm(x~ex*var,twoway)) Analysis of Variance Table Response: x Df Sum Sq Mean Sq F value Pr(>F) ex 1 16.667 16.6667 var 2 42.333 21.1667 ex:var 2 4.333 2.1667 Residuals 0 0.000 Warning message: In anova.lm(lm(x ~ ex * var, twoway)) : ANOVA F-tests on an essentially perfect fit are unreliable In a 10-fold replication, the SS would be 10 times bigger, and the residual Df would be 54; also, we need to take the error variance of 5^2 = 25 into account. The noncentrality for the interaction term is thus 43.333/25 and you can work out the power as > pf(qf(.95,2,54),2,54,ncp=43.333/25,lower=F) [1] 0.1914457 Similarly, the main effect powers are > pf(qf(.95,2,54),2,54,423.333/25,lower=F) [1] 0.956741 > pf(qf(.95,1,54),1,54,166.66667/25,lower=F) [1] 0.7176535 (whatever that means in the presence of interaction, but that is a different discussion) -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com From nuncio.m at gmail.com Mon Apr 4 11:22:30 2011 From: nuncio.m at gmail.com (nuncio m) Date: Mon, 4 Apr 2011 14:52:30 +0530 Subject: [R] svd In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From coforfe at gmail.com Mon Apr 4 11:23:47 2011 From: coforfe at gmail.com (Carlos Ortega) Date: Mon, 4 Apr 2011 11:23:47 +0200 Subject: [R] multiple variables Y and X In-Reply-To: <74776A1FD44FB94E9182E2C524E78772BD0783AE8A@exmbx3.ad.slu.se> References: <74776A1FD44FB94E9182E2C524E78772BD0783AE8A@exmbx3.ad.slu.se> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From savicky at praha1.ff.cuni.cz Mon Apr 4 11:37:25 2011 From: savicky at praha1.ff.cuni.cz (Petr Savicky) Date: Mon, 4 Apr 2011 11:37:25 +0200 Subject: [R] Support Counting In-Reply-To: <1301897497861-3424730.post@n4.nabble.com> References: <1301897497861-3424730.post@n4.nabble.com> Message-ID: <20110404093724.GA21431@praha1.ff.cuni.cz> On Mon, Apr 04, 2011 at 01:11:37AM -0500, psombe wrote: > Hi, > I'm new to R and trying to some simple analysis. I have a data set with > about 88000 transactions and i want to perform a simple support count > analysis of an itemset which is say not a complete transaction but a subset > of a transaction. > say > > {A,B,D} is a transaction and i want to find support of {A,B} even though it > never occurs as only A,B in the entire set > > > To this i needed to create a new itemsets class and then use the support > function but somehow the answers never seem to tally. Hi. The answer depends on the representation of the data set. Can you describe the representation? A possible representation of a data set for itemsets counting is a matrix of 0/1. Using this representation, computing the support may be done as follows. db <- matrix(0, nrow=5, ncol=5, dimnames=list(NULL, LETTERS[1:5])) db[1, c("A", "B", "D")] <- 1 db[2, c("A", "B")] <- 1 db[3, c("A", "D", "E")] <- 1 db[4, c("B", "C", "D")] <- 1 db[5, c("A", "B", "C")] <- 1 db A B C D E [1,] 1 1 0 1 0 [2,] 1 1 0 0 0 [3,] 1 0 0 1 1 [4,] 0 1 1 1 0 [5,] 1 1 1 0 0 itemset <- c("A", "B") # for each transaction, whether it contains c("A", "B") rowSums(db[, itemset]) == length(itemset) [1] TRUE TRUE FALSE FALSE TRUE # the number of transactions containing c("A", "B") sum(rowSums(db[, itemset]) == length(itemset)) [1] 3 Hope this helps. Petr Savicky. From dicko.ahmadou at gmail.com Mon Apr 4 11:37:47 2011 From: dicko.ahmadou at gmail.com (ahmadou dicko) Date: Mon, 4 Apr 2011 09:37:47 +0000 Subject: [R] another question on shapefiles and geom_point in ggplot2 In-Reply-To: References: <4D9772A4.40208@gmail.com> <276101.2012.qm@web56602.mail.re3.yahoo.com> <4D97BE22.4040002@gmail.com> <6775.63438.qm@web56603.mail.re3.yahoo.com> <4D97C9E1.7080207@gmail.com> <297554.87949.qm@web56603.mail.re3.yahoo.com> <4D980737.7030401@gmail.com> <472714.79454.qm@web56606.mail.re3.yahoo.com> <4D981080.9040701@gmail.com> <173268.37139.qm@web56608.mail.re3.yahoo.com> <4D981220.7060404@gmail.com> <969108.33353.qm@web56603.mail.re3.yahoo.com> <4D98DDB6.9030503@gmail.com> <874959.51997.qm@web56604.mail.re3.yahoo.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From kitty.a1000 at gmail.com Mon Apr 4 12:35:08 2011 From: kitty.a1000 at gmail.com (kitty) Date: Mon, 4 Apr 2011 11:35:08 +0100 Subject: [R] Deriving formula with deriv Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From y.jiao at ucl.ac.uk Mon Apr 4 12:35:51 2011 From: y.jiao at ucl.ac.uk (Yan Jiao) Date: Mon, 4 Apr 2011 11:35:51 +0100 Subject: [R] add zero in front of numbers Message-ID: <7785EFE1CC2D9F4A96F6A415BC1C2BD0168DD4C1@PC6-46.pogb.cancer.ucl.ac.uk> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From luridao at gmail.com Mon Apr 4 12:43:19 2011 From: luridao at gmail.com (Luis Ridao) Date: Mon, 4 Apr 2011 11:43:19 +0100 Subject: [R] add zero in front of numbers In-Reply-To: <7785EFE1CC2D9F4A96F6A415BC1C2BD0168DD4C1@PC6-46.pogb.cancer.ucl.ac.uk> References: <7785EFE1CC2D9F4A96F6A415BC1C2BD0168DD4C1@PC6-46.pogb.cancer.ucl.ac.uk> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From gavin.simpson at ucl.ac.uk Mon Apr 4 13:08:26 2011 From: gavin.simpson at ucl.ac.uk (Gavin Simpson) Date: Mon, 04 Apr 2011 12:08:26 +0100 Subject: [R] principal components In-Reply-To: References: Message-ID: <1301915306.2986.10.camel@chrysothemis.geog.ucl.ac.uk> On Fri, 2011-04-01 at 11:52 +0530, nuncio m wrote: > HI all, > I am trying to compute the EOF of a matrix using prcomp but unable to get > the expansion co-efficients. > is it possible using prcomp or are there any other methods > thanks > nuncio > *sigh* > RSiteSearch("EOF") It is at times like this that an R equivalent of: http://lmgtfy.com/ would be handy ;-) Or it slightly cruder cousin... G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% From bialozyt at biologie.uni-marburg.de Mon Apr 4 11:11:08 2011 From: bialozyt at biologie.uni-marburg.de (bialozyt at biologie.uni-marburg.de) Date: Mon, 04 Apr 2011 02:11:08 -0700 Subject: [R] Clarks 2Dt function in R Message-ID: <29430419.17555.1301908268731.JavaMail.nabble@joe.nabble.com> Dear Ben, you answerd to Nancy Shackelford about Clarks 2Dt function. Since the thread ended just after your reply, I would like to ask, if you have an idea how to use this function in R I defined it the following way: function(x , p, u) { (p/(pi*u))*(1+(x^2/u))^(p+1) } and would like to fit this one to my obeservational data (count) [,1] [,2] [1,] 15 12 [2,] 45 13 [3,] 75 10 [4,] 105 8 [5,] 135 16 [6,] 165 5 [7,] 195 15 [8,] 225 8 [9,] 255 9 [10,] 285 12 [11,] 315 5 [12,] 345 4 [13,] 375 1 [14,] 405 1 [15,] 435 1 [16,] 465 0 [17,] 495 1 [18,] 525 2 [19,] 555 0 [20,] 585 0 [21,] 615 0 [22,] 645 0 [23,] 675 0 but I am not able to fit anything. Do you have an idea? I guess there is something wrong in my formula for Clarks 2Dt Thank you for reading Ciao Ronald Bialozyt From s.zaidi.ke at amu.ac.in Mon Apr 4 11:15:15 2011 From: s.zaidi.ke at amu.ac.in (Sadaf Zaidi) Date: Mon, 4 Apr 2011 15:45:15 +0630 Subject: [R] Please help Message-ID: <20110404091303.M91337@mail.amu.ac.in> Dear Sir/Madam, I am stuck with a nagging problem in using R for SVM regression. My data has 5 dimensions and 400 observations. The independent variables are : Peb, Ksub, Sub, and Xtt. The dependent variable is: Rexp. I tried using the svm.tune function to tune the hyper parameters: gamma, epsilon and C. I am getting the following error message: Error in predict.svm(ret, xhold, decision.values+TRUE): Model is empty! May you please help me! SADAF ZAIDI Associate Professor Department of Aligarh Muslim University Aligarh 202002. From mitra at informatik.uni-tuebingen.de Mon Apr 4 11:54:25 2011 From: mitra at informatik.uni-tuebingen.de (suparna mitra) Date: Mon, 4 Apr 2011 11:54:25 +0200 Subject: [R] ticklabs in scatterplot3d Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From sadaf63in at yahoo.com Mon Apr 4 12:43:35 2011 From: sadaf63in at yahoo.com (sadaf zaidi) Date: Mon, 4 Apr 2011 03:43:35 -0700 (PDT) Subject: [R] Problem using svm.tune Message-ID: <208248.58466.qm@web37905.mail.mud.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From january.weiner at mpiib-berlin.mpg.de Mon Apr 4 12:55:13 2011 From: january.weiner at mpiib-berlin.mpg.de (January Weiner) Date: Mon, 4 Apr 2011 12:55:13 +0200 Subject: [R] add zero in front of numbers In-Reply-To: <7785EFE1CC2D9F4A96F6A415BC1C2BD0168DD4C1@PC6-46.pogb.cancer.ucl.ac.uk> References: <7785EFE1CC2D9F4A96F6A415BC1C2BD0168DD4C1@PC6-46.pogb.cancer.ucl.ac.uk> Message-ID: Dear Yan, apart from formatC, you can also use sprintf, which works almost exactly like the C sprintf function. To convert an integer x to a string with 5 leading 0s, you do: sprintf( "%05d", x ) Best regards, j. On Mon, Apr 4, 2011 at 12:35 PM, Yan Jiao wrote: > Dear R users, > > I need to add 0 in front of a series of numbers, e.g. 1->001, 19->019, > Is there a fast way of doing that? > > Many thanks > > yan > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- -------- Dr. January Weiner 3 -------------------------------------- Max Planck Institute for Infection Biology Charit?platz 1 D-10117 Berlin, Germany Web?? : www.mpiib-berlin.mpg.de Tel? ?? : +49-30-28460514 From m_hofert at web.de Mon Apr 4 13:39:57 2011 From: m_hofert at web.de (Marius Hofert) Date: Mon, 4 Apr 2011 13:39:57 +0200 Subject: [R] lattice: how to "center" a subtitle? Message-ID: Dear expeRts, I recently asked for a real "centered" title (see, e.g., http://tolstoy.newcastle.edu.au/R/e13/help/11/01/0135.html). A nice solution (from Deepayan Sarkar) is to use "xlab.top" instead of "main": library(lattice) trellis.device("pdf") print(xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for the human's eye", sub = "but subtitles are not centered", scales = list(alternating = c(1,1), tck = c(1,0)))) dev.off() My question is whether there is something similar for *sub*titles [so something like "xlab.bottom"]? As you can see from the plot, the subtitle does not seem to be "centered" for the human's eye. I would like to center it according to the x-axis label. Cheers, Marius From mspinola10 at gmail.com Mon Apr 4 13:55:41 2011 From: mspinola10 at gmail.com (=?ISO-8859-1?Q?Manuel_Sp=EDnola?=) Date: Mon, 4 Apr 2011 05:55:41 -0600 Subject: [R] another question on shapefiles and geom_point in ggplot2 In-Reply-To: References: <4D9772A4.40208@gmail.com> <276101.2012.qm@web56602.mail.re3.yahoo.com> <4D97BE22.4040002@gmail.com> <6775.63438.qm@web56603.mail.re3.yahoo.com> <4D97C9E1.7080207@gmail.com> <297554.87949.qm@web56603.mail.re3.yahoo.com> <4D980737.7030401@gmail.com> <472714.79454.qm@web56606.mail.re3.yahoo.com> <4D981080.9040701@gmail.com> <173268.37139.qm@web56608.mail.re3.yahoo.com> <4D981220.7060404@gmail.com> <969108.33353.qm@web56603.mail.re3.yahoo.com> <4D98DDB6.9030503@gmail.com> <874959.51997.qm@web56604.mail.re3.yahoo.com> Message-ID: <4D99B1BD.6050702@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rasanpreet.kaur at gmail.com Mon Apr 4 13:24:51 2011 From: rasanpreet.kaur at gmail.com (rasanpreet kaur suri) Date: Mon, 4 Apr 2011 13:24:51 +0200 Subject: [R] system() command in R Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From skfglades at gmail.com Mon Apr 4 14:30:43 2011 From: skfglades at gmail.com (Steve Friedman) Date: Mon, 4 Apr 2011 08:30:43 -0400 Subject: [R] moving mean and moving variance functions Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From david.carslaw at kcl.ac.uk Mon Apr 4 14:31:19 2011 From: david.carslaw at kcl.ac.uk (carslaw) Date: Mon, 4 Apr 2011 07:31:19 -0500 (CDT) Subject: [R] Examples of web-based Sweave use? Message-ID: <1301920279327-3425324.post@n4.nabble.com> I appreciate that this is OT, but I'd be grateful for pointers to examples of where Sweave has been used for web-based applications. In particular, examples of where reports/analyses are produced automatically through submission of data to a web-sever. I am mostly interested in situations where pdf reports have been produced rather than, say, a plot/table etc shown on a web page. I've had limited success finding examples on this. Many thanks. David Carslaw Environmental Research Group MRC-HPA Centre for Environment and Health King's College London Franklin Wilkins Building Stamford Street London SE1 9NH david.carslaw at kcl.ac.uk -- View this message in context: http://r.789695.n4.nabble.com/Examples-of-web-based-Sweave-use-tp3425324p3425324.html Sent from the R help mailing list archive at Nabble.com. From thomas.rusch at wu.ac.at Mon Apr 4 14:24:55 2011 From: thomas.rusch at wu.ac.at (Thomas Rusch) Date: Mon, 04 Apr 2011 14:24:55 +0200 Subject: [R] I think I just broke R Message-ID: <1301919895.1642.8.camel@arbeitstier> > >> and time to upgrade R > > I'm still fighting to find out how to upgrade stuff on Ubuntu. After > a > repository update the newest available version was still 2.10.1. > I'll figure it out, sooner or later :) > That's simple. Just add $deb http:///bin/linux/ubuntu maverick/ to /etc/apt/sources.list (or whatever Ubuntu version you use) and type $sudo aptitude update $sudo aptitude safe-upgrade See http://cran.r-project.org/bin/linux/ubuntu/ Regards From jshleap at DAL.CA Mon Apr 4 14:24:34 2011 From: jshleap at DAL.CA (Jose Hleap Lozano) Date: Mon, 04 Apr 2011 09:24:34 -0300 Subject: [R] hc2Newick is different than th hclust dendrogram In-Reply-To: <20110401174755.20670jzo8r2746pc@wm4.dal.ca> References: <20110401174755.20670jzo8r2746pc@wm4.dal.ca> Message-ID: <20110404092434.93826vox72jcqv7k@wm1.dal.ca> Hi R helpers... I am having troubles because of the discrepancy between the dendrogram plotted from hclust and what is wrote in the hc2Newick file. I've got a matrix C: hc <- hclust(dist(C)) plot(hc) with the: write(hc2Newick(hc),file='test.newick') both things draw completely different "trees"... I have also tried with the raw distance matrix D and also the agnes function, but the same happens. The hclus and agnes dendrogram is logical, whereas the newick tree is not. Thanks for any help! PS> attached C matrix that is a 112x112 matrix -- Jose Sergio Hleap Lozano, M. Sc. Ph. D. Student, Dalhousie University Researcher, SQUALUS Foundation From ggrothendieck at gmail.com Mon Apr 4 15:17:16 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Mon, 4 Apr 2011 09:17:16 -0400 Subject: [R] moving mean and moving variance functions In-Reply-To: References: Message-ID: On Mon, Apr 4, 2011 at 8:30 AM, Steve Friedman wrote: > Hello > > > Lets say as an example I have a dataframe with the following attributes: > rownum(1:405), colnum(1:287), year(2000:2009), daily(rownum x colnum x year) > and foragePotential (0:1, by 0.01). ?The data is actually stored in a netcdf > file and I'm trying to provide a conceptual version of the data. > > Ok. I need to calculate a moving mean and a moving variance for each cell on > the following temporal > windows - 7 day, 14 day, and 28 day. So far I have code for the moving > average. > > ma <- function(x , n) { > ? ? ? ? ?filter(x, rep(1/n, n), sides = 1) > ? ? ?} ? # note that when the function is used, n is defined for the > temporal period (7, 14, and 28), and x is the input variable. > > > ma7 <- ? ma(dat, 7) ?# where dat is accessing the foraging potential of the > birds. > ma14 <- ma(dat, 14) > ma28 <- ma(dat, 28) > > This works fine. ?What I don't have is the code for a moving variance. > > filter in the function above is included in the stats package and conducts a > linear filtering on a Time Series. > > Is there comparable code some place in R for a moving variance? > See rollmean and rollapply in the zoo package and runmean and runsd in the caTools package. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From lawrence.michael at gene.com Mon Apr 4 15:33:07 2011 From: lawrence.michael at gene.com (Michael Lawrence) Date: Mon, 4 Apr 2011 06:33:07 -0700 Subject: [R] RGtk2: How to populate an GtkListStore data model? In-Reply-To: <4D993B1E.907@yahoo.com.br> References: <4D993B1E.907@yahoo.com.br> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From tal.galili at gmail.com Mon Apr 4 15:38:36 2011 From: tal.galili at gmail.com (Tal Galili) Date: Mon, 4 Apr 2011 15:38:36 +0200 Subject: [R] Examples of web-based Sweave use? In-Reply-To: <1301920279327-3425324.post@n4.nabble.com> References: <1301920279327-3425324.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From den.alpin at gmail.com Mon Apr 4 14:49:17 2011 From: den.alpin at gmail.com (Den Alpin) Date: Mon, 4 Apr 2011 14:49:17 +0200 Subject: [R] How to speed up grouping time series, help please Message-ID: I retrieve for a few hundred times a group of time series (10-15 ts with 10000 values each), on every group I do some calculation, graphs etc. I wonder if there is a faster method than what presented below to get an appropriate timeseries object. Making a query with RODBC for every group I get a data frame like this: > X ID DATE VALUE 14 3 2000-01-01 00:00:03 0.5726334 4 1 2000-01-01 00:00:03 0.8830174 1 1 2000-01-01 00:00:00 0.2875775 15 3 2000-01-01 00:00:04 0.1029247 11 3 2000-01-01 00:00:00 0.9568333 9 2 2000-01-01 00:00:03 0.5514350 7 2 2000-01-01 00:00:01 0.5281055 6 2 2000-01-01 00:00:00 0.0455565 12 3 2000-01-01 00:00:01 0.4533342 8 2 2000-01-01 00:00:02 0.8924190 3 1 2000-01-01 00:00:02 0.4089769 13 3 2000-01-01 00:00:02 0.6775706 And I want to get a timeSeries object or xts object like this: 1 2 3 2000-01-01 00:00:00 0.2875775 0.0455565 0.9568333 2000-01-01 00:00:01 NA 0.5281055 0.4533342 2000-01-01 00:00:02 0.4089769 0.8924190 0.6775706 2000-01-01 00:00:03 0.8830174 0.5514350 0.5726334 2000-01-01 00:00:04 NA NA 0.1029247 Input data can be sorted or unsorted (the most complicated case is in the example, unsorted and missing data) in the sense that I can sort in query if I can take an advantage from this. Some considerations: - Xts is generally faster than timeSeries - both accept a matrix so if I can create a matrix like the one represented above and an array of characters representing dates faster than what possible with xts:::cbind, for examole,I will have a faster implementation (package data.table ?). - create timeseries objects in multithread and then merge (package plyr ?) - faster merge algorithms? Below some code to generate the test case above: set.seed(123) N <- 5 # number of observations K <- 3 # number of timeseries ID X <- data.frame( ID = rep(1:K, each = N), DATE = as.character(rep(as.POSIXct("2000-01-01", tz = "GMT")+ 0:(N-1), K)), VALUE = runif(N*K), stringsAsFactors = FALSE) X <- X[sample(1:(N*K), N*K),] # sample observations to get random order (optional) X <- X[-(sample(1:nrow(X), floor(nrow(X)*0.2))),] # 20% missing head(X, 15) # use explicitly environments to avoid '<<-' buildTimeSeriesFromDataFrame <- function(x, env) { { if(exists("xx", envir = env)) # if exist variable xx in env cbind assign("xx", cbind(get("xx", env), timeSeries(x$VALUE, x$DATE, format = '%Y-%m-%d %H:%M:%S', zone = 'GMT', units = as.character(x$ID[1]))), envir = env) else # create xx in env assign("xx", timeSeries(x$VALUE, x$DATE, format = '%Y-%m-%d %H:%M:%S', zone = 'GMT', units = as.character(x$ID[1])), envir = env) return(TRUE) } } # use package plyr, faster than 'by' function tsDaply <- function(...) { library(plyr) e1 <- new.env(parent = baseenv()) #create a new env res <- daply(X, "ID", buildTimeSeriesFromDataFrame, env = e1) return(get("xx", e1)) # return xx from env } ##replicate 100 times #Time03 <- replicate(100, # system.time(tsDaply(X, X$ID))[[1]]) #median(Time03) # result tsDaply(X, X$ID) Thanks in advance for any input, best regards, Den From sharma.ram.h at gmail.com Mon Apr 4 15:32:14 2011 From: sharma.ram.h at gmail.com (Ram H. Sharma) Date: Mon, 4 Apr 2011 09:32:14 -0400 Subject: [R] reading from text file that have different rowlength and create a data frame Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From andrew.decker.steen at gmail.com Mon Apr 4 15:39:34 2011 From: andrew.decker.steen at gmail.com (Andrew D. Steen) Date: Mon, 4 Apr 2011 15:39:34 +0200 Subject: [R] gap.barplot doesn't support data arrays? Message-ID: <4d99ca0f.cc7e0e0a.0b7e.ffffa600@mx.google.com> I am trying to make a barplot with a broken axis using gap.barplot (in the indispensable plotrix package). This works well when the data is a vector: > twogrp<-c(rnorm(10)+4,rnorm(10)+20) > gap.barplot(twogrp,gap=c(8,16),xlab="Index",ytics=c(3,6,17,20),ylab="Group values",main="Barplot with gap") But when the data is an array (for a bar plot with multiple series) I get an error and a strange plot with no y-tics and bars stretching downwards, as if all the values were negative: > twogrp2<-array(twogrp, dim=c(2,5)) > gap.barplot(twogrp2,gap=c(8,16),xlab="Index",ytics=c(3,6,17,20),ylab="Group values",main="Barplot with gap") Error in rect(xtics[bigones] - halfwidth, botgap, xtics[bigones] + halfwidth, : cannot mix zero-length and non-zero-length coordinates However, the main title and axis labels do appear correctly. Are data arrays unsupported for gap.barplot, or am I missing something? Thanks, Drew Steen From jholtman at gmail.com Mon Apr 4 15:57:25 2011 From: jholtman at gmail.com (jim holtman) Date: Mon, 4 Apr 2011 09:57:25 -0400 Subject: [R] reading from text file that have different rowlength and create a data frame In-Reply-To: References: Message-ID: ?read.table Then look at the 'fill' & 'flush' parameters; this may do the trick On Mon, Apr 4, 2011 at 9:32 AM, Ram H. Sharma wrote: > Hi R-experts > > I have many text files to read and combined them into one into R that are > output from other programs. My textfile have unbalanced number of rows for > example: > > ;this is example > ; r help > Var1 ? ? Var2 ? ? ?Var3 ? ? ?Var4 ? ? ?Var5 > 0 ? ? ? ? ? ? 0.05 ? ? 0.01 ? ? ? ?12 > 1 ? ? ? ? ? ? 0.04 ? ? 0.06 ? ? ? ?18 ? ? ? ?A > 2 ? ? ? ? ? ? 0.05 ? ? 0.08 ? ? ? ?14 > 3 ? ? ? ? ? ? 0.01 ? ? 0.06 ? ? ? ?15 ? ? ? B > 4 ? ? ? ? ? ? 0.05 ? ? 0.07 ? ? ? ?14 ? ? ? C > > and so on > Inames<-as.data.frame(read.table("CLG1mpd.asc",header=T,comment=";")) > Inames<-as.matrix(read.table("example.txt",header=T,comment=";")) > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, > : > ?line 1 did not have 5 elements > > > In bestcase scenerio, I want to fill the blank space with NA's, with matrix > or dataframe > ?Var1 ? ? Var2 ? ? ?Var3 ? ? ?Var4 ? ? ?Var5 > 0 ? ? ? ? ? ? 0.05 ? ? 0.01 ? ? ? ?12 ? ? ? ?NA > 1 ? ? ? ? ? ? 0.04 ? ? 0.06 ? ? ? ?18 ? ? ? ?A > 2 ? ? ? ? ? ? 0.05 ? ? 0.08 ? ? ? ?14 ? ? ? NA > 3 ? ? ? ? ? ? 0.01 ? ? 0.06 ? ? ? ?15 ? ? ? B > 4 ? ? ? ? ? ? 0.05 ? ? 0.07 ? ? ? ?14 ? ? ? C > > The minimum would be to ?remove the column Var5, so that my data.frame would > look like the follows: > ?Var1 ? ? Var2 ? ? ?Var3 ? ? ?Var4 > 0 ? ? ? ? ? ? 0.05 ? ? 0.01 ? ? ? ?12 > 1 ? ? ? ? ? ? 0.04 ? ? 0.06 ? ? ? ?18 > 2 ? ? ? ? ? ? 0.05 ? ? 0.08 ? ? ? ?14 > 3 ? ? ? ? ? ? 0.01 ? ? 0.06 ? ? ? ?15 > 4 ? ? ? ? ? ? 0.05 ? ? 0.07 ? ? ? ?14 > -- > Thank you in advance for the help. > > Ram H > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? From jim.silverton at gmail.com Mon Apr 4 16:00:56 2011 From: jim.silverton at gmail.com (Jim Silverton) Date: Mon, 4 Apr 2011 10:00:56 -0400 Subject: [R] Difference in mixture normals and one density Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jholtman at gmail.com Mon Apr 4 16:01:27 2011 From: jholtman at gmail.com (jim holtman) Date: Mon, 4 Apr 2011 10:01:27 -0400 Subject: [R] reading from text file that have different rowlength and create a data frame In-Reply-To: References: Message-ID: try this: > x <- read.table(textConnection(";this is example + ; r help + Var1 Var2 Var3 Var4 Var5 + 0 0.05 0.01 12 + 1 0.04 0.06 18 A + 2 0.05 0.08 14 + 3 0.01 0.06 15 B + 4 0.05 0.07 14 C") + , comment = ';' + , fill = TRUE + , header = TRUE + , na.strings = '' + ) > closeAllConnections() > > x Var1 Var2 Var3 Var4 Var5 1 0 0.05 0.01 12 2 1 0.04 0.06 18 A 3 2 0.05 0.08 14 4 3 0.01 0.06 15 B 5 4 0.05 0.07 14 C On Mon, Apr 4, 2011 at 9:32 AM, Ram H. Sharma wrote: > Hi R-experts > > I have many text files to read and combined them into one into R that are > output from other programs. My textfile have unbalanced number of rows for > example: > > ;this is example > ; r help > Var1 ? ? Var2 ? ? ?Var3 ? ? ?Var4 ? ? ?Var5 > 0 ? ? ? ? ? ? 0.05 ? ? 0.01 ? ? ? ?12 > 1 ? ? ? ? ? ? 0.04 ? ? 0.06 ? ? ? ?18 ? ? ? ?A > 2 ? ? ? ? ? ? 0.05 ? ? 0.08 ? ? ? ?14 > 3 ? ? ? ? ? ? 0.01 ? ? 0.06 ? ? ? ?15 ? ? ? B > 4 ? ? ? ? ? ? 0.05 ? ? 0.07 ? ? ? ?14 ? ? ? C > > and so on > Inames<-as.data.frame(read.table("CLG1mpd.asc",header=T,comment=";")) > Inames<-as.matrix(read.table("example.txt",header=T,comment=";")) > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, > : > ?line 1 did not have 5 elements > > > In bestcase scenerio, I want to fill the blank space with NA's, with matrix > or dataframe > ?Var1 ? ? Var2 ? ? ?Var3 ? ? ?Var4 ? ? ?Var5 > 0 ? ? ? ? ? ? 0.05 ? ? 0.01 ? ? ? ?12 ? ? ? ?NA > 1 ? ? ? ? ? ? 0.04 ? ? 0.06 ? ? ? ?18 ? ? ? ?A > 2 ? ? ? ? ? ? 0.05 ? ? 0.08 ? ? ? ?14 ? ? ? NA > 3 ? ? ? ? ? ? 0.01 ? ? 0.06 ? ? ? ?15 ? ? ? B > 4 ? ? ? ? ? ? 0.05 ? ? 0.07 ? ? ? ?14 ? ? ? C > > The minimum would be to ?remove the column Var5, so that my data.frame would > look like the follows: > ?Var1 ? ? Var2 ? ? ?Var3 ? ? ?Var4 > 0 ? ? ? ? ? ? 0.05 ? ? 0.01 ? ? ? ?12 > 1 ? ? ? ? ? ? 0.04 ? ? 0.06 ? ? ? ?18 > 2 ? ? ? ? ? ? 0.05 ? ? 0.08 ? ? ? ?14 > 3 ? ? ? ? ? ? 0.01 ? ? 0.06 ? ? ? ?15 > 4 ? ? ? ? ? ? 0.05 ? ? 0.07 ? ? ? ?14 > -- > Thank you in advance for the help. > > Ram H > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? From josh.m.ulrich at gmail.com Mon Apr 4 16:07:58 2011 From: josh.m.ulrich at gmail.com (Joshua Ulrich) Date: Mon, 4 Apr 2011 09:07:58 -0500 Subject: [R] How to speed up grouping time series, help please In-Reply-To: References: Message-ID: Hi Dan, On Mon, Apr 4, 2011 at 7:49 AM, Den Alpin wrote: > I retrieve for a few hundred times a group of time series (10-15 ts > with 10000 values each), on every group I do some calculation, graphs > etc. I wonder if there is a faster method than what presented below to > get an appropriate timeseries object. > > Making a query with RODBC for every group I get a data frame like this: > >> X > ? ID ? ? ? ? ? ? ? ?DATE ? ? VALUE > 14 ?3 2000-01-01 00:00:03 0.5726334 > 4 ? 1 2000-01-01 00:00:03 0.8830174 > 1 ? 1 2000-01-01 00:00:00 0.2875775 > 15 ?3 2000-01-01 00:00:04 0.1029247 > 11 ?3 2000-01-01 00:00:00 0.9568333 > 9 ? 2 2000-01-01 00:00:03 0.5514350 > 7 ? 2 2000-01-01 00:00:01 0.5281055 > 6 ? 2 2000-01-01 00:00:00 0.0455565 > 12 ?3 2000-01-01 00:00:01 0.4533342 > 8 ? 2 2000-01-01 00:00:02 0.8924190 > 3 ? 1 2000-01-01 00:00:02 0.4089769 > 13 ?3 2000-01-01 00:00:02 0.6775706 > > And I want to get a timeSeries object or xts object like this: > > ? ? ? ? ? ? ? ? ? ? ? ? ? ?1 ? ? ? ? 2 ? ? ? ? 3 > 2000-01-01 00:00:00 0.2875775 0.0455565 0.9568333 > 2000-01-01 00:00:01 ? ? ? ?NA 0.5281055 0.4533342 > 2000-01-01 00:00:02 0.4089769 0.8924190 0.6775706 > 2000-01-01 00:00:03 0.8830174 0.5514350 0.5726334 > 2000-01-01 00:00:04 ? ? ? ?NA ? ? ? ?NA 0.1029247 > > > Input data can be sorted or unsorted (the most complicated case is in > the example, unsorted and missing data) in the sense that I can sort > in query if I can take an advantage from this. > > Some considerations: > - Xts is generally faster than timeSeries > - both accept a matrix so if I can create a matrix like the one > represented above and an array of characters representing dates faster > than what possible with xts:::cbind, for examole,I will have a faster > implementation (package data.table ?). > - create timeseries objects in multithread and then merge (package plyr ?) > - faster merge algorithms? > > Below some code to generate the test case above: > > > set.seed(123) > N <- 5 # number of observations > K <- 3 # number of timeseries ID > > X <- data.frame( > ?ID = rep(1:K, each = N), > ?DATE = as.character(rep(as.POSIXct("2000-01-01", tz = "GMT")+ 0:(N-1), K)), > ?VALUE = runif(N*K), stringsAsFactors = FALSE) > > X <- X[sample(1:(N*K), N*K),] # sample observations to get random > order (optional) > X <- X[-(sample(1:nrow(X), floor(nrow(X)*0.2))),] # 20% missing > > head(X, 15) > > # use explicitly environments to avoid '<<-' > buildTimeSeriesFromDataFrame <- function(x, env) > { > ?{ > ? ?if(exists("xx", envir = env)) # if exist variable xx in env cbind > ? ? ?assign("xx", > ? ? ? ?cbind(get("xx", env), timeSeries(x$VALUE, x$DATE, > ? ? ? ? ?format = '%Y-%m-%d %H:%M:%S', > ? ? ? ? ?zone = 'GMT', units = as.character(x$ID[1]))), > ? ? ? ?envir = env) > ? ?else ?# create xx in env > ? ? ?assign("xx", > ? ? ? ?timeSeries(x$VALUE, x$DATE, format = '%Y-%m-%d %H:%M:%S', > ? ? ? ? ?zone = 'GMT', units = as.character(x$ID[1])), > ? ? ? ?envir = env) > > ? ?return(TRUE) > ?} > } > > # use package plyr, faster than 'by' function > tsDaply <- function(...) > { > ?library(plyr) > ?e1 <- new.env(parent = baseenv()) #create a new env > ?res <- daply(X, "ID", buildTimeSeriesFromDataFrame, > ? ? ?env = e1) > ?return(get("xx", e1)) # return xx from env > } > > ##replicate 100 times > #Time03 <- replicate(100, > # ?system.time(tsDaply(X, X$ID))[[1]]) > #median(Time03) > > # result > tsDaply(X, X$ID) > > > Thanks in advance for any input, best regards, > Den > > Here's how I would do it with xts: x <- xts(X[,c("ID","VALUE")], as.POSIXct(X[,"DATE"])) do.call(merge, split(x$VALUE,x$ID)) My xts solution compares favorably to your solution: > Time03 <- replicate(100, + system.time(tsDaply(X, X$ID))[[1]]) > median(Time03) [1] 0.02 > xtsTime <- replicate(100, + system.time(do.call(merge, split(x$VALUE,x$ID)))[[1]]) > median(xtsTime) [1] 0 Best, -- Joshua Ulrich | FOSS Trading: www.fosstrading.com From ggrothendieck at gmail.com Mon Apr 4 16:08:13 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Mon, 4 Apr 2011 10:08:13 -0400 Subject: [R] How to speed up grouping time series, help please In-Reply-To: References: Message-ID: On Mon, Apr 4, 2011 at 8:49 AM, Den Alpin wrote: > I retrieve for a few hundred times a group of time series (10-15 ts > with 10000 values each), on every group I do some calculation, graphs > etc. I wonder if there is a faster method than what presented below to > get an appropriate timeseries object. > > Making a query with RODBC for every group I get a data frame like this: > >> X > ? ID ? ? ? ? ? ? ? ?DATE ? ? VALUE > 14 ?3 2000-01-01 00:00:03 0.5726334 > 4 ? 1 2000-01-01 00:00:03 0.8830174 > 1 ? 1 2000-01-01 00:00:00 0.2875775 > 15 ?3 2000-01-01 00:00:04 0.1029247 > 11 ?3 2000-01-01 00:00:00 0.9568333 > 9 ? 2 2000-01-01 00:00:03 0.5514350 > 7 ? 2 2000-01-01 00:00:01 0.5281055 > 6 ? 2 2000-01-01 00:00:00 0.0455565 > 12 ?3 2000-01-01 00:00:01 0.4533342 > 8 ? 2 2000-01-01 00:00:02 0.8924190 > 3 ? 1 2000-01-01 00:00:02 0.4089769 > 13 ?3 2000-01-01 00:00:02 0.6775706 > > And I want to get a timeSeries object or xts object like this: > > ? ? ? ? ? ? ? ? ? ? ? ? ? ?1 ? ? ? ? 2 ? ? ? ? 3 > 2000-01-01 00:00:00 0.2875775 0.0455565 0.9568333 > 2000-01-01 00:00:01 ? ? ? ?NA 0.5281055 0.4533342 > 2000-01-01 00:00:02 0.4089769 0.8924190 0.6775706 > 2000-01-01 00:00:03 0.8830174 0.5514350 0.5726334 > 2000-01-01 00:00:04 ? ? ? ?NA ? ? ? ?NA 0.1029247 > > > Input data can be sorted or unsorted (the most complicated case is in > the example, unsorted and missing data) in the sense that I can sort > in query if I can take an advantage from this. > > Some considerations: > - Xts is generally faster than timeSeries > - both accept a matrix so if I can create a matrix like the one > represented above and an array of characters representing dates faster > than what possible with xts:::cbind, for examole,I will have a faster > implementation (package data.table ?). > - create timeseries objects in multithread and then merge (package plyr ?) > - faster merge algorithms? > > Below some code to generate the test case above: > > > set.seed(123) > N <- 5 # number of observations > K <- 3 # number of timeseries ID > > X <- data.frame( > ?ID = rep(1:K, each = N), > ?DATE = as.character(rep(as.POSIXct("2000-01-01", tz = "GMT")+ 0:(N-1), K)), > ?VALUE = runif(N*K), stringsAsFactors = FALSE) > > X <- X[sample(1:(N*K), N*K),] # sample observations to get random > order (optional) > X <- X[-(sample(1:nrow(X), floor(nrow(X)*0.2))),] # 20% missing > > head(X, 15) > > # use explicitly environments to avoid '<<-' > buildTimeSeriesFromDataFrame <- function(x, env) > { > ?{ > ? ?if(exists("xx", envir = env)) # if exist variable xx in env cbind > ? ? ?assign("xx", > ? ? ? ?cbind(get("xx", env), timeSeries(x$VALUE, x$DATE, > ? ? ? ? ?format = '%Y-%m-%d %H:%M:%S', > ? ? ? ? ?zone = 'GMT', units = as.character(x$ID[1]))), > ? ? ? ?envir = env) > ? ?else ?# create xx in env > ? ? ?assign("xx", > ? ? ? ?timeSeries(x$VALUE, x$DATE, format = '%Y-%m-%d %H:%M:%S', > ? ? ? ? ?zone = 'GMT', units = as.character(x$ID[1])), > ? ? ? ?envir = env) > > ? ?return(TRUE) > ?} > } > > # use package plyr, faster than 'by' function > tsDaply <- function(...) > { > ?library(plyr) > ?e1 <- new.env(parent = baseenv()) #create a new env > ?res <- daply(X, "ID", buildTimeSeriesFromDataFrame, > ? ? ?env = e1) > ?return(get("xx", e1)) # return xx from env > } > > ##replicate 100 times > #Time03 <- replicate(100, > # ?system.time(tsDaply(X, X$ID))[[1]]) > #median(Time03) > > # result > tsDaply(X, X$ID) > Haven't checked how fast it is but using read.zoo its just one line of code to produce the required matrix: # set up input data frame, DF Lines <- "ID,DATE,VALUE 3,2000-01-01 00:00:03,0.5726334 1,2000-01-01 00:00:03,0.8830174 1,2000-01-01 00:00:00,0.2875775 3,2000-01-01 00:00:04,0.1029247 3,2000-01-01 00:00:00,0.9568333 2,2000-01-01 00:00:03,0.5514350 2,2000-01-01 00:00:01,0.5281055 2,2000-01-01 00:00:00,0.0455565 3,2000-01-01 00:00:01,0.4533342 2,2000-01-01 00:00:02,0.8924190 1,2000-01-01 00:00:02,0.4089769 3,2000-01-01 00:00:02,0.6775706" DF <- read.table(textConnection(Lines), header = TRUE, sep = ",") # create zoo matrix library(zoo) z <- read.zoo(DF, split = 1, index = 2, tz = "") The last line gives: > z 1 2 3 2000-01-01 00:00:00 0.2875775 0.0455565 0.9568333 2000-01-01 00:00:01 NA 0.5281055 0.4533342 2000-01-01 00:00:02 0.4089769 0.8924190 0.6775706 2000-01-01 00:00:03 0.8830174 0.5514350 0.5726334 2000-01-01 00:00:04 NA NA 0.1029247 -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From andy_liaw at merck.com Mon Apr 4 16:24:15 2011 From: andy_liaw at merck.com (Liaw, Andy) Date: Mon, 4 Apr 2011 10:24:15 -0400 Subject: [R] Difference in mixture normals and one density In-Reply-To: References: Message-ID: Is something like this what you're looking for? R> library(nor1mix) R> nmix2 <- norMix(c(2, 3), sig2=c(25, 4), w=c(.2, .8)) R> dnorMix(1, nmix2) - dnorm(1, 2, 5) [1] 0.03422146 Andy > -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Jim Silverton > Sent: Monday, April 04, 2011 10:01 AM > To: r-help at r-project.org > Subject: Re: [R] Difference in mixture normals and one density > > Hello, > I am trying to find out if R can do the following: > > I have a mixture of normals say f = 0.2*Normal(2, 5) + 0.8*Normal(3,2) > How do I find the difference in the densities at any > particular point of f > and at Normal(2,5)? > > -- > Thanks, > Jim. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > Notice: This e-mail message, together with any attachme...{{dropped:11}} From mailinglist.honeypot at gmail.com Mon Apr 4 16:31:45 2011 From: mailinglist.honeypot at gmail.com (Steve Lianoglou) Date: Mon, 4 Apr 2011 10:31:45 -0400 Subject: [R] Please help In-Reply-To: <20110404091303.M91337@mail.amu.ac.in> References: <20110404091303.M91337@mail.amu.ac.in> Message-ID: Hi, On Mon, Apr 4, 2011 at 5:15 AM, Sadaf Zaidi wrote: > Dear Sir/Madam, > I am stuck with a nagging problem in using R for SVM regression. My data has 5 > dimensions and 400 observations. The independent variables are : > Peb, Ksub, Sub, and Xtt. > The dependent variable is: Rexp. > I tried using the svm.tune function to tune the hyper parameters: gamma, epsilon and C. > I am getting the following error message: > Error in predict.svm(ret, xhold, decision.values+TRUE): Model is empty! > May you please help me! You're not giving us much to go on, can you show us the code that you are using that gets you into this problem? (i) For example -- by "svm.tune", do you mean the "tune" function from the e1071 package? (ii) What is the exact function call you are using that gives you this error. (iii) Can you build a "normal" svm model without "tuning" it. For instance, does svm(x,y,..) work with your data? (iv) Are you sure that the values you are inputting to the svm (and/or tune function) are of the correct type? With your follow up email that provides the code you tried and answers to some of the Q's above. Also provide a small bit of your data that we can use to help you debug. You can easily do so by using the `dput` function. Say your data (predictors and label) are in a variable `x`, paste the output of the following command in your follow up email: R> dput(x[sample(nrow(x), 10),]) This will give us 10 random rows from your data that people trying to help you can use. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology ?| Memorial Sloan-Kettering Cancer Center ?| Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact From ehlers at ucalgary.ca Mon Apr 4 16:33:06 2011 From: ehlers at ucalgary.ca (Peter Ehlers) Date: Mon, 04 Apr 2011 07:33:06 -0700 Subject: [R] gap.barplot doesn't support data arrays? In-Reply-To: <4d99ca0f.cc7e0e0a.0b7e.ffffa600@mx.google.com> References: <4d99ca0f.cc7e0e0a.0b7e.ffffa600@mx.google.com> Message-ID: <4D99D6A2.7070604@ucalgary.ca> On 2011-04-04 06:39, Andrew D. Steen wrote: > I am trying to make a barplot with a broken axis using gap.barplot (in the > indispensable plotrix package). This works well when the data is a vector: > >> twogrp<-c(rnorm(10)+4,rnorm(10)+20) >> gap.barplot(twogrp,gap=c(8,16),xlab="Index",ytics=c(3,6,17,20),ylab="Group > values",main="Barplot with gap") > > But when the data is an array (for a bar plot with multiple series) I get an > error and a strange plot with no y-tics and bars stretching downwards, as if > all the values were negative: > >> twogrp2<-array(twogrp, dim=c(2,5)) >> > gap.barplot(twogrp2,gap=c(8,16),xlab="Index",ytics=c(3,6,17,20),ylab="Group > values",main="Barplot with gap") > > Error in rect(xtics[bigones] - halfwidth, botgap, xtics[bigones] + > halfwidth, : > cannot mix zero-length and non-zero-length coordinates > > However, the main title and axis labels do appear correctly. > > Are data arrays unsupported for gap.barplot, or am I missing something? Looks like they're not supported, as you could easily see from the code. But do gapped stacked barplots even make sense? Not to me. Still, I think that the plotrix documentation is somewhat spotty. The help page for gap.barplot indicates that the input 'y' should be 'data values'; not overly informative. Peter Ehlers > > Thanks, > Drew Steen > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From michalseneca at gmail.com Mon Apr 4 16:24:23 2011 From: michalseneca at gmail.com (michalseneca) Date: Mon, 4 Apr 2011 09:24:23 -0500 (CDT) Subject: [R] Creating multiple vector/list names-novice Message-ID: <1301927063332-3425616.post@n4.nabble.com> Hi I have very simple issue as I am still new to the group of R I have basically vector of names for which i want to create mutliple combinations and then place them in different vectors. In some other language I can just place a third dimension to separate list (or matrix) but i do not know how to do it in R. My issue is simple I use cc<-combn(colnames(DD),2) I would need to have this as vector1 or like vector[,,1] : cc<-combn(colnames(DD),2) vector2or like vector[,,2] cc<-combn(colnames(DD),3) etc..for up to k combinations something so then I can use for loop to go through the al of these combinations example: string<-"a", "b" , "c" ",d" vector/list(1) ab ac ad bc bd be cd ce de vector/list (2) abc abd abe bcd bce bde cde Can you help me with this.. I know that it is a simple question for this thread thank you.. Michal -- View this message in context: http://r.789695.n4.nabble.com/Creating-multiple-vector-list-names-novice-tp3425616p3425616.html Sent from the R help mailing list archive at Nabble.com. From bbolker at gmail.com Mon Apr 4 16:42:25 2011 From: bbolker at gmail.com (Ben Bolker) Date: Mon, 4 Apr 2011 14:42:25 +0000 Subject: [R] Clarks 2Dt function in R References: <29430419.17555.1301908268731.JavaMail.nabble@joe.nabble.com> Message-ID: biologie.uni-marburg.de> writes: > > Dear Ben, > > you answerd to Nancy Shackelford about Clarks 2Dt function. > Since the thread ended just after your reply, > I would like to ask, if you have an idea how to use this function in R > Dear Ronald, I got started on your problem, but I didn't finish it. I got a plausible answer to start with, but when checking the answer I ran into some trouble. Unfortunately, fitting these functions is a bit harder than one might expect ... it takes quite a bit of fussing to get a good, reliable answer. My partly-worked solution is below. > I defined it the following way: You were multiplying instead of dividing by the second term (I changed it by raising the term to a negative power instead. The lesson here: *always* do some sanity checks (graphical or otherwise) of your functions. I actually did the whole fit before I tried to plot the curves and found that they were increasing rather than decreasing ... ## fixed clark2Dt <- function(x , p, u=1) { (p/(pi*u))/(1+(x^2/u))^(p+1) } It might be preferable to define this in terms of s=sqrt(u) instead (then s would be a scale parameter with the same units as x, more easily interpretable ... Sanity checks: par(las=1,bty="l") ## personal preferences curve(clark2Dt(x,p=6),from=0,to=5) curve(clark2Dt(x,p=4),col=2,add=TRUE) curve(clark2Dt(x,p=2),col=4,add=TRUE) legend("topright",paste("p",c(6,4,2),sep="="),col=c(1,2,4),lty=1) Grab data (in the future, if possible, please use dput(), which puts your data in the most convenient form, or write out a statement like this to define your data ...) X <- as.data.frame(matrix( c(15,12, 45,13, 75,10, 105,8, 135,16, 165,5, 195,15, 225,8, 255,9, 285,12, 315,5, 345,4, 375,1, 405,1, 435,1, 465,0, 495,1, 525,2, 555,0, 585,0, 615,0, 645,0, 675,0), ncol=2,byrow=TRUE, dimnames=list(NULL,c("dist","count")))) ## assume these are traps/samples with unit size ## (if not, it will get absorbed into the "fecundity" constant library(bbmle) m1 <- mle2(count~dnbinom(mu=f*clark2Dt(dist,p,u),size=k), data=X,start=list(f=20,u=10,p=5,k=2), lower=rep(0.002,4),method="L-BFGS-B") ## we get a plausible-looking fit ... with(X,plot(count~dist,pch=16,las=1,bty="l")) newdat <- data.frame(dist=1:700) ## overkill but harmless lines(newdat$dist,predict(m1,newdata=newdat)) ## but the coefficients look funny, especially f coef(m1) ## tried resetting parscale but it's bogus (gets stuck at a worse likelihood) m2 <- mle2(count~dnbinom(mu=f*clark2Dt(dist,p,u),size=k), data=X,start=list(f=20,u=10,p=5,k=2), control=list(parscale=abs(coef(m1))), lower=rep(0.002,4),method="L-BFGS-B") m3 <- mle2(count~dnbinom(mu=exp(logf)*clark2Dt(dist,exp(logp),exp(logu)), size=exp(logk)), data=X,start=list(logf=log(20),logu=log(10),logp=log(5), logk=log(2)), method="Nelder-Mead") exp(coef(m3)) coef(m1) summary(m1) ## hmm. Redefine in terms of s instead of u and (more importantly) ## with f = seed density at r=0 rather the cov2cor(vcov(m1)) ## shows that f and u are horribly correlated newclark2Dt <- function(x , p, s=1, eps=1e-70) { d <- (1+(x/s)^2) r <- 1/d^(p+1) if (any(!is.finite(r))) browser() r } dnbinom_pen <- function(x,mu,size,pen=1000,log=TRUE) { mu <- rep(mu,length.out=length(x)) logval <- ifelse(mu==0 && x==0,pen*x^2,dnbinom(x,mu=mu,size=size,log=TRUE)) if (log) logval else exp(logval) } ## needed for predict() snbinom_pen <- snbinom m4 <- mle2(count~dnbinom(mu=f*newclark2Dt(dist,p,s),size=k), data=X,start=list(f=20,s=10,p=5,k=2), lower=rep(0.002,4),method="L-BFGS-B") m5 <- mle2(count~dnbinom_pen(mu=f*newclark2Dt(dist,1/(pinv),s),size=exp(logk)), data=X,start=list(f=15,s=10,pinv=100,logk=1),trace=TRUE, ## control=list(parscale=c(200,0.002,1.66,3600)), lower=rep(0.002,4),method="L-BFGS-B") with(X,plot(count~dist,pch=16,las=1,bty="l")) newdat <- data.frame(dist=1:700) ## overkill but harmless lines(newdat$dist,predict(m1,newdata=newdat)) lines(newdat$dist,predict(m5,newdata=newdat),col=2) > but I am not able to fit anything. > Do you have an idea? > I guess there is something wrong in my formula for Clarks 2Dt > > Thank you for reading > > Ciao > Ronald Bialozyt > > From wwwhsd at gmail.com Mon Apr 4 16:43:34 2011 From: wwwhsd at gmail.com (Henrique Dallazuanna) Date: Mon, 4 Apr 2011 11:43:34 -0300 Subject: [R] Creating multiple vector/list names-novice In-Reply-To: <1301927063332-3425616.post@n4.nabble.com> References: <1301927063332-3425616.post@n4.nabble.com> Message-ID: Try this: lapply(2:3, FUN = combn, x = string, paste, collapse = '') On Mon, Apr 4, 2011 at 11:24 AM, michalseneca wrote: > Hi I have very simple issue as I am still new to the group of R > > I have basically > > vector of names for which i want to create mutliple combinations and then > place them in different vectors. In some other language I can just place a > third dimension to separate list (or matrix) but i do not know how to do it > in R. > > My issue is simple I use > cc<-combn(colnames(DD),2) > > I would need to have this as > > vector1 or like vector[,,1] : ? ? ? ? ? cc<-combn(colnames(DD),2) > vector2or like vector[,,2] ? ? ? ? ? ? ?cc<-combn(colnames(DD),3) > > etc..for up to k combinations > > something so then I can use for loop to go through the al of these > combinations > > example: > > string<-"a", "b" , "c" ",d" > > vector/list(1) ab ac ad bc bd be cd ce de > vector/list ?(2) abc abd abe bcd bce bde cde > > > Can you help me with this.. I know that it is a simple question for this > thread thank you.. > > Michal > > > -- > View this message in context: http://r.789695.n4.nabble.com/Creating-multiple-vector-list-names-novice-tp3425616p3425616.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O From dwinsemius at comcast.net Mon Apr 4 16:47:02 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 4 Apr 2011 10:47:02 -0400 Subject: [R] lattice: how to "center" a subtitle? In-Reply-To: References: Message-ID: On Apr 4, 2011, at 7:39 AM, Marius Hofert wrote: > Dear expeRts, > > I recently asked for a real "centered" title (see, e.g., http://tolstoy.newcastle.edu.au/R/e13/help/11/01/0135.html) > . > A nice solution (from Deepayan Sarkar) is to use "xlab.top" instead > of "main": > > library(lattice) > trellis.device("pdf") > print(xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for the > human's eye", sub = "but subtitles are not centered", scales = > list(alternating = c(1,1), tck = c(1,0)))) > dev.off() library(lattice) trellis.device("pdf") print(xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for the human's eye", xlab = "but subtitles are _now_ centered", scales = list(alternating = c(1,1), tck = c(1,0)))) dev.off() (I realize that those are not really subtitles by a 'lab', but that appears acceptable in your current test case.) > > My question is whether there is something similar for *sub*titles > [so something like "xlab.bottom"]? As you can see from the plot, the > subtitle does not seem to be "centered" for the human's eye. I would > like to center it according to the x-axis label. > David Winsemius, MD West Hartford, CT From alison.callahan at gmail.com Mon Apr 4 16:56:10 2011 From: alison.callahan at gmail.com (Alison Callahan) Date: Mon, 4 Apr 2011 10:56:10 -0400 Subject: [R] simulating a VARXls model using dse Message-ID: Hello, Using the dse package I have estimated a VAR model using estVARXls(). I can perform forecasts using forecast() with no problems, but when I try to use simulate() with the same model, I get the following error: Error in diag(Cov, p) : 'nrow' or 'ncol' cannot be specified when 'x' is a matrix Can anyone shed some light on the meaning of this error? How can I simulate output using a model created with estVARXls()? Thanks, Alison From alison.callahan at gmail.com Mon Apr 4 17:16:46 2011 From: alison.callahan at gmail.com (Alison Callahan) Date: Mon, 4 Apr 2011 11:16:46 -0400 Subject: [R] simulating a VARXls model using dse In-Reply-To: <6441154A9FF1CD4386AF4ABF141A056D2153E1EC@WMEXOSCD2-N1.bocad.bank-banque-canada.ca> References: <6441154A9FF1CD4386AF4ABF141A056D2153E1EC@WMEXOSCD2-N1.bocad.bank-banque-canada.ca> Message-ID: Hi Paul, I am using R v. 2.12.2, and the "dse" package with build 2.12.2. I have attached some sample data to this email, and the R code I use to create the model and then forecast with it. Thanks, Alison On Mon, Apr 4, 2011 at 11:02 AM, Paul Gilbert wrote: > Could you please send me a reproducible example, and R and dse version info. > > Thanks, > Paul > >> -----Original Message----- >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- >> project.org] On Behalf Of Alison Callahan >> Sent: April 4, 2011 10:56 AM >> To: r-help at r-project.org >> Subject: [R] simulating a VARXls model using dse >> >> Hello, >> >> Using the dse package I have estimated a VAR model using estVARXls(). >> I can perform forecasts using forecast() with no problems, but when I >> try to use simulate() with the same model, I get the following error: >> >> Error in diag(Cov, p) : >> ? 'nrow' or 'ncol' cannot be specified when 'x' is a matrix >> >> Can anyone shed some light on the meaning of this error? How can I >> simulate output using a model created with estVARXls()? >> >> Thanks, >> >> Alison >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting- >> guide.html >> and provide commented, minimal, self-contained, reproducible code. > ==================================================================================== > > La version fran?aise suit le texte anglais. > > ------------------------------------------------------------------------------------ > > This email may contain privileged and/or confidential information, and the Bank of > Canada does not waive any related rights. Any distribution, use, or copying of this > email or the information it contains by other than the intended recipient is > unauthorized. If you received this email in error please delete it immediately from > your system and notify the sender promptly by email that you have done so. > > ------------------------------------------------------------------------------------ > > Le pr?sent courriel peut contenir de l'information privil?gi?e ou confidentielle. > La Banque du Canada ne renonce pas aux droits qui s'y rapportent. Toute diffusion, > utilisation ou copie de ce courriel ou des renseignements qu'il contient par une > personne autre que le ou les destinataires d?sign?s est interdite. Si vous recevez > ce courriel par erreur, veuillez le supprimer imm?diatement et envoyer sans d?lai ? > l'exp?diteur un message ?lectronique pour l'aviser que vous avez ?limin? de votre > ordinateur toute copie du courriel re?u. > From den.alpin at gmail.com Mon Apr 4 17:20:50 2011 From: den.alpin at gmail.com (Den Alpin) Date: Mon, 4 Apr 2011 17:20:50 +0200 Subject: [R] How to speed up grouping time series, help please In-Reply-To: References: Message-ID: I did some tests on Your and Gabor solutions, below my findings: - Your solution is fast as my solution in xts (below) but MUCH MORE READABLE, in particular I think your test should take into account xts creation from the data.frame (see below); - Gabor's solution with read.zoo is fast as xts but give an xts object that has some problems with time zones. Any better idea to speed up grouping time series? Thanks! Below few line of codes to test (I suggest to grow X size to get better comparison results): xtsSplit <- function(x) { x <- xts(x[,c("ID","VALUE")], as.POSIXct(x[,"DATE"])) x <- do.call(merge, split(x$VALUE,x$ID)) return(x) } xtsSplitTime <- replicate(100, system.time(xtsSplit(X))[[1]]) median(xtsTime) zooReadTime <- replicate(100, system.time(z <- read.zoo(X, split = 1, index = 2, tz = ""))[[1]]) median(zooReadTime) And my (unreadable) solution: library(xts) buildXtsFromDataFrame <- function(x, env) { { if(exists("xx", envir = env)) { VALUE <- as.matrix(x$VALUE) colnames(VALUE) <- as.character(x$ID[1]) assign("xx", cbind(get("xx", env), xts(VALUE, as.POSIXct(x$DATE, tz = "GMT", format = '%Y-%m-%d %H:%M:%S'), tzone = 'GMT')), envir = env) } else { VALUE <- as.matrix(x$VALUE) colnames(VALUE) <- as.character(x$ID[1]) assign("xx", xts(VALUE, as.POSIXct(x$DATE, tz = "GMT", format = '%Y-%m-%d %H:%M:%S'), tzone = 'GMT'), envir = env) } return(TRUE) } } xtsDaply <- function(...) { e1 <- new.env(parent = baseenv()) res <- daply(X, "ID", buildXtsFromDataFrame, env = e1) return(get("xx", e1)) } Time04 <- replicate(100, system.time(xtsDaply(X, X$ID))[[1]]) 2011/4/4 Joshua Ulrich : > Hi Dan, > > On Mon, Apr 4, 2011 at 7:49 AM, Den Alpin wrote: >> I retrieve for a few hundred times a group of time series (10-15 ts >> with 10000 values each), on every group I do some calculation, graphs >> etc. I wonder if there is a faster method than what presented below to >> get an appropriate timeseries object. >> >> Making a query with RODBC for every group I get a data frame like this: >> >>> X >> ? ID ? ? ? ? ? ? ? ?DATE ? ? VALUE >> 14 ?3 2000-01-01 00:00:03 0.5726334 >> 4 ? 1 2000-01-01 00:00:03 0.8830174 >> 1 ? 1 2000-01-01 00:00:00 0.2875775 >> 15 ?3 2000-01-01 00:00:04 0.1029247 >> 11 ?3 2000-01-01 00:00:00 0.9568333 >> 9 ? 2 2000-01-01 00:00:03 0.5514350 >> 7 ? 2 2000-01-01 00:00:01 0.5281055 >> 6 ? 2 2000-01-01 00:00:00 0.0455565 >> 12 ?3 2000-01-01 00:00:01 0.4533342 >> 8 ? 2 2000-01-01 00:00:02 0.8924190 >> 3 ? 1 2000-01-01 00:00:02 0.4089769 >> 13 ?3 2000-01-01 00:00:02 0.6775706 >> >> And I want to get a timeSeries object or xts object like this: >> >> ? ? ? ? ? ? ? ? ? ? ? ? ? ?1 ? ? ? ? 2 ? ? ? ? 3 >> 2000-01-01 00:00:00 0.2875775 0.0455565 0.9568333 >> 2000-01-01 00:00:01 ? ? ? ?NA 0.5281055 0.4533342 >> 2000-01-01 00:00:02 0.4089769 0.8924190 0.6775706 >> 2000-01-01 00:00:03 0.8830174 0.5514350 0.5726334 >> 2000-01-01 00:00:04 ? ? ? ?NA ? ? ? ?NA 0.1029247 >> >> >> Input data can be sorted or unsorted (the most complicated case is in >> the example, unsorted and missing data) in the sense that I can sort >> in query if I can take an advantage from this. >> >> Some considerations: >> - Xts is generally faster than timeSeries >> - both accept a matrix so if I can create a matrix like the one >> represented above and an array of characters representing dates faster >> than what possible with xts:::cbind, for examole,I will have a faster >> implementation (package data.table ?). >> - create timeseries objects in multithread and then merge (package plyr ?) >> - faster merge algorithms? >> >> Below some code to generate the test case above: >> >> >> set.seed(123) >> N <- 5 # number of observations >> K <- 3 # number of timeseries ID >> >> X <- data.frame( >> ?ID = rep(1:K, each = N), >> ?DATE = as.character(rep(as.POSIXct("2000-01-01", tz = "GMT")+ 0:(N-1), K)), >> ?VALUE = runif(N*K), stringsAsFactors = FALSE) >> >> X <- X[sample(1:(N*K), N*K),] # sample observations to get random >> order (optional) >> X <- X[-(sample(1:nrow(X), floor(nrow(X)*0.2))),] # 20% missing >> >> head(X, 15) >> >> # use explicitly environments to avoid '<<-' >> buildTimeSeriesFromDataFrame <- function(x, env) >> { >> ?{ >> ? ?if(exists("xx", envir = env)) # if exist variable xx in env cbind >> ? ? ?assign("xx", >> ? ? ? ?cbind(get("xx", env), timeSeries(x$VALUE, x$DATE, >> ? ? ? ? ?format = '%Y-%m-%d %H:%M:%S', >> ? ? ? ? ?zone = 'GMT', units = as.character(x$ID[1]))), >> ? ? ? ?envir = env) >> ? ?else ?# create xx in env >> ? ? ?assign("xx", >> ? ? ? ?timeSeries(x$VALUE, x$DATE, format = '%Y-%m-%d %H:%M:%S', >> ? ? ? ? ?zone = 'GMT', units = as.character(x$ID[1])), >> ? ? ? ?envir = env) >> >> ? ?return(TRUE) >> ?} >> } >> >> # use package plyr, faster than 'by' function >> tsDaply <- function(...) >> { >> ?library(plyr) >> ?e1 <- new.env(parent = baseenv()) #create a new env >> ?res <- daply(X, "ID", buildTimeSeriesFromDataFrame, >> ? ? ?env = e1) >> ?return(get("xx", e1)) # return xx from env >> } >> >> ##replicate 100 times >> #Time03 <- replicate(100, >> # ?system.time(tsDaply(X, X$ID))[[1]]) >> #median(Time03) >> >> # result >> tsDaply(X, X$ID) >> >> >> Thanks in advance for any input, best regards, >> Den >> >> > > Here's how I would do it with xts: > > x <- xts(X[,c("ID","VALUE")], as.POSIXct(X[,"DATE"])) > do.call(merge, split(x$VALUE,x$ID)) > > My xts solution compares favorably to your solution: >> Time03 <- replicate(100, > + ? system.time(tsDaply(X, X$ID))[[1]]) >> median(Time03) > [1] 0.02 >> xtsTime <- replicate(100, > + ? system.time(do.call(merge, split(x$VALUE,x$ID)))[[1]]) >> median(xtsTime) > [1] 0 > > Best, > -- > Joshua Ulrich ?| ?FOSS Trading: www.fosstrading.com > From dwinsemius at comcast.net Mon Apr 4 17:22:33 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 4 Apr 2011 11:22:33 -0400 Subject: [R] Deriving formula with deriv In-Reply-To: References: Message-ID: <856A8A52-20F6-45D3-A4F4-59484D23557A@comcast.net> On Apr 4, 2011, at 6:35 AM, kitty wrote: > Dear list, > > Hi, > > I am trying to get the second derivative of a logistic formula, in R > summary > the model is given as : > > ### >> $nls >> Nonlinear regression model >> model: data ~ logistic(time, A, mu, lambda, addpar) >> data: parent.frame() >> A mu lambda >> 0.53243 0.03741 6.94296 > ### > > but I know the formula used is > > # y~'A'/(1+exp((4*'mu'/'A')*('lambda'-'time'))+2)) # from the > grofit ( > package I am using to fit the model) documentation. > > I have attempted to use the R function 'deriv' to get the > first derivative from which I can then reuse the deriv function to > get the > second derivative.... unfortunately this does not seem to work > > ### >> express<-expression(y~'A'/(1+exp((4*'mu'/'A')*('lambda'-'time'))+2)) >> express > expression(y ~ "A"/(1 + exp((4 * "mu"/"A") * ("lambda" - "time")) + > 2)) >> >> d1<-deriv(express) > Error in deriv.default(express) : element 2 is empty; > the part of the args list of '.Internal' being evaluated was: > (expr, namevec, function.arg, tag, hessian) > #### For one thing you are not specifying what variable you want to differentiate with-respect-to: Assuming this to be `A` then: express <- expression( A/(1+exp((4*mu/A)*(lambda-time))+ 2)) # The quotes looked "wrong" inside an expression, so I removed them d1<-deriv(express, "A") # but the diff w.r.t variable needs to be quoted. d1 expression({ .expr1 <- 4 * mu .expr3 <- lambda - time .expr5 <- exp(.expr1/A * .expr3) .expr7 <- 1 + .expr5 + 2 .value <- A/.expr7 .grad <- array(0, c(length(.value), 1L), list(NULL, c("A"))) .grad[, "A"] <- 1/.expr7 + A * (.expr5 * (.expr1/A^2 * .expr3))/.expr7^2 attr(.value, "gradient") <- .grad .value }) All this should have been clear if you had looked at the examples in help(deriv). > > Why is this not working and how do I get the second derivative? That , too, is clearly exemplified in the help page. > > Thank you for reading my post, all help is appreciated, > Kitty -- David Winsemius, MD West Hartford, CT From dieter.menne at menne-biomed.de Mon Apr 4 17:34:04 2011 From: dieter.menne at menne-biomed.de (Dieter Menne) Date: Mon, 4 Apr 2011 10:34:04 -0500 (CDT) Subject: [R] D'Agostino test In-Reply-To: <1301907377701-3424952.post@n4.nabble.com> References: <1301907377701-3424952.post@n4.nabble.com> Message-ID: <1301931244416-3425833.post@n4.nabble.com> Juraj17 wrote: > > Do I have to write my own, or it exists yet? How name has it, or how can I > use it. > Try the R-function search. It return the function you are looking for as the first match. Dieter -- View this message in context: http://r.789695.n4.nabble.com/D-Agostino-test-tp3424952p3425833.html Sent from the R help mailing list archive at Nabble.com. From ggrothendieck at gmail.com Mon Apr 4 17:38:07 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Mon, 4 Apr 2011 11:38:07 -0400 Subject: [R] How to speed up grouping time series, help please In-Reply-To: References: Message-ID: On Mon, Apr 4, 2011 at 11:20 AM, Den Alpin wrote: > I did some tests on Your and Gabor solutions, below my findings: > > - Your solution is fast as my solution in xts (below) but MUCH MORE > READABLE, in particular I think your test should take into account xts > creation from the data.frame (see below); > - Gabor's solution with read.zoo is fast as xts but give an xts object > that has some problems with time zones. read.zoo gives a zoo object, not an xts object. If you want an xts object try as.xts(z). If you were expecting an xts object that may be the source of your other problems. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From dieter.menne at menne-biomed.de Mon Apr 4 17:41:30 2011 From: dieter.menne at menne-biomed.de (Dieter Menne) Date: Mon, 4 Apr 2011 10:41:30 -0500 (CDT) Subject: [R] replace last 3 characters of string In-Reply-To: <000001cbf258$637e5a60$2a7b0f20$@jacobs@figurestofacts.be> References: <000001cbf258$637e5a60$2a7b0f20$@jacobs@figurestofacts.be> Message-ID: <1301931690007-3425855.post@n4.nabble.com> Bert Jacobs-2 wrote: > > I would like to replace the last tree characters of the values of a > certain > column in a dataframe. > > Besides the mentioned standard method: I found the subset of string operations in Hadley Wickhams stringr package helpful. They have a much more consistent interface compared to the standard interface. str_sub: accepts negative positions, which are calculated from the left of the last character. Dieter -- View this message in context: http://r.789695.n4.nabble.com/replace-last-3-characters-of-string-tp3424354p3425855.html Sent from the R help mailing list archive at Nabble.com. From andrew.decker.steen at gmail.com Mon Apr 4 16:47:52 2011 From: andrew.decker.steen at gmail.com (Andrew D. Steen) Date: Mon, 4 Apr 2011 16:47:52 +0200 Subject: [R] gap.barplot doesn't support data arrays? In-Reply-To: <4D99D6A2.7070604@ucalgary.ca> References: <4d99ca0f.cc7e0e0a.0b7e.ffffa600@mx.google.com> <4D99D6A2.7070604@ucalgary.ca> Message-ID: <4d99da12.4d790e0a.646a.ffffab94@mx.google.com> True - gapped stacked bar plots make no sense at all. I'm working my way up to a gapped bar plot with series next to each other (and error bars!), what you'd get if you put a gap in the y-axis of > twogrp2<-array(twogrp, dim=c(2,5)) > barplot(twogrp2, beside=TRUE) I'm guessing I can do this if I spend enough time with the axis.break() function. But is there an easier way? --Drew > -----Original Message----- > From: Peter Ehlers [mailto:ehlers at ucalgary.ca] > Sent: Monday, April 04, 2011 4:33 PM > To: Andrew D. Steen > Cc: r-help at r-project.org > Subject: Re: [R] gap.barplot doesn't support data arrays? > > On 2011-04-04 06:39, Andrew D. Steen wrote: > > I am trying to make a barplot with a broken axis using gap.barplot > (in the > > indispensable plotrix package). This works well when the data is a > vector: > > > >> twogrp<-c(rnorm(10)+4,rnorm(10)+20) > >> > gap.barplot(twogrp,gap=c(8,16),xlab="Index",ytics=c(3,6,17,20),ylab="Gr > oup > > values",main="Barplot with gap") > > > > But when the data is an array (for a bar plot with multiple series) I > get an > > error and a strange plot with no y-tics and bars stretching > downwards, as if > > all the values were negative: > > > >> twogrp2<-array(twogrp, dim=c(2,5)) > >> > > > gap.barplot(twogrp2,gap=c(8,16),xlab="Index",ytics=c(3,6,17,20),ylab="G > roup > > values",main="Barplot with gap") > > > > Error in rect(xtics[bigones] - halfwidth, botgap, xtics[bigones] + > > halfwidth, : > > cannot mix zero-length and non-zero-length coordinates > > > > However, the main title and axis labels do appear correctly. > > > > Are data arrays unsupported for gap.barplot, or am I missing > something? > > Looks like they're not supported, as you could easily see from > the code. But do gapped stacked barplots even make sense? Not > to me. > > Still, I think that the plotrix documentation is somewhat > spotty. The help page for gap.barplot indicates that the > input 'y' should be 'data values'; not overly informative. > > Peter Ehlers > > > > > Thanks, > > Drew Steen > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > > and provide commented, minimal, self-contained, reproducible code. From rosyara at msu.edu Mon Apr 4 16:41:34 2011 From: rosyara at msu.edu (Umesh Rosyara) Date: Mon, 4 Apr 2011 10:41:34 -0400 Subject: [R] merging data list in to single data frame Message-ID: <002701cbf2d6$65405870$2fc10950$@edu> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From january.weiner at mpiib-berlin.mpg.de Mon Apr 4 17:02:28 2011 From: january.weiner at mpiib-berlin.mpg.de (January Weiner) Date: Mon, 4 Apr 2011 17:02:28 +0200 Subject: [R] Adjusting p values of a matrix Message-ID: Dear all, I have an n x n matrix of p-values. The matrix is symmetrical, as it describes the "each against each" p values of correlation coefficients. How can I best correct the p values of the matrix? Notably, the total number of the tests performed is n(n-1)/2, since I do not test the correlation of each variable with itself. That means, I only want to correct one half of the matrix, not including the diagonal. Therefore, simply writing pmat <- p.adjust( pmat, method= "fdr" ) # where pmat is an n x n matrix ...doesn't cut it. Of course, I can turn the matrix in to a three column data frame with n(n-1)/2 rows, but that is slow and not elegant. regards, j. -- -------- Dr. January Weiner 3 -------------------------------------- Max Planck Institute for Infection Biology Charit?platz 1 D-10117 Berlin, Germany Web?? : www.mpiib-berlin.mpg.de Tel? ?? : +49-30-28460514 From padmanabhan.vijayan at gmail.com Mon Apr 4 17:03:24 2011 From: padmanabhan.vijayan at gmail.com (Vijayan Padmanabhan) Date: Mon, 4 Apr 2011 20:33:24 +0530 Subject: [R] help In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From michalseneca at gmail.com Mon Apr 4 17:09:32 2011 From: michalseneca at gmail.com (michalseneca) Date: Mon, 4 Apr 2011 10:09:32 -0500 (CDT) Subject: [R] Creating multiple vector/list names-novice In-Reply-To: References: <1301927063332-3425616.post@n4.nabble.com> Message-ID: <1301929772084-3425759.post@n4.nabble.com> Hi Thanks ,however I would need something different still... I would need to return a vector so if as to choose cc[[3]] [2] would return vector/list as in terms of c(b,d,e) Thanks -- View this message in context: http://r.789695.n4.nabble.com/Creating-multiple-vector-list-names-novice-tp3425616p3425759.html Sent from the R help mailing list archive at Nabble.com. From aurelien.chateigner at googlemail.com Mon Apr 4 17:35:52 2011 From: aurelien.chateigner at googlemail.com (=?iso-8859-1?Q?Aur=E9lien_Chateigner?=) Date: Mon, 4 Apr 2011 17:35:52 +0200 Subject: [R] Multithreading of Geneland Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From gunter.berton at gene.com Mon Apr 4 17:54:32 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Mon, 4 Apr 2011 08:54:32 -0700 Subject: [R] Adjusting p values of a matrix In-Reply-To: References: Message-ID: 1. This is not an R question, AFAICS. 2. Sounds like a research topic. I don't think there's a meaningful simple answer. I suspect it strongly depends on the model and context. -- Bert On Mon, Apr 4, 2011 at 8:02 AM, January Weiner wrote: > Dear all, > > I have an n x n matrix of p-values. The matrix is symmetrical, as it > describes the "each against each" p values of correlation > coefficients. > > How can I best correct the p values of the matrix? Notably, the total > number of the tests performed is n(n-1)/2, since I do not test the > correlation of each variable with itself. That means, I only want to > correct one half of the matrix, not including the diagonal. Therefore, > simply writing > > pmat <- p.adjust( pmat, method= "fdr" ) > # where pmat is an n x n matrix > > ...doesn't cut it. > > Of course, I can turn the matrix in to a three column data frame with > n(n-1)/2 rows, but that is slow and not elegant. > > regards, > j. > > -- > -------- Dr. January Weiner 3 -------------------------------------- > Max Planck Institute for Infection Biology > Charit?platz 1 > D-10117 Berlin, Germany > Web?? : www.mpiib-berlin.mpg.de > Tel? ?? : +49-30-28460514 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- "Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions." -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics From Greg.Snow at imail.org Mon Apr 4 18:02:46 2011 From: Greg.Snow at imail.org (Greg Snow) Date: Mon, 4 Apr 2011 10:02:46 -0600 Subject: [R] power of 2 way ANOVA with interaction In-Reply-To: References: Message-ID: You can use simulation: 1. decide what you think your data will look like 2. decide how you plan to analyze your data 3. write a function that simulates a dataset (common arguments include sample size(s) and effect sizes) then analyzes the data in your planned manner and returns the p-value(s) or other statistic(s) of interest 4. run the function from 3 a bunch (1000 or more) times, the replicate function is useful for this (progress bars can also be useful) 5. the proportion of times that the results are significant is your estimate of power -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111 > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Timothy Spier > Sent: Sunday, April 03, 2011 5:11 PM > To: R Help > Subject: [R] power of 2 way ANOVA with interaction > > > I've been searching for an answer to this for a while but no joy. I > have a simple 2-way ANOVA with an interaction. I'd like to determine > the power of this test for each factor (factor A, factor B, and the A*B > interaction). How can I do this in R? I used to do this with "proc > Glmpower" in SAS, but I can find no analogue in R. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. From PDalgd at gmail.com Mon Apr 4 18:03:40 2011 From: PDalgd at gmail.com (peter dalgaard) Date: Mon, 4 Apr 2011 18:03:40 +0200 Subject: [R] Adjusting p values of a matrix In-Reply-To: References: Message-ID: On Apr 4, 2011, at 17:02 , January Weiner wrote: > Dear all, > > I have an n x n matrix of p-values. The matrix is symmetrical, as it > describes the "each against each" p values of correlation > coefficients. > > How can I best correct the p values of the matrix? Notably, the total > number of the tests performed is n(n-1)/2, since I do not test the > correlation of each variable with itself. That means, I only want to > correct one half of the matrix, not including the diagonal. Therefore, > simply writing > > pmat <- p.adjust( pmat, method= "fdr" ) > # where pmat is an n x n matrix > > ...doesn't cut it. > > Of course, I can turn the matrix in to a three column data frame with > n(n-1)/2 rows, but that is slow and not elegant. I don't think there's a really elegant way (have a look inside pairwise.table if you care). If you start one step further back, you could just use pairwise.table with a suitably defined comparison function. Otherwise, how about ltri <- lower.tri(pmat) utri <- upper.tri(pmat) pmat[ltri] <- p.adjust(pmat[ltri], method = "fdr") pmat[utri] <- t(pmat)[utri] -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com From spencer.graves at structuremonitoring.com Mon Apr 4 18:08:20 2011 From: spencer.graves at structuremonitoring.com (Spencer Graves) Date: Mon, 4 Apr 2011 09:08:20 -0700 Subject: [R] Adjusting p values of a matrix In-Reply-To: References: Message-ID: <4D99ECF4.9030501@structuremonitoring.com> There are also the multcomp and multcompView packages that might provide something of interest in this regard. "multcomp" has a companion book, "Multiple Comparisons Using R" (Bretz, Hothorn, Westfall, 2010, CRC Press), which I believe provides an excellent overview of the state of the art in multiple comparisons. The simple rule is Bonferroni, which involves multiplying the p-values by n or n(n-1)/2. Note, also, that one of the most important innovations in statistical methods of the past quarter century is the development of "false discovery rate", which estimates the false alarm rate among the cases that the user actually sees, which is a mixture of true and false hypotheses. The p value, by contrast, is the probability of a decision error only among hypotheses that are true. For more info, see the Wikipedia entries on Bonferroni or false discovery rate -- or the book by Bretz, Hothorn and Westfall or the vignettes accompanying the multcomp package. Hope this helps. Spencer On 4/4/2011 8:54 AM, Bert Gunter wrote: > 1. This is not an R question, AFAICS. > > 2. Sounds like a research topic. I don't think there's a meaningful > simple answer. I suspect it strongly depends on the model and context. > > -- Bert > > On Mon, Apr 4, 2011 at 8:02 AM, January Weiner > wrote: >> Dear all, >> >> I have an n x n matrix of p-values. The matrix is symmetrical, as it >> describes the "each against each" p values of correlation >> coefficients. >> >> How can I best correct the p values of the matrix? Notably, the total >> number of the tests performed is n(n-1)/2, since I do not test the >> correlation of each variable with itself. That means, I only want to >> correct one half of the matrix, not including the diagonal. Therefore, >> simply writing >> >> pmat<- p.adjust( pmat, method= "fdr" ) >> # where pmat is an n x n matrix >> >> ...doesn't cut it. >> >> Of course, I can turn the matrix in to a three column data frame with >> n(n-1)/2 rows, but that is slow and not elegant. >> >> regards, >> j. >> >> -- >> -------- Dr. January Weiner 3 -------------------------------------- >> Max Planck Institute for Infection Biology >> Charit?platz 1 >> D-10117 Berlin, Germany >> Web : www.mpiib-berlin.mpg.de >> Tel : +49-30-28460514 >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- Spencer Graves, PE, PhD President and Chief Operating Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San Jos?, CA 95126 ph: 408-655-4567 From ligges at statistik.tu-dortmund.de Mon Apr 4 18:12:16 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Mon, 04 Apr 2011 18:12:16 +0200 Subject: [R] add zero in front of numbers In-Reply-To: <7785EFE1CC2D9F4A96F6A415BC1C2BD0168DD4C1@PC6-46.pogb.cancer.ucl.ac.uk> References: <7785EFE1CC2D9F4A96F6A415BC1C2BD0168DD4C1@PC6-46.pogb.cancer.ucl.ac.uk> Message-ID: <4D99EDE0.6030006@statistik.tu-dortmund.de> On 04.04.2011 12:35, Yan Jiao wrote: > Dear R users, > > I need to add 0 in front of a series of numbers, e.g. 1->001, 19->019, > Is there a fast way of doing that? formatC(c(1, 19), flag=0, width=3) Uwe Ligges > Many thanks > > yan > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ligges at statistik.tu-dortmund.de Mon Apr 4 18:21:52 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Mon, 04 Apr 2011 18:21:52 +0200 Subject: [R] merging data list in to single data frame In-Reply-To: <002701cbf2d6$65405870$2fc10950$@edu> References: <002701cbf2d6$65405870$2fc10950$@edu> Message-ID: <4D99F020.8020508@statistik.tu-dortmund.de> On 04.04.2011 16:41, Umesh Rosyara wrote: > Dear R community members > > > > I did find a good way to merge my 200 text data files in to a single data > file with one column added will show indicator for that file. > > > > filelist = list.files(pattern = "K*cd.txt") I doubt you meant "K*cd.txt" but "^K[[:digit:]]*cd\\.txt$". # the file names are K1cd.txt > .................to K200cd.txt > > data_list<-lapply(filelist, read.table, header=T, comment=";", fill=T) Replace by: data_list <- lapply(filelist, function(x) cbind(Filename = x, read.table(x, header=T, comment=";", fill=TRUE)) And then: result <- do.call("rbind", data_list) Uwe Ligges > > > > This will create list, but this is not what I want. > > > > I want a single dataframe (all separate dataframes have same variable > headings) with additional row for example > > > > ; just for example, two small datasets are created by my component datasets > are huge, need automation > > ;read from file K1cd.txt > > var1 var2 var3 var4 > > 1 6 0.3 8 > > 3 4 0.4 9 > > 2 3 0.4 6 > > 1 0.4 0.9 3 > > > > ;read from file K2cd.txt > > var1 var2 var3 var4 > > 1 16 0.6 7 > > 3 14 0.4 6 > > 2 1 3 0.4 5 > > 1 0.6 0.9 2 > > > > the output dataframe should look like > > > > Fileno var1 var2 var3 var4 > > 1 1 6 0.3 8 > > 1 3 4 0.4 9 > > 1 2 3 0.4 6 > > 1 1 0.4 0.9 3 > > 2 1 16 0.6 7 > > 2 3 14 0.4 6 > > 2 2 1 3 0.4 5 > > 2 1 0.6 0.9 2 > > > > Please note that new file no column is added > > > > Thank you for the help. > > > > Umesh R > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From puetz at mpipsykl.mpg.de Mon Apr 4 18:21:52 2011 From: puetz at mpipsykl.mpg.de (=?iso-8859-1?Q?Benno_P=FCtz?=) Date: Mon, 4 Apr 2011 18:21:52 +0200 Subject: [R] Adjusting p values of a matrix In-Reply-To: References: Message-ID: <8BC2E7AD-5B2F-4861-837A-A67932E1CD58@mpipsykl.mpg.de> How about as.matrix(p.adjust(as.dist(pmat))) Benno On 4.Apr.2011, at 17:02, January Weiner wrote: > Dear all, > > I have an n x n matrix of p-values. The matrix is symmetrical, as it > describes the "each against each" p values of correlation > coefficients. > > How can I best correct the p values of the matrix? Notably, the total > number of the tests performed is n(n-1)/2, since I do not test the > correlation of each variable with itself. That means, I only want to > correct one half of the matrix, not including the diagonal. Therefore, > simply writing > > pmat <- p.adjust( pmat, method= "fdr" ) > # where pmat is an n x n matrix > > ...doesn't cut it. > > Of course, I can turn the matrix in to a three column data frame with > n(n-1)/2 rows, but that is slow and not elegant. > > regards, > j. > > -- > -------- Dr. January Weiner 3 -------------------------------------- > Max Planck Institute for Infection Biology > Charit?platz 1 > D-10117 Berlin, Germany > Web : www.mpiib-berlin.mpg.de > Tel : +49-30-28460514 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From josh.m.ulrich at gmail.com Mon Apr 4 18:35:53 2011 From: josh.m.ulrich at gmail.com (Joshua Ulrich) Date: Mon, 4 Apr 2011 11:35:53 -0500 Subject: [R] 'RQuantLib for 2.12 version In-Reply-To: <4d91fced.d27bdc0a.4f7b.48bb@mx.google.com> References: <4d91fced.d27bdc0a.4f7b.48bb@mx.google.com> Message-ID: Hi Mauricio, A Windows binary is now available on CRAN: http://dirk.eddelbuettel.com/blog/2011/04/04/#rquantlib_0.3.7 Best, -- Joshua Ulrich ?| ?FOSS Trading: www.fosstrading.com On Tue, Mar 29, 2011 at 10:38 AM, Mauricio Romero wrote: > Dear R users, > > I have been trying to use RQuantLib in the 2.12.2 R version. I downloaded > the .zip file from > > http://sourceforge.net/projects/quantlib/files/QuantLib/ which didn't work, > so I > > I downloaded the package source from > > http://sourceforge.net/projects/quantlib/files/QuantLib/1.0.1/ > > > > Then I run: > > install.packages("RQuantLib_0.3.6.tar.gz", type="source", repos=NULL) > > and get the following error: > > > > Installing package(s) into 'C:\Users\Mauricio\Documents/R/win-library/2.12' > > (as 'lib' is unspecified) > > * installing *source* package 'RQuantLib' ... > > > > ? ********************************************** > > ? WARNING: this package has a configure script > > ? ? ? ? It probably needs manual configuration > > ? ********************************************** > > > > > > ** libs > > > > *** arch - i386 > > cygwin warning: > > ?MS-DOS style path detected: C:/PROGRA~1/R/R-212~1.2/etc/i386/Makeconf > > ?Preferred POSIX equivalent is: > /cygdrive/c/PROGRA~1/R/R-212~1.2/etc/i386/Makeconf > > ?CYGWIN environment variable option "nodosfilewarning" turns off this > warning. > > ?Consult the user's guide for more details about POSIX paths: > > ? ?http://cygwin.com/cygwin-ug-net/using.html#using-pathnames > > g++ -I"C:/PROGRA~1/R/R-212~1.2/include" > -I"C:/Users/Mauricio/Documents/R/win-library/2.12/Rcpp/include" ? -I -I. > -O2 -Wall ?-c asian.cpp -o asian.o > > asian.cpp:26:23: fatal error: rquantlib.h: No such file or directory > > compilation terminated. > > make: *** [asian.o] Error 1 > > ERROR: compilation failed for package 'RQuantLib' > > * removing 'C:/Users/Mauricio/Documents/R/win-library/2.12/RQuantLib' > > Warning messages: > > 1: running command 'C:\PROGRA~1\R\R-212~1.2/bin/i386/R CMD INSTALL -l > "C:\Users\Mauricio\Documents/R/win-library/2.12" ? "RQuantLib_0.3.6.tar.gz"' > had status 1 > > 2: In install.packages("RQuantLib_0.3.6.tar.gz", type = "source", repos = > NULL) : > > ?installation of package 'RQuantLib_0.3.6.tar.gz' had non-zero exit status > > > > > > Any ideas? > > > > Thanks, > > > > Mauricio Romero > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From m_hofert at web.de Mon Apr 4 18:45:31 2011 From: m_hofert at web.de (Marius Hofert) Date: Mon, 4 Apr 2011 18:45:31 +0200 Subject: [R] lattice: how to "center" a subtitle? In-Reply-To: References: Message-ID: Dear David, I intended to use another x-label. But your suggestion brings me to the idea of just using a two-line xlab, so s.th. like print(xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for the human's eye", xlab = "but subtitles are _now_ centered\nbla", scales = list(alternating = c(1,1), tck = c(1,0)))) Thanks! Cheers, Marius On 2011-04-04, at 16:47 , David Winsemius wrote: > > On Apr 4, 2011, at 7:39 AM, Marius Hofert wrote: > >> Dear expeRts, >> >> I recently asked for a real "centered" title (see, e.g., http://tolstoy.newcastle.edu.au/R/e13/help/11/01/0135.html). >> A nice solution (from Deepayan Sarkar) is to use "xlab.top" instead of "main": >> >> library(lattice) >> trellis.device("pdf") >> print(xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for the human's eye", sub = "but subtitles are not centered", scales = list(alternating = c(1,1), tck = c(1,0)))) >> dev.off() > > library(lattice) > trellis.device("pdf") > print(xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for the human's eye", xlab = "but subtitles are _now_ centered", scales = list(alternating = c(1,1), tck = c(1,0)))) > dev.off() > > > (I realize that those are not really subtitles by a 'lab', but that appears acceptable in your current test case.) > >> >> My question is whether there is something similar for *sub*titles [so something like "xlab.bottom"]? As you can see from the plot, the subtitle does not seem to be "centered" for the human's eye. I would like to center it according to the x-axis label. >> > > > David Winsemius, MD > West Hartford, CT > From juliet.hannah at gmail.com Mon Apr 4 18:46:36 2011 From: juliet.hannah at gmail.com (Juliet Hannah) Date: Mon, 4 Apr 2011 12:46:36 -0400 Subject: [R] converting affybatch object to matrix In-Reply-To: References: Message-ID: Use exprs on the output from RMA (or another method you like) library("affy") myData <-ReadAffy() myRMA <- rma(myData) e = exprs(myRMA) Also, check out the Bioconductor mailing list where Bioconductor-related topics are discussed. On Fri, Apr 1, 2011 at 9:54 AM, Landes, Ezekiel wrote: > I have an Affybatch object called "batch" : > >> >> batch > AffyBatch object > size of arrays=1050x1050 features (196 kb) > cdf=HuGene-1_0-st-v1 (32321 affyids) > number of samples=384 > number of genes=32321 > annotation=hugene10stv1 > notes= >> >> > > Is there a way of converting a portion of this data into a matrix? More specifically, a matrix where the 384 samples are columns and the 32321 genes are rows? The "exprs" function returns a matrix that has 384 columns but for some reason there are 1050^2 rows. > > Thanks! > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From spencer.graves at prodsyse.com Mon Apr 4 18:06:29 2011 From: spencer.graves at prodsyse.com (Spencer Graves) Date: Mon, 4 Apr 2011 09:06:29 -0700 Subject: [R] Adjusting p values of a matrix In-Reply-To: References: Message-ID: <4D99EC85.7050103@prodsyse.com> There are, however, the multcomp and multcompView packages that might provide something of interest in this regard. "multcomp" has a companion book, "Multiple Comparisons Using R" (Bretz, Hothorn, Westfall, 2010, CRC Press), which I believe provides an excellent overview of the state of the art in multiple comparisons. The simple rule is Bonferroni, which involves multiplying the p-values by n or n(n-1)/2. Note, also, that one of the most important innovations in statistical methods of the past quarter century is the development of "false discovery rate", which estimates the false alarm rate among the cases that the user actually sees, which is a mixture of true and false hypotheses. The p value, by contrast, is the probability of a decision error only among hypotheses that are true. For more info, see the Wikipedia entries on Bonferroni or false discovery rate -- or the book by Bretz, Hothorn and Westfall or the vignettes accompanying the multcomp package. Hope this helps. Spencer On 4/4/2011 8:54 AM, Bert Gunter wrote: > 1. This is not an R question, AFAICS. > > 2. Sounds like a research topic. I don't think there's a meaningful > simple answer. I suspect it strongly depends on the model and context. > > -- Bert > > On Mon, Apr 4, 2011 at 8:02 AM, January Weiner > wrote: >> Dear all, >> >> I have an n x n matrix of p-values. The matrix is symmetrical, as it >> describes the "each against each" p values of correlation >> coefficients. >> >> How can I best correct the p values of the matrix? Notably, the total >> number of the tests performed is n(n-1)/2, since I do not test the >> correlation of each variable with itself. That means, I only want to >> correct one half of the matrix, not including the diagonal. Therefore, >> simply writing >> >> pmat<- p.adjust( pmat, method= "fdr" ) >> # where pmat is an n x n matrix >> >> ...doesn't cut it. >> >> Of course, I can turn the matrix in to a three column data frame with >> n(n-1)/2 rows, but that is slow and not elegant. >> >> regards, >> j. >> >> -- >> -------- Dr. January Weiner 3 -------------------------------------- >> Max Planck Institute for Infection Biology >> Charit?platz 1 >> D-10117 Berlin, Germany >> Web : www.mpiib-berlin.mpg.de >> Tel : +49-30-28460514 >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- Spencer Graves, PE, PhD President and Chief Operating Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San Jos?, CA 95126 ph: 408-655-4567 From roland.taber at gmx.at Mon Apr 4 18:35:04 2011 From: roland.taber at gmx.at (user84) Date: Mon, 4 Apr 2011 11:35:04 -0500 (CDT) Subject: [R] r-squared for object timeseries Message-ID: <1301934904761-3425984.post@n4.nabble.com> Hi, i am new in this forum. I hope someone can help me or correct me, if this is the false "subforum" to write this. I have to choose the "best" arima model from different possibilities of a timeseries. I know the AIC; BIC and similar. But now i would like to check the value called r-squared or adjusted r-squared in a linear model (in german called: Bestimmtheitsma?). How can i get it in R? The function "summary" is not availiable for the type "timeseries". If there is a routine or function that gets it, perfect, if not i am able to calculate this value for linear models. Is the equivalent thing for timeseries given, if i simply change the variable y from a linear model by the time? Thanks for any help! -- View this message in context: http://r.789695.n4.nabble.com/r-squared-for-object-timeseries-tp3425984p3425984.html Sent from the R help mailing list archive at Nabble.com. From rosyara at msu.edu Mon Apr 4 18:37:07 2011 From: rosyara at msu.edu (Umesh Rosyara) Date: Mon, 4 Apr 2011 12:37:07 -0400 Subject: [R] Questions remaining: define any character as na.string RE: merging data list in to single data frame In-Reply-To: <4D99F020.8020508@statistik.tu-dortmund.de> References: <002701cbf2d6$65405870$2fc10950$@edu> <4D99F020.8020508@statistik.tu-dortmund.de> Message-ID: <004701cbf2e6$89e09db0$9da1d910$@edu> Dear Uwe and R community members Thank you Uwe for the help. I have still a question remaining, I am trying to find answer from long time. While exporting my data, I have some characters mixed into it. I want to define any characters as na.string? Is it possible to do so? Thanks; Umesh -----Original Message----- From: Uwe Ligges [mailto:ligges at statistik.tu-dortmund.de] Sent: Monday, April 04, 2011 12:22 PM To: Umesh Rosyara Cc: r-help at r-project.org; rosyaraur at gmail.com Subject: Re: [R] merging data list in to single data frame On 04.04.2011 16:41, Umesh Rosyara wrote: > Dear R community members > > > > I did find a good way to merge my 200 text data files in to a single data > file with one column added will show indicator for that file. > > > > filelist = list.files(pattern = "K*cd.txt") I doubt you meant "K*cd.txt" but "^K[[:digit:]]*cd\\.txt$". # the file names are K1cd.txt > .................to K200cd.txt > > data_list<-lapply(filelist, read.table, header=T, comment=";", fill=T) Replace by: data_list <- lapply(filelist, function(x) cbind(Filename = x, read.table(x, header=T, comment=";", fill=TRUE)) And then: result <- do.call("rbind", data_list) Uwe Ligges > > > > This will create list, but this is not what I want. > > > > I want a single dataframe (all separate dataframes have same variable > headings) with additional row for example > > > > ; just for example, two small datasets are created by my component datasets > are huge, need automation > > ;read from file K1cd.txt > > var1 var2 var3 var4 > > 1 6 0.3 8 > > 3 4 0.4 9 > > 2 3 0.4 6 > > 1 0.4 0.9 3 > > > > ;read from file K2cd.txt > > var1 var2 var3 var4 > > 1 16 0.6 7 > > 3 14 0.4 6 > > 2 1 3 0.4 5 > > 1 0.6 0.9 2 > > > > the output dataframe should look like > > > > Fileno var1 var2 var3 var4 > > 1 1 6 0.3 8 > > 1 3 4 0.4 9 > > 1 2 3 0.4 6 > > 1 1 0.4 0.9 3 > > 2 1 16 0.6 7 > > 2 3 14 0.4 6 > > 2 2 1 3 0.4 5 > > 2 1 0.6 0.9 2 > > > > Please note that new file no column is added > > > > Thank you for the help. > > > > Umesh R > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From dwinsemius at comcast.net Mon Apr 4 18:59:06 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 4 Apr 2011 12:59:06 -0400 Subject: [R] lattice: how to "center" a subtitle? In-Reply-To: References: Message-ID: <57C6A6F6-4CB1-49B2-A0DA-4480EFD385A3@comcast.net> On Apr 4, 2011, at 12:45 PM, Marius Hofert wrote: > Dear David, > > I intended to use another x-label. But your suggestion brings me to > the idea of just using a two-line xlab, so s.th. like > print(xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for the > human's eye", xlab = "but subtitles are _now_ centered\nbla", scales > = list(alternating = c(1,1), tck = c(1,0)))) And if you wanted different fontface (underline, italic or bold) then you could use plotmath expressions: xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for the human's eye", xlab = expression( atop(but~subtitles2~are~underline(now)~centered, bold(bla) )), scales = list(alternating = c(1,1), tck = c(1,0))) -- David. > > Thanks! > > Cheers, > > Marius > > On 2011-04-04, at 16:47 , David Winsemius wrote: > >> >> On Apr 4, 2011, at 7:39 AM, Marius Hofert wrote: >> >>> Dear expeRts, >>> >>> I recently asked for a real "centered" title (see, e.g., http://tolstoy.newcastle.edu.au/R/e13/help/11/01/0135.html) >>> . >>> A nice solution (from Deepayan Sarkar) is to use "xlab.top" >>> instead of "main": >>> >>> library(lattice) >>> trellis.device("pdf") >>> print(xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for >>> the human's eye", sub = "but subtitles are not centered", scales = >>> list(alternating = c(1,1), tck = c(1,0)))) >>> dev.off() >> >> library(lattice) >> trellis.device("pdf") >> print(xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for >> the human's eye", xlab = "but subtitles are _now_ centered", scales >> = list(alternating = c(1,1), tck = c(1,0)))) >> dev.off() >> >> >> (I realize that those are not really subtitles by a 'lab', but that >> appears acceptable in your current test case.) >> >>> >>> My question is whether there is something similar for *sub*titles >>> [so something like "xlab.bottom"]? As you can see from the plot, >>> the subtitle does not seem to be "centered" for the human's eye. I >>> would like to center it according to the x-axis label. >>> >> >> >> David Winsemius, MD >> West Hartford, CT >> > David Winsemius, MD West Hartford, CT From dimitri.liakhovitski at gmail.com Mon Apr 4 19:09:02 2011 From: dimitri.liakhovitski at gmail.com (Dimitri Liakhovitski) Date: Mon, 4 Apr 2011 13:09:02 -0400 Subject: [R] merging 2 frames while keeping all the entries from the "reference" frame Message-ID: Hello! I have my data frame "mydata" (below) and data frame "reference" - that contains all the dates I would like to be present in the final data frame. I am trying to merge them so that the the result data frame contains all 8 dates in both subgroups (i.e., Group1 should have 8 rows and Group2 too). But when I merge it it's not coming out this way. Any hint would be greatly appreciated! Dimitri mydata<-data.frame(mydate=rep(seq(as.Date("2008-12-29"), length = 8, by = "week"),2), group=c(rep("Group1",8),rep("Group2",8)),values=rnorm(16,1,1)) (reference);(mydata) set.seed(1234) out<-sample(1:16,5,replace=F) mydata<-mydata[-out,]; dim(mydata) (mydata) # "reference" contains the dates I want to be present in the final data frame: reference<-data.frame(mydate=seq(as.Date("2008-12-29"), length = 8, by = "week")) # Merging: new.data<-merge(mydata,reference,by="mydate",all.x=T,all.y=T) new.data<-new.data[order(new.data$group,new.data$mydate),] (new.data) # my new.data contains only 7 rows in Group 1 and 4 rows in Group 2 -- Dimitri Liakhovitski Ninah Consulting From dwinsemius at comcast.net Mon Apr 4 19:22:53 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 4 Apr 2011 13:22:53 -0400 Subject: [R] Questions remaining: define any character as na.string RE: merging data list in to single data frame In-Reply-To: <004701cbf2e6$89e09db0$9da1d910$@edu> References: <002701cbf2d6$65405870$2fc10950$@edu> <4D99F020.8020508@statistik.tu-dortmund.de> <004701cbf2e6$89e09db0$9da1d910$@edu> Message-ID: On Apr 4, 2011, at 12:37 PM, Umesh Rosyara wrote: > Dear Uwe and R community members > > Thank you Uwe for the help. > > I have still a question remaining, I am trying to find answer from > long > time. > > While exporting my data, I have some characters mixed into it. I > want to > define any characters as na.string? Is it possible to do so? Option 1: do it in an editor that is regex aware. Option 2: input the file with readLines, use gsub to remove the unwanted characters, read.table(textConnection(obj)) on the resulting object. [There are many worked examples in the archives. Search on "read.table(textConnection(" .] -- David. > > Thanks; > > Umesh > > > > -----Original Message----- > From: Uwe Ligges [mailto:ligges at statistik.tu-dortmund.de] > Sent: Monday, April 04, 2011 12:22 PM > To: Umesh Rosyara > Cc: r-help at r-project.org; rosyaraur at gmail.com > Subject: Re: [R] merging data list in to single data frame > > > > On 04.04.2011 16:41, Umesh Rosyara wrote: >> Dear R community members >> >> >> >> I did find a good way to merge my 200 text data files in to a >> single data >> file with one column added will show indicator for that file. >> >> >> >> filelist = list.files(pattern = "K*cd.txt") > > > I doubt you meant "K*cd.txt" but "^K[[:digit:]]*cd\\.txt$". > > > > # the file names are K1cd.txt >> .................to K200cd.txt >> >> data_list<-lapply(filelist, read.table, header=T, comment=";", >> fill=T) > > > Replace by: > > data_list <- lapply(filelist, function(x) > cbind(Filename = x, read.table(x, header=T, comment=";", > fill=TRUE)) > > > And then: > > result <- do.call("rbind", data_list) > > Uwe Ligges > > >> >> >> >> This will create list, but this is not what I want. >> >> >> >> I want a single dataframe (all separate dataframes have same variable >> headings) with additional row for example >> >> >> >> ; just for example, two small datasets are created by my component > datasets >> are huge, need automation >> >> ;read from file K1cd.txt >> >> var1 var2 var3 var4 >> >> 1 6 0.3 8 >> >> 3 4 0.4 9 >> >> 2 3 0.4 6 >> >> 1 0.4 0.9 3 >> >> >> >> ;read from file K2cd.txt >> >> var1 var2 var3 var4 >> >> 1 16 0.6 7 >> >> 3 14 0.4 6 >> >> 2 1 3 0.4 5 >> >> 1 0.6 0.9 2 >> >> >> >> the output dataframe should look like >> >> >> >> Fileno var1 var2 var3 var4 >> >> 1 1 6 0.3 8 >> >> 1 3 4 0.4 9 >> >> 1 2 3 0.4 6 >> >> 1 1 0.4 0.9 3 >> >> 2 1 16 0.6 7 >> >> 2 3 14 0.4 6 >> >> 2 2 1 3 0.4 5 >> >> 2 1 0.6 0.9 2 >> >> >> >> Please note that new file no column is added >> >> >> >> Thank you for the help. >> >> >> >> Umesh R >> >> >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From dimitri.liakhovitski at gmail.com Mon Apr 4 19:24:34 2011 From: dimitri.liakhovitski at gmail.com (Dimitri Liakhovitski) Date: Mon, 4 Apr 2011 13:24:34 -0400 Subject: [R] merging 2 frames while keeping all the entries from the "reference" frame In-Reply-To: References: Message-ID: To clarify just in case, here is the result I am trying to get: mydate group values 12/29/2008 Group1 0.453466522 1/5/2009 Group1 NA 1/12/2009 Group1 0.416548943 1/19/2009 Group1 2.066275155 1/26/2009 Group1 2.037729638 2/2/2009 Group1 -0.598040483 2/9/2009 Group1 1.658999227 2/16/2009 Group1 -0.869325211 12/29/2008 Group2 NA 1/5/2009 Group2 NA 1/12/2009 Group2 NA 1/19/2009 Group2 0.375284194 1/26/2009 Group2 0.706785401 2/2/2009 Group2 NA 2/9/2009 Group2 2.104937151 2/16/2009 Group2 2.880393978 On Mon, Apr 4, 2011 at 1:09 PM, Dimitri Liakhovitski wrote: > Hello! > I have my data frame "mydata" (below) and data frame "reference" - > that contains all the dates I would like to be present in the final > data frame. > I am trying to merge them so that the the result data frame contains > all 8 dates in both subgroups (i.e., Group1 should have 8 rows and > Group2 too). But when I merge it it's not coming out this way. Any > hint would be greatly appreciated! > Dimitri > > mydata<-data.frame(mydate=rep(seq(as.Date("2008-12-29"), length = 8, > by = "week"),2), > group=c(rep("Group1",8),rep("Group2",8)),values=rnorm(16,1,1)) > (reference);(mydata) > set.seed(1234) > out<-sample(1:16,5,replace=F) > mydata<-mydata[-out,]; dim(mydata) > (mydata) > > # "reference" contains the dates I want to be present in the final data frame: > reference<-data.frame(mydate=seq(as.Date("2008-12-29"), length = 8, by > = "week")) > > # Merging: > new.data<-merge(mydata,reference,by="mydate",all.x=T,all.y=T) > new.data<-new.data[order(new.data$group,new.data$mydate),] > (new.data) > # my new.data contains only 7 rows in Group 1 and 4 rows in Group 2 > > > -- > Dimitri Liakhovitski > Ninah Consulting > -- Dimitri Liakhovitski Ninah Consulting www.ninah.com From m_hofert at web.de Mon Apr 4 19:27:55 2011 From: m_hofert at web.de (Marius Hofert) Date: Mon, 4 Apr 2011 19:27:55 +0200 Subject: [R] lattice: how to "center" a subtitle? In-Reply-To: <57C6A6F6-4CB1-49B2-A0DA-4480EFD385A3@comcast.net> References: <57C6A6F6-4CB1-49B2-A0DA-4480EFD385A3@comcast.net> Message-ID: Dear David, do you know how to get plotmath-like symbols in both rows? I tried s.th. like: lab <- expression(paste(alpha==1, ", ", beta==2, sep="")) xlab <- substitute(expression( atop(lab==lab., bold(foo)) ), list(lab.=lab)) xyplot(0 ~ 0, xlab = xlab) Cheers, Marius On 2011-04-04, at 18:59 , David Winsemius wrote: > > On Apr 4, 2011, at 12:45 PM, Marius Hofert wrote: > >> Dear David, >> >> I intended to use another x-label. But your suggestion brings me to the idea of just using a two-line xlab, so s.th. like >> print(xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for the human's eye", xlab = "but subtitles are _now_ centered\nbla", scales = list(alternating = c(1,1), tck = c(1,0)))) > > And if you wanted different fontface (underline, italic or bold) then you could use plotmath expressions: > > xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for the human's eye", xlab = expression( atop(but~subtitles2~are~underline(now)~centered, bold(bla) )), scales = list(alternating = c(1,1), tck = c(1,0))) > > -- > David. > >> >> Thanks! >> >> Cheers, >> >> Marius >> >> On 2011-04-04, at 16:47 , David Winsemius wrote: >> >>> >>> On Apr 4, 2011, at 7:39 AM, Marius Hofert wrote: >>> >>>> Dear expeRts, >>>> >>>> I recently asked for a real "centered" title (see, e.g., http://tolstoy.newcastle.edu.au/R/e13/help/11/01/0135.html). >>>> A nice solution (from Deepayan Sarkar) is to use "xlab.top" instead of "main": >>>> >>>> library(lattice) >>>> trellis.device("pdf") >>>> print(xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for the human's eye", sub = "but subtitles are not centered", scales = list(alternating = c(1,1), tck = c(1,0)))) >>>> dev.off() >>> >>> library(lattice) >>> trellis.device("pdf") >>> print(xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for the human's eye", xlab = "but subtitles are _now_ centered", scales = list(alternating = c(1,1), tck = c(1,0)))) >>> dev.off() >>> >>> >>> (I realize that those are not really subtitles by a 'lab', but that appears acceptable in your current test case.) >>> >>>> >>>> My question is whether there is something similar for *sub*titles [so something like "xlab.bottom"]? As you can see from the plot, the subtitle does not seem to be "centered" for the human's eye. I would like to center it according to the x-axis label. >>>> >>> >>> >>> David Winsemius, MD >>> West Hartford, CT >>> >> > > David Winsemius, MD > West Hartford, CT > From dwinsemius at comcast.net Mon Apr 4 19:58:43 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 4 Apr 2011 13:58:43 -0400 Subject: [R] lattice: how to "center" a subtitle? In-Reply-To: References: <57C6A6F6-4CB1-49B2-A0DA-4480EFD385A3@comcast.net> Message-ID: On Apr 4, 2011, at 1:27 PM, Marius Hofert wrote: > Dear David, > > do you know how to get plotmath-like symbols in both rows? > I tried s.th. like: > > lab <- expression(paste(alpha==1, ", ", beta==2, sep="")) > xlab <- substitute(expression( atop(lab==lab., bold(foo)) ), > list(lab.=lab)) > xyplot(0 ~ 0, xlab = xlab) I _did_ have plotmath functions in both rows: But here is your solution: xyplot(0 ~ 0, xlab = expression( atop(paste(alpha==1, " ", beta==2), bold(bla) )) ) ) Note that `paste` in plotmath is different than `paste` in regular R. It has no `sep` argument. I did try both substitute and bquote on you externally expression, but lattice seems to be doing some non- standard evaluation and I never got it to "work". Using what I thought _should_ work, does work with `plot`: > x=1;y=2 > plot(0 ~ 0, xlab = bquote( atop(alpha==.(x)*","~beta==.(y), bold(foo) ) ) + ) But the same expression throws an error with xyplot: > x=1;y=2 > xyplot(0 ~ 0, xlab = bquote( atop(alpha==.(x)*","~beta==.(y), bold(foo) ) ) + ) Error in trellis.skeleton(formula = 0 ~ 0, cond = list(1L), aspect = "fill", : could not find function "atop" -- David. > > Cheers, > > Marius > > On 2011-04-04, at 18:59 , David Winsemius wrote: > >> >> On Apr 4, 2011, at 12:45 PM, Marius Hofert wrote: >> >>> Dear David, >>> >>> I intended to use another x-label. But your suggestion brings me >>> to the idea of just using a two-line xlab, so s.th. like >>> print(xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for >>> the human's eye", xlab = "but subtitles are _now_ centered\nbla", >>> scales = list(alternating = c(1,1), tck = c(1,0)))) >> >> And if you wanted different fontface (underline, italic or bold) >> then you could use plotmath expressions: >> >> xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for the >> human's eye", xlab = >> expression( atop(but~subtitles2~are~underline(now)~centered, >> bold(bla) )), scales = list(alternating = c(1,1), tck = c(1,0))) >> >> -- >> David. >> >>> >>> Thanks! >>> >>> Cheers, >>> >>> Marius >>> >>> On 2011-04-04, at 16:47 , David Winsemius wrote: >>> >>>> >>>> On Apr 4, 2011, at 7:39 AM, Marius Hofert wrote: >>>> >>>>> Dear expeRts, >>>>> >>>>> I recently asked for a real "centered" title (see, e.g., http://tolstoy.newcastle.edu.au/R/e13/help/11/01/0135.html) >>>>> . >>>>> A nice solution (from Deepayan Sarkar) is to use "xlab.top" >>>>> instead of "main": >>>>> >>>>> library(lattice) >>>>> trellis.device("pdf") >>>>> print(xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for >>>>> the human's eye", sub = "but subtitles are not centered", scales >>>>> = list(alternating = c(1,1), tck = c(1,0)))) >>>>> dev.off() >>>> >>>> library(lattice) >>>> trellis.device("pdf") >>>> print(xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for >>>> the human's eye", xlab = "but subtitles are _now_ centered", >>>> scales = list(alternating = c(1,1), tck = c(1,0)))) >>>> dev.off() >>>> >>>> >>>> (I realize that those are not really subtitles by a 'lab', but >>>> that appears acceptable in your current test case.) >>>> >>>>> >>>>> My question is whether there is something similar for >>>>> *sub*titles [so something like "xlab.bottom"]? As you can see >>>>> from the plot, the subtitle does not seem to be "centered" for >>>>> the human's eye. I would like to center it according to the x- >>>>> axis label. >>>>> >>>> >>>> >>>> David Winsemius, MD >>>> West Hartford, CT >>>> >>> >> >> David Winsemius, MD >> West Hartford, CT >> > David Winsemius, MD West Hartford, CT From ehlers at ucalgary.ca Mon Apr 4 20:01:51 2011 From: ehlers at ucalgary.ca (Peter Ehlers) Date: Mon, 4 Apr 2011 11:01:51 -0700 Subject: [R] lattice: how to "center" a subtitle? In-Reply-To: References: <57C6A6F6-4CB1-49B2-A0DA-4480EFD385A3@comcast.net> Message-ID: <4D9A078F.5080408@ucalgary.ca> On 2011-04-04 10:27, Marius Hofert wrote: > Dear David, > > do you know how to get plotmath-like symbols in both rows? > I tried s.th. like: > > lab<- expression(paste(alpha==1, ", ", beta==2, sep="")) > xlab<- substitute(expression( atop(lab==lab., bold(foo)) ), list(lab.=lab)) > xyplot(0 ~ 0, xlab = xlab) Marius, I always find paste a bit tricky with plotmath. Maybe this will do what you want: mylab <- expression( atop(lab==list(alpha==1, beta==2), bold(foo)) ) xyplot(0 ~ 0, xlab = mylab) Peter Ehlers > > Cheers, > > Marius > > On 2011-04-04, at 18:59 , David Winsemius wrote: > >> >> On Apr 4, 2011, at 12:45 PM, Marius Hofert wrote: >> >>> Dear David, >>> >>> I intended to use another x-label. But your suggestion brings me to the idea of just using a two-line xlab, so s.th. like >>> print(xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for the human's eye", xlab = "but subtitles are _now_ centered\nbla", scales = list(alternating = c(1,1), tck = c(1,0)))) >> >> And if you wanted different fontface (underline, italic or bold) then you could use plotmath expressions: >> >> xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for the human's eye", xlab = expression( atop(but~subtitles2~are~underline(now)~centered, bold(bla) )), scales = list(alternating = c(1,1), tck = c(1,0))) >> >> -- >> David. >> >>> >>> Thanks! >>> >>> Cheers, >>> >>> Marius >>> >>> On 2011-04-04, at 16:47 , David Winsemius wrote: >>> >>>> >>>> On Apr 4, 2011, at 7:39 AM, Marius Hofert wrote: >>>> >>>>> Dear expeRts, >>>>> >>>>> I recently asked for a real "centered" title (see, e.g., http://tolstoy.newcastle.edu.au/R/e13/help/11/01/0135.html). >>>>> A nice solution (from Deepayan Sarkar) is to use "xlab.top" instead of "main": >>>>> >>>>> library(lattice) >>>>> trellis.device("pdf") >>>>> print(xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for the human's eye", sub = "but subtitles are not centered", scales = list(alternating = c(1,1), tck = c(1,0)))) >>>>> dev.off() >>>> >>>> library(lattice) >>>> trellis.device("pdf") >>>> print(xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for the human's eye", xlab = "but subtitles are _now_ centered", scales = list(alternating = c(1,1), tck = c(1,0)))) >>>> dev.off() >>>> >>>> >>>> (I realize that those are not really subtitles by a 'lab', but that appears acceptable in your current test case.) >>>> >>>>> >>>>> My question is whether there is something similar for *sub*titles [so something like "xlab.bottom"]? As you can see from the plot, the subtitle does not seem to be "centered" for the human's eye. I would like to center it according to the x-axis label. >>>>> >>>> >>>> >>>> David Winsemius, MD >>>> West Hartford, CT >>>> >>> >> >> David Winsemius, MD >> West Hartford, CT >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From wwwhsd at gmail.com Mon Apr 4 20:03:44 2011 From: wwwhsd at gmail.com (Henrique Dallazuanna) Date: Mon, 4 Apr 2011 15:03:44 -0300 Subject: [R] lattice: how to "center" a subtitle? In-Reply-To: References: <57C6A6F6-4CB1-49B2-A0DA-4480EFD385A3@comcast.net> Message-ID: Maybe: xyplot(0 ~ 0, xlab = bquote(expression(atop(alpha==.(x)*","~beta==.(y), bold(foo) )) )) On Mon, Apr 4, 2011 at 2:58 PM, David Winsemius wrote: > > On Apr 4, 2011, at 1:27 PM, Marius Hofert wrote: > >> Dear David, >> >> do you know how to get plotmath-like symbols in both rows? >> I tried s.th. like: >> >> lab <- expression(paste(alpha==1, ", ", beta==2, sep="")) >> xlab <- substitute(expression( atop(lab==lab., bold(foo)) ), >> list(lab.=lab)) >> xyplot(0 ~ 0, xlab = xlab) > > I _did_ have plotmath functions in both rows: But here is your solution: > > xyplot(0 ~ 0, ?xlab = > ? ?expression( atop(paste(alpha==1, " ? ", beta==2), bold(bla) )) ) > ? ? ?) > > Note that `paste` in plotmath is different than `paste` in regular R. It has > no `sep` argument. I did try both substitute and bquote on you externally > expression, ?but lattice seems to be doing some non-standard evaluation and > I never got it to "work". Using what I thought _should_ work, does work with > `plot`: > >> x=1;y=2 >> plot(0 ~ 0, xlab = bquote( atop(alpha==.(x)*","~beta==.(y), bold(foo) ) ) > + ) > > But the same expression throws an error with xyplot: >> x=1;y=2 >> xyplot(0 ~ 0, xlab = bquote( atop(alpha==.(x)*","~beta==.(y), bold(foo) ) >> ) > + ) > Error in trellis.skeleton(formula = 0 ~ 0, cond = list(1L), aspect = "fill", > ?: > ?could not find function "atop" > > -- > David. > > >> >> Cheers, >> >> Marius >> >> On 2011-04-04, at 18:59 , David Winsemius wrote: >> >>> >>> On Apr 4, 2011, at 12:45 PM, Marius Hofert wrote: >>> >>>> Dear David, >>>> >>>> I intended to use another x-label. But your suggestion brings me to the >>>> idea of just using a two-line xlab, so s.th. like >>>> print(xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for the >>>> human's eye", xlab = "but subtitles are _now_ centered\nbla", scales = >>>> list(alternating = c(1,1), tck = c(1,0)))) >>> >>> And if you wanted different fontface (underline, italic or bold) then you >>> could use plotmath expressions: >>> >>> xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for the human's >>> eye", xlab = expression( atop(but~subtitles2~are~underline(now)~centered, >>> bold(bla) )), scales = list(alternating = c(1,1), tck = c(1,0))) >>> >>> -- >>> David. >>> >>>> >>>> Thanks! >>>> >>>> Cheers, >>>> >>>> Marius >>>> >>>> On 2011-04-04, at 16:47 , David Winsemius wrote: >>>> >>>>> >>>>> On Apr 4, 2011, at 7:39 AM, Marius Hofert wrote: >>>>> >>>>>> Dear expeRts, >>>>>> >>>>>> I recently asked for a real "centered" title (see, e.g., >>>>>> http://tolstoy.newcastle.edu.au/R/e13/help/11/01/0135.html). >>>>>> A nice solution (from Deepayan Sarkar) is to use "xlab.top" instead of >>>>>> "main": >>>>>> >>>>>> library(lattice) >>>>>> trellis.device("pdf") >>>>>> print(xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for the >>>>>> human's eye", sub = "but subtitles are not centered", scales = >>>>>> list(alternating = c(1,1), tck = c(1,0)))) >>>>>> dev.off() >>>>> >>>>> library(lattice) >>>>> trellis.device("pdf") >>>>> print(xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for the >>>>> human's eye", xlab = "but subtitles are _now_ centered", scales = >>>>> list(alternating = c(1,1), tck = c(1,0)))) >>>>> dev.off() >>>>> >>>>> >>>>> (I realize that those are not really subtitles by a 'lab', but that >>>>> appears acceptable in your current test case.) >>>>> >>>>>> >>>>>> My question is whether there is something similar for *sub*titles [so >>>>>> something like "xlab.bottom"]? As you can see from the plot, the subtitle >>>>>> does not seem to be "centered" for the human's eye. I would like to center >>>>>> it according to the x-axis label. >>>>>> >>>>> >>>>> >>>>> David Winsemius, MD >>>>> West Hartford, CT >>>>> >>>> >>> >>> David Winsemius, MD >>> West Hartford, CT >>> >> > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O From i.petzev at gmail.com Mon Apr 4 20:23:47 2011 From: i.petzev at gmail.com (ivan) Date: Mon, 4 Apr 2011 20:23:47 +0200 Subject: [R] Granger Causality in a VAR Model Message-ID: Dear Community, I am new to R and have a question concerning the causality () test in the vars package. I need to test whether, say, the variable y Granger causes the variable x, given z as a control variable. I estimated the VAR model as follows: >model<-VAR(cbind(x,y,z),p=2) Then I did the following: >causality(model, cause="y"). I thing this tests the Granger causality of y on the vector (x,z), though. How can I implement the test for y causing x controlled for z? Thus, the F-test comparing the two models M1:x~lagged(x)+lagged(z) and M2:x~lagged(x)+lagged(y)+lagged(z)? Thank you in advance. Best Regards From m_hofert at web.de Mon Apr 4 20:34:01 2011 From: m_hofert at web.de (Marius Hofert) Date: Mon, 4 Apr 2011 20:34:01 +0200 Subject: [R] lattice: how to "center" a subtitle? In-Reply-To: References: <57C6A6F6-4CB1-49B2-A0DA-4480EFD385A3@comcast.net> Message-ID: Dear all, many thanks, that helped a lot! Cheers, Marius On 2011-04-04, at 19:58 , David Winsemius wrote: > > On Apr 4, 2011, at 1:27 PM, Marius Hofert wrote: > >> Dear David, >> >> do you know how to get plotmath-like symbols in both rows? >> I tried s.th. like: >> >> lab <- expression(paste(alpha==1, ", ", beta==2, sep="")) >> xlab <- substitute(expression( atop(lab==lab., bold(foo)) ), list(lab.=lab)) >> xyplot(0 ~ 0, xlab = xlab) > > I _did_ have plotmath functions in both rows: But here is your solution: > > xyplot(0 ~ 0, xlab = > expression( atop(paste(alpha==1, " ", beta==2), bold(bla) )) ) > ) > > Note that `paste` in plotmath is different than `paste` in regular R. It has no `sep` argument. I did try both substitute and bquote on you externally expression, but lattice seems to be doing some non-standard evaluation and I never got it to "work". Using what I thought _should_ work, does work with `plot`: > > > x=1;y=2 > > plot(0 ~ 0, xlab = bquote( atop(alpha==.(x)*","~beta==.(y), bold(foo) ) ) > + ) > > But the same expression throws an error with xyplot: > > x=1;y=2 > > xyplot(0 ~ 0, xlab = bquote( atop(alpha==.(x)*","~beta==.(y), bold(foo) ) ) > + ) > Error in trellis.skeleton(formula = 0 ~ 0, cond = list(1L), aspect = "fill", : > could not find function "atop" > > -- > David. > > >> >> Cheers, >> >> Marius >> >> On 2011-04-04, at 18:59 , David Winsemius wrote: >> >>> >>> On Apr 4, 2011, at 12:45 PM, Marius Hofert wrote: >>> >>>> Dear David, >>>> >>>> I intended to use another x-label. But your suggestion brings me to the idea of just using a two-line xlab, so s.th. like >>>> print(xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for the human's eye", xlab = "but subtitles are _now_ centered\nbla", scales = list(alternating = c(1,1), tck = c(1,0)))) >>> >>> And if you wanted different fontface (underline, italic or bold) then you could use plotmath expressions: >>> >>> xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for the human's eye", xlab = expression( atop(but~subtitles2~are~underline(now)~centered, bold(bla) )), scales = list(alternating = c(1,1), tck = c(1,0))) >>> >>> -- >>> David. >>> >>>> >>>> Thanks! >>>> >>>> Cheers, >>>> >>>> Marius >>>> >>>> On 2011-04-04, at 16:47 , David Winsemius wrote: >>>> >>>>> >>>>> On Apr 4, 2011, at 7:39 AM, Marius Hofert wrote: >>>>> >>>>>> Dear expeRts, >>>>>> >>>>>> I recently asked for a real "centered" title (see, e.g., http://tolstoy.newcastle.edu.au/R/e13/help/11/01/0135.html). >>>>>> A nice solution (from Deepayan Sarkar) is to use "xlab.top" instead of "main": >>>>>> >>>>>> library(lattice) >>>>>> trellis.device("pdf") >>>>>> print(xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for the human's eye", sub = "but subtitles are not centered", scales = list(alternating = c(1,1), tck = c(1,0)))) >>>>>> dev.off() >>>>> >>>>> library(lattice) >>>>> trellis.device("pdf") >>>>> print(xyplot(0 ~ 0, xlab.top = "This title is now 'centered' for the human's eye", xlab = "but subtitles are _now_ centered", scales = list(alternating = c(1,1), tck = c(1,0)))) >>>>> dev.off() >>>>> >>>>> >>>>> (I realize that those are not really subtitles by a 'lab', but that appears acceptable in your current test case.) >>>>> >>>>>> >>>>>> My question is whether there is something similar for *sub*titles [so something like "xlab.bottom"]? As you can see from the plot, the subtitle does not seem to be "centered" for the human's eye. I would like to center it according to the x-axis label. >>>>>> >>>>> >>>>> >>>>> David Winsemius, MD >>>>> West Hartford, CT >>>>> >>>> >>> >>> David Winsemius, MD >>> West Hartford, CT >>> >> > > David Winsemius, MD > West Hartford, CT > From geraldes at mail.ubc.ca Mon Apr 4 19:13:04 2011 From: geraldes at mail.ubc.ca (geral) Date: Mon, 4 Apr 2011 12:13:04 -0500 (CDT) Subject: [R] automating regression or correlations for many variables Message-ID: <1301937184597-3426091.post@n4.nabble.com> Dear All, I have a large data frame with 10 rows and 82 columns. I want to apply the same function to all of the columns with a single command. e.g. zl <- lm (snp$a_109909 ~ snp$lat) will fit a linear model to the values in lat and a_109909. What I want to do is fit linear models for the values in each column against lat. I tried doing zl <- (snp[,2:82] ~ snp$lat[,1]) but got the following error message "Error in model.frame.default(formula = snp[, 2:82] ~ snp[, 1], drop.unused.levels = TRUE) : invalid type (list) for variable 'snp[, 2:82]'". Does this mean I cannot use a data frame to do this? Should I have my data in a matrix instead? Thanks -- View this message in context: http://r.789695.n4.nabble.com/automating-regression-or-correlations-for-many-variables-tp3426091p3426091.html Sent from the R help mailing list archive at Nabble.com. From antrael at hotmail.com Mon Apr 4 19:14:36 2011 From: antrael at hotmail.com (jouba) Date: Mon, 4 Apr 2011 12:14:36 -0500 (CDT) Subject: [R] Structural equation modeling in R(lavaan,sem) In-Reply-To: <4D996FB6.2020903@gmail.com> References: <4D9073F7.2040309@gmail.com> <1301426701835-3415954.post@n4.nabble.com> <4D996FB6.2020903@gmail.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ggrothendieck at gmail.com Mon Apr 4 20:53:51 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Mon, 4 Apr 2011 14:53:51 -0400 Subject: [R] merging 2 frames while keeping all the entries from the "reference" frame In-Reply-To: References: Message-ID: On Mon, Apr 4, 2011 at 1:09 PM, Dimitri Liakhovitski wrote: > Hello! > I have my data frame "mydata" (below) and data frame "reference" - > that contains all the dates I would like to be present in the final > data frame. > I am trying to merge them so that the the result data frame contains > all 8 dates in both subgroups (i.e., Group1 should have 8 rows and > Group2 too). But when I merge it it's not coming out this way. Any > hint would be greatly appreciated! > Dimitri > > mydata<-data.frame(mydate=rep(seq(as.Date("2008-12-29"), length = 8, > by = "week"),2), > group=c(rep("Group1",8),rep("Group2",8)),values=rnorm(16,1,1)) > (reference);(mydata) > set.seed(1234) > out<-sample(1:16,5,replace=F) > mydata<-mydata[-out,]; dim(mydata) > (mydata) > > # "reference" contains the dates I want to be present in the final data frame: > reference<-data.frame(mydate=seq(as.Date("2008-12-29"), length = 8, by > = "week")) > > # Merging: > new.data<-merge(mydata,reference,by="mydate",all.x=T,all.y=T) > new.data<-new.data[order(new.data$group,new.data$mydate),] > (new.data) > # my new.data contains only 7 rows in Group 1 and 4 rows in Group 2 > It might make more sense to put each group into its own column since then the object is a multivariate time series: > library(zoo) > z <- merge(read.zoo(mydata, split = 2), zoo(, reference[[1]]), all = c(FALSE, TRUE)) > z Group1 Group2 2008-12-29 2.0266215 NA 2009-01-05 NA NA 2009-01-12 1.0255344 NA 2009-01-19 1.3880938 0.8135788 2009-01-26 1.4380978 1.6068682 2009-02-02 1.1764965 NA 2009-02-09 1.1578531 1.4484447 2009-02-16 0.6673568 1.4760864 although if you really need to you could string them out like this: library(reshape2) melt(data.frame(time(z), coredata(z)), id = 1) -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From wwwhsd at gmail.com Mon Apr 4 21:07:25 2011 From: wwwhsd at gmail.com (Henrique Dallazuanna) Date: Mon, 4 Apr 2011 16:07:25 -0300 Subject: [R] merging 2 frames while keeping all the entries from the "reference" frame In-Reply-To: References: Message-ID: Try this: merge(mydata, cbind(reference, group = rep(unique(mydata$group), each = nrow(reference))), all = TRUE) On Mon, Apr 4, 2011 at 2:24 PM, Dimitri Liakhovitski wrote: > To clarify just in case, here is the result I am trying to get: > > mydate ?group ? values > 12/29/2008 ? ? ?Group1 ?0.453466522 > 1/5/2009 ? ? ? ?Group1 ?NA > 1/12/2009 ? ? ? Group1 ?0.416548943 > 1/19/2009 ? ? ? Group1 ?2.066275155 > 1/26/2009 ? ? ? Group1 ?2.037729638 > 2/2/2009 ? ? ? ?Group1 ?-0.598040483 > 2/9/2009 ? ? ? ?Group1 ?1.658999227 > 2/16/2009 ? ? ? Group1 ?-0.869325211 > 12/29/2008 ? ? ?Group2 ?NA > 1/5/2009 ? ? ? ?Group2 ?NA > 1/12/2009 ? ? ? Group2 ?NA > 1/19/2009 ? ? ? Group2 ?0.375284194 > 1/26/2009 ? ? ? Group2 ?0.706785401 > 2/2/2009 ? ? ? ?Group2 ?NA > 2/9/2009 ? ? ? ?Group2 ?2.104937151 > 2/16/2009 ? ? ? Group2 ?2.880393978 > > > > On Mon, Apr 4, 2011 at 1:09 PM, Dimitri Liakhovitski > wrote: >> Hello! >> I have my data frame "mydata" (below) and data frame "reference" - >> that contains all the dates I would like to be present in the final >> data frame. >> I am trying to merge them so that the the result data frame contains >> all 8 dates in both subgroups (i.e., Group1 should have 8 rows and >> Group2 too). But when I merge it it's not coming out this way. Any >> hint would be greatly appreciated! >> Dimitri >> >> mydata<-data.frame(mydate=rep(seq(as.Date("2008-12-29"), length = 8, >> by = "week"),2), >> group=c(rep("Group1",8),rep("Group2",8)),values=rnorm(16,1,1)) >> (reference);(mydata) >> set.seed(1234) >> out<-sample(1:16,5,replace=F) >> mydata<-mydata[-out,]; dim(mydata) >> (mydata) >> >> # "reference" contains the dates I want to be present in the final data frame: >> reference<-data.frame(mydate=seq(as.Date("2008-12-29"), length = 8, by >> = "week")) >> >> # Merging: >> new.data<-merge(mydata,reference,by="mydate",all.x=T,all.y=T) >> new.data<-new.data[order(new.data$group,new.data$mydate),] >> (new.data) >> # my new.data contains only 7 rows in Group 1 and 4 rows in Group 2 >> >> >> -- >> Dimitri Liakhovitski >> Ninah Consulting >> > > > > -- > Dimitri Liakhovitski > Ninah Consulting > www.ninah.com > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O From djmuser at gmail.com Mon Apr 4 21:24:57 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Mon, 4 Apr 2011 12:24:57 -0700 Subject: [R] merging data list in to single data frame In-Reply-To: <002701cbf2d6$65405870$2fc10950$@edu> References: <002701cbf2d6$65405870$2fc10950$@edu> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From djmuser at gmail.com Mon Apr 4 21:53:38 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Mon, 4 Apr 2011 12:53:38 -0700 Subject: [R] automating regression or correlations for many variables In-Reply-To: <1301937184597-3426091.post@n4.nabble.com> References: <1301937184597-3426091.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ggrothendieck at gmail.com Mon Apr 4 21:54:34 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Mon, 4 Apr 2011 15:54:34 -0400 Subject: [R] zoo:rollapply by multiple grouping factors In-Reply-To: <4D9A1EA9.1010203@ucsc.edu> References: <4D98991B.10909@ucsc.edu> <4D9A1EA9.1010203@ucsc.edu> Message-ID: On Mon, Apr 4, 2011 at 3:40 PM, Mark Novak wrote: > Thank you very much Gabor! ?It looks like that's gonna work wonderfully. ?I > didn't even know 'ave' existed. > > For others out there: ?I only needed to add a comma: ? dat[,c("Site", > "Plot", "Sp")] Actually, if dd is a data frame dd[, ix] and dd[ix] give the same result. e.g. > dd <- data.frame(a = 1:3, b = 11:13, c = 21:23) > identical(dd[, c("b", "c")], dd[c("b", "c")]) [1] TRUE > Small follow up Q: ?Is there any reason to use 'aggregate' vs. 'ave' in > general? aggregate reduces the data to fewer rows. ave adds a potentially additional column to the original data. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From smckinney at bccrc.ca Mon Apr 4 21:58:53 2011 From: smckinney at bccrc.ca (Steven McKinney) Date: Mon, 4 Apr 2011 12:58:53 -0700 Subject: [R] Linear Model with curve fitting parameter? In-Reply-To: References: <8512_1301611104_1301611104_AANLkTikOxmcE=oMvHBuB8x61fxXJwnexXJkr+Qp3Tawp@mail.gmail.com> Message-ID: > -----Original Message----- > From: stephen sefick [mailto:ssefick at gmail.com] > Sent: April-03-11 5:35 PM > To: Steven McKinney > Cc: R help > Subject: Re: [R] Linear Model with curve fitting parameter? > > Steven: > > You are exactly right sorry I was confused. > > > ####################################################### > so log(y-intercept)+log(K) is a constant called b0 (is this right?) Doesn't look right to me based on the information you've provided. I don't see anything labeled "y" in your previous emails, so I'm not clear on what y is and how it relates to the original model you described > >> I have a model Q=K*A*(R^r)*(S^s) > >> > >> A, R, and S are data I have and K is a curve fitting parameter. If the model is Q=K*A*(R^r)*(S^s) then log(Q) = log(K) + log(A) + r*log(R) + s*log(S) Rearranging yields log(Q) - log(A) = log(K) + r*log(R) + s*log(S) Let Z = log(Q) - log(A) = log(Q/A) so Z = log(K) + r*log(R) + s*log(S) and a linear model fit of Z ~ log(R) + log(S) will yield parameter estimates for the linear equation E(Z) = B0 + B1*log(R) + B2*log(S) (E(Z) = expected value of Z) so B0 estimate is an estimate of log(K) B1 estimate is an estimate of r B2 estimate is an estimate of s and these are the only parameters you described in the original model. > > lm(log(Q)~log(A)+log(R)+log(S)-1) > > is fitting the model > > log(Q)=a*log(A)+r*log(R)+s*log(S) (no beta 0) > > and > > lm(log(Q)~log(A)+log(R)+log(S)) > > > is fitting the model > > log(Q)=b0+a*log(A)+r*log(R)+s*log(S) K has disappeared from these equations so these model fits do not correspond to the model originally described. Now a b0 appears, and is used in models below. I think changing notation is also adding confusion. What are "y" and "intercept" you discuss above, in relation to your original notation? > > ###################################################### > > These are the models I am trying to fit and if I have reasoned > correctly above then I should be able to fit the below models > similarly. You will be able to fit models appropriately once you have a clearly defined system of notation that allows you to map between the proposed data model, the parameters in that model, and the corresponding regression equations. Once you have consistent notation, you will be able to see if you can express your model as a linear regression, or if not, what kind of non-linear regression you will need to do to get estimates for the parameters in your model. Best Steve McKinney > > manning > log(Q)=log(b0)+log(K)+log(A)+r*log(R)+s*log(S) > > dingman > log(Q)=log(b0)+log(K)+a*log(A)+r*log(R)+s*(log(S))^2 > > bjerklie > log(Q)=log(b0)+log(K)+a*log(A)+r*log(R)+s*log(S) > > ####################################################### > > Thank you for all of your help! > > Stephen > > On Fri, Apr 1, 2011 at 2:44 PM, Steven McKinney wrote: > > > >> -----Original Message----- > >> From: stephen sefick [mailto:ssefick at gmail.com] > >> Sent: April-01-11 5:44 AM > >> To: Steven McKinney > >> Cc: R help > >> Subject: Re: [R] Linear Model with curve fitting parameter? > >> > >> Setting Z=Q-A would be the incorrect dimensions. ?I could Z=Q/A. > > > > I suspect this is confusion about what Q is. ?I was presuming that > > the Q in this following formula was log(Q) with Q from the original data. > > > >> >> I have taken the log of the data that I have and this is the model > >> >> formula without the K part > >> >> > >> >> lm(Q~offset(A)+R+S, data=x) > > > > If the model is > > > > ? Q=K*A*(R^r)*(S^s) > > > > then > > > > ? log(Q) = log(K) + log(A) + r*log(R) + s*log(S) > > > > Rearranging yields > > > > ? log(Q) - log(A) = log(K) + r*log(R) + s*log(S) > > > > so what I labeled 'Z' below is > > > > ? Z = log(Q) - log(A) = log(Q/A) > > > > so > > > > ? Z = log(K) + r*log(R) + s*log(S) > > > > and a linear model fit of > > > > ? Z ~ log(R) + log(S) > > > > will yield parameter estimates for the linear equation > > > > ? E(Z) = B0 + B1*log(R) + B2*log(S) > > > > (E(Z) = expected value of Z) > > > > so B0 estimate is an estimate of log(K) > > ? B1 estimate is an estimate of r > > ? B2 estimate is an estimate of s > > > > More details and careful notation will eventually lead > > to a reasonable description and analysis strategy. > > > > > > Best > > > > Steve McKinney > > > > > > > >> Is fitting a nls model the same as fitting an ols? ?These data are > >> hydraulic data from ~47 sites. ?To access predictive ability I am > >> removing one site fitting a new model and then accessing the fit with > >> a myriad of model assessment criteria. ?I should get the same answer > >> with ols vs nls? ?Thank you for all of your help. > >> > >> Stephen > >> > >> On Thu, Mar 31, 2011 at 8:34 PM, Steven McKinney wrote: > >> > > >> >> -----Original Message----- > >> >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of stephen > >> sefick > >> >> Sent: March-31-11 3:38 PM > >> >> To: R help > >> >> Subject: [R] Linear Model with curve fitting parameter? > >> >> > >> >> I have a model Q=K*A*(R^r)*(S^s) > >> >> > >> >> A, R, and S are data I have and K is a curve fitting parameter. ?I > >> >> have linearized as > >> >> > >> >> log(Q)=log(K)+log(A)+r*log(R)+s*log(S) > >> >> > >> >> I have taken the log of the data that I have and this is the model > >> >> formula without the K part > >> >> > >> >> lm(Q~offset(A)+R+S, data=x) > >> >> > >> >> What is the formula that I should use? > >> > > >> > Let Z = Q - A for your logged data. > >> > > >> > Fitting lm(Z ~ R + S, data = x) should yield > >> > intercept parameter estimate = estimate for log(K) > >> > R coefficient parameter estimate = estimate for r > >> > S coefficient parameter estimate = estimate for s > >> > > >> > > >> > > >> > Steven McKinney > >> > > >> > Statistician > >> > Molecular Oncology and Breast Cancer Program > >> > British Columbia Cancer Research Centre > >> > > >> > > >> > > >> >> > >> >> Thanks for all of your help. ?I can provide a subset of data if necessary. > >> >> > >> >> > >> >> > >> >> -- > >> >> Stephen Sefick > >> >> ____________________________________ > >> >> | Auburn University? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | > >> >> | Biological Sciences ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ?? ? ? ?| > >> >> | 331 Funchess Hall? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| > >> >> | Auburn, Alabama? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ? ??? | > >> >> | 36849? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | > >> >> |___________________________________| > >> >> | sas0025 at auburn.edu? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ?| > >> >> | http://www.auburn.edu/~sas0025? ? ? ? ? ?? ? ? | > >> >> |___________________________________| > >> >> > >> >> Let's not spend our time and resources thinking about things that are > >> >> so little or so large that all they really do for us is puff us up and > >> >> make us feel like gods.? We are mammals, and have not exhausted the > >> >> annoying little problems of being mammals. > >> >> > >> >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -K. Mullis > >> >> > >> >> "A big computer, a complex algorithm and a long time does not equal science." > >> >> > >> >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -Robert Gentleman > >> >> ______________________________________________ > >> >> R-help at r-project.org mailing list > >> >> https://stat.ethz.ch/mailman/listinfo/r-help > >> >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > >> >> and provide commented, minimal, self-contained, reproducible code. > >> > > >> > >> > >> > >> -- > >> Stephen Sefick > >> ____________________________________ > >> | Auburn University? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | > >> | Biological Sciences ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ?? ? ? ?| > >> | 331 Funchess Hall? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| > >> | Auburn, Alabama? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ? ??? | > >> | 36849? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | > >> |___________________________________| > >> | sas0025 at auburn.edu? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ?| > >> | http://www.auburn.edu/~sas0025? ? ? ? ? ?? ? ? | > >> |___________________________________| > >> > >> Let's not spend our time and resources thinking about things that are > >> so little or so large that all they really do for us is puff us up and > >> make us feel like gods.? We are mammals, and have not exhausted the > >> annoying little problems of being mammals. > >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -K. Mullis > >> > >> "A big computer, a complex algorithm and a long time does not equal science." > >> > >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -Robert Gentleman > > > > > > -- > Stephen Sefick > ____________________________________ > | Auburn University? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | > | Biological Sciences ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ?? ? ? ?| > | 331 Funchess Hall? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| > | Auburn, Alabama? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ? ??? | > | 36849? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | > |___________________________________| > | sas0025 at auburn.edu? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ?| > | http://www.auburn.edu/~sas0025? ? ? ? ? ?? ? ? | > |___________________________________| > > Let's not spend our time and resources thinking about things that are > so little or so large that all they really do for us is puff us up and > make us feel like gods.? We are mammals, and have not exhausted the > annoying little problems of being mammals. > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -K. Mullis > > "A big computer, a complex algorithm and a long time does not equal science." > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -Robert Gentleman From mnovak1 at ucsc.edu Mon Apr 4 21:40:25 2011 From: mnovak1 at ucsc.edu (Mark Novak) Date: Mon, 04 Apr 2011 12:40:25 -0700 Subject: [R] zoo:rollapply by multiple grouping factors In-Reply-To: References: <4D98991B.10909@ucsc.edu> Message-ID: <4D9A1EA9.1010203@ucsc.edu> Thank you very much Gabor! It looks like that's gonna work wonderfully. I didn't even know 'ave' existed. For others out there: I only needed to add a comma: dat[,c("Site", "Plot", "Sp")] Small follow up Q: Is there any reason to use 'aggregate' vs. 'ave' in general? -mark On 4/3/11 3:27 PM, Gabor Grothendieck wrote: > Try ave: > > dat$cv<- ave(dat$Count, dat[c("Site", "Plot", "Sp")], FUN = > function(x) rollapply(zoo(x), 2, cv, na.pad = TRUE, align = "right")) > On Sun, Apr 3, 2011 at 11:58 AM, Mark Novak wrote: >> # Hi there, >> # I am trying to apply a function over a moving-window for a large number of >> multivariate time-series that are grouped in a nested set of factors. I >> have spent a few days searching for solutions with no luck, so any >> suggestions are much appreciated. >> >> # The data I have are for the abundance dynamics of multiple species >> observed in multiple fixed plots at multiple sites. (I total I have 7 >> sites, ~3-5 plots/site, ~150 species/plot, for 60 time-steps each.) So my >> data look something like this: >> >> dat<-data.frame(Site=rep(1), Plot=rep(c(rep(1,8),rep(2,8),rep(3,8)),1), >> Time=rep(c(1,1,2,2,3,3,4,4)), Sp=rep(1:2), Count=sample(24)) >> dat >> >> # Let the function I want to apply over a right-aligned window of w=2 time >> steps be: >> cv<-function(x){sd(x)/mean(x)} >> w<-2 >> >> # The final output I want would look something like this: >> Out<-data.frame(dat,CV=round(c(NA,NA,runif(6,0,1),c(NA,NA,runif(6,0,1))),2)) >> >> # I could reshape and apply zoo:rollapply() to a given plot at a given site, >> and reshape again as follows: >> library(zoo) >> a<-subset(dat,Site==1&Plot==1) >> b<-reshape(a[-c(1,2)],v.names='Count',idvar='Time',timevar='Sp',direction='wide') >> d<-zoo(b[,-1],b[,1]) >> d >> out<-rollapply(d, w, cv, na.pad=T, align='right') >> out >> >> # I would thereby have to loop through all my sites and plots which, >> although it deals with all species at once, still seems exceedingly >> inefficient. >> >> # So the question is, how do I use something like aggregate.zoo or tapply or >> even lapply to apply rollapply on each species' time series. >> >> # The closest I've come is the following two approaches: >> >> # First let: >> datx<-list(Site=dat$Site,Plot=dat$Plot,Sp=dat$Sp) >> daty<-dat$Count >> >> # Method 1. >> out1<-tapply(seq(along=daty),datx,function(i,x=daty){ rollapply(zoo(x[i]), >> w, cv, na.pad=T, align='right') }) >> out1 >> out1[,,1] >> >> # Which "works" in that it gives me the right answers, but in a format from >> which I can't figure out how to get back into the format I want. >> >> # Method 2. >> fun<-function(x){y<-zoo(x);coredata(rollapply(y, w, >> cv,na.pad=T,align='right'))} >> out2<-aggregate(daty,by=datx,fun) >> out2 >> >> # Which superficially "works" better, but again only in a format I can't >> figure out how to use because the output seems to be a mix of data.frame and >> lists. >> out2[1,4] >> out2[1,5] >> is.data.frame(out2) >> is.list(out2) >> >> # The situation is made more problematic by the fact that the time point of >> first survey can differ between plots (e.g., site1-plot3 may only start at >> time-point 3). As in... >> dat2<-dat >> dat2<-dat2[-which(dat2$Plot==3& dat2$Time<3),] >> dat2 >> >> # I must therefore ensure that I'm keeping track of the true time associated >> with each value, not just the order of their occurences. This information >> is (seemingly) lost by both methods. >> datx<-list(Site=dat2$Site,Plot=dat2$Plot,Sp=dat2$Sp) >> daty<-dat2$Count >> >> # Method 1. >> out3<-tapply(seq(along=daty),datx,function(i,x=daty){ rollapply(zoo(x[i]), >> w, cv, na.pad=T, align='right') }) >> out3 >> out3[1,3,1] >> time(out3[1,3,1]) >> >> # Method 2 >> out4<-aggregate(daty,by=datx,fun) >> out4 >> time(out4[3,4]) >> >> >> # Am I going about this all wrong? Is there a different package to try? >> Any thoughts and suggestions are much appreciated! >> >> # R 2.12.2 GUI 1.36 Leopard build 32-bit (5691); zoo 1.6-4 >> >> # Thanks! >> # -mark >> > From macrakis at alum.mit.edu Mon Apr 4 22:15:18 2011 From: macrakis at alum.mit.edu (Stavros Macrakis) Date: Mon, 4 Apr 2011 16:15:18 -0400 Subject: [R] General binary search? Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From rosyara at msu.edu Mon Apr 4 22:10:40 2011 From: rosyara at msu.edu (Umesh Rosyara) Date: Mon, 4 Apr 2011 16:10:40 -0400 Subject: [R] merging data list in to single data frame In-Reply-To: References: <002701cbf2d6$65405870$2fc10950$@edu> Message-ID: <006301cbf304$5f429950$1dc7cbf0$@edu> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From geraldes at mail.ubc.ca Mon Apr 4 22:33:15 2011 From: geraldes at mail.ubc.ca (geral) Date: Mon, 4 Apr 2011 15:33:15 -0500 (CDT) Subject: [R] automating regression or correlations for many variables In-Reply-To: References: <1301937184597-3426091.post@n4.nabble.com> Message-ID: <1301949195112-3426519.post@n4.nabble.com> Thanks! I must confess I am just a beginner, but I followed your suggestion and did 'm <- lm(as.matrix(snp[, -1]) ~ lat, data = snp) ' and it worked perfectly. I would like to understand what is being done here. as.matrix I understand makes my data frame be a matrix, but I don't understand the part snp[,-1]. I think I am saying use all rows of... but I don't understand the -1, does that mean, all columns but column 1? I really appreciate the help! It was really really valuable. A related question is, what is m? If I do is.object(m) I get true. so I guess it is an object. I now need to use each of the fitted models for more things, such as calculate anova, etc... How do I do that? When I do this column by column I just do anova(m), but it does not seem to work now... Thanks! AG -- View this message in context: http://r.789695.n4.nabble.com/automating-regression-or-correlations-for-many-variables-tp3426091p3426519.html Sent from the R help mailing list archive at Nabble.com. From wdunlap at tibco.com Mon Apr 4 22:50:28 2011 From: wdunlap at tibco.com (William Dunlap) Date: Mon, 4 Apr 2011 13:50:28 -0700 Subject: [R] General binary search? In-Reply-To: References: Message-ID: <77EB52C6DD32BA4D87471DCD70C8D7000412520C@NA-PA-VBE03.na.tibco.com> > -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Stavros Macrakis > Sent: Monday, April 04, 2011 1:15 PM > To: r-help > Subject: [R] General binary search? > > Is there a generic binary search routine in a standard library which > > a) works for character vectors > b) runs in O(log(N)) time? > > I'm aware of findInterval(x,vec), but it is restricted to > numeric vectors. xtfrm(x) will convert a character (or other) vector to a numeric vector with the same ordering. findInterval can work on that. E.g., > f0 <- function(x, vec) { tmp <- xtfrm(c(x, vec)) findInterval(tmp[seq_along(x)], tmp[-seq_along(x)]) } > f0(c("Baby", "Aunt", "Dog"), LETTERS) [1] 2 1 4 I've never looked at its speed. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > > I'm also aware of various hashing solutions (e.g. > new.env(hash=TRUE) and > fastmatch), but I need the greatest-lower-bound match in my > application. > > findInterval is also slow for large N=length(vec) because of the O(N) > checking it does, as Duncan Murdoch has pointed > out: > though > its documentation says it runs in O(n * log(N)), it actually > runs in O(n * > log(N) + N), which is quite noticeable for largish N. But > that is easy > enough to work around by writing a variant of findInterval which calls > find_interv_vec without checking. > > -s > > PS Yes, binary search is a one-liner in R, but I always prefer to use > standard, fast native libraries when possible.... > > binarysearch <- function(val,tab,L,H) {while (H>=L) { > M=L+(H-L) %/% 2; if > (tab[M]>val) H<-M-1 else if (tab[M] return(L-1)} > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From skfglades at gmail.com Mon Apr 4 23:12:59 2011 From: skfglades at gmail.com (skfglades at gmail.com) Date: Mon, 4 Apr 2011 17:12:59 -0400 Subject: [R] moving mean and moving variance functions In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From yee at post.harvard.edu Mon Apr 4 23:30:36 2011 From: yee at post.harvard.edu (Andrew Yee) Date: Mon, 4 Apr 2011 17:30:36 -0400 Subject: [R] how to handle no lines in input with pipe() Message-ID: This has to do with using pipe() and grep and read.csv() I have a .csv file that I grep using pipe() and read.csv() as follows: read.csv(pipe('grep foo bar.csv')) However, is there a way to have this command run when for example, there is no "foo" text in the bar.csv file? I get an error message (appropriately): Error in read.table(file = file, header = header, sep = sep, quote = quote, : no lines available in input Is there a way to "inspect" the output of pipe before passing it on to read.csv()? Thanks, Andrew From jwshaw at uic.edu Mon Apr 4 23:39:04 2011 From: jwshaw at uic.edu (James Warren Shaw) Date: Mon, 4 Apr 2011 16:39:04 -0500 Subject: [R] AIC for robust regression Message-ID: <003f01cbf310$b8b50b10$2a1f2130$@edu> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ripley at stats.ox.ac.uk Mon Apr 4 23:46:14 2011 From: ripley at stats.ox.ac.uk (Prof Brian Ripley) Date: Mon, 4 Apr 2011 22:46:14 +0100 (BST) Subject: [R] how to handle no lines in input with pipe() In-Reply-To: References: Message-ID: On Mon, 4 Apr 2011, Andrew Yee wrote: > This has to do with using pipe() and grep and read.csv() > > I have a .csv file that I grep using pipe() and read.csv() as follows: > > read.csv(pipe('grep foo bar.csv')) > > However, is there a way to have this command run when for example, > there is no "foo" text in the bar.csv file? I get an error message > (appropriately): > > Error in read.table(file = file, header = header, sep = sep, quote = quote, : > no lines available in input > > Is there a way to "inspect" the output of pipe before passing it on to > read.csv()? You have to read from a pipe to 'inspect' it. So tmp <- readLines(pipe('grep foo bar.csv')) if(!length(tmp)) do something else else { res <- read.csv(con <- textConnection(tmp)) close(con) } OTOH, unless the file is enormous you could simply read it into R and use grep(value = TRUE) on the character vector. > > Thanks, > Andrew > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 From Michael.Folkes at dfo-mpo.gc.ca Tue Apr 5 00:05:21 2011 From: Michael.Folkes at dfo-mpo.gc.ca (Folkes, Michael) Date: Mon, 4 Apr 2011 15:05:21 -0700 Subject: [R] RODBC excel - need to preserve (or extract) numeric column names Message-ID: <63F107BCC37AEA49A75FD94AA3E07CB004AFD71A@pacpbsex01.pac.dfo-mpo.ca> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From hadley at rice.edu Tue Apr 5 00:10:44 2011 From: hadley at rice.edu (Hadley Wickham) Date: Mon, 4 Apr 2011 17:10:44 -0500 Subject: [R] merging data list in to single data frame In-Reply-To: <006301cbf304$5f429950$1dc7cbf0$@edu> References: <002701cbf2d6$65405870$2fc10950$@edu> <006301cbf304$5f429950$1dc7cbf0$@edu> Message-ID: > filelist = list.files(pattern = "K*cd.txt") # the file names are K1cd.txt > .................to K200cd.txt It's very easy: names(filelist) <- basename(filelist) data_list <- ldply(filelist, read.table, header=T, comment=";", fill=T) Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ From dwinsemius at comcast.net Tue Apr 5 00:33:27 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 4 Apr 2011 18:33:27 -0400 Subject: [R] RODBC excel - need to preserve (or extract) numeric column names In-Reply-To: <63F107BCC37AEA49A75FD94AA3E07CB004AFD71A@pacpbsex01.pac.dfo-mpo.ca> References: <63F107BCC37AEA49A75FD94AA3E07CB004AFD71A@pacpbsex01.pac.dfo-mpo.ca> Message-ID: <880A0AF0-AE12-4AC3-80D3-5AB5B3DF2F99@comcast.net> On Apr 4, 2011, at 6:05 PM, Folkes, Michael wrote: > I'm using RODBC to read an excel file (not mine!). But I'm > struggling to find a way to preserve the column names that have a > numeric value. sqlFetch() drops the value and calls them f1, f2, > f3,... (ie field number). this is a different approach from > read.csv, which will append "V" prior to the numeric column name. read.table() (and perhaps read.csv) has a check.names argument which defaults to TRUE but can be set to FALSE. You will then need to take special care with the result, since those are not "safe" column names. Another way would be to read only one line in with readLines and then assign to names(dfrm) which would be read in with `skip` = 1. > sqlFetch isn't so helpful. > > Is there a way to get the first line of data from the excel file and > place it in a vector? Perhaps I can use that method and rename the > dataframe column names later? > > thanks! > Michael -- David Winsemius, MD West Hartford, CT From djmuser at gmail.com Tue Apr 5 00:42:13 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Mon, 4 Apr 2011 15:42:13 -0700 Subject: [R] automating regression or correlations for many variables In-Reply-To: <1301949195112-3426519.post@n4.nabble.com> References: <1301937184597-3426091.post@n4.nabble.com> <1301949195112-3426519.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From smithlaura937 at gmail.com Tue Apr 5 00:48:28 2011 From: smithlaura937 at gmail.com (Laura Smith) Date: Mon, 4 Apr 2011 17:48:28 -0500 Subject: [R] PERT/CPM on R Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From axel.urbiz at gmail.com Tue Apr 5 00:57:48 2011 From: axel.urbiz at gmail.com (Axel Urbiz) Date: Mon, 4 Apr 2011 18:57:48 -0400 Subject: [R] Help in sub-setting a List Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From smckinney at bccrc.ca Tue Apr 5 01:10:25 2011 From: smckinney at bccrc.ca (Steven McKinney) Date: Mon, 4 Apr 2011 16:10:25 -0700 Subject: [R] Linear Model with curve fitting parameter? In-Reply-To: References: <8512_1301611104_1301611104_AANLkTikOxmcE=oMvHBuB8x61fxXJwnexXJkr+Qp3Tawp@mail.gmail.com> Message-ID: > -----Original Message----- > From: stephen sefick [mailto:ssefick at gmail.com] > Sent: April-04-11 2:49 PM > To: Steven McKinney > Subject: Re: [R] Linear Model with curve fitting parameter? > > Steven: > > I am really sorry for my confusion. I hope this now makes sense. > > b0 == y intercept == y-intercept == (intercept) fit by lm > > a <- 1:10 > b <- 1:10 > > summary(lm(a~b)) > #to show what I was calling b0 > > So... > > ################################################ > manning > > Q = K*A*(R^b2)*(S^b3) > > log(Q) = log(K)+log(A)+(b2*log(R))+(b3*log(S)) Okay, using this notation, this appears to be the original model you queried about. So for this model, as I showed before, Let Z = log(Q) - log(A) E(Z) = b0 + b2*log(R) + b3*log(S) = log(K) + b2*log(R) + b3*log(S) Fitting the model lm(Z ~ log(R) + log(S)) will yield parameter estimates b_hat_0, b_hat_2, b_hat_3 where b_hat_0 (the fitted model intercept) is an estimate of b0 (which is log(K)), b_hat_2 is an estimate of b2, b_hat_3 is an estimate of b3. So in answer to your previous question, b0 is an estimate of log(K), not ( log(Qintercept)+log(K) ) so an estimate for K is exp(b_hat_0) > > ################################################ > dingman > Q = K*(A^b1)*(R^b2)*(S^b3*log(S)) > > log(Q) = log(K)+(b1*log(A))+(b2*log(R))+(b3*(log(S))^2) The dingman model notation is ambiguous. Is the last term S^(b3*log(S)) or (S^b3)*log(S) ? Previous email showed > dingman > log(Q)=log(b0)+log(K)+a*log(A)+r*log(R)+s*(log(S))^2 which implies (if I ignore the log(b0) term) Q = K*(A^a)*(R^r)*(exp(log(S)*log(S))^s) = K*(A^a)*(R^r)*(S^(log(S)*s)) This is linearizable as log(Q) = log(K) + a*log(A) + r*log(R) + s*(log(S))^2 = b0 + b1*log(A) + b2*log(R) + b3*(log(S)^2) Fitting lm(log(Q) ~ log(A) + log(R) + I(log(S)^2) ... ) will yield estimates b_hat_0, b_hat_1, b_hat_2 and b_hat_3 where b_hat_0 is an estimate of b0 = log(K) so an estimate of K is exp(b_hat_0), b_hat_1 is an estimate of b1 = a, b_hat_2 is an estimate of b2 = r, b_hat_3 is an estimate of b3 = s > > ################################################ > > Bjerklie > > Q = K*(A^b1)*(R^b2)*(S^b3) > > log(Q) = log(K)+(b1*log(A))+(b2*log(R))*(b3*log(S)) Fitting lm(log(Q) ~ log(A) + log(R) + log(S) ... ) will yield estimates b_hat_0, b_hat_1, b_hat_2 and b_hat_3 where b_hat_0 is an estimate of b0 = log(K) so an estimate of K is exp(b_hat_0), b_hat_1 is an estimate of b1 = a, b_hat_2 is an estimate of b2 = r, b_hat_3 is an estimate of b3 = s Best Steve McKinney > > ################################################ > > > > > > On Mon, Apr 4, 2011 at 2:58 PM, Steven McKinney wrote: > > > >> -----Original Message----- > >> From: stephen sefick [mailto:ssefick at gmail.com] > >> Sent: April-03-11 5:35 PM > >> To: Steven McKinney > >> Cc: R help > >> Subject: Re: [R] Linear Model with curve fitting parameter? > >> > >> Steven: > >> > >> You are exactly right sorry I was confused. > >> > >> > >> ####################################################### > >> so log(y-intercept)+log(K) is a constant called b0 (is this right?) > > > > Doesn't look right to me based on the information you've provided. > > I don't see anything labeled "y" in your previous emails, so I'm > > not clear on what y is and how it relates to the original model > > you described > > > > > >> I have a model Q=K*A*(R^r)*(S^s) > > > >> > > > >> A, R, and S are data I have and K is a curve fitting parameter. > > > > If the model is > > > > Q=K*A*(R^r)*(S^s) > > > > then > > > > log(Q) = log(K) + log(A) + r*log(R) + s*log(S) > > > > Rearranging yields > > > > log(Q) - log(A) = log(K) + r*log(R) + s*log(S) > > > > Let Z = log(Q) - log(A) = log(Q/A) > > > > so > > > > Z = log(K) + r*log(R) + s*log(S) > > > > and a linear model fit of > > > > Z ~ log(R) + log(S) > > > > will yield parameter estimates for the linear equation > > > > E(Z) = B0 + B1*log(R) + B2*log(S) > > > > (E(Z) = expected value of Z) > > > > so B0 estimate is an estimate of log(K) > > B1 estimate is an estimate of r > > B2 estimate is an estimate of s > > > > and these are the only parameters you described in the original model. > > > > > >> > >> lm(log(Q)~log(A)+log(R)+log(S)-1) > >> > >> is fitting the model > >> > >> log(Q)=a*log(A)+r*log(R)+s*log(S) (no beta 0) > >> > >> and > >> > >> lm(log(Q)~log(A)+log(R)+log(S)) > >> > >> > >> is fitting the model > >> > >> log(Q)=b0+a*log(A)+r*log(R)+s*log(S) > > > > K has disappeared from these equations so these model fits do > > not correspond to the model originally described. Now a b0 > > appears, and is used in models below. I think changing notation > > is also adding confusion. What are "y" and "intercept" you > > discuss above, in relation to your original notation? > > > >> > >> ###################################################### > >> > >> These are the models I am trying to fit and if I have reasoned > >> correctly above then I should be able to fit the below models > >> similarly. > > > > You will be able to fit models appropriately once you have a > > clearly defined system of notation that allows you to map between > > the proposed data model, the parameters in that model, and the > > corresponding regression equations. > > > > Once you have consistent notation, you will be able to see > > if you can express your model as a linear regression, or > > if not, what kind of non-linear regression you will need to > > do to get estimates for the parameters in your model. > > > > Best > > > > Steve McKinney > > > >> > >> manning > >> log(Q)=log(b0)+log(K)+log(A)+r*log(R)+s*log(S) > >> > >> dingman > >> log(Q)=log(b0)+log(K)+a*log(A)+r*log(R)+s*(log(S))^2 > >> > >> bjerklie > >> log(Q)=log(b0)+log(K)+a*log(A)+r*log(R)+s*log(S) > >> > >> ####################################################### > >> > >> Thank you for all of your help! > >> > >> Stephen > >> > >> On Fri, Apr 1, 2011 at 2:44 PM, Steven McKinney wrote: > >> > > >> >> -----Original Message----- > >> >> From: stephen sefick [mailto:ssefick at gmail.com] > >> >> Sent: April-01-11 5:44 AM > >> >> To: Steven McKinney > >> >> Cc: R help > >> >> Subject: Re: [R] Linear Model with curve fitting parameter? > >> >> > >> >> Setting Z=Q-A would be the incorrect dimensions. I could Z=Q/A. > >> > > >> > I suspect this is confusion about what Q is. I was presuming that > >> > the Q in this following formula was log(Q) with Q from the original data. > >> > > >> >> >> I have taken the log of the data that I have and this is the model > >> >> >> formula without the K part > >> >> >> > >> >> >> lm(Q~offset(A)+R+S, data=x) > >> > > >> > If the model is > >> > > >> > Q=K*A*(R^r)*(S^s) > >> > > >> > then > >> > > >> > log(Q) = log(K) + log(A) + r*log(R) + s*log(S) > >> > > >> > Rearranging yields > >> > > >> > log(Q) - log(A) = log(K) + r*log(R) + s*log(S) > >> > > >> > so what I labeled 'Z' below is > >> > > >> > Z = log(Q) - log(A) = log(Q/A) > >> > > >> > so > >> > > >> > Z = log(K) + r*log(R) + s*log(S) > >> > > >> > and a linear model fit of > >> > > >> > Z ~ log(R) + log(S) > >> > > >> > will yield parameter estimates for the linear equation > >> > > >> > E(Z) = B0 + B1*log(R) + B2*log(S) > >> > > >> > (E(Z) = expected value of Z) > >> > > >> > so B0 estimate is an estimate of log(K) > >> > B1 estimate is an estimate of r > >> > B2 estimate is an estimate of s > >> > > >> > More details and careful notation will eventually lead > >> > to a reasonable description and analysis strategy. > >> > > >> > > >> > Best > >> > > >> > Steve McKinney > >> > > >> > > >> > > >> >> Is fitting a nls model the same as fitting an ols? These data are > >> >> hydraulic data from ~47 sites. To access predictive ability I am > >> >> removing one site fitting a new model and then accessing the fit with > >> >> a myriad of model assessment criteria. I should get the same answer > >> >> with ols vs nls? Thank you for all of your help. > >> >> > >> >> Stephen > >> >> > >> >> On Thu, Mar 31, 2011 at 8:34 PM, Steven McKinney wrote: > >> >> > > >> >> >> -----Original Message----- > >> >> >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of stephen > >> >> sefick > >> >> >> Sent: March-31-11 3:38 PM > >> >> >> To: R help > >> >> >> Subject: [R] Linear Model with curve fitting parameter? > >> >> >> > >> >> >> I have a model Q=K*A*(R^r)*(S^s) > >> >> >> > >> >> >> A, R, and S are data I have and K is a curve fitting parameter. I > >> >> >> have linearized as > >> >> >> > >> >> >> log(Q)=log(K)+log(A)+r*log(R)+s*log(S) > >> >> >> > >> >> >> I have taken the log of the data that I have and this is the model > >> >> >> formula without the K part > >> >> >> > >> >> >> lm(Q~offset(A)+R+S, data=x) > >> >> >> > >> >> >> What is the formula that I should use? > >> >> > > >> >> > Let Z = Q - A for your logged data. > >> >> > > >> >> > Fitting lm(Z ~ R + S, data = x) should yield > >> >> > intercept parameter estimate = estimate for log(K) > >> >> > R coefficient parameter estimate = estimate for r > >> >> > S coefficient parameter estimate = estimate for s > >> >> > > >> >> > > >> >> > > >> >> > Steven McKinney > >> >> > > >> >> > Statistician > >> >> > Molecular Oncology and Breast Cancer Program > >> >> > British Columbia Cancer Research Centre > >> >> > > >> >> > > >> >> > > >> >> >> > >> >> >> Thanks for all of your help. I can provide a subset of data if necessary. > >> >> >> > >> >> >> > >> >> >> > >> >> >> -- > >> >> >> Stephen Sefick > >> >> >> ____________________________________ > >> >> >> | Auburn University | > >> >> >> | Biological Sciences | > >> >> >> | 331 Funchess Hall | > >> >> >> | Auburn, Alabama | > >> >> >> | 36849 | > >> >> >> |___________________________________| > >> >> >> | sas0025 at auburn.edu | > >> >> >> | http://www.auburn.edu/~sas0025 | > >> >> >> |___________________________________| > >> >> >> > >> >> >> Let's not spend our time and resources thinking about things that are > >> >> >> so little or so large that all they really do for us is puff us up and > >> >> >> make us feel like gods. We are mammals, and have not exhausted the > >> >> >> annoying little problems of being mammals. > >> >> >> > >> >> >> -K. Mullis > >> >> >> > >> >> >> "A big computer, a complex algorithm and a long time does not equal science." > >> >> >> > >> >> >> -Robert Gentleman > >> >> >> ______________________________________________ > >> >> >> R-help at r-project.org mailing list > >> >> >> https://stat.ethz.ch/mailman/listinfo/r-help > >> >> >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > >> >> >> and provide commented, minimal, self-contained, reproducible code. > >> >> > > >> >> > >> >> > >> >> > >> >> -- > >> >> Stephen Sefick > >> >> ____________________________________ > >> >> | Auburn University | > >> >> | Biological Sciences | > >> >> | 331 Funchess Hall | > >> >> | Auburn, Alabama | > >> >> | 36849 | > >> >> |___________________________________| > >> >> | sas0025 at auburn.edu | > >> >> | http://www.auburn.edu/~sas0025 | > >> >> |___________________________________| > >> >> > >> >> Let's not spend our time and resources thinking about things that are > >> >> so little or so large that all they really do for us is puff us up and > >> >> make us feel like gods. We are mammals, and have not exhausted the > >> >> annoying little problems of being mammals. > >> >> > >> >> -K. Mullis > >> >> > >> >> "A big computer, a complex algorithm and a long time does not equal science." > >> >> > >> >> -Robert Gentleman > >> > > >> > >> > >> > >> -- > >> Stephen Sefick > >> ____________________________________ > >> | Auburn University | > >> | Biological Sciences | > >> | 331 Funchess Hall | > >> | Auburn, Alabama | > >> | 36849 | > >> |___________________________________| > >> | sas0025 at auburn.edu | > >> | http://www.auburn.edu/~sas0025 | > >> |___________________________________| > >> > >> Let's not spend our time and resources thinking about things that are > >> so little or so large that all they really do for us is puff us up and > >> make us feel like gods. We are mammals, and have not exhausted the > >> annoying little problems of being mammals. > >> > >> -K. Mullis > >> > >> "A big computer, a complex algorithm and a long time does not equal science." > >> > >> -Robert Gentleman > > > > > > -- > Stephen Sefick > ____________________________________ > | Auburn University | > | Biological Sciences | > | 331 Funchess Hall | > | Auburn, Alabama | > | 36849 | > |___________________________________| > | sas0025 at auburn.edu | > | http://www.auburn.edu/~sas0025 | > |___________________________________| > > Let's not spend our time and resources thinking about things that are > so little or so large that all they really do for us is puff us up and > make us feel like gods. We are mammals, and have not exhausted the > annoying little problems of being mammals. > > -K. Mullis > > "A big computer, a complex algorithm and a long time does not equal science." > > -Robert Gentleman From Soren.Hojsgaard at agrsci.dk Tue Apr 5 01:29:37 2011 From: Soren.Hojsgaard at agrsci.dk (=?iso-8859-1?Q?S=F8ren_H=F8jsgaard?=) Date: Tue, 5 Apr 2011 01:29:37 +0200 Subject: [R] Assigning a class attribute to a list or vector slows "[" down Message-ID: <9F0721FDD4F12D4B95AD894274F388EC020C63F0FB5C@DJFEXMBX01.djf.agrsci.dk> Dear list, I've noticed that if a list or a vector is given a class (by class(x) <- "something") then the "selection operator slows down - quite a bit. For example: > lll <- as.list(letters) > system.time({for(ii in 1:200000)lll[-(1:4)]}) user system elapsed 0.48 0.00 0.49 > > class(lll) <- "foo" > system.time({for(ii in 1:200000)lll[-(1:4)]}) user system elapsed 2.57 0.00 2.58 > > vvv <- 1:100 > system.time({for(ii in 1:200000)vvv[-(1:4)]}) user system elapsed 0.71 0.00 0.72 > > class(vvv) <- "foo" > system.time({for(ii in 1:200000)vvv[-(1:4)]}) user system elapsed 2.85 0.00 2.87 I guess that what happens is that R looks for a "["-method for for "foo" objects and when such a method is not found, a default "["-method is called? Is that so? What should one do to avoid such a slowdown when wanting to select elements from a list or a vector with a class? Using unclass is one option: > class(vvv) <- "foo" > system.time({for(ii in 1:200000)unclass(vvv)[-(1:4)]}) user system elapsed 0.94 0.00 0.94 Are there better ways? Best regards S?ren PS: I am using R.2.12.2 on windows 7. From ssefick at gmail.com Tue Apr 5 01:50:34 2011 From: ssefick at gmail.com (stephen sefick) Date: Mon, 4 Apr 2011 18:50:34 -0500 Subject: [R] Linear Model with curve fitting parameter? In-Reply-To: References: <8512_1301611104_1301611104_AANLkTikOxmcE=oMvHBuB8x61fxXJwnexXJkr+Qp3Tawp@mail.gmail.com> Message-ID: Thank you very much for all of your help. On Mon, Apr 4, 2011 at 6:10 PM, Steven McKinney wrote: > > >> -----Original Message----- >> From: stephen sefick [mailto:ssefick at gmail.com] >> Sent: April-04-11 2:49 PM >> To: Steven McKinney >> Subject: Re: [R] Linear Model with curve fitting parameter? >> >> Steven: >> >> I am really sorry for my confusion. ?I hope this now makes sense. >> >> b0 == y intercept == y-intercept == (intercept) fit by lm >> >> a <- 1:10 >> b <- 1:10 >> >> summary(lm(a~b)) >> #to show what I was calling b0 >> >> So... >> >> ################################################ >> manning >> >> Q = K*A*(R^b2)*(S^b3) >> >> log(Q) = log(K)+log(A)+(b2*log(R))+(b3*log(S)) > > Okay, using this notation, this appears to be the original > model you queried about. ?So for this model, as I showed > before, > > Let Z = log(Q) - log(A) > > E(Z) = b0 ? ? + b2*log(R) + b3*log(S) > ? ? = log(K) + b2*log(R) + b3*log(S) > > Fitting the model ?lm(Z ~ log(R) + log(S)) > will yield parameter estimates b_hat_0, b_hat_2, b_hat_3 > where > b_hat_0 (the fitted model intercept) is an estimate of b0 (which is log(K)), > b_hat_2 is an estimate of b2, > b_hat_3 is an estimate of b3. > > So in answer to your previous question, b0 is an > estimate of log(K), not ( log(Qintercept)+log(K) ) > so an estimate for K is exp(b_hat_0) > > >> >> ################################################ >> dingman >> Q = K*(A^b1)*(R^b2)*(S^b3*log(S)) >> >> log(Q) = log(K)+(b1*log(A))+(b2*log(R))+(b3*(log(S))^2) > > The dingman model notation is ambiguous. ?Is the last > term ?S^(b3*log(S)) ?or ?(S^b3)*log(S) ? > > Previous email showed > > ? > dingman > ? > log(Q)=log(b0)+log(K)+a*log(A)+r*log(R)+s*(log(S))^2 > > which implies (if I ignore the log(b0) term) > ?Q = K*(A^a)*(R^r)*(exp(log(S)*log(S))^s) > ? ?= K*(A^a)*(R^r)*(S^(log(S)*s)) > > This is linearizable as > > log(Q) = log(K) + a*log(A) + r*log(R) + s*(log(S))^2 > ? ? ? = b0 ? ? + b1*log(A) + b2*log(R) + b3*(log(S)^2) > > Fitting lm(log(Q) ~ log(A) + log(R) + I(log(S)^2) ... ) > will yield estimates b_hat_0, b_hat_1, b_hat_2 and b_hat_3 > where b_hat_0 is an estimate of b0 = log(K) so an estimate of K is exp(b_hat_0), > b_hat_1 is an estimate of b1 = a, > b_hat_2 is an estimate of b2 = r, > b_hat_3 is an estimate of b3 = s > > > >> >> ################################################ >> >> Bjerklie >> >> Q = K*(A^b1)*(R^b2)*(S^b3) >> >> log(Q) = log(K)+(b1*log(A))+(b2*log(R))*(b3*log(S)) > > Fitting lm(log(Q) ~ log(A) + log(R) + log(S) ... ) > will yield estimates b_hat_0, b_hat_1, b_hat_2 and b_hat_3 > where b_hat_0 is an estimate of b0 = log(K) so an estimate of K is exp(b_hat_0), > b_hat_1 is an estimate of b1 = a, > b_hat_2 is an estimate of b2 = r, > b_hat_3 is an estimate of b3 = s > > > Best > > Steve McKinney > >> >> ################################################ >> >> >> >> >> >> On Mon, Apr 4, 2011 at 2:58 PM, Steven McKinney wrote: >> > >> >> -----Original Message----- >> >> From: stephen sefick [mailto:ssefick at gmail.com] >> >> Sent: April-03-11 5:35 PM >> >> To: Steven McKinney >> >> Cc: R help >> >> Subject: Re: [R] Linear Model with curve fitting parameter? >> >> >> >> Steven: >> >> >> >> You are exactly right sorry I was confused. >> >> >> >> >> >> ####################################################### >> >> so log(y-intercept)+log(K) is a constant called b0 (is this right?) >> > >> > Doesn't look right to me based on the information you've provided. >> > I don't see anything labeled "y" in your previous emails, so I'm >> > not clear on what y is and how it relates to the original model >> > you described >> > >> > ? > >> I have a model Q=K*A*(R^r)*(S^s) >> > ? > >> >> > ? > >> A, R, and S are data I have and K is a curve fitting parameter. >> > >> > If the model is >> > >> > ? Q=K*A*(R^r)*(S^s) >> > >> > then >> > >> > ? log(Q) = log(K) + log(A) + r*log(R) + s*log(S) >> > >> > Rearranging yields >> > >> > ? log(Q) - log(A) = log(K) + r*log(R) + s*log(S) >> > >> > Let ?Z = log(Q) - log(A) = log(Q/A) >> > >> > so >> > >> > ? Z = log(K) + r*log(R) + s*log(S) >> > >> > and a linear model fit of >> > >> > ? Z ~ log(R) + log(S) >> > >> > will yield parameter estimates for the linear equation >> > >> > ? E(Z) = B0 + B1*log(R) + B2*log(S) >> > >> > (E(Z) = expected value of Z) >> > >> > so B0 estimate is an estimate of log(K) >> > ? B1 estimate is an estimate of r >> > ? B2 estimate is an estimate of s >> > >> > and these are the only parameters you described in the original model. >> > >> > >> >> >> >> lm(log(Q)~log(A)+log(R)+log(S)-1) >> >> >> >> is fitting the model >> >> >> >> log(Q)=a*log(A)+r*log(R)+s*log(S) (no beta 0) >> >> >> >> and >> >> >> >> lm(log(Q)~log(A)+log(R)+log(S)) >> >> >> >> >> >> is fitting the model >> >> >> >> log(Q)=b0+a*log(A)+r*log(R)+s*log(S) >> > >> > K has disappeared from these equations so these model fits do >> > not correspond to the model originally described. ?Now a b0 >> > appears, and is used in models below. ?I think changing notation >> > is also adding confusion. ?What are "y" and "intercept" you >> > discuss above, in relation to your original notation? >> > >> >> >> >> ###################################################### >> >> >> >> These are the models I am trying to fit and if I have reasoned >> >> correctly above then I should be able to fit the below models >> >> similarly. >> > >> > You will be able to fit models appropriately once you have a >> > clearly defined system of notation that allows you to map between >> > the proposed data model, the parameters in that model, and the >> > corresponding regression equations. >> > >> > Once you have consistent notation, you will be able to see >> > if you can express your model as a linear regression, or >> > if not, what kind of non-linear regression you will need to >> > do to get estimates for the parameters in your model. >> > >> > Best >> > >> > Steve McKinney >> > >> >> >> >> manning >> >> log(Q)=log(b0)+log(K)+log(A)+r*log(R)+s*log(S) >> >> >> >> dingman >> >> log(Q)=log(b0)+log(K)+a*log(A)+r*log(R)+s*(log(S))^2 >> >> >> >> bjerklie >> >> log(Q)=log(b0)+log(K)+a*log(A)+r*log(R)+s*log(S) >> >> >> >> ####################################################### >> >> >> >> Thank you for all of your help! >> >> >> >> Stephen >> >> >> >> On Fri, Apr 1, 2011 at 2:44 PM, Steven McKinney wrote: >> >> > >> >> >> -----Original Message----- >> >> >> From: stephen sefick [mailto:ssefick at gmail.com] >> >> >> Sent: April-01-11 5:44 AM >> >> >> To: Steven McKinney >> >> >> Cc: R help >> >> >> Subject: Re: [R] Linear Model with curve fitting parameter? >> >> >> >> >> >> Setting Z=Q-A would be the incorrect dimensions. ?I could Z=Q/A. >> >> > >> >> > I suspect this is confusion about what Q is. ?I was presuming that >> >> > the Q in this following formula was log(Q) with Q from the original data. >> >> > >> >> >> >> I have taken the log of the data that I have and this is the model >> >> >> >> formula without the K part >> >> >> >> >> >> >> >> lm(Q~offset(A)+R+S, data=x) >> >> > >> >> > If the model is >> >> > >> >> > ? Q=K*A*(R^r)*(S^s) >> >> > >> >> > then >> >> > >> >> > ? log(Q) = log(K) + log(A) + r*log(R) + s*log(S) >> >> > >> >> > Rearranging yields >> >> > >> >> > ? log(Q) - log(A) = log(K) + r*log(R) + s*log(S) >> >> > >> >> > so what I labeled 'Z' below is >> >> > >> >> > ? Z = log(Q) - log(A) = log(Q/A) >> >> > >> >> > so >> >> > >> >> > ? Z = log(K) + r*log(R) + s*log(S) >> >> > >> >> > and a linear model fit of >> >> > >> >> > ? Z ~ log(R) + log(S) >> >> > >> >> > will yield parameter estimates for the linear equation >> >> > >> >> > ? E(Z) = B0 + B1*log(R) + B2*log(S) >> >> > >> >> > (E(Z) = expected value of Z) >> >> > >> >> > so B0 estimate is an estimate of log(K) >> >> > ? B1 estimate is an estimate of r >> >> > ? B2 estimate is an estimate of s >> >> > >> >> > More details and careful notation will eventually lead >> >> > to a reasonable description and analysis strategy. >> >> > >> >> > >> >> > Best >> >> > >> >> > Steve McKinney >> >> > >> >> > >> >> > >> >> >> Is fitting a nls model the same as fitting an ols? ?These data are >> >> >> hydraulic data from ~47 sites. ?To access predictive ability I am >> >> >> removing one site fitting a new model and then accessing the fit with >> >> >> a myriad of model assessment criteria. ?I should get the same answer >> >> >> with ols vs nls? ?Thank you for all of your help. >> >> >> >> >> >> Stephen >> >> >> >> >> >> On Thu, Mar 31, 2011 at 8:34 PM, Steven McKinney wrote: >> >> >> > >> >> >> >> -----Original Message----- >> >> >> >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of stephen >> >> >> sefick >> >> >> >> Sent: March-31-11 3:38 PM >> >> >> >> To: R help >> >> >> >> Subject: [R] Linear Model with curve fitting parameter? >> >> >> >> >> >> >> >> I have a model Q=K*A*(R^r)*(S^s) >> >> >> >> >> >> >> >> A, R, and S are data I have and K is a curve fitting parameter. ?I >> >> >> >> have linearized as >> >> >> >> >> >> >> >> log(Q)=log(K)+log(A)+r*log(R)+s*log(S) >> >> >> >> >> >> >> >> I have taken the log of the data that I have and this is the model >> >> >> >> formula without the K part >> >> >> >> >> >> >> >> lm(Q~offset(A)+R+S, data=x) >> >> >> >> >> >> >> >> What is the formula that I should use? >> >> >> > >> >> >> > Let Z = Q - A for your logged data. >> >> >> > >> >> >> > Fitting lm(Z ~ R + S, data = x) should yield >> >> >> > intercept parameter estimate = estimate for log(K) >> >> >> > R coefficient parameter estimate = estimate for r >> >> >> > S coefficient parameter estimate = estimate for s >> >> >> > >> >> >> > >> >> >> > >> >> >> > Steven McKinney >> >> >> > >> >> >> > Statistician >> >> >> > Molecular Oncology and Breast Cancer Program >> >> >> > British Columbia Cancer Research Centre >> >> >> > >> >> >> > >> >> >> > >> >> >> >> >> >> >> >> Thanks for all of your help. ?I can provide a subset of data if necessary. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> >> Stephen Sefick >> >> >> >> ____________________________________ >> >> >> >> | Auburn University ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | >> >> >> >> | Biological Sciences ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| >> >> >> >> | 331 Funchess Hall ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | >> >> >> >> | Auburn, Alabama ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | >> >> >> >> | 36849 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | >> >> >> >> |___________________________________| >> >> >> >> | sas0025 at auburn.edu ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| >> >> >> >> | http://www.auburn.edu/~sas0025 ? ? ? ? ? ? ? ? | >> >> >> >> |___________________________________| >> >> >> >> >> >> >> >> Let's not spend our time and resources thinking about things that are >> >> >> >> so little or so large that all they really do for us is puff us up and >> >> >> >> make us feel like gods. ?We are mammals, and have not exhausted the >> >> >> >> annoying little problems of being mammals. >> >> >> >> >> >> >> >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -K. Mullis >> >> >> >> >> >> >> >> "A big computer, a complex algorithm and a long time does not equal science." >> >> >> >> >> >> >> >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -Robert Gentleman >> >> >> >> ______________________________________________ >> >> >> >> R-help at r-project.org mailing list >> >> >> >> https://stat.ethz.ch/mailman/listinfo/r-help >> >> >> >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> >> >> >> and provide commented, minimal, self-contained, reproducible code. >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> Stephen Sefick >> >> >> ____________________________________ >> >> >> | Auburn University ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | >> >> >> | Biological Sciences ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| >> >> >> | 331 Funchess Hall ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | >> >> >> | Auburn, Alabama ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | >> >> >> | 36849 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | >> >> >> |___________________________________| >> >> >> | sas0025 at auburn.edu ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| >> >> >> | http://www.auburn.edu/~sas0025 ? ? ? ? ? ? ? ? | >> >> >> |___________________________________| >> >> >> >> >> >> Let's not spend our time and resources thinking about things that are >> >> >> so little or so large that all they really do for us is puff us up and >> >> >> make us feel like gods. ?We are mammals, and have not exhausted the >> >> >> annoying little problems of being mammals. >> >> >> >> >> >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -K. Mullis >> >> >> >> >> >> "A big computer, a complex algorithm and a long time does not equal science." >> >> >> >> >> >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -Robert Gentleman >> >> > >> >> >> >> >> >> >> >> -- >> >> Stephen Sefick >> >> ____________________________________ >> >> | Auburn University ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | >> >> | Biological Sciences ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| >> >> | 331 Funchess Hall ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | >> >> | Auburn, Alabama ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | >> >> | 36849 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | >> >> |___________________________________| >> >> | sas0025 at auburn.edu ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| >> >> | http://www.auburn.edu/~sas0025 ? ? ? ? ? ? ? ? | >> >> |___________________________________| >> >> >> >> Let's not spend our time and resources thinking about things that are >> >> so little or so large that all they really do for us is puff us up and >> >> make us feel like gods. ?We are mammals, and have not exhausted the >> >> annoying little problems of being mammals. >> >> >> >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -K. Mullis >> >> >> >> "A big computer, a complex algorithm and a long time does not equal science." >> >> >> >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -Robert Gentleman >> > >> >> >> >> -- >> Stephen Sefick >> ____________________________________ >> | Auburn University ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | >> | Biological Sciences ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| >> | 331 Funchess Hall ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | >> | Auburn, Alabama ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | >> | 36849 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? | >> |___________________________________| >> | sas0025 at auburn.edu ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| >> | http://www.auburn.edu/~sas0025 ? ? ? ? ? ? ? ? | >> |___________________________________| >> >> Let's not spend our time and resources thinking about things that are >> so little or so large that all they really do for us is puff us up and >> make us feel like gods. ?We are mammals, and have not exhausted the >> annoying little problems of being mammals. >> >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -K. Mullis >> >> "A big computer, a complex algorithm and a long time does not equal science." >> >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -Robert Gentleman > -- Stephen Sefick ____________________________________ | Auburn University? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | | Biological Sciences ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ?? ? ? ?| | 331 Funchess Hall? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?| | Auburn, Alabama? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ? ??? | | 36849? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? | |___________________________________| | sas0025 at auburn.edu? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ?| | http://www.auburn.edu/~sas0025? ? ? ? ? ?? ? ? | |___________________________________| Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods.? We are mammals, and have not exhausted the annoying little problems of being mammals. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -K. Mullis "A big computer, a complex algorithm and a long time does not equal science." ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? -Robert Gentleman From geraldes at mail.ubc.ca Tue Apr 5 01:53:30 2011 From: geraldes at mail.ubc.ca (geral) Date: Mon, 4 Apr 2011 18:53:30 -0500 (CDT) Subject: [R] automating regression or correlations for many variables In-Reply-To: References: <1301937184597-3426091.post@n4.nabble.com> <1301949195112-3426519.post@n4.nabble.com> Message-ID: <1301961210499-3426887.post@n4.nabble.com> Thanks! You are awesome! I am not sure I follow everything, but I am trying! AG -- View this message in context: http://r.789695.n4.nabble.com/automating-regression-or-correlations-for-many-variables-tp3426091p3426887.html Sent from the R help mailing list archive at Nabble.com. From d.scott at auckland.ac.nz Tue Apr 5 02:42:00 2011 From: d.scott at auckland.ac.nz (David Scott) Date: Tue, 5 Apr 2011 12:42:00 +1200 Subject: [R] lattice: how to "center" a subtitle? In-Reply-To: References: <57C6A6F6-4CB1-49B2-A0DA-4480EFD385A3@comcast.net> Message-ID: <4D9A6558.4050405@auckland.ac.nz> On 05/04/11 05:58, David Winsemius wrote: > On Apr 4, 2011, at 1:27 PM, Marius Hofert wrote: > >> Dear David, >> >> do you know how to get plotmath-like symbols in both rows? >> I tried s.th. like: >> >> lab<- expression(paste(alpha==1, ", ", beta==2, sep="")) >> xlab<- substitute(expression( atop(lab==lab., bold(foo)) ), >> list(lab.=lab)) >> xyplot(0 ~ 0, xlab = xlab) > I _did_ have plotmath functions in both rows: But here is your solution: > > xyplot(0 ~ 0, xlab = > expression( atop(paste(alpha==1, " ", beta==2), bold(bla) )) ) > ) > > Note that `paste` in plotmath is different than `paste` in regular R. > It has no `sep` argument. I did try both substitute and bquote on you > externally expression, but lattice seems to be doing some non- > standard evaluation and I never got it to "work". Using what I thought > _should_ work, does work with `plot`: > > > x=1;y=2 > > plot(0 ~ 0, xlab = bquote( atop(alpha==.(x)*","~beta==.(y), > bold(foo) ) ) > + ) > > But the same expression throws an error with xyplot: > > x=1;y=2 > > xyplot(0 ~ 0, xlab = bquote( atop(alpha==.(x)*","~beta==.(y), > bold(foo) ) ) > + ) > Error in trellis.skeleton(formula = 0 ~ 0, cond = list(1L), aspect = > "fill", : > could not find function "atop" > I am not sure where I read it and I can't find it again, but my understanding is that expressions using bquote with lattice need to be enclosed in as.expression() to work. That is in contrast to what happens in base graphics. Here is a simple example. a <- 2 plot(1:10, a*(1:10), main = bquote(alpha == .(a))) require(lattice) xyplot(a*(1:10)~ 1:10, main = bquote(alpha == .(a))) xyplot(a*(1:10)~ 1:10, main = as.expression(bquote(alpha == .(a)))) Which produces: > a <- 2 > plot(1:10, a*(1:10), main = bquote(alpha == .(a))) > require(lattice) Loading required package: lattice > xyplot(a*(1:10)~ 1:10, main = bquote(alpha == .(a))) Error in trellis.skeleton(formula = a * (1:10) ~ 1:10, cond = list(c(1L, : object 'alpha' not found > xyplot(a*(1:10)~ 1:10, main = as.expression(bquote(alpha == .(a)))) Using expression() rather than as.expression() doesn't produce the desired affect. Try it yourself. As to why this is the case ..... David Scott -- _________________________________________________________________ David Scott Department of Statistics The University of Auckland, PB 92019 Auckland 1142, NEW ZEALAND Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055 Email: d.scott at auckland.ac.nz, Fax: +64 9 373 7018 Director of Consulting, Department of Statistics From dwinsemius at comcast.net Tue Apr 5 03:03:14 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 4 Apr 2011 21:03:14 -0400 Subject: [R] lattice: how to "center" a subtitle? In-Reply-To: <4D9A6558.4050405@auckland.ac.nz> References: <57C6A6F6-4CB1-49B2-A0DA-4480EFD385A3@comcast.net> <4D9A6558.4050405@auckland.ac.nz> Message-ID: On Apr 4, 2011, at 8:42 PM, David Scott wrote: > On 05/04/11 05:58, David Winsemius wrote: >> On Apr 4, 2011, at 1:27 PM, Marius Hofert wrote: >> >>> Dear David, >>> >>> do you know how to get plotmath-like symbols in both rows? >>> I tried s.th. like: >>> >>> lab<- expression(paste(alpha==1, ", ", beta==2, sep="")) >>> xlab<- substitute(expression( atop(lab==lab., bold(foo)) ), >>> list(lab.=lab)) >>> xyplot(0 ~ 0, xlab = xlab) >> I _did_ have plotmath functions in both rows: But here is your >> solution: >> >> xyplot(0 ~ 0, xlab = >> expression( atop(paste(alpha==1, " ", beta==2), bold(bla) )) ) >> ) >> >> Note that `paste` in plotmath is different than `paste` in regular R. >> It has no `sep` argument. I did try both substitute and bquote on you >> externally expression, but lattice seems to be doing some non- >> standard evaluation and I never got it to "work". Using what I >> thought >> _should_ work, does work with `plot`: >> >> > x=1;y=2 >> > plot(0 ~ 0, xlab = bquote( atop(alpha==.(x)*","~beta==.(y), >> bold(foo) ) ) >> + ) >> >> But the same expression throws an error with xyplot: >> > x=1;y=2 >> > xyplot(0 ~ 0, xlab = bquote( atop(alpha==.(x)*","~beta==.(y), >> bold(foo) ) ) >> + ) >> Error in trellis.skeleton(formula = 0 ~ 0, cond = list(1L), aspect = >> "fill", : >> could not find function "atop" >> > I am not sure where I read it and I can't find it again, but my > understanding is that expressions using bquote with lattice need to > be enclosed in as.expression() to work. That is in contrast to what > happens in base graphics. > > Here is a simple example. > > a <- 2 > plot(1:10, a*(1:10), main = bquote(alpha == .(a))) > require(lattice) > xyplot(a*(1:10)~ 1:10, main = bquote(alpha == .(a))) > xyplot(a*(1:10)~ 1:10, main = as.expression(bquote(alpha == .(a)))) > > Which produces: > > > a <- 2 > > plot(1:10, a*(1:10), main = bquote(alpha == .(a))) > > require(lattice) > Loading required package: lattice > > xyplot(a*(1:10)~ 1:10, main = bquote(alpha == .(a))) > Error in trellis.skeleton(formula = a * (1:10) ~ 1:10, cond = > list(c(1L, : > object 'alpha' not found > > xyplot(a*(1:10)~ 1:10, main = as.expression(bquote(alpha == .(a)))) > > Using expression() rather than as.expression() doesn't produce the > desired affect. Try it yourself. I did. Your theory seems supported by the experimental evidence: lab <- bquote(paste(alpha==1, ", ", beta==2)) xyplot(0 ~ 0, xlab = as.expression(bquote(atop(.(lab), bold(foo) ) ))) # Works! > > As to why this is the case ..... Right. It does seem as though we should need to be deducing the inner workings of the subatomic particles by throwing text at the interpreter though. -- David Winsemius, MD West Hartford, CT From thomas.levine at gmail.com Tue Apr 5 03:05:12 2011 From: thomas.levine at gmail.com (Thomas Levine) Date: Mon, 4 Apr 2011 21:05:12 -0400 Subject: [R] Sample size estimation for sample surveys Message-ID: Hi, Is there an R package for estimating sample size requirements for parameter estimation in sample surveys? In particular, I'm interested in sample size estimation for stratified and systematic sampling. I have a textbook with appropriate formulae, but it'd be nice if I didn't have to type in all of the equations. Thanks Tom From dwinsemius at comcast.net Tue Apr 5 03:14:40 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Mon, 4 Apr 2011 21:14:40 -0400 Subject: [R] lattice: how to "center" a subtitle? In-Reply-To: References: <57C6A6F6-4CB1-49B2-A0DA-4480EFD385A3@comcast.net> <4D9A6558.4050405@auckland.ac.nz> Message-ID: <5E815824-53F2-4863-911F-CC754C76AE61@comcast.net> On Apr 4, 2011, at 9:03 PM, David Winsemius wrote: > > On Apr 4, 2011, at 8:42 PM, David Scott wrote: > >> On 05/04/11 05:58, David Winsemius wrote: >>> On Apr 4, 2011, at 1:27 PM, Marius Hofert wrote: >>> >>>> Dear David, >>>> >>>> do you know how to get plotmath-like symbols in both rows? >>>> I tried s.th. like: >>>> >>>> lab<- expression(paste(alpha==1, ", ", beta==2, sep="")) >>>> xlab<- substitute(expression( atop(lab==lab., bold(foo)) ), >>>> list(lab.=lab)) >>>> xyplot(0 ~ 0, xlab = xlab) >>> I _did_ have plotmath functions in both rows: But here is your >>> solution: >>> >>> xyplot(0 ~ 0, xlab = >>> expression( atop(paste(alpha==1, " ", beta==2), bold(bla) )) ) >>> ) >>> >>> Note that `paste` in plotmath is different than `paste` in regular >>> R. >>> It has no `sep` argument. I did try both substitute and bquote on >>> you >>> externally expression, but lattice seems to be doing some non- >>> standard evaluation and I never got it to "work". Using what I >>> thought >>> _should_ work, does work with `plot`: >>> >>> > x=1;y=2 >>> > plot(0 ~ 0, xlab = bquote( atop(alpha==.(x)*","~beta==.(y), >>> bold(foo) ) ) >>> + ) >>> >>> But the same expression throws an error with xyplot: >>> > x=1;y=2 >>> > xyplot(0 ~ 0, xlab = bquote( atop(alpha==.(x)*","~beta==.(y), >>> bold(foo) ) ) >>> + ) >>> Error in trellis.skeleton(formula = 0 ~ 0, cond = list(1L), aspect = >>> "fill", : >>> could not find function "atop" > >>> >> I am not sure where I read it and I can't find it again, but my >> understanding is that expressions using bquote with lattice need to >> be enclosed in as.expression() to work. That is in contrast to what >> happens in base graphics. Perhaps here: http://finzi.psych.upenn.edu/Rhelp10/2010-August/250832.html Or here: http://finzi.psych.upenn.edu/Rhelp10/2009-July/203714.html Although I disagree with Heimstra that reading the help(bquote) provides "more details" that might shed light on why this is so. -- David. >> >> Here is a simple example. >> >> a <- 2 >> plot(1:10, a*(1:10), main = bquote(alpha == .(a))) >> require(lattice) >> xyplot(a*(1:10)~ 1:10, main = bquote(alpha == .(a))) >> xyplot(a*(1:10)~ 1:10, main = as.expression(bquote(alpha == .(a)))) >> >> Which produces: >> >> > a <- 2 >> > plot(1:10, a*(1:10), main = bquote(alpha == .(a))) >> > require(lattice) >> Loading required package: lattice >> > xyplot(a*(1:10)~ 1:10, main = bquote(alpha == .(a))) >> Error in trellis.skeleton(formula = a * (1:10) ~ 1:10, cond = >> list(c(1L, : >> object 'alpha' not found >> > xyplot(a*(1:10)~ 1:10, main = as.expression(bquote(alpha == .(a)))) >> >> Using expression() rather than as.expression() doesn't produce the >> desired affect. Try it yourself. > > I did. Your theory seems supported by the experimental evidence: > > lab <- bquote(paste(alpha==1, ", ", beta==2)) > xyplot(0 ~ 0, xlab = as.expression(bquote(atop(.(lab), bold(foo) ) ))) > > # Works! > >> >> As to why this is the case ..... > > Right. It does seem as though we should need to be deducing the > inner workings of the subatomic particles by throwing text at the > interpreter though. > > -- > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From darcy.webber at gmail.com Tue Apr 5 03:15:07 2011 From: darcy.webber at gmail.com (Darcy Webber) Date: Tue, 5 Apr 2011 13:15:07 +1200 Subject: [R] lists within lists Message-ID: Hello R users, I am dealing with some resonably big data sets that I have split up into lists based on various factors. In the code below, I have got my code producing 100 values between point1x and point1y for the first matrix in my list. for (k in 1:length(point1x[[1]][, 1])) { linex[[k]] = seq(point1x[[1]][, 1][k], point2x[[1]][, 1][k], length = 100)} This works properly when I specify point1x[[1]] and point2x[[1]], but I need to repeat this process for point1x[[2]]... point1x[[j]] and append it within another list. Perhaps something along the lines of this, for (j in 1:length(something)) { for (k in 1:length(point1x[[j]][, 1])) { linex[[j]][[k]] = seq(point1x[[j]][, 1][k], point2x[[j]][, 1][k], length = 100)}} But, R wont let me do this, so, my question is, how can I produce lists within lists in R using a similar code to above? I could do this manually by changing the values of n and then setting up the list using biglist[[1]] = linex #for n=1 biglist[[2]] = linex #for n=2 etc I can then call to lists within the list using biglist[[1]][[4]] etc, but I need to automate all of this. Am I missing something basic with respect to list structures? Thanks again, Darcy. From rosyara at msu.edu Tue Apr 5 02:38:50 2011 From: rosyara at msu.edu (Umesh Rosyara) Date: Mon, 4 Apr 2011 20:38:50 -0400 Subject: [R] merging data list in to single data frame In-Reply-To: References: <002701cbf2d6$65405870$2fc10950$@edu> <006301cbf304$5f429950$1dc7cbf0$@edu> Message-ID: <1AC73670A58B443ABECB52815DC82552@OwnerPC> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Yusuke.Fukuda at nt.gov.au Tue Apr 5 02:45:49 2011 From: Yusuke.Fukuda at nt.gov.au (Yusuke Fukuda) Date: Tue, 5 Apr 2011 10:15:49 +0930 Subject: [R] ANCOVA for linear regressions without intercept In-Reply-To: <4D95F78F.10501@ucalgary.ca> References: <4D95F78F.10501@ucalgary.ca> Message-ID: Thank you for your suggestions, stats experts. Much appreciated. I still haven't got what I wanted but someone suggested looking into contrasts and this is looking worth trying http://finzi.psych.upenn.edu/R/library/gmodels/html/fit.contrast.html Regards, Yusuke -----Original Message----- From: Peter Ehlers [mailto:ehlers at ucalgary.ca] Sent: Saturday, 2 April 2011 1:35 AM To: Yusuke Fukuda Cc: 'Bert Gunter'; r-help at r-project.org Subject: Re: [R] ANCOVA for linear regressions without intercept See inline. On 2011-03-31 22:22, Yusuke Fukuda wrote: > Thanks Bert. > > I have read "?formula" again and again, and I'm still struggling; > >> lm(body_length ~ head_length-1) > > This removes intercept from each individual regression (for male, female, unknown). > > When they are taken together, > >> lm(body_length ~ sex*head_length) > > This shows differences in slopes and intercepts between the regressions (but I want to compare the slopes of the regressions WITHOUT intercepts). > > If I put > >> lm(body_length ~ sex:head_length-1) > > This shows slopes for each sex without intercepts, but NOT differences in the slope between the regressions. You probably want: lm(body_length ~ head_length + sex:head_length-1) or, in short form: lm(body_length ~ head_length/sex-1) You might then compare the model 'without intercepts' (i.e. with intercepts forced to zero) with a model that includes intercepts. If the intercepts turn out to be significantly nonzero, what will you do? Peter Ehlers > > I also tried > >> lm(body_length ~ sex*head_length-1) >> lm(body_length ~ sex*head_length-sex-1) > > But none of them worked. > > Would anyone be able to help me? All I want to do is to compare the slopes of three linear regressions that go through the origin (0,0) so that I can say if their difference is significant or not. > > Thanks for your help. > > > > ________________________________________ > From: Bert Gunter [mailto:gunter.berton at gene.com] > Sent: Friday, 1 April 2011 12:56 AM > To: Yusuke Fukuda > Cc: r-help at r-project.org > Subject: Re: [R] ANCOVA for linear regressions without intercept > > If you haven't already received an answer, a careful reading of > > ?formula > > will provide it. > > -- Bert > On Wed, Mar 30, 2011 at 11:42 PM, Yusuke Fukuda wrote: > > Hello R experts > > I have two linear regressions for sexes (Male, Female, Unknown). All have a good correlation between body length (response variable) and head length (explanatory variable). I know it is not recommended, but for a good practical reason (the purpose of study is to find a single conversion factor from head length to body length), the regressions need to go through the origin (0 intercept). > > Is it possible to do ANCOVA for these regressions without intercepts? When I do > > summary(lm(body length ~ sex*head length)) > > this will include the intercepts as below > > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) -6.49697 1.68497 -3.856 0.000118 *** > sexMale -9.39340 1.97760 -4.750 2.14e-06 *** > sexUnknown -1.33791 2.35453 -0.568 0.569927 > head_length 7.12307 0.05503 129.443< 2e-16 *** > sexMale:head_length 0.31631 0.06246 5.064 4.37e-07 *** > sexUnknown:head_length 0.19937 0.07022 2.839 0.004556 ** > --- > > Is there any way I can remove the intercepts so that I can simply compare the slopes with no intercept taken into account? > > Thanks for help in advance. > > Yusuke Fukuda > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > From djmuser at gmail.com Tue Apr 5 03:38:29 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Mon, 4 Apr 2011 18:38:29 -0700 Subject: [R] Sample size estimation for sample surveys In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From davejaiminm at gmail.com Tue Apr 5 03:39:45 2011 From: davejaiminm at gmail.com (Jaimin Dave) Date: Mon, 4 Apr 2011 20:39:45 -0500 Subject: [R] Grid on Map Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From thomas.levine at gmail.com Tue Apr 5 03:50:21 2011 From: thomas.levine at gmail.com (Thomas Levine) Date: Mon, 4 Apr 2011 21:50:21 -0400 Subject: [R] Sample size estimation for sample surveys In-Reply-To: References: Message-ID: Awesome! Thanks, David and Dennis! And now I know how to search for packages more effectively. Tom On Mon, Apr 4, 2011 at 9:38 PM, Dennis Murphy wrote: > Start here: > > library(sos)????? # install first if necessary > findFn('sample size survey') > > I got 238 hits, many of which could be relevant. > > HTH, > Dennis > > On Mon, Apr 4, 2011 at 6:05 PM, Thomas Levine > wrote: >> >> Hi, >> >> Is there an R package for estimating sample size requirements for >> parameter estimation in sample surveys? In particular, I'm interested >> in sample size estimation for stratified and systematic sampling. I >> have a textbook with appropriate formulae, but it'd be nice if I >> didn't have to type in all of the equations. >> >> Thanks >> >> Tom >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > From d.scott at auckland.ac.nz Tue Apr 5 03:53:04 2011 From: d.scott at auckland.ac.nz (David Scott) Date: Tue, 5 Apr 2011 13:53:04 +1200 Subject: [R] lattice: how to "center" a subtitle? In-Reply-To: <5E815824-53F2-4863-911F-CC754C76AE61@comcast.net> References: <57C6A6F6-4CB1-49B2-A0DA-4480EFD385A3@comcast.net> <4D9A6558.4050405@auckland.ac.nz> <5E815824-53F2-4863-911F-CC754C76AE61@comcast.net> Message-ID: <4D9A7600.8070708@auckland.ac.nz> On 05/04/11 13:14, David Winsemius wrote: > On Apr 4, 2011, at 9:03 PM, David Winsemius wrote: > >> On Apr 4, 2011, at 8:42 PM, David Scott wrote: >> >>> On 05/04/11 05:58, David Winsemius wrote: >>>> On Apr 4, 2011, at 1:27 PM, Marius Hofert wrote: >>>> >>>>> Dear David, >>>>> >>>>> do you know how to get plotmath-like symbols in both rows? >>>>> I tried s.th. like: >>>>> >>>>> lab<- expression(paste(alpha==1, ", ", beta==2, sep="")) >>>>> xlab<- substitute(expression( atop(lab==lab., bold(foo)) ), >>>>> list(lab.=lab)) >>>>> xyplot(0 ~ 0, xlab = xlab) >>>> I _did_ have plotmath functions in both rows: But here is your >>>> solution: >>>> >>>> xyplot(0 ~ 0, xlab = >>>> expression( atop(paste(alpha==1, " ", beta==2), bold(bla) )) ) >>>> ) >>>> >>>> Note that `paste` in plotmath is different than `paste` in regular >>>> R. >>>> It has no `sep` argument. I did try both substitute and bquote on >>>> you >>>> externally expression, but lattice seems to be doing some non- >>>> standard evaluation and I never got it to "work". Using what I >>>> thought >>>> _should_ work, does work with `plot`: >>>> >>>>> x=1;y=2 >>>>> plot(0 ~ 0, xlab = bquote( atop(alpha==.(x)*","~beta==.(y), >>>> bold(foo) ) ) >>>> + ) >>>> >>>> But the same expression throws an error with xyplot: >>>>> x=1;y=2 >>>>> xyplot(0 ~ 0, xlab = bquote( atop(alpha==.(x)*","~beta==.(y), >>>> bold(foo) ) ) >>>> + ) >>>> Error in trellis.skeleton(formula = 0 ~ 0, cond = list(1L), aspect = >>>> "fill", : >>>> could not find function "atop" >>> I am not sure where I read it and I can't find it again, but my >>> understanding is that expressions using bquote with lattice need to >>> be enclosed in as.expression() to work. That is in contrast to what >>> happens in base graphics. > Perhaps here: > http://finzi.psych.upenn.edu/Rhelp10/2010-August/250832.html I am pretty sure that was where I saw it. I knew it was out there somewhere. > Or here: > http://finzi.psych.upenn.edu/Rhelp10/2009-July/203714.html > > Although I disagree with Heimstra that reading the help(bquote) > provides "more details" that might shed light on why this is so. > David Scott -- _________________________________________________________________ David Scott Department of Statistics The University of Auckland, PB 92019 Auckland 1142, NEW ZEALAND Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055 Email: d.scott at auckland.ac.nz, Fax: +64 9 373 7018 Director of Consulting, Department of Statistics From S.Ellison at lgc.co.uk Tue Apr 5 04:35:13 2011 From: S.Ellison at lgc.co.uk (slre) Date: Mon, 4 Apr 2011 21:35:13 -0500 (CDT) Subject: [R] R CMD build creates tar file instead of tar.gz file In-Reply-To: <7299A0B7339263459A05620F8C814D03029FA1F897@r4-ce-s8-03> References: <7299A0B7339263459A05620F8C814D03029FA1F897@r4-ce-s8-03> Message-ID: <1301970913009-3427046.post@n4.nabble.com> I had an identical problem building in R.2.12 with the latest (last week's) Rtools. Interestingly I found that if I used a DOS path in rcmd build (eg rcmd build 0.9\pkg) I got a .tar, but if I replaced it with the unix-like path as in rcmd build 0.9/pkg I got a .tar.gz Weird but workable. S Ellison LGC -- View this message in context: http://r.789695.n4.nabble.com/R-CMD-build-creates-tar-file-instead-of-tar-gz-file-tp3402687p3427046.html Sent from the R help mailing list archive at Nabble.com. From smckinney at bccrc.ca Tue Apr 5 04:38:08 2011 From: smckinney at bccrc.ca (Steven McKinney) Date: Mon, 4 Apr 2011 19:38:08 -0700 Subject: [R] ANCOVA for linear regressions without intercept In-Reply-To: <26212_1301966525_1301966525_B38A1433CF2D7E48BF60B9D93FCD5A260D45C5DE5F@emdch-es2.prod.main.ntgov> References: <4D95F78F.10501@ucalgary.ca>, <26212_1301966525_1301966525_B38A1433CF2D7E48BF60B9D93FCD5A260D45C5DE5F@emdch-es2.prod.main.ntgov> Message-ID: Hi Yusuke, Does the following get what you are after? ### Make some test data. > set.seed(123) > edf <- data.frame(sex = c(rep("Male", 10), rep("Female", 10), rep("Unknown", 10)), + head_length = c(1.2 * c(170:179 + rnorm(10)), 0.8 * c(150:159 + rnorm(10)), c(160:169 + rnorm(10)))/10, + body_length = c(c(170:179 + rnorm(10)), c(150:159 + rnorm(10)), c(160:169 + rnorm(10))) + ) > edf$sex <- factor(as.character(edf$sex)) > plot(edf$head_length, edf$body_length, pch = as.numeric(edf$sex), col = as.numeric(edf$sex), xlim = c(0, 25), ylim = c(0, 190)) > lmf <- lm(body_length ~ head_length * sex, data = edf) ### The full model - do keep an eye on those intercepts and try to ensure they are not far from 0. > summary(lmf) Call: lm(formula = body_length ~ head_length * sex, data = edf) Residuals: Min 1Q Median 3Q Max -2.73783 -0.68133 0.02147 0.50858 2.38931 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -3.578 25.425 -0.141 0.8893 head_length 12.772 2.054 6.218 2e-06 *** sexMale 15.122 37.464 0.404 0.6901 sexUnknown 40.308 33.137 1.216 0.2357 head_length:sexMale -4.977 2.438 -2.042 0.0523 . head_length:sexUnknown -4.971 2.428 -2.047 0.0517 . --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 1.384 on 24 degrees of freedom Multiple R-squared: 0.9802, Adjusted R-squared: 0.9761 F-statistic: 237.7 on 5 and 24 DF, p-value: < 2.2e-16 ### Now suppress intercepts. head_length:sex should give interactions (slopes) only. > lmrf <- lm(body_length ~ -1 + head_length : sex, data = edf) > summary(lmrf) Call: lm(formula = body_length ~ -1 + head_length:sex, data = edf) Residuals: Min 1Q Median 3Q Max -3.02782 -0.61861 -0.01079 0.68785 2.57544 Coefficients: Estimate Std. Error t value Pr(>|t|) head_length:sexFemale 12.48253 0.03549 351.8 <2e-16 *** head_length:sexMale 8.34500 0.02097 398.0 <2e-16 *** head_length:sexUnknown 10.03844 0.02677 375.0 <2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 1.389 on 27 degrees of freedom Multiple R-squared: 0.9999, Adjusted R-squared: 0.9999 F-statistic: 1.409e+05 on 3 and 27 DF, p-value: < 2.2e-16 ### Check the numeric coding of the factor > with(edf, table(sex, as.numeric(sex))) sex 1 2 3 Female 10 0 0 Male 0 10 0 Unknown 0 0 10 > abline(a = 0, b = coef(lmrf)[1], col = 1) ## Females = Black > abline(a = 0, b = coef(lmrf)[2], col = 2) ## Males = Red > abline(a = 0, b = coef(lmrf)[3], col = 3) ## Unknown = Green ### If no diff between males and females, then males and females can be combined into one group. > edf$MvF <- as.character(edf$sex) > edf$MvF[edf$MvF != "Unknown"] <- "MorF" > edf$MvF <- factor(edf$MvF) > with(edf, table(MvF, sex)) sex MvF Female Male Unknown MorF 10 10 0 Unknown 0 0 10 > lmr1f <- lm(body_length ~ -1 + head_length : MvF, data = edf) > summary(lmr1f) Call: lm(formula = body_length ~ -1 + head_length:MvF, data = edf) Residuals: Min 1Q Median 3Q Max -23.976 -21.656 0.077 35.899 39.839 Coefficients: Estimate Std. Error t value Pr(>|t|) head_length:MvFMorF 9.4156 0.3429 27.46 <2e-16 *** head_length:MvFUnknown 10.0384 0.5085 19.74 <2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 26.39 on 28 degrees of freedom Multiple R-squared: 0.9761, Adjusted R-squared: 0.9744 F-statistic: 571.9 on 2 and 28 DF, p-value: < 2.2e-16 ### Test the hypothesis that male and female heights are equivalent > anova(lmr1f, lmrf) Analysis of Variance Table Model 1: body_length ~ -1 + head_length:MvF Model 2: body_length ~ -1 + head_length:sex Res.Df RSS Df Sum of Sq F Pr(>F) 1 28 19496.1 2 27 52.1 1 19444 10077 < 2.2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ### Plot the reduced model regression lines > abline(a = 0, b = coef(lmr1f)[1], col = "blue", lty = 2) > abline(a = 0, b = coef(lmr1f)[2], col = "orange", lty = 2, lwd = 4) > The other two tests can be set up and run similarly. Don't forget to adjust for multiple comparisons... HTH Steve Steven McKinney, Ph.D. Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre ________________________________________ From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] On Behalf Of Yusuke Fukuda [Yusuke.Fukuda at nt.gov.au] Sent: April 4, 2011 5:45 PM To: 'Peter Ehlers' Cc: r-help at r-project.org; 'Bert Gunter' Subject: Re: [R] ANCOVA for linear regressions without intercept Thank you for your suggestions, stats experts. Much appreciated. I still haven't got what I wanted but someone suggested looking into contrasts and this is looking worth trying http://finzi.psych.upenn.edu/R/library/gmodels/html/fit.contrast.html Regards, Yusuke -----Original Message----- From: Peter Ehlers [mailto:ehlers at ucalgary.ca] Sent: Saturday, 2 April 2011 1:35 AM To: Yusuke Fukuda Cc: 'Bert Gunter'; r-help at r-project.org Subject: Re: [R] ANCOVA for linear regressions without intercept See inline. On 2011-03-31 22:22, Yusuke Fukuda wrote: > Thanks Bert. > > I have read "?formula" again and again, and I'm still struggling; > >> lm(body_length ~ head_length-1) > > This removes intercept from each individual regression (for male, female, unknown). > > When they are taken together, > >> lm(body_length ~ sex*head_length) > > This shows differences in slopes and intercepts between the regressions (but I want to compare the slopes of the regressions WITHOUT intercepts). > > If I put > >> lm(body_length ~ sex:head_length-1) > > This shows slopes for each sex without intercepts, but NOT differences in the slope between the regressions. You probably want: lm(body_length ~ head_length + sex:head_length-1) or, in short form: lm(body_length ~ head_length/sex-1) You might then compare the model 'without intercepts' (i.e. with intercepts forced to zero) with a model that includes intercepts. If the intercepts turn out to be significantly nonzero, what will you do? Peter Ehlers > > I also tried > >> lm(body_length ~ sex*head_length-1) >> lm(body_length ~ sex*head_length-sex-1) > > But none of them worked. > > Would anyone be able to help me? All I want to do is to compare the slopes of three linear regressions that go through the origin (0,0) so that I can say if their difference is significant or not. > > Thanks for your help. > > > > ________________________________________ > From: Bert Gunter [mailto:gunter.berton at gene.com] > Sent: Friday, 1 April 2011 12:56 AM > To: Yusuke Fukuda > Cc: r-help at r-project.org > Subject: Re: [R] ANCOVA for linear regressions without intercept > > If you haven't already received an answer, a careful reading of > > ?formula > > will provide it. > > -- Bert > On Wed, Mar 30, 2011 at 11:42 PM, Yusuke Fukuda wrote: > > Hello R experts > > I have two linear regressions for sexes (Male, Female, Unknown). All have a good correlation between body length (response variable) and head length (explanatory variable). I know it is not recommended, but for a good practical reason (the purpose of study is to find a single conversion factor from head length to body length), the regressions need to go through the origin (0 intercept). > > Is it possible to do ANCOVA for these regressions without intercepts? When I do > > summary(lm(body length ~ sex*head length)) > > this will include the intercepts as below > > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) -6.49697 1.68497 -3.856 0.000118 *** > sexMale -9.39340 1.97760 -4.750 2.14e-06 *** > sexUnknown -1.33791 2.35453 -0.568 0.569927 > head_length 7.12307 0.05503 129.443< 2e-16 *** > sexMale:head_length 0.31631 0.06246 5.064 4.37e-07 *** > sexUnknown:head_length 0.19937 0.07022 2.839 0.004556 ** > --- > > Is there any way I can remove the intercepts so that I can simply compare the slopes with no intercept taken into account? > > Thanks for help in advance. > > Yusuke Fukuda > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From mvalle at cscs.ch Tue Apr 5 04:48:23 2011 From: mvalle at cscs.ch (Mario Valle) Date: Tue, 5 Apr 2011 04:48:23 +0200 Subject: [R] hc2Newick is different than th hclust dendrogram In-Reply-To: <20110404164736.53062rcc9hate0sg@wm3.dal.ca> References: <20110401174755.20670jzo8r2746pc@wm4.dal.ca> <20110404092434.93826vox72jcqv7k@wm1.dal.ca> <4D99D526.6000001@cscs.ch> <20110404164736.53062rcc9hate0sg@wm3.dal.ca> Message-ID: <4D9A82F7.6070100@cscs.ch> Well, I could not call them "entirely different". See attached (tree-tv from TreeView, tree-r from R). Yes, I had to rotate and mirror the tree in TreeView but that's all. And yes, I have to ignore the tree length values from the file. Maybe it is better to post your inquiry to the Bioconductor list (from where hc2newick apparently arrives in package 'ctc'). Some humble suggestions: - don't call a text file .Rdata! Rdata is a specific binary format read using load() and saved using save(). - add to the description of your problem how to read the data, i.e.: z <- scan('data.Rdata') C <- matrix(z, 112, 112, byrow=TRUE) - add to your problem description all the packages you loaded to make the example work, i.e.: library('ctc') Hope it helps mario On 04-Apr-11 21:47, Jose Hleap Lozano wrote: > It was... I have it in my sent box is called data.Rdata... > > I am attaching it again! > > Thanks -- Ing. Mario Valle Data Analysis and Visualization Group | http://www.cscs.ch/~mvalle Swiss National Supercomputing Centre (CSCS) | Tel: +41 (91) 610.82.60 v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax: +41 (91) 610.82.82 -------------- next part -------------- A non-text attachment was scrubbed... Name: tree-tv.png Type: image/png Size: 7035 bytes Desc: tree-tv.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tree-r.png Type: image/png Size: 14855 bytes Desc: tree-r.png URL: From jdnewmil at dcn.davis.ca.us Tue Apr 5 06:22:47 2011 From: jdnewmil at dcn.davis.ca.us (Jeff Newmiller) Date: Mon, 04 Apr 2011 21:22:47 -0700 Subject: [R] how to handle no lines in input with pipe() In-Reply-To: References: Message-ID: <98d92287-6e2d-4128-bb88-bdb3634e79a8@email.android.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From thpe at simecol.de Tue Apr 5 07:35:03 2011 From: thpe at simecol.de (Thomas Petzoldt) Date: Tue, 05 Apr 2011 07:35:03 +0200 Subject: [R] precompiled ode with spline input In-Reply-To: <1300190446.8565.8.camel@dkubuntu-laptop> References: <1300190446.8565.8.camel@dkubuntu-laptop> Message-ID: <4D9AAA07.3080405@simecol.de> Hi Daniel, thanks for your positive response about using compiled ODE functions. Did you use package deSolve? Regarding the use of splines (as forcing functions ?) we don't have an out of the box method yet, but you may consider to approximate the splines by linear segments or contribute your own spline code to the deSolve project. The example Forcing_lv.R / Forcing_lv.c shows how to use (linearly) interpolated forcing data within compiled code, and the interpolation code itself (Initdeforc, updateforc) is found in forcings.c. Depending on your particular problem there may be other possibilities too. I don't want to go more into the details here on r-help, but you may consider to post additional questions to the mailing list of the respective special interest group https://stat.ethz.ch/mailman/listinfo/r-sig-dynamic-models This is also a good idea in general for such questions for getting more immediate responses. Thomas Petzoldt On 3/15/2011 1:00 PM, Daniel Kaschek wrote: > I tried to use lsode with precompiled C code instead of an R function > defining the derivatives. No problem so far. > > However, now I need to implement an ODE that contains spline functions, > i.e. the derivatives at given time points depend on the value of a > spline function at this time point. What is an efficient way to > implement this in precompiled C code? > > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From jcg3441 at rit.edu Tue Apr 5 07:20:41 2011 From: jcg3441 at rit.edu (jcg3441) Date: Tue, 5 Apr 2011 00:20:41 -0500 (CDT) Subject: [R] kmeans clustering java Message-ID: <1301980841118-3427159.post@n4.nabble.com> Hello, I have been trying for a few days to do kmeans on a matrix of data which I populate from java. Here's my code: String[] Rargs = {"--vanilla"}; Rengine re = new Rengine(Rargs, false, null); System.out.println("Rengine created, waiting for R"); // the engine creates R is a new thread, so we should wait until it's // ready if (!re.waitForR()) { System.out.println("Cannot load R"); return null; } re.eval ("rmatrix <- matrix(data = NA, nrow = "+rows+", ncol ="+(columns-1)+", byrow = FALSE)");//,dimnames = )"); REXP rp= re.eval(hDr); //loop through the matrix and give the upgma_matrix the correct values for (int i = 0; i < rows-1; i++) { re.eval ("i<- " +i); for (int j = 1; j < columns; j++) { re.eval ("j<- " +j); //R matrices start at index 1 (java at 0), so add 1 to current position re.eval ("ii <- i+1"); re.eval ("jj <- j"); //add values for the lower triangle.. re.eval ("rmatrix [ii,jj] <- "+ data[i][j].toString()); System.out.print(data[i][j].toString()+","); } System.out.println(); } REXP rt = re.eval("r_matrix"); String bindString = "DATAMATRIX <- cbind(rmatrix[,1],"; for (int k = 0; k Hi 1. I have two raster files *.asc (identical size) 2. The data in each contain presence or absence data in each cell represented by a 1 or 0 respectively 3. I would like to take the location of each 1 (presence cell) in raster file 1 and measure the euclidean distance to the nearest 1 (presence cell) in raster file 2. Obviously in some cases there will be overlap so the distance will be zero. 4. I would like the output file to have each individual measurement on a seperate line in a single file. I am very new to R, so any help would be appreciated. Best regards Paul -- Paul Duckett - PhD Candidate Conservation Genetics Lab E8A 264 Biological Sciences Faculty of Science Macquarie University North Ryde NSW 2109 http://paulduckett.redbubble.com From nikhil.abhyankar at gmail.com Tue Apr 5 07:23:03 2011 From: nikhil.abhyankar at gmail.com (Nikhil Abhyankar) Date: Tue, 5 Apr 2011 10:53:03 +0530 Subject: [R] Saving console and graph output to same file Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From janusjlarsen at gmail.com Tue Apr 5 08:29:24 2011 From: janusjlarsen at gmail.com (sunaj) Date: Tue, 5 Apr 2011 01:29:24 -0500 (CDT) Subject: [R] several Filled.contour plots on the same device... In-Reply-To: <1282263386198-2332041.post@n4.nabble.com> References: <39828.134.157.178.21.1172142897.squirrel@www.lodyc.jussieu.fr> <07E228A5BE53C24CAD490193A7381BBB83C554@LP-EXCHVS07.CO.IHC.COM> <1282263386198-2332041.post@n4.nabble.com> Message-ID: <1301984964914-3427236.post@n4.nabble.com> Billy.Requena, I bow myself into the dust - exactly what I was looking for. Thx, Sunaj -- View this message in context: http://r.789695.n4.nabble.com/several-Filled-contour-plots-on-the-same-device-tp819040p3427236.html Sent from the R help mailing list archive at Nabble.com. From michalseneca at gmail.com Tue Apr 5 09:07:55 2011 From: michalseneca at gmail.com (michalseneca) Date: Tue, 5 Apr 2011 02:07:55 -0500 (CDT) Subject: [R] Creating multiple vector/list names-novice In-Reply-To: <1301929772084-3425759.post@n4.nabble.com> References: <1301927063332-3425616.post@n4.nabble.com> <1301929772084-3425759.post@n4.nabble.com> Message-ID: <1301987275755-3427283.post@n4.nabble.com> The exact would be for example that I shoul be able then to choose "a" from "abc". and I cannot do that. -- View this message in context: http://r.789695.n4.nabble.com/Creating-multiple-vector-list-names-novice-tp3425616p3427283.html Sent from the R help mailing list archive at Nabble.com. From l.cattarino at uq.edu.au Tue Apr 5 09:40:59 2011 From: l.cattarino at uq.edu.au (Lorenzo Cattarino) Date: Tue, 5 Apr 2011 17:40:59 +1000 Subject: [R] do not execute newline command Message-ID: <2869E75AAA158C4E936333A119467ADD1F7CD129C9@UQEXMB06.soe.uq.edu.au> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From nandan.amar at gmail.com Tue Apr 5 10:07:12 2011 From: nandan.amar at gmail.com (nandan amar) Date: Tue, 5 Apr 2011 13:37:12 +0530 Subject: [R] system() command in R In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Yusuke.Fukuda at nt.gov.au Tue Apr 5 05:24:16 2011 From: Yusuke.Fukuda at nt.gov.au (Yusuke Fukuda) Date: Tue, 5 Apr 2011 12:54:16 +0930 Subject: [R] ANCOVA for linear regressions without intercept In-Reply-To: References: <4D95F78F.10501@ucalgary.ca>, <26212_1301966525_1301966525_B38A1433CF2D7E48BF60B9D93FCD5A260D45C5DE5F@emdch-es2.prod.main.ntgov> Message-ID: Hi Steve Wow, this could be the way to get around to what I was after. I will have a close look and see if it works with my data. Will let you know how it goes. Thank you. Yusuke -----Original Message----- From: Steven McKinney [mailto:smckinney at bccrc.ca] Sent: Tuesday, 5 April 2011 12:08 PM To: Yusuke Fukuda; 'Peter Ehlers' Cc: r-help at r-project.org Subject: RE: [R] ANCOVA for linear regressions without intercept Hi Yusuke, Does the following get what you are after? ### Make some test data. > set.seed(123) > edf <- data.frame(sex = c(rep("Male", 10), rep("Female", 10), rep("Unknown", 10)), + head_length = c(1.2 * c(170:179 + rnorm(10)), 0.8 * c(150:159 + rnorm(10)), c(160:169 + rnorm(10)))/10, + body_length = c(c(170:179 + rnorm(10)), c(150:159 + rnorm(10)), c(160:169 + rnorm(10))) + ) > edf$sex <- factor(as.character(edf$sex)) > plot(edf$head_length, edf$body_length, pch = as.numeric(edf$sex), col = as.numeric(edf$sex), xlim = c(0, 25), ylim = c(0, 190)) > lmf <- lm(body_length ~ head_length * sex, data = edf) ### The full model - do keep an eye on those intercepts and try to ensure they are not far from 0. > summary(lmf) Call: lm(formula = body_length ~ head_length * sex, data = edf) Residuals: Min 1Q Median 3Q Max -2.73783 -0.68133 0.02147 0.50858 2.38931 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -3.578 25.425 -0.141 0.8893 head_length 12.772 2.054 6.218 2e-06 *** sexMale 15.122 37.464 0.404 0.6901 sexUnknown 40.308 33.137 1.216 0.2357 head_length:sexMale -4.977 2.438 -2.042 0.0523 . head_length:sexUnknown -4.971 2.428 -2.047 0.0517 . --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 1.384 on 24 degrees of freedom Multiple R-squared: 0.9802, Adjusted R-squared: 0.9761 F-statistic: 237.7 on 5 and 24 DF, p-value: < 2.2e-16 ### Now suppress intercepts. head_length:sex should give interactions (slopes) only. > lmrf <- lm(body_length ~ -1 + head_length : sex, data = edf) > summary(lmrf) Call: lm(formula = body_length ~ -1 + head_length:sex, data = edf) Residuals: Min 1Q Median 3Q Max -3.02782 -0.61861 -0.01079 0.68785 2.57544 Coefficients: Estimate Std. Error t value Pr(>|t|) head_length:sexFemale 12.48253 0.03549 351.8 <2e-16 *** head_length:sexMale 8.34500 0.02097 398.0 <2e-16 *** head_length:sexUnknown 10.03844 0.02677 375.0 <2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 1.389 on 27 degrees of freedom Multiple R-squared: 0.9999, Adjusted R-squared: 0.9999 F-statistic: 1.409e+05 on 3 and 27 DF, p-value: < 2.2e-16 ### Check the numeric coding of the factor > with(edf, table(sex, as.numeric(sex))) sex 1 2 3 Female 10 0 0 Male 0 10 0 Unknown 0 0 10 > abline(a = 0, b = coef(lmrf)[1], col = 1) ## Females = Black > abline(a = 0, b = coef(lmrf)[2], col = 2) ## Males = Red > abline(a = 0, b = coef(lmrf)[3], col = 3) ## Unknown = Green ### If no diff between males and females, then males and females can be combined into one group. > edf$MvF <- as.character(edf$sex) > edf$MvF[edf$MvF != "Unknown"] <- "MorF" > edf$MvF <- factor(edf$MvF) > with(edf, table(MvF, sex)) sex MvF Female Male Unknown MorF 10 10 0 Unknown 0 0 10 > lmr1f <- lm(body_length ~ -1 + head_length : MvF, data = edf) > summary(lmr1f) Call: lm(formula = body_length ~ -1 + head_length:MvF, data = edf) Residuals: Min 1Q Median 3Q Max -23.976 -21.656 0.077 35.899 39.839 Coefficients: Estimate Std. Error t value Pr(>|t|) head_length:MvFMorF 9.4156 0.3429 27.46 <2e-16 *** head_length:MvFUnknown 10.0384 0.5085 19.74 <2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 26.39 on 28 degrees of freedom Multiple R-squared: 0.9761, Adjusted R-squared: 0.9744 F-statistic: 571.9 on 2 and 28 DF, p-value: < 2.2e-16 ### Test the hypothesis that male and female heights are equivalent > anova(lmr1f, lmrf) Analysis of Variance Table Model 1: body_length ~ -1 + head_length:MvF Model 2: body_length ~ -1 + head_length:sex Res.Df RSS Df Sum of Sq F Pr(>F) 1 28 19496.1 2 27 52.1 1 19444 10077 < 2.2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ### Plot the reduced model regression lines > abline(a = 0, b = coef(lmr1f)[1], col = "blue", lty = 2) > abline(a = 0, b = coef(lmr1f)[2], col = "orange", lty = 2, lwd = 4) > The other two tests can be set up and run similarly. Don't forget to adjust for multiple comparisons... HTH Steve Steven McKinney, Ph.D. Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre ________________________________________ From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] On Behalf Of Yusuke Fukuda [Yusuke.Fukuda at nt.gov.au] Sent: April 4, 2011 5:45 PM To: 'Peter Ehlers' Cc: r-help at r-project.org; 'Bert Gunter' Subject: Re: [R] ANCOVA for linear regressions without intercept Thank you for your suggestions, stats experts. Much appreciated. I still haven't got what I wanted but someone suggested looking into contrasts and this is looking worth trying http://finzi.psych.upenn.edu/R/library/gmodels/html/fit.contrast.html Regards, Yusuke -----Original Message----- From: Peter Ehlers [mailto:ehlers at ucalgary.ca] Sent: Saturday, 2 April 2011 1:35 AM To: Yusuke Fukuda Cc: 'Bert Gunter'; r-help at r-project.org Subject: Re: [R] ANCOVA for linear regressions without intercept See inline. On 2011-03-31 22:22, Yusuke Fukuda wrote: > Thanks Bert. > > I have read "?formula" again and again, and I'm still struggling; > >> lm(body_length ~ head_length-1) > > This removes intercept from each individual regression (for male, female, unknown). > > When they are taken together, > >> lm(body_length ~ sex*head_length) > > This shows differences in slopes and intercepts between the regressions (but I want to compare the slopes of the regressions WITHOUT intercepts). > > If I put > >> lm(body_length ~ sex:head_length-1) > > This shows slopes for each sex without intercepts, but NOT differences in the slope between the regressions. You probably want: lm(body_length ~ head_length + sex:head_length-1) or, in short form: lm(body_length ~ head_length/sex-1) You might then compare the model 'without intercepts' (i.e. with intercepts forced to zero) with a model that includes intercepts. If the intercepts turn out to be significantly nonzero, what will you do? Peter Ehlers > > I also tried > >> lm(body_length ~ sex*head_length-1) >> lm(body_length ~ sex*head_length-sex-1) > > But none of them worked. > > Would anyone be able to help me? All I want to do is to compare the slopes of three linear regressions that go through the origin (0,0) so that I can say if their difference is significant or not. > > Thanks for your help. > > > > ________________________________________ > From: Bert Gunter [mailto:gunter.berton at gene.com] > Sent: Friday, 1 April 2011 12:56 AM > To: Yusuke Fukuda > Cc: r-help at r-project.org > Subject: Re: [R] ANCOVA for linear regressions without intercept > > If you haven't already received an answer, a careful reading of > > ?formula > > will provide it. > > -- Bert > On Wed, Mar 30, 2011 at 11:42 PM, Yusuke Fukuda wrote: > > Hello R experts > > I have two linear regressions for sexes (Male, Female, Unknown). All have a good correlation between body length (response variable) and head length (explanatory variable). I know it is not recommended, but for a good practical reason (the purpose of study is to find a single conversion factor from head length to body length), the regressions need to go through the origin (0 intercept). > > Is it possible to do ANCOVA for these regressions without intercepts? When I do > > summary(lm(body length ~ sex*head length)) > > this will include the intercepts as below > > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) -6.49697 1.68497 -3.856 0.000118 *** > sexMale -9.39340 1.97760 -4.750 2.14e-06 *** > sexUnknown -1.33791 2.35453 -0.568 0.569927 > head_length 7.12307 0.05503 129.443< 2e-16 *** > sexMale:head_length 0.31631 0.06246 5.064 4.37e-07 *** > sexUnknown:head_length 0.19937 0.07022 2.839 0.004556 ** > --- > > Is there any way I can remove the intercepts so that I can simply compare the slopes with no intercept taken into account? > > Thanks for help in advance. > > Yusuke Fukuda > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From Yusuke.Fukuda at nt.gov.au Tue Apr 5 07:08:45 2011 From: Yusuke.Fukuda at nt.gov.au (Yusuke Fukuda) Date: Tue, 5 Apr 2011 14:38:45 +0930 Subject: [R] ANCOVA for linear regressions without intercept In-Reply-To: References: <4D95F78F.10501@ucalgary.ca>, <26212_1301966525_1301966525_B38A1433CF2D7E48BF60B9D93FCD5A260D45C5DE5F@emdch-es2.prod.main.ntgov> Message-ID: Hi Steve It worked with my data! I didn't think of combining the categories before doing ANOVA to test for the difference. This is the final answer to my question. Thank you very much for your time. Regards, Yusuke -----Original Message----- From: Steven McKinney [mailto:smckinney at bccrc.ca] Sent: Tuesday, 5 April 2011 12:08 PM To: Yusuke Fukuda; 'Peter Ehlers' Cc: r-help at r-project.org Subject: RE: [R] ANCOVA for linear regressions without intercept Hi Yusuke, Does the following get what you are after? ### Make some test data. > set.seed(123) > edf <- data.frame(sex = c(rep("Male", 10), rep("Female", 10), rep("Unknown", 10)), + head_length = c(1.2 * c(170:179 + rnorm(10)), 0.8 * c(150:159 + rnorm(10)), c(160:169 + rnorm(10)))/10, + body_length = c(c(170:179 + rnorm(10)), c(150:159 + rnorm(10)), c(160:169 + rnorm(10))) + ) > edf$sex <- factor(as.character(edf$sex)) > plot(edf$head_length, edf$body_length, pch = as.numeric(edf$sex), col = as.numeric(edf$sex), xlim = c(0, 25), ylim = c(0, 190)) > lmf <- lm(body_length ~ head_length * sex, data = edf) ### The full model - do keep an eye on those intercepts and try to ensure they are not far from 0. > summary(lmf) Call: lm(formula = body_length ~ head_length * sex, data = edf) Residuals: Min 1Q Median 3Q Max -2.73783 -0.68133 0.02147 0.50858 2.38931 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -3.578 25.425 -0.141 0.8893 head_length 12.772 2.054 6.218 2e-06 *** sexMale 15.122 37.464 0.404 0.6901 sexUnknown 40.308 33.137 1.216 0.2357 head_length:sexMale -4.977 2.438 -2.042 0.0523 . head_length:sexUnknown -4.971 2.428 -2.047 0.0517 . --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 1.384 on 24 degrees of freedom Multiple R-squared: 0.9802, Adjusted R-squared: 0.9761 F-statistic: 237.7 on 5 and 24 DF, p-value: < 2.2e-16 ### Now suppress intercepts. head_length:sex should give interactions (slopes) only. > lmrf <- lm(body_length ~ -1 + head_length : sex, data = edf) > summary(lmrf) Call: lm(formula = body_length ~ -1 + head_length:sex, data = edf) Residuals: Min 1Q Median 3Q Max -3.02782 -0.61861 -0.01079 0.68785 2.57544 Coefficients: Estimate Std. Error t value Pr(>|t|) head_length:sexFemale 12.48253 0.03549 351.8 <2e-16 *** head_length:sexMale 8.34500 0.02097 398.0 <2e-16 *** head_length:sexUnknown 10.03844 0.02677 375.0 <2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 1.389 on 27 degrees of freedom Multiple R-squared: 0.9999, Adjusted R-squared: 0.9999 F-statistic: 1.409e+05 on 3 and 27 DF, p-value: < 2.2e-16 ### Check the numeric coding of the factor > with(edf, table(sex, as.numeric(sex))) sex 1 2 3 Female 10 0 0 Male 0 10 0 Unknown 0 0 10 > abline(a = 0, b = coef(lmrf)[1], col = 1) ## Females = Black > abline(a = 0, b = coef(lmrf)[2], col = 2) ## Males = Red > abline(a = 0, b = coef(lmrf)[3], col = 3) ## Unknown = Green ### If no diff between males and females, then males and females can be combined into one group. > edf$MvF <- as.character(edf$sex) > edf$MvF[edf$MvF != "Unknown"] <- "MorF" > edf$MvF <- factor(edf$MvF) > with(edf, table(MvF, sex)) sex MvF Female Male Unknown MorF 10 10 0 Unknown 0 0 10 > lmr1f <- lm(body_length ~ -1 + head_length : MvF, data = edf) > summary(lmr1f) Call: lm(formula = body_length ~ -1 + head_length:MvF, data = edf) Residuals: Min 1Q Median 3Q Max -23.976 -21.656 0.077 35.899 39.839 Coefficients: Estimate Std. Error t value Pr(>|t|) head_length:MvFMorF 9.4156 0.3429 27.46 <2e-16 *** head_length:MvFUnknown 10.0384 0.5085 19.74 <2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 26.39 on 28 degrees of freedom Multiple R-squared: 0.9761, Adjusted R-squared: 0.9744 F-statistic: 571.9 on 2 and 28 DF, p-value: < 2.2e-16 ### Test the hypothesis that male and female heights are equivalent > anova(lmr1f, lmrf) Analysis of Variance Table Model 1: body_length ~ -1 + head_length:MvF Model 2: body_length ~ -1 + head_length:sex Res.Df RSS Df Sum of Sq F Pr(>F) 1 28 19496.1 2 27 52.1 1 19444 10077 < 2.2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ### Plot the reduced model regression lines > abline(a = 0, b = coef(lmr1f)[1], col = "blue", lty = 2) > abline(a = 0, b = coef(lmr1f)[2], col = "orange", lty = 2, lwd = 4) > The other two tests can be set up and run similarly. Don't forget to adjust for multiple comparisons... HTH Steve Steven McKinney, Ph.D. Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre ________________________________________ From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] On Behalf Of Yusuke Fukuda [Yusuke.Fukuda at nt.gov.au] Sent: April 4, 2011 5:45 PM To: 'Peter Ehlers' Cc: r-help at r-project.org; 'Bert Gunter' Subject: Re: [R] ANCOVA for linear regressions without intercept Thank you for your suggestions, stats experts. Much appreciated. I still haven't got what I wanted but someone suggested looking into contrasts and this is looking worth trying http://finzi.psych.upenn.edu/R/library/gmodels/html/fit.contrast.html Regards, Yusuke -----Original Message----- From: Peter Ehlers [mailto:ehlers at ucalgary.ca] Sent: Saturday, 2 April 2011 1:35 AM To: Yusuke Fukuda Cc: 'Bert Gunter'; r-help at r-project.org Subject: Re: [R] ANCOVA for linear regressions without intercept See inline. On 2011-03-31 22:22, Yusuke Fukuda wrote: > Thanks Bert. > > I have read "?formula" again and again, and I'm still struggling; > >> lm(body_length ~ head_length-1) > > This removes intercept from each individual regression (for male, female, unknown). > > When they are taken together, > >> lm(body_length ~ sex*head_length) > > This shows differences in slopes and intercepts between the regressions (but I want to compare the slopes of the regressions WITHOUT intercepts). > > If I put > >> lm(body_length ~ sex:head_length-1) > > This shows slopes for each sex without intercepts, but NOT differences in the slope between the regressions. You probably want: lm(body_length ~ head_length + sex:head_length-1) or, in short form: lm(body_length ~ head_length/sex-1) You might then compare the model 'without intercepts' (i.e. with intercepts forced to zero) with a model that includes intercepts. If the intercepts turn out to be significantly nonzero, what will you do? Peter Ehlers > > I also tried > >> lm(body_length ~ sex*head_length-1) >> lm(body_length ~ sex*head_length-sex-1) > > But none of them worked. > > Would anyone be able to help me? All I want to do is to compare the slopes of three linear regressions that go through the origin (0,0) so that I can say if their difference is significant or not. > > Thanks for your help. > > > > ________________________________________ > From: Bert Gunter [mailto:gunter.berton at gene.com] > Sent: Friday, 1 April 2011 12:56 AM > To: Yusuke Fukuda > Cc: r-help at r-project.org > Subject: Re: [R] ANCOVA for linear regressions without intercept > > If you haven't already received an answer, a careful reading of > > ?formula > > will provide it. > > -- Bert > On Wed, Mar 30, 2011 at 11:42 PM, Yusuke Fukuda wrote: > > Hello R experts > > I have two linear regressions for sexes (Male, Female, Unknown). All have a good correlation between body length (response variable) and head length (explanatory variable). I know it is not recommended, but for a good practical reason (the purpose of study is to find a single conversion factor from head length to body length), the regressions need to go through the origin (0 intercept). > > Is it possible to do ANCOVA for these regressions without intercepts? When I do > > summary(lm(body length ~ sex*head length)) > > this will include the intercepts as below > > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) -6.49697 1.68497 -3.856 0.000118 *** > sexMale -9.39340 1.97760 -4.750 2.14e-06 *** > sexUnknown -1.33791 2.35453 -0.568 0.569927 > head_length 7.12307 0.05503 129.443< 2e-16 *** > sexMale:head_length 0.31631 0.06246 5.064 4.37e-07 *** > sexUnknown:head_length 0.19937 0.07022 2.839 0.004556 ** > --- > > Is there any way I can remove the intercepts so that I can simply compare the slopes with no intercept taken into account? > > Thanks for help in advance. > > Yusuke Fukuda > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From Bernhard_Pfaff at fra.invesco.com Tue Apr 5 10:11:12 2011 From: Bernhard_Pfaff at fra.invesco.com (Pfaff, Bernhard Dr.) Date: Tue, 5 Apr 2011 09:11:12 +0100 Subject: [R] Granger Causality in a VAR Model In-Reply-To: References: Message-ID: The below email was cross-posted to R-Sig-Finance and has been answered there. > -----Urspr?ngliche Nachricht----- > Von: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] Im Auftrag von ivan > Gesendet: Montag, 4. April 2011 20:24 > An: r-help at r-project.org > Betreff: [R] Granger Causality in a VAR Model > > Dear Community, > > I am new to R and have a question concerning the causality () > test in the vars package. I need to test whether, say, the > variable y Granger causes the variable x, given z as a > control variable. > > I estimated the VAR model as follows: >model<-VAR(cbind(x,y,z),p=2) > > Then I did the following: >causality(model, cause="y"). I > thing this tests the Granger causality of y on the vector > (x,z), though. How can I implement the test for y causing x > controlled for z? Thus, the F-test comparing the two models > M1:x~lagged(x)+lagged(z) and M2:x~lagged(x)+lagged(y)+lagged(z)? > > Thank you in advance. > > Best Regards > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ***************************************************************** Confidentiality Note: The information contained in this ...{{dropped:10}} From lebatsnok at gmail.com Tue Apr 5 10:27:39 2011 From: lebatsnok at gmail.com (Kenn Konstabel) Date: Tue, 5 Apr 2011 11:27:39 +0300 Subject: [R] do not execute newline command In-Reply-To: <2869E75AAA158C4E936333A119467ADD1F7CD129C9@UQEXMB06.soe.uq.edu.au> References: <2869E75AAA158C4E936333A119467ADD1F7CD129C9@UQEXMB06.soe.uq.edu.au> Message-ID: On Tue, Apr 5, 2011 at 10:40 AM, Lorenzo Cattarino wrote: > Hi R-users, > > To automate the creation of scripts, I converted the code (example below) into a character string and wrote the object to a file: > > Repeat <- " > myvec <- c(1:12) > cat('vector= ', myvec, '\n') > " > > write (Repeat, 'yourpath/test.R') > > the problem is that one line of the code is a "cat" command. In the output file (i.e. test.R), the newline symbol gets executed and I don't want that. > > Any idea on how to do that? You can "escape" the newline symbol (i.e., write an extra \ before it): Repeat <- " myvec <- c(1:12) cat('vector= ', myvec, '\\n') " write (Repeat, 'test.R') > Thanks > Lorenzo > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From r.m.krug at gmail.com Tue Apr 5 10:28:53 2011 From: r.m.krug at gmail.com (Rainer M Krug) Date: Tue, 05 Apr 2011 10:28:53 +0200 Subject: [R] Euclidean Distance in R In-Reply-To: References: Message-ID: <4D9AD2C5.1060907@gmail.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/04/11 03:43, Paul Duckett wrote: > Hi Hi > > 1. I have two raster files *.asc (identical size) This question is much more appropriate for the r-sig-geo mailing list (https://stat.ethz.ch/mailman/listinfo/r-sig-geo), which focusses on spatial analysis / modelling in R. I am sure you will get an answer there. I take the liberty to CC this mail to the list - and I would encourage you to subscribe to the mailing list. Cheers, Rainer > 2. The data in each contain presence or absence data in each cell > represented by a 1 or 0 respectively > 3. I would like to take the location of each 1 (presence cell) in > raster file 1 and measure the euclidean distance to the nearest 1 > (presence cell) in raster file 2. > > Obviously in some cases there will be overlap so the distance will be zero. > > 4. I would like the output file to have each individual measurement on > a seperate line in a single file. > > > I am very new to R, so any help would be appreciated. > > Best regards > Paul - -- Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Natural Sciences Building Office Suite 2039 Stellenbosch University Main Campus, Merriman Avenue Stellenbosch South Africa Tel: +33 - (0)9 53 10 27 44 Cell: +27 - (0)8 39 47 90 42 Fax (SA): +27 - (0)8 65 16 27 82 Fax (D) : +49 - (0)3 21 21 25 22 44 Fax (FR): +33 - (0)9 58 10 27 44 email: Rainer at krugs.de Skype: RMkrug -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk2a0sUACgkQoYgNqgF2egqpqACfa2FdwXYwn7i+woC6RnFnURE8 p2kAn1Q833jkNyG9EfkQUIoycsdlDJWp =aykq -----END PGP SIGNATURE----- From Thierry.ONKELINX at inbo.be Tue Apr 5 10:31:14 2011 From: Thierry.ONKELINX at inbo.be (ONKELINX, Thierry) Date: Tue, 5 Apr 2011 08:31:14 +0000 Subject: [R] Euclidean Distance in R In-Reply-To: References: Message-ID: Dear Paul, The command RSiteSearch("nearest neighbour") Will give you the answer that you need. (The second hit is the function you want). Best regards, Thierry ---------------------------------------------------------------------------- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie & Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics & Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey > -----Oorspronkelijk bericht----- > Van: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] Namens Paul Duckett > Verzonden: dinsdag 5 april 2011 3:44 > Aan: R-help at r-project.org > Onderwerp: [R] Euclidean Distance in R > > Hi > > 1. I have two raster files *.asc (identical size) 2. The data > in each contain presence or absence data in each cell > represented by a 1 or 0 respectively 3. I would like to take > the location of each 1 (presence cell) in raster file 1 and > measure the euclidean distance to the nearest 1 (presence > cell) in raster file 2. > > Obviously in some cases there will be overlap so the distance > will be zero. > > 4. I would like the output file to have each individual > measurement on a seperate line in a single file. > > > I am very new to R, so any help would be appreciated. > > Best regards > Paul > -- > Paul Duckett - PhD Candidate > Conservation Genetics Lab > E8A 264 > Biological Sciences > Faculty of Science > Macquarie University > North Ryde > NSW 2109 > http://paulduckett.redbubble.com > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From p.pagel at wzw.tum.de Tue Apr 5 10:38:36 2011 From: p.pagel at wzw.tum.de (Philipp Pagel) Date: Tue, 5 Apr 2011 10:38:36 +0200 Subject: [R] Saving console and graph output to same file In-Reply-To: References: Message-ID: <20110405083836.GA4629@maker> On Tue, Apr 05, 2011 at 10:53:03AM +0530, Nikhil Abhyankar wrote: > Hello All, > > How do I save the output of the R console and the graphic output to the same > PDF file and append these to each other? > > I need to have a frequency table and a corresponding graph, one below the > other in a file. I have tried with sending the cross table to the graph > window using 'textplot' and then saving the graphic output. However, the > table does not look nice in the graph output. > > Is there any way the output from the console can be saved in a file and then > the output from the graph window be appended to the same file? Sweave an odfWeave are very nice methods for generating reports with both text, R code, Results from R and Graphics. cu Philipp -- Dr. Philipp Pagel Lehrstuhl f?r Genomorientierte Bioinformatik Technische Universit?t M?nchen Wissenschaftszentrum Weihenstephan Maximus-von-Imhof-Forum 3 85354 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/ From mdowle at mdowle.plus.com Tue Apr 5 10:42:33 2011 From: mdowle at mdowle.plus.com (Matthew Dowle) Date: Tue, 5 Apr 2011 09:42:33 +0100 Subject: [R] General binary search? References: Message-ID: Try data.table:::sortedmatch, which is implemented in C. It requires it's input to be sorted (and doesn't check) "Stavros Macrakis" wrote in message news:BANLkTi=j2LF5SyXYTV1Dd4k9wr0ZGK88ig at mail.gmail.com... > Is there a generic binary search routine in a standard library which > > a) works for character vectors > b) runs in O(log(N)) time? > > I'm aware of findInterval(x,vec), but it is restricted to numeric vectors. > > I'm also aware of various hashing solutions (e.g. new.env(hash=TRUE) and > fastmatch), but I need the greatest-lower-bound match in my application. > > findInterval is also slow for large N=length(vec) because of the O(N) > checking it does, as Duncan Murdoch has pointed > out: > though > its documentation says it runs in O(n * log(N)), it actually runs in O(n * > log(N) + N), which is quite noticeable for largish N. But that is easy > enough to work around by writing a variant of findInterval which calls > find_interv_vec without checking. > > -s > > PS Yes, binary search is a one-liner in R, but I always prefer to use > standard, fast native libraries when possible.... > > binarysearch <- function(val,tab,L,H) {while (H>=L) { M=L+(H-L) %/% 2; if > (tab[M]>val) H<-M-1 else if (tab[M] return(L-1)} > > [[alternative HTML version deleted]] > From enricoschumann at yahoo.de Tue Apr 5 10:53:35 2011 From: enricoschumann at yahoo.de (Enrico Schumann) Date: Tue, 5 Apr 2011 10:53:35 +0200 Subject: [R] RODBC excel - need to preserve (or extract) numeric column names In-Reply-To: <63F107BCC37AEA49A75FD94AA3E07CB004AFD71A@pacpbsex01.pac.dfo-mpo.ca> References: <63F107BCC37AEA49A75FD94AA3E07CB004AFD71A@pacpbsex01.pac.dfo-mpo.ca> Message-ID: At least for Excel 2003 on my computer (Win XP) I can "persuade" Excel to treat cells like text by prepending a ' to the entry (eg, '1000). Then sqlFetch/RODBC should import these cells as character. [But a number would not be valid column name for a data.frame, and you may run into other trouble. See ?make.names] regards, enrico > -----Urspr?ngliche Nachricht----- > Von: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] Im Auftrag von Folkes, Michael > Gesendet: Dienstag, 5. April 2011 00:05 > An: r-help at r-project.org > Betreff: [R] RODBC excel - need to preserve (or extract) > numeric column names > > I'm using RODBC to read an excel file (not mine!). But I'm > struggling to find a way to preserve the column names that > have a numeric value. sqlFetch() drops the value and calls > them f1, f2, f3,... (ie field number). this is a different > approach from read.csv, which will append "V" prior to the > numeric column name. sqlFetch isn't so helpful. > > Is there a way to get the first line of data from the excel > file and place it in a vector? Perhaps I can use that method > and rename the dataframe column names later? > > thanks! > Michael > > _______________________________________________________ > Michael Folkes > Salmon Stock Assessment > Canadian Dept. of Fisheries & Oceans > Pacific Biological Station > 3190 Hammond Bay Rd. > Nanaimo, B.C., Canada > V9T-6N7 > Ph (250) 756-7264 Fax (250) 756-7053 > Michael.Folkes at dfo-mpo.gc.ca > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From leray.guillaume at gmail.com Tue Apr 5 11:30:29 2011 From: leray.guillaume at gmail.com (guillaume Le Ray) Date: Tue, 5 Apr 2011 11:30:29 +0200 Subject: [R] grImport/ghostscript problems In-Reply-To: References: <4D8F8E7A.60105@auckland.ac.nz> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ligges at statistik.tu-dortmund.de Tue Apr 5 11:47:59 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Tue, 05 Apr 2011 11:47:59 +0200 Subject: [R] Creating multiple vector/list names-novice In-Reply-To: <1301987275755-3427283.post@n4.nabble.com> References: <1301927063332-3425616.post@n4.nabble.com> <1301929772084-3425759.post@n4.nabble.com> <1301987275755-3427283.post@n4.nabble.com> Message-ID: <4D9AE54F.10503@statistik.tu-dortmund.de> On 05.04.2011 09:07, michalseneca wrote: > The exact would be for example that I shoul be able then to choose "a" from > "abc". and I cannot do that. The is rather unhelpful for helpers without quoting what your original question and the answers were. Uwe Ligges > -- > View this message in context: http://r.789695.n4.nabble.com/Creating-multiple-vector-list-names-novice-tp3425616p3427283.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From ligges at statistik.tu-dortmund.de Tue Apr 5 12:00:06 2011 From: ligges at statistik.tu-dortmund.de (Uwe Ligges) Date: Tue, 05 Apr 2011 12:00:06 +0200 Subject: [R] Problem using svm.tune In-Reply-To: <208248.58466.qm@web37905.mail.mud.yahoo.com> References: <208248.58466.qm@web37905.mail.mud.yahoo.com> Message-ID: <4D9AE826.50909@statistik.tu-dortmund.de> On 04.04.2011 12:43, sadaf zaidi wrote: > Dear Sir, > > I am stuck with a nagging problem in using R for SVM regression. My data has 5 > dimensions and 400 observations. The independent variables are : > Peb, Ksub, Sub, and Xtt. > The dependent variable is: Rexp. > I tried using the svm.tune function as well as<_tune(svm.....), to tune the > hyper parameters: gamma, epsilon and C. > > Since I am new to R, I am probably not using the svm.tune function properly. I > am getting the following error message: > Error in predict.svm(ret, xhold, decision.values=TRUE): Model is empty! > May you please help me!SADAF ZAIDI Please show us a reproducible examples with all your code. Otherwise it is hard to find where the error comes from. Uwe Ligges > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From Samuel.Le at srlglobal.com Tue Apr 5 12:00:16 2011 From: Samuel.Le at srlglobal.com (Samuel Le) Date: Tue, 5 Apr 2011 10:00:16 +0000 Subject: [R] converting "call" objects into character In-Reply-To: References: <037B6CC7-D60B-4B6F-A9B3-5F95F33B8174@comcast.net> Message-ID: Hi, David and Douglas, thanks for the effort in helping me. It seems that deparse(match.call()) is doing the trick. I learned that the class "call" is not easy to handle in R. Samuel -----Original Message----- From: David Winsemius [mailto:dwinsemius at comcast.net] Sent: 03 April 2011 18:56 To: Douglas Bates Cc: Samuel Le; r-help at r-project.org Subject: Re: [R] converting "call" objects into character On Apr 3, 2011, at 1:22 PM, Douglas Bates wrote: > On Sun, Apr 3, 2011 at 11:42 AM, David Winsemius > wrote: >> >> On Apr 3, 2011, at 12:14 PM, Samuel Le wrote: >> >>> Dear all, >>> >>> >>> >>> I would like to log the calls to my functions. I am trying to do >>> this >>> using the function match.call(): >> >> fTest<-function(x) >> >> { theCall<-match.call() >> print(theCall) >> return(list(x=x, logf = theCall)) >> } >> >>> >>> fTest(x=2)$x >> [1] 2 >>> fTest(x=2)$logf >> fTest(x = 2) >>> str(fTest(x=2)$logf) >> language fTest(x = 2) >> >> You may want to convert that call component to a character object, >> since: >> >>> cat(fTest(x=2)$logf) >> Error in cat(list(...), file, sep, fill, labels, append) : >> argument 1 (type 'language') cannot be handled by 'cat' > > If you want to examine a call object you need to ensure that it is not > evaluated. Evaluating a number or a character string is not a problem > because > > eval(4) > > is the same as > > 4 > > However, evaluating a function call should be different from the call > itself. As David shows, the str function is careful not to evaluate > the call object. (Martin and I found ourselves going around in > circles when looking at the structure of a fitted model object that > included a call and he kindly changed the behavior of str().) > > So you need to decide when a function, such as print(), evaluates its > arguments or when it doesn't, which can get kind of complicated. An > alternative is to use match.call() repeatedly instead of trying to > save the value, as in > >> fTest > function(x) { > print(match.call()) > list(x=x, logf = match.call()) > } >> fTest(x=2) > fTest(x = 2) > $x > [1] 2 > > $logf > fTest(x = 2) > > The trick there is that the value of match.call() is the unevaluated > call whereas > > myCall <- match.call() > print(myCall) > > evaluates myCall in the call to print, thereby evaluating the function > fTest again. > > Is this sufficiently confusing? :-) Yes, I am now sufficiently confused^W , ... er, motivated to look for another route. I think the way out of the confusion is to turn the call into text and since as.character doesn't do a very neat a job, I would suggest instead: deparse() > fTest <- function(x) { + print(match.call()) + list(x=x, logf = deparse(match.call())) + } > fTest(x=3)$logf fTest(x = 3) [1] "fTest(x = 3)" > cat(fTest(x=3)$logf) fTest(x = 3) fTest(x = 3) cat() is a convenient test of the capacity of an object to be written to a file. It has an append parameter that implies it could serve the logging function requested by the OP. >>> >>> I can see "theCall" printed into the console, but I don't manage to >>> convert it into a character to write it into a log file with other >>> informations. >>> >>> Can anyone help? David Winsemius, MD West Hartford, CT __________ Information from ESET NOD32 Antivirus, version of virus signature database 6011 (20110403) __________ The message was checked by ESET NOD32 Antivirus. http://www.eset.com __________ Information from ESET NOD32 Antivirus, version of virus signature database 6016 (20110405) __________ The message was checked by ESET NOD32 Antivirus. http://www.eset.com From alexandrovich at mathematik.uni-marburg.de Tue Apr 5 11:43:57 2011 From: alexandrovich at mathematik.uni-marburg.de (Grigory Alexandrovich) Date: Tue, 05 Apr 2011 11:43:57 +0200 Subject: [R] Animation for pers3d Message-ID: <4D9AE45D.70205@mathematik.uni-marburg.de> Hello all, I use persp3d from the rgl-package to plot a sruface. The typical call is persp3d(x, y, z) With cooridinate-vectros x, y and a function-values matrix z. Now I have different z's, say z_1,...,z_n Question: Is it possible to generate an animation from a sequence of such calls, for different z's? I would like to see how the surface is beeing changed in the time. Thank you Grigory Alexandrovich From hsanz at imim.es Tue Apr 5 10:49:42 2011 From: hsanz at imim.es (hsr) Date: Tue, 5 Apr 2011 03:49:42 -0500 (CDT) Subject: [R] frailty Message-ID: <1301993382370-3427452.post@n4.nabble.com> Hi R-users I spend a lot of time searching on the web but I didn?t found a clear answer. I have some doubts with 'frailty' function of 'survival' package. The following model with the function R ?coxph? was fitted: modx <- coxph(Surv(to_stroke, stroke) ~ age + sbp + dbp + sex + frailty(center,distribution = "gamma", method='aic'), data=datax) Then I get survival (eg to 10 years) for the mean of the covariates: survfit1 <- survfit(modx) timesele<- 3652.25 tab <- as.data.frame(cbind(survfit1$time, survfit1$surv)) names(tab) <- c("time", "surv") meansurv <- tab[tab$time==max(tab$time[tab$time References: <8BC2E7AD-5B2F-4861-837A-A67932E1CD58@mpipsykl.mpg.de> Message-ID: > ? ? ? ?as.matrix(p.adjust(as.dist(pmat))) Perfect! Thanks. j. > > > Benno > > On 4.Apr.2011, at 17:02, January Weiner wrote: > >> Dear all, >> >> I have an n x n matrix of p-values. The matrix is symmetrical, as it >> describes the "each against each" p values of correlation >> coefficients. >> >> How can I best correct the p values of the matrix? Notably, the total >> number of the tests performed is n(n-1)/2, since I do not test the >> correlation of each variable with itself. That means, I only want to >> correct one half of the matrix, not including the diagonal. Therefore, >> simply writing >> >> pmat <- p.adjust( pmat, method= "fdr" ) >> # where pmat is an n x n matrix >> >> ...doesn't cut it. >> >> Of course, I can turn the matrix in to a three column data frame with >> n(n-1)/2 rows, but that is slow and not elegant. >> >> regards, >> j. >> >> -- >> -------- Dr. January Weiner 3 -------------------------------------- >> Max Planck Institute for Infection Biology >> Charit?platz 1 >> D-10117 Berlin, Germany >> Web ? : www.mpiib-berlin.mpg.de >> Tel ? ? : +49-30-28460514 >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- -------- Dr. January Weiner 3 -------------------------------------- Max Planck Institute for Infection Biology Charit?platz 1 D-10117 Berlin, Germany Web?? : www.mpiib-berlin.mpg.de Tel? ?? : +49-30-28460514 From january.weiner at mpiib-berlin.mpg.de Tue Apr 5 12:02:35 2011 From: january.weiner at mpiib-berlin.mpg.de (January Weiner) Date: Tue, 5 Apr 2011 12:02:35 +0200 Subject: [R] Adjusting p values of a matrix In-Reply-To: References: Message-ID: > 1. This is not an R question, AFAICS. I am afraid I was not clear enough. I am wondering how to best correct p values that are stored in a matrix, or, in more general: how to apply a function that takes a vector as an argument to the upper right (or, equivalently, lower left) half of a matrix, excluding the diagonal. for... in loop is a trivial, but slow and not elegant solution. Naturally, what correction should I use in case of tests which clearly are not independent is another matter, and I agree on that with you. Best regards, January > > 2. Sounds like a research topic. ?I don't think there's a meaningful > simple answer. I suspect it strongly depends on the model and context. > > -- Bert > > On Mon, Apr 4, 2011 at 8:02 AM, January Weiner > wrote: >> Dear all, >> >> I have an n x n matrix of p-values. The matrix is symmetrical, as it >> describes the "each against each" p values of correlation >> coefficients. >> >> How can I best correct the p values of the matrix? Notably, the total >> number of the tests performed is n(n-1)/2, since I do not test the >> correlation of each variable with itself. That means, I only want to >> correct one half of the matrix, not including the diagonal. Therefore, >> simply writing >> >> pmat <- p.adjust( pmat, method= "fdr" ) >> # where pmat is an n x n matrix >> >> ...doesn't cut it. >> >> Of course, I can turn the matrix in to a three column data frame with >> n(n-1)/2 rows, but that is slow and not elegant. >> >> regards, >> j. >> >> -- >> -------- Dr. January Weiner 3 -------------------------------------- >> Max Planck Institute for Infection Biology >> Charit?platz 1 >> D-10117 Berlin, Germany >> Web?? : www.mpiib-berlin.mpg.de >> Tel? ?? : +49-30-28460514 >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > "Men by nature long to get on to the ultimate truths, and will often > be impatient with elementary studies or fight shy of them. If it were > possible to reach the ultimate truths without the elementary studies > usually prefixed to them, these would not be preparatory studies but > superfluous diversions." > > -- Maimonides (1135-1204) > > Bert Gunter > Genentech Nonclinical Biostatistics > -- -------- Dr. January Weiner 3 -------------------------------------- Max Planck Institute for Infection Biology Charit?platz 1 D-10117 Berlin, Germany Web?? : www.mpiib-berlin.mpg.de Tel? ?? : +49-30-28460514 From rasanpreet.kaur at gmail.com Tue Apr 5 10:45:12 2011 From: rasanpreet.kaur at gmail.com (rasanpreet kaur suri) Date: Tue, 5 Apr 2011 10:45:12 +0200 Subject: [R] system() command in R In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From lars52r at gmail.com Tue Apr 5 12:46:57 2011 From: lars52r at gmail.com (Lars Bishop) Date: Tue, 5 Apr 2011 06:46:57 -0400 Subject: [R] Help in splitting a list Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jim at bitwrit.com.au Tue Apr 5 14:11:51 2011 From: jim at bitwrit.com.au (Jim Lemon) Date: Tue, 05 Apr 2011 22:11:51 +1000 Subject: [R] gap.barplot doesn't support data arrays? In-Reply-To: <4d99ca0f.cc7e0e0a.0b7e.ffffa600@mx.google.com> References: <4d99ca0f.cc7e0e0a.0b7e.ffffa600@mx.google.com> Message-ID: <4D9B0707.7070006@bitwrit.com.au> On 04/04/2011 11:39 PM, Andrew D. Steen wrote: > I am trying to make a barplot with a broken axis using gap.barplot (in the > indispensable plotrix package). Aww, gee, you've won me. > This works well when the data is a vector: > >> twogrp<-c(rnorm(10)+4,rnorm(10)+20) >> gap.barplot(twogrp,gap=c(8,16),xlab="Index",ytics=c(3,6,17,20),ylab="Group > values",main="Barplot with gap") > > But when the data is an array (for a bar plot with multiple series) I get an > error and a strange plot with no y-tics and bars stretching downwards, as if > all the values were negative: > >> twogrp2<-array(twogrp, dim=c(2,5)) >> > gap.barplot(twogrp2,gap=c(8,16),xlab="Index",ytics=c(3,6,17,20),ylab="Group > values",main="Barplot with gap") > > Error in rect(xtics[bigones] - halfwidth, botgap, xtics[bigones] + > halfwidth, : > cannot mix zero-length and non-zero-length coordinates > > However, the main title and axis labels do appear correctly. > > Are data arrays unsupported for gap.barplot, or am I missing something? > Hi Drew, You are right, as is Peter, gap.barplot doesn't support arrays, only vectors (I'll fix the docs). However, it wasn't too hard to whip up a rough but perhaps serviceable fix in the attached function. You may need to do some mods to the function to get exactly what you want. Try this: # your twogrp2 left out the high values twogrp2<-array(twogrp,dim=c(2,10)) source("gap.barp.R") gap.barp(twogrp2,gap=c(8,16),xlab="Index",ytics=c(3,6,17,20), xtics=1:10,ylab="Groupvalues",main="Barplot with gap",col=2:3) gap.barp returns the modified y values so that you can include the error bars. To paraphrase the immortal P.J. O'Rourke, "Perhaps R users shouldn't do some things in plots, but they certainly can do them" Jim -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: gap.barp.R URL: From ggrothendieck at gmail.com Tue Apr 5 13:13:30 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Tue, 5 Apr 2011 07:13:30 -0400 Subject: [R] converting "call" objects into character In-Reply-To: References: Message-ID: On Sun, Apr 3, 2011 at 12:14 PM, Samuel Le wrote: > Dear all, > > > > I would like to log the calls to my functions. I am trying to do this using the function match.call(): > > > > fTest<-function(x) > > { > > ? ? ?theCall<-match.call() > > ? ? ?print(theCall) > > ? ? ?return(x) > > } > > > >> fTest(2) > > fTest(x = 2) > > [1] 2 > > > > I can see "theCall" printed into the console, but I don't manage to convert it into a character to write it into a log file with other informations. > See ?trace -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From jim at bitwrit.com.au Tue Apr 5 14:17:31 2011 From: jim at bitwrit.com.au (Jim Lemon) Date: Tue, 05 Apr 2011 22:17:31 +1000 Subject: [R] Error in "color2D.matplot" : "Error in plot.new() : figure margins too large" Message-ID: <4D9B085B.7080709@bitwrit.com.au> Hi all, Just to let you know that the error was in the calculation (and the error message maybe peculiar to the original poster's system), not color2D.matplot. The message sent to the original poster was long and not very meaningful to those who didn't get the data and code (i.e. everybody except me). Jim From murdoch.duncan at gmail.com Tue Apr 5 13:24:48 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Tue, 05 Apr 2011 07:24:48 -0400 Subject: [R] Animation for pers3d In-Reply-To: <4D9AE45D.70205@mathematik.uni-marburg.de> References: <4D9AE45D.70205@mathematik.uni-marburg.de> Message-ID: <4D9AFC00.4070000@gmail.com> On 11-04-05 5:43 AM, Grigory Alexandrovich wrote: > Hello all, > > I use persp3d from the rgl-package to plot a sruface. The typical call > is persp3d(x, y, z) > With cooridinate-vectros x, y and a function-values matrix z. > > Now I have different z's, say z_1,...,z_n > > Question: > > Is it possible to generate an animation from a sequence of such calls, > for different z's? > I would like to see how the surface is beeing changed in the time. Yes, you can do animations. See example(persp3d) for one that changes the viewpoint. If you want to change the content of the plot, save the result of persp3d() on the first call, e.g. objs <- persp3d( ... ) surface <- objs["surface"] then delete and re-plot the "surface" element: rgl.pop(id=surface) # compute new x y z surface <- surface3d(x, y, z) If the scale changes during the animation it's likely to look ugly, so set the limits in the persp3d call (or turn off the axes). Duncan Murdoch From andrew.decker.steen at gmail.com Tue Apr 5 13:34:49 2011 From: andrew.decker.steen at gmail.com (Andrew D. Steen) Date: Tue, 5 Apr 2011 13:34:49 +0200 Subject: [R] gap.barplot doesn't support data arrays? In-Reply-To: <4D9B0707.7070006@bitwrit.com.au> References: <4d99ca0f.cc7e0e0a.0b7e.ffffa600@mx.google.com> <4D9B0707.7070006@bitwrit.com.au> Message-ID: <4d9afe5a.8b7c0e0a.2d37.0f69@mx.google.com> Jim, That works great. Thanks much for the quick help. Cheers, Drew > -----Original Message----- > From: Jim Lemon [mailto:jim at bitwrit.com.au] > Sent: Tuesday, April 05, 2011 2:12 PM > To: Andrew D. Steen > Cc: r-help at r-project.org > Subject: Re: [R] gap.barplot doesn't support data arrays? > > On 04/04/2011 11:39 PM, Andrew D. Steen wrote: > > I am trying to make a barplot with a broken axis using gap.barplot > (in > > the indispensable plotrix package). > > Aww, gee, you've won me. > > > This works well when the data is a vector: > > > >> twogrp<-c(rnorm(10)+4,rnorm(10)+20) > >> > gap.barplot(twogrp,gap=c(8,16),xlab="Index",ytics=c(3,6,17,20),ylab=" > >> Group > > values",main="Barplot with gap") > > > > But when the data is an array (for a bar plot with multiple series) I > > get an error and a strange plot with no y-tics and bars stretching > > downwards, as if all the values were negative: > > > >> twogrp2<-array(twogrp, dim=c(2,5)) > >> > > > gap.barplot(twogrp2,gap=c(8,16),xlab="Index",ytics=c(3,6,17,20),ylab=" > > Group > > values",main="Barplot with gap") > > > > Error in rect(xtics[bigones] - halfwidth, botgap, xtics[bigones] + > > halfwidth, : > > cannot mix zero-length and non-zero-length coordinates > > > > However, the main title and axis labels do appear correctly. > > > > Are data arrays unsupported for gap.barplot, or am I missing > something? > > > Hi Drew, > You are right, as is Peter, gap.barplot doesn't support arrays, only > vectors (I'll fix the docs). However, it wasn't too hard to whip up a > rough but perhaps serviceable fix in the attached function. You may > need to do some mods to the function to get exactly what you want. Try > this: > > # your twogrp2 left out the high values > twogrp2<-array(twogrp,dim=c(2,10)) > source("gap.barp.R") > gap.barp(twogrp2,gap=c(8,16),xlab="Index",ytics=c(3,6,17,20), > xtics=1:10,ylab="Groupvalues",main="Barplot with gap",col=2:3) > > gap.barp returns the modified y values so that you can include the > error bars. > > To paraphrase the immortal P.J. O'Rourke, > > "Perhaps R users shouldn't do some things in plots, but they certainly > can do them" > > Jim From wwwhsd at gmail.com Tue Apr 5 13:53:16 2011 From: wwwhsd at gmail.com (Henrique Dallazuanna) Date: Tue, 5 Apr 2011 08:53:16 -0300 Subject: [R] Help in splitting a list In-Reply-To: References: Message-ID: Try this: mapply('[', lapply(mylist, as.data.frame), c(index, lapply(index, `!`)), SIMPLIFY = FALSE) On Tue, Apr 5, 2011 at 7:46 AM, Lars Bishop wrote: > Dear R users, > > Let's say I have a list with components being 'm' matrices (as exemplified > in the "mylist" object below). Now, I'd like to subset this list based on an > index vector, which will partition each matrix 'm' in 2 sub-matrices. My > questions are: > > 1. Is there an elegant way to have the results shown in mylist2 for an > arbitrary number of matrices in mylist? > > 2. The column names are 'lost' for mylist2[[2]] and mylist2[[4]] (but not > for mylist2[[1]] and mylist2[[3]]). Is there a way to keep the column names > in the results of mylist2? > > mylist <- list(matrix(1:9,3,3), matrix(10:18,3,3)) > colnames(mylist[[1]])=c('x1','x2','x3') > colnames(mylist[[2]])=c('x4','x5','x6') > index <- list(2) > index[[1]] <- c(TRUE,FALSE,TRUE) > index[[2]] <- c(FALSE,TRUE,TRUE) > mylist2 <- list(as.matrix(mylist[[1]][,index[[1]]]), > ? ? ? ? ? ? ? ?as.matrix(mylist[[1]][,!index[[1]]]), > ? ? ? ? ? ? ? ?as.matrix(mylist[[2]][,index[[2]]]), > ? ? ? ? ? ? ? ?as.matrix(mylist[[2]][,!index[[2]]])) > Thanks for any help, > Lars. > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O From deepayan.sarkar at gmail.com Tue Apr 5 14:47:24 2011 From: deepayan.sarkar at gmail.com (Deepayan Sarkar) Date: Tue, 5 Apr 2011 18:17:24 +0530 Subject: [R] lattice: how to "center" a subtitle? In-Reply-To: <4D9A6558.4050405@auckland.ac.nz> References: <57C6A6F6-4CB1-49B2-A0DA-4480EFD385A3@comcast.net> <4D9A6558.4050405@auckland.ac.nz> Message-ID: On Tue, Apr 5, 2011 at 6:12 AM, David Scott wrote: [...] > I am not sure where I read it and I can't find it again, but my > understanding is that expressions using bquote with lattice need to be > enclosed in as.expression() to work. That is in contrast to what happens in > base graphics. > > Here is a simple example. > > a <- 2 > plot(1:10, a*(1:10), main = bquote(alpha == .(a))) > require(lattice) > xyplot(a*(1:10)~ 1:10, main = bquote(alpha == .(a))) > xyplot(a*(1:10)~ 1:10, main = as.expression(bquote(alpha == .(a)))) > > Which produces: > >> a <- 2 >> plot(1:10, a*(1:10), main = bquote(alpha == .(a))) >> require(lattice) > Loading required package: lattice >> xyplot(a*(1:10)~ 1:10, main = bquote(alpha == .(a))) > Error in trellis.skeleton(formula = a * (1:10) ~ 1:10, cond = list(c(1L, ?: > ?object 'alpha' not found >> xyplot(a*(1:10)~ 1:10, main = as.expression(bquote(alpha == .(a)))) > > Using expression() rather than as.expression() doesn't produce the desired > affect. Try it yourself. > > As to why this is the case ..... Let's see: ?xyplot says 'main': Typically a character string or expression describing the main title to be placed on top of each page. [...] So, lattice is fairly explicit, by R standards, in requiring 'main' to be "character" or "expression". On the other hand, ?title says The labels passed to 'title' can be character strings or language objects (names, calls or expressions), or [...] so it additionally accepts "names" and "calls". Now, we have > a <- 2 > foo <- bquote(alpha == .(a)) > foo # Looks OK alpha == 2 > mode(foo) # But [1] "call" > is.expression(foo) # not an expression [1] FALSE > is.expression(expression(foo)) ## YES, but [1] TRUE > expression(foo) ## not what we want expression(foo) > is.expression(as.expression(foo)) [1] TRUE > as.expression(foo) ## This IS what we want expression(alpha == 2) So I submit that lattice is behaving exactly as suggested by its documentation. Now you would naturally argue that this is hiding behind technicalities, and if "call" objects work for plot(), it should work for lattice as well. But watch this: > plot(1:10, main = foo) # works perfectly > arglist <- list(1:10, main = foo) > arglist # Looks like what we want [[1]] [1] 1 2 3 4 5 6 7 8 9 10 $main alpha == 2 > do.call(plot, arglist) Error in as.graphicsAnnot(main) : object 'alpha' not found ...which I would say is "unexpected" behaviour, if not a bug. The moral of the story is that unevaluated calls are dangerous objects (try this one out for fun: foo <- bquote(q(.(x)), list(x = "no")) do.call(plot, list(1:10, main = foo)) ), and carrying them around is not a good idea. Lattice does use the do.call paradigm quite a bit, and I think it might be quite difficult to fix it up to handle non-expression language objects (which will still not fix the type of problem shown above). -Deepayan From marchywka at hotmail.com Tue Apr 5 14:15:30 2011 From: marchywka at hotmail.com (Mike Marchywka) Date: Tue, 5 Apr 2011 08:15:30 -0400 Subject: [R] system() command in R In-Reply-To: References: , Message-ID: ---------------------------------------- > Date: Tue, 5 Apr 2011 13:37:12 +0530 > From: nandan.amar at gmail.com > To: rasanpreet.kaur at gmail.com > CC: r-help at r-project.org > Subject: Re: [R] system() command in R > > On 4 April 2011 16:54, rasanpreet kaur suri wrote: > > > Hi all, > > I have a local server insalled on my system and have to start that from > > within my R function. > > > > here is how I start it: > > > > cmd<-"sh start-server.sh" > > > > system(cmd, wait=FALSE) > > > > My function has to start the server and proceed with further steps. The > > server starts but the further steps of the program are not executed.The > > cursor keeps waiting after the server is started. > > > > How r u executing further steps after starting server, meant for server > from R ?? > > > i tried removing the wait=FALSE, but it still keeps waiting. > > > > I also tried putting the start-server in a separate function and my further > > script in a separate function and then run them together, but it still > > waits. The transition from the start of server to next step is not > > happening. > > > > Please help. I have been stuck on this for quite some time now. > > > > -- I hadn't done this in R but expect to do so soon. I just got done with some java code to do something similar and you can expect in any implementation these things will be system dependent. It often helps to have simple test cases to isolate the problem. Here I made a tst script called "foo" that takes a minute or so to exevute and generates some output. If I type system("./foo",wait=F) the prompt comes back right away but stdout seems to still go to my console and maybe stdin is not redicrected either and it could eat your input ( no idea, but this is probably not what you want). I did try this that could fix your problem, on debian anyway it seems to work, system("nohup ./foo &") you can "man nohup" for details. > > Rasanpreet Kaur > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > Amar Kumar Nandan > Karnataka, India, 560100 > http://aknandan.co.nr > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. From pengchy at gmail.com Tue Apr 5 13:09:12 2011 From: pengchy at gmail.com (Pengcheng Yang) Date: Tue, 05 Apr 2011 19:09:12 +0800 Subject: [R] how to label customized y axis when using lattice parallel parameter common.scale=TRUE Message-ID: <4D9AF858.90200@gmail.com> Dear all, When I use parallel function in lattice package, I want to label the y-axis with customized numbers. Like this: parallel(~iris[1:4] | Species, iris,horiz=FALSE,common.scale=TRUE, scales=list(y=list(at=c(0,2,3)))) But only "Min" label in the y-axis, nothing happened. Could anyone help me? Thanks. Regards, Pengcheng Yang From ravi.kulk at gmail.com Tue Apr 5 14:35:04 2011 From: ravi.kulk at gmail.com (Ravi Kulkarni) Date: Tue, 5 Apr 2011 07:35:04 -0500 (CDT) Subject: [R] Time series example in Koop Message-ID: <1302006904537-3427897.post@n4.nabble.com> I am trying to reproduce the output of a time series example in Koop's book "Analysis of Financial Data". Koop does the example in Excel and I used the ts function followed by the lm function. I am unable to get the exact coefficients that Koop gives - my coefficients are slightly different. After loading the data file and attaching the frame, my code reads: > y = ts(m.cap) > x = ts(oil.price) > d = ts.union(y,x,x1=lag(x,-1),x2=lag(x,-2),x3=lag(x,-3),x4=lag(x,-4)) > mod1 = lm(y~x+x1+x2+x3+x4, data=d) > summary(mod1) Koop gives an intercept of 92001.51, while the code above gives 91173.32. The other coefficients are also slightly off. This is the example in Table 8.3 of Koop. I also attach a plain text version of the tab separated file "badnews.txt". http://r.789695.n4.nabble.com/file/n3427897/badnews.txt badnews.txt Any light on why I do not get Koop's coefficients is most welcome... Ravi -- View this message in context: http://r.789695.n4.nabble.com/Time-series-example-in-Koop-tp3427897p3427897.html Sent from the R help mailing list archive at Nabble.com. From rhelpforum at gmail.com Tue Apr 5 13:20:40 2011 From: rhelpforum at gmail.com (beatleb) Date: Tue, 5 Apr 2011 06:20:40 -0500 (CDT) Subject: [R] Value between which elements of a vector? Message-ID: <1302002440873-3427751.post@n4.nabble.com> Dear R-useRs, I am looking for a why to perform the following: specialweeks<-c(0,2,5,12,18,19,20) weeks<-c(1:30) Now I would like that for every week it is even between which elements of vector special weeks it is. For weeks after 20, the value NA or 20, or even 20-30is fine. Thus for week 1: 0-2 week 2: 2-5 week 3: 2-5 week 4: 2-5 week 5: 5-12 ect.... It is not relevant if those intervals are captured in a matrix, in a vector or whatever. I hope that you can help me! With best regards, Brenda Grondman -- View this message in context: http://r.789695.n4.nabble.com/Value-between-which-elements-of-a-vector-tp3427751p3427751.html Sent from the R help mailing list archive at Nabble.com. From SVonfelten at uhbs.ch Tue Apr 5 12:38:34 2011 From: SVonfelten at uhbs.ch (Stefanie Von Felten) Date: Tue, 05 Apr 2011 12:38:34 +0200 Subject: [R] Confidence interval for the difference between proportions - method used in prop.test() Message-ID: <4D9B0D4A020000E100036749@mail.uhbs.ch> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From deepayan.sarkar at gmail.com Tue Apr 5 15:21:10 2011 From: deepayan.sarkar at gmail.com (Deepayan Sarkar) Date: Tue, 5 Apr 2011 18:51:10 +0530 Subject: [R] lattice xscale.components: different ticks on top/bottom axis In-Reply-To: <4d962ecc.0132640a.0557.ffffcf28SMTPIN_ADDED@mx.google.com> References: <201103101854.p2AIsL0A020203@hypatia.math.ethz.ch> <4d962ecc.0132640a.0557.ffffcf28SMTPIN_ADDED@mx.google.com> Message-ID: On Sat, Apr 2, 2011 at 1:29 AM, wrote: > >> On Fri, Mar 11, 2011 at 12:28 AM, >> wrote: >> > Good afternoon, >> > >> > I am trying to create a plot where the bottom and top axes have the >> > same scale but different tick marks. ?I tried user-defined >> > xscale.component function but it does not produce desired results. >> > Can anybody suggest where my use of xscale.component >> > function is incorrect? >> > >> > For example, the code below tries to create a plot where horizontal >> > axes limits are c(0,10), top axis has ticks at odd integers, and >> > bottom axis has ticks at even integers. >> > >> > library(lattice) >> > >> > df <- data.frame(x=1:10,y=1:10) >> > >> > xscale.components.A <- function(...,user.value=NULL) { >> > ?# get default axes definition list; print user.value >> > ?ans <- xscale.components.default(...) >> > ?print(user.value) >> > >> > ?# start with the same definition of bottom and top axes >> > ?ans$top <- ans$bottom >> > >> > ?# - bottom labels >> > ?ans$bottom$labels$at <- seq(0,10,by=2) >> > ?ans$bottom$labels$labels <- paste("B",seq(0,10,by=2),sep="-") >> > >> > ?# - top labels >> > ?ans$top$labels$at <- seq(1,9,by=2) >> > ?ans$top$labels$labels <- paste("T",seq(1,9,by=2),sep="-") >> > >> > ?# return axes definition list >> > ?return(ans) >> > } >> > >> > oltc <- xyplot(y~x,data=df, >> > >> > scales=list(x=list(limits=c(0,10),at=0:10,alternating=3)), >> > ? ? ? ? ? ? ? xscale.components=xscale.components.A, >> > ? ? ? ? ? ? ? user.value=1) >> > print(oltc) >> > >> > The code generates a figure with incorrectly placed bottom and top >> > labels. ?Bottom labels "B-0", "B-2", ... are at 0, 1, ... and top >> > labels "T-1", "T-3", ... are at 0, 1, ... ?When axis-function runs out >> > of labels, it replaces labels with NA. >> > >> > It appears that lattice uses top$ticks$at to place labels and >> > top$labels$labels for labels. ?Is there a way to override this >> > behaviour (other than to expand the "labels$labels" vector to be as >> > long as "ticks$at" vector and set necessary elements to "")? >> >> Well, $ticks$at is used to place the ticks, and >> $labels$at is used to place the labels. They should typically >> be the same, but you have changed one and not the other. >> Everything seems to work if you set $ticks$at to the same >> values as $labels$at: >> >> >> ? ? ## ?- bottom labels >> + ? ans$bottom$ticks$at <- seq(0,10,by=2) >> ? ? ans$bottom$labels$at <- seq(0,10,by=2) >> ? ? ans$bottom$labels$labels <- paste("B",seq(0,10,by=2),sep="-") >> >> ? ? ## ?- top labels >> + ? ans$top$ticks$at <- seq(1,9,by=2) >> ? ? ans$top$labels$at <- seq(1,9,by=2) >> ? ? ans$top$labels$labels <- paste("T",seq(1,9,by=2),sep="-") >> >> >> > Also, can user-parameter be passed into xscale.components() >> > function? (For example, locations and labels of ticks on the top >> > axis). ?In the ?code above, print(user.value) returns NULL even >> > though in the xyplot() call user.value is 1. >> >> No. Unrecognized arguments are passed to the panel function >> only, not to any other function. However, you can always >> define an inline >> function: >> >> oltc <- xyplot(y~x,data=df, >> ? ? ? ? ? ? ? ?scales=list(x=list(limits=c(0,10), at = 0:10, >> ? ? ? ? ? ? ? ? ? ? ? ? ? ?alternating=3)), >> ? ? ? ? ? ? ? ?xscale.components = function(...) >> ? ? ? ? ? ? ? ? ? ? ? ? ? ?xscale.components.A(..., user.value=1)) >> >> Hope that helps (and sorry for the late reply). >> >> -Deepayan >> > > Deepyan, > > Thank you very much for your reply. ?It makes things a bit clearer. > > It other words in the list prepared by xscale.components(), vectors > $ticks$at and $labels$at must be the same. > If only every second tick is to be labelled then every second label > should be set explicitly to empty strings: Now when you put it that way, the current behaviour does seem wrong (I didn't read your original post carefully enough). I guess this was one of the not-yet-implemented things mentioned in the Details section of ?xscale.components.default. I have added support for different ticks$at and labels$at in the SVN sources in r-forge. You can test it from there (your original code works as expected). I won't make a new release on CRAN until after R 2.13 is released (we are almost in code freeze now). -Deepayan From djandrija at gmail.com Tue Apr 5 15:33:04 2011 From: djandrija at gmail.com (andrija djurovic) Date: Tue, 5 Apr 2011 15:33:04 +0200 Subject: [R] Value between which elements of a vector? In-Reply-To: <1302002440873-3427751.post@n4.nabble.com> References: <1302002440873-3427751.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From deepayan.sarkar at gmail.com Tue Apr 5 15:42:51 2011 From: deepayan.sarkar at gmail.com (Deepayan Sarkar) Date: Tue, 5 Apr 2011 19:12:51 +0530 Subject: [R] how to label customized y axis when using lattice parallel parameter common.scale=TRUE In-Reply-To: <4D9AF858.90200@gmail.com> References: <4D9AF858.90200@gmail.com> Message-ID: On Tue, Apr 5, 2011 at 4:39 PM, Pengcheng Yang wrote: > Dear all, > > When I use parallel function in lattice package, I want to label the y-axis > with customized numbers. Like this: > > parallel(~iris[1:4] | Species, iris,horiz=FALSE,common.scale=TRUE, > ? ?scales=list(y=list(at=c(0,2,3)))) Parallel does not directly support that, and will insist on scaling the data. However, you can control the scaling (using 'lower' and 'upper'), and override a couple of other arguments to get what you want: parallel(~iris[1:4] | Species, iris, xlim = extendrange(range(iris[1:4])), scales = list(x = list(at = NULL, labels = NULL)), lower = 0, upper = 1) This is for horizontal.axis = TRUE, adjust accordingly for FALSE. -Deepayan > > But only "Min" label in the y-axis, nothing happened. Could anyone help me? > > Thanks. > Regards, > > Pengcheng Yang > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From ggrothendieck at gmail.com Tue Apr 5 15:46:51 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Tue, 5 Apr 2011 09:46:51 -0400 Subject: [R] Time series example in Koop In-Reply-To: <1302006904537-3427897.post@n4.nabble.com> References: <1302006904537-3427897.post@n4.nabble.com> Message-ID: On Tue, Apr 5, 2011 at 8:35 AM, Ravi Kulkarni wrote: > I am trying to reproduce the output of a time series example in Koop's book > "Analysis of Financial Data". Koop does the example in Excel and I used the > ts function followed by the lm function. > I am unable to get the exact coefficients that Koop gives - my coefficients > are slightly different. > After loading the data file and attaching the frame, my code reads: > >> y = ts(m.cap) >> x = ts(oil.price) >> d = ts.union(y,x,x1=lag(x,-1),x2=lag(x,-2),x3=lag(x,-3),x4=lag(x,-4)) >> mod1 = lm(y~x+x1+x2+x3+x4, data=d) >> summary(mod1) > > Koop gives an intercept of 92001.51, while the code above gives 91173.32. > The other coefficients are also ?slightly off. > > This is the example in Table 8.3 of Koop. I also attach a plain text version > of the tab separated file "badnews.txt". > http://r.789695.n4.nabble.com/file/n3427897/badnews.txt badnews.txt > > Any light on why I do not get Koop's coefficients is most welcome... > It looks like he erroneously left out the first point. > URL <- "http://r.789695.n4.nabble.com/file/n3427897/badnews.txt" > BAD <- read.table(URL, header = TRUE) > library(dyn) > dyn$lm(m.cap ~ lag(oil.price, -(0:4)), as.zoo(BAD)) Call: lm(formula = dyn(m.cap ~ lag(oil.price, -(0:4))), data = as.zoo(BAD)) Coefficients: (Intercept) lag(oil.price, -(0:4))1 lag(oil.price, -(0:4))2 91173.32 -131.99 -449.86 lag(oil.price, -(0:4))3 lag(oil.price, -(0:4))4 lag(oil.price, -(0:4))5 -422.52 -187.10 -27.77 > > # without first point > dyn$lm(m.cap ~ lag(oil.price, -(0:4)), tail(as.zoo(BAD), -1)) Call: lm(formula = dyn(m.cap ~ lag(oil.price, -(0:4))), data = tail(as.zoo(BAD), -1)) Coefficients: (Intercept) lag(oil.price, -(0:4))1 lag(oil.price, -(0:4))2 92001.5 -145.0 -462.1 lag(oil.price, -(0:4))3 lag(oil.price, -(0:4))4 lag(oil.price, -(0:4))5 -424.5 -199.5 -36.9 > -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com From jwiley.psych at gmail.com Tue Apr 5 15:59:02 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Tue, 5 Apr 2011 06:59:02 -0700 Subject: [R] Confidence interval for the difference between proportions - method used in prop.test() In-Reply-To: <4D9B0D4A020000E100036749@mail.uhbs.ch> References: <4D9B0D4A020000E100036749@mail.uhbs.ch> Message-ID: Hi Stefanie, Just to be clear, we are talking about differences in the third or lower decimal place (at least with R version 2.13.0 alpha (2011-03-17 r54849), Epi_1.1.20). This strikes me as small enough that both functions may be implementing the same method, but maybe slightly different ways of going about it? If you are really concerned and need to know *exactly*, look at the source code for both functions. In case you did not know, if you type the function name at the console with parentheses or any arguments, just like: > prop.test > ci.pd it will show the actual function code. It looks to me like both of them are implemented purely in R, and without even calling any other complex functions (at least based on a quick glance through). This means if you have the Newscomb text, you should be able to sit down and go through the code step by step comparing it. Cheers, Josh FYI, you can use a matrix with prop.test, and then its transpose for ci.pd. ## mymat <- cbind(Successes = c(21, 41), Failures = c(345, 345) - c(21, 41)) require(Epi) results <- list(prop.test(mymat, correct=FALSE), ci.pd(t(mymat))) results[[1]][["conf.int"]] - results[[2]][6:7] On Tue, Apr 5, 2011 at 3:38 AM, Stefanie Von Felten wrote: > Hello, > > Does anyone know which method from Newcombe (1998)* is implemented in prop.test for comparing two proportions? > I would guess it is the method based on the Wilson score (for single proportion), with and without continuity correction ?for prop.test(..., correct=FALSE) and prop.test(..., correct=TRUE). These methods would correspond to no. 10 and 11 tested in Newcombe, respectively. Can someone confirm this? If not, which other methods are implemented by prop.test? > > * Newcombe R.G. (1998) Two-Sided Confidence Intervals for the Single Proportion: Comparison of Seven Methods. ?Statistics in Medicine ?*17*, 857-872. > > There is also the function ci.pd() from the R-package Epi, which should implement method no. 10 from Newcombe. However, prop.test(..., correct=FALSE) and ci.pd do not give the same result if I do the following: > > successes <- c(21, 41) > total <- c(345, 345) > prop.test(successes, total, correct=FALSE) > library(Epi) > ci.pd(matrix(c(successes, total-successes),ncol=2, byrow=TRUE)) > > Can someone explain why? > > Best wishes > Stefanie von Felten > > > Stefanie von Felten, PhD > Statistician > Clinical Trial Unit, CTU > University Hospital Basel > Schanzenstrasse 55 > CH-4031 Basel > > Phone: ++41(0)61 556 54 98 > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From dwinsemius at comcast.net Tue Apr 5 16:10:31 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 5 Apr 2011 10:10:31 -0400 Subject: [R] RODBC excel - need to preserve (or extract) numeric column names In-Reply-To: References: <63F107BCC37AEA49A75FD94AA3E07CB004AFD71A@pacpbsex01.pac.dfo-mpo.ca> Message-ID: <4486C24A-4D2D-40BD-8411-FA08BE38B994@comcast.net> On Apr 5, 2011, at 4:53 AM, Enrico Schumann wrote: > At least for Excel 2003 on my computer (Win XP) I can "persuade" > Excel to > treat cells like text by prepending a ' to the entry (eg, '1000). Then > sqlFetch/RODBC should import these cells as character. [But a number > would > not be valid column name for a data.frame, and you may run into other > trouble. See ?make.names] > If the problem is that Excel is failing to treat cells as text, then you could also try selecting the cells, then using the Format/Cells panel to specify "Text" rather than "General". But as I suggested before, I suspect the problem is you effort to defeat the usual checking for valid R names. -- David. > regards, > enrico > > > >> -----Urspr?ngliche Nachricht----- >> Von: r-help-bounces at r-project.org >> [mailto:r-help-bounces at r-project.org] Im Auftrag von Folkes, Michael >> Gesendet: Dienstag, 5. April 2011 00:05 >> An: r-help at r-project.org >> Betreff: [R] RODBC excel - need to preserve (or extract) >> numeric column names >> >> I'm using RODBC to read an excel file (not mine!). But I'm >> struggling to find a way to preserve the column names that >> have a numeric value. sqlFetch() drops the value and calls >> them f1, f2, f3,... (ie field number). this is a different >> approach from read.csv, which will append "V" prior to the >> numeric column name. sqlFetch isn't so helpful. >> >> Is there a way to get the first line of data from the excel >> file and place it in a vector? Perhaps I can use that method >> and rename the dataframe column names later? >> >> thanks! >> Michael >> >> _______________________________________________________ >> Michael Folkes >> Salmon Stock Assessment >> Canadian Dept. of Fisheries & Oceans >> Pacific Biological Station >> 3190 Hammond Bay Rd. >> Nanaimo, B.C., Canada >> V9T-6N7 >> Ph (250) 756-7264 Fax (250) 756-7053 >> Michael.Folkes at dfo-mpo.gc.ca >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From jdnewmil at dcn.davis.ca.us Tue Apr 5 16:36:15 2011 From: jdnewmil at dcn.davis.ca.us (Jeff Newmiller) Date: Tue, 05 Apr 2011 07:36:15 -0700 Subject: [R] system() command in R In-Reply-To: References: , Message-ID: <6a754eb7-04f7-4f46-bdec-07c9b791977f@email.android.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From yrosseel at gmail.com Tue Apr 5 16:48:14 2011 From: yrosseel at gmail.com (yrosseel) Date: Tue, 05 Apr 2011 16:48:14 +0200 Subject: [R] Structural equation modeling in R(lavaan,sem) In-Reply-To: References: <4D9073F7.2040309@gmail.com> <1301426701835-3415954.post@n4.nabble.com> <4D996FB6.2020903@gmail.com> Message-ID: <4D9B2BAE.2050602@gmail.com> On 04/04/2011 07:14 PM, jouba wrote: > > > Thanks you for your response > For lavaan package can i have more information about this example you have applied in the section 7 > the meanings of The variables (c1,c2,c3,c4, i ,s ,x1,x2) > I think i have need more information to learn more about how able to apply growth model in my data (longitudianl data) In the example, c1-c4 are time-varying covariates, i and s are the random intercept and slope respectively, and x1 and x2 are two exogenous covariates influencing the intercept and slope. Please note: the lavaanIntroduction document is hardly useful to _learn_ about growth models (or any SEM model for that matter). It only explains how to fit them using the lavaan package. To learn about growth models, you may want to read any one of the books below: Latent Curve Models: A Structural Equation Perspective (Wiley Series in Probability and Statistics) by Kenneth A. Bollen and Patrick J. Curran (Hardcover - Dec 23, 2005) Latent Growth Curve Modeling (Quantitative Applications in the Social Sciences) by Dr. Kristopher J. Preacher, Aaron Lee Wichman, Robert Charles MacCallum and Dr. Nancy E. Briggs (Paperback - Jun 27, 2008) An Introduction to Latent Variable Growth Curve Modeling: Concepts, Issues, and Applications (Quantitative Methodology) (Quantitative Methodology Series) by Terry E. Duncan, Susan C. Duncan and Lisa A. Strycker (Paperback - May 23, 2006) Yves. From jwiley.psych at gmail.com Tue Apr 5 17:03:30 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Tue, 5 Apr 2011 08:03:30 -0700 Subject: [R] Confidence interval for the difference between proportions - method used in prop.test() In-Reply-To: <4D9B42B1020000E1000367BD@mail.uhbs.ch> References: <4D9B42B1020000E1000367BD@mail.uhbs.ch> Message-ID: Dear Steffi, On Tue, Apr 5, 2011 at 7:26 AM, Stefanie Von Felten wrote: > Dear Josh, > > Thanks for your help! > > Does your answer mean, that you agree the two methods should do the same, > and what I was guessing, despite the small differences? That would be my guess, but I have not actually read the reference in discussion. Still, the documentation for prop.test uses the same Newcombe reference as ci.pd, so if method 10 is the clear winner, it seems reasonable that prop.test is an implementation of it. > > What I prefer about ci.pd is, that the help clearly says which method is > implemented, which is not the case for prop.test. But I do not know who has > programmed the function. Then for reporting, use ci.pd, and say it is method 10 from Newcombe. You can always check the results with prop.test() which is part of R core so you can be pretty confident whatever it does, it does it correctly (and will be updated if necessary with future releases of R). Sincerely, Josh > > Best wishes > Steffi > > > Stefanie von Felten, PhD > Statistician > Clinical Trial Unit, CTU > University Hospital Basel > Schanzenstrasse 55 > CH-4031 Basel > > Phone: ++41(0)61 556 54 98 >>>> Joshua Wiley 05.04.11 15.59 Uhr >>> > Hi Stefanie, > > Just to be clear, we are talking about differences in the third or > lower decimal place (at least with R version 2.13.0 alpha (2011-03-17 > r54849), Epi_1.1.20). This strikes me as small enough that both > functions may be implementing the same method, but maybe slightly > different ways of going about it? > > If you are really concerned and need to know *exactly*, look at the > source code for both functions. In case you did not know, if you type > the function name at the console with parentheses or any arguments, > just like: > >> prop.test >> ci.pd > > it will show the actual function code. It looks to me like both of > them are implemented purely in R, and without even calling any other > complex functions (at least based on a quick glance through). This > means if you have the Newscomb text, you should be able to sit down > and go through the code step by step comparing it. > > Cheers, > > Josh > > FYI, you can use a matrix with prop.test, and then its transpose for ci.pd. > ## > mymat <- cbind(Successes = c(21, 41), Failures = c(345, 345) - c(21, 41)) > require(Epi) > results <- list(prop.test(mymat, correct=FALSE), ci.pd(t(mymat))) > results[[1]][["conf.int"]] - results[[2]][6:7] > > On Tue, Apr 5, 2011 at 3:38 AM, Stefanie Von Felten > wrote: >> Hello, >> >> Does anyone know which method from Newcombe (1998)* is implemented in >> prop.test for comparing two proportions? >> I would guess it is the method based on the Wilson score (for single >> proportion), with and without continuity correction for prop.test(..., >> correct=FALSE) and prop.test(..., correct=TRUE). These methods would >> correspond to no. 10 and 11 tested in Newcombe, respectively. Can someone >> confirm this? If not, which other methods are implemented by prop.test? >> >> * Newcombe R.G. (1998) Two-Sided Confidence Intervals for the Single >> Proportion: Comparison of Seven Methods. Statistics in Medicine *17*, >> 857-872. >> >> There is also the function ci.pd() from the R-package Epi, which should >> implement method no. 10 from Newcombe. However, prop.test(..., >> correct=FALSE) and ci.pd do not give the same result if I do the following: >> >> successes <- c(21, 41) >> total <- c(345, 345) >> prop.test(successes, total, correct=FALSE) >> library(Epi) >> ci.pd(matrix(c(successes, total-successes),ncol=2, byrow=TRUE)) >> >> Can someone explain why? >> >> Best wishes >> Stefanie von Felten >> >> >> Stefanie von Felten, PhD >> Statistician >> Clinical Trial Unit, CTU >> University Hospital Basel >> Schanzenstrasse 55 >> CH-4031 Basel >> >> Phone: ++41(0)61 556 54 98 >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Joshua Wiley > Ph.D. Student, Health Psychology > University of California, Los Angeles > http://www.joshuawiley.com/ > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From marchywka at hotmail.com Tue Apr 5 16:01:56 2011 From: marchywka at hotmail.com (Mike Marchywka) Date: Tue, 5 Apr 2011 10:01:56 -0400 Subject: [R] Time series example in Koop In-Reply-To: <1302006904537-3427897.post@n4.nabble.com> References: <1302006904537-3427897.post@n4.nabble.com> Message-ID: > Date: Tue, 5 Apr 2011 07:35:04 -0500 > From: ravi.kulk at gmail.com > To: r-help at r-project.org > Subject: [R] Time series example in Koop > > I am trying to reproduce the output of a time series example in Koop's book > "Analysis of Financial Data". Koop does the example in Excel and I used the > ts function followed by the lm function. > I am unable to get the exact coefficients that Koop gives - my coefficients > are slightly different. > After loading the data file and attaching the frame, my code reads: > > > y = ts(m.cap) > > x = ts(oil.price) > > d = ts.union(y,x,x1=lag(x,-1),x2=lag(x,-2),x3=lag(x,-3),x4=lag(x,-4)) > > mod1 = lm(y~x+x1+x2+x3+x4, data=d) > > summary(mod1) > > Koop gives an intercept of 92001.51, while the code above gives 91173.32. > The other coefficients are also slightly off. The differences here seem to be of order 1 percent. You could suspect a number of things, including the published data file being published to less precision than that used in the book numbers(also look at number of points and see if any were added or dropped etc ). However, you may want to judge these based on what they do to your error which they presumably are both supposed to minimize but the calculation of which could be subject to various roundoff errors etc. Unless minimization is done analytically, it is of course subject to limitations of convergence or iteration count. Plotting both fits over the data and looking at residuals may help too. Depending on what you are really trying to do, you may want to change your error calculation etc. Details of numerical results often depend on details of implementation. This is why stats packages that are not open source have limitations in applicability. With "real models" of course things get even more confusing. ( take a look at credit rating agencies results for example LOL). > > This is the example in Table 8.3 of Koop. I also attach a plain text version > of the tab separated file "badnews.txt". > http://r.789695.n4.nabble.com/file/n3427897/badnews.txt badnews.txt > > Any light on why I do not get Koop's coefficients is most welcome... > > Ravi From moliterno.camargo at gmail.com Tue Apr 5 17:06:55 2011 From: moliterno.camargo at gmail.com (Ulisses.Camargo) Date: Tue, 5 Apr 2011 10:06:55 -0500 (CDT) Subject: [R] Help to check data before putting it in a database Message-ID: <1302016015254-3428318.post@n4.nabble.com> The example scene: I have a database with stats about each goal made by my soccer team. This database (a data frame in R) is organized in lines (goals) with a set of columns containing data about these goals (player name, tactic position, etc). For now, this database will be called "data.frame1". What I need is to feed this "data.frame1" with new information about my team goals. I will call this new information "data.frame2". This set of new goals is organized in the same way as in "data.frame1" (equal numbers of cols). Where help is needed: I need help in finding a way to check the player-name column in "data.frame2" before feeding "data.frame1" with it. What I need is a way to verify the name of the player on each line of "data.frame2" with the names of players that already exist on a col in "data.frame1". Moreover, I need R to make two main things: First, the lines of ?data.frame2? with player names that already exists in ?data.frame1? must be added to ?data.frame1?. Second: lines of ?data.frame2? with player names that does not exist on ?data.frame1? must be listed in an output to be manually checked and corrected. After this verification, corrected lines and new-player-names lines must be incorporated in "data.frame1". What I want is to guarantee that will be no lines with wrong player names in my database. At the same time, my script must permit new information to be added (new player names). Is there somebody who could help me with this? Thanks for your attention Best wishes Ulisses -- View this message in context: http://r.789695.n4.nabble.com/Help-to-check-data-before-putting-it-in-a-database-tp3428318p3428318.html Sent from the R help mailing list archive at Nabble.com. From pengchy at gmail.com Tue Apr 5 16:14:21 2011 From: pengchy at gmail.com (Pengcheng Yang) Date: Tue, 05 Apr 2011 22:14:21 +0800 Subject: [R] how to label customized y axis when using lattice parallel parameter common.scale=TRUE In-Reply-To: References: <4D9AF858.90200@gmail.com> Message-ID: <4D9B23BD.9070002@gmail.com> Thanks Deepayan, It works! On 2011-4-5 21:42, Deepayan Sarkar wrote: > On Tue, Apr 5, 2011 at 4:39 PM, Pengcheng Yang wrote: >> Dear all, >> >> When I use parallel function in lattice package, I want to label the y-axis >> with customized numbers. Like this: >> >> parallel(~iris[1:4] | Species, iris,horiz=FALSE,common.scale=TRUE, >> scales=list(y=list(at=c(0,2,3)))) > Parallel does not directly support that, and will insist on scaling > the data. However, you can control the scaling (using 'lower' and > 'upper'), and override a couple of other arguments to get what you > want: > > parallel(~iris[1:4] | Species, iris, > xlim = extendrange(range(iris[1:4])), > scales = list(x = list(at = NULL, labels = NULL)), > lower = 0, upper = 1) > > This is for horizontal.axis = TRUE, adjust accordingly for FALSE. > > -Deepayan > >> But only "Min" label in the y-axis, nothing happened. Could anyone help me? >> >> Thanks. >> Regards, >> >> Pengcheng Yang >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> From srinivas.eswar at gmail.com Tue Apr 5 15:43:34 2011 From: srinivas.eswar at gmail.com (psombe) Date: Tue, 5 Apr 2011 08:43:34 -0500 (CDT) Subject: [R] Support Counting In-Reply-To: <20110404093724.GA21431@praha1.ff.cuni.cz> References: <1301897497861-3424730.post@n4.nabble.com> <20110404093724.GA21431@praha1.ff.cuni.cz> Message-ID: <1302011014314-3428062.post@n4.nabble.com> well im using the "arules" package and i'm trying to use the support command. my data is read form a file using the "read.transactions" command and a line of data looks something like this. there are aboutt 88000 rows and 16000 different items > inspect(dset[3]) items 1 {33, 34, 35} > inspect(dset[1]) items 1 {0, 1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 3, 4,5, 6, 7, 8, 9} So in order to use support i have to make an object of class "itemsets" and im kind of struggling with the "new" command. I made an object of class itemsets by first creating a presence/absence matrix and with something like 16000 items this is really sort of tedious. I wonder if there is a better way. //Currently im doing this avec = array(dim=400) //dim is till the max number of the item im concerned with avec[1:400] = 0 avec[27] = 1 avec[63] = 1 //and do on for all the items i want amat = matrix(data = avec,ncol = 400) aset = as(amat,"transactions") //coercing the matrix as a transactions class then say my data is "dat" i can use >support(aset,dat) [1] 0.001406470 There has to be a better way Thanks once again -- View this message in context: http://r.789695.n4.nabble.com/Support-Counting-tp3424730p3428062.html Sent from the R help mailing list archive at Nabble.com. From SVonfelten at uhbs.ch Tue Apr 5 16:26:25 2011 From: SVonfelten at uhbs.ch (Stefanie Von Felten) Date: Tue, 05 Apr 2011 16:26:25 +0200 Subject: [R] Antw: Re: Confidence interval for the difference between proportions - method used in prop.test() Message-ID: <4D9B42B1020000E1000367BD@mail.uhbs.ch> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jwiley.psych at gmail.com Tue Apr 5 17:21:41 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Tue, 5 Apr 2011 08:21:41 -0700 Subject: [R] Help to check data before putting it in a database In-Reply-To: <1302016015254-3428318.post@n4.nabble.com> References: <1302016015254-3428318.post@n4.nabble.com> Message-ID: Hi Ulisses, Look at the functions ?match and ?rbind If you do not want to do it by hand, you can make a little function as below. HTH, Josh d1 <- data.frame(goals = 4:1, players = LETTERS[1:4]) d2 <- data.frame(goals = c(1, 3, 2, 5), players = LETTERS[3:6]) f <- function(old, new, check) { index <- new[, check] %in% old[, check] dat <- rbind(old, new[index, ]) tocheck <- new[!index, ] list(merged = dat, tocheck = tocheck) } dmerged <- f(d1, d2, "players") ## check "tocheck" and once it is correct dfinal <- do.call("rbind", dmerged) On Tue, Apr 5, 2011 at 8:06 AM, Ulisses.Camargo wrote: > The example scene: > > I have a database with stats about each goal made by my soccer team. This > database (a data frame in R) is organized in lines (goals) with a set of > columns containing data about these goals (player name, tactic position, > etc). For now, this database will be called "data.frame1". > > What I need is to feed this "data.frame1" with new information about my team > goals. I will call this new information "data.frame2". This set of new goals > is organized in the same way as in "data.frame1" (equal numbers of cols). > > Where help is needed: > > I need help in finding a way to check the player-name column in > "data.frame2" before feeding "data.frame1" with it. What I need is a way to > verify the name of the player on each line of "data.frame2" with the names > of players that already exist on a col in "data.frame1". Moreover, I need R > to make two main things: > > First, the lines of ?data.frame2? with player names that already exists in > ?data.frame1? must be added to ?data.frame1?. > > Second: lines of ?data.frame2? with player names that does not exist on > ?data.frame1? must be listed in an output to be manually checked and > corrected. > After this verification, corrected lines and new-player-names lines must be > incorporated in "data.frame1". > > What I want is to guarantee that will be no lines with wrong player names in > my database. > At the same time, my script must permit new information to be added (new > player names). > > Is there somebody who could help me with this? > > Thanks for your attention > > Best wishes > Ulisses > > -- > View this message in context: http://r.789695.n4.nabble.com/Help-to-check-data-before-putting-it-in-a-database-tp3428318p3428318.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From jdnewmil at dcn.davis.ca.us Tue Apr 5 17:36:28 2011 From: jdnewmil at dcn.davis.ca.us (Jeff Newmiller) Date: Tue, 05 Apr 2011 08:36:28 -0700 Subject: [R] Help to check data before putting it in a database In-Reply-To: <1302016015254-3428318.post@n4.nabble.com> References: <1302016015254-3428318.post@n4.nabble.com> Message-ID: <051d57a1-e957-4908-a9c3-9ed5e3a41515@email.android.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From moliterno.camargo at gmail.com Tue Apr 5 17:30:49 2011 From: moliterno.camargo at gmail.com (Ulisses.Camargo) Date: Tue, 5 Apr 2011 10:30:49 -0500 (CDT) Subject: [R] Help to check data before putting it in a database In-Reply-To: References: <1302016015254-3428318.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From pengchy at gmail.com Tue Apr 5 17:16:29 2011 From: pengchy at gmail.com (Pengcheng Yang) Date: Tue, 05 Apr 2011 23:16:29 +0800 Subject: [R] how to label customized y axis when using lattice parallel parameter common.scale=TRUE In-Reply-To: References: <4D9AF858.90200@gmail.com> Message-ID: <4D9B324D.3060605@gmail.com> I have readjust the script as follows to retain the complete information of original graph, the background vertical bar. parallel(~iris[1:4] | Species, iris,horizon=FALSE, ylim = extendrange(range(iris[1:4])), scales = list(y = list(at = NULL, labels = NULL),x=list(rot=45)), lower = 0, upper = 1, panel=function(x,y,z,...){ panel.abline(v=1:4,col="gray90") panel.parallel(x,y,z,...) }) On 2011-4-5 21:42, Deepayan Sarkar wrote: > On Tue, Apr 5, 2011 at 4:39 PM, Pengcheng Yang wrote: >> Dear all, >> >> When I use parallel function in lattice package, I want to label the y-axis >> with customized numbers. Like this: >> >> parallel(~iris[1:4] | Species, iris,horiz=FALSE,common.scale=TRUE, >> scales=list(y=list(at=c(0,2,3)))) > Parallel does not directly support that, and will insist on scaling > the data. However, you can control the scaling (using 'lower' and > 'upper'), and override a couple of other arguments to get what you > want: > > parallel(~iris[1:4] | Species, iris, > xlim = extendrange(range(iris[1:4])), > scales = list(x = list(at = NULL, labels = NULL)), > lower = 0, upper = 1) > > This is for horizontal.axis = TRUE, adjust accordingly for FALSE. > > -Deepayan > >> But only "Min" label in the y-axis, nothing happened. Could anyone help me? >> >> Thanks. >> Regards, >> >> Pengcheng Yang >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> From thomas.triebs at cantab.net Tue Apr 5 17:33:11 2011 From: thomas.triebs at cantab.net (Thomas) Date: Tue, 05 Apr 2011 16:33:11 +0100 Subject: [R] loop question Message-ID: <4D9B3637.7020407@cantab.net> Dear all, I am trying to set up a list with 1:c objects each meant to capture the coefficients for one coefficient and 100 replications. I receive the following error message: Error in betaboot[[p]] : subscript out of bounds. My code is below. Where is my mistake? Many thanks, Thomas _________________________________ betaboot<-list(NULL) for (i in 1:c) { betaboot[[i]]<-cbind() } num <- 100 # this is the number of bootstraps for (i in 1:num) { [BOOTSTRAP] coef.temp <- coef(model.temp, data=newdata) for (p in 1:c){ betaboot[[p]] <- cbind(betaboot[[p]], coef.temp[,p]) } } From jwiley.psych at gmail.com Tue Apr 5 17:56:08 2011 From: jwiley.psych at gmail.com (Joshua Wiley) Date: Tue, 5 Apr 2011 08:56:08 -0700 Subject: [R] loop question In-Reply-To: <4D9B3637.7020407@cantab.net> References: <4D9B3637.7020407@cantab.net> Message-ID: Dear Thomas, On Tue, Apr 5, 2011 at 8:33 AM, Thomas wrote: > Dear all, > > I am trying to set up a list with 1:c objects each meant to capture the > coefficients for one coefficient and 100 replications. I receive the > following error message: > > Error in betaboot[[p]] : subscript out of bounds. > > My code is below. Where is my mistake? > > Many thanks, > > Thomas > > _________________________________ > betaboot<-list(NULL) if you know the number of bootstraps (which you seem to later on), a preferred way to instatiate the list would be: betaboot <- vector(mode = "list", length = yourlength) > > for (i in 1:c) { because "c()" is such an important function, I would strongly encourage you not to use it also as a variable. > betaboot[[i]]<-cbind() Don't use this to build an empty list. > } > > > num <- 100 # this is the number of bootstraps > > for (i in 1:num) { > > ? ?[BOOTSTRAP] > > ?coef.temp <- coef(model.temp, data=newdata) > > ?for (p in 1:c){ > ?betaboot[[p]] <- cbind(betaboot[[p]], coef.temp[,p]) This should work assuming betaboot is instatiated properly. That said, it looks like you have a nested for loop and then just keep cbind()ing each element of betaboot bigger and bigger. You may get a performance increase if you also instantiate each matrix/dataframe inside betaboot. Then the call would become something like: betaboot[[i]][,p] <- coef.temp[,p] that is, you can use a chained series of extraction operators to get to the appropriate column in the matrix/dataframe inside the appropriate list element. Then rather than constantly using cbind(), you just place coef.temp[,p] where you want it. The only requirement is that you know the sizes of the matrices/dataframes going in so you can create empty ones from the get go. Cheers, Josh > ?} > > ?} > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ From liliana.pacheco24 at gmail.com Tue Apr 5 18:23:16 2011 From: liliana.pacheco24 at gmail.com (Liliana Pacheco) Date: Tue, 5 Apr 2011 11:23:16 -0500 Subject: [R] Gibbs sampling Message-ID: An embedded and charset-unspecified text was scrubbed... Name: n?o dispon?vel URL: From daniel at umd.edu Tue Apr 5 18:38:46 2011 From: daniel at umd.edu (Daniel Malter) Date: Tue, 5 Apr 2011 11:38:46 -0500 (CDT) Subject: [R] Precision of summary() when summarizing variables in a data frame Message-ID: <1302021526605-3428570.post@n4.nabble.com> Hi, I summary() a variable with 409908 numeric observations. The variable is part of a data.frame. The problem is that the min and max returned by summary() do not equal the ones returned by min() and max(). Does anybody know why that is? > min(data$vc) [1] 15452 > max(data$vc) [1] 316148 > summary(data$vc) Min. 1st Qu. Median Mean 3rd Qu. Max. 15450 21670 40980 55500 63880 316100 sessionInfo() R version 2.11.1 (2010-05-31) x86_64-apple-darwin9.8.0 locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] sqldf_0.3-5 chron_2.3-39 gsubfn_0.5-5 [4] proto_0.3-8 RSQLite.extfuns_0.0.1 RSQLite_0.9-4 [7] DBI_0.2-5 Thanks much, Daniel -- View this message in context: http://r.789695.n4.nabble.com/Precision-of-summary-when-summarizing-variables-in-a-data-frame-tp3428570p3428570.html Sent from the R help mailing list archive at Nabble.com. From lodeag at gmail.com Tue Apr 5 18:27:07 2011 From: lodeag at gmail.com (lorena delgadillo) Date: Tue, 5 Apr 2011 13:27:07 -0300 Subject: [R] lorena Message-ID: Dear I would like to know how to use the Croston method in R, consaltarte if I download a package? personally work the series as a SARIMA In the present instalment have many values zeros proposed the following model, but I have many doubts with his predictions. M3 = arima (d1, order = c (2,1,4), n. ahead = 4, seasonal = list (order = c (2,1,4), period = 4)) where D1 = diff (series) Many thanks in advance Dear I would like to know how to use the Croston method in R, consaltarte if I download a package? personally work the series as a SARIMA In the present instalment have many values zeros proposed the following model, but I have many doubts with his predictions. M3 = arima (d1, order = c (2,1,4), n. ahead = 4, seasonal = list (order = c (2,1,4), period = 4)) where D1 = diff (series) Many thanks in advance Dear I would like to know how to use the Croston method in R, consaltarte if I download a package? personally work the series as a SARIMA In the present instalment have many values zeros proposed the following model, but I have many doubts with his predictions. M3 = arima (d1, order = c (2,1,4), n. ahead = 4, seasonal = list (order = c (2,1,4), period = 4)) where D1 = diff (series) Many thanks in advance Dear I would like to know how to use the Croston method in R, consaltarte if I download a package? personally work the series as a SARIMA In the present instalment have many values zeros proposed the following model, but I have many doubts with his predictions. M3 = arima (d1, order = c (2,1,4), n. ahead = 4, seasonal = list (order = c (2,1,4), period = 4)) where D1 = diff (series) Many thanks in advance Dear I would like to know how to use the Croston method in R, consaltarte if I download a package? personally work the series as a SARIMA In the present instalment have many values zeros proposed the following model, but I have many doubts with his predictions. M3 = arima (d1, order = c (2,1,4), n. ahead = 4, seasonal = list (order = c (2,1,4), period = 4)) where D1 = diff (series) Many thanks in advance Dear I would like to know how to use the Croston method in R, consaltarte if I download a package? personally work the series as a SARIMA In the present instalment have many values zeros proposed the following model, but I have many doubts with his predictions. M3 = arima (d1, order = c (2,1,4), n. ahead = 4, seasonal = list (order = c (2,1,4), period = 4)) where D1 = diff (series) Many thanks in advance Dear I would like to know how to use the Croston method in R, consaltarte if I download a package? personally work the series as a SARIMA In the present instalment have many values zeros proposed the following model, but I have many doubts with his predictions. M3 = arima (d1, order = c (2,1,4), n. ahead = 4, seasonal = list (order = c (2,1,4), period = 4)) where D1 = diff (series) Many thanks in advance Dear I would like to know how to use the Croston method in R, consaltarte if I download a package? Personally work the series as a SARIMA, this correct? In the present instalment have many values zeros proposed the following model, but I have many doubts with his predictions. M3 = arima (d1, order = c (2,1,4), n. ahead = 4, seasonal = list (order = c (2,1,4), period = 4)) where D1 = diff (series) Many thanks in advance * saludos cordiales, LORENA DELGADILLO AGUIRRE* Licenciada en Estad?stica PUCV *www.lorenadelgadilloaguirre.blogspot.com* *lodeag at gmail.com* *09-668.48.60* ------------ pr?xima parte ------------ ID Serie 1.000000 0.000000 2.000000 0.000000 3.000000 0.000000 4.000000 152.100000 5.000000 179.580000 6.000000 195.474000 7.000000 197.937200 8.000000 216.097160 9.000000 223.071448 10.000000 245.966874 11.000000 252.880644 12.000000 0.000000 13.000000 273.109027 14.000000 281.577342 15.000000 287.372776 16.000000 280.555955 17.000000 286.182042 18.000000 293.301229 19.000000 303.959187 20.000000 303.197473 21.000000 312.053897 22.000000 0.000000 23.000000 322.755681 24.000000 0.000000 25.000000 321.304233 26.000000 329.709470 27.000000 335.897999 28.000000 0.000000 29.000000 329.701277 30.000000 325.349956 31.000000 338.850093 32.000000 343.215070 33.000000 0.000000 34.000000 0.000000 35.000000 334.615434 36.000000 343.551063 37.000000 337.402394 38.000000 336.177021 39.000000 340.881856 40.000000 348.523187 41.000000 349.106735 42.000000 351.637707 43.000000 353.120839 44.000000 344.560442 45.000000 0.000000 46.000000 0.000000 47.000000 0.000000 48.000000 0.000000 49.000000 345.231065 50.000000 351.480541 51.000000 342.707539 52.000000 348.914086 53.000000 354.102022 54.000000 356.273026 55.000000 345.428623 56.000000 0.000000 57.000000 342.699023 58.000000 350.816239 59.000000 0.000000 60.000000 0.000000 61.000000 355.108240 62.000000 0.000000 63.000000 353.261693 64.000000 349.328213 65.000000 349.388740 66.000000 352.443813 67.000000 0.000000 68.000000 344.539521 69.000000 350.581009 70.000000 357.618759 71.000000 346.653108 72.000000 341.684363 73.000000 0.000000 74.000000 342.738677 75.000000 350.762222 76.000000 0.000000 77.000000 347.803138 78.000000 344.820875 79.000000 0.000000 80.000000 342.851699 From markleeds2 at gmail.com Tue Apr 5 18:43:01 2011 From: markleeds2 at gmail.com (Mark Leeds) Date: Tue, 5 Apr 2011 12:43:01 -0400 Subject: [R] Gibbs sampling In-Reply-To: References: Message-ID: hi liliana: you should read the attached before doing that. also, unrelated to above ( I think ? ), from experience, there can be problems when you have restrictions on a parameter and then try to use gibbs sampling to get the distribution of that parameter. in a totally different context ( not bivariate normal ) , I tried it using many different approaches ( gibbs, metropolis, admit ) and was never successful. good luck. On Tue, Apr 5, 2011 at 12:23 PM, Liliana Pacheco < liliana.pacheco24 at gmail.com> wrote: > HI R users, perhaps you can help me with this: > > I am planning on using the Gibbs sampler for the correlation coefficient of > a bivariate normal. I have a posterior distribution for rho, besides that, > the conditional distribution for all the parameters of this posterior > distribution. The thing is that, since I have to get a sample, size 10000 > for rho, obtain a 95% confidence interval, and repeat this procedure 1000 > times; and repeat this procedure for 50 scenaries, I'n thinking this is > going to take forever. > > Is there a library for making this work faster? I've heard of gibbs.met, > but > I don't know if it's going to work or even more, I didn't understand the > examples. > > Thanks! > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -------------- next part -------------- A non-text attachment was scrubbed... Name: gelman_meng.pdf Type: application/pdf Size: 402188 bytes Desc: not available URL: From jholtman at gmail.com Tue Apr 5 18:48:48 2011 From: jholtman at gmail.com (jim holtman) Date: Tue, 5 Apr 2011 12:48:48 -0400 Subject: [R] Precision of summary() when summarizing variables in a data frame In-Reply-To: <1302021526605-3428570.post@n4.nabble.com> References: <1302021526605-3428570.post@n4.nabble.com> Message-ID: They are probably the same. It isjust that summary is printing out 4 significant digits. Try: options(digits = 20) On Tue, Apr 5, 2011 at 12:38 PM, Daniel Malter wrote: > Hi, > > I summary() a variable with 409908 numeric observations. The variable is > part of a data.frame. The problem is that the min and max returned by > summary() do not equal the ones returned by min() and max(). Does anybody > know why that is? > >> min(data$vc) > [1] 15452 >> max(data$vc) > [1] 316148 >> summary(data$vc) > ? Min. 1st Qu. ?Median ? ?Mean 3rd Qu. ? ?Max. > ?15450 ? 21670 ? 40980 ? 55500 ? 63880 ?316100 > > > sessionInfo() > R version 2.11.1 (2010-05-31) > x86_64-apple-darwin9.8.0 > > locale: > [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats ? ? graphics ?grDevices utils ? ? datasets ?methods ? base > > other attached packages: > [1] sqldf_0.3-5 ? ? ? ? ? chron_2.3-39 ? ? ? ? ?gsubfn_0.5-5 > [4] proto_0.3-8 ? ? ? ? ? RSQLite.extfuns_0.0.1 RSQLite_0.9-4 > [7] DBI_0.2-5 > > Thanks much, > Daniel > > -- > View this message in context: http://r.789695.n4.nabble.com/Precision-of-summary-when-summarizing-variables-in-a-data-frame-tp3428570p3428570.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? From michalseneca at gmail.com Tue Apr 5 18:59:49 2011 From: michalseneca at gmail.com (michalseneca) Date: Tue, 5 Apr 2011 11:59:49 -0500 (CDT) Subject: [R] Creating multiple vector/list names-novice In-Reply-To: <4D9AE54F.10503@statistik.tu-dortmund.de> References: <1301927063332-3425616.post@n4.nabble.com> <1301929772084-3425759.post@n4.nabble.com> <1301987275755-3427283.post@n4.nabble.com> <4D9AE54F.10503@statistik.tu-dortmund.de> Message-ID: <1302022789920-3428617.post@n4.nabble.com> Thanks I already found out solution :) -- View this message in context: http://r.789695.n4.nabble.com/Creating-multiple-vector-list-names-novice-tp3425616p3428617.html Sent from the R help mailing list archive at Nabble.com. From lnharris at zoology.ubc.ca Tue Apr 5 20:37:09 2011 From: lnharris at zoology.ubc.ca (Les) Date: Tue, 5 Apr 2011 13:37:09 -0500 (CDT) Subject: [R] Arrangement of Lattice Histograms - Top to bottom and then left to right? Message-ID: <1302028629569-3428825.post@n4.nabble.com> Hi List, Using Lattice, I have created a plot of histograms showing Fork Length by Year. The plot shows the histograms in 3 columns and 5 rows. Using the as.table=T function I can get the years to start on top. However, what I would like to do is have the first year start in the top left (column 1, row 1; as it is now) and add the subsequent histograms to the plot going down the column and then over by row (example and current code is shown below). Is this possible? I have spent a great deal of time searching, and have not found any clues. Any help/ideas would be greatly appreciated. For example, what is being done now: 1974 1975 1976 1977 1978 1979 1980 1981 1982 and what I am hoping for: 1974 1977 1980 1975 1978 1981 1976 1979 1982 Thank you kindly, Les data=read.csv("AllData.csv",sep=",",header=T) data$year=as.factor(data$year) ferg=data[data$wtrbody=="Ferguson",c(1:12)] library(lattice) strip.background=trellis.par.get("strip.background") trellis.par.set(strip.background = list(col = grey(7:1/8))) histogram(~fl|year, data=ferg, as.table=T, type="count", col='dark grey',layout=c(3,5),lwd=2, lty=1, xlab=list("Fork Length (mm)", cex=1.4,font=2), breaks=seq(from=300,to=900,by=25), ylab=list("Frequency",cex=1.4,font=2), scales=list(font=2,cex=1.1, tck=c(1,0), alternating=1, y=list(relation="free",tick.number=3)), par.strip.text=list(cex=1.2),font=2, aspect=0.6) -- View this message in context: http://r.789695.n4.nabble.com/Arrangement-of-Lattice-Histograms-Top-to-bottom-and-then-left-to-right-tp3428825p3428825.html Sent from the R help mailing list archive at Nabble.com. From eriki at ccbr.umn.edu Tue Apr 5 20:53:32 2011 From: eriki at ccbr.umn.edu (Erik Iverson) Date: Tue, 05 Apr 2011 13:53:32 -0500 Subject: [R] Precision of summary() when summarizing variables in a data frame In-Reply-To: References: <1302021526605-3428570.post@n4.nabble.com> Message-ID: <4D9B652C.4040409@ccbr.umn.edu> jim holtman wrote: > They are probably the same. It isjust that summary is printing out 4 > significant digits. Try: > > options(digits = 20) FYI, the default summary method also has its own digits argument. > > > > On Tue, Apr 5, 2011 at 12:38 PM, Daniel Malter wrote: >> Hi, >> >> I summary() a variable with 409908 numeric observations. The variable is >> part of a data.frame. The problem is that the min and max returned by >> summary() do not equal the ones returned by min() and max(). Does anybody >> know why that is? >> >>> min(data$vc) >> [1] 15452 >>> max(data$vc) >> [1] 316148 >>> summary(data$vc) >> Min. 1st Qu. Median Mean 3rd Qu. Max. >> 15450 21670 40980 55500 63880 316100 >> >> >> sessionInfo() >> R version 2.11.1 (2010-05-31) >> x86_64-apple-darwin9.8.0 >> >> locale: >> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> other attached packages: >> [1] sqldf_0.3-5 chron_2.3-39 gsubfn_0.5-5 >> [4] proto_0.3-8 RSQLite.extfuns_0.0.1 RSQLite_0.9-4 >> [7] DBI_0.2-5 >> >> Thanks much, >> Daniel >> >> -- >> View this message in context: http://r.789695.n4.nabble.com/Precision-of-summary-when-summarizing-variables-in-a-data-frame-tp3428570p3428570.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > From gunter.berton at gene.com Tue Apr 5 21:04:41 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Tue, 5 Apr 2011 12:04:41 -0700 Subject: [R] Arrangement of Lattice Histograms - Top to bottom and then left to right? In-Reply-To: <1302028629569-3428825.post@n4.nabble.com> References: <1302028629569-3428825.post@n4.nabble.com> Message-ID: Look for the index.cond argument at the bottom of the Help page for xyplot. -- Bert On Tue, Apr 5, 2011 at 11:37 AM, Les wrote: > Hi List, > Using Lattice, I have created a plot of histograms showing Fork Length by > Year. The plot shows the histograms in 3 columns and 5 rows. Using the > as.table=T function I can get the years to start on top. However, what I > would like to do is have the first year start in the top left (column 1, row > 1; as it is now) and add the subsequent histograms to the plot going down > the column and then over by row (example and current code is shown below). > Is this possible? I have spent a great deal of time searching, and have not > found any clues. Any help/ideas would be greatly appreciated. > > For example, what is being done now: > > 1974 1975 1976 > 1977 1978 1979 > 1980 1981 1982 > > and what I am hoping for: > > 1974 1977 1980 > 1975 1978 1981 > 1976 1979 1982 > > Thank you kindly, > Les > > data=read.csv("AllData.csv",sep=",",header=T) > data$year=as.factor(data$year) > ferg=data[data$wtrbody=="Ferguson",c(1:12)] > library(lattice) > > strip.background=trellis.par.get("strip.background") > trellis.par.set(strip.background = list(col = grey(7:1/8))) > > histogram(~fl|year, data=ferg, as.table=T, type="count", > ? ? ? ?col='dark grey',layout=c(3,5),lwd=2, lty=1, > ? ? ? ?xlab=list("Fork Length (mm)", cex=1.4,font=2), > ? ? ? ?breaks=seq(from=300,to=900,by=25), > ? ? ? ?ylab=list("Frequency",cex=1.4,font=2), > ? ? ? ?scales=list(font=2,cex=1.1, tck=c(1,0), alternating=1, > ? ? ? ?y=list(relation="free",tick.number=3)), > ? ? ? ?par.strip.text=list(cex=1.2),font=2, > ? ? ? ?aspect=0.6) > > -- > View this message in context: http://r.789695.n4.nabble.com/Arrangement-of-Lattice-Histograms-Top-to-bottom-and-then-left-to-right-tp3428825p3428825.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- "Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions." -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics From jim.silverton at gmail.com Tue Apr 5 21:14:48 2011 From: jim.silverton at gmail.com (Jim Silverton) Date: Tue, 5 Apr 2011 15:14:48 -0400 Subject: [R] Changing parameter in local fdr R code Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jeroenooms at gmail.com Tue Apr 5 21:30:16 2011 From: jeroenooms at gmail.com (Jeroen Ooms) Date: Tue, 5 Apr 2011 12:30:16 -0700 Subject: [R] detect filetype (as in unix 'file') In-Reply-To: References: <1301886139248-3424562.post@n4.nabble.com> Message-ID: > No, but what is wrong with using system()? The application is running in a very sandboxed environment and might not have permission to execute 'file'. > 'file' is large and complex because it tries to be comprehensive (but it > still does not know about some common systems, e.g. 64-bit Windows > binaries). ?There simply is no point in replicating that in R: which is why > we chose rather to port 'file' to Windows and provide in in Rools. Alright that makes sense, thanks. From daniel at umd.edu Tue Apr 5 21:54:35 2011 From: daniel at umd.edu (Daniel Malter) Date: Tue, 5 Apr 2011 14:54:35 -0500 (CDT) Subject: [R] Precision of summary() when summarizing variables in a data frame In-Reply-To: <4D9B652C.4040409@ccbr.umn.edu> References: <1302021526605-3428570.post@n4.nabble.com> <4D9B652C.4040409@ccbr.umn.edu> Message-ID: <1302033275639-3429022.post@n4.nabble.com> Thanks all. No I wasn't aware of the fact that summary is rounding in this case. Da. -- View this message in context: http://r.789695.n4.nabble.com/Precision-of-summary-when-summarizing-variables-in-a-data-frame-tp3428570p3429022.html Sent from the R help mailing list archive at Nabble.com. From angerusso1980 at gmail.com Tue Apr 5 22:30:04 2011 From: angerusso1980 at gmail.com (Angel Russo) Date: Tue, 5 Apr 2011 16:30:04 -0400 Subject: [R] Hazard ratio calculation and KM plot p-value: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From antrael at hotmail.com Tue Apr 5 21:06:31 2011 From: antrael at hotmail.com (jouba) Date: Tue, 5 Apr 2011 14:06:31 -0500 (CDT) Subject: [R] Structural equation modeling in R(lavaan,sem) In-Reply-To: <4D9B2BAE.2050602@gmail.com> References: <4D9073F7.2040309@gmail.com> <1301426701835-3415954.post@n4.nabble.com> <4D996FB6.2020903@gmail.com> <4D9B2BAE.2050602@gmail.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From lnharris at zoology.ubc.ca Tue Apr 5 22:45:16 2011 From: lnharris at zoology.ubc.ca (Les) Date: Tue, 5 Apr 2011 15:45:16 -0500 (CDT) Subject: [R] Arrangement of Lattice Histograms - Top to bottom and then left to right? In-Reply-To: <1302028629569-3428825.post@n4.nabble.com> References: <1302028629569-3428825.post@n4.nabble.com> Message-ID: <1302036316638-3429123.post@n4.nabble.com> Thank you. The help is much appreciated. Les -- View this message in context: http://r.789695.n4.nabble.com/Arrangement-of-Lattice-Histograms-Top-to-bottom-and-then-left-to-right-tp3428825p3429123.html Sent from the R help mailing list archive at Nabble.com. From dan.abner99 at gmail.com Tue Apr 5 21:58:17 2011 From: dan.abner99 at gmail.com (Dan Abner) Date: Tue, 5 Apr 2011 15:58:17 -0400 Subject: [R] IFELSE function XXXX Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From xueke at pdx.edu Tue Apr 5 21:56:35 2011 From: xueke at pdx.edu (xueke at pdx.edu) Date: Tue, 5 Apr 2011 12:56:35 -0700 Subject: [R] simple save question Message-ID: <20110405125635.31101nfivkjlngfn@webmail.pdx.edu> Hi, When I run the survfit function, I want to get the restricted mean value and the standard error also. I found out using the "print" function to do so, as shown below, print(km.fit,print.rmean=TRUE) Call: survfit(formula = Surv(diff, status) ~ 1, type = "kaplan-meier") records n.max n.start events *rmean *se(rmean) median 200.000 200.000 200.000 129.000 0.145 0.237 1.158 0.95LCL 0.95UCL 0.450 1.730 * restricted mean with upper limit = 2.97 The questions is, is there any way to extract these values from the print command? Thanks a lot. Xueke From wdunlap at tibco.com Tue Apr 5 23:09:59 2011 From: wdunlap at tibco.com (William Dunlap) Date: Tue, 5 Apr 2011 14:09:59 -0700 Subject: [R] IFELSE function XXXX In-Reply-To: References: Message-ID: <77EB52C6DD32BA4D87471DCD70C8D700041B27AF@NA-PA-VBE03.na.tibco.com> > -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Dan Abner > Sent: Tuesday, April 05, 2011 12:58 PM > To: r-help at r-project.org > Subject: [R] IFELSE function XXXX > > Hello everyone, > > This IFELSE function call is not working properly. I do not > receive an error > message, but the actions are not executed conditional as I > was hoping. Any > assistance is appreciated. > > set.seed(12345) > res1<-rbinom(10000,1,.1) > rdata3<-transform(data.frame(res1),input1=rnorm(10000,50,10)) > data3 > #inducing correlation between res1 & input1 > ifelse(data3$res1==1,data3$input1<-data3$input1+10,data3$input 1<-data3$input1) Avoid assignments in arguments to function calls, especially multiple assignments to the same object, except when you know what you are doing and want to write obscure code. Change the above line to data3$input1 <- ifelse(data3$res1==1, data3$input1+10, data3input1) There is nothing too special about ifelse here, you would get similar results if you put assignment statements in other function calls. The assignment gets evaluated just as things like log(12) or data3$input1+10 get evaulated when given as arguments. E.g., > x <- 1 > cat(x <- 2:1, x <- 6:3, x <- 10:4, "\n") 2 1 6 5 4 3 10 9 8 7 6 5 4 > x [1] 10 9 8 7 6 5 4 Which version of x you end up with depends on which argument the function evaluates last. > f <- function(arg1, arg2) arg2 + arg1 > f( x <- 7, x <- 101 ) [1] 108 > x [1] 7 An example of obscure code involving multiple assignments to one object is if (is.null(tms <- x$terms) && is.null(tms <- attr(x, "terms"))) { stop("Cannot find a list component or attribute called terms") } tms # x$terms if it exists, otherwise attr(x, "terms") Since the order of evaluation of the 2 arguments to && is well defined (left then right, and the right won't be evaluated unless the left if TRUE), this produces a trustworthy answer. Most functions don't promise any particular order of evaluation. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > data3 > > Thank you, > > Dan > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From p.murrell at auckland.ac.nz Tue Apr 5 23:15:24 2011 From: p.murrell at auckland.ac.nz (Paul Murrell) Date: Wed, 06 Apr 2011 09:15:24 +1200 Subject: [R] grImport/ghostscript problems In-Reply-To: References: <4D8F8E7A.60105@auckland.ac.nz> Message-ID: <4D9B866C.9010200@auckland.ac.nz> Hi On 5/04/2011 9:30 p.m., guillaume Le Ray wrote: > Hi Al, > > I'm facing exactly the same problem as you are, have you manage to > fix it? If yes I eager to know the trick. Al's problem turned out to be a bug in 'grImport', so one thing you can try is to install the latest version of 'grImport'. If that still fails, you might be able to get more information about the problem by looking at the end of the XML file that is created by PostScriptTrace(). If ghostscript has hit trouble it's error messages will hopefully be at the end of that XML file. Paul > > Regards, > > Guillaume > > 2011/3/27 Al Roark > >> Paul Murrell auckland.ac.nz> writes: >> >>> >>> Hi >>> >>> On 28/03/2011 8:13 a.m., Al Roark wrote: >>>> >>>> Hi All: I've been struggling for a while trying to get grImport >>>> up and running. I'm on a Windows 7 (home premium 64 bit) >>>> machine running R-2.12.2 along with GPL Ghostscript 9.01. I've >>>> set my Windows PATH variable to point to the Ghostscript \bin >>>> and \lib directories, and I've created the R_GSCMD environment >>>> variable pointing to gswin32c.exe. I don't have any experience >>>> with Ghostscript, but with the setup described above I can view >>>> the postscript file with the following command to the Windows >>>> command prompt: gswin32c.exe D:\Sndbx\vasarely.ps However, I >>>> can't get the PostScriptTrace() function to work on the same >>>> file. Submitting PostScriptTrace("D:/Sndbx/vasarely.ps") gives >>>> me the error: Error in PostScriptTrace("D:/Sndbx/vasarely.ps") >>>> : status 127 in running command 'gswin32c.exe -q -dBATCH >>>> -dNOPAUSE -sDEVICE=pswrite >>>> -sOutputFile=C:\Users\Al\AppData\Local\Temp\RtmppPjDAf\file5db99cb >>>> >>>> -sstdout=vasarely.ps.xml capturevasarely.ps' Your suggestions are >>>> much appreciated. Cheers, Al [[alternative HTML version >>>> deleted]] >>> >>> You could try running the ghostscript command that is printed in >>> the error message at the Windows command prompt to see more info >>> about the problem (might need to remove the '-q' so that >>> ghostscript prints messages to the screen). >>> >>> Paul >>> >> >> Thanks for your reply. >> >> Perhaps this is a Ghostscript problem. When I run the Ghostscript >> command, I'm met with the rather unhelpful error: 'GPL Ghostscript >> 9.01: Unrecoverable error, exit code 1 (occurs whether or not I >> remove the -q)'. >> >> Interestingly, if I remove the final argument (in this case, >> capturevasarely.ps) the Ghostscript command executes, placing a >> file (appears to be xml) in the temporary directory. However, I'm >> not sure what to do with this result. >> >> ______________________________________________ R-help at r-project.org >> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do >> read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] > > ______________________________________________ R-help at r-project.org > mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do > read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Dr Paul Murrell Department of Statistics The University of Auckland Private Bag 92019 Auckland New Zealand 64 9 3737599 x85392 paul at stat.auckland.ac.nz http://www.stat.auckland.ac.nz/~paul/ From d.scott at auckland.ac.nz Tue Apr 5 23:30:46 2011 From: d.scott at auckland.ac.nz (David Scott) Date: Wed, 6 Apr 2011 09:30:46 +1200 Subject: [R] lattice: how to "center" a subtitle? In-Reply-To: References: <57C6A6F6-4CB1-49B2-A0DA-4480EFD385A3@comcast.net> <4D9A6558.4050405@auckland.ac.nz> Message-ID: <4D9B8A06.5020503@auckland.ac.nz> On 6/04/2011 12:47 a.m., Deepayan Sarkar wrote: > On Tue, Apr 5, 2011 at 6:12 AM, David Scott wrote: > > [...] > >> I am not sure where I read it and I can't find it again, but my >> understanding is that expressions using bquote with lattice need to be >> enclosed in as.expression() to work. That is in contrast to what happens in >> base graphics. >> >> Here is a simple example. >> >> a<- 2 >> plot(1:10, a*(1:10), main = bquote(alpha == .(a))) >> require(lattice) >> xyplot(a*(1:10)~ 1:10, main = bquote(alpha == .(a))) >> xyplot(a*(1:10)~ 1:10, main = as.expression(bquote(alpha == .(a)))) >> >> Which produces: >> >>> a<- 2 >>> plot(1:10, a*(1:10), main = bquote(alpha == .(a))) >>> require(lattice) >> Loading required package: lattice >>> xyplot(a*(1:10)~ 1:10, main = bquote(alpha == .(a))) >> Error in trellis.skeleton(formula = a * (1:10) ~ 1:10, cond = list(c(1L, : >> object 'alpha' not found >>> xyplot(a*(1:10)~ 1:10, main = as.expression(bquote(alpha == .(a)))) >> >> Using expression() rather than as.expression() doesn't produce the desired >> affect. Try it yourself. >> >> As to why this is the case ..... > > Let's see: ?xyplot says > > 'main': Typically a character string or expression describing > the main title to be placed on top of each page. [...] > > So, lattice is fairly explicit, by R standards, in requiring 'main' to > be "character" or "expression". On the other hand, ?title says > > The labels passed to 'title' can be character strings or language > objects (names, calls or expressions), or [...] > > so it additionally accepts "names" and "calls". > > Now, we have > >> a<- 2 >> foo<- bquote(alpha == .(a)) > >> foo # Looks OK > alpha == 2 >> mode(foo) # But > [1] "call" >> is.expression(foo) # not an expression > [1] FALSE > >> is.expression(expression(foo)) ## YES, but > [1] TRUE >> expression(foo) ## not what we want > expression(foo) > >> is.expression(as.expression(foo)) > [1] TRUE >> as.expression(foo) ## This IS what we want > expression(alpha == 2) > > So I submit that lattice is behaving exactly as suggested by its documentation. > > Now you would naturally argue that this is hiding behind > technicalities, and if "call" objects work for plot(), it should work > for lattice as well. But watch this: > >> plot(1:10, main = foo) # works perfectly > >> arglist<- list(1:10, main = foo) >> arglist # Looks like what we want > [[1]] > [1] 1 2 3 4 5 6 7 8 9 10 > > $main > alpha == 2 > >> do.call(plot, arglist) > Error in as.graphicsAnnot(main) : object 'alpha' not found > > ...which I would say is "unexpected" behaviour, if not a bug. > > The moral of the story is that unevaluated calls are dangerous objects > (try this one out for fun: > > foo<- bquote(q(.(x)), list(x = "no")) > do.call(plot, list(1:10, main = foo)) > > ), and carrying them around is not a good idea. > > Lattice does use the do.call paradigm quite a bit, and I think it > might be quite difficult to fix it up to handle non-expression > language objects (which will still not fix the type of problem shown > above). > > -Deepayan Thanks very much for this explanation Deepayan. Part of my intention in contributing to this thread was to have something explicit in the archives for future reference, and your reply is excellent in that regard. And many thanks for your work on lattice. David Scott -- _________________________________________________________________ David Scott Department of Statistics The University of Auckland, PB 92019 Auckland 1142, NEW ZEALAND Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055 Email: d.scott at auckland.ac.nz, Fax: +64 9 373 7018 Director of Consulting, Department of Statistics From felipe.parra at quantil.com.co Wed Apr 6 00:32:51 2011 From: felipe.parra at quantil.com.co (Luis Felipe Parra) Date: Wed, 6 Apr 2011 06:32:51 +0800 Subject: [R] solveRsocp in fPortfolio Message-ID: An embedded and charset-unspecified text was scrubbed... Name: no disponible URL: From mjdubya at gmail.com Wed Apr 6 00:34:50 2011 From: mjdubya at gmail.com (mjdubya) Date: Tue, 5 Apr 2011 17:34:50 -0500 (CDT) Subject: [R] Search arrays based on similar values Message-ID: <1302042890316-3429381.post@n4.nabble.com> Hey folks, I have two arrays: "A" (1X100) with non-ordered values ranging 1-14 "B" (2X14) containing 14 decimal values. I would like to create a new array (1X100) that contains only the decimal values from array B by associating the integers from A and B. In other words, for each value of A find the same integer value of B, select the associated decimal value of B. A B newarray 1 1 0.1 0.1 1 2 0.3 0.1 1 3 0.14 0.1 2 4 0.2 0.3 3 5 0.82 0.14 3 6 0.21 0.14 4 . . 0.2 7 . . . 14 . . . 4 14 0.03 . 3 . 5 . . . . . Any suggestions are greatly appreciated! -- View this message in context: http://r.789695.n4.nabble.com/Search-arrays-based-on-similar-values-tp3429381p3429381.html Sent from the R help mailing list archive at Nabble.com. From bbolker at gmail.com Wed Apr 6 00:48:50 2011 From: bbolker at gmail.com (Ben Bolker) Date: Tue, 5 Apr 2011 22:48:50 +0000 Subject: [R] IFELSE function XXXX References: <77EB52C6DD32BA4D87471DCD70C8D700041B27AF@NA-PA-VBE03.na.tibco.com> Message-ID: William Dunlap tibco.com> writes: > Avoid assignments in arguments to function calls, especially > multiple assignments to the same object, except when you know > what you are doing and want to write obscure code. > > Change the above line to > data3$input1 <- ifelse(data3$res1==1, data3$input1+10, data3input1) > data3 <- transform(data3,input1=ifelse(res==1,input1+10,input1)) might be even clearer From dwinsemius at comcast.net Wed Apr 6 00:59:42 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 5 Apr 2011 18:59:42 -0400 Subject: [R] Search arrays based on similar values In-Reply-To: <1302042890316-3429381.post@n4.nabble.com> References: <1302042890316-3429381.post@n4.nabble.com> Message-ID: <78F28735-C69C-40A2-A52D-159B47ACA702@comcast.net> On Apr 5, 2011, at 6:34 PM, mjdubya wrote: > Hey folks, > I have two arrays: "A" (1X100) with non-ordered values ranging 1-14 > "B" (2X14) containing 14 decimal values. > I would like to create a new array (1X100) that contains only the > decimal > values from array B by associating the integers from A and B. In > other > words, for each value of A find the same integer value of B, select > the > associated decimal value of B. Sounds like a job for merge(). Tested solutions offered when reproducible examples are provided. > > A B newarray > 1 1 0.1 0.1 > 1 2 0.3 0.1 > 1 3 0.14 0.1 > 2 4 0.2 0.3 > 3 5 0.82 0.14 > 3 6 0.21 0.14 > 4 . . 0.2 > 7 . . . > 14 . . . > 4 14 0.03 . > 3 . > 5 . > . . > . . David Winsemius, MD West Hartford, CT From szimine at gmail.com Wed Apr 6 01:30:04 2011 From: szimine at gmail.com (stan zimine) Date: Wed, 6 Apr 2011 01:30:04 +0200 Subject: [R] R gui on windows how to force to always show the last line of output In-Reply-To: References: Message-ID: thank you, David, for your answer, which has a therapeutic effect. Otherwise i was launching my prod code from Emacs ESS jas as Janice suggested. On Sat, Apr 2, 2011 at 2:27 PM, David Winsemius wrote: > > On Apr 2, 2011, at 4:21 AM, stan zimine wrote: > >> Hi. >> Googled but did not found the answer for the following little issue. >> >> how to force R gui ?on windows (maybe a specific setting) ?to always >> show the last line of output in the window console. >> >> >> My program in R makes measurements every 5 mins in indefinite loop and >> prints ?results in the console. >> >> The problem: ?last messages are not visible, ?The scrolling bar of the >> gui console ?gets shorter. I.e. ?you have to scroll for the last >> messages. >> >> Thanks if anybody knows the sol to this prob. > > You may want to add flush.console() to the code. > > -- > > David Winsemius, MD > West Hartford, CT > > From dimitri.liakhovitski at gmail.com Wed Apr 6 01:56:23 2011 From: dimitri.liakhovitski at gmail.com (Dimitri Liakhovitski) Date: Tue, 5 Apr 2011 19:56:23 -0400 Subject: [R] merging 2 frames while keeping all the entries from the "reference" frame In-Reply-To: References: Message-ID: Thanks a lot - these solutions are much more elegant than my own: new.data<-merge(mydata[mydata$group %in% levels(mydata$group)[1],],reference,by="mydate",all.x=T,all.y=T) new.data[["group"]][is.na(new.data[["group"]])]<-levels(mydata$group)[1] new.data[["values"]][is.na(new.data[["values"]])]<-0 # Continue Merging - starting with Group2: for(i in 2:nlevels(mydata$group)){ #i<-2 temp<-merge(mydata[mydata$group %in% levels(mydata$group)[i],],reference,by="mydate",all.x=T,all.y=T) temp[["group"]][is.na(temp[["group"]])]<-levels(mydata$group)[i] temp[["values"]][is.na(temp[["values"]])]<-0 new.data<-rbind(new.data,temp) } Dimitri On Mon, Apr 4, 2011 at 3:07 PM, Henrique Dallazuanna wrote: > Try this: > > ?merge(mydata, cbind(reference, group = rep(unique(mydata$group), each > = nrow(reference))), all = TRUE) > > On Mon, Apr 4, 2011 at 2:24 PM, Dimitri Liakhovitski > wrote: >> To clarify just in case, here is the result I am trying to get: >> >> mydate ?group ? values >> 12/29/2008 ? ? ?Group1 ?0.453466522 >> 1/5/2009 ? ? ? ?Group1 ?NA >> 1/12/2009 ? ? ? Group1 ?0.416548943 >> 1/19/2009 ? ? ? Group1 ?2.066275155 >> 1/26/2009 ? ? ? Group1 ?2.037729638 >> 2/2/2009 ? ? ? ?Group1 ?-0.598040483 >> 2/9/2009 ? ? ? ?Group1 ?1.658999227 >> 2/16/2009 ? ? ? Group1 ?-0.869325211 >> 12/29/2008 ? ? ?Group2 ?NA >> 1/5/2009 ? ? ? ?Group2 ?NA >> 1/12/2009 ? ? ? Group2 ?NA >> 1/19/2009 ? ? ? Group2 ?0.375284194 >> 1/26/2009 ? ? ? Group2 ?0.706785401 >> 2/2/2009 ? ? ? ?Group2 ?NA >> 2/9/2009 ? ? ? ?Group2 ?2.104937151 >> 2/16/2009 ? ? ? Group2 ?2.880393978 >> >> >> >> On Mon, Apr 4, 2011 at 1:09 PM, Dimitri Liakhovitski >> wrote: >>> Hello! >>> I have my data frame "mydata" (below) and data frame "reference" - >>> that contains all the dates I would like to be present in the final >>> data frame. >>> I am trying to merge them so that the the result data frame contains >>> all 8 dates in both subgroups (i.e., Group1 should have 8 rows and >>> Group2 too). But when I merge it it's not coming out this way. Any >>> hint would be greatly appreciated! >>> Dimitri >>> >>> mydata<-data.frame(mydate=rep(seq(as.Date("2008-12-29"), length = 8, >>> by = "week"),2), >>> group=c(rep("Group1",8),rep("Group2",8)),values=rnorm(16,1,1)) >>> (reference);(mydata) >>> set.seed(1234) >>> out<-sample(1:16,5,replace=F) >>> mydata<-mydata[-out,]; dim(mydata) >>> (mydata) >>> >>> # "reference" contains the dates I want to be present in the final data frame: >>> reference<-data.frame(mydate=seq(as.Date("2008-12-29"), length = 8, by >>> = "week")) >>> >>> # Merging: >>> new.data<-merge(mydata,reference,by="mydate",all.x=T,all.y=T) >>> new.data<-new.data[order(new.data$group,new.data$mydate),] >>> (new.data) >>> # my new.data contains only 7 rows in Group 1 and 4 rows in Group 2 >>> >>> >>> -- >>> Dimitri Liakhovitski >>> Ninah Consulting >>> >> >> >> >> -- >> Dimitri Liakhovitski >> Ninah Consulting >> www.ninah.com >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Henrique Dallazuanna > Curitiba-Paran?-Brasil > 25? 25' 40" S 49? 16' 22" O > -- Dimitri Liakhovitski Ninah Consulting www.ninah.com From gunter.berton at gene.com Wed Apr 6 02:17:36 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Tue, 5 Apr 2011 17:17:36 -0700 Subject: [R] A fortunes candidate? Message-ID: A fortunes candidate? On Tue, Apr 5, 2011 at 3:59 PM, David Winsemius wrote: ... " Tested solutions offered when reproducible examples are provided. " -- "Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions." -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics From quan.poko2000 at gmail.com Wed Apr 6 02:28:02 2011 From: quan.poko2000 at gmail.com (Quan Zhou) Date: Tue, 5 Apr 2011 20:28:02 -0400 Subject: [R] Error in match.names(clabs, names(xi)) Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jrkrideau at yahoo.ca Wed Apr 6 04:06:55 2011 From: jrkrideau at yahoo.ca (John Kane) Date: Tue, 5 Apr 2011 19:06:55 -0700 (PDT) Subject: [R] A fortunes candidate? In-Reply-To: Message-ID: <937029.89951.qm@web38404.mail.mud.yahoo.com> Sounds more like an advertisement A bit like the old TV story "Paladin, have gun, will travel". --- On Tue, 4/5/11, Bert Gunter wrote: > From: Bert Gunter > Subject: [R] A fortunes candidate? > To: "David Winsemius" > A fortunes candidate? > On Tue, Apr 5, 2011 at 3:59 PM, David Winsemius > wrote: > > " Tested solutions offered when reproducible examples are > provided. " > From dwinsemius at comcast.net Wed Apr 6 04:44:52 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 5 Apr 2011 22:44:52 -0400 Subject: [R] Error in match.names(clabs, names(xi)) In-Reply-To: References: Message-ID: <6B4148F4-7949-4EAF-8FF5-8761BE12A140@comcast.net> On Apr 5, 2011, at 8:28 PM, Quan Zhou wrote: > Hi Guys, > I have this part of a program: > library(survival) > Gastric <- cbind.data.frame(Gp=c(rep(1,45),rep(0,45)), ### 2nd gp 0 > time=c(1,63,105,129,182,216,250,262,301,301,342,354,356,358, > 380,383, 383,388,394,408,460,489,499,523,524,535,562,569,675,676, > > 748,778,786,797,955,968,1000,1245,1271,1420,1551,1694,2363,2754,2950, > > 17,42,44,48,60,72,74,95,103,108,122,144,167,170,183,185,193,195,197, > > 208,234,235,254,307,315,401,445,464,484,528,542,547,577,580,795,855, > 1366,1577,2060,2412,2486,2796,2802,2934,2988), Dth=c(rep(1,43), > 0,0, rep(1,39), rep(0,6))) > CoxG0 <- coxph(Surv(time,Dth) ~ Gp, Gastric) > srvGastA <- survfit(Surv(Gastric$time,Gastric$Dth)~1) ## 88 distinct > times > #Gastric$time is all the time points either death or largest > obervation > time. > #srvGastA$time is all the unique times > newGas <- data.frame(start=0, stop=1, Dth=1, Ploidy=1, tim=0) > #newGas <- r(0,1,1,1,0) > for (i in 2:90) { > timind <- match(Gastric$time[i],srvGastA$time) > tmpmat <- array(0, dim=c(timind,5))#build an array with > dim('index',5) > tmpmat[,4] <- rep(Gastric[i,1], timind)#fourth column, return i's > group > tmpmat[timind,3] <- Gastric$Dth[i] > tmpmat[,1] <- if(timind>1) c(0,srvGastA$time[1:(timind-1)]) else 0 > tmpmat[,2] <- srvGastA$time[1:timind] > newGas <- rbind(newGas,tmpmat) } You haven't told us what you expect (or even what you are trying to do), but if you add an assignment of colnames before the final rbind()-ing of a data.frame and a matrix you get some sort of result: ..... + tmpmat[,2] <- srvGastA$time[1:timind] ; colnames(tmpmat) <- names(newGas) + newGas <- rbind(newGas,tmpmat) } > str(newGas) 'data.frame': 3988 obs. of 5 variables: $ start : num 0 0 1 17 42 44 48 60 0 1 ... $ stop : num 1 1 17 42 44 48 60 63 1 17 ... $ Dth : num 1 0 0 0 0 0 0 1 0 0 ... $ Ploidy: num 1 1 1 1 1 1 1 1 1 1 ... $ tim : num 0 0 0 0 0 0 0 0 0 0 ... > ____________________________________________________________________ > I found when include "the last line" in the for loop. the error will > jump > out. But I do not know how to fix it. I initialize newGas without > names. Good luck with trying to make a data.frame without column names. > it > does not work either.... > It would be great if anyone knows how to fix the problem. Thanks a > lot. > BR > Quan -- David Winsemius, MD West Hartford, CT From sarah.kalicin at intel.com Wed Apr 6 01:48:52 2011 From: sarah.kalicin at intel.com (Kalicin, Sarah) Date: Tue, 5 Apr 2011 16:48:52 -0700 Subject: [R] Pulling strings from a Flat file Message-ID: <9DA5872FEF993D41B7173F58FCF6BE9451283934@orsmsx504.amr.corp.intel.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dwinsemius at comcast.net Wed Apr 6 04:59:47 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Tue, 5 Apr 2011 22:59:47 -0400 Subject: [R] Pulling strings from a Flat file In-Reply-To: <9DA5872FEF993D41B7173F58FCF6BE9451283934@orsmsx504.amr.corp.intel.com> References: <9DA5872FEF993D41B7173F58FCF6BE9451283934@orsmsx504.amr.corp.intel.com> Message-ID: <86A3DB71-05FE-48D9-949F-9F3AFDE92CC0@comcast.net> On Apr 5, 2011, at 7:48 PM, Kalicin, Sarah wrote: > Hi, > > I have a flat file that contains a bunch of strings that look like > this. The file was originally in Unix and brought over into Windows: > > E123456E234567E345678E456789E567891E678910E. . . . > Basically the string starts with E and is followed with 6 numbers. > One string=E123456, length=7 characters. This file contains 10,000's > of these strings. I want to separate them into one vector the length > of the number of strings in the flat file, where each string is it's > on unique value. > > cc<-c(7,7,7,7,7,7,7) >> aa<- file("Master","r", raw=TRUE) >> readChar(aa, cc, useBytes = FALSE) > [1] "E123456" "\nE23456" "7\nE3456" "78\nE456" "789\nE56" > "7891\nE6" "78910\nE" >> close(aa) >> unlink("Master") > txt <- "E123456E234567E345678E456789E567891E678910E" # You could use readLines to bring in from the file # and assign to a character vector for work in R. > gsub("(E[[:digit:]]{6})", "\\1\n", txt) [1] "E123456\nE234567\nE345678\nE456789\nE567891\nE678910\nE" # Seems to be "working" properly > ?scan > scan(textConnection(gsub("(E[[:digit:]]{6})", "\\1\n", txt)), what="character") Read 7 items [1] "E123456" "E234567" "E345678" "E456789" "E567891" "E678910" "E" You might be able to use read.table or variants. > > The biggest issue is I am getting \n added into the string, which I > am not sure where it is coming from, and splices the strings. Any > suggestions on getting rid of the /n and create an infinite sequence > of 7's for the string length for the cc vector? Is there a better > way to do this? > > Sarah > David Winsemius, MD West Hartford, CT From deepayan.sarkar at gmail.com Wed Apr 6 06:04:45 2011 From: deepayan.sarkar at gmail.com (Deepayan Sarkar) Date: Wed, 6 Apr 2011 09:34:45 +0530 Subject: [R] Arrangement of Lattice Histograms - Top to bottom and then left to right? In-Reply-To: References: <1302028629569-3428825.post@n4.nabble.com> Message-ID: On Wed, Apr 6, 2011 at 12:34 AM, Bert Gunter wrote: > Look for the index.cond argument at the bottom of the Help page for xyplot. > -- Bert Also ?print.trellis and ?packet.panel.default for a more general (non-example-specific) approach. -Deepayan > On Tue, Apr 5, 2011 at 11:37 AM, Les wrote: >> Hi List, >> Using Lattice, I have created a plot of histograms showing Fork Length by >> Year. The plot shows the histograms in 3 columns and 5 rows. Using the >> as.table=T function I can get the years to start on top. However, what I >> would like to do is have the first year start in the top left (column 1, row >> 1; as it is now) and add the subsequent histograms to the plot going down >> the column and then over by row (example and current code is shown below). >> Is this possible? I have spent a great deal of time searching, and have not >> found any clues. Any help/ideas would be greatly appreciated. >> >> For example, what is being done now: >> >> 1974 1975 1976 >> 1977 1978 1979 >> 1980 1981 1982 >> >> and what I am hoping for: >> >> 1974 1977 1980 >> 1975 1978 1981 >> 1976 1979 1982 >> >> Thank you kindly, >> Les >> >> data=read.csv("AllData.csv",sep=",",header=T) >> data$year=as.factor(data$year) >> ferg=data[data$wtrbody=="Ferguson",c(1:12)] >> library(lattice) >> >> strip.background=trellis.par.get("strip.background") >> trellis.par.set(strip.background = list(col = grey(7:1/8))) >> >> histogram(~fl|year, data=ferg, as.table=T, type="count", >> ? ? ? ?col='dark grey',layout=c(3,5),lwd=2, lty=1, >> ? ? ? ?xlab=list("Fork Length (mm)", cex=1.4,font=2), >> ? ? ? ?breaks=seq(from=300,to=900,by=25), >> ? ? ? ?ylab=list("Frequency",cex=1.4,font=2), >> ? ? ? ?scales=list(font=2,cex=1.1, tck=c(1,0), alternating=1, >> ? ? ? ?y=list(relation="free",tick.number=3)), >> ? ? ? ?par.strip.text=list(cex=1.2),font=2, >> ? ? ? ?aspect=0.6) >> >> -- >> View this message in context: http://r.789695.n4.nabble.com/Arrangement-of-Lattice-Histograms-Top-to-bottom-and-then-left-to-right-tp3428825p3428825.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > "Men by nature long to get on to the ultimate truths, and will often > be impatient with elementary studies or fight shy of them. If it were > possible to reach the ultimate truths without the elementary studies > usually prefixed to them, these would not be preparatory studies but > superfluous diversions." > > -- Maimonides (1135-1204) > > Bert Gunter > Genentech Nonclinical Biostatistics > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From mwege at zoology.up.ac.za Wed Apr 6 08:21:27 2011 From: mwege at zoology.up.ac.za (mwege) Date: Tue, 5 Apr 2011 23:21:27 -0700 (PDT) Subject: [R] Package diveMove readTDR problem Message-ID: <1302070887814-3429980.post@n4.nabble.com> Hi, I am trying to read my TDR data into R using the readTDR function for the diveMove package. > seal <- readTDR("file location and name here", dateCol=1, depthCol=3, > speed=FALSE, subsamp=1, concurrentCols=4:5) But I keep getting the following error: > Error: all(!is.na(time)) is not TRUE All my columns to have values in them (there are no empty records) The manual and vignette of the package diveMove doesnt give a proper description of how to read data into R. It only describes how to access the data in the system file that comes with the package. What am I doing wrong? Thank you -- View this message in context: http://r.789695.n4.nabble.com/Package-diveMove-readTDR-problem-tp3429980p3429980.html Sent from the R help mailing list archive at Nabble.com. From lrrylsrst at gmail.com Wed Apr 6 08:13:26 2011 From: lrrylsrst at gmail.com (Larry) Date: Tue, 5 Apr 2011 23:13:26 -0700 Subject: [R] executing from .R source file in the src package Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From savicky at praha1.ff.cuni.cz Wed Apr 6 08:36:29 2011 From: savicky at praha1.ff.cuni.cz (Petr Savicky) Date: Wed, 6 Apr 2011 08:36:29 +0200 Subject: [R] Search arrays based on similar values In-Reply-To: <1302042890316-3429381.post@n4.nabble.com> References: <1302042890316-3429381.post@n4.nabble.com> Message-ID: <20110406063629.GA11296@praha1.ff.cuni.cz> On Tue, Apr 05, 2011 at 05:34:50PM -0500, mjdubya wrote: > Hey folks, > I have two arrays: "A" (1X100) with non-ordered values ranging 1-14 > "B" (2X14) containing 14 decimal values. > I would like to create a new array (1X100) that contains only the decimal > values from array B by associating the integers from A and B. In other > words, for each value of A find the same integer value of B, select the > associated decimal value of B. > > A B newarray > 1 1 0.1 0.1 > 1 2 0.3 0.1 > 1 3 0.14 0.1 > 2 4 0.2 0.3 > 3 5 0.82 0.14 > 3 6 0.21 0.14 > 4 . . 0.2 > 7 . . . > 14 . . . > 4 14 0.03 . > 3 . > 5 . > . . > . . Hi. The first column of B seems to contain consecutive integers 1:nrow(B). If this is true, then try following B <- cbind(1:6, c(0.1, 0.3, 0.14, 0.2, 0.82, 0.21)) A <- rbind(1, 1, 1, 2, 3, 3, 4, 6) cbind(B[A[, 1], 2]) Hope this helps. Petr Savicky. From Bill.Venables at csiro.au Wed Apr 6 09:05:08 2011 From: Bill.Venables at csiro.au (Bill.Venables at csiro.au) Date: Wed, 6 Apr 2011 17:05:08 +1000 Subject: [R] Pulling strings from a Flat file In-Reply-To: <9DA5872FEF993D41B7173F58FCF6BE9451283934@orsmsx504.amr.corp.intel.com> References: <9DA5872FEF993D41B7173F58FCF6BE9451283934@orsmsx504.amr.corp.intel.com> Message-ID: <1BDAE2969943D540934EE8B4EF68F95FB28F8935C3@EXNSW-MBX03.nexus.csiro.au> Isn't all you need read.fwf? ________________________________________ From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] On Behalf Of Kalicin, Sarah [sarah.kalicin at intel.com] Sent: 06 April 2011 09:48 To: r-help at r-project.org Subject: [R] Pulling strings from a Flat file Hi, I have a flat file that contains a bunch of strings that look like this. The file was originally in Unix and brought over into Windows: E123456E234567E345678E456789E567891E678910E. . . . Basically the string starts with E and is followed with 6 numbers. One string=E123456, length=7 characters. This file contains 10,000's of these strings. I want to separate them into one vector the length of the number of strings in the flat file, where each string is it's on unique value. cc<-c(7,7,7,7,7,7,7) > aa<- file("Master","r", raw=TRUE) > readChar(aa, cc, useBytes = FALSE) [1] "E123456" "\nE23456" "7\nE3456" "78\nE456" "789\nE56" "7891\nE6" "78910\nE" > close(aa) > unlink("Master") The biggest issue is I am getting \n added into the string, which I am not sure where it is coming from, and splices the strings. Any suggestions on getting rid of the /n and create an infinite sequence of 7's for the string length for the cc vector? Is there a better way to do this? Sarah [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. From laomeng.3 at gmail.com Wed Apr 6 09:16:21 2011 From: laomeng.3 at gmail.com (Lao Meng) Date: Wed, 6 Apr 2011 15:16:21 +0800 Subject: [R] qcc.overdispersion-test In-Reply-To: <20110401232448.20386ud3gd2ka1ls@webmail.unibw-hamburg.de> References: <20110401232448.20386ud3gd2ka1ls@webmail.unibw-hamburg.de> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From savicky at praha1.ff.cuni.cz Wed Apr 6 09:55:09 2011 From: savicky at praha1.ff.cuni.cz (Petr Savicky) Date: Wed, 6 Apr 2011 09:55:09 +0200 Subject: [R] Support Counting In-Reply-To: <1302011014314-3428062.post@n4.nabble.com> References: <1301897497861-3424730.post@n4.nabble.com> <20110404093724.GA21431@praha1.ff.cuni.cz> <1302011014314-3428062.post@n4.nabble.com> Message-ID: <20110406075509.GB11296@praha1.ff.cuni.cz> On Tue, Apr 05, 2011 at 08:43:34AM -0500, psombe wrote: > well im using the "arules" package and i'm trying to use the support command. Hi. R-help can provide help for some of the frequently used CRAN packages, but not for all. There are too many of them. It is not clear, whether there is someone on R-help, who uses "arules". One of my students is using Eclat for association rules directly, but not from R. I am using R, but not for association rules. Try to determine, whether your question is indeed specific to "arules". If the question may be formulated without "arules", it has a good chance to be replied here. Otherwise, send a query to the package maintainer. Package maintainers usually welcome feedback. > my data is read form a file using the "read.transactions" command and a line > of data looks something like this. there are aboutt 88000 rows and 16000 > different items > > inspect(dset[3]) > items > 1 {33, > 34, > 35} > > inspect(dset[1]) > items > 1 {0, 1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2, 20, 21, 22, 23, 24, > 25, 26, 27, 28, 29, 3, 4,5, 6, 7, 8, 9} > > So in order to use support i have to make an object of class "itemsets" and > im kind of struggling with the "new" command. > I made an object of class itemsets by first creating a presence/absence > matrix and with something like 16000 items this is really sort of tedious. I > wonder if there is a better way. > > //Currently im doing this > > avec = array(dim=400) //dim is till the max number of the item im concerned > with > avec[1:400] = 0 > avec[27] = 1 > avec[63] = 1 //and do on for all the items i want > > amat = matrix(data = avec,ncol = 400) Up to here, this may be simplified, if the required indices are stored in a vector, say, "indices". For example indices <- c(3, 5, 6, 10) avec <- array(0, dim=14) avec[indices] <- 1 amat <- rbind(avec) or amat <- matrix(0, nrow=1, ncol=14) amat[1, indices] <- 1 amat [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] avec 0 0 1 0 1 1 0 0 0 1 0 0 0 0 Hope this helps. Petr Savicky. From leray.guillaume at gmail.com Wed Apr 6 10:23:30 2011 From: leray.guillaume at gmail.com (guillaume Le Ray) Date: Wed, 6 Apr 2011 10:23:30 +0200 Subject: [R] grImport/ghostscript problems In-Reply-To: <4D9B866C.9010200@auckland.ac.nz> References: <4D8F8E7A.60105@auckland.ac.nz> <4D9B866C.9010200@auckland.ac.nz> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From krishna at primps.com.sg Wed Apr 6 10:45:57 2011 From: krishna at primps.com.sg (SNV Krishna) Date: Wed, 6 Apr 2011 16:45:57 +0800 Subject: [R] syntax to subset for multiple values from a single variable Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From Achim.Zeileis at uibk.ac.at Wed Apr 6 11:07:53 2011 From: Achim.Zeileis at uibk.ac.at (Achim Zeileis) Date: Wed, 6 Apr 2011 11:07:53 +0200 (CEST) Subject: [R] A fortunes candidate? In-Reply-To: References: Message-ID: On Tue, 5 Apr 2011, Bert Gunter wrote: > A fortunes candidate? Definitely! Added to the devel version on R-Forge. thx, Z > On Tue, Apr 5, 2011 at 3:59 PM, David Winsemius wrote: > > ... > > " Tested solutions offered when reproducible examples are provided. " > > > > > > -- > "Men by nature long to get on to the ultimate truths, and will often > be impatient with elementary studies or fight shy of them. If it were > possible to reach the ultimate truths without the elementary studies > usually prefixed to them, these would not be preparatory studies but > superfluous diversions." > > -- Maimonides (1135-1204) > > Bert Gunter > Genentech Nonclinical Biostatistics > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > From ivan.calandra at uni-hamburg.de Wed Apr 6 11:38:19 2011 From: ivan.calandra at uni-hamburg.de (Ivan Calandra) Date: Wed, 06 Apr 2011 11:38:19 +0200 Subject: [R] syntax to subset for multiple values from a single variable In-Reply-To: References: Message-ID: <4D9C348B.7010901@uni-hamburg.de> Hi, I'm not sure what you're looking for because it looks to me that you have the answer already... Is this what you want: subset(df, x %in% c('a','b')) ? Ivan Le 4/6/2011 10:45, SNV Krishna a ?crit : > Hi All, > > Is it possible to use the subset() function to select data based on multiple > values of a single variable from a data frame. > > My actual data set is much bigger and would like to illustrate with > following dataset >> df = data.frame(x = c('a','b','c','d','e','f','g','h','a','a','b','b'), y > = 1:12) > I would like to select all rows where x = a or b. > > >> subset(df, x == c('a','b')) # this command did not return all rows where x > is equal to a or b > > x y > > 1 a 1 > > 2 b 2 > > 9 a 9 > > 12 b 12 > >> df[df$x %in% c('a','b'),] # subsetting using subscripts returned all rows > x y > > 1 a 1 > > 2 b 2 > > 9 a 9 > > 10 a 10 > > 11 b 11 > > 12 b 12 > > I know there might be a problem with subset syntax that I have used, but > could'nt figure out what it is. Any insights from members will be highly > appreciated and thanks for the same. > > Regards, > > S.N.V. Krishna > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. S?ugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calandra at uni-hamburg.de ********** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php From ian_renner at yahoo.com Wed Apr 6 12:12:57 2011 From: ian_renner at yahoo.com (Ian Renner) Date: Wed, 6 Apr 2011 03:12:57 -0700 (PDT) Subject: [R] Layout within levelplot from the lattice package Message-ID: <350451.41580.qm@web39401.mail.mud.yahoo.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From y.jiao at ucl.ac.uk Wed Apr 6 12:35:32 2011 From: y.jiao at ucl.ac.uk (Yan Jiao) Date: Wed, 6 Apr 2011 11:35:32 +0100 Subject: [R] function order Message-ID: <7785EFE1CC2D9F4A96F6A415BC1C2BD0168DD5C0@PC6-46.pogb.cancer.ucl.ac.uk> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From shreyasee.pradhan at gmail.com Wed Apr 6 12:44:01 2011 From: shreyasee.pradhan at gmail.com (Shreyasee) Date: Wed, 6 Apr 2011 16:14:01 +0530 Subject: [R] CSV file in "tm" package Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jim at bitwrit.com.au Wed Apr 6 14:09:26 2011 From: jim at bitwrit.com.au (Jim Lemon) Date: Wed, 06 Apr 2011 22:09:26 +1000 Subject: [R] function order In-Reply-To: <7785EFE1CC2D9F4A96F6A415BC1C2BD0168DD5C0@PC6-46.pogb.cancer.ucl.ac.uk> References: <7785EFE1CC2D9F4A96F6A415BC1C2BD0168DD5C0@PC6-46.pogb.cancer.ucl.ac.uk> Message-ID: <4D9C57F6.8070908@bitwrit.com.au> On 04/06/2011 08:35 PM, Yan Jiao wrote: > Dear All > > I'm trying to sort a matrix using function order, > Some thing really odd: > > e.g. > abc<-cbind(c(1,6,2),c(2,5,3),c(3,2,1))## matrix I want to sort > > if I do > abc[ order(abc[,3]), increasing = TRUE] > > the result is correct > [,1] [,2] [,3] > [1,] 2 3 1 > [2,] 6 5 2 > [3,] 1 2 3 > > But if I want to sort in decresing order: > abc[ order(abc[,3]), decreasing = TRUE] > > the result is wrong > [,1] [,2] [,3] > [1,] 2 3 1 > [2,] 6 5 2 > [3,] 1 2 3 > > Also if I use > abc[ order(abc[,3]), increasing = FALSE] > it returns nothing > [1,] > [2,] > [3,] > > Why is that? > Hi Yan, It is because you have put the "decreasing" argument outside the parentheses, and it is not being used in the "order" function. Jim From p.pagel at wzw.tum.de Wed Apr 6 13:42:35 2011 From: p.pagel at wzw.tum.de (Philipp Pagel) Date: Wed, 6 Apr 2011 13:42:35 +0200 Subject: [R] function order In-Reply-To: <7785EFE1CC2D9F4A96F6A415BC1C2BD0168DD5C0@PC6-46.pogb.cancer.ucl.ac.uk> References: <7785EFE1CC2D9F4A96F6A415BC1C2BD0168DD5C0@PC6-46.pogb.cancer.ucl.ac.uk> Message-ID: <20110406114235.GB5890@maker> On Wed, Apr 06, 2011 at 11:35:32AM +0100, Yan Jiao wrote: > abc<-cbind(c(1,6,2),c(2,5,3),c(3,2,1))## matrix I want to sort > > if I do > abc[ order(abc[,3]), increasing = TRUE] Jim already pointed out that the argument needs to go inside the parenthes of the order function. In addition, order has an argument called 'decreasing', but none called 'inceasing'. Finally, you are lacking a comma in your subsetting of the matrix: > abc[ order(abc[,3], decreasing=F)] [1] 2 6 1 But you probably mean: > abc[ order(abc[,3], decreasing=F), ] [,1] [,2] [,3] [1,] 2 3 1 [2,] 6 5 2 [3,] 1 2 3 cu Philipp -- Dr. Philipp Pagel Lehrstuhl f?r Genomorientierte Bioinformatik Technische Universit?t M?nchen Wissenschaftszentrum Weihenstephan Maximus-von-Imhof-Forum 3 85354 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/ From fabrice.ciup at gmail.com Wed Apr 6 10:48:04 2011 From: fabrice.ciup at gmail.com (Fabrice Tourre) Date: Wed, 6 Apr 2011 10:48:04 +0200 Subject: [R] Calculated mean value based on another column bin from dataframe. Message-ID: Dear list, I have a dataframe with two column as fellow. > head(dat) V1 V2 0.15624 0.94567 0.26039 0.66442 0.16629 0.97822 0.23474 0.72079 0.11037 0.83760 0.14969 0.91312 I want to get the column V2 mean value based on the bin of column of V1. I write the code as fellow. It works, but I think this is not the elegant way. Any suggestions? dat<-read.table("dat.txt",head=F) ran<-seq(0,0.5,0.05) mm<-NULL for (i in c(1:(length(ran)-1))) { fil<- dat[,1] > ran[i] & dat[,1]<=ran[i+1] m<-mean(dat[fil,2]) mm<-c(mm,m) } mm Here is the first 20 lines of my data. > dput(head(dat,20)) structure(list(V1 = c(0.15624, 0.26039, 0.16629, 0.23474, 0.11037, 0.14969, 0.16166, 0.09785, 0.36417, 0.08005, 0.29597, 0.14856, 0.17307, 0.36718, 0.11621, 0.23281, 0.10415, 0.1025, 0.04238, 0.13525), V2 = c(0.94567, 0.66442, 0.97822, 0.72079, 0.8376, 0.91312, 0.88463, 0.82432, 0.55582, 0.9429, 0.78956, 0.93424, 0.87692, 0.83996, 0.74552, 0.9779, 0.9958, 0.9783, 0.92523, 0.99022 )), .Names = c("V1", "V2"), row.names = c(NA, 20L), class = "data.frame") From fabrice.ciup at gmail.com Wed Apr 6 10:48:04 2011 From: fabrice.ciup at gmail.com (Fabrice Tourre) Date: Wed, 6 Apr 2011 10:48:04 +0200 Subject: [R] Calculated mean value based on another column bin from dataframe. Message-ID: Dear list, I have a dataframe with two column as fellow. > head(dat) V1 V2 0.15624 0.94567 0.26039 0.66442 0.16629 0.97822 0.23474 0.72079 0.11037 0.83760 0.14969 0.91312 I want to get the column V2 mean value based on the bin of column of V1. I write the code as fellow. It works, but I think this is not the elegant way. Any suggestions? dat<-read.table("dat.txt",head=F) ran<-seq(0,0.5,0.05) mm<-NULL for (i in c(1:(length(ran)-1))) { fil<- dat[,1] > ran[i] & dat[,1]<=ran[i+1] m<-mean(dat[fil,2]) mm<-c(mm,m) } mm Here is the first 20 lines of my data. > dput(head(dat,20)) structure(list(V1 = c(0.15624, 0.26039, 0.16629, 0.23474, 0.11037, 0.14969, 0.16166, 0.09785, 0.36417, 0.08005, 0.29597, 0.14856, 0.17307, 0.36718, 0.11621, 0.23281, 0.10415, 0.1025, 0.04238, 0.13525), V2 = c(0.94567, 0.66442, 0.97822, 0.72079, 0.8376, 0.91312, 0.88463, 0.82432, 0.55582, 0.9429, 0.78956, 0.93424, 0.87692, 0.83996, 0.74552, 0.9779, 0.9958, 0.9783, 0.92523, 0.99022 )), .Names = c("V1", "V2"), row.names = c(NA, 20L), class = "data.frame") From mitra at informatik.uni-tuebingen.de Wed Apr 6 10:39:21 2011 From: mitra at informatik.uni-tuebingen.de (suparna mitra) Date: Wed, 6 Apr 2011 10:39:21 +0200 Subject: [R] Creating a symmetric contingency table from two vectors with different length of levels in R Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From raji.sankaran at gmail.com Wed Apr 6 12:19:38 2011 From: raji.sankaran at gmail.com (Raji) Date: Wed, 6 Apr 2011 03:19:38 -0700 (PDT) Subject: [R] Help in kmeans Message-ID: <1302085178483-3430433.post@n4.nabble.com> Hi All, I was using the following command for performing kmeans for Iris dataset. Kmeans_model<-kmeans(dataFrame[,c(1,2,3,4)],centers=3) This was giving proper results for me. But, in my application we generate the R commands dynamically and there was a requirement that the column names will be sent instead of column indices to the R commands.Hence, to incorporate this, i tried using the R commands in the following way. kmeans_model<-kmeans((SepalLength+SepalWidth+PetalLength+PetalWidth),centers=3) or kmeans_model<-kmeans(as.matrix(SepalLength,SepalWidth,PetalLength,PetalWidth),centers=3) In both the ways, we found that the results are different from what we saw with the first command (with column indices). can you please let us know what is going wrong here.If so, can you please let us know how the column names can be used in kmeans to obtain the correct results? Many thanks, Raji -- View this message in context: http://r.789695.n4.nabble.com/Help-in-kmeans-tp3430433p3430433.html Sent from the R help mailing list archive at Nabble.com. From rubenbauar at gmx.de Wed Apr 6 13:55:57 2011 From: rubenbauar at gmx.de (Chris82) Date: Wed, 6 Apr 2011 04:55:57 -0700 (PDT) Subject: [R] Problem to convert date to number Message-ID: <1302090957068-3430571.post@n4.nabble.com> Hi R users, I have a maybe small problem which I cannot solve by myself. I want to convert "chron" "dates" "times" (04/30/06 11:35:00) to a number with as.POSIXct. The Problem is that I can't choose different timezones. I always get "CEST" and not "UTC" what I need. date = as.POSIXct(y,tz="UTC") "2006-04-30 11:35:00 CEST" Then I tried to use as.POSIXlt. date = as.POSIXlt(y,tz="UTC") "2006-04-30 11:35:00 UTC" The advantage is I get time in UTC but now the problem is that I can calculate numbers. date <- as.double(date)/86400 it is not working with as.POSIXlt but with as.POSIXct Thanks! With best regards -- View this message in context: http://r.789695.n4.nabble.com/Problem-to-convert-date-to-number-tp3430571p3430571.html Sent from the R help mailing list archive at Nabble.com. From djmuser at gmail.com Wed Apr 6 14:00:33 2011 From: djmuser at gmail.com (Dennis Murphy) Date: Wed, 6 Apr 2011 05:00:33 -0700 Subject: [R] function order In-Reply-To: <7785EFE1CC2D9F4A96F6A415BC1C2BD0168DD5C0@PC6-46.pogb.cancer.ucl.ac.uk> References: <7785EFE1CC2D9F4A96F6A415BC1C2BD0168DD5C0@PC6-46.pogb.cancer.ucl.ac.uk> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From chrish at stats.ucl.ac.uk Wed Apr 6 14:00:27 2011 From: chrish at stats.ucl.ac.uk (Christian Hennig) Date: Wed, 6 Apr 2011 13:00:27 +0100 (BST) Subject: [R] Help in kmeans In-Reply-To: <1302085178483-3430433.post@n4.nabble.com> References: <1302085178483-3430433.post@n4.nabble.com> Message-ID: I'm not going to comment on column names, but this is just to make you aware that the results of k-means depend on random initialisation. This means that it is possible that you get different results if you run it several times. It basically gives you a local optimum and there may be more than one of these. Use set.seed to see whether this explains your problem. Best regards, Christian On Wed, 6 Apr 2011, Raji wrote: > Hi All, > > I was using the following command for performing kmeans for Iris dataset. > > Kmeans_model<-kmeans(dataFrame[,c(1,2,3,4)],centers=3) > > This was giving proper results for me. But, in my application we generate > the R commands dynamically and there was a requirement that the column names > will be sent instead of column indices to the R commands.Hence, to > incorporate this, i tried using the R commands in the following way. > > kmeans_model<-kmeans((SepalLength+SepalWidth+PetalLength+PetalWidth),centers=3) > > or > > kmeans_model<-kmeans(as.matrix(SepalLength,SepalWidth,PetalLength,PetalWidth),centers=3) > > In both the ways, we found that the results are different from what we saw > with the first command (with column indices). > > can you please let us know what is going wrong here.If so, can you please > let us know how the column names can be used in kmeans to obtain the correct > results? > > Many thanks, > Raji > > -- > View this message in context: http://r.789695.n4.nabble.com/Help-in-kmeans-tp3430433p3430433.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > *** --- *** Christian Hennig University College London, Department of Statistical Science Gower St., London WC1E 6BT, phone +44 207 679 1698 chrish at stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche From murdoch.duncan at gmail.com Wed Apr 6 14:05:03 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Wed, 06 Apr 2011 08:05:03 -0400 Subject: [R] executing from .R source file in the src package In-Reply-To: References: Message-ID: <4D9C56EF.2090009@gmail.com> On 11-04-06 2:13 AM, Larry wrote: > Can I run R code straight from R src (.R) file instead of .rdb/.rdx? I of > course tried simply unzipping tar.gz in the R_LIBS directory but R complains > with "not a valid installed package". > > Real issue: I am very new to R and all, so this could be something basic. > I'm trying to use ess-tracebug (Emacs front-end to trace/browser et al). > It works great when I trace functions in .R files because browser outputs > filename+line-number when it hits a breakpoint. i.e. something like this: > > debug at /home/lgd/test/R/test3.R#1: { > > It even moves the cursor as you step through the function. This is just > lovely as I'm sure everyone knows. However, in case of a trace on function > in a package (ie loaded from a .rdb/.rdx) there is no filename/linenum > information probably because its not retained. i.e. it prints something > like this: > > debugging in: train(net, P, target, error.criterium = "LMS", report = TRUE, > > Any way to work around this? Thanks for all insights. I don't know ESS, but to get debug info in a package, you need to install the package with debug information included. By default source() includes it, installed packages don't. Setting the environment variable R_KEEP_PKG_SOURCE=yes before running R CMD INSTALL foo.tar.gz will install the debug information. Then the browser, etc. will list filename and line number information. This is also necessary for setBreakpoint to work, but then you'll also probably need to say which environments to look in, e.g. setBreakpoint("Sweave.R#70", env=environment(Sweave)) will set a breakpoint in whatever function is defined at line 70 of Sweave.R, as long as that function lives in the same environment as Sweave. Duncan Murdoch From djandrija at gmail.com Wed Apr 6 14:11:59 2011 From: djandrija at gmail.com (andrija djurovic) Date: Wed, 6 Apr 2011 14:11:59 +0200 Subject: [R] Creating a symmetric contingency table from two vectors with different length of levels in R In-Reply-To: References: Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From wwwhsd at gmail.com Wed Apr 6 14:16:46 2011 From: wwwhsd at gmail.com (Henrique Dallazuanna) Date: Wed, 6 Apr 2011 09:16:46 -0300 Subject: [R] Calculated mean value based on another column bin from dataframe. In-Reply-To: References: Message-ID: Try this: fil <- sapply(ran, '<', e1 = dat[,1]) & sapply(ran[2:(length(ran) + 1)], '>=', e1 = dat[,1]) mm <- apply(fil, 2, function(idx)mean(dat[idx, 2])) On Wed, Apr 6, 2011 at 5:48 AM, Fabrice Tourre wrote: > Dear list, > > I have a dataframe with two column as fellow. > >> head(dat) > ? ? ? V1 ? ? ?V2 > ?0.15624 0.94567 > ?0.26039 0.66442 > ?0.16629 0.97822 > ?0.23474 0.72079 > ?0.11037 0.83760 > ?0.14969 0.91312 > > I want to get the column V2 mean value based on the bin of column of > V1. I write the code as fellow. It works, but I think this is not the > elegant way. Any suggestions? > > dat<-read.table("dat.txt",head=F) > ran<-seq(0,0.5,0.05) > mm<-NULL > for (i in c(1:(length(ran)-1))) > { > ? ?fil<- dat[,1] > ran[i] & dat[,1]<=ran[i+1] > ? ?m<-mean(dat[fil,2]) > ? ?mm<-c(mm,m) > } > mm > > Here is the first 20 lines of my data. > >> dput(head(dat,20)) > structure(list(V1 = c(0.15624, 0.26039, 0.16629, 0.23474, 0.11037, > 0.14969, 0.16166, 0.09785, 0.36417, 0.08005, 0.29597, 0.14856, > 0.17307, 0.36718, 0.11621, 0.23281, 0.10415, 0.1025, 0.04238, > 0.13525), V2 = c(0.94567, 0.66442, 0.97822, 0.72079, 0.8376, > 0.91312, 0.88463, 0.82432, 0.55582, 0.9429, 0.78956, 0.93424, > 0.87692, 0.83996, 0.74552, 0.9779, 0.9958, 0.9783, 0.92523, 0.99022 > )), .Names = c("V1", "V2"), row.names = c(NA, 20L), class = "data.frame") > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O From wwwhsd at gmail.com Wed Apr 6 14:16:46 2011 From: wwwhsd at gmail.com (Henrique Dallazuanna) Date: Wed, 6 Apr 2011 09:16:46 -0300 Subject: [R] Calculated mean value based on another column bin from dataframe. In-Reply-To: References: Message-ID: Try this: fil <- sapply(ran, '<', e1 = dat[,1]) & sapply(ran[2:(length(ran) + 1)], '>=', e1 = dat[,1]) mm <- apply(fil, 2, function(idx)mean(dat[idx, 2])) On Wed, Apr 6, 2011 at 5:48 AM, Fabrice Tourre wrote: > Dear list, > > I have a dataframe with two column as fellow. > >> head(dat) > ? ? ? V1 ? ? ?V2 > ?0.15624 0.94567 > ?0.26039 0.66442 > ?0.16629 0.97822 > ?0.23474 0.72079 > ?0.11037 0.83760 > ?0.14969 0.91312 > > I want to get the column V2 mean value based on the bin of column of > V1. I write the code as fellow. It works, but I think this is not the > elegant way. Any suggestions? > > dat<-read.table("dat.txt",head=F) > ran<-seq(0,0.5,0.05) > mm<-NULL > for (i in c(1:(length(ran)-1))) > { > ? ?fil<- dat[,1] > ran[i] & dat[,1]<=ran[i+1] > ? ?m<-mean(dat[fil,2]) > ? ?mm<-c(mm,m) > } > mm > > Here is the first 20 lines of my data. > >> dput(head(dat,20)) > structure(list(V1 = c(0.15624, 0.26039, 0.16629, 0.23474, 0.11037, > 0.14969, 0.16166, 0.09785, 0.36417, 0.08005, 0.29597, 0.14856, > 0.17307, 0.36718, 0.11621, 0.23281, 0.10415, 0.1025, 0.04238, > 0.13525), V2 = c(0.94567, 0.66442, 0.97822, 0.72079, 0.8376, > 0.91312, 0.88463, 0.82432, 0.55582, 0.9429, 0.78956, 0.93424, > 0.87692, 0.83996, 0.74552, 0.9779, 0.9958, 0.9783, 0.92523, 0.99022 > )), .Names = c("V1", "V2"), row.names = c(NA, 20L), class = "data.frame") > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O From therneau at mayo.edu Wed Apr 6 14:49:51 2011 From: therneau at mayo.edu (Terry Therneau) Date: Wed, 6 Apr 2011 07:49:51 -0500 Subject: [R] simple save question Message-ID: <1302094191.18197.8.camel@nemo> --- begin inclusion-- Hi, When I run the survfit function, I want to get the restricted mean value and the standard error also. I found out using the "print" function to do so, as shown below, .... The questions is, is there any way to extract these values from the print command? ----- end inclusion --- Use sfit <- summary(fit). Then sfit$table contains the data the the print method produces. No, there isn't a way to extract them from the print command; the standard in S/R is for all print commands to return the object passed to them, without embellisment. Terry Therneau From dwinsemius at comcast.net Wed Apr 6 14:53:46 2011 From: dwinsemius at comcast.net (David Winsemius) Date: Wed, 6 Apr 2011 08:53:46 -0400 Subject: [R] Problem to convert date to number In-Reply-To: <1302090957068-3430571.post@n4.nabble.com> References: <1302090957068-3430571.post@n4.nabble.com> Message-ID: <962B411B-983F-455A-A9BF-D3247159561D@comcast.net> On Apr 6, 2011, at 7:55 AM, Chris82 wrote: > Hi R users, > > I have a maybe small problem which I cannot solve by myself. > > I want to convert > > "chron" "dates" "times" > > (04/30/06 11:35:00) Using the example from help(chron) > as.POSIXlt(x) # chron times are assumed to be UTC but are printed with the current local value [1] "1992-02-27 18:03:20 EST" "1992-02-27 17:29:56 EST" [3] "1992-01-13 20:03:30 EST" "1992-02-28 13:21:03 EST" [5] "1992-02-01 11:56:26 EST" > as.POSIXlt(x, tz="UTC") [1] "1992-02-27 23:03:20 UTC" "1992-02-27 22:29:56 UTC" [3] "1992-01-14 01:03:30 UTC" "1992-02-28 18:21:03 UTC" [5] "1992-02-01 16:56:26 UTC" > as.POSIXlt(x, tz="CEST") # "not working" [1] "1992-02-27 23:03:20 UTC" "1992-02-27 22:29:56 UTC" [3] "1992-01-14 01:03:30 UTC" "1992-02-28 18:21:03 UTC" [5] "1992-02-01 16:56:26 UTC" So it makes me wonder if as.POSIXct considers CEST to be a valid tz value. > as.POSIXlt(x, tz="XYZST") [1] "1992-02-27 23:03:20 UTC" "1992-02-27 22:29:56 UTC" [3] "1992-01-14 01:03:30 UTC" "1992-02-28 18:21:03 UTC" [5] "1992-02-01 16:56:26 UTC" > as.POSIXlt(x, tz="EST5EDT") # where I am, and seems to be working [1] "1992-02-27 18:03:20 EST" "1992-02-27 17:29:56 EST" [3] "1992-01-13 20:03:30 EST" "1992-02-28 13:21:03 EST" [5] "1992-02-01 11:56:26 EST" But despite the returned code from Sys.time() that TLA (EDT) does not "work": > Sys.time() [1] "2011-04-06 08:44:01 EDT" > as.POSIXlt(x, tz="EDT") # "EDT" printed as UTC values [1] "1992-02-27 23:03:20 UTC" "1992-02-27 22:29:56 UTC" [3] "1992-01-14 01:03:30 UTC" "1992-02-28 18:21:03 UTC" [5] "1992-02-01 16:56:26 UTC" But an unambiguous version does return the expected offset. All of this can be specific to your system (not provided) and your locale setting (also not provided) > as.POSIXlt(x, tz="UTC+02") [1] "1992-02-27 21:03:20 UTC" "1992-02-27 20:29:56 UTC" [3] "1992-01-13 23:03:30 UTC" "1992-02-28 16:21:03 UTC" [5] "1992-02-01 14:56:26 UTC" -- David. > > > to a number with as.POSIXct. > > The Problem is that I can't choose different timezones. I always get > "CEST" > and not "UTC" what I need. > > date = as.POSIXct(y,tz="UTC") > > "2006-04-30 11:35:00 CEST" > Then I tried to use as.POSIXlt. > > date = as.POSIXlt(y,tz="UTC") > > "2006-04-30 11:35:00 UTC" > > The advantage is I get time in UTC but now the problem is that I can > calculate numbers. > > date <- as.double(date)/86400 > > it is not working with as.POSIXlt but with as.POSIXct > > > Thanks! > > With best regards > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Problem-to-convert-date-to-number-tp3430571p3430571.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT From therneau at mayo.edu Wed Apr 6 15:10:02 2011 From: therneau at mayo.edu (Terry Therneau) Date: Wed, 06 Apr 2011 08:10:02 -0500 Subject: [R] frailty and survival curves Message-ID: <1302095402.18197.26.camel@nemo> With respect to Cox models + frailty, and post-fit survival curves. 1. There are two possible survival curves, the conditional curve where we identify which center a subject comes from, and the marginal curve where we have integrated out center and give survival for an "unspecified" individual. I find the first more useful. More importantly to your case, the survival package currently has no code to calculate the second of these. 2. When the number of centers is large the coxph code may have used a sparse approximation to the variance matrix, for speed reasons. In this particular case one cannot use the "newdata" argument. The reason is entirely practical --- the code turned out to be very hard to write. The need for this comes up very rarely, and the work around is to use coxph(....... + frailty(center, sparse=1000, ....) where we set the "sparse computation" threshold to be some number larger than the number of centers, i.e., force non-sparse computation. Terry Therneau From Boris.Vasiliev at forces.gc.ca Wed Apr 6 15:26:44 2011 From: Boris.Vasiliev at forces.gc.ca (Boris.Vasiliev at forces.gc.ca) Date: Wed, 6 Apr 2011 09:26:44 -0400 Subject: [R] lattice xscale.components: different ticks on top/bottom axis In-Reply-To: References: <201103101854.p2AIsL0A020203@hypatia.math.ethz.ch><4d962ecc.0132640a.0557.ffffcf28SMTPIN_ADDED@mx.google.com> Message-ID: <201104061326.p36DQrif011923@hypatia.math.ethz.ch> > On Sat, Apr 2, 2011 at 1:29 AM, wrote: > > > >> On Fri, Mar 11, 2011 at 12:28 AM, > >> wrote: > >> > Good afternoon, > >> > > >> > I am trying to create a plot where the bottom and top > >> > axes have the > >> > same scale but different tick marks. ?I tried user-defined > >> > xscale.component function but it does not produce > >> > desired results. > >> > Can anybody suggest where my use of xscale.component function is > >> > incorrect? > >> > > >> > For example, the code below tries to create a plot where > >> > horizontal > >> > axes limits are c(0,10), top axis has ticks at odd integers, and > >> > bottom axis has ticks at even integers. > >> > > >> > library(lattice) > >> > > >> > df <- data.frame(x=1:10,y=1:10) > >> > > >> > xscale.components.A <- function(...,user.value=NULL) { > >> > ?# get default axes definition list; print user.value > >> > ?ans <- xscale.components.default(...) > >> > ?print(user.value) > >> > > >> > ?# start with the same definition of bottom and top axes > >> > ?ans$top <- ans$bottom > >> > > >> > ?# - bottom labels > >> > ?ans$bottom$labels$at <- seq(0,10,by=2) > >> > ?ans$bottom$labels$labels <- paste("B",seq(0,10,by=2),sep="-") > >> > > >> > ?# - top labels > >> > ?ans$top$labels$at <- seq(1,9,by=2) > >> > ?ans$top$labels$labels <- paste("T",seq(1,9,by=2),sep="-") > >> > > >> > ?# return axes definition list > >> > ?return(ans) > >> > } > >> > > >> > oltc <- xyplot(y~x,data=df, > >> > > >> > scales=list(x=list(limits=c(0,10),at=0:10,alternating=3)), > >> > ? ? ? ? ? ? ? xscale.components=xscale.components.A, > >> > ? ? ? ? ? ? ? user.value=1) > >> > print(oltc) > >> > > >> > The code generates a figure with incorrectly placed > >> > bottom and top > >> > labels. ?Bottom labels "B-0", "B-2", ... are at 0, 1, ... > >> > and top labels "T-1", "T-3", ... are at 0, 1, ... ?When > >> > axis-function runs out of labels, it replaces labels with NA. > >> > > >> > It appears that lattice uses top$ticks$at to place labels and > >> > top$labels$labels for labels. ?Is there a way to override this > >> > behaviour (other than to expand the "labels$labels" vector to > >> > be as long as "ticks$at" vector and set necessary elements to "")? > >> > >> Well, $ticks$at is used to place the ticks, and > >> $labels$at is used to place the labels. They should > >> typically be the > >> same, but you have changed one and not the other. > >> Everything seems to work if you set $ticks$at to the same > >> values as $labels$at: > >> > >> > >> ? ? ## ?- bottom labels > >> + ? ans$bottom$ticks$at <- seq(0,10,by=2) > >> ? ? ans$bottom$labels$at <- seq(0,10,by=2) > >> ? ? ans$bottom$labels$labels <- paste("B",seq(0,10,by=2),sep="-") > >> > >> ? ? ## ?- top labels > >> + ? ans$top$ticks$at <- seq(1,9,by=2) > >> ? ? ans$top$labels$at <- seq(1,9,by=2) > >> ? ? ans$top$labels$labels <- paste("T",seq(1,9,by=2),sep="-") > >> > >> > >> > Also, can user-parameter be passed into xscale.components() > >> > function? (For example, locations and labels of ticks on the top > >> > axis). ?In the ?code above, print(user.value) returns NULL even > >> > though in the xyplot() call user.value is 1. > >> > >> No. Unrecognized arguments are passed to the panel function only, not > >> to any other function. However, you can always define an inline > >> function: > >> > >> oltc <- xyplot(y~x,data=df, > >> ? ? ? ? ? ? ? ?scales=list(x=list(limits=c(0,10), at = 0:10, > >> ? ? ? ? ? ? ? ? ? ? ? ? ? ?alternating=3)), > >> ? ? ? ? ? ? ? ?xscale.components = function(...) > >> ? ? ? ? ? ? ? ? ? ? ? ? ? ?xscale.components.A(..., user.value=1)) > >> > >> Hope that helps (and sorry for the late reply). > >> > >> -Deepayan > >> > > > > Deepyan, > > > > Thank you very much for your reply. ?It makes things a bit clearer. > > > > It other words in the list prepared by xscale.components(), vectors > > $ticks$at and $labels$at must be the same. > > If only every second tick is to be labelled then every second label > > should be set explicitly to empty strings: > > Now when you put it that way, the current behaviour does seem > wrong (I didn't read your original post carefully enough). I > guess this was one of the not-yet-implemented things > mentioned in the Details section of ?xscale.components.default. > > I have added support for different ticks$at and labels$at in > the SVN sources in r-forge. You can test it from there (your > original code works as expected). I won't make a new release > on CRAN until after R > 2.13 is released (we are almost in code freeze now). > > -Deepayan > Great! Thank you very much. Boris. From dieter.menne at menne-biomed.de Wed Apr 6 15:39:46 2011 From: dieter.menne at menne-biomed.de (Dieter Menne) Date: Wed, 6 Apr 2011 06:39:46 -0700 (PDT) Subject: [R] Layout within levelplot from the lattice package In-Reply-To: <350451.41580.qm@web39401.mail.mud.yahoo.com> References: <350451.41580.qm@web39401.mail.mud.yahoo.com> Message-ID: <1302097186206-3430812.post@n4.nabble.com> Ian Renner wrote: > > Hi, > > I'm a novice with levelplot and need some assistance! Basically, I want a > window > which contains 6 levelplots of equal size presented in 3 columns and 2 > rows. > ... > Is there any way to concatenate levelplots from a factor vertically as > opposed > to horizontally? > > Thank for providing a self-contained example. Remembering my early struggles with lattice, you must have needed some hours to get this working! Your last question is easy to answer: use a slightly different version of split (see below). To keep the code more transparent, separating plot generation from display, I prefer the following scheme: p = xyplot(...) print(p, split....) Normally one would try to use one data frame and panels for your type of plot, but as I see this cannot be done here because you want different color regions which is not vectorized. So doing it in three runs seems to be fine, if Deepayan has no other solution. To get the plots closer together you must find the correct par.settings. This is one of the tricky parts in lattice, but try str(trellis.par.get()) to find what is possible. Dieter # --------------------------------------------------- library(lattice) start = expand.grid(1:10,1:14) start2 = rbind(start,start,start,start,start,start) z = rnorm(840) factor.1 = c(rep("A", 280), rep("B", 280), rep("C", 280)) factor.2 = c(rep("1", 140), rep("2", 140), rep("1", 140), rep("2", 140), rep("1", 140), rep("2", 140)) data = data.frame(start2, z, factor.1, factor.2) names(data)[1:2] = c("x", "y") data.A = data[data$factor.1 == "A",] data.B = data[data$factor.1 == "B",] data.C = data[data$factor.1 == "C",] ## End of data generation doLevels = function(data, col.regions){ levelplot(z~x*y|factor.2,data, col.regions=col.regions,asp="iso",xlab = "", ylab = "", colorkey = list(space="bottom"), scales=list(y=list(draw=F),x=list(draw=F)), par.settings=list(layout.widths=list( right.padding=-1, left.padding = -1 )) ) } p1 = doLevels(data.A,heat.colors) p2 = doLevels(data.B,topo.colors) p3 = doLevels(data.C,terrain.colors) print(p1,split=c(1,1,3,1),more=TRUE) print(p2,split=c(2,1,3,1),more=TRUE) print(p3,split=c(3,1,3,1)) -- View this message in context: http://r.789695.n4.nabble.com/Layout-within-levelplot-from-the-lattice-package-tp3430421p3430812.html Sent from the R help mailing list archive at Nabble.com. From bbolker at gmail.com Wed Apr 6 16:18:17 2011 From: bbolker at gmail.com (Ben Bolker) Date: Wed, 6 Apr 2011 14:18:17 +0000 Subject: [R] Package diveMove readTDR problem References: <1302070887814-3429980.post@n4.nabble.com> Message-ID: mwege zoology.up.ac.za> writes: > I am trying to read my TDR data into R using the readTDR function for the > diveMove package. > > > seal <- readTDR("file location and name here", dateCol=1, depthCol=3, > > speed=FALSE, subsamp=1, concurrentCols=4:5) > > But I keep getting the following error: > > Error: all(!is.na(time)) is not TRUE > > All my columns to have values in them (there are no empty records) > > The manual and vignette of the package diveMove doesnt give a proper > description of how to read data into R. It only describes how to access the > data in the system file that comes with the package. > What am I doing wrong? It's hard to answer this without a reproducible example, and it's generally harder to get answers about less-used/more specialized packages. I'm going to guess that at least some of your dates are not in the same format as specified (from the manual page: default is "%d/%m/%Y %H:%M:%S" -- you can change this with the 'dtformat' argument). You shouldn't need to specify arguments to the function that are the same as the defaults: I would expect that seal <- readTDR("file location and name here", subsamp=1, concurrentCols=4:5) would work equally well (or poorly ...) "%d/%m/%Y %H:%M:%S" From bkmooney at gmail.com Wed Apr 6 17:33:48 2011 From: bkmooney at gmail.com (Brigid Mooney) Date: Wed, 6 Apr 2011 11:33:48 -0400 Subject: [R] Decimal Accuracy Loss? Message-ID: This is hopefully a quick question on decimal accuracy. Is any decimal accuracy lost when casting a numeric vector as a matrix? And then again casting the result back to a numeric? I'm finding that my calculation values are different when I run for loops that manually calculate matrix multiplication as compared to when I cast the vectors as matrices and multiply them using "%*%". (The errors are very small, but the process is run iteratively thousands of times, at which point the error between the two differences becomes noticeable.) I've read FAQ # 7.31 "Why doesn't R think these numbers are equal?", but just want to confirm that the differences in values are due to differences in the matrix multiplication operator and manual calculation via for loops, rather than information that is lost when casting a numeric as a matrix and back again. Thanks in advance for the help, Brigid From gunter.berton at gene.com Wed Apr 6 17:45:06 2011 From: gunter.berton at gene.com (Bert Gunter) Date: Wed, 6 Apr 2011 08:45:06 -0700 Subject: [R] Decimal Accuracy Loss? In-Reply-To: References: Message-ID: Confirmed. "Casting" just adds/removes the dim attribute to the numeric vector/matrix. -- Bert On Wed, Apr 6, 2011 at 8:33 AM, Brigid Mooney wrote: > This is hopefully a quick question on decimal accuracy. ?Is any > decimal accuracy lost when casting a numeric vector as a matrix? ?And > then again casting the result back to a numeric? > > I'm finding that my calculation values are different when I run for > loops that manually calculate matrix multiplication as compared to > when I cast the vectors as matrices and multiply them using "%*%". > (The errors are very small, but the process is run iteratively > thousands of times, at which point the error between the two > differences becomes noticeable.) > > I've read FAQ # 7.31 "Why doesn't R think these numbers are equal?", > but just want to confirm that the differences in values are due to > differences in the matrix multiplication operator and manual > calculation via for loops, rather than information that is lost when > casting a numeric as a matrix and back again. > > Thanks in advance for the help, > Brigid > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- "Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions." -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics From bkmooney at gmail.com Wed Apr 6 17:50:14 2011 From: bkmooney at gmail.com (Brigid Mooney) Date: Wed, 6 Apr 2011 11:50:14 -0400 Subject: [R] Decimal Accuracy Loss? In-Reply-To: References: Message-ID: Thanks, Bert. That's a big help. -Brigid On Wed, Apr 6, 2011 at 11:45 AM, Bert Gunter wrote: > Confirmed. "Casting" just adds/removes the dim attribute to the > numeric vector/matrix. > > -- Bert > > On Wed, Apr 6, 2011 at 8:33 AM, Brigid Mooney wrote: >> This is hopefully a quick question on decimal accuracy. ?Is any >> decimal accuracy lost when casting a numeric vector as a matrix? ?And >> then again casting the result back to a numeric? >> >> I'm finding that my calculation values are different when I run for >> loops that manually calculate matrix multiplication as compared to >> when I cast the vectors as matrices and multiply them using "%*%". >> (The errors are very small, but the process is run iteratively >> thousands of times, at which point the error between the two >> differences becomes noticeable.) >> >> I've read FAQ # 7.31 "Why doesn't R think these numbers are equal?", >> but just want to confirm that the differences in values are due to >> differences in the matrix multiplication operator and manual >> calculation via for loops, rather than information that is lost when >> casting a numeric as a matrix and back again. >> >> Thanks in advance for the help, >> Brigid >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > "Men by nature long to get on to the ultimate truths, and will often > be impatient with elementary studies or fight shy of them. If it were > possible to reach the ultimate truths without the elementary studies > usually prefixed to them, these would not be preparatory studies but > superfluous diversions." > > -- Maimonides (1135-1204) > > Bert Gunter > Genentech Nonclinical Biostatistics > From savicky at praha1.ff.cuni.cz Wed Apr 6 17:58:08 2011 From: savicky at praha1.ff.cuni.cz (Petr Savicky) Date: Wed, 6 Apr 2011 17:58:08 +0200 Subject: [R] Decimal Accuracy Loss? In-Reply-To: References: Message-ID: <20110406155808.GA30208@praha1.ff.cuni.cz> On Wed, Apr 06, 2011 at 11:33:48AM -0400, Brigid Mooney wrote: > This is hopefully a quick question on decimal accuracy. Is any > decimal accuracy lost when casting a numeric vector as a matrix? And > then again casting the result back to a numeric? > > I'm finding that my calculation values are different when I run for > loops that manually calculate matrix multiplication as compared to > when I cast the vectors as matrices and multiply them using "%*%". > (The errors are very small, but the process is run iteratively > thousands of times, at which point the error between the two > differences becomes noticeable.) > > I've read FAQ # 7.31 "Why doesn't R think these numbers are equal?", > but just want to confirm that the differences in values are due to > differences in the matrix multiplication operator and manual > calculation via for loops, rather than information that is lost when > casting a numeric as a matrix and back again. Others already confirmed that casting a numeric as a matrix and back again does not change the numbers. It is likely that the library operator "%*%" is more accurate than a straightforward for loop. For example, sum(x) uses a more accurate algorithm than iteration of s <- s + x[i] in double precision. Petr Savicky. From krcabrer at une.net.co Wed Apr 6 18:04:11 2011 From: krcabrer at une.net.co (KENNETH R CABRERA) Date: Wed, 06 Apr 2011 11:04:11 -0500 Subject: [R] Use of the dot.dot.dot option in functions. Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From dimitri.liakhovitski at gmail.com Wed Apr 6 18:14:02 2011 From: dimitri.liakhovitski at gmail.com (Dimitri Liakhovitski) Date: Wed, 6 Apr 2011 12:14:02 -0400 Subject: [R] summing values by week - based on daily dates - but with some dates missing In-Reply-To: References: Message-ID: Guys, sorry to bother you again: I am running everything as before (see code below - before the line with a lot of ######). But now I am getting an error: Error in eval(expr, envir, enclos) : could not find function "na.locf" I also noticed that after I run the 3rd line from the bottom: "wk <- as.numeric(format(myframe$dates, "%Y.%W"))" - there are some weeks that end with .00 And then, after I run the 2nd line from the bottom: "is.na(wk) <- wk %% 1 == 0" those weeks turn into NAs. Whether I run the second line or not - I get the same error about it not finding the function "na.locf". Do you know what might be going on? Thanks a lot! Dimitri ### Creating a longer example data set: mydates<-rep(seq(as.Date("2008-12-29"), length = 500, by = "day"),2) myfactor<-c(rep("group.1",500),rep("group.2",500)) set.seed(123) myvalues<-runif(1000,0,1) myframe<-data.frame(dates=mydates,group=myfactor,value=myvalues) (myframe) dim(myframe) ## Removing same rows (dates) unsystematically: set.seed(123) removed.group1<-sample(1:500,size=150,replace=F) set.seed(456) removed.group2<-sample(501:1000,size=150,replace=F) to.remove<-c(removed.group1,removed.group2);length(to.remove) to.remove<-to.remove[order(to.remove)] myframe<-myframe[-to.remove,] (myframe) dim(myframe) names(myframe)# write.csv(myframe,file="x.test.csv",row.names=F) wk <- as.numeric(format(myframe$dates, "%Y.%W")) is.na(wk) <- wk %% 1 == 0 solution<-aggregate(value ~ group + na.locf(wk), myframe, FUN = sum) ############################################################### On Wed, Mar 30, 2011 at 5:25 PM, Henrique Dallazuanna wrote: > You're right: > > wk <- as.numeric(format(myframe$dates, "%Y.%W")) > is.na(wk) <- wk %% 1 == 0 > solution<-aggregate(value ~ group + na.locf(wk), myframe, FUN = sum) > > > On Wed, Mar 30, 2011 at 6:10 PM, Dimitri Liakhovitski > wrote: >> Yes, zoo! That's what I forgot. It's great. >> Henrique, thanks a lot! One question: >> >> if the data are as I originally posted - then week numbered 52 is >> actually the very first week (it straddles 2008-2009). >> What if the data much longer (like in the code below - same as before, >> but more dates) so that we have more than 1 year to deal with. >> It looks like this code is lumping everything into 52 weeks. And my >> goal is to keep each week independent. If I have 2 years, then it >> should be 100+ weeks. Makes sense? >> Thank you! >> >> ### Creating a longer example data set: >> mydates<-rep(seq(as.Date("2008-12-29"), length = 500, by = "day"),2) >> myfactor<-c(rep("group.1",500),rep("group.2",500)) >> set.seed(123) >> myvalues<-runif(1000,0,1) >> myframe<-data.frame(dates=mydates,group=myfactor,value=myvalues) >> (myframe) >> dim(myframe) >> >> ## Removing same rows (dates) unsystematically: >> set.seed(123) >> removed.group1<-sample(1:500,size=150,replace=F) >> set.seed(456) >> removed.group2<-sample(501:1000,size=150,replace=F) >> to.remove<-c(removed.group1,removed.group2);length(to.remove) >> to.remove<-to.remove[order(to.remove)] >> myframe<-myframe[-to.remove,] >> (myframe) >> dim(myframe) >> names(myframe) >> >> library(zoo) >> wk <- as.numeric(format(myframe$dates, '%W')) >> is.na(wk) <- wk == 0 >> solution<-aggregate(value ~ group + na.locf(wk), myframe, FUN = sum) >> solution<-solution[order(solution$group),] >> write.csv(solution,file="test.csv",row.names=F) >> >> >> >> On Wed, Mar 30, 2011 at 4:45 PM, Henrique Dallazuanna wrote: >>> Try this: >>> >>> library(zoo) >>> wk <- as.numeric(format(myframe$dates, '%W')) >>> is.na(wk) <- wk == 0 >>> aggregate(value ~ group + na.locf(wk), myframe, FUN = sum) >>> >>> >>> >>> On Wed, Mar 30, 2011 at 4:35 PM, Dimitri Liakhovitski >>> wrote: >>>> Henrique, this is great, thank you! >>>> >>>> It's almost what I was looking for! Only one small thing - it doesn't >>>> "merge" the results for weeks that "straddle" 2 years. In my example - >>>> last week of year 2008 and the very first week of 2009 are one week. >>>> Any way to "join them"? >>>> Asking because in reality I'll have many years and hundreds of groups >>>> - hence, it'll be hard to do it manually. >>>> >>>> >>>> BTW - does format(dates,"%Y.%W") always consider weeks as starting with Mondays? >>>> >>>> Thank you very much! >>>> Dimitri >>>> >>>> >>>> On Wed, Mar 30, 2011 at 2:55 PM, Henrique Dallazuanna wrote: >>>>> Try this: >>>>> >>>>> aggregate(value ~ group + format(dates, "%Y.%W"), myframe, FUN = sum) >>>>> >>>>> >>>>> On Wed, Mar 30, 2011 at 11:23 AM, Dimitri Liakhovitski >>>>> wrote: >>>>>> Dear everybody, >>>>>> >>>>>> I have the following challenge. I have a data set with 2 subgroups, >>>>>> dates (days), and corresponding values (see example code below). >>>>>> Within each subgroup: I need to aggregate (sum) the values by week - >>>>>> for weeks that start on a Monday (for example, 2008-12-29 was a >>>>>> Monday). >>>>>> I find it difficult because I have missing dates in my data - so that >>>>>> sometimes I don't even have the date for some Mondays. So, I can't >>>>>> write a proper loop. >>>>>> I want my output to look something like this: >>>>>> group ? dates ? value >>>>>> group.1 2008-12-29 ?3.0937 >>>>>> group.1 2009-01-05 ?3.8833 >>>>>> group.1 2009-01-12 ?1.362 >>>>>> ... >>>>>> group.2 2008-12-29 ?2.250 >>>>>> group.2 2009-01-05 ?1.4057 >>>>>> group.2 2009-01-12 ?3.4411 >>>>>> ... >>>>>> >>>>>> Thanks a lot for your suggestions! The code is below: >>>>>> Dimitri >>>>>> >>>>>> ### Creating example data set: >>>>>> mydates<-rep(seq(as.Date("2008-12-29"), length = 43, by = "day"),2) >>>>>> myfactor<-c(rep("group.1",43),rep("group.2",43)) >>>>>> set.seed(123) >>>>>> myvalues<-runif(86,0,1) >>>>>> myframe<-data.frame(dates=mydates,group=myfactor,value=myvalues) >>>>>> (myframe) >>>>>> dim(myframe) >>>>>> >>>>>> ## Removing same rows (dates) unsystematically: >>>>>> set.seed(123) >>>>>> removed.group1<-sample(1:43,size=11,replace=F) >>>>>> set.seed(456) >>>>>> removed.group2<-sample(44:86,size=11,replace=F) >>>>>> to.remove<-c(removed.group1,removed.group2);length(to.remove) >>>>>> to.remove<-to.remove[order(to.remove)] >>>>>> myframe<-myframe[-to.remove,] >>>>>> (myframe) >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Dimitri Liakhovitski >>>>>> Ninah Consulting >>>>>> www.ninah.com >>>>>> >>>>>> ______________________________________________ >>>>>> R-help at r-project.org mailing list >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Henrique Dallazuanna >>>>> Curitiba-Paran?-Brasil >>>>> 25? 25' 40" S 49? 16' 22" O >>>>> >>>> >>>> >>>> >>>> -- >>>> Dimitri Liakhovitski >>>> Ninah Consulting >>>> www.ninah.com >>>> >>> >>> >>> >>> -- >>> Henrique Dallazuanna >>> Curitiba-Paran?-Brasil >>> 25? 25' 40" S 49? 16' 22" O >>> >> >> >> >> -- >> Dimitri Liakhovitski >> Ninah Consulting >> www.ninah.com >> > > > > -- > Henrique Dallazuanna > Curitiba-Paran?-Brasil > 25? 25' 40" S 49? 16' 22" O > -- Dimitri Liakhovitski Ninah Consulting www.ninah.com From dimitri.liakhovitski at gmail.com Wed Apr 6 18:18:56 2011 From: dimitri.liakhovitski at gmail.com (Dimitri Liakhovitski) Date: Wed, 6 Apr 2011 12:18:56 -0400 Subject: [R] summing values by week - based on daily dates - but with some dates missing In-Reply-To: References: Message-ID: Sorry - never mind. It turns out I did not load the zoo package. That was the reason. On Wed, Apr 6, 2011 at 12:14 PM, Dimitri Liakhovitski wrote: > Guys, sorry to bother you again: > > I am running everything as before (see code below - before the line > with a lot of ######). But now I am getting an error: > Error in eval(expr, envir, enclos) : could not find function "na.locf" > I also noticed that after I run the 3rd line from the bottom: "wk <- > as.numeric(format(myframe$dates, "%Y.%W"))" - there are some weeks > that end with .00 > And then, after I run the 2nd line from the bottom: "is.na(wk) <- wk > %% 1 == 0" those weeks turn into NAs. > Whether I run the second line or not - I get the same error about it > not finding the function "na.locf". > Do you know what might be going on? > Thanks a lot! > Dimitri > > ### Creating a longer example data set: > mydates<-rep(seq(as.Date("2008-12-29"), length = 500, by = "day"),2) > myfactor<-c(rep("group.1",500),rep("group.2",500)) > set.seed(123) > myvalues<-runif(1000,0,1) > myframe<-data.frame(dates=mydates,group=myfactor,value=myvalues) > (myframe) > dim(myframe) > > ## Removing same rows (dates) unsystematically: > set.seed(123) > removed.group1<-sample(1:500,size=150,replace=F) > set.seed(456) > removed.group2<-sample(501:1000,size=150,replace=F) > to.remove<-c(removed.group1,removed.group2);length(to.remove) > to.remove<-to.remove[order(to.remove)] > myframe<-myframe[-to.remove,] > (myframe) > dim(myframe) > names(myframe)# write.csv(myframe,file="x.test.csv",row.names=F) > > wk <- as.numeric(format(myframe$dates, "%Y.%W")) > is.na(wk) <- wk %% 1 == 0 > solution<-aggregate(value ~ group + na.locf(wk), myframe, FUN = sum) > > > > > ############################################################### > > On Wed, Mar 30, 2011 at 5:25 PM, Henrique Dallazuanna wrote: >> You're right: >> >> wk <- as.numeric(format(myframe$dates, "%Y.%W")) >> is.na(wk) <- wk %% 1 == 0 >> solution<-aggregate(value ~ group + na.locf(wk), myframe, FUN = sum) >> >> >> On Wed, Mar 30, 2011 at 6:10 PM, Dimitri Liakhovitski >> wrote: >>> Yes, zoo! That's what I forgot. It's great. >>> Henrique, thanks a lot! One question: >>> >>> if the data are as I originally posted - then week numbered 52 is >>> actually the very first week (it straddles 2008-2009). >>> What if the data much longer (like in the code below - same as before, >>> but more dates) so that we have more than 1 year to deal with. >>> It looks like this code is lumping everything into 52 weeks. And my >>> goal is to keep each week independent. If I have 2 years, then it >>> should be 100+ weeks. Makes sense? >>> Thank you! >>> >>> ### Creating a longer example data set: >>> mydates<-rep(seq(as.Date("2008-12-29"), length = 500, by = "day"),2) >>> myfactor<-c(rep("group.1",500),rep("group.2",500)) >>> set.seed(123) >>> myvalues<-runif(1000,0,1) >>> myframe<-data.frame(dates=mydates,group=myfactor,value=myvalues) >>> (myframe) >>> dim(myframe) >>> >>> ## Removing same rows (dates) unsystematically: >>> set.seed(123) >>> removed.group1<-sample(1:500,size=150,replace=F) >>> set.seed(456) >>> removed.group2<-sample(501:1000,size=150,replace=F) >>> to.remove<-c(removed.group1,removed.group2);length(to.remove) >>> to.remove<-to.remove[order(to.remove)] >>> myframe<-myframe[-to.remove,] >>> (myframe) >>> dim(myframe) >>> names(myframe) >>> >>> library(zoo) >>> wk <- as.numeric(format(myframe$dates, '%W')) >>> is.na(wk) <- wk == 0 >>> solution<-aggregate(value ~ group + na.locf(wk), myframe, FUN = sum) >>> solution<-solution[order(solution$group),] >>> write.csv(solution,file="test.csv",row.names=F) >>> >>> >>> >>> On Wed, Mar 30, 2011 at 4:45 PM, Henrique Dallazuanna wrote: >>>> Try this: >>>> >>>> library(zoo) >>>> wk <- as.numeric(format(myframe$dates, '%W')) >>>> is.na(wk) <- wk == 0 >>>> aggregate(value ~ group + na.locf(wk), myframe, FUN = sum) >>>> >>>> >>>> >>>> On Wed, Mar 30, 2011 at 4:35 PM, Dimitri Liakhovitski >>>> wrote: >>>>> Henrique, this is great, thank you! >>>>> >>>>> It's almost what I was looking for! Only one small thing - it doesn't >>>>> "merge" the results for weeks that "straddle" 2 years. In my example - >>>>> last week of year 2008 and the very first week of 2009 are one week. >>>>> Any way to "join them"? >>>>> Asking because in reality I'll have many years and hundreds of groups >>>>> - hence, it'll be hard to do it manually. >>>>> >>>>> >>>>> BTW - does format(dates,"%Y.%W") always consider weeks as starting with Mondays? >>>>> >>>>> Thank you very much! >>>>> Dimitri >>>>> >>>>> >>>>> On Wed, Mar 30, 2011 at 2:55 PM, Henrique Dallazuanna wrote: >>>>>> Try this: >>>>>> >>>>>> aggregate(value ~ group + format(dates, "%Y.%W"), myframe, FUN = sum) >>>>>> >>>>>> >>>>>> On Wed, Mar 30, 2011 at 11:23 AM, Dimitri Liakhovitski >>>>>> wrote: >>>>>>> Dear everybody, >>>>>>> >>>>>>> I have the following challenge. I have a data set with 2 subgroups, >>>>>>> dates (days), and corresponding values (see example code below). >>>>>>> Within each subgroup: I need to aggregate (sum) the values by week - >>>>>>> for weeks that start on a Monday (for example, 2008-12-29 was a >>>>>>> Monday). >>>>>>> I find it difficult because I have missing dates in my data - so that >>>>>>> sometimes I don't even have the date for some Mondays. So, I can't >>>>>>> write a proper loop. >>>>>>> I want my output to look something like this: >>>>>>> group ? dates ? value >>>>>>> group.1 2008-12-29 ?3.0937 >>>>>>> group.1 2009-01-05 ?3.8833 >>>>>>> group.1 2009-01-12 ?1.362 >>>>>>> ... >>>>>>> group.2 2008-12-29 ?2.250 >>>>>>> group.2 2009-01-05 ?1.4057 >>>>>>> group.2 2009-01-12 ?3.4411 >>>>>>> ... >>>>>>> >>>>>>> Thanks a lot for your suggestions! The code is below: >>>>>>> Dimitri >>>>>>> >>>>>>> ### Creating example data set: >>>>>>> mydates<-rep(seq(as.Date("2008-12-29"), length = 43, by = "day"),2) >>>>>>> myfactor<-c(rep("group.1",43),rep("group.2",43)) >>>>>>> set.seed(123) >>>>>>> myvalues<-runif(86,0,1) >>>>>>> myframe<-data.frame(dates=mydates,group=myfactor,value=myvalues) >>>>>>> (myframe) >>>>>>> dim(myframe) >>>>>>> >>>>>>> ## Removing same rows (dates) unsystematically: >>>>>>> set.seed(123) >>>>>>> removed.group1<-sample(1:43,size=11,replace=F) >>>>>>> set.seed(456) >>>>>>> removed.group2<-sample(44:86,size=11,replace=F) >>>>>>> to.remove<-c(removed.group1,removed.group2);length(to.remove) >>>>>>> to.remove<-to.remove[order(to.remove)] >>>>>>> myframe<-myframe[-to.remove,] >>>>>>> (myframe) >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Dimitri Liakhovitski >>>>>>> Ninah Consulting >>>>>>> www.ninah.com >>>>>>> >>>>>>> ______________________________________________ >>>>>>> R-help at r-project.org mailing list >>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Henrique Dallazuanna >>>>>> Curitiba-Paran?-Brasil >>>>>> 25? 25' 40" S 49? 16' 22" O >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Dimitri Liakhovitski >>>>> Ninah Consulting >>>>> www.ninah.com >>>>> >>>> >>>> >>>> >>>> -- >>>> Henrique Dallazuanna >>>> Curitiba-Paran?-Brasil >>>> 25? 25' 40" S 49? 16' 22" O >>>> >>> >>> >>> >>> -- >>> Dimitri Liakhovitski >>> Ninah Consulting >>> www.ninah.com >>> >> >> >> >> -- >> Henrique Dallazuanna >> Curitiba-Paran?-Brasil >> 25? 25' 40" S 49? 16' 22" O >> > > > > -- > Dimitri Liakhovitski > Ninah Consulting > www.ninah.com > -- Dimitri Liakhovitski Ninah Consulting www.ninah.com From pdalgd at gmail.com Wed Apr 6 18:21:49 2011 From: pdalgd at gmail.com (peter dalgaard) Date: Wed, 6 Apr 2011 18:21:49 +0200 Subject: [R] Decimal Accuracy Loss? In-Reply-To: <20110406155808.GA30208@praha1.ff.cuni.cz> References: <20110406155808.GA30208@praha1.ff.cuni.cz> Message-ID: <2E562D58-BEC9-4F21-8100-E2A244D84629@gmail.com> On Apr 6, 2011, at 17:58 , Petr Savicky wrote: > On Wed, Apr 06, 2011 at 11:33:48AM -0400, Brigid Mooney wrote: >> This is hopefully a quick question on decimal accuracy. Is any >> decimal accuracy lost when casting a numeric vector as a matrix? And >> then again casting the result back to a numeric? >> >> I'm finding that my calculation values are different when I run for >> loops that manually calculate matrix multiplication as compared to >> when I cast the vectors as matrices and multiply them using "%*%". >> (The errors are very small, but the process is run iteratively >> thousands of times, at which point the error between the two >> differences becomes noticeable.) >> >> I've read FAQ # 7.31 "Why doesn't R think these numbers are equal?", >> but just want to confirm that the differences in values are due to >> differences in the matrix multiplication operator and manual >> calculation via for loops, rather than information that is lost when >> casting a numeric as a matrix and back again. > > Others already confirmed that casting a numeric as a matrix and back > again does not change the numbers. It is likely that the library > operator "%*%" is more accurate than a straightforward for loop. > For example, sum(x) uses a more accurate algorithm than iteration > of s <- s + x[i] in double precision. Even more likely, %*% is optimized for speed by reordering the additions and multiplications to take advantage of pipelining and CPU caches at several levels. This may or may not improve accuracy, but certainly does affect the last bits in the results. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com From cheba.meier at googlemail.com Wed Apr 6 18:53:56 2011 From: cheba.meier at googlemail.com (cheba meier) Date: Wed, 6 Apr 2011 17:53:56 +0100 Subject: [R] metaplot Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From murdoch.duncan at gmail.com Wed Apr 6 18:56:01 2011 From: murdoch.duncan at gmail.com (Duncan Murdoch) Date: Wed, 06 Apr 2011 12:56:01 -0400 Subject: [R] Use of the dot.dot.dot option in functions. In-Reply-To: References: Message-ID: <4D9C9B21.8010601@gmail.com> On 06/04/2011 12:04 PM, KENNETH R CABRERA wrote: > Hi R users: > > I try this code, where "fun" is a parameter of a random generating > function name, and I pretend to use "..." parameter to pass the parameters > of different random generating functions. > > What am I doing wrong? > > f1<-function(nsim=20,n=10,fun=rnorm,...){ > vp<-replicate(nsim,t.test(fun(n,...),fun(n,...))$p.value) > return(vp) > } > > This works! > f1() > f1(n=20,mean=10) > > This two fails: > f1(n=10,fun=rexp) > f1(n=10,fun=rbeta,shape1=1,shape2=2) > > Thank you for your help. I imagine it's a scoping problem: replicate() is probably not evaluating the ... in the context you think it is. You could debug this by writing a function like showArgs <- function(n, ...) { print(n) print(list(...)) } and calling f1(n=10, fun=showArgs), but it might be easier just to avoid the problem: f1 <- function(nsim=20,n=10,fun=rnorm,...){ force(fun) force(n) localfun <- function() fun(n, ...) vp<-replicate(nsim,t.test(localfun(), localfun())$p.value) return(vp) } From mtmorgan at fhcrc.org Wed Apr 6 18:59:25 2011 From: mtmorgan at fhcrc.org (Martin Morgan) Date: Wed, 06 Apr 2011 09:59:25 -0700 Subject: [R] General binary search? In-Reply-To: <77EB52C6DD32BA4D87471DCD70C8D7000412520C@NA-PA-VBE03.na.tibco.com> References: <77EB52C6DD32BA4D87471DCD70C8D7000412520C@NA-PA-VBE03.na.tibco.com> Message-ID: <4D9C9BED.9020804@fhcrc.org> On 04/04/2011 01:50 PM, William Dunlap wrote: >> -----Original Message----- >> From: r-help-bounces at r-project.org >> [mailto:r-help-bounces at r-project.org] On Behalf Of Stavros Macrakis >> Sent: Monday, April 04, 2011 1:15 PM >> To: r-help >> Subject: [R] General binary search? >> >> Is there a generic binary search routine in a standard library which >> >> a) works for character vectors >> b) runs in O(log(N)) time? >> >> I'm aware of findInterval(x,vec), but it is restricted to >> numeric vectors. > > xtfrm(x) will convert a character (or other) vector to > a numeric vector with the same ordering. findInterval > can work on that. E.g., > > f0<- function(x, vec) { > tmp<- xtfrm(c(x, vec)) > findInterval(tmp[seq_along(x)], tmp[-seq_along(x)]) > } > > f0(c("Baby", "Aunt", "Dog"), LETTERS) > [1] 2 1 4 > I've never looked at its speed. For a little progress (though no 'generic binary searchin a standard library'), here's the 'one-liner' bsearch1 <- function(val, tab, L=1L, H=length(tab)) { while (H >= L) { M <- L + (H - L) %/% 2L if (tab[M] > val) H <- M - 1L else if (tab[M] < val) L <- M + 1L else return(M) } return(L - 1L) } It seems like a good candidate for the new (R-2.13) 'compiler' package, so library(compiler) bsearch2 <- cmpfun(bsearch1) And Bill's suggestion bsearch3 <- function(val, tab) { tmp <- xtfrm(c(val, tab)) findInterval(tmp[seq_along(val)], tmp[-seq_along(val)]) } which will work best when 'val' is a vector to be looked up. A quick look at data.table:::sortedmatch seemed to return matches, whereas Stavros is looking for lower bounds. It seems that one could shift the weight more to C code by 'vectorizing' the one-liner, first as bsearch5 <- function(val, tab, L=1L, H=length(tab)) { b <- cbind(L=rep(L, length(val)), H=rep(H, length(val))) i0 <- seq_along(val) repeat { M <- b[i0,"L"] + (b[i0,"H"] - b[i0,"L"]) %/% 2L i <- tab[M] > val[i0] b[i0 + i * length(val)] <- ifelse(i, M - 1L, ifelse(tab[M] < val[i0], M + 1L, M)) i0 <- which(b[i0, "H"] >= b[i0, "L"]) if (!length(i0)) break; } b[,"L"] - 1L } and then a little more thoughtfully (though more room for improvement) as bsearch7 <- function(val, tab, L=1L, H=length(tab)) { b <- cbind(L=rep(L, length(val)), H=rep(H, length(val))) i0 <- seq_along(val) repeat { updt <- M <- b[i0,"L"] + (b[i0,"H"] - b[i0,"L"]) %/% 2L tabM <- tab[M] val0 <- val[i0] i <- tabM < val0 updt[i] <- M[i] + 1L i <- tabM > val0 updt[i] <- M[i] - 1L b[i0 + i * length(val)] <- updt i0 <- which(b[i0, "H"] >= b[i0, "L"]) if (!length(i0)) break; } b[,"L"] - 1L } none of bsearch 3, 5, or 7 is likely to benefit substantially from compilation. Here's a little test data set converting numeric to character as an easy cheat. set.seed(123L) x <- sort(as.character(rnorm(1e6))) y <- as.character(rnorm(1e4)) There seems to be some significant initial overhead, so we warm things up (and also introduce the paradigm for multiple look-ups in bsearch 1, 2) warmup <- function(y, x) { lapply(y, bsearch1, x) lapply(y, bsearch2, x) bsearch3(y, x) bsearch5(y, x) bsearch7(y, x) } replicate(3, warmup(y, x)) and then time > system.time(res1 <- unlist(lapply(y, bsearch1, x), use.names=FALSE)) user system elapsed 2.692 0.000 2.696 > system.time(res2 <- unlist(lapply(y, bsearch2, x), use.names=FALSE)) user system elapsed 1.379 0.000 1.380 > identical(res1, res2) [1] TRUE > system.time(res3 <- bsearch3(y, x)); identical(res1, res3) user system elapsed 8.339 0.001 8.350 [1] TRUE > system.time(res5 <- bsearch5(y, x)); identical(res1, res5) user system elapsed 0.700 0.000 0.702 [1] TRUE > system.time(res7 <- bsearch7(y, x)); identical(res1, res7) user system elapsed 0.222 0.000 0.222 [1] TRUE Martin > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > >> >> I'm also aware of various hashing solutions (e.g. >> new.env(hash=TRUE) and >> fastmatch), but I need the greatest-lower-bound match in my >> application. >> >> findInterval is also slow for large N=length(vec) because of the O(N) >> checking it does, as Duncan Murdoch has pointed >> out: >> though >> its documentation says it runs in O(n * log(N)), it actually >> runs in O(n * >> log(N) + N), which is quite noticeable for largish N. But >> that is easy >> enough to work around by writing a variant of findInterval which calls >> find_interv_vec without checking. >> >> -s >> >> PS Yes, binary search is a one-liner in R, but I always prefer to use >> standard, fast native libraries when possible.... >> >> binarysearch<- function(val,tab,L,H) {while (H>=L) { >> M=L+(H-L) %/% 2; if >> (tab[M]>val) H<-M-1 else if (tab[M]> return(L-1)} >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793 From tal.galili at gmail.com Wed Apr 6 19:08:30 2011 From: tal.galili at gmail.com (Tal Galili) Date: Wed, 6 Apr 2011 20:08:30 +0300 Subject: [R] Examples of web-based Sweave use? In-Reply-To: References: <1301920279327-3425324.post@n4.nabble.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From scttchamberlain4 at gmail.com Wed Apr 6 19:09:34 2011 From: scttchamberlain4 at gmail.com (Scott Chamberlain) Date: Wed, 6 Apr 2011 12:09:34 -0500 Subject: [R] metaplot In-Reply-To: References: Message-ID: <555A67D111024124AA898630658BE499@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From jhibschman at gmail.com Wed Apr 6 19:37:34 2011 From: jhibschman at gmail.com (Johann Hibschman) Date: Wed, 6 Apr 2011 12:37:34 -0500 Subject: [R] unexpected sort order with merge Message-ID: `merge` lists sorted as if by character, not by the actual class of the by-columns. > tmp <- merge(data.frame(f=ordered(c("a","b","b","a","b"), levels=c("b","a")), x=1:5), data.frame(f=ordered(c("a","b"), levels=c("b","a")), y=c(10,20))) > tmp f x y 1 a 1 10 2 a 4 10 3 b 2 20 4 b 3 20 5 b 5 20 > tmp[order(tmp$f),] f x y 3 b 2 20 4 b 3 20 5 b 5 20 1 a 1 10 2 a 4 10 I expected the second order, not the first. I actually ran into this issue when merging zoo yearmon columns, but that adds a package dependency. In that context, I observed different behavior depending on whether I had one key or two: > library(zoo) > d1 <- data.frame(date=as.yearmon(2000 + (0:5)/12), icpn=500, foo=1:6) > d2 <- data.frame(date=as.yearmon(2000 + (0:5)/12), icpn=500, bar=10*1:6) > merge(d1,d2) date icpn foo bar 1 Apr 2000 500 4 40 2 Feb 2000 500 2 20 3 Jan 2000 500 1 10 4 Jun 2000 500 6 60 5 Mar 2000 500 3 30 6 May 2000 500 5 50 > d1 <- data.frame(date=as.yearmon(2000 + (0:5)/12), foo=1:6) > d2 <- data.frame(date=as.yearmon(2000 + (0:5)/12), bar=10*1:6) > merge(d1,d2) date foo bar 1 Jan 2000 1 10 2 Feb 2000 2 20 3 Mar 2000 3 30 4 Apr 2000 4 40 5 May 2000 5 50 6 Jun 2000 6 60 The first example appears to sort by the name of the date, not by the actual date value. The documentation of `merge` says the sort is "lexicographic", but I assumed that was in the cartesian-product sense, not in some convert-everything-to-character sense. Is this behavior expected? Thanks, Johann P.S. > sessionInfo() R version 2.10.1 (2009-12-14) x86_64-unknown-linux-gnu locale: [1] C attached base packages: [1] grid splines stats graphics grDevices utils datasets [8] methods base other attached packages: [1] ggplot2_0.8.8 reshape_0.8.3 Rauto_1.0 plyr_1.1 [5] zoo_1.6-4 Hmisc_3.7-0 survival_2.35-8 ascii_0.7 [9] proto_0.3-8 loaded via a namespace (and not attached): [1] cluster_1.12.1 digest_0.4.2 lattice_0.17-26 tools_2.10.1 From krcabrer at une.net.co Wed Apr 6 21:00:58 2011 From: krcabrer at une.net.co (KENNETH R CABRERA) Date: Wed, 06 Apr 2011 14:00:58 -0500 Subject: [R] [SOLVED] Re: Use of the dot.dot.dot option in functions. In-Reply-To: <4D9C9B21.8010601@gmail.com> References: <4D9C9B21.8010601@gmail.com> Message-ID: An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bogaso.christofer at gmail.com Wed Apr 6 21:40:12 2011 From: bogaso.christofer at gmail.com (Bogaso Christofer) Date: Thu, 7 Apr 2011 01:10:12 +0530 Subject: [R] A zoo related question Message-ID: <008401cbf492$78301250$689036f0$@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From bogaso.christofer at gmail.com Wed Apr 6 21:57:18 2011 From: bogaso.christofer at gmail.com (Bogaso Christofer) Date: Thu, 7 Apr 2011 01:27:18 +0530 Subject: [R] A zoo related question Message-ID: <008901cbf494$d84c7e60$88e57b20$@gmail.com> An embedded and charset-unspecified text was scrubbed... Name: not available URL: From ggrothendieck at gmail.com Wed Apr 6 21:33:45 2011 From: ggrothendieck at gmail.com (Gabor Grothendieck) Date: Wed, 6 Apr 2011 15:33:45 -0400 Subject: [R] A zoo related question In-Reply-To: <008401cbf492$78301250$689036f0$@gmail.com> References: <008401cbf492$78301250$689036f0$@gmail.com> Message-ID: On Wed, Apr 6, 2011 at 3:40 PM, Bogaso